Difference between revisions of "SMILA/Documentation/ApertureMimeTypeIdentifier"

From Eclipsepedia

Jump to: navigation, search
Line 1: Line 1:
 
<span style="color:#ff0000">'''This component is not yet available in our repository. We are in the process of creating CQs for the required third party code and hopefully get permission to use it in our project.'''</span>
 
<span style="color:#ff0000">'''This component is not yet available in our repository. We are in the process of creating CQs for the required third party code and hopefully get permission to use it in our project.'''</span>
  
== Bundle: <tt>org.eclipse.smila.processing.pipelets.aperture.ApertureMimeTypeIdentifier</tt> ==
+
== Class: <tt>org.eclipse.smila.aperture.ApertureMimeTypeIdentifier</tt> ==  
 +
 
 +
Located in bundle: <tt>org.eclipse.smila.aperture</tt>
  
 
=== Description ===
 
=== Description ===
This ProcessingService is used to identify the mimetype of a document. The service uses either the document's content (a byte[]), a file extension or both. So it is not required that the record contains a value for both properties ''ContentAttachment'' and ''FileExtensionAttribute''. The identified MimeType is store in an attribute in the record.
 
  
It is strongly recommended that you use both (input data and extension) to identify the mime type of the data, since the aperture mime type identification mainly focuses on the magic numbers in the file and so often fails to determine e.g. office documents' mime types when no conten is given.
+
This service implements the [http://build.eclipse.org/rt/smila/javadoc/current/index.html?org/eclipse/smila/common/mimetype/package-summary.html MimeTypeIdentifier] interface using the "magic" identification functionality of [http://aperture.sourceforge.net/index.html Aperture]. The service uses either the document's content (a byte[]), a file extension or both. For best results it is recommended that you use both (input data and extension) to identify the mime type of the data, since the aperture mime type identification mainly focuses on the magic numbers in the file and so often fails to determine e.g. office documents' mime types when no conten is given.
  
 
For further information on the aperture mime type extraction please consult the apropriate [http://aperture.sourceforge.net/ Aperture] documentation pages (e.g. [http://sourceforge.net/apps/trac/aperture/wiki/MIMETypeIdentification MIMETypeIdentification]).
 
For further information on the aperture mime type extraction please consult the apropriate [http://aperture.sourceforge.net/ Aperture] documentation pages (e.g. [http://sourceforge.net/apps/trac/aperture/wiki/MIMETypeIdentification MIMETypeIdentification]).
Line 12: Line 13:
 
The javadoc for the implemented interface can be found [http://build.eclipse.org/rt/smila/javadoc/current/index.html?org/eclipse/smila/common/mimetype/package-summary.html here].
 
The javadoc for the implemented interface can be found [http://build.eclipse.org/rt/smila/javadoc/current/index.html?org/eclipse/smila/common/mimetype/package-summary.html here].
  
==== Useful Information ====
+
To enable the service, start bundle <tt>org.eclipse.smila.aperture</tt> and get a OSGi service reference for interface <tt>org.eclipse.smila.common.mimetype.MimeTypeIdentifier</tt>. You should take care not to start the <tt>org.eclipse.smila.common.mimetype.impl</tt> bundle to ensure that the Aperture based implementation is used and not simplistic one that SMILA provides as a fallback. We have set the service rankings of those services such that the Aperture implementation should be preferred if both are running, but it's always better to be sure what happens in your system ;-)
 
+
Note that this ProcessingService also is a DeclarativeService that implements interface <tt>org.eclipse.smila.processing.pipelets.aperture.MimeTypeIdentifier</tt> and can be used outside the workflow as well.
+
  
 
==== Interaction with the MimeTypeIdentifyPipelet ====
 
==== Interaction with the MimeTypeIdentifyPipelet ====
  
The MimeTypeIdentifyPipelet uses OSGi to access a MimeTypeIdentifier service.
+
When the Aperture based MimeTypeIdentifier is started, it uses the <tt>org.eclipse.smila.processing.pipelets.MimeTypeIdentifyPipelet</tt> automatically (if no other MimeTypeIdentifier service with yet a higher service ranking is active, of course).
 
+
So if you want to use the Aperture mime type identifier you should start the aperture bundle and take care not to start the <tt>org.eclipse.smila.common.mimetype.impl</tt> bundle that contains the <tt>SimpleMimeTypeIdentifier</tt>, which is a mime type identifier that only identifies mime types based on the file extensions using a mapping file. (see [http://build.eclipse.org/rt/smila/javadoc/current/index.html?org/eclipse/smila/common/mimetype/impl/package-summary.html org.eclipse.smila.common.mimetype.impl]).
+
 
+
=== Configuration ===
+
  
 
For information on how to configure the mime type identification pipelet, which uses the MimeTypeIdentifier service to recognize attachment mime types please refer to [[SMILA/Documentation/Bundle_org.eclipse.smila.processing.pipelets#Bundle:_org.eclipse.smila.processing.pipelets.MimeTypeIdentifyPipelet|MimeTypeIdentifyPipelet]].
 
For information on how to configure the mime type identification pipelet, which uses the MimeTypeIdentifier service to recognize attachment mime types please refer to [[SMILA/Documentation/Bundle_org.eclipse.smila.processing.pipelets#Bundle:_org.eclipse.smila.processing.pipelets.MimeTypeIdentifyPipelet|MimeTypeIdentifyPipelet]].
  
 
[[Category:SMILA]] [[Category:SMILA/Processing Service]]
 
[[Category:SMILA]] [[Category:SMILA/Processing Service]]

Revision as of 05:49, 19 September 2011

This component is not yet available in our repository. We are in the process of creating CQs for the required third party code and hopefully get permission to use it in our project.

Class: org.eclipse.smila.aperture.ApertureMimeTypeIdentifier

Located in bundle: org.eclipse.smila.aperture

Description

This service implements the MimeTypeIdentifier interface using the "magic" identification functionality of Aperture. The service uses either the document's content (a byte[]), a file extension or both. For best results it is recommended that you use both (input data and extension) to identify the mime type of the data, since the aperture mime type identification mainly focuses on the magic numbers in the file and so often fails to determine e.g. office documents' mime types when no conten is given.

For further information on the aperture mime type extraction please consult the apropriate Aperture documentation pages (e.g. MIMETypeIdentification).

The javadoc for the implemented interface can be found here.

To enable the service, start bundle org.eclipse.smila.aperture and get a OSGi service reference for interface org.eclipse.smila.common.mimetype.MimeTypeIdentifier. You should take care not to start the org.eclipse.smila.common.mimetype.impl bundle to ensure that the Aperture based implementation is used and not simplistic one that SMILA provides as a fallback. We have set the service rankings of those services such that the Aperture implementation should be preferred if both are running, but it's always better to be sure what happens in your system ;-)

Interaction with the MimeTypeIdentifyPipelet

When the Aperture based MimeTypeIdentifier is started, it uses the org.eclipse.smila.processing.pipelets.MimeTypeIdentifyPipelet automatically (if no other MimeTypeIdentifier service with yet a higher service ranking is active, of course).

For information on how to configure the mime type identification pipelet, which uses the MimeTypeIdentifier service to recognize attachment mime types please refer to MimeTypeIdentifyPipelet.