Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Difference between revisions of "SMILA/Documentation/MimeTypeIdentifier"

(Overview)
(org.eclipse.smila.tika)
 
(12 intermediate revisions by 4 users not shown)
Line 1: Line 1:
 
== Overview ==
 
== Overview ==
  
A <tt>MimeTypeIdentifier</tt> is an OSGi service which can be used to identify the MIME type of a given <tt>byte[]</tt> or a file extension. The documentation of a pipelet which uses this service can be found at [[SMILA/Documentation/Bundle_org.eclipse.smila.processing.pipelets#Bundle:_org.eclipse.smila.processing.pipelets.MimeTypeIdentifyPipelet| SMILA/Documentation/Bundle_org.eclipse.smila.processing.pipelets ]].
+
A <tt>MimeTypeIdentifier</tt> is an OSGi service which can be used to identify the MIME type of a given <tt>byte[]</tt>, <tt>InputStream</tt> or a file extension. The documentation of a pipelet which uses this service can be found at [[SMILA/Documentation/Bundle_org.eclipse.smila.processing.pipelets#org.eclipse.smila.processing.pipelets.MimeTypeIdentifyPipelet| SMILA/Documentation/Bundle_org.eclipse.smila.processing.pipelets ]].
  
 
== API ==
 
== API ==
 +
You can find the javaDoc API for the [http://build.eclipse.org/rt/smila/javadoc/current/index.html?org/eclipse/smila/common/mimetype/MimeTypeIdentifier.html MimeTypeIdentifier here]
  
<source lang="java">
+
==Implementations==
/**
+
* Service interface to identify a MimeType.
+
*/
+
public interface MimeTypeIdentifier {
+
  
  /**
+
It is possible to provide different implementations for the <tt>MimeTypeIdentifier</tt> interface. In general it makes sense to only activate one <tt>MimeTypeIdentifier</tt> implementation at the same time. This is achieved by simply starting just the bundle with the desired implementation. If multiple implementations are started, a client using the <tt>MimeTypeIdentifier</tt> has to use a filter to select between the available implementations. Otherwise it gets a reference randomly (or based on the <tt>service.ranking</tt> property of the service registrations). The component name could be used for filtering.
  * Identifies a MimeType based an the given data.
+
  *
+
  * @param data
+
  *          a byte[] containing the data
+
  * @return the detected MimeType
+
  * @throws MimeTypeParseException
+
  *          if any error occurs
+
  */
+
  String identify(byte[] data) throws MimeTypeParseException;
+
  
  /**
+
Below is a list of the currently available implementations.
  * Identifies a MimeType based an the file extension.
+
  *
+
  * @param extension
+
  *          the extension of the filename
+
  * @return the detected MimeType
+
  * @throws MimeTypeParseException
+
  *          if any error occurs
+
  */
+
  String identify(String extension) throws MimeTypeParseException;
+
  
  /**
+
===org.eclipse.smila.common.mimetype.impl===
  * Identifies a MimeType based an the given data and file extension.
+
  *
+
  * @param data
+
  *          a byte[] containing the data
+
  * @param extension
+
  *          the extension of the filename
+
  * @return the detected MimeType
+
  * @throws MimeTypeParseException
+
  *          if any error occurs
+
  */
+
  String identify(byte[] data, String extension) throws MimeTypeParseException;
+
}
+
</source>
+
  
==Implementations==
+
The default implementation <tt>SimpleMimeTypeIdentifier</tt> cam only identify MIME types (only) by file extension. Identification by <tt>byte[]</tt> or <tt>InputStream</tt> is not supported. The mapping file <tt>mime.types</tt> (located inside the bundle) is used to map from file extension to MIME type.
  
It is possible to provide different implementations for the MimeTypeIdentifier interface. In general it makes sense to only activate one MimeTypeIdentifier implementation at a time. This is achieved by simply starting just the bundle with the desired implementation. If multiple implementations are started a client using the MimeTypeIdentifier has to use a filter to select between the available implementations. Otherwise it gets a reference randomly. The component name could be used for filtering.  
+
==== Configuration ====
 +
There are no configuration options available for this bundle.
  
Below is a list of the currently available implementations.
+
===org.eclipse.smila.tika===
  
===org.eclipse.smila.common.mimetype.impl===
+
{{Tip|This implementation is not yet available in SMILA. We have to finish the CQ process for Apache Tika and its dependencies first}}
 +
 
 +
The bundle <tt>org.eclipse.smila.tika</tt> contains an implementation of the MimeTypeIdentifier service based on the [http://tika.apache.org/1.3/detection.html Detector services] provided by [http://tika.apache.org/Apache Tika]. The default detector service started by Tika uses magic bytes as well as filename (i.e. filename suffix) based detection. For details see the [http://tika.apache.org/1.3/detection.html Detector services Tika documentation].
  
The default implementation can only identify mime types by a file extension. identification by <tt>byte[]</tt> is not supported. The mapping file <tt>mime.types</tt> (located inside the bundle) is used to map from file extension to mime type.
+
This service has a higher <tt>service.ranking</tt> property than the default implementation so it should be selected for service references if both bundles are started.
  
 
==== Configuration ====
 
==== Configuration ====
 
There are no configuration options available for this bundle.
 
There are no configuration options available for this bundle.

Latest revision as of 07:04, 1 February 2013

Overview

A MimeTypeIdentifier is an OSGi service which can be used to identify the MIME type of a given byte[], InputStream or a file extension. The documentation of a pipelet which uses this service can be found at SMILA/Documentation/Bundle_org.eclipse.smila.processing.pipelets .

API

You can find the javaDoc API for the MimeTypeIdentifier here

Implementations

It is possible to provide different implementations for the MimeTypeIdentifier interface. In general it makes sense to only activate one MimeTypeIdentifier implementation at the same time. This is achieved by simply starting just the bundle with the desired implementation. If multiple implementations are started, a client using the MimeTypeIdentifier has to use a filter to select between the available implementations. Otherwise it gets a reference randomly (or based on the service.ranking property of the service registrations). The component name could be used for filtering.

Below is a list of the currently available implementations.

org.eclipse.smila.common.mimetype.impl

The default implementation SimpleMimeTypeIdentifier cam only identify MIME types (only) by file extension. Identification by byte[] or InputStream is not supported. The mapping file mime.types (located inside the bundle) is used to map from file extension to MIME type.

Configuration

There are no configuration options available for this bundle.

org.eclipse.smila.tika

Idea.png
This implementation is not yet available in SMILA. We have to finish the CQ process for Apache Tika and its dependencies first


The bundle org.eclipse.smila.tika contains an implementation of the MimeTypeIdentifier service based on the Detector services provided by Tika. The default detector service started by Tika uses magic bytes as well as filename (i.e. filename suffix) based detection. For details see the Detector services Tika documentation.

This service has a higher service.ranking property than the default implementation so it should be selected for service references if both bundles are started.

Configuration

There are no configuration options available for this bundle.

Back to the top