Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.
Difference between revisions of "SMILA/Documentation/AperturePipelet"
(New page: == Bundle: <tt>org.eclipse.eilf.processing.pipelets.aperture.AperturePipelet</tt> == === Description === This Pipelet converts various document formats (like PDF,XLS, etc.) to plain text ...) |
|||
Line 1: | Line 1: | ||
+ | |||
== Bundle: <tt>org.eclipse.eilf.processing.pipelets.aperture.AperturePipelet</tt> == | == Bundle: <tt>org.eclipse.eilf.processing.pipelets.aperture.AperturePipelet</tt> == | ||
Line 7: | Line 8: | ||
=== Configuration === | === Configuration === | ||
− | + | Configuration File: <tt>configuration/org.eclipse.eilf.processing.pipelets.aperture/ConverterConfig.xml</tt> | |
{| border = 1 | {| border = 1 | ||
Line 18: | Line 19: | ||
|AttachmentMimeType||String||name of the attribute containing the MimeType of the original document content | |AttachmentMimeType||String||name of the attribute containing the MimeType of the original document content | ||
|- | |- | ||
− | |||
|} | |} | ||
+ | |||
+ | Note that all properties are required and must be provided. | ||
+ | |||
==== Example ==== | ==== Example ==== | ||
Revision as of 07:01, 12 August 2008
Contents
Bundle: org.eclipse.eilf.processing.pipelets.aperture.AperturePipelet
Description
This Pipelet converts various document formats (like PDF,XLS, etc.) to plain text using [Aperture|Glossary#Aperture] technology. It converts the document's content in AttachmentContent and stores the plain text result in AttachmentText. The optional MimeType of AttachmentContent in AttachmentMimeType is used for conversion. If no MimeType is provided a MimeType identification is done inside the Pipelet using a MimeTypeIdentifier service.
Configuration
Configuration File: configuration/org.eclipse.eilf.processing.pipelets.aperture/ConverterConfig.xml
Property | Type | Description |
---|---|---|
AttachmentContent | String | name of the attachment containing the original document content |
AttachmentText | String | name of the attachment to store the converted text in |
AttachmentMimeType | String | name of the attribute containing the MimeType of the original document content |
Note that all properties are required and must be provided.
Example
The following example was used in the EILF example application to convert documents delivered by Filesystem- and WebCrawler to plain text.
ConverterConfig.xml
<PipeletConfiguration xmlns="http://www.eclipse.org/eilf/processor"> <Property name="AttachmentContent"> <Value>Content</Value> </Property> <Property name="AttachmentText"> <Value>Text</Value> </Property> <Property name="AttachmentMimeType"> <Value>MimeType</Value> </Property> </PipeletConfiguration>