Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Talk:SMILA/Documentation/HowTo/How to implement a crawler

Comment originally by Ivan Churkin:

In my oppinion implementing crawler interface and creating service is not good idea for 3rd party developers. Its required to have too much knowlege about technologies (SCA, declarative services, osgi, our interfaces...).

Its much better to suggest for implementation some simple iterateable interface like

interface DataExtractor { 
 void start(IndexOrderConfiruration config);
 boolean moveNext(); 
 Object readAttribute(String name); 
 void finish(); 
}

And it will be written one wrapper class that will implement Crawler interface and it will use "DataExtractor" user's object for crawling(creating Record and so on). and plz look at the page comment


Develop your crawler

that section is, to say the least, insufficient. looking into already existing crawlers doesn't help much, since there's not much comment but some rather tangled code. a clear description of the methods to implement and what they are supposed to do (outlining how-to, maybe) is the minimum of required information. i've browsed the SMILA wiki a lot lately and am still suprised how a project aiming that high can provide so little information suitable for starters.

Back to the top