Notice: This Wiki is now read only and edits are no longer possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.
Talk:SMILA/Documentation/Web Crawler
Revision as of 03:52, 10 August 2010 by Unnamed Poltroon (Talk) (New page: There are some inconsistencies in this page: - in Process you are writing "Policy: there are five types of policies offered on how to deal with robots.txt rules: " and list four types. Wh...)
There are some inconsistencies in this page:
- in Process you are writing "Policy: there are five types of policies offered on how to deal with robots.txt rules: " and list four types. What is the fifth one? - for CrawlingModel you name two available models, MaxBreadth or MaxDepth. In the multiple website configuration example youre using
<CrawlingModel Type="MaxIterations" Value="20"/>. What about this model?
What I am missing in the whole description for crawlers is a word about content, what does content actually look like or how can i configure the outlook of content?