Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.
Talk:SMILA/Documentation/Web Crawler
Revision as of 03:52, 10 August 2010 by Andrej.rosenheinrich.unister-gmbh.de (Talk | contribs) (New page: There are some inconsistencies in this page: - in Process you are writing "Policy: there are five types of policies offered on how to deal with robots.txt rules: " and list four types. Wh...)
There are some inconsistencies in this page:
- in Process you are writing "Policy: there are five types of policies offered on how to deal with robots.txt rules: " and list four types. What is the fifth one? - for CrawlingModel you name two available models, MaxBreadth or MaxDepth. In the multiple website configuration example youre using
<CrawlingModel Type="MaxIterations" Value="20"/>. What about this model?
What I am missing in the whole description for crawlers is a word about content, what does content actually look like or how can i configure the outlook of content?