User Tools

Site Tools

Translations of this page:

White and black list

Using the white and black list you can specify very precisely, what will get in the index and what not.

The base rule is always: A document gets in the index, if its URL comes up to at least one entry from the white list, but no entry from the black list.

How can I use this feature?

The lists are defined in the CrawlerConfiguration.xml by the tags <whitelist> tag resp. <blacklist> tag.

The following configuration defines for example that all URLs should be taken in that start with, except for those starting with


Additionally, a crawler plugin may be written in order to blacklist files according to more complex conditions (e.g. filesize).

features/white_and_black_list.txt · Last modified: 2014/10/29 10:22 (external edit)