Using the white and black list you can specify very precisely, what will get in the index and what not.
The base rule is always: A document gets in the index, if its URL comes up to at least one entry from the white list, but no entry from the black list.
The following configuration defines for example that all URLs should be taken in that start with
http://www.mydomain.de, except for those starting with
<whitelist> <prefix><nowiki>http://www.mydomain.de</nowiki></prefix> </whitelist> <blacklist> <prefix><nowiki>http://www.mydomain.de/some/dynamic/content/</nowiki></prefix> </blacklist>
Additionally, a crawler plugin may be written in order to blacklist files according to more complex conditions (e.g. filesize).