This shows you the differences between two versions of the page.
components:crawler_plugins [2011/07/28 13:40] benjamin link |
components:crawler_plugins [2014/10/29 10:22] |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Crawler Plugins ====== | ||
- | |||
- | Crawler Plugins hook into the [[project_info:crawling_process|crawling process]] in order to add advanced functionality. | ||
- | |||
- | ==== What can crawler plugins do? ==== | ||
- | Some examples: | ||
- | |||
- | * Modify the result of preparators | ||
- | * by specifying default-values if the chosen preparator does not fill in a certain field (''onBeforePrepare'') | ||
- | * by overriding or modyfing the results of whatever preparator was chosen (''onAfterPrepare'') | ||
- | * Modify their storage in the lucene index | ||
- | * Do sth at every start or end of the crawling process (e.g. inform the administrator via email) | ||
- | |||
- | |||
- | ==== How to create a crawler plugin ==== | ||
- | |||
- | - Create a class that implements ''CrawlerPlugin''. | ||
- | - Packaged it (and all its dependencies) as a .jar | ||
- | * In the manifest file, the attribute ''Plugin-Class'' must be set to the complete class name of the implementing class. | ||
- | - Drop it into the ''plugins''-Directory. | ||
- | |||
- | |||
- | |||
- | |||
- | ==== Existing Plugins ==== | ||
- | |||
- | * Create Thumbnails of indexed documents (https://github.com/benjamin4ruby/java-thumbnailer) | ||
- | |||