====== Differences ====== This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
project_info:crawling_process [2012/01/30 09:43] benjamin add checkDynamicBlacklist |
project_info:crawling_process [2024/09/18 08:31] (current) |
||
---|---|---|---|
Line 23: | Line 23: | ||
* Create a new Index entry (IndexWriterManager::createNewIndexEntry) | * Create a new Index entry (IndexWriterManager::createNewIndexEntry) | ||
* First the document is prepared for indexation (DocumentFactory::createDocument) | * First the document is prepared for indexation (DocumentFactory::createDocument) | ||
+ | * [[features:auxiliary_fields|Auxiliary Fields]] are calculated. | ||
+ | * The [[features:access_rights_management|Crawler Access Controller]] (if available) is asked to retrieve the allowed groups. | ||
* The MIME-Type is identified.(org.semanticdesktop.aperture.mime.identifier.magic.MagicMimeTypeIdentifier::identify()) | * The MIME-Type is identified.(org.semanticdesktop.aperture.mime.identifier.magic.MagicMimeTypeIdentifier::identify()) | ||
* All preparators are collected which accept this MIME-Type. | * All preparators are collected which accept this MIME-Type. | ||
Line 30: | Line 32: | ||
* Then it is added to the [[http://lucene.apache.org|Lucene]] [[components:search_index|index]], after notification of the plugins (''__onCreateIndexEntry(Document doc, IndexWriter index)__''). | * Then it is added to the [[http://lucene.apache.org|Lucene]] [[components:search_index|index]], after notification of the plugins (''__onCreateIndexEntry(Document doc, IndexWriter index)__''). | ||
- | At the en, ''__onFinishCrawling(Crawler)__'' is called. | + | At the end, ''__onFinishCrawling(Crawler)__'' is called. |