<?xml version="1.0" encoding="utf-8"?>
<!-- generator="FeedCreator 1.7.2-ppt DokuWiki" -->
<?xml-stylesheet href="http://regain.murfman.de/wiki/lib/exe/css.php?s=feed" type="text/css"?>
<rdf:RDF
    xmlns="http://purl.org/rss/1.0/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
    xmlns:dc="http://purl.org/dc/elements/1.1/">
    <channel rdf:about="http://regain.murfman.de/wiki/feed.php">
        <title>regain manual</title>
        <description></description>
        <link>http://regain.murfman.de/wiki/</link>
        <image rdf:resource="http://regain.murfman.de/wiki/lib/images/favicon.ico" />
       <dc:date>2012-01-31T10:21:46+01:00</dc:date>
        <items>
            <rdf:Seq>
                <rdf:li rdf:resource="http://regain.murfman.de/wiki/doku.php?id=components:crawler_plugins&amp;rev=1327913032&amp;do=diff"/>
                <rdf:li rdf:resource="http://regain.murfman.de/wiki/doku.php?id=project_info:crawling_process&amp;rev=1327912988&amp;do=diff"/>
                <rdf:li rdf:resource="http://regain.murfman.de/wiki/doku.php?id=features:white_and_black_list&amp;rev=1327912607&amp;do=diff"/>
                <rdf:li rdf:resource="http://regain.murfman.de/wiki/doku.php?id=project_info:sourcecode&amp;rev=1322902403&amp;do=diff"/>
                <rdf:li rdf:resource="http://regain.murfman.de/wiki/doku.php?id=project_info:howto_regain_in_an_ide&amp;rev=1320048811&amp;do=diff"/>
                <rdf:li rdf:resource="http://regain.murfman.de/wiki/doku.php?id=de:project_info:building_regain&amp;rev=1320048589&amp;do=diff"/>
            </rdf:Seq>
        </items>
    </channel>
    <image rdf:about="http://regain.murfman.de/wiki/lib/images/favicon.ico">
        <title>regain manual</title>
        <link>http://regain.murfman.de/wiki/</link>
        <url>http://regain.murfman.de/wiki/lib/images/favicon.ico</url>
    </image>
    <item rdf:about="http://regain.murfman.de/wiki/doku.php?id=components:crawler_plugins&amp;rev=1327913032&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2012-01-30T09:43:52+01:00</dc:date>
        <dc:creator>Benjamin</dc:creator>
        <title>Crawler Plugins - at least one</title>
        <link>http://regain.murfman.de/wiki/doku.php?id=components:crawler_plugins&amp;rev=1327913032&amp;do=diff</link>
        <description>Crawler Plugins hook into the crawling process in order to add advanced functionality. 

What can crawler plugins do?

Some examples:


	*  Modify the result of preparators
		*  by specifying default-values if the chosen preparator does not fill in a certain field (onBeforePrepare)
		*  by overriding or modyfing the results of whatever preparator was chosen (onAfterPrepare)</description>
    </item>
    <item rdf:about="http://regain.murfman.de/wiki/doku.php?id=project_info:crawling_process&amp;rev=1327912988&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2012-01-30T09:43:08+01:00</dc:date>
        <dc:creator>Benjamin</dc:creator>
        <title>Crawling Process - add checkDynamicBlacklist</title>
        <link>http://regain.murfman.de/wiki/doku.php?id=project_info:crawling_process&amp;rev=1327912988&amp;do=diff</link>
        <description>How does the crawling process work? Where do the Crawler Plugins interact?

At the beginning, onStartCrawling(Crawler) is called for all plugins.

1. Creating the job queue

	*  According to the Crawling Whitelist and Blacklist, the start URLs are added to the job queue (Crawler::addJob) (onAcceptURL(String url, CrawlerJob job) or onDeclineURL(String url) is called to inform the plugin)
			*  If a job would be accepted, the crawler plugins are asked if they want to blacklist it anyway (boolean c…</description>
    </item>
    <item rdf:about="http://regain.murfman.de/wiki/doku.php?id=features:white_and_black_list&amp;rev=1327912607&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2012-01-30T09:36:47+01:00</dc:date>
        <dc:creator>Benjamin</dc:creator>
        <title>White and black list</title>
        <link>http://regain.murfman.de/wiki/doku.php?id=features:white_and_black_list&amp;rev=1327912607&amp;do=diff</link>
        <description>Using the white and black list you can specify very precisely, what will get in the index and what not.

The base rule is always: A document gets in the index, if its URL comes up to at least one entry from the white list, but no entry from the black list.</description>
    </item>
    <item rdf:about="http://regain.murfman.de/wiki/doku.php?id=project_info:sourcecode&amp;rev=1322902403&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2011-12-03T09:53:23+01:00</dc:date>
        <dc:creator>Benjamin</dc:creator>
        <title>Sourcecode - link</title>
        <link>http://regain.murfman.de/wiki/doku.php?id=project_info:sourcecode&amp;rev=1322902403&amp;do=diff</link>
        <description>If you are interested in details of regain or if you want to fix a bug, you'll need the sourcecode.

There are several possibilites to get the sourcecode. You can either download the zipped source of a particular version or you can get the newest developer version directly from the SVN repository.</description>
    </item>
    <item rdf:about="http://regain.murfman.de/wiki/doku.php?id=project_info:howto_regain_in_an_ide&amp;rev=1320048811&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2011-10-31T09:13:31+01:00</dc:date>
        <dc:creator>Benjamin</dc:creator>
        <title>How to setup an IDE - basics</title>
        <link>http://regain.murfman.de/wiki/doku.php?id=project_info:howto_regain_in_an_ide&amp;rev=1320048811&amp;do=diff</link>
        <description>In any case, you need to install the JDK first: Download.

You also should set the environment variable JAVA_HOME to the directory you installed the JDK into, e.g. C:/Programme/Java/jdk1.6.0_12. The variable PATH should contain the bin-Folder, e.g. %PATH%;%JAVA_HOME%\bin.</description>
    </item>
    <item rdf:about="http://regain.murfman.de/wiki/doku.php?id=de:project_info:building_regain&amp;rev=1320048589&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2011-10-31T09:09:49+01:00</dc:date>
        <dc:creator>Benjamin</dc:creator>
        <title>regain bauen - 1.6</title>
        <link>http://regain.murfman.de/wiki/doku.php?id=de:project_info:building_regain&amp;rev=1320048589&amp;do=diff</link>
        <description>Diese Seite zeigt, wie Sie regain selbst aus dem Quellcode bauen können auf der Kommandozeile. (Sie können aber auch einen Code-Editor installieren und konfigurieren.)

Entwicklungsumgebung einrichten


Um regain bauen zu können, müssen Sie zuerst folgendes installieren:</description>
    </item>
</rdf:RDF>

