LEFT | RIGHT |
1 crawler | 1 crawler |
2 ======= | 2 ======= |
3 | 3 |
4 Backend for the Adblock Plus Crawler. It provides the following URLs: | 4 Backend for the Adblock Plus Crawler. It provides the following URLs: |
5 | 5 |
6 * */crawlableSites* - Return a list of sites to be crawled | 6 * */crawlableSites* - Return a list of sites to be crawled |
7 * */crawlerRun, /crawlerData* - Receive data on filtered elements | 7 * */crawlerData* - Receive data on filtered elements |
8 | 8 |
9 Required packages | 9 Required packages |
10 ----------------- | 10 ----------------- |
11 | 11 |
12 * [simplejson](http://pypi.python.org/pypi/simplejson/) | 12 * [simplejson](http://pypi.python.org/pypi/simplejson/) |
13 | 13 |
14 Database setup | 14 Database setup |
15 -------------- | 15 -------------- |
16 | 16 |
17 Just execute the statements in _schema.sql_. | 17 Just execute the statements in _schema.sql_. |
(...skipping 17 matching lines...) Expand all Loading... |
35 ------------------------ | 35 ------------------------ |
36 | 36 |
37 Make _filter\_list\_repository_ in the _crawler_ configuration section | 37 Make _filter\_list\_repository_ in the _crawler_ configuration section |
38 point to the local Mercurial repository of a filter list. | 38 point to the local Mercurial repository of a filter list. |
39 | 39 |
40 Then execute the following: | 40 Then execute the following: |
41 | 41 |
42 python -m sitescripts.crawler.bin.extract_crawler_sites > crawler_sites.sql | 42 python -m sitescripts.crawler.bin.extract_crawler_sites > crawler_sites.sql |
43 | 43 |
44 Now you can execute the insert statements from _crawler\_sites.sql_. | 44 Now you can execute the insert statements from _crawler\_sites.sql_. |
LEFT | RIGHT |