sitescripts/crawler/README.md - Issue 8327353: Crawler backend

Side by Side Diff

Use n/p to move between diff chunks; N/P to move between comments.

Keyboard Shortcuts

	File
u :	up to issue
m :	publish + mail comments
M :	edit review message
j / k :	jump to file after / before current file
J / K :	jump to next file with a comment after / before current file
	Side-by-side diff
i :	toggle intra-line diffs
e :	expand all comments
c :	collapse all comments
s :	toggle showing all comments
n / p :	next / previous diff chunk or comment
N / P :	next / previous comment
<Up> / <Down> :	next / previous line
<Enter> :	respond to / edit current comment
d :	mark current comment as done

	Issue
u :	up to list of issues
m :	publish + mail comments
j / k :	jump to patch after / before current patch
o / <Enter> :	open current patch in side-by-side view
i :	open current patch in unified diff view

	Issue List
j / k :	jump to issue after / before current issue
o / <Enter> :	open current issue
# :	close issue

	Comment/message editing
<Ctrl> + s or <Ctrl> + Enter :	save comment
<Esc> :	cancel edit

Side by Side Diff: sitescripts/crawler/README.md

Issue 8327353: Crawler backend (Closed)

Patch Set: Created Sept. 27, 2012, 9:26 a.m.

Left:
Right:

Use n/p to move between diff chunks; N/P to move between comments.

Jump to:

OLD	NEW
(Empty)
	1 crawler

	2 =======

	3

	4 Backend for the Adblock Plus Crawler. It provides the following URLs:

	5

	6 * /crawlableSites - Return a list of sites to be crawled

	7 * /crawlerData - Receive data on filtered elements

	8

	9 Required packages

	10 -----------------

	11

	12 * [simplejson](http://pypi.python.org/pypi/simplejson/)

	13

	14 Database setup

	15 --------------

	16

	17 Just execute the statements in _schema.sql_.

	18

	19 Configuration

	20 -------------

	21

	22 Just add an empty _crawler_ section to _/etc/sitescripts_ or _.sitescripts_.

	23

	24 Also make sure that the following keys are configured in the _DEFAULT_

	25 section:

	26

	27 * _database_

	28 * _dbuser_

	29 * _dbpassword_

	30 * _basic\_auth\_realm_

	31 * _basic\_auth\_username_

	32 * _basic\_auth\_password_

	33

	34 Extracting crawler sites

	35 ------------------------

	36

	37 Make _filter\_list\_repository_ in the _crawler_ configuration section

	38 point to the local Mercurial repository of a filter list.

	39

	40 Then execute the following:

	41

	42 python -m sitescripts.crawler.bin.extract_crawler_sites > crawler_sites.sql

	43

	44 Now you can execute the insert statements from _crawler\_sites.sql_.

OLD	NEW