sitescripts/crawler/README.md - Issue 8432110: sitescripts: Script to extract domain-specific filters

Side by Side Diff

Use n/p to move between diff chunks; N/P to move between comments.

Keyboard Shortcuts

	File
u :	up to issue
m :	publish + mail comments
M :	edit review message
j / k :	jump to file after / before current file
J / K :	jump to next file with a comment after / before current file
	Side-by-side diff
i :	toggle intra-line diffs
e :	expand all comments
c :	collapse all comments
s :	toggle showing all comments
n / p :	next / previous diff chunk or comment
N / P :	next / previous comment
<Up> / <Down> :	next / previous line
<Enter> :	respond to / edit current comment
d :	mark current comment as done

	Issue
u :	up to list of issues
m :	publish + mail comments
j / k :	jump to patch after / before current patch
o / <Enter> :	open current patch in side-by-side view
i :	open current patch in unified diff view

	Issue List
j / k :	jump to issue after / before current issue
o / <Enter> :	open current issue
# :	close issue

	Comment/message editing
<Ctrl> + s or <Ctrl> + Enter :	save comment
<Esc> :	cancel edit

Side by Side Diff: sitescripts/crawler/README.md

Issue 8432110: sitescripts: Script to extract domain-specific filters (Closed)

Patch Set: Created Sept. 28, 2012, 2:32 p.m.

Left:
Right:

Use n/p to move between diff chunks; N/P to move between comments.

Jump to:

View unified diff | Download patch

OLD	NEW
1 crawler	1 crawler

2 =======	2 =======

3	3

4 Backend for the Adblock Plus Crawler. It provides the following URLs:	4 Backend for the Adblock Plus Crawler. It provides the following URLs:

5	5

6 * /crawlableSites - Return a list of sites to be crawled	6 * /crawlableSites - Return a list of sites to be crawled

7 * /crawlerData - Receive data on filtered elements	7 * /crawlerData - Receive data on filtered elements

8	8

9 Required packages	9 Required packages

10 -----------------	10 -----------------

(...skipping 24 matching lines...) Expand all Loading...
35 ------------------------	35 ------------------------

36	36

37 Make _filter\_list\_repository_ in the _crawler_ configuration section	37 Make _filter\_list\_repository_ in the _crawler_ configuration section

38 point to the local Mercurial repository of a filter list.	38 point to the local Mercurial repository of a filter list.

39	39

40 Then execute the following:	40 Then execute the following:

41	41

42 python -m sitescripts.crawler.bin.extract_sites > sites.sql	42 python -m sitescripts.crawler.bin.extract_sites > sites.sql

43	43

44 Now you can execute the insert statements from _crawler.sql_.	44 Now you can execute the insert statements from _crawler.sql_.

	45

	46 Extracting domain-specific filters

	47 --------------------------------

	48

	49 Make _filter\_list\_repository_ in the _crawler_ configuration section

	50 point to the local Mercurial repository of a filter list.

	51

	52 You also have to set _domain\_specific\_filter\_files_ to a comma

	53 separated list of files in the filter list repository that contain

	54 domain-specific rules.

	55

	56 Then execute the following:

	57

	58 python -m sitescripts.crawler.bin.extract_filters > filters.sql

OLD	NEW

« no previous file with comments | « no previous file | sitescripts/crawler/bin/extract_filters.py » ('j') | no next file with comments »