sitescripts/crawler/README.md - Issue 8492019: sitescripts: Collect unmatched filters

Keyboard Shortcuts

	File
u :	up to issue
m :	publish + mail comments
M :	edit review message
j / k :	jump to file after / before current file
J / K :	jump to next file with a comment after / before current file
	Side-by-side diff
i :	toggle intra-line diffs
e :	expand all comments
c :	collapse all comments
s :	toggle showing all comments
n / p :	next / previous diff chunk or comment
N / P :	next / previous comment
<Up> / <Down> :	next / previous line
<Enter> :	respond to / edit current comment
d :	mark current comment as done

	Issue
u :	up to list of issues
m :	publish + mail comments
j / k :	jump to patch after / before current patch
o / <Enter> :	open current patch in side-by-side view
i :	open current patch in unified diff view

	Issue List
j / k :	jump to issue after / before current issue
o / <Enter> :	open current issue
# :	close issue

	Comment/message editing
<Ctrl> + s or <Ctrl> + Enter :	save comment
<Esc> :	cancel edit

Unified Diff: sitescripts/crawler/README.md

Issue 8492019: sitescripts: Collect unmatched filters (Closed)

Patch Set: Created Oct. 2, 2012, 5:02 a.m.

Use n/p to move between diff chunks; N/P to move between comments.

Jump to:

Index: sitescripts/crawler/README.md

===================================================================

--- a/sitescripts/crawler/README.md

+++ b/sitescripts/crawler/README.md

@@ -4,7 +4,7 @@

Backend for the Adblock Plus Crawler. It provides the following URLs:

* */crawlableSites* - Return a list of sites to be crawled

-* */crawlerData* - Receive data on filtered elements

+* */crawlerRequests* - Receive all requests made, and whether they were filtered

Required packages

-----------------

@@ -21,6 +21,10 @@

Just add an empty _crawler_ section to _/etc/sitescripts_ or _.sitescripts_.

+If you want to import crawlable sites or domain-specific filters from

+easylist (see below), you need to make _easylist\_repository_ point to

+the local Mercurial repository of easylist.

Also make sure that the following keys are configured in the _DEFAULT_

section:

@@ -31,14 +35,12 @@

* _basic\_auth\_username_

* _basic\_auth\_password_

-Extracting crawler sites

-------------------------

+Importing crawlable sites from easylist

+---------------------------------------

-Make _filter\_list\_repository_ in the _crawler_ configuration section

-point to the local Mercurial repository of a filter list.

+ python -m sitescripts.crawler.bin.import_sites

-Then execute the following:

+Importing domain-specific filters from easylist

+-----------------------------------------------

- python -m sitescripts.crawler.bin.extract_sites > sites.sql

-Now you can execute the insert statements from _crawler.sql_.

+ python -m sitescripts.crawler.bin.import_filters

« no previous file with comments | « no previous file | sitescripts/crawler/bin/extract_sites.py » ('j') | no next file with comments »