Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code

Delta Between Two Patch Sets: README.md

Issue 5288886037118976: Adblock Plus Crawler rewrite (Closed)
Left Patch Set: Created April 24, 2015, 3:38 p.m.
Right Patch Set: Addressed comments Created May 7, 2015, 12:04 a.m.
Left:
Right:
Use n/p to move between diff chunks; N/P to move between comments.
Jump to:
Left: Side by side diff | Download
Right: Side by side diff | Download
« no previous file with change/comment | « .hgignore ('k') | build.py » ('j') | lib/commandLine.js » ('J')
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
LEFTRIGHT
1 abpcrawler 1 abpcrawler
2 ========== 2 ==========
3 3
4 Firefox extension that loads a range of websites and records which 4 This tool loads a range of websites in Firefox and records which requests are
Sebastian Noack 2015/04/27 14:55:50 Apparently its not only a Firefox extension but al
Wladimir Palant 2015/05/07 00:04:59 Done.
5 elements are filtered by [Adblock Plus](http://adblockplus.org). 5 blocked by the [Adblock Plus extension](http://adblockplus.org).
6 6
7 Requirements 7 Requirements
8 ------------ 8 ------------
9 9
10 * [Python 2.x](https://www.python.org) 10 * [Python 2.7](https://www.python.org)
Sebastian Noack 2015/04/27 14:55:50 We actually require Python 2.7 specifically, as we
Wladimir Palant 2015/05/07 00:04:59 Done.
11 * [The Jinja2 module](http://jinja.pocoo.org/docs) 11 * [The Jinja2 module](http://jinja.pocoo.org/docs)
12 * [mozrunner module](https://pypi.python.org/pypi/mozrunner) 12 * [mozrunner module](https://pypi.python.org/pypi/mozrunner)
Sebastian Noack 2015/04/27 14:55:50 I think you should add Firefox to that list as wel
Wladimir Palant 2015/05/07 00:04:59 Done.
13 * [Firefox](https://www.mozilla.org/en-US/firefox/)
13 14
14 Running 15 Running
15 ------- 16 -------
16 17
17 Execute the following: 18 Execute the following:
18 19
19 ./run.py -b /usr/bin/firefox urls.txt outputdir 20 ./run.py -b /usr/bin/firefox urls.txt outputdir
20 21
21 This will run the specified Firefox binary to crawl the URLs from `urls.txt` 22 This will run the specified Firefox binary to crawl the URLs from `urls.txt`
22 (one URL per line). The resulting data and screenshots will be written to the 23 (one URL per line). The resulting data and screenshots will be written to the
23 `outputdir` directory. Firefox will close automatically once all URLs have been 24 `outputdir` directory. Firefox will close automatically once all URLs have been
24 processed. 25 processed.
25 26
26 Optionally, you can provide the path to the Adblock Plus repository - Adblock 27 The complete list of command line flags:
saroyanm 2015/05/04 18:13:43 Maybe make sense to also add some small notes abou
Wladimir Palant 2015/05/07 00:04:59 Done.
27 Plus will no longer be downloaded then. 28
29 -h, --help show help message and exit
30 -b BINARY, --binary BINARY
31 path to the Firefox binary
32 -a ABPDIR, --abpdir ABPDIR
33 path to the Adblock Plus repository
34 -f url [url ...], --filters url [url ...]
35 filter lists to install in Adblock Plus. The arguments
36 can also have the format path=url, the data will be
37 read from the specified path then.
38 -t TIMEOUT, --timeout TIMEOUT
39 Load timeout (seconds)
40 -x MAXTABS, --maxtabs MAXTABS
41 Maximal number of tabs to open in parallel
28 42
29 License 43 License
saroyanm 2015/05/04 18:13:43 Is there a purpose why we use MPL instead of GPL ?
Wladimir Palant 2015/05/07 00:04:59 This extension was written before we switched to t
saroyanm 2015/05/07 13:19:00 Got it, thanks.
30 ------- 44 -------
31 45
32 This Source Code is subject to the terms of the Mozilla Public License 46 This Source Code is subject to the terms of the Mozilla Public License
33 version 2.0 (the "License"). You can obtain a copy of the License at 47 version 2.0 (the "License"). You can obtain a copy of the License at
34 http://mozilla.org/MPL/2.0/. 48 http://mozilla.org/MPL/2.0/.
LEFTRIGHT

Powered by Google App Engine
This is Rietveld