Index: README.md |
=================================================================== |
new file mode 100644 |
--- /dev/null |
+++ b/README.md |
@@ -0,0 +1,34 @@ |
+abpcrawler |
+========== |
+ |
+Firefox extension that loads a range of websites and records which |
Sebastian Noack
2015/04/27 14:55:50
Apparently its not only a Firefox extension but al
Wladimir Palant
2015/05/07 00:04:59
Done.
|
+elements are filtered by [Adblock Plus](http://adblockplus.org). |
+ |
+Requirements |
+------------ |
+ |
+* [Python 2.x](https://www.python.org) |
Sebastian Noack
2015/04/27 14:55:50
We actually require Python 2.7 specifically, as we
Wladimir Palant
2015/05/07 00:04:59
Done.
|
+* [The Jinja2 module](http://jinja.pocoo.org/docs) |
+* [mozrunner module](https://pypi.python.org/pypi/mozrunner) |
Sebastian Noack
2015/04/27 14:55:50
I think you should add Firefox to that list as wel
Wladimir Palant
2015/05/07 00:04:59
Done.
|
+ |
+Running |
+------- |
+ |
+Execute the following: |
+ |
+ ./run.py -b /usr/bin/firefox urls.txt outputdir |
+ |
+This will run the specified Firefox binary to crawl the URLs from `urls.txt` |
+(one URL per line). The resulting data and screenshots will be written to the |
+`outputdir` directory. Firefox will close automatically once all URLs have been |
+processed. |
+ |
+Optionally, you can provide the path to the Adblock Plus repository - Adblock |
saroyanm
2015/05/04 18:13:43
Maybe make sense to also add some small notes abou
Wladimir Palant
2015/05/07 00:04:59
Done.
|
+Plus will no longer be downloaded then. |
+ |
+License |
saroyanm
2015/05/04 18:13:43
Is there a purpose why we use MPL instead of GPL ?
Wladimir Palant
2015/05/07 00:04:59
This extension was written before we switched to t
saroyanm
2015/05/07 13:19:00
Got it, thanks.
|
+------- |
+ |
+This Source Code is subject to the terms of the Mozilla Public License |
+version 2.0 (the "License"). You can obtain a copy of the License at |
+http://mozilla.org/MPL/2.0/. |