Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code

Side by Side Diff: README.md

Issue 29465720: Issue 4970 - Document the library API of python-abp (Closed)
Patch Set: Rebase to match the new master and retouche the docstrings. Created Oct. 24, 2017, 4:06 p.m.
Left:
Right:
Use n/p to move between diff chunks; N/P to move between comments.
Jump to:
View unified diff | Download patch
« no previous file with comments | « no previous file | abp/__init__.py » ('j') | abp/filters/parser.py » ('J')
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
1 # python-abp 1 # python-abp
2 2
3 This repository contains the script that is used for building Adblock Plus 3 This repository contains a library for working with Adblock Plus filter lists
4 filter lists from the form in which they are authored into the format suitable 4 and the script that is used for building Adblock Plus filter lists from the
5 for consumption by the adblocking software. 5 form in which they are authored into the format suitable for consumption by the
6 adblocking software.
6 7
7 ## Installation 8 ## Installation
8 9
9 Prerequisites: 10 Prerequisites:
10 11
11 * Linux, Mac OS X or Windows (any modern Unix should work too), 12 * Linux, Mac OS X or Windows (any modern Unix should work too),
12 * Python (2.7 or 3.5), 13 * Python (2.7 or 3.5+),
13 * pip. 14 * pip.
14 15
15 To install: 16 To install:
16 17
17 $ pip install -U python-abp 18 $ pip install -U python-abp
18 19
19 ## Rendering of filter lists 20 ## Rendering of filter lists
20 21
21 The filter lists are originally authored in relatively smaller parts focused 22 The filter lists are originally authored in relatively smaller parts focused
22 on a particular type of filters, related to a specific topic or relevant 23 on a particular type of filters, related to a specific topic or relevant
23 for particular geographical area. 24 for particular geographical area.
24 We call these parts _filter list fragments_ (or just _fragments_) 25 We call these parts _filter list fragments_ (or just _fragments_)
25 to distinguish them from full filter lists that are 26 to distinguish them from full filter lists that are
26 consumed by the adblocking software such as Adblock Plus. 27 consumed by the adblocking software such as Adblock Plus.
27 28
28 Rendering is a process that combines filter list fragments into a filter list. 29 Rendering is a process that combines filter list fragments into a filter list.
29 It starts with one fragment that can include other ones and so forth. 30 It starts with one fragment that can include other ones and so forth.
30 The produced filter list is marked with a version, a timestamp and 31 The produced filter list is marked with a version, a timestamp and
31 a [checksum](https://adblockplus.org/filters#special-comments). 32 a [checksum][1].
32 33
33 Python-abp contains a script that can do this called `flrender`: 34 Python-abp contains a script that can do this called `flrender`:
34 35
35 $ flrender fragment.txt filterlist.txt 36 $ flrender fragment.txt filterlist.txt
36 37
37 This will take the top level fragment in `fragment.txt`, render it and save into 38 This will take the top level fragment in `fragment.txt`, render it and save into
38 `filterlist.txt`. 39 `filterlist.txt`.
39 40
40 Fragments might reference other fragments that should be included into them. 41 Fragments might reference other fragments that should be included into them.
41 The references come in two forms: http(s) includes and local includes: 42 The references come in two forms: http(s) includes and local includes:
42 43
43 %include http://www.server.org/dir/list.txt% 44 %include http://www.server.org/dir/list.txt%
44 %include easylist:easylist/easylist_general_block.txt 45 %include easylist:easylist/easylist_general_block.txt%
45 46
46 The first instruction contains a URL that will be fetched and inserted at the 47 The first instruction contains a URL that will be fetched and inserted at the
47 point of reference. 48 point of reference.
48 The second one contains a path inside easylist repository. 49 The second one contains a path inside easylist repository.
49 `flrender` needs to be able to find a copy of the repository on the local 50 `flrender` needs to be able to find a copy of the repository on the local
50 filesystem. We use `-i` option to point it to to the right directory: 51 filesystem. We use `-i` option to point it to to the right directory:
51 52
52 $ flrender -i easylist=/home/abc/easylist input.txt output.txt 53 $ flrender -i easylist=/home/abc/easylist input.txt output.txt
53 54
54 Now the second reference above will be resolved to 55 Now the second reference above will be resolved to
55 `/home/abc/easylist/easylist/easylist_general_block.txt` and the fragment will 56 `/home/abc/easylist/easylist/easylist_general_block.txt` and the fragment will
56 be read from this file. 57 be loaded from this file.
57 58
58 Directories that contain filter list fragments that are used during rendering 59 Directories that contain filter list fragments that are used during rendering
59 are called sources. 60 are called sources.
60 They are normally working copies of the repositories that contain filter list 61 They are normally working copies of the repositories that contain filter list
61 fragments. 62 fragments.
62 Each source is identified by a name: that's the part that comes before ":" 63 Each source is identified by a name: that's the part that comes before ":"
63 in the include instruction and it should be the same as what comes before "=" 64 in the include instruction and it should be the same as what comes before "="
64 in the `-i` option. 65 in the `-i` option.
65 66
66 Commonly used sources have generally accepted names. For example the main 67 Commonly used sources have generally accepted names. For example the main
67 EasyList repository is referred to as `easylist`. 68 EasyList repository is referred to as `easylist`.
68 If you don't know all the source names that are needed to render some list, 69 If you don't know all the source names that are needed to render some list,
69 just run `flrender` and it will report what it's missing: 70 just run `flrender` and it will report what it's missing:
70 71
71 $ flrender easylist.txt output/easylist.txt 72 $ flrender easylist.txt output/easylist.txt
72 Unknown source: 'easylist' when including 'easylist:easylist/easylist_gener 73 Unknown source: 'easylist' when including 'easylist:easylist/easylist_gener
73 al_block.txt' from 'easylist.txt' 74 al_block.txt' from 'easylist.txt'
74 75
75 You can clone the necessary repositories to a local directory and add `-i` 76 You can clone the necessary repositories to a local directory and add `-i`
76 options accordingly. 77 options accordingly.
77 78
79 ## Library API
80
81 Python-abp can also be used as a library for parsing filter lists. For example
82 to read a filter list (we use Python 3 syntax here but the API is the same):
83
84 from abp.filters import parse_filterlist
85
86 with open('filterlist.txt') as filterlist:
87 for line in parse_filterlist(filterlist):
88 print(line)
89
90 If `filterlist.txt` contains a filter list:
91
92 [Adblock Plus 2.0]
93 ! Title: Example list
94
95 abc.com,cdf.com##div#ad1
96 abc.com/ad$image
97 @@/abc\.com/
98 ...
99
100 the output will look something like:
101
102 Header(version='Adblock Plus 2.0')
103 Metadata(key='Title', value='Example list')
104 EmptyLine()
105 Filter(text='abc.com,cdf.com##div#ad1', selector={'type': 'css', 'value': 'd iv#ad1'}, action='hide', options=[('domain', [('abc .com', True), ('cdf.com', Tr ue)])])
106 Filter(text='abc.com/ad$image', selector={'type': 'url-pattern', 'value': 'a bc.com/ad'}, action='block', options=[('image', True)])
107 Filter(text='@@/abc\\.com/', selector={'type': 'url-regexp', 'value': 'abc\\ .com'}, action='allow', options=[])
108 ...
109
110 `abp.filters` module also exports a lower-level function for parsing individual
111 lines of a filter list: `parse_line`. It returns a parsed line object just like
112 the items in the iterator returned by `parse_filterlist`.
113
114 For further information on the library API use `help()` on `abp.filters` and
115 its contents in interactive Python session, read the docstrings or look at the
116 tests for some usage examples.
117
78 ## Testing 118 ## Testing
79 119
80 Unit tests for `python-abp` are located in the `/tests` directory. 120 Unit tests for `python-abp` are located in the `/tests` directory.
81 [Pytest](http://pytest.org/) is used for quickly running the tests 121 [Pytest][3] is used for quickly running the tests
82 during development. 122 during development.
83 [Tox](https://tox.readthedocs.org/) is used for testing in different 123 [Tox][4] is used for testing in different
84 environments (Python 2.7, Python 3.5 and PyPy) and code quality 124 environments (Python 2.7, Python 3.5+ and PyPy) and code quality
85 reporting. 125 reporting.
86 126
87 In order to execute the tests, first create and activate development 127 In order to execute the tests, first create and activate development
88 virtualenv: 128 virtualenv:
89 129
90 $ python setup.py devenv 130 $ python setup.py devenv
91 $ . devenv/bin/activate 131 $ . devenv/bin/activate
92 132
93 With the development virtualenv activated use pytest for a quick test run: 133 With the development virtualenv activated use pytest for a quick test run:
94 134
95 (devenv) $ py.test tests 135 (devenv) $ pytest tests
96 136
97 and tox for a comprehensive report: 137 and tox for a comprehensive report:
98 138
99 (devenv) $ tox 139 (devenv) $ tox
140
141 ## Development
142
143 When adding new functionality, add tests for it (preferably first). Code
144 coverage (as measured by `tox -e qa`) should not decrease and the tests
145 should pass in all Tox environments.
146
147 All public functions, classes and methods should have docstrings compliant with
148 [NumPy/SciPy documentation guide][5]. One exception is the constructors of
149 classes that the user is not expected to instantiate (such as exceptions).
150
151 [1]: https://adblockplus.org/filters#special-comments
152 [2]: https://adblockplus.org/filters#options
153 [3]: http://pytest.org/
154 [4]: https://tox.readthedocs.org/
155 [5]: https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt
OLDNEW
« no previous file with comments | « no previous file | abp/__init__.py » ('j') | abp/filters/parser.py » ('J')

Powered by Google App Engine
This is Rietveld