Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code

Delta Between Two Patch Sets: README.md

Issue 29465720: Issue 4970 - Document the library API of python-abp (Closed)
Left Patch Set: Update README to match the changes from https://codereview.adblockplus.org/29465715/ Created Aug. 7, 2017, 8:28 p.m.
Right Patch Set: Rebase to match the new master and retouche the docstrings. Created Oct. 24, 2017, 4:06 p.m.
Left:
Right:
Use n/p to move between diff chunks; N/P to move between comments.
Jump to:
Left: Side by side diff | Download
Right: Side by side diff | Download
« no previous file with change/comment | « no previous file | abp/__init__.py » ('j') | abp/filters/parser.py » ('J')
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
LEFTRIGHT
1 # python-abp 1 # python-abp
2 2
3 This repository contains a library for working with Adblock Plus filter lists 3 This repository contains a library for working with Adblock Plus filter lists
4 and the script that is used for building Adblock Plus filter lists from the 4 and the script that is used for building Adblock Plus filter lists from the
5 form in which they are authored into the format suitable for consumption by the 5 form in which they are authored into the format suitable for consumption by the
6 adblocking software. 6 adblocking software.
mathias 2017/08/08 12:24:35 For an introduction that is a bit too much. How ab
7 7
8 ## Installation 8 ## Installation
9 9
10 Prerequisites: 10 Prerequisites:
11 11
12 * Linux, Mac OS X or Windows (any modern Unix should work too), 12 * Linux, Mac OS X or Windows (any modern Unix should work too),
13 * Python (2.7 or 3.5+), 13 * Python (2.7 or 3.5+),
14 * pip. 14 * pip.
15 15
16 To install: 16 To install:
(...skipping 30 matching lines...) Expand all
47 The first instruction contains a URL that will be fetched and inserted at the 47 The first instruction contains a URL that will be fetched and inserted at the
48 point of reference. 48 point of reference.
49 The second one contains a path inside easylist repository. 49 The second one contains a path inside easylist repository.
50 `flrender` needs to be able to find a copy of the repository on the local 50 `flrender` needs to be able to find a copy of the repository on the local
51 filesystem. We use `-i` option to point it to to the right directory: 51 filesystem. We use `-i` option to point it to to the right directory:
52 52
53 $ flrender -i easylist=/home/abc/easylist input.txt output.txt 53 $ flrender -i easylist=/home/abc/easylist input.txt output.txt
54 54
55 Now the second reference above will be resolved to 55 Now the second reference above will be resolved to
56 `/home/abc/easylist/easylist/easylist_general_block.txt` and the fragment will 56 `/home/abc/easylist/easylist/easylist_general_block.txt` and the fragment will
57 be read from this file. 57 be loaded from this file.
58 58
59 Directories that contain filter list fragments that are used during rendering 59 Directories that contain filter list fragments that are used during rendering
60 are called sources. 60 are called sources.
61 They are normally working copies of the repositories that contain filter list 61 They are normally working copies of the repositories that contain filter list
62 fragments. 62 fragments.
63 Each source is identified by a name: that's the part that comes before ":" 63 Each source is identified by a name: that's the part that comes before ":"
64 in the include instruction and it should be the same as what comes before "=" 64 in the include instruction and it should be the same as what comes before "="
65 in the `-i` option. 65 in the `-i` option.
66 66
67 Commonly used sources have generally accepted names. For example the main 67 Commonly used sources have generally accepted names. For example the main
(...skipping 22 matching lines...) Expand all
90 If `filterlist.txt` contains a filter list: 90 If `filterlist.txt` contains a filter list:
91 91
92 [Adblock Plus 2.0] 92 [Adblock Plus 2.0]
93 ! Title: Example list 93 ! Title: Example list
94 94
95 abc.com,cdf.com##div#ad1 95 abc.com,cdf.com##div#ad1
96 abc.com/ad$image 96 abc.com/ad$image
97 @@/abc\.com/ 97 @@/abc\.com/
98 ... 98 ...
99 99
100 the output will look similar to the following: 100 the output will look something like:
101 101
102 Header(version='Adblock Plus 2.0') 102 Header(version='Adblock Plus 2.0')
103 Metadata(key='Title', value='Example list') 103 Metadata(key='Title', value='Example list')
104 EmptyLine() 104 EmptyLine()
105 Filter(text='abc.com,cdf.com##div#ad1', selector={'type': 'css', 'value': 'd iv#ad1'}, action='hide', options=[('domain', [('abc .com', True), ('cdf.com', Tr ue)])]) 105 Filter(text='abc.com,cdf.com##div#ad1', selector={'type': 'css', 'value': 'd iv#ad1'}, action='hide', options=[('domain', [('abc .com', True), ('cdf.com', Tr ue)])])
106 Filter(text='abc.com/ad$image', selector={'type': 'url-pattern', 'value': 'a bc.com/ad'}, action='block', options=[('image', True)]) 106 Filter(text='abc.com/ad$image', selector={'type': 'url-pattern', 'value': 'a bc.com/ad'}, action='block', options=[('image', True)])
107 Filter(text='@@/abc\\.com/', selector={'type': 'url-regexp', 'value': 'abc\\ .com'}, action='allow', options=[]) 107 Filter(text='@@/abc\\.com/', selector={'type': 'url-regexp', 'value': 'abc\\ .com'}, action='allow', options=[])
108 ... 108 ...
109 109
110 In general `parse_filterlist` takes an iterable of strings (such as a list or
111 an open file) and returns an iterable of parsed filter list lines. Each line
112 will have its `.type` attribute set to a string indicating its type. It will
113 also have a `.to_string()` method that converts it to a unicode string in the
114 filter list format (most of the time it's the same as the string from which the
115 filter was parsed). Further attributes depend on the type of the line.
116
117 **Note:** `parse_filterlist` returns an iterator, not a list, and only consumes
118 the input lines when its output is iterated over. This allows much more memory
119 efficient handling of large filter lists, however there are two things to watch
120 out for:
121
122 **Note:** iteration over parsed lines may throw a `ParseError` exception if a
123 line cannot be parsed. The exception will contain the information about the
124 error and the original line that failed parsing.
mathias 2017/08/08 12:24:35 It is not clear what bits this is about (I assume
Vasily Kuznetsov 2017/08/08 14:31:12 Yeah, we've discussed this. But for now that chang
125
126 - When you're parsing filters from a file, you need to complete the iteration
127 before you close the file.
128 - Once you iterate over the output of `parse_filterlist` once, it will be
129 consumed and you won't be iterate over it again.
130
131 If you find that this is bothering you, you probably want to convert the output
mathias 2017/08/08 12:24:34 Everything in this section from here on, maybe inc
132 of `parse_filterlist` to a list:
133
134 lines_list = list(parse_filterlist(filterlist))
135
136 This will load the whole file into memory but unless you're dealing with a
137 gigantic filter list that should not be a problem.
138
139 ### Line types
140
141 As mentioned above, lines of different types have different attributes:
142
143 | type | attributes |
mathias 2017/08/08 12:24:35 Are you sure this kind of table markup is supporte
Vasily Kuznetsov 2017/08/08 14:31:12 Indeed the table markup was not part of the origin
144 |------------|------------------------------------------------------------------ ------|
145 | header | `version` - plugin version string |
146 | emptyline | no options |
147 | comment | `text` - text of the comment |
148 | metadata | `key` - name of the metadata field, `value` - value of the field |
149 | include | `target` - url/path of the file to include |
150 | filter | `text` - text of the filter, `selector` - what to look for, `acti on` - what to do with selected items, `options` - filter options |
151
152 #### Filter atributes
mathias 2017/08/08 12:24:35 This section mentions "Selector" but not ".selecto
153
154 Selector is a dictionary with two keys:
155
156 | key | meaning |
157 |--------------|---------------------------------------------------------------- --|
158 | type | 'css', 'abp-simple', 'url-pattern', 'url-regexp', 'extended-css ' |
159 | value | the selector itself, the meaning is type-dependent |
160
161 It's preferable to import `SELECTOR_TYPE` namespace from `abp.filters` to refer
162 to filter types instead of using strings. `SELECTOR_TYPE` contains constants
163 for each filter type: `SELECTOR_TYPE.CSS`, `SELECTOR_TYPE.ABP_SIMPLE`,
164 `SELECTOR_TYPE.URL_PATTERN`, `SELECTOR_TYPE.URL_REGEXP` and
165 `SELECTOR_TYPE.XCSS`.
166
167 Action instructs adblocking software on what should be done with the items
168 matching the selector:
169
170 | action | meaning |
171 |--------|---------------------------------------------------------------------- --|
172 | block | block http(s) request that matches the selector |
173 | allow | allow http(s) request that matches the filter (whitelist the resource ) |
174 | hide | hide the DOM element that matches the selector |
175 | show | show the DOM element that matches the selector (whitelist the element ) |
176
177 The action constants are contained in `FILTER_ACTION` namespace, which can also
178 be imported from `abp.filters` (`FILTER_ACTION.BLOCK`, `FILTER_ACTION.ALLOW`,
179 etc.)
180
181 Options is a list of tuples consisting of option name and option value. The
182 option value is `True` or `False` for flags or, for options with a value, it's
183 a string, list of strings or a list of `(string, boolean)` tuples. See
184 [documentation on authoring the filter rules][2] for the list of existing
185 options and their meanings.
186
187 ### Other functions
188
189 `abp.filters` module also exports a lower-level function for parsing individual 110 `abp.filters` module also exports a lower-level function for parsing individual
190 lines of a filter list: `parse_line`. It returns a parsed line object just like 111 lines of a filter list: `parse_line`. It returns a parsed line object just like
191 the items in the iterator returned by `parse_filterlist`. 112 the items in the iterator returned by `parse_filterlist`.
113
114 For further information on the library API use `help()` on `abp.filters` and
115 its contents in interactive Python session, read the docstrings or look at the
116 tests for some usage examples.
192 117
193 ## Testing 118 ## Testing
194 119
195 Unit tests for `python-abp` are located in the `/tests` directory. 120 Unit tests for `python-abp` are located in the `/tests` directory.
196 [Pytest][3] is used for quickly running the tests 121 [Pytest][3] is used for quickly running the tests
197 during development. 122 during development.
198 [Tox][4] is used for testing in different 123 [Tox][4] is used for testing in different
199 environments (Python 2.7, Python 3.5+ and PyPy) and code quality 124 environments (Python 2.7, Python 3.5+ and PyPy) and code quality
200 reporting. 125 reporting.
201 126
202 In order to execute the tests, first create and activate development 127 In order to execute the tests, first create and activate development
203 virtualenv: 128 virtualenv:
204 129
205 $ python setup.py devenv 130 $ python setup.py devenv
206 $ . devenv/bin/activate 131 $ . devenv/bin/activate
207 132
208 With the development virtualenv activated use pytest for a quick test run: 133 With the development virtualenv activated use pytest for a quick test run:
209 134
210 (devenv) $ pytest tests 135 (devenv) $ pytest tests
211 136
212 and tox for a comprehensive report: 137 and tox for a comprehensive report:
213 138
214 (devenv) $ tox 139 (devenv) $ tox
215 140
141 ## Development
142
143 When adding new functionality, add tests for it (preferably first). Code
144 coverage (as measured by `tox -e qa`) should not decrease and the tests
145 should pass in all Tox environments.
146
147 All public functions, classes and methods should have docstrings compliant with
148 [NumPy/SciPy documentation guide][5]. One exception is the constructors of
149 classes that the user is not expected to instantiate (such as exceptions).
216 150
217 [1]: https://adblockplus.org/filters#special-comments 151 [1]: https://adblockplus.org/filters#special-comments
218 [2]: https://adblockplus.org/filters#options 152 [2]: https://adblockplus.org/filters#options
219 [3]: http://pytest.org/ 153 [3]: http://pytest.org/
220 [4]: https://tox.readthedocs.org/ 154 [4]: https://tox.readthedocs.org/
155 [5]: https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt
LEFTRIGHT
« no previous file | abp/__init__.py » ('j') | Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Toggle Comments ('s')

Powered by Google App Engine
This is Rietveld