Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code

Side by Side Diff: abp/filters/blocks.py

Issue 30053555: Issue 7471 - Add an API for working with blocks of filters (Closed) Base URL: https://hg.adblockplus.org/python-abp
Patch Set: Adjust the API in response to review comments Created May 9, 2019, 4:22 p.m.
Left:
Right:
Use n/p to move between diff chunks; N/P to move between comments.
Jump to:
View unified diff | Download patch
« no previous file with comments | « README.rst ('k') | abp/filters/parser.py » ('j') | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
(Empty)
1 # This file is part of Adblock Plus <https://adblockplus.org/>,
2 # Copyright (C) 2006-present eyeo GmbH
3 #
4 # Adblock Plus is free software: you can redistribute it and/or modify
5 # it under the terms of the GNU General Public License version 3 as
6 # published by the Free Software Foundation.
7 #
8 # Adblock Plus is distributed in the hope that it will be useful,
9 # but WITHOUT ANY WARRANTY; without even the implied warranty of
10 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
11 # GNU General Public License for more details.
12 #
13 # You should have received a copy of the GNU General Public License
14 # along with Adblock Plus. If not, see <http://www.gnu.org/licenses/>.
15
16 """Extract blocks of filters separated by comments.
17
18 Blocks of filters separated by comments are common in real world filter lists
19 (e.g. easylist). This structure itself is not documented or standardized but
20 it's often useful to be able to parse it.
21
22 This module exports one function: to_blocks(), that further processes a filter
23 list (after has been parsed by abp.filters.parser) by splitting it into blocks
24 of filters. The comments preceeding each block are merged to produce block
25 description.
26
27 Some filter lists (e.g. ABP exception list) also make use of variable notation
28 ("!:varname=value") to define specific attributes of filters blocks. This
29 module supports this notation and will collect those variables in a dictionary
30 that's placed into `variables` attribute of the block. If variables are present
31 in comments preceeding a block, only non-variable comments that follow the
32 first variable declaration will be included into the block description.
33
34 Blocks also provide a method to convert them to dictionaries: .to_dict() --
35 this can be used for JSON conversion.
36
37 Example
38 -------
39
40 The following code will dump the blocks as dictionaries:
41
42 from abp.filters import parse_filterlist
43 from abp.filters.blocks import to_blocks
44
45 with open(fl_path) as f:
46 for block in to_blocks(parse_filterlist(f)):
47 print(block.to_dict())
48
49 This will produce output like this:
50
51 {'variables': {'partner_token': 'abc', 'partner_id': '3372',
52 'type': 'partner'}, 'description': 'Some comments', 'filters': [...]}
53
54 """
55
56 from __future__ import unicode_literals
57
58 import re
59
60 __all__ = ['to_blocks']
61
62 VAR_REGEXP = re.compile(r'^:(\w+)=(.*)$')
63
64
65 class FiltersBlock(object):
66 """A block of filters (preceded by comments)."""
67
68 def __init__(self, comments, filters):
69 """Create a filter block from filters and comments preceding them."""
70 self.filters = filters
71 self.variables = {}
72 descr_lines = []
73
74 for comment in comments:
75 match = VAR_REGEXP.search(comment.text)
76 if match:
77 if not self.variables:
78 # Normal comments before first variable are not included in
79 # the description.
80 descr_lines = []
81 name, value = match.groups()
82 self.variables[name] = value
83 else:
84 descr_lines.append(comment.text)
85
86 self.description = '\n'.join(descr_lines)
87
88 def to_dict(self):
89 ret = dict(self.__dict__)
90 ret['filters'] = [f.to_dict() for f in ret['filters']]
91 return ret
92
93
94 def to_blocks(parsed_lines):
95 """Convert a sequence of parser filter list lines to blocks.
96
97 Parameters
98 ----------
99 parsed_lines : iterable of namedtuple
100 Parsed filter list (see `parser.py` for details on how it's
101 represented).
102
103 Returns
104 -------
105 blocks : iterable of FiltersBlock.
106 Blocks extracted from the parsed filter list. Each block carries
107 filters in `.filters` attribute, comments in `.description` attribute
108 and variable-defining comments in `.variables`.
109
110 """
111 comments = []
112 filters = []
113
114 for line in parsed_lines:
115 if line.type == 'comment':
116 if filters:
117 yield FiltersBlock(comments, filters)
118 comments = []
119 filters = []
120 comments.append(line)
121 elif line.type == 'filter':
122 filters.append(line)
123
124 if filters:
125 yield FiltersBlock(comments, filters)
OLDNEW
« no previous file with comments | « README.rst ('k') | abp/filters/parser.py » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld