cms/converters.py - Issue 29380555: Noissue - undo accidental logic changes introduced by 78d85c096f9e

Side by Side Diff: cms/converters.py

Issue 29380555: Noissue - undo accidental logic changes introduced by 78d85c096f9e (Closed)

Patch Set: Created March 11, 2017, 4:58 p.m.

Left:
Right:

Use n/p to move between diff chunks; N/P to move between comments.

Jump to:

View unified diff | Download patch

OLD	NEW
1 # This file is part of the Adblock Plus web scripts,	1 # This file is part of the Adblock Plus web scripts,

2 # Copyright (C) 2006-2016 Eyeo GmbH	2 # Copyright (C) 2006-2016 Eyeo GmbH

3 #	3 #

4 # Adblock Plus is free software: you can redistribute it and/or modify	4 # Adblock Plus is free software: you can redistribute it and/or modify

5 # it under the terms of the GNU General Public License version 3 as	5 # it under the terms of the GNU General Public License version 3 as

6 # published by the Free Software Foundation.	6 # published by the Free Software Foundation.

7 #	7 #

8 # Adblock Plus is distributed in the hope that it will be useful,	8 # Adblock Plus is distributed in the hope that it will be useful,

9 # but WITHOUT ANY WARRANTY; without even the implied warranty of	9 # but WITHOUT ANY WARRANTY; without even the implied warranty of

10 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the	10 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the

11 # GNU General Public License for more details.	11 # GNU General Public License for more details.

12 #	12 #

13 # You should have received a copy of the GNU General Public License	13 # You should have received a copy of the GNU General Public License

14 # along with Adblock Plus. If not, see <http://www.gnu.org/licenses/>.	14 # along with Adblock Plus. If not, see <http://www.gnu.org/licenses/>.

15	15

	16 from __future__ import unicode_literals

	17

16 import os	18 import os

17 import HTMLParser	19 import HTMLParser

18 import re	20 import re

19	21

20 import jinja2	22 import jinja2

21 import markdown	23 import markdown

22	24

23	25

24 # Monkey-patch Markdown's isBlockLevel function to ensure that no paragraphs	26 # Monkey-patch Markdown's isBlockLevel function to ensure that no paragraphs

25 # are inserted into the <head> tag	27 # are inserted into the <head> tag

(...skipping 44 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
70 finally:	72 finally:

71 self._string = None	73 self._string = None

72 self._attrs = None	74 self._attrs = None

73 self._pagename = None	75 self._pagename = None

74 self._inside_fixed = False	76 self._inside_fixed = False

75 self._fixed_strings = None	77 self._fixed_strings = None

76	78

77 def handle_starttag(self, tag, attrs):	79 def handle_starttag(self, tag, attrs):

78 if self._inside_fixed:	80 if self._inside_fixed:

79 raise Exception("Unexpected HTML tag '{}' inside a fixed string"	81 raise Exception("Unexpected HTML tag '{}' inside a fixed string"

80 'on page {}'.format(tag, self._pagename))	82 ' on page {}'.format(tag, self._pagename))

81 if tag == 'fix':	83 if tag == 'fix':
	Wladimir Palant 2017/03/12 16:02:22 For reference, I'm not a huge fan of removing ever For reference, I'm not a huge fan of removing every "unnecessary" else statement. This function exemplifies the point nicely. It handles four mutually exclusive scenarios, and before the changes you could tell that just by looking at the overall structure. Now you have to realize that the first branch has a raise statement in it and so execution won't continue. For the Python interpreter both versions are identical, but for humans the current version of the code is harder to read. From what I can tell, PEP8 doesn't mandate replacing elif by if statement here. Would also be a rather unpythonic rule, with Python being very much about making structures obvious. Vasily Kuznetsov 2017/03/13 12:25:29 I agree that applying strict "no unnecessary else' Show quoted text On 2017/03/12 16:02:22, Wladimir Palant wrote: > For reference, I'm not a huge fan of removing every "unnecessary" else > statement. This function exemplifies the point nicely. It handles four mutually > exclusive scenarios, and before the changes you could tell that just by looking > at the overall structure. Now you have to realize that the first branch has a > raise statement in it and so execution won't continue. For the Python > interpreter both versions are identical, but for humans the current version of > the code is harder to read. > > From what I can tell, PEP8 doesn't mandate replacing elif by if statement here. > Would also be a rather unpythonic rule, with Python being very much about making > structures obvious. I agree that applying strict "no unnecessary else's" rule here makes the logic less clear. We could of course add returns to other branches and then not have any elif's or else's at all, but it seems like the original structure was actually better. Unfortunately our flake-eyeo plugin is rather strict about this so we seem to have shot ourselves in the foot a bit. I have created a ticket to deal with this (https://issues.adblockplus.org/ticket/4980) but for now I will revert the `if` and add an exception to `tox.ini`.
82 self._inside_fixed = True	84 self._inside_fixed = True

83 self._fixed_strings.append([])	85 self._fixed_strings.append([])

84 if tag in self._whitelist:	86 elif tag in self._whitelist:

85 self._attrs.setdefault(tag, []).append(attrs)	87 self._attrs.setdefault(tag, []).append(attrs)

86 self._string.append('<{}>'.format(tag))	88 self._string.append('<{}>'.format(tag))

87 else:	89 else:

88 raise Exception("Unexpected HTML tag '{}' inside a fixed string"	90 raise Exception("Unexpected HTML tag '{}' inside a fixed string"

89 'on page {}'.format(tag, self._pagename))	91 ' on page {}'.format(tag, self._pagename))

90	92

91 def handle_endtag(self, tag):	93 def handle_endtag(self, tag):

92 if tag == 'fix':	94 if tag == 'fix':

93 self._string.append('{{{}}}'.format(self._fixed_strings))	95 self._string.append('{{{}}}'.format(len(self._fixed_strings)))

94 self._inside_fixed = False	96 self._inside_fixed = False

95 else:	97 else:

96 self._string.append('</{}>'.format(tag))	98 self._string.append('</{}>'.format(tag))

97	99

98 def _append_text(self, s):	100 def _append_text(self, s):

99 if self._inside_fixed:	101 if self._inside_fixed:

100 self._fixed_strings[-1].append(s)	102 self._fixed_strings[-1].append(s)

101 else:	103 else:

102 self._string.append(s)	104 self._string.append(s)

103	105

(...skipping 45 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
149 return re.escape(escape(s))	151 return re.escape(escape(s))

150	152

151 # Handle duplicated strings	153 # Handle duplicated strings

152 if default:	154 if default:

153 self._seen_defaults[(page, name)] = (default, comment)	155 self._seen_defaults[(page, name)] = (default, comment)

154 else:	156 else:

155 try:	157 try:

156 default, comment = self._seen_defaults[(page, name)]	158 default, comment = self._seen_defaults[(page, name)]

157 except KeyError:	159 except KeyError:

158 raise Exception('Text not yet defined for string {} on page'	160 raise Exception('Text not yet defined for string {} on page'

159 '{}'.format(name, page))	161 ' {}'.format(name, page))

160	162

161 # Extract tag attributes from default string	163 # Extract tag attributes from default string

162 default, saved_attributes, fixed_strings = (	164 default, saved_attributes, fixed_strings = (

163 self._attribute_parser.parse(default, self._params['page']))	165 self._attribute_parser.parse(default, self._params['page']))

164	166

165 # Get translation	167 # Get translation

166 locale = self._params['locale']	168 locale = self._params['locale']

167 if locale == self._params['defaultlocale']:	169 if locale == self._params['defaultlocale']:

168 result = default	170 result = default

169 elif name in localedata:	171 elif name in localedata:

170 result = localedata[name].strip()	172 result = localedata[name].strip()

171 else:	173 else:

172 result = default	174 result = default

173 self.missing_translations += 1	175 self.missing_translations += 1

174 self.total_translations += 1	176 self.total_translations += 1

175	177

176 # Perform callback with the string if required, e.g. for the	178 # Perform callback with the string if required, e.g. for the

177 # translations script	179 # translations script

178 callback = self._params['localized_string_callback']	180 callback = self._params['localized_string_callback']

179 if callback:	181 if callback:

180 callback(page, locale, name, result, comment, fixed_strings)	182 callback(page, locale, name, result, comment, fixed_strings)

181	183

182 # Insert fixed strings	184 # Insert fixed strings

183 for i, fixed_string in enumerate(fixed_strings, 1):	185 for i, fixed_string in enumerate(fixed_strings, 1):

184 result = result.replace('{{{%d}}}'.format(i), fixed_string)	186 result = result.replace('{{{}}}'.format(i), fixed_string)

185	187

186 # Insert attributes	188 # Insert attributes

187 result = escape(result)	189 result = escape(result)

188	190

189 def stringify_attribute((name, value)):	191 def stringify_attribute((name, value)):

190 return '{}="{}"'.format(	192 return '{}="{}"'.format(

191 escape(name),	193 escape(name),

192 escape(self.insert_localized_strings(value, {}))	194 escape(self.insert_localized_strings(value, {}))

193 )	195 )

194	196

195 for tag in self.whitelist:	197 for tag in self.whitelist:

196 allowed_contents = '(?:[^<>]\|{})'.format('\|').join((	198 allowed_contents = '(?:[^<>]\|{})'.format('\|'.join((

197 '<(?:{}[^<>]*?\|/{})>'.format(t, t)	199 '<(?:{}[^<>]*?\|/{})>'.format(t, t)

198 for t in map(re.escape, self.whitelist - {tag})	200 for t in map(re.escape, self.whitelist - {tag})

199 ))	201 )))

200 saved = saved_attributes.get(tag, [])	202 saved = saved_attributes.get(tag, [])

201 for attrs in saved:	203 for attrs in saved:

202 attrs = map(stringify_attribute, attrs)	204 attrs = map(stringify_attribute, attrs)

203 result = re.sub(	205 result = re.sub(

204 r'{}({}*?){}'.format(re_escape('<{}>'.format(tag)),	206 r'{}({}*?){}'.format(re_escape('<{}>'.format(tag)),

205 allowed_contents,	207 allowed_contents,

206 re_escape('</{}>'.format(tag))),	208 re_escape('</{}>'.format(tag))),

207 lambda match: r'<{}{}>{}</{}>'.format(	209 lambda match: '<{}{}>{}</{}>'.format(
	Wladimir Palant 2017/03/12 16:02:22 Like with my other comment, by not using raw strin Like with my other comment, by not using raw strings for regular impressions you aren't making it any easier for the Python interpreter - yet humans will no longer see what's going on. More importantly, people changing these regular expressions will have to remember that they need to switch to raw strings when adding backslashes. Vasily Kuznetsov 2017/03/13 12:25:29 I switched from raw strings to unicode strings to Show quoted text On 2017/03/12 16:02:22, Wladimir Palant wrote: > Like with my other comment, by not using raw strings for regular impressions you > aren't making it any easier for the Python interpreter - yet humans will no > longer see what's going on. More importantly, people changing these regular > expressions will have to remember that they need to switch to raw strings when > adding backslashes. I switched from raw strings to unicode strings to address the more strict behavior of .format with respect to unicode. However, after I added `from __future__ import unicode_literals` to the top, this is no longer necessary, so I reverted it to a raw string, which indeed is usually better for regular expressions. Thanks for catching this.
208 tag,	210 tag,

209 ' ' + ' '.join(attrs) if attrs else '',	211 ' ' + ' '.join(attrs) if attrs else '',

210 match.group(1),	212 match.group(1),

211 tag	213 tag

212 ),	214 ),

213 result, 1, flags=re.S	215 result, 1, flags=re.S

214 )	216 )

215 result = re.sub(	217 result = re.sub(

216 r'{}({}*?){}'.format(re_escape('<{}>'.format(tag)),	218 r'{}({}*?){}'.format(re_escape('<{}>'.format(tag)),

217 allowed_contents,	219 allowed_contents,

(...skipping 61 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
279 self._params['includedata'] = (	281 self._params['includedata'] = (

280 self._params['source'].read_include(name, format_))	282 self._params['source'].read_include(name, format_))

281	283

282 converter = converter_class(self._params,	284 converter = converter_class(self._params,

283 key='includedata')	285 key='includedata')

284 result = converter()	286 result = converter()

285 self.missing_translations += converter.missing_translations	287 self.missing_translations += converter.missing_translations

286 self.total_translations += converter.total_translations	288 self.total_translations += converter.total_translations

287 return result	289 return result

288 raise Exception('Failed to resolve include {}'	290 raise Exception('Failed to resolve include {}'

289 'on page {}'.format(name, self._params['page']))	291 ' on page {}'.format(name, self._params['page']))

290	292

291 return re.sub(	293 return re.sub(

292 r'{}\?\sinclude\s+([^\s<>"]+)\s\?{}'.format(	294 r'{}\?\sinclude\s+([^\s<>"]+)\s\?{}'.format(

293 self.include_start_regex,	295 self.include_start_regex,

294 self.include_end_regex	296 self.include_end_regex

295 ),	297 ),

296 resolve_include,	298 resolve_include,

297 text	299 text

298 )	300 )

299	301

(...skipping 90 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
390 continue	392 continue

391	393

392 path = os.path.join(dirname, filename)	394 path = os.path.join(dirname, filename)

393 namespace = self._params['source'].exec_file(path)	395 namespace = self._params['source'].exec_file(path)

394	396

395 name = os.path.basename(root)	397 name = os.path.basename(root)

396 try:	398 try:

397 dictionary[name] = namespace[name]	399 dictionary[name] = namespace[name]

398 except KeyError:	400 except KeyError:

399 raise Exception('Expected symbol {} not found'	401 raise Exception('Expected symbol {} not found'

400 'in {}'.format(name, path))	402 ' in {}'.format(name, path))

401	403

402 self._env = jinja2.Environment(	404 self._env = jinja2.Environment(

403 loader=SourceTemplateLoader(self._params['source']),	405 loader=SourceTemplateLoader(self._params['source']),

404 autoescape=True)	406 autoescape=True)

405 self._env.filters.update(filters)	407 self._env.filters.update(filters)

406 self._env.globals.update(globals)	408 self._env.globals.update(globals)

407	409

408 def get_html(self, source, filename):	410 def get_html(self, source, filename):

409 env = self._env	411 env = self._env

410 code = env.compile(source, None, filename)	412 code = env.compile(source, None, filename)

(...skipping 66 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
477 stack.pop()	479 stack.pop()

478 stack[-1]['subitems'].append(item)	480 stack[-1]['subitems'].append(item)

479 stack.append(item)	481 stack.append(item)

480 return structured	482 return structured

481	483

482 converters = {	484 converters = {

483 'html': RawConverter,	485 'html': RawConverter,

484 'md': MarkdownConverter,	486 'md': MarkdownConverter,

485 'tmpl': TemplateConverter,	487 'tmpl': TemplateConverter,

486 }	488 }

OLD	NEW

« no previous file with comments | « no previous file | no next file » | no next file with comments »