Issue 29319007: Issue 2711 - Refactored ChainedConfigParser, allowing manipulation of list items

Sebastian Noack

June 22, 2015, 9:53 p.m. (2015-06-22 21:53:44 UTC) #1

Wladimir Palant

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser.py File chainedconfigparser.py (right): https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser.py#newcode37 chainedconfigparser.py:37: have a source attribute serving the same purpose. Extend ...

June 23, 2015, 9:43 a.m. (2015-06-23 09:43:36 UTC) #2

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser.py
File chainedconfigparser.py (right):

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser...
chainedconfigparser.py:37: have a source attribute serving the same purpose.
Extend documentation to mention the new += and -= syntax?

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser...
chainedconfigparser.py:48: parser._read(file, filename)
Please use public API here:

  parser.readfp(file, filename)

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser...
chainedconfigparser.py:68: old_value = self.get(section, option)
This will throw when trying to change an option that doesn't exist yet. While we
probably don't want to sweep this condition under the carpet, a meaningful
warning would do better IMHO.

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser...
chainedconfigparser.py:71: value = '%s %s' % (old_value, value)
I don't think that we want duplicate values - we are essentially treating the
option as a set. So adding a value that is already there shouldn't have an
effect. Maybe:

  existing = old_value.split()
  value = ' '.join(existing + [v for v in value.split() if v not in existing])

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser...
chainedconfigparser.py:72: elif removal:
Nit: no need for elif, can be simply else. But I don't mind leaving it as is for
readability.

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser...
chainedconfigparser.py:73: value = re.sub(r'\b(?:%s)\b\s*' %
'|'.join(map(re.escape, value.split())), '', old_value).rstrip()
I don't think that regular expressions are the right tool here. How about:

  blacklist = value.split()
  value = ' '.join([v for v in old_value.split() if v not in blacklist])

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser...
chainedconfigparser.py:77: def _read(self, file, filename):
Is it really a good idea to override private methods? There is relatively little
danger of this one changing of course but still. IMHO we should be overriding
public API - meaning read() and readfp(). This won't really add much to the
complexity.

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser...
chainedconfigparser.py:93: self._sections[section] = self._dict()
I cannot say that I like this hack. How about simply skipping section 'default'?
There is lots of magic attached to it, I don't think we want it for metadata
files. Would also make sense to have the "inherit" option no longer visible in
the result.

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser...
chainedconfigparser.py:95: self._origin[(section, option)] = filename
Ok, we have a problem right here... Consider the following scenario:

foo/bar/metadata.chrome defines document_start = foo.js. foo/metadata.chrome
defines document_start += bar.js. The resulting value is document_start = foo.js
bar.js. Where are foo.js and bar.js supposed to be located?

With this change they will both be looked up in foo/ which is wrong. foo.js
should definitely be looked up in foo/bar/. Big question is bar.js - would be
logical to assume that its origin is foo/. However, that would be inconsistent
with -= directives - these use the paths from the original option, not paths
relative to the current metadata file. So I think that bar.js should also be
located in foo/bar/, additions and removals simply shouldn't change the origin
of the option.

Sebastian Noack

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser.py File chainedconfigparser.py (right): https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser.py#newcode37 chainedconfigparser.py:37: have a source attribute serving the same purpose. On ...

June 24, 2015, 8:51 a.m. (2015-06-24 08:51:10 UTC) #3

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser.py
File chainedconfigparser.py (right):

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser...
chainedconfigparser.py:37: have a source attribute serving the same purpose.
On 2015/06/23 09:43:36, Wladimir Palant wrote:
> Extend documentation to mention the new += and -= syntax?

Done.

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser...
chainedconfigparser.py:48: parser._read(file, filename)
On 2015/06/23 09:43:35, Wladimir Palant wrote:
> Please use public API here:
> 
>   parser.readfp(file, filename)

This code has been gone while addressing the comment below.

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser...
chainedconfigparser.py:68: old_value = self.get(section, option)
On 2015/06/23 09:43:36, Wladimir Palant wrote:
> This will throw when trying to change an option that doesn't exist yet. While
we
> probably don't want to sweep this condition under the carpet, a meaningful
> warning would do better IMHO.

Done.

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser...
chainedconfigparser.py:71: value = '%s %s' % (old_value, value)
On 2015/06/23 09:43:35, Wladimir Palant wrote:
> I don't think that we want duplicate values - we are essentially treating the
> option as a set. So adding a value that is already there shouldn't have an
> effect. Maybe:
> 
>   existing = old_value.split()
>   value = ' '.join(existing + [v for v in value.split() if v not in existing])

Done.

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser...
chainedconfigparser.py:72: elif removal:
On 2015/06/23 09:43:35, Wladimir Palant wrote:
> Nit: no need for elif, can be simply else. But I don't mind leaving it as is
for
> readability.

Done.

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser...
chainedconfigparser.py:73: value = re.sub(r'\b(?:%s)\b\s*' %
'|'.join(map(re.escape, value.split())), '', old_value).rstrip()
On 2015/06/23 09:43:35, Wladimir Palant wrote:
> I don't think that regular expressions are the right tool here. How about:
> 
>   blacklist = value.split()
>   value = ' '.join([v for v in old_value.split() if v not in blacklist])

Done.

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser...
chainedconfigparser.py:77: def _read(self, file, filename):
On 2015/06/23 09:43:35, Wladimir Palant wrote:
> Is it really a good idea to override private methods? There is relatively
little
> danger of this one changing of course but still. IMHO we should be overriding
> public API - meaning read() and readfp(). This won't really add much to the
> complexity.

Overriding read() and readfp() would require quite some additional complexity
and duplication. I went for only supporting read() now, which doesn't seem to be
too bad.

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser...
chainedconfigparser.py:93: self._sections[section] = self._dict()
On 2015/06/23 09:43:36, Wladimir Palant wrote:
> I cannot say that I like this hack. How about simply skipping section
'default'?
> There is lots of magic attached to it, I don't think we want it for metadata
> files. Would also make sense to have the "inherit" option no longer visible in
> the result.

"default" != "DEFAULT". The latter is handled specially and is already ignored
by our code. However, due to a bug, add_section() checks for "DEFAULT"
case-insensitive, and prevents adding a section called "default", which is just
handled as a regular section by the rest of the API.

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser...
chainedconfigparser.py:95: self._origin[(section, option)] = filename
On 2015/06/23 09:43:35, Wladimir Palant wrote:
> Ok, we have a problem right here... Consider the following scenario:
> 
> foo/bar/metadata.chrome defines document_start = foo.js. foo/metadata.chrome
> defines document_start += bar.js. The resulting value is document_start =
foo.js
> bar.js. Where are foo.js and bar.js supposed to be located?
> 
> With this change they will both be looked up in foo/ which is wrong. foo.js
> should definitely be looked up in foo/bar/. Big question is bar.js - would be
> logical to assume that its origin is foo/. However, that would be inconsistent
> with -= directives - these use the paths from the original option, not paths
> relative to the current metadata file. So I think that bar.js should also be
> located in foo/bar/, additions and removals simply shouldn't change the origin
> of the option.

I first thought that we can simply rely on the source of the addition/removal as
it is merely a short hand syntax for repeating the pre-existing items. But you
are right.

Ideally, we would resolve the filenames relative to the location of the metadata
file they occurred in, before applying the diff. However, then we have to assume
that the option lists only filenames.

So without adding vague assumptions or too much complexity, it seems that the
best we can do is indeed preserving the source of the original option when
applying diffs. :(

Wladimir Palant

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser.py File chainedconfigparser.py (right): https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser.py#newcode93 chainedconfigparser.py:93: self._sections[section] = self._dict() On 2015/06/24 08:51:09, Sebastian Noack wrote: ...

June 25, 2015, 2:11 p.m. (2015-06-25 14:11:40 UTC) #4

Sebastian Noack

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser.py File chainedconfigparser.py (right): https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser.py#newcode93 chainedconfigparser.py:93: self._sections[section] = self._dict() On 2015/06/25 14:11:40, Wladimir Palant wrote: ...

June 25, 2015, 4:06 p.m. (2015-06-25 16:06:28 UTC) #5

Wladimir Palant

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser.py File chainedconfigparser.py (right): https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser.py#newcode93 chainedconfigparser.py:93: self._sections[section] = self._dict() On 2015/06/25 16:06:28, Sebastian Noack wrote: ...

June 25, 2015, 4:12 p.m. (2015-06-25 16:12:28 UTC) #6

Sebastian Noack

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser.py File chainedconfigparser.py (right): https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser.py#newcode93 chainedconfigparser.py:93: self._sections[section] = self._dict() On 2015/06/25 16:12:28, Wladimir Palant wrote: ...

June 25, 2015, 11:05 p.m. (2015-06-25 23:05:31 UTC) #7

Wladimir Palant

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser.py File chainedconfigparser.py (right): https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser.py#newcode93 chainedconfigparser.py:93: self._sections[section] = self._dict() On 2015/06/25 23:05:31, Sebastian Noack wrote: ...

June 26, 2015, 1:17 p.m. (2015-06-26 13:17:22 UTC) #8

Sebastian Noack

https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser.py File chainedconfigparser.py (right): https://codereview.adblockplus.org/29319007/diff/29319014/chainedconfigparser.py#newcode93 chainedconfigparser.py:93: self._sections[section] = self._dict() So your concerns here are only ...

June 26, 2015, 1:37 p.m. (2015-06-26 13:37:53 UTC) #9

Sebastian Noack

I found two regressions, fixed with the new patch set: 1. Config files weren't decoded ...

July 7, 2015, 1:09 p.m. (2015-07-07 13:09:50 UTC) #11

Wladimir Palant

https://codereview.adblockplus.org/29319007/diff/29321431/chainedconfigparser.py File chainedconfigparser.py (right): https://codereview.adblockplus.org/29319007/diff/29321431/chainedconfigparser.py#newcode47 chainedconfigparser.py:47: As opposed to ChainedConfigParser, files are decoded as UTF-8 ...

July 7, 2015, 2:53 p.m. (2015-07-07 14:53:18 UTC) #12

Sebastian Noack

https://codereview.adblockplus.org/29319007/diff/29321431/chainedconfigparser.py File chainedconfigparser.py (right): https://codereview.adblockplus.org/29319007/diff/29321431/chainedconfigparser.py#newcode47 chainedconfigparser.py:47: As opposed to ChainedConfigParser, files are decoded as UTF-8 ...

July 7, 2015, 3:20 p.m. (2015-07-07 15:20:44 UTC) #13

LGTM

Issue 29319007: Issue 2711 - Refactored ChainedConfigParser, allowing manipulation of list items (Closed)

Description

Patch Set 1 : #

Patch Set 2 : Addressed comments #

Patch Set 3 : Don't rely on internal APIs to work around add_section("default") bug #

Patch Set 4 : Unimplement add_section as well #

Patch Set 5 : Don't convert options to lowercase and decode config files as UTF-8 #

Patch Set 6 : Addressed comments #

Messages