|
|
Created:
Oct. 5, 2017, 9:41 p.m. by Sebastian Noack Modified:
Oct. 11, 2017, 7:50 p.m. Reviewers:
Vasily Kuznetsov CC:
tlucas Visibility:
Public. |
DescriptionNoissue - Improved accuracy of evaluated expressions for A103 and A207
Patch Set 1 #
Total comments: 5
Patch Set 2 : Rebased #Patch Set 3 : Changed behavior of A207 #
Total comments: 1
MessagesTotal messages: 9
https://codereview.adblockplus.org/29565854/diff/29565855/flake8-eyeo/tests/A... File flake8-eyeo/tests/A103.py (right): https://codereview.adblockplus.org/29565854/diff/29565855/flake8-eyeo/tests/A... flake8-eyeo/tests/A103.py:30: return dir == x This was causing an A103 (yoda expression), since we determine whether something is a constant expression with eval(). So that if it can successfully be executed without accessing any global or local variables, it is considered a constant expression, and constant expressions read weird on the left hand side of a comparison. However, eval('dir', {}) doesn't fail because "dir" is a builtin.
Hi Sebastian, I'm not completely convinced about this change, at least not yet. It seems that it does make quite a bit of sense to treat builtins as more or less constant. Do you have an example that shows a situation when one would want to use a builtin as a non-constant thing in a "false yoda expression"? Cheers, Vasily https://codereview.adblockplus.org/29565854/diff/29565855/flake8-eyeo/tests/A... File flake8-eyeo/tests/A103.py (right): https://codereview.adblockplus.org/29565854/diff/29565855/flake8-eyeo/tests/A... flake8-eyeo/tests/A103.py:30: return dir == x On 2017/10/05 22:07:11, Sebastian Noack wrote: > This was causing an A103 (yoda expression), since we determine whether something > is a constant expression with eval(). So that if it can successfully be executed > without accessing any global or local variables, it is considered a constant > expression, and constant expressions read weird on the left hand side of a > comparison. However, eval('dir', {}) doesn't fail because "dir" is a builtin. Isn't this kind of a genuine yoda expression though, since `dir` is more or less like a constant (and we even show another error if people try to redefine it)? Same for the other builtins.
Hey Sebastian, Please find my comment below. Naive as i am, shouldn't a change to flake8-eyeo also increase the version somehow? Best, Tristan https://codereview.adblockplus.org/29565854/diff/29565855/flake8-eyeo/flake8_... File flake8-eyeo/flake8_eyeo.py (right): https://codereview.adblockplus.org/29565854/diff/29565855/flake8-eyeo/flake8_... flake8-eyeo/flake8_eyeo.py:50: names = {'__builtins__': {'True': True, 'False': False, 'None': None}} Why are these values necessary? From what i understand, they only seem to be needed to get A207 (duplicate keys) evaluated correctly, which also seems to work out of the box for python3*. However, providing those values produces the following output for the given example: if False == foo: bar() if True == foo: bar() ./main.py:1:4: A103 yoda condition ./main.py:1:10: E712 comparison to False should be 'if cond is False:' or 'if not cond:' ./main.py:4:4: A103 yoda condition ./main.py:4:9: E712 comparison to True should be 'if cond is True:' or 'if cond:' If you would only set names to 'names = {'__builtins__': {}}' the following (IMHO more valuable) output would be produced: ./main.py:1:10: E712 comparison to False should be 'if cond is False:' or 'if not cond:' ./main.py:4:9: E712 comparison to True should be 'if cond is True:' or 'if cond: Since evaluate() is only called for is_const() (A103) or the A207 checks, you could add an argument 'names' to evaluate(), which could default to {}. What do you think about this?
https://codereview.adblockplus.org/29565854/diff/29565855/flake8-eyeo/flake8_... File flake8-eyeo/flake8_eyeo.py (right): https://codereview.adblockplus.org/29565854/diff/29565855/flake8-eyeo/flake8_... flake8-eyeo/flake8_eyeo.py:50: names = {'__builtins__': {'True': True, 'False': False, 'None': None}} On 2017/10/06 11:41:36, tlucas wrote: > Why are these values necessary? > > From what i understand, they only seem to be needed to get A207 (duplicate keys) > evaluated correctly, which also seems to work out of the box for python3*. > > However, providing those values produces the following output for the given > example: > > if False == foo: > bar() > > if True == foo: > bar() > > ./main.py:1:4: A103 yoda condition > ./main.py:1:10: E712 comparison to False should be 'if cond is False:' or 'if > not cond:' > ./main.py:4:4: A103 yoda condition > ./main.py:4:9: E712 comparison to True should be 'if cond is True:' or 'if > cond:' > > > If you would only set names to 'names = {'__builtins__': {}}' the following > (IMHO more valuable) output would be produced: > > ./main.py:1:10: E712 comparison to False should be 'if cond is False:' or 'if > not cond:' > ./main.py:4:9: E712 comparison to True should be 'if cond is True:' or 'if cond: > > > Since evaluate() is only called for is_const() (A103) or the A207 checks, you > could add an argument 'names' to evaluate(), which could default to {}. > > What do you think about this? True, False and None are special. They are the only builtins which are primitives (i.e. immutable values of scalar built-in types). Semantically they are more similar to numeric or string literals. That the Python tokenizer treats them as names which are then resolved as builtins, is just an implementation detail. However, in Python 3, they are read-only (and redefining them in Python 2 would break almost every program). So it is safe to assume that any occurrence of False, True and None in the source code refer to the original value, while other builtins are occasionally overridden. Hence treating every built-in but True/False/None as volatile seems logical, both for A103 and A207. Yes, `False == foo` is both violating A103 and E712. But from the different column they indicate, you can see that E712 is referring to the operator, while A103 is referring to the order of operands. So it is not really redundant. Also that means that `False is foo` only violates A103 but not E712. https://codereview.adblockplus.org/29565854/diff/29565855/flake8-eyeo/tests/A... File flake8-eyeo/tests/A103.py (right): https://codereview.adblockplus.org/29565854/diff/29565855/flake8-eyeo/tests/A... flake8-eyeo/tests/A103.py:30: return dir == x On 2017/10/06 10:46:21, Vasily Kuznetsov wrote: > On 2017/10/05 22:07:11, Sebastian Noack wrote: > > This was causing an A103 (yoda expression), since we determine whether > something > > is a constant expression with eval(). So that if it can successfully be > executed > > without accessing any global or local variables, it is considered a constant > > expression, and constant expressions read weird on the left hand side of a > > comparison. However, eval('dir', {}) doesn't fail because "dir" is a builtin. > > Isn't this kind of a genuine yoda expression though, since `dir` is more or less > like a constant (and we even show another error if people try to redefine it)? > Same for the other builtins. Another way to look at it is that builtin functions are rather like default globals than an inherit part of the programming language. Hence in the (rather theoretical) case of `dir == orig_dir` this doesn't quite sound like Yoda speak, while `0 == foo` or `False is foo` certainly does. Another (much more common) case is when builtin names (like "dir") are shadowed by local variables. It took me quite a while to figure out where the bogus A111 came from in this line: https://hg.adblockplus.org/buildtools/file/d4ca9bfb82b5/localeTools.py#l412
On 2017/10/06 18:11:10, Sebastian Noack wrote: > https://codereview.adblockplus.org/29565854/diff/29565855/flake8-eyeo/flake8_... > File flake8-eyeo/flake8_eyeo.py (right): > > https://codereview.adblockplus.org/29565854/diff/29565855/flake8-eyeo/flake8_... > flake8-eyeo/flake8_eyeo.py:50: names = {'__builtins__': {'True': True, 'False': > False, 'None': None}} > On 2017/10/06 11:41:36, tlucas wrote: > > Why are these values necessary? > > > > From what i understand, they only seem to be needed to get A207 (duplicate > keys) > > evaluated correctly, which also seems to work out of the box for python3*. > > > > However, providing those values produces the following output for the given > > example: > > > > if False == foo: > > bar() > > > > if True == foo: > > bar() > > > > ./main.py:1:4: A103 yoda condition > > ./main.py:1:10: E712 comparison to False should be 'if cond is False:' or 'if > > not cond:' > > ./main.py:4:4: A103 yoda condition > > ./main.py:4:9: E712 comparison to True should be 'if cond is True:' or 'if > > cond:' > > > > > > If you would only set names to 'names = {'__builtins__': {}}' the following > > (IMHO more valuable) output would be produced: > > > > ./main.py:1:10: E712 comparison to False should be 'if cond is False:' or 'if > > not cond:' > > ./main.py:4:9: E712 comparison to True should be 'if cond is True:' or 'if > cond: > > > > > > Since evaluate() is only called for is_const() (A103) or the A207 checks, you > > could add an argument 'names' to evaluate(), which could default to {}. > > > > What do you think about this? > > True, False and None are special. They are the only builtins which are > primitives (i.e. immutable values of scalar built-in types). Semantically they > are more similar to numeric or string literals. That the Python tokenizer treats > them as names which are then resolved as builtins, is just an implementation > detail. However, in Python 3, they are read-only (and redefining them in Python > 2 would break almost every program). So it is safe to assume that any occurrence > of False, True and None in the source code refer to the original value, while > other builtins are occasionally overridden. Hence treating every built-in but > True/False/None as volatile seems logical, both for A103 and A207. > > Yes, `False == foo` is both violating A103 and E712. But from the different > column they indicate, you can see that E712 is referring to the operator, while > A103 is referring to the order of operands. So it is not really redundant. Also > that means that `False is foo` only violates A103 but not E712. > > https://codereview.adblockplus.org/29565854/diff/29565855/flake8-eyeo/tests/A... > File flake8-eyeo/tests/A103.py (right): > > https://codereview.adblockplus.org/29565854/diff/29565855/flake8-eyeo/tests/A... > flake8-eyeo/tests/A103.py:30: return dir == x > On 2017/10/06 10:46:21, Vasily Kuznetsov wrote: > > On 2017/10/05 22:07:11, Sebastian Noack wrote: > > > This was causing an A103 (yoda expression), since we determine whether > > something > > > is a constant expression with eval(). So that if it can successfully be > > executed > > > without accessing any global or local variables, it is considered a constant > > > expression, and constant expressions read weird on the left hand side of a > > > comparison. However, eval('dir', {}) doesn't fail because "dir" is a > builtin. > > > > Isn't this kind of a genuine yoda expression though, since `dir` is more or > less > > like a constant (and we even show another error if people try to redefine it)? > > Same for the other builtins. > > Another way to look at it is that builtin functions are rather like default > globals than an inherit part of the programming language. Hence in the (rather > theoretical) case of `dir == orig_dir` this doesn't quite sound like Yoda speak, > while `0 == foo` or `False is foo` certainly does. Another (much more common) > case is when builtin names (like "dir") are shadowed by local variables. It took > me quite a while to figure out where the bogus A111 came from in this line: > https://hg.adblockplus.org/buildtools/file/d4ca9bfb82b5/localeTools.py#l412 I agree that generating two errors for cases which are both yoda conditions and incorrect comparison operator is appropriate. Also, this is current behavior and there's no particularly good reason to change it. However, I'm still not convinced about the mutability of other builtins. In overwhelming majority of cases builtins won't be redefined, and we even have an error (A302) to prevent this. Given this, it makes sense to treat them as constants in comparisons (A103) and even more so in dictionary key and set element evaluation (A207). I can see how the example that you give is annoying, but the reason it ends up with an A103 is that it also deserves an A302. When our code gets more compliant, and error ignore lists are gone, you won't be wondering why you're getting an A103 somewhere, because you'll also see an A302 next to it and once that's fixed A103 will be gone too. The reasoning from above paragraph doesn't apply for non-essential builtins, like `file`, `apply`, etc. for which A302 will not be raised. I think relaxing A103 (and probably A207) for those could make sense, since we seem to be implying that it's ok to redefine those names. So basically it seems to me that A103 (and probably also A207) should be consistent with A302. Isn't this logical?
On 2017/10/09 20:14:22, Vasily Kuznetsov wrote: > I agree that generating two errors for cases which are both yoda conditions and > incorrect comparison operator is appropriate. Also, this is current behavior and > there's no particularly good reason to change it. > > However, I'm still not convinced about the mutability of other builtins. In > overwhelming majority of cases builtins won't be redefined, and we even have an > error (A302) to prevent this. Given this, it makes sense to treat them as > constants in comparisons (A103) and even more so in dictionary key and set > element evaluation (A207). > > I can see how the example that you give is annoying, but the reason it ends up > with an A103 is that it also deserves an A302. When our code gets more > compliant, and error ignore lists are gone, you won't be wondering why you're > getting an A103 somewhere, because you'll also see an A302 next to it and once > that's fixed A103 will be gone too. > > The reasoning from above paragraph doesn't apply for non-essential builtins, > like `file`, `apply`, etc. for which A302 will not be raised. I think relaxing > A103 (and probably A207) for those could make sense, since we seem to be > implying that it's ok to redefine those names. So basically it seems to me that > A103 (and probably also A207) should be consistent with A302. Isn't this > logical? Let's see it pragmatic: There is virtually no real case scenario where built-in functions are used as dict/set keys. There is virtually no scenario in which you get an A103 involving a built-in function on the left-hand where the A302 (which you get as well) isn't the actual/larger issue. But on the other hand, a common case we have in our existing code base is that A302 is ignored, and you find the shadowed builtin on left-hand side of an operation, causing a rather confusing A103. Now let's look at it semantically: A Yoda condition (A103) is an expression with a literal part on the left-hand side and a variable part on the right-hand side of an operation, so that it reads as if Yoda would say it (as opposed to natural English language), e.g. "10 being larger than 3 when you add to number" (i.e. "10 > 3 + n"). There are no real constants in Python, and even if you consider builtins constant (which they aren't) I cannot come up with a realistic example, so let's just pretend for a moment that there would be constants in Python (or just assume a different programming language): "The (constant) maximum number of connections is larger than the active number of connections (i.e. "MAX_CONNECTIONS > n_connection"). Does that sound like Yoda? For A207 this is less an issue, since we essentially just want to avoid duplicate keys. So even if we cannot determine the final value for sure, if the same name is given for two keys this is a duplicate. However, in that case it doesn't matter whether the name refers to a builtin, global or local variable. So that {dir, dir} gives and A207 while {x, x} doesn't is so inherently inconsistent and confusing that I consider it a positive side effect that this fix is eliminating this inconsistency. When I initially wrote this code, I never meant to consider built-ins (other than True, False and None). It ever was an unintended side effect that I merely accepted because back then, 1. I considered false positives extremely rare, 2. I didn't know that you can control the built-ins made available by eval().
On 2017/10/11 02:52:48, Sebastian Noack wrote: > On 2017/10/09 20:14:22, Vasily Kuznetsov wrote: > > I agree that generating two errors for cases which are both yoda conditions > and > > incorrect comparison operator is appropriate. Also, this is current behavior > and > > there's no particularly good reason to change it. > > > > However, I'm still not convinced about the mutability of other builtins. In > > overwhelming majority of cases builtins won't be redefined, and we even have > an > > error (A302) to prevent this. Given this, it makes sense to treat them as > > constants in comparisons (A103) and even more so in dictionary key and set > > element evaluation (A207). > > > > I can see how the example that you give is annoying, but the reason it ends up > > with an A103 is that it also deserves an A302. When our code gets more > > compliant, and error ignore lists are gone, you won't be wondering why you're > > getting an A103 somewhere, because you'll also see an A302 next to it and once > > that's fixed A103 will be gone too. > > > > The reasoning from above paragraph doesn't apply for non-essential builtins, > > like `file`, `apply`, etc. for which A302 will not be raised. I think relaxing > > A103 (and probably A207) for those could make sense, since we seem to be > > implying that it's ok to redefine those names. So basically it seems to me > that > > A103 (and probably also A207) should be consistent with A302. Isn't this > > logical? I'm a bit worried that we spend more time on this issue than what it deserves and that it's a bit slow to discuss this over a code review, especially when we're in very different timezones. Maybe we should find a better way to converge to something quicker. > Let's see it pragmatic: There is virtually no real case scenario where built-in > functions are used as dict/set keys. There is virtually no scenario in which you > get an A103 involving a built-in function on the left-hand where the A302 (which > you get as well) isn't the actual/larger issue. But on the other hand, a common > case we have in our existing code base is that A302 is ignored, and you find the > shadowed builtin on left-hand side of an operation, causing a rather confusing > A103. Indeed it's hard to find a real world case where built-in functions are used in set/dict keys, but I'm still not thrilled when I think that something like `{'foo': 1, str('foo'): 2}` would become an acceptable dict, while before it was correctly flagged as a somewhat broken one. This seems to be a regression, even though it's not very significant. > Now let's look at it semantically: A Yoda condition (A103) is an expression with > a literal part on the left-hand side and a variable part on the right-hand side > of an operation, so that it reads as if Yoda would say it (as opposed to natural > English language), e.g. "10 being larger than 3 when you add to number" (i.e. > "10 > 3 + n"). There are no real constants in Python, and even if you consider > builtins constant (which they aren't) I cannot come up with a realistic example, > so let's just pretend for a moment that there would be constants in Python (or > just assume a different programming language): "The (constant) maximum number of > connections is larger than the active number of connections (i.e. > "MAX_CONNECTIONS > n_connection"). Does that sound like Yoda? Yeah, this sounds a bit Yoda. > For A207 this is less an issue, since we essentially just want to avoid > duplicate keys. So even if we cannot determine the final value for sure, if the > same name is given for two keys this is a duplicate. However, in that case it > doesn't matter whether the name refers to a builtin, global or local variable. > So that {dir, dir} gives and A207 while {x, x} doesn't is so inherently > inconsistent and confusing that I consider it a positive side effect that this > fix is eliminating this inconsistency. Here the second example `{x, x}` is also broken and it's a pity that we can't detect this automatically. However, saying that because we can't detect that, we should also allow the first broken one, which we can detect sounds a bit strange. I'd prefer moving in direction of detecting more errors, not less. Otherwise yes, inconsistencies are annoying. > When I initially wrote this code, I never meant to consider built-ins (other > than True, False and None). It ever was an unintended side effect that I merely > accepted because back then, 1. I considered false positives extremely rare, 2. I > didn't know that you can control the built-ins made available by eval(). I'm still not very convinced about removing the check that I think makes sense in order to make our code, which we agree could use some fixing, pass the checker. On the other hand I don't think Yoda conditions are such a big deal that they would be worth all the time we spent on this review. But how about making a version that doesn't break A207?
On 2017/10/11 16:13:09, Vasily Kuznetsov wrote: > On 2017/10/11 02:52:48, Sebastian Noack wrote: > > There are no real constants in Python, and even if you consider > > builtins constant (which they aren't) I cannot come up with a > > realistic example, > > so let's just pretend for a moment that there would be constants in Python > > (or just assume a different programming language): "The (constant) maximum > > number of connections is larger than the active number of connections (i.e. > > "MAX_CONNECTIONS > n_connection"). Does that sound like Yoda? > > Yeah, this sounds a bit Yoda. It doesn't. > > For A207 this is less an issue, since we essentially just want to avoid > > duplicate keys. So even if we cannot determine the final value for sure, if > the > > same name is given for two keys this is a duplicate. However, in that case it > > doesn't matter whether the name refers to a builtin, global or local variable. > > So that {dir, dir} gives and A207 while {x, x} doesn't is so inherently > > inconsistent and confusing that I consider it a positive side effect that this > > fix is eliminating this inconsistency. > > Here the second example `{x, x}` is also broken and it's a pity that we can't > detect this automatically. However, saying that because we can't detect that, we > should also allow the first broken one, which we can detect sounds a bit > strange. I'd prefer moving in direction of detecting more errors, not less. > Otherwise yes, inconsistencies are annoying. > > > When I initially wrote this code, I never meant to consider built-ins (other > > than True, False and None). It ever was an unintended side effect that I > merely > > accepted because back then, 1. I considered false positives extremely rare, 2. > I > > didn't know that you can control the built-ins made available by eval(). > > I'm still not very convinced about removing the check that I think makes sense > in order to make our code, which we agree could use some fixing, pass the > checker. On the other hand I don't think Yoda conditions are such a big deal > that they would be worth all the time we spent on this review. But how about > making a version that doesn't break A207? There you go. I now only ignore builtins other than True/False/None only for A103, and fixed the inconsistency in A207 so that {x,x} will give you an error as well.
Allright, I think we can agree on this compromise :D LGTM https://codereview.adblockplus.org/29565854/diff/29573863/flake8-eyeo/flake8_... File flake8-eyeo/flake8_eyeo.py (right): https://codereview.adblockplus.org/29565854/diff/29573863/flake8-eyeo/flake8_... flake8-eyeo/flake8_eyeo.py:339: namespace = collections.defaultdict(object, vars(builtins)) This is clever! |