|
|
Created:
May 20, 2017, 11:50 p.m. by Manish Jethani Modified:
May 24, 2017, 12:06 p.m. Base URL:
https://hg.adblockplus.org/abp2blocklist Visibility:
Public. |
DescriptionNoissue - Do not add subdomain wildcard if subdomain excluded
For a filter like this:
foo$domain=bar.com|~de.bar.com
We should generate a rule like this:
[
{
"trigger": {
"url-filter": "^https?://.*foo",
"resource-type": [
...
],
"if-domain": [
"bar.com"
]
},
...
}
]
i.e. no subdomain wildcard, because at least one subdomain has been excluded.
Patch Set 1 #Patch Set 2 : Add "www" prefix #
Total comments: 9
Patch Set 3 : Add unit tests #Patch Set 4 : Use includes #Patch Set 5 : Rebase #MessagesTotal messages: 15
Patch Set 1
Maybe we should add the "www." In this case.
On 2017/05/21 00:34:23, Manish Jethani wrote: > Maybe we should add the "www." In this case. Yea, I think that's a good idea. At least then the new logic wont be any worse for this cases.
Patch Set 2: Add "www" prefix
What's about tests for this scenario? https://codereview.adblockplus.org/29443587/diff/29444555/lib/abp2blocklist.js File lib/abp2blocklist.js (right): https://codereview.adblockplus.org/29443587/diff/29444555/lib/abp2blocklist.j... lib/abp2blocklist.js:293: if ((filter instanceof filterClasses.BlockingFilter || Ẁhy do we have to explicitly check for BlockingFilter/ElemHideFilter here? https://codereview.adblockplus.org/29443587/diff/29444555/lib/abp2blocklist.j... lib/abp2blocklist.js:300: if (!notSubdomains.some(name => name == "www")) Is this check actually correct? What if the domain is foo.www.example.com?
Patch Set 3: Add unit tests https://codereview.adblockplus.org/29443587/diff/29444555/lib/abp2blocklist.js File lib/abp2blocklist.js (right): https://codereview.adblockplus.org/29443587/diff/29444555/lib/abp2blocklist.j... lib/abp2blocklist.js:293: if ((filter instanceof filterClasses.BlockingFilter || On 2017/05/21 20:56:13, Sebastian Noack wrote: > Ẁhy do we have to explicitly check for BlockingFilter/ElemHideFilter here? Because what is desired in the case of whitelisting filters is the exact opposite. For example, "@@foo$domain=bar.com|~evil.bar.com" should still generate "if-domain='*bar.com'". i.e. we want to err on the side of not blocking rather than blocking. Correct me if I'm wrong. https://codereview.adblockplus.org/29443587/diff/29444555/lib/abp2blocklist.j... lib/abp2blocklist.js:300: if (!notSubdomains.some(name => name == "www")) On 2017/05/21 20:56:13, Sebastian Noack wrote: > Is this check actually correct? What if the domain is foo.www.example.com? Can you give a more complete example of where this might fail?
https://codereview.adblockplus.org/29443587/diff/29444555/lib/abp2blocklist.js File lib/abp2blocklist.js (right): https://codereview.adblockplus.org/29443587/diff/29444555/lib/abp2blocklist.j... lib/abp2blocklist.js:293: if ((filter instanceof filterClasses.BlockingFilter || On 2017/05/21 21:49:05, Manish Jethani wrote: > On 2017/05/21 20:56:13, Sebastian Noack wrote: > > Ẁhy do we have to explicitly check for BlockingFilter/ElemHideFilter here? > > Because what is desired in the case of whitelisting filters is the exact > opposite. For example, mailto:"@@foo$domain=bar.com|~evil.bar.com" should still > generate "if-domain='*bar.com'". i.e. we want to err on the side of not blocking > rather than blocking. > > Correct me if I'm wrong. That might be a good point. It seems to be better to let some ads through occasionally than potentially blocking too much breaking some website. Perhaps the comment here should explain this though. https://codereview.adblockplus.org/29443587/diff/29444555/lib/abp2blocklist.j... lib/abp2blocklist.js:300: if (!notSubdomains.some(name => name == "www")) On 2017/05/21 21:49:05, Manish Jethani wrote: > On 2017/05/21 20:56:13, Sebastian Noack wrote: > > Is this check actually correct? What if the domain is foo.www.example.com? > > Can you give a more complete example of where this might fail? So in case of a filter like this: $domain=example.com|~foo.www.example.com as far as I understand we wouldn't add www.example.com, though doing so wouldn't hurt a we don't add a wildcard, right? Or asked differently, any reason to not just use `notSubdomains[0] != "www"`?
https://codereview.adblockplus.org/29443587/diff/29444555/lib/abp2blocklist.js File lib/abp2blocklist.js (right): https://codereview.adblockplus.org/29443587/diff/29444555/lib/abp2blocklist.j... lib/abp2blocklist.js:300: if (!notSubdomains.some(name => name == "www")) On 2017/05/22 08:06:11, Sebastian Noack wrote: > On 2017/05/21 21:49:05, Manish Jethani wrote: > > On 2017/05/21 20:56:13, Sebastian Noack wrote: > > > Is this check actually correct? What if the domain is foo.www.example.com? > > > > Can you give a more complete example of where this might fail? > > So in case of a filter like this: > > $domain=example.com|~foo.www.example.com > > as far as I understand we wouldn't add http://www.example.com, though doing so wouldn't > hurt a we don't add a wildcard, right? > > Or asked differently, any reason to not just use `notSubdomains[0] != "www"`? In this case the code would be comparing with "foo.www" rather than just "www", so it would go ahead and add the "www." prefix.
https://codereview.adblockplus.org/29443587/diff/29444555/lib/abp2blocklist.js File lib/abp2blocklist.js (right): https://codereview.adblockplus.org/29443587/diff/29444555/lib/abp2blocklist.j... lib/abp2blocklist.js:300: if (!notSubdomains.some(name => name == "www")) On 2017/05/22 10:37:39, Manish Jethani wrote: > On 2017/05/22 08:06:11, Sebastian Noack wrote: > > On 2017/05/21 21:49:05, Manish Jethani wrote: > > > On 2017/05/21 20:56:13, Sebastian Noack wrote: > > > > Is this check actually correct? What if the domain is foo.www.example.com? > > > > > > Can you give a more complete example of where this might fail? > > > > So in case of a filter like this: > > > > $domain=example.com|~foo.www.example.com > > > > as far as I understand we wouldn't add http://www.example.com, though doing so > wouldn't > > hurt a we don't add a wildcard, right? > > > > Or asked differently, any reason to not just use `notSubdomains[0] != "www"`? > > In this case the code would be comparing with "foo.www" rather than just "www", > so it would go ahead and add the "www." prefix. Alright. But I think it still can be slightly simplified by using includes() instead of some().
Patch Set 4: Use includes https://codereview.adblockplus.org/29443587/diff/29444555/lib/abp2blocklist.js File lib/abp2blocklist.js (right): https://codereview.adblockplus.org/29443587/diff/29444555/lib/abp2blocklist.j... lib/abp2blocklist.js:300: if (!notSubdomains.some(name => name == "www")) On 2017/05/22 11:08:59, Sebastian Noack wrote: > On 2017/05/22 10:37:39, Manish Jethani wrote: > > On 2017/05/22 08:06:11, Sebastian Noack wrote: > > > On 2017/05/21 21:49:05, Manish Jethani wrote: > > > > On 2017/05/21 20:56:13, Sebastian Noack wrote: > > > > > Is this check actually correct? What if the domain is > foo.www.example.com? > > > > > > > > Can you give a more complete example of where this might fail? > > > > > > So in case of a filter like this: > > > > > > $domain=example.com|~foo.www.example.com > > > > > > as far as I understand we wouldn't add http://www.example.com, though doing > so > > wouldn't > > > hurt a we don't add a wildcard, right? > > > > > > Or asked differently, any reason to not just use `notSubdomains[0] != > "www"`? > > > > In this case the code would be comparing with "foo.www" rather than just > "www", > > so it would go ahead and add the "www." prefix. > > Alright. But I think it still can be slightly simplified by using includes() > instead of some(). Done.
LGTM
LGTM
Patch Set 5: Rebase (Is it normal to resubmit for review after a LGTM+rebase even if there are no conflicts during the merge?)
On 2017/05/23 17:05:51, Manish Jethani wrote: > Patch Set 5: Rebase > > (Is it normal to resubmit for review after a LGTM+rebase even if there are no > conflicts during the merge?) I don't think we always do it if there are no conflicts. But it's actually possible that an automatic merge breaks something. So at least make sure to test your changes after merges. Perhaps, we should also always re-review in this case, but I guess this would have to be discussed somewhere else. Still LGTM. |