|
|
Created:
April 20, 2018, 6:41 p.m. by kzar Modified:
May 4, 2018, 1:32 p.m. Reviewers:
Manish Jethani CC:
sergei Visibility:
Public. |
DescriptionIssue 6562 - Remove filter keys from the element hiding code
Patch Set 1 #
Total comments: 8
Patch Set 2 : Avoid adding duplicate hiding filters #
Total comments: 15
Patch Set 3 : Addressed nits #
Total comments: 12
Patch Set 4 : Don't note if a filter is genericly disabled #Patch Set 5 : Only note excluded filters and only if the domain isn't "" #Patch Set 6 : Avoid checking currentDomain == "" more than necessary #
Total comments: 6
Patch Set 7 : Address some final feedback #
Total comments: 2
Patch Set 8 : == instead of === #MessagesTotal messages: 22
Patch Set 1 I've checked and the performance seems about the same, but my test was simplistic: let {ElemHide} = require("elemHide"); (function() { var start = performance.now(); for (var i = 0; i < 1000; i += 1) ElemHide.getSelectorsForDomain("www.extremetech.com"); console.log("RESULT", performance.now() - start); }()); I've also confirmed the tests still pass and smoke tested element hiding. Next week I'll test a bit more thoroughly and double check the tests still pass in between the three commits, but I've run out of time for today.
https://codereview.adblockplus.org/29757591/diff/29757592/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29757592/lib/elemHide.js#new... lib/elemHide.js:30: * @type {Map.<string,Map.<Filter,boolean>>} I am unsure about this one. Previously we needed the layer of indirection since we needed a filter key to get to the filter anyway. But now we don't any more, so I wonder if we could avoid the nested maps, while still supporting element hiding filters which specifically don't apply to a domain. One idea I had was to instead have two lookups, filtersByDomain and filtersByDomainExcluded or something like that. What do you guys think? Any ideas?
https://codereview.adblockplus.org/29757591/diff/29757592/lib/elemHide.js File lib/elemHide.js (left): https://codereview.adblockplus.org/29757591/diff/29757592/lib/elemHide.js#old... lib/elemHide.js:142: if (keyByFilter.has(filter)) If we already have the filter, we don't add it again. The same is done for exceptions above (knownExceptions). So I think we should keep this, perhaps combine the two so that now we simply have a "knownFilters" and we skip this entire function is knownFilters contains the filter. https://codereview.adblockplus.org/29757591/diff/29757592/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29757592/lib/elemHide.js#new... lib/elemHide.js:30: * @type {Map.<string,Map.<Filter,boolean>>} On 2018/04/20 18:52:38, kzar wrote: > I am unsure about this one. Previously we needed the layer of indirection since > we needed a filter key to get to the filter anyway. But now we don't any more, > so I wonder if we could avoid the nested maps, while still supporting element > hiding filters which specifically don't apply to a domain. > > One idea I had was to instead have two lookups, filtersByDomain and > filtersByDomainExcluded or something like that. What do you guys think? Any > ideas? Unless I'm missing something, filtersByDomain is only needed in _addToFiltersByDomain, and that function is only interested in the "included" filters. Then perhaps we could skip the excluded ones and make this a Map.<string,Set.<Filter>> instead?
https://codereview.adblockplus.org/29757591/diff/29757592/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29757592/lib/elemHide.js#new... lib/elemHide.js:30: * @type {Map.<string,Map.<Filter,boolean>>} On 2018/04/21 11:11:57, Manish Jethani wrote: > Unless I'm missing something, filtersByDomain is only needed in > _addToFiltersByDomain, and that function is only interested in the "included" > filters. It's also needed by getSelectorsForDomain, which is actually the whole reason why we maintain it. > Then perhaps we could skip the excluded ones and make this a > Map.<string,Set.<Filter>> instead? I don't think we can skip the exlucded ones, but we could potentially have two lookups like you suggest, one for included and one for excluded.
https://codereview.adblockplus.org/29757591/diff/29757592/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29757592/lib/elemHide.js#new... lib/elemHide.js:30: * @type {Map.<string,Map.<Filter,boolean>>} On 2018/04/21 12:01:00, kzar wrote: > On 2018/04/21 11:11:57, Manish Jethani wrote: > > Unless I'm missing something, filtersByDomain is only needed in > > _addToFiltersByDomain, and that function is only interested in the "included" > > filters. > > It's also needed by getSelectorsForDomain, which is actually the whole reason > why we maintain it. Sorry, I meant getSelectorForDomain there. getSelectorsForDomain is only interested in the "included" filters, so it could just be a Set of those filters?
Patch Set 2 : Avoid adding duplicate hiding filters https://codereview.adblockplus.org/29757591/diff/29757592/lib/elemHide.js File lib/elemHide.js (left): https://codereview.adblockplus.org/29757591/diff/29757592/lib/elemHide.js#old... lib/elemHide.js:142: if (keyByFilter.has(filter)) On 2018/04/21 11:11:57, Manish Jethani wrote: > If we already have the filter, we don't add it again. The same is done for > exceptions above (knownExceptions). So I think we should keep this, perhaps > combine the two so that now we simply have a "knownFilters" and we skip this > entire function is knownFilters contains the filter. Yea, I think you're right. I've gone back through some of the early commits and it seems like the intention was always to avoid adding duplicate hiding or exception filters here. Done. https://codereview.adblockplus.org/29757591/diff/29757592/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29757592/lib/elemHide.js#new... lib/elemHide.js:30: * @type {Map.<string,Map.<Filter,boolean>>} On 2018/04/22 11:35:42, Manish Jethani wrote: > Sorry, I meant getSelectorForDomain there. getSelectorsForDomain is only > interested in the "included" filters, > so it could just be a Set of those filters? OK, I read back through the old discussions to refresh my memory. Take the example filter ~foo.example.com,example.com##foo , if we call getSelectorsForDomain for foo.example.com we'll first find the excluded filter for foo.example.com, then the included filter for example.com. The included filter is (rightly) ignored since we already saw it without using it earlier. If we didn't keep track of the excluded filter for foo.example.com we'd only find the included filter for example.com and so that filter would have (wrongly) been used. I had a think about this for a while and can't see much of a better way to do it. https://codereview.adblockplus.org/29757591/diff/29757592/lib/elemHide.js#new... lib/elemHide.js:165: return; Note: I think this was a bug, we wouldn't emit the "elemhideupdate" event when removing unconditionally applied hiding filters.
https://codereview.adblockplus.org/29757591/diff/29762586/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29762586/lib/elemHide.js#new... lib/elemHide.js:55: * @type {Set.<string>} This is really {Set.<ElemHideBase>} https://codereview.adblockplus.org/29757591/diff/29762586/lib/elemHide.js#new... lib/elemHide.js:57: let knownFilters = new Set(); I wonder if we should make this a WeakSet since we never need to iterate over it. We are holding on to the filter references elsewhere? https://codereview.adblockplus.org/29757591/diff/29762586/lib/elemHide.js#new... lib/elemHide.js:98: * @param {ElemHideFilter} filter This is really {ElemHideBase} https://codereview.adblockplus.org/29757591/diff/29762586/lib/elemHide.js#new... lib/elemHide.js:143: * @param {ElemHideFilter} filter This is really {ElemHideBase} https://codereview.adblockplus.org/29757591/diff/29762586/lib/elemHide.js#new... lib/elemHide.js:261: for (let [filter, isIncluded] of filters) Just a thought, but I wonder if we could do this (I haven't thought this through): let excludedFilterSets = []; while (true) { excludedFilterSets.push(excludedFiltersByDomain.get(currentDomain)); for (let filter of includedFiltersByDomain.get(currentDomain)) { for (let set of excludedFilterSets) { if (set.has(filter)) { filter = null; break; } } if (filter && !this.getException(filter, domain)) selectors.push(filter.selector); } ... } So I think this is the idea you were toying with, I see it now. We could have two sets, one for excluded and one for included. I'm not sure if this is better though. It would mean fewer iterations but more set lookups. Ultimately it depends on the kinds of filters we have. https://codereview.adblockplus.org/29757591/diff/29762586/test/filterListener.js File test/filterListener.js (right): https://codereview.adblockplus.org/29757591/diff/29762586/test/filterListener... test/filterListener.js:84: for (let filter of elemHide.knownFilters.values()) Call .values is redundant here, but if you prefer it then it's fine.
https://codereview.adblockplus.org/29757591/diff/29762586/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29762586/lib/elemHide.js#new... lib/elemHide.js:261: for (let [filter, isIncluded] of filters) On 2018/04/27 12:39:36, Manish Jethani wrote: > Just a thought, but I wonder if we could do this (I haven't thought this > through): > [...] Anyway, I'm all for not changing things here as part of this patch.
https://codereview.adblockplus.org/29757591/diff/29762586/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29762586/lib/elemHide.js#new... lib/elemHide.js:261: for (let [filter, isIncluded] of filters) On 2018/04/27 12:39:36, Manish Jethani wrote: > So I think this is the idea you were toying with, I see it now. We could have > two sets, one for excluded and one for included. I'm not sure if this is better > though. It would mean fewer iterations but more set lookups. Ultimately it > depends on the kinds of filters we have. Exactly, that's what I was playing with. Except perhaps we would have to keep track of the already seen included filters as well if we want to keep behaviour the same... but not sure it matters. I'm also not clear this is a change worth making, I could see aspects of both which are nicer. Maintaining two sets would also have (even) more complexity for the filter adding and removing code. > Anyway, I'm all for not changing things here as part of this patch. Yea, I don't want to do it for now I think. Well unless I can be convinced it's clearly better! What do you think?
https://codereview.adblockplus.org/29757591/diff/29762586/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29762586/lib/elemHide.js#new... lib/elemHide.js:261: for (let [filter, isIncluded] of filters) On 2018/04/27 17:01:50, kzar wrote: > On 2018/04/27 12:39:36, Manish Jethani wrote: > > So I think this is the idea you were toying with, I see it now. We could have > > two sets, one for excluded and one for included. I'm not sure if this is > better > > though. It would mean fewer iterations but more set lookups. Ultimately it > > depends on the kinds of filters we have. > > Exactly, that's what I was playing with. Except perhaps we would have to keep > track of the already seen included filters as well if we want to keep behaviour > the same... but not sure it matters. That's what excludedFilterSets in the above code snippet was. Anyway, I did some tests, the performance seems to improve for EasyList. Here's the patch I tested: https://codereview.adblockplus.org/29764555/ > I'm also not clear this is a change worth making, I could see aspects of both > which are nicer. Maintaining two sets would also have (even) more complexity for > the filter adding and removing code. So I think filter adding and removing happens less often than lookups for filters by domain, which means the lookup code should perform better. > > Anyway, I'm all for not changing things here as part of this patch. > > Yea, I don't want to do it for now I think. Well unless I can be convinced it's > clearly better! What do you think? It seems to be better. Let's take this discussion to the other patch that I've uploaded. This one can focus on the original issue.
Patch Set 3 : Addressed nits https://codereview.adblockplus.org/29757591/diff/29762586/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29762586/lib/elemHide.js#new... lib/elemHide.js:55: * @type {Set.<string>} On 2018/04/27 12:39:36, Manish Jethani wrote: > This is really {Set.<ElemHideBase>} Done. https://codereview.adblockplus.org/29757591/diff/29762586/lib/elemHide.js#new... lib/elemHide.js:57: let knownFilters = new Set(); On 2018/04/27 12:39:36, Manish Jethani wrote: > I wonder if we should make this a WeakSet since we never need to iterate over > it. We are holding on to the filter references elsewhere? I considered that as well but we keep a reference to exceptions in the exceptions lookup, the rest in filtersByDomain and filtersBySelector anyway. If we remove the filter from those lookups but not from this one then we've got a bug. (We do iterate over this in the tests, not that that's a good reason to avoid making this a WeakSet.) https://codereview.adblockplus.org/29757591/diff/29762586/lib/elemHide.js#new... lib/elemHide.js:98: * @param {ElemHideFilter} filter On 2018/04/27 12:39:36, Manish Jethani wrote: > This is really {ElemHideBase} Well spotted, Done. https://codereview.adblockplus.org/29757591/diff/29762586/lib/elemHide.js#new... lib/elemHide.js:261: for (let [filter, isIncluded] of filters) On 2018/04/28 16:22:17, Manish Jethani wrote: > On 2018/04/27 17:01:50, kzar wrote: > > Except perhaps we would have to keep track of the already seen > > included filters as well if we want to keep behaviour > > the same... but not sure it matters. > > That's what excludedFilterSets in the above code snippet was. Well no, excludedFilterSets contained the excluded filters so far. I was talking about the included filters so far. > Anyway, I did some tests, the performance seems to improve for EasyList. Here's > the patch I tested: > > https://codereview.adblockplus.org/29764555/ Thanks for looking into it some more. You mention the performance improving, could you share how you tested and the numbers? > So I think filter adding and removing happens less often than lookups for > filters by domain, which means the lookup code should perform better. Interesting insight. https://codereview.adblockplus.org/29757591/diff/29762586/test/filterListener.js File test/filterListener.js (right): https://codereview.adblockplus.org/29757591/diff/29762586/test/filterListener... test/filterListener.js:84: for (let filter of elemHide.knownFilters.values()) On 2018/04/27 12:39:36, Manish Jethani wrote: > Call .values is redundant here, but if you prefer it then it's fine. Done.
https://codereview.adblockplus.org/29757591/diff/29762586/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29762586/lib/elemHide.js#new... lib/elemHide.js:261: for (let [filter, isIncluded] of filters) On 2018/04/30 10:11:46, kzar wrote: > [...] > > > Anyway, I did some tests, the performance seems to improve for EasyList. > Here's > > the patch I tested: > > > > https://codereview.adblockplus.org/29764555/ > > Thanks for looking into it some more. You mention the performance improving, > could you share how you tested and the numbers? Let's do that as part of the other patch. https://codereview.adblockplus.org/29757591/diff/29766555/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29766555/lib/elemHide.js#new... lib/elemHide.js:92: filters.set(filter, isIncluded); I think we should change this to: if (domain || isIncluded) filters.set(filter, isIncluded); I don't see why we would add an entry if the key is an empty string and the value is false. It is not necessary, it doesn't do anything. https://codereview.adblockplus.org/29757591/diff/29766555/lib/elemHide.js#new... lib/elemHide.js:261: for (let [filter, isIncluded] of filters) So your comment about included vs excluded filters makes me wonder why we should add included filters to the seenFilters set. Isn't it really just about keeping track of exceptions on a subdomain? e.g. ~no.example.com,example.com##foo. Should we really care if there's a filter like cool.example.com,example.com##foo and it ends up causing the function to slow down and have the selector included more than once? I think maybe it's a good idea to change it to this: for (let [filter, isIncluded] of filters) { if (!isIncluded) { seenFilters.add(filter); } else if (!seenFilters.has(filter) && !this.getException(filter, domain)) { selectors.push(filter.selector); } } Can you think of a case where this would break?
Patch Set 4 : Don't note if a filter is generically disabled https://codereview.adblockplus.org/29757591/diff/29766555/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29766555/lib/elemHide.js#new... lib/elemHide.js:92: filters.set(filter, isIncluded); On 2018/04/30 17:55:21, Manish Jethani wrote: > I think we should change this to: > > if (domain || isIncluded) > filters.set(filter, isIncluded); > > I don't see why we would add an entry if the key is an empty string and the > value is false. It is not necessary, it doesn't do anything. Good point, you're right and I checked and this applies to quite a lot of filters in practice. Done, except I've made it explicit + avoided a pointless filtersByDomain lookup. https://codereview.adblockplus.org/29757591/diff/29766555/lib/elemHide.js#new... lib/elemHide.js:261: for (let [filter, isIncluded] of filters) On 2018/04/30 17:55:21, Manish Jethani wrote: > So your comment about included vs excluded filters makes me wonder why we should > add included filters to the seenFilters set. Isn't it really just about keeping > track of exceptions on a subdomain? e.g. ~no.example.com,example.com##foo. > Should we really care if there's a filter like cool.example.com,example.com##foo > and it ends up causing the function to slow down and have the selector included > more than once? > > I think maybe it's a good idea to change it to this: > > for (let [filter, isIncluded] of filters) > { > if (!isIncluded) > { > seenFilters.add(filter); > } > else if (!seenFilters.has(filter) && > !this.getException(filter, domain)) > { > selectors.push(filter.selector); > } > } > > Can you think of a case where this would break? This is pretty much the same point you made on your review[1]. It's interesting, but I'm not sure either way about it yet. Regardless I don't see the point of making this change here, I'd rather save that discussion for your review. [1] - https://codereview.adblockplus.org/29764555/diff/29764556/lib/elemHide.js#new...
https://codereview.adblockplus.org/29757591/diff/29766555/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29766555/lib/elemHide.js#new... lib/elemHide.js:261: for (let [filter, isIncluded] of filters) On 2018/05/01 16:13:59, kzar wrote: > On 2018/04/30 17:55:21, Manish Jethani wrote: > > So your comment about included vs excluded filters makes me wonder why we > should > > add included filters to the seenFilters set. Isn't it really just about > keeping > > track of exceptions on a subdomain? e.g. ~no.example.com,example.com##foo. > > Should we really care if there's a filter like > cool.example.com,example.com##foo > > and it ends up causing the function to slow down and have the selector > included > > more than once? > > > > I think maybe it's a good idea to change it to this: > > > > for (let [filter, isIncluded] of filters) > > { > > if (!isIncluded) > > { > > seenFilters.add(filter); > > } > > else if (!seenFilters.has(filter) && > > !this.getException(filter, domain)) > > { > > selectors.push(filter.selector); > > } > > } > > > > Can you think of a case where this would break? > > This is pretty much the same point you made on your review[1]. It's interesting, > but I'm not sure either way about it yet. Regardless I don't see the point of > making this change here, I'd rather save that discussion for your review. > > [1] - > https://codereview.adblockplus.org/29764555/diff/29764556/lib/elemHide.js#new... Fair enough. In that case, do you think it makes sense to check if currentDomain is non-empty before adding the filter to the seenFilters set? if (currentDomain != "") seenFilters.add(filter); Adding to a set also has some cost, we're not going to look up those entries since we'll be exiting the loop. Since there are so many generic filters typically, it should help if we skipped that operation for generic filters. It might also make the lookup faster. Think about it, for a random domain there are no domain-specific filters and no exceptions, it's just a bunch of generic filters. A lookup in an empty set should be instantaneous.
https://codereview.adblockplus.org/29757591/diff/29766555/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29766555/lib/elemHide.js#new... lib/elemHide.js:261: for (let [filter, isIncluded] of filters) On 2018/05/02 12:28:11, Manish Jethani wrote: > [...] > might also make the lookup faster. Think about it, for a random domain there are > no domain-specific filters and no exceptions, it's just a bunch of generic > filters. A lookup in an empty set should be instantaneous. I just checked here and with just EasyList enabled the set is populated with ~1,100 entries that are then never looked up. By adding that check we reduce the size of the set to 0. I also checked if it helps to check the size before doing a lookup: (function () { let set = new Set(); let count = 0; let start = Date.now(); for (let i = 0; i < 10000000; i++) { if (set.size > 0 && set.has("foo")) count++; //if (Math.random() < .01) // set.add("foo"); } console.log("time:", Date.now() - start); console.log("count:", count); })(); The performance improves drastically if we check the value of the size property before doing a lookup, if the set is empty. If the set is not empty, the performance doesn't seem to degrade if we check the size property (uncomment the two lines so the set gets populated). So I think our checks should be: if (seenFilters.size > 0 && seenFilters.has(filter)) continue; if (currentDomain != "") seenFilters.add(filter); https://codereview.adblockplus.org/29757591/diff/29766555/lib/elemHide.js#new... lib/elemHide.js:272: if (currentDomain == "") I'm wondering if we should update these checks: if (!currentDomain) break;
Patch Set 5 : Only note excluded filters and only if the domain isn't "" Patch Set 6 : Avoid checking currentDomain == "" more than necessary https://codereview.adblockplus.org/29757591/diff/29766555/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29766555/lib/elemHide.js#new... lib/elemHide.js:261: for (let [filter, isIncluded] of filters) On 2018/05/02 12:57:34, Manish Jethani wrote: > On 2018/05/02 12:28:11, Manish Jethani wrote: > > [...] > > might also make the lookup faster. Think about it, for a random domain there > are > > no domain-specific filters and no exceptions, it's just a bunch of generic > > filters. A lookup in an empty set should be instantaneous. > > I just checked here and with just EasyList enabled the set is populated with > ~1,100 entries that are then never looked up. By adding that check we reduce the > size of the set to 0. > > I also checked if it helps to check the size before doing a lookup: > > (function () > { > let set = new Set(); > > let count = 0; > let start = Date.now(); > > for (let i = 0; i < 10000000; i++) > { > if (set.size > 0 && set.has("foo")) > count++; > > //if (Math.random() < .01) > // set.add("foo"); > } > > console.log("time:", Date.now() - start); > console.log("count:", count); > })(); > > The performance improves drastically if we check the value of the size property > before doing a lookup, if the set is empty. If the set is not empty, the > performance doesn't seem to degrade if we check the size property (uncomment the > two lines so the set gets populated). > > So I think our checks should be: > > if (seenFilters.size > 0 && seenFilters.has(filter)) > continue; > > if (currentDomain != "") > seenFilters.add(filter); You make some good points and I think on reflection you're probably right about there not being much use in ignoring filters that were previously included. Done. https://codereview.adblockplus.org/29757591/diff/29766555/lib/elemHide.js#new... lib/elemHide.js:272: if (currentDomain == "") On 2018/05/02 12:57:34, Manish Jethani wrote: > I'm wondering if we should update these checks: > > if (!currentDomain) > break; Well I think it's generally a good idea to check the value you mean rather than just falsey... not that I always do that to be honest. Anyway I've tweaked the logic here so we're checking the domain string much less now.
https://codereview.adblockplus.org/29757591/diff/29766555/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29766555/lib/elemHide.js#new... lib/elemHide.js:272: if (currentDomain == "") On 2018/05/02 16:45:39, kzar wrote: > [...] > > Well I think it's generally a good idea to check the value you mean rather than > just falsey... not that I always do that to be honest. Anyway I've tweaked the > logic here so we're checking the domain string much less now. OK, the Mozilla style guide, which we refer to, advises to do truthy/falsey checks unless there's a chance of confusion [1] and I think we've pretty much followed this everywhere else in the code. [1]: https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Coding_Style... https://codereview.adblockplus.org/29757591/diff/29768633/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29768633/lib/elemHide.js#new... lib/elemHide.js:258: let currentDomainIsGeneric = currentDomain == ""; Comparison with an empty string really should not need to be cached, it's probably superoptimized by the JS engine anyway. I would drop this. Also see below. https://codereview.adblockplus.org/29757591/diff/29768633/lib/elemHide.js#new... lib/elemHide.js:272: if (excludedFilters.size > 0 && excludedFilters.has(filter)) We only need to do this check for included filters, so it should be inside the if block. https://codereview.adblockplus.org/29757591/diff/29768633/lib/elemHide.js#new... lib/elemHide.js:280: else if (!currentDomainIsGeneric) currentDomain will always be non-empty if the filter is excluded, we made sure of that while populating filtersByDomain (which is why I did not mention this when I thought we would just not add the included filters to the set). So I think it comes down to this: if (!isIncluded) { excludedFilters.add(filter); } else if ((excludedFilters.size == 0 || !excludedFilters.has(filter)) && !this.getException(filter, domain)) { selectors.push(filter.selector); } Note there's no check for currentDomain now, I think we can drop all the related changes there.
On 2018/05/02 18:11:41, Manish Jethani wrote: > [...] > > So I think it comes down to this: > > if (!isIncluded) > { > excludedFilters.add(filter); > } > else if ((excludedFilters.size == 0 || !excludedFilters.has(filter)) && > !this.getException(filter, domain)) > { > selectors.push(filter.selector); > } I did some performance measurement on this patch, including the above suggestion. At the end of lib/elemHide.js I added the following line: window.ElemHide = exports.ElemHide; Then I built and reloaded the extension, opened the background page console, and pasted the following code into it: (function() { var s = performance.now(); for (let i = 0; i < 1000; i++) ElemHide.getSelectorsForDomain(`s${i}.example.com`); console.log(performance.now() - s); })(); The results I got on Chrome before and after: - before: ~2.4s - after: ~1.4s Similarly on Firefox: - before: ~6.6s - after: ~3.2s So I think these changes are definitely making ElemHide.getSelectorsForDomain work faster. After I applied the changes for patch #29764555 ElemHide.getSelectorForDomain went from ~1.4s to ~1.25s on Chrome, but I think we can look into that as part of the code review for that patch.
Patch Set 7 : Address some final feedback > So I think these changes are definitely making > ElemHide.getSelectorsForDomain work faster. Awesome, thanks for checking. You know I'm really happy about this change. You're seeing something like 2x speed up in this critical part of the code, we've reduced LOC and even added some more comments. Thanks for all your help with it. https://codereview.adblockplus.org/29757591/diff/29766555/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29766555/lib/elemHide.js#new... lib/elemHide.js:272: if (currentDomain == "") On 2018/05/02 18:11:40, Manish Jethani wrote: > On 2018/05/02 16:45:39, kzar wrote: > > [...] > > > > Well I think it's generally a good idea to check the value you mean rather > than > > just falsey... not that I always do that to be honest. Anyway I've tweaked the > > logic here so we're checking the domain string much less now. > > OK, the Mozilla style guide, which we refer to, advises to do truthy/falsey > checks unless there's a chance of confusion [1] and I think we've pretty much > followed this everywhere else in the code. > > [1]: > https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Coding_Style... I prefer checking for an empty string specifically in this code, I think it makes it easier to grok. https://codereview.adblockplus.org/29757591/diff/29768633/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29768633/lib/elemHide.js#new... lib/elemHide.js:258: let currentDomainIsGeneric = currentDomain == ""; On 2018/05/02 18:11:40, Manish Jethani wrote: > Comparison with an empty string really should not need to be cached, it's > probably superoptimized by the JS engine anyway. I would drop this. Also see > below. OK, Done. https://codereview.adblockplus.org/29757591/diff/29768633/lib/elemHide.js#new... lib/elemHide.js:272: if (excludedFilters.size > 0 && excludedFilters.has(filter)) On 2018/05/02 18:11:40, Manish Jethani wrote: > We only need to do this check for included filters, so it should be inside the > if block. Good point, Done. https://codereview.adblockplus.org/29757591/diff/29768633/lib/elemHide.js#new... lib/elemHide.js:280: else if (!currentDomainIsGeneric) On 2018/05/02 18:11:40, Manish Jethani wrote: > currentDomain will always be non-empty if the filter is excluded, we made sure > of that while populating filtersByDomain (which is why I did not mention this > when I thought we would just not add the included filters to the set). > > So I think it comes down to this: > > if (!isIncluded) > { > excludedFilters.add(filter); > } > else if ((excludedFilters.size == 0 || !excludedFilters.has(filter)) && > !this.getException(filter, domain)) > { > selectors.push(filter.selector); > } OK, Done. > Note there's no check for currentDomain now, I think we can drop all the related > changes there. Well there still are a couple of checks for currentDomain elsewhere.
Looks good to me other than the one comment below. LGTM https://codereview.adblockplus.org/29757591/diff/29766555/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29766555/lib/elemHide.js#new... lib/elemHide.js:272: if (currentDomain == "") On 2018/05/03 15:36:00, kzar wrote: > On 2018/05/02 18:11:40, Manish Jethani wrote: > > On 2018/05/02 16:45:39, kzar wrote: > > > [...] > > > > > > Well I think it's generally a good idea to check the value you mean rather > > than > > > just falsey... not that I always do that to be honest. Anyway I've tweaked > the > > > logic here so we're checking the domain string much less now. > > > > OK, the Mozilla style guide, which we refer to, advises to do truthy/falsey > > checks unless there's a chance of confusion [1] and I think we've pretty much > > followed this everywhere else in the code. > > > > [1]: > > > https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Coding_Style... > > I prefer checking for an empty string specifically in this code, I think it > makes it easier to grok. Acknowledged. https://codereview.adblockplus.org/29757591/diff/29769578/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29769578/lib/elemHide.js#new... lib/elemHide.js:163: else if (filterBySelector.get(filter.selector) === filter) Looking at the code above, the value returned by filterBySelector.get will always be a non-null object. In that case we should change the strict equality here to non-strict equality. There's no harm in keeping it, but when I see a "===" I think there's a good reason for it to be there, it tells a story, but here it's just noise. (It would be the opposite if our style guide mandated "===" and made "==" the exception.)
Patch Set 8 : == instead of === https://codereview.adblockplus.org/29757591/diff/29769578/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29757591/diff/29769578/lib/elemHide.js#new... lib/elemHide.js:163: else if (filterBySelector.get(filter.selector) === filter) On 2018/05/04 11:59:55, Manish Jethani wrote: > Looking at the code above, the value returned by filterBySelector.get will > always be a non-null object. In that case we should change the strict equality > here to non-strict equality. There's no harm in keeping it, but when I see a > "===" I think there's a good reason for it to be there, it tells a story, but > here it's just noise. > > (It would be the opposite if our style guide mandated "===" and made "==" the > exception.) Done.
LGTM |