|
|
DescriptionIssue 235 - Speed up ElemHide.getSelectorsForDomain
Patch Set 1 #
Total comments: 6
Patch Set 2 : Cache filtersByDomain[domain] #
Total comments: 8
Patch Set 3 : Addressed feedback #MessagesTotal messages: 8
Patch Set 1
https://codereview.adblockplus.org/29341426/diff/29341427/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29341426/diff/29341427/lib/elemHide.js#new... lib/elemHide.js:158: if (!(domain in filtersByDomain)) I think we should cache the lookup here: let filters = filtersByDomain[domain]; if (!filters) filters = filtersByDomain[domain] = Object.create(null); filters[filter.text] = ...; https://codereview.adblockplus.org/29341426/diff/29341427/lib/elemHide.js#new... lib/elemHide.js:420: if (filterText in seenFilters) Is this some kind of micro optimization or an attempt to exclude duplicate selectors? If it's supposed to be the latter, I think it makes more sense to simply use an object for the resulting selectors and simply returning Object.keys(selectors) below, as this will also remove duplicate selectors from different filters. https://codereview.adblockplus.org/29341426/diff/29341427/lib/elemHide.js#new... lib/elemHide.js:425: if (filter && !this.getException(filter, domain)) I still have a hard time to understand how/whether filters like ~example.com##div are working. As far as I understand the logic here, you simply ignore such filters, as the variable "filter" will be false for them here.
Patch Set 2 : Cache filtersByDomain[domain] https://codereview.adblockplus.org/29341426/diff/29341427/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29341426/diff/29341427/lib/elemHide.js#new... lib/elemHide.js:158: if (!(domain in filtersByDomain)) On 2016/05/17 09:47:49, Sebastian Noack wrote: > I think we should cache the lookup here: > > let filters = filtersByDomain[domain]; > if (!filters) > filters = filtersByDomain[domain] = Object.create(null); > filters[filter.text] = ...; Done. https://codereview.adblockplus.org/29341426/diff/29341427/lib/elemHide.js#new... lib/elemHide.js:420: if (filterText in seenFilters) On 2016/05/17 09:47:49, Sebastian Noack wrote: > Is this some kind of micro optimization or an attempt to exclude duplicate > selectors? If it's supposed to be the latter, I think it makes more sense to > simply use an object for the resulting selectors and simply returning > Object.keys(selectors) below, as this will also remove duplicate selectors from > different filters. There's a comment in the old codereview about how this logic should work https://codereview.adblockplus.org/4529242486341632/diff/29321269/lib/elemHid... "~foo.example.com,example.com##foo really shouldn't apply on foo.example.com even if that filter is allowed on example.com." So let's say the domain given to getSelectorsForDomain is foo.example.com. The matched filter is false for foo.example.com, meaning that the filter should not be active. Now without the seenFilters logic we would continue to match the filter example.com but this time it would be active. So the end result would be that the filter would be applied to foo.example.com, which would be incorrect. https://codereview.adblockplus.org/29341426/diff/29341427/lib/elemHide.js#new... lib/elemHide.js:425: if (filter && !this.getException(filter, domain)) On 2016/05/17 09:47:49, Sebastian Noack wrote: > I still have a hard time to understand how/whether filters like > ~example.com##div are working. As far as I understand the logic here, you simply > ignore such filters, as the variable "filter" will be false for them here. For ~example.com##div there should be two entries added to filtersByDomain. filtersByDomain["example.com"]["~example.com##div"][false] and filtersByDomain[""]["~example.com##div"]. The first has the value of false and the second has of the filter itself. (Originally it was going to simply be true but this way we avoid a second lookup to get the filter object.)
LGTM, assuming the logic works like described in your comments, i.e. like it did before. But I guess Wladimir wants to have a go as well.
https://codereview.adblockplus.org/29341426/diff/29341462/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29341426/diff/29341462/lib/elemHide.js#new... lib/elemHide.js:154: let domainMatches = filter.domains || {"": true}; You are iterating over object keys below but {"": true} has more properties than just "" (unlike filter.domains which doesn't have a prototype). It's probably best to have a global defaultDomains variable that has this object without prototype. https://codereview.adblockplus.org/29341426/diff/29341462/lib/elemHide.js#new... lib/elemHide.js:157: domain = domain.toUpperCase(); This operation is unnecessary, the domains are already upper-case. https://codereview.adblockplus.org/29341426/diff/29341462/lib/elemHide.js#new... lib/elemHide.js:161: filters[filter.text] = domainMatches[domain] && filter; I know, Sebastian loves such constructs. Still, I consider this too non-obvious. Could you write this out, with a proper if statement? if (domainMatches[domain]) filters[filter.text] = filter; else filters[filter.text] = false;
https://codereview.adblockplus.org/29341426/diff/29341462/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29341426/diff/29341462/lib/elemHide.js#new... lib/elemHide.js:154: let domainMatches = filter.domains || {"": true}; On 2016/05/18 13:38:06, Wladimir Palant wrote: > You are iterating over object keys below but {"": true} has more properties than > just "" (unlike filter.domains which doesn't have a prototype). It's probably > best to have a global defaultDomains variable that has this object without > prototype. Also note that for..in loops on objects in hash-table mode cause the function to de-deoptimized in V8. However, whether this is the case here depends a lot of the context. So please make sure that there is no deopt. And if necessary resort to Object.keys() + for(;;)
Patch Set 3 : Addressed feedback https://codereview.adblockplus.org/29341426/diff/29341462/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29341426/diff/29341462/lib/elemHide.js#new... lib/elemHide.js:154: let domainMatches = filter.domains || {"": true}; On 2016/05/18 13:38:06, Wladimir Palant wrote: > You are iterating over object keys below but {"": true} has more properties than > just "" (unlike filter.domains which doesn't have a prototype). It's probably > best to have a global defaultDomains variable that has this object without > prototype. Done. https://codereview.adblockplus.org/29341426/diff/29341462/lib/elemHide.js#new... lib/elemHide.js:154: let domainMatches = filter.domains || {"": true}; On 2016/05/18 13:48:25, Sebastian Noack wrote: > On 2016/05/18 13:38:06, Wladimir Palant wrote: > > You are iterating over object keys below but {"": true} has more properties > than > > just "" (unlike filter.domains which doesn't have a prototype). It's probably > > best to have a global defaultDomains variable that has this object without > > prototype. > > Also note that for..in loops on objects in hash-table mode cause the function to > de-deoptimized in V8. However, whether this is the case here depends a lot of > the context. So please make sure that there is no deopt. And if necessary resort > to Object.keys() + for(;;) The only way I know to check if a function is deoptimized is to find it in the profiler. It was a little hard to find it but finally found it and the Elemhide.add function is not deoptimized with this change apparently. (Unfortunately I only found it in Botttom Up view so far, finding it in Top Down was even harder.) https://codereview.adblockplus.org/29341426/diff/29341462/lib/elemHide.js#new... lib/elemHide.js:157: domain = domain.toUpperCase(); On 2016/05/18 13:38:06, Wladimir Palant wrote: > This operation is unnecessary, the domains are already upper-case. Done. https://codereview.adblockplus.org/29341426/diff/29341462/lib/elemHide.js#new... lib/elemHide.js:161: filters[filter.text] = domainMatches[domain] && filter; On 2016/05/18 13:38:05, Wladimir Palant wrote: > I know, Sebastian loves such constructs. Still, I consider this too non-obvious. > Could you write this out, with a proper if statement? > > if (domainMatches[domain]) > filters[filter.text] = filter; > else > filters[filter.text] = false; Done.
LGTM |