Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code

Issue 29773570: Issue 6652 - Implement fast selector lookups for unknown domains (Closed)

Created:
May 7, 2018, 3:38 p.m. by Manish Jethani
Modified:
March 16, 2019, 3:54 p.m.
Reviewers:
kzar
CC:
sergei
Base URL:
https://hg.adblockplus.org/adblockpluscore/
Visibility:
Public.

Description

ElemHide.getSelectorsForDomain does filter matching for every domain. This is not necessary; instead, it can do filter matching only for known domains (i.e. those that have filters that would apply to them). For any random domain like x4aabd7c0ddcd.com or manishjethani.com the lookups can be much, much faster, because there are no domain-specific filters that apply and no exceptions to check against (except for generic exceptions like "#@#foo", which would still apply to "##foo" on x4aabd7c0ddcd.com or manishjethani.com). Since the list of generic selectors that apply on most domains is the same, ElemHide.getSelectorsForDomain can cache this list and, for most domains, return the cached list concatenated with the generated list of domain-specific selectors. With these changes, the performance improvement for most domains is 10x for EastList+AA (as of 12 May 2018) and possibly significantly higher as the list of subscriptions grows. There is no performance penalty for domains to which this optimization does not apply.

Patch Set 1 #

Total comments: 10

Patch Set 2 : Check generic exception set size before looking up #

Total comments: 10

Patch Set 3 : Look up filters before deciding path #

Patch Set 4 : Merge getConditionalGenericSelectors into getConditionalSelectorsForDomain #

Total comments: 1

Patch Set 5 : Rebase #

Patch Set 6 : Split out getConditionalGenericSelectors again and cache result #

Total comments: 2

Patch Set 7 : Used cached list for generic-friendly domains #

Total comments: 2

Patch Set 8 : Limit size of generic-friendly domains set #

Patch Set 9 : Rebase on master #

Patch Set 10 : Clean up #

Patch Set 11 : Fix JSDoc #

Patch Set 12 : Use longest subdomain as key in generic-friendly domains set #

Patch Set 13 : Move selector matching into its own function #

Total comments: 19

Patch Set 14 : Rebase #

Patch Set 15 : Inline doesFilterApply #

Patch Set 16 : Reinclude test/elemHide.js #

Patch Set 17 : Strip out optimization for "generic-friendly" domains #

Patch Set 18 : Avoid unnecessary Array.concat #

Unified diffs Side-by-side diffs Delta from patch set Stats (+182 lines, -47 lines) Patch
M lib/elemHide.js View 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 5 chunks +173 lines, -47 lines 0 comments Download
M test/elemHide.js View 15 1 chunk +9 lines, -0 lines 0 comments Download

Messages

Total messages: 34
Manish Jethani
May 7, 2018, 3:38 p.m. (2018-05-07 15:38:36 UTC) #1
Manish Jethani
Patch Set 1 I'll file an issue for this shortly, but see the description here. ...
May 7, 2018, 4:03 p.m. (2018-05-07 16:03:03 UTC) #2
Manish Jethani
https://codereview.adblockplus.org/29773570/diff/29773571/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29773570/diff/29773571/lib/elemHide.js#newcode98 lib/elemHide.js:98: function getConditionalGenericSelectors() To give you some numbers, there are ...
May 7, 2018, 4:10 p.m. (2018-05-07 16:10:34 UTC) #3
Manish Jethani
Patch Set 2: Check generic exception set size before looking up https://codereview.adblockplus.org/29773570/diff/29773574/lib/elemHide.js File lib/elemHide.js (right): ...
May 7, 2018, 4:15 p.m. (2018-05-07 16:15:09 UTC) #4
Manish Jethani
Now about how I tested this. First I generated a list of 3,000 filters using ...
May 7, 2018, 4:31 p.m. (2018-05-07 16:31:14 UTC) #5
sergei
I think it is worth an issue.
May 7, 2018, 5:23 p.m. (2018-05-07 17:23:33 UTC) #6
Manish Jethani
On 2018/05/07 17:23:33, sergei wrote: > I think it is worth an issue. Done. https://issues.adblockplus.org/ticket/6652
May 7, 2018, 5:53 p.m. (2018-05-07 17:53:47 UTC) #7
kzar
https://codereview.adblockplus.org/29773570/diff/29773574/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29773570/diff/29773574/lib/elemHide.js#newcode335 lib/elemHide.js:335: if (isDomainKnown(currentDomain)) I don't quite understand how your change ...
May 8, 2018, 1:46 p.m. (2018-05-08 13:46:47 UTC) #8
Manish Jethani
https://codereview.adblockplus.org/29773570/diff/29773574/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29773570/diff/29773574/lib/elemHide.js#newcode335 lib/elemHide.js:335: if (isDomainKnown(currentDomain)) On 2018/05/08 13:46:47, kzar wrote: > I ...
May 8, 2018, 2:20 p.m. (2018-05-08 14:20:43 UTC) #9
Manish Jethani
https://codereview.adblockplus.org/29773570/diff/29773574/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29773570/diff/29773574/lib/elemHide.js#newcode335 lib/elemHide.js:335: if (isDomainKnown(currentDomain)) On 2018/05/08 14:20:43, Manish Jethani wrote: > ...
May 8, 2018, 3:49 p.m. (2018-05-08 15:49:01 UTC) #10
kzar
https://codereview.adblockplus.org/29773570/diff/29773574/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29773570/diff/29773574/lib/elemHide.js#newcode335 lib/elemHide.js:335: if (isDomainKnown(currentDomain)) On 2018/05/08 15:49:00, Manish Jethani wrote: > ...
May 8, 2018, 5:13 p.m. (2018-05-08 17:13:44 UTC) #11
Manish Jethani
https://codereview.adblockplus.org/29773570/diff/29773574/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29773570/diff/29773574/lib/elemHide.js#newcode335 lib/elemHide.js:335: if (isDomainKnown(currentDomain)) On 2018/05/08 17:13:43, kzar wrote: > On ...
May 8, 2018, 5:37 p.m. (2018-05-08 17:37:13 UTC) #12
Manish Jethani
https://codereview.adblockplus.org/29773570/diff/29773574/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29773570/diff/29773574/lib/elemHide.js#newcode335 lib/elemHide.js:335: if (isDomainKnown(currentDomain)) On 2018/05/08 17:37:13, Manish Jethani wrote: > ...
May 8, 2018, 5:49 p.m. (2018-05-08 17:49:30 UTC) #13
kzar
On 2018/05/08 17:49:30, Manish Jethani wrote: > Anyway I have something in mind here so ...
May 9, 2018, 10:18 a.m. (2018-05-09 10:18:11 UTC) #14
Manish Jethani
On 2018/05/09 10:18:11, kzar wrote: > On 2018/05/08 17:49:30, Manish Jethani wrote: > > Anyway ...
May 9, 2018, 3:40 p.m. (2018-05-09 15:40:00 UTC) #15
Manish Jethani
Patch Set 3: Look up filters before deciding path Now rebased on patch #29774573. We ...
May 9, 2018, 7:57 p.m. (2018-05-09 19:57:56 UTC) #16
Manish Jethani
Patch Set 4: Merge getConditionalGenericSelectors into getConditionalSelectorsForDomain I'm still not convinced merging these two is ...
May 9, 2018, 9:35 p.m. (2018-05-09 21:35:45 UTC) #17
Manish Jethani
Patch Set 5: Rebase Patch Set 6: Split out getConditionalGenericSelectors again and cache result Patch ...
May 11, 2018, 10 a.m. (2018-05-11 10:00:32 UTC) #18
Manish Jethani
Patch Set 8: Limit size of generic-friendly domains set https://codereview.adblockplus.org/29773570/diff/29778566/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29773570/diff/29778566/lib/elemHide.js#newcode175 lib/elemHide.js:175: ...
May 11, 2018, 10:09 a.m. (2018-05-11 10:09:09 UTC) #19
Manish Jethani
Patch Set 9: Rebase on master
May 11, 2018, 1:31 p.m. (2018-05-11 13:31:55 UTC) #20
Manish Jethani
Patch Set 10: Clean up Patch Set 11: Fix JSDoc A note about the "generic-friendly ...
May 12, 2018, 10:35 a.m. (2018-05-12 10:35:05 UTC) #21
Manish Jethani
Patch Set 12: Use longest subdomain as key in generic-friendly domains set We use the ...
May 12, 2018, 5:39 p.m. (2018-05-12 17:39:35 UTC) #22
Manish Jethani
Patch Set 13: Move selector matching into its own function Fewer lines of code. Later ...
May 13, 2018, 12:21 p.m. (2018-05-13 12:21:44 UTC) #23
kzar
https://codereview.adblockplus.org/29773570/diff/29780555/lib/elemHide.js File lib/elemHide.js (left): https://codereview.adblockplus.org/29773570/diff/29780555/lib/elemHide.js#oldcode271 lib/elemHide.js:271: if (specificOnly && currentDomain == "") Why don't we ...
May 15, 2018, 1:26 p.m. (2018-05-15 13:26:31 UTC) #24
Manish Jethani
Patch Set 14: Rebase Patch Set 15: Inline doesFilterApply https://codereview.adblockplus.org/29773570/diff/29780555/lib/elemHide.js File lib/elemHide.js (left): https://codereview.adblockplus.org/29773570/diff/29780555/lib/elemHide.js#oldcode271 lib/elemHide.js:271: ...
May 15, 2018, 4 p.m. (2018-05-15 16:00:08 UTC) #25
Manish Jethani
https://codereview.adblockplus.org/29773570/diff/29780555/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29773570/diff/29780555/lib/elemHide.js#newcode471 lib/elemHide.js:471: let selectors = getConditionalSelectorsForDomain(domain, specificOnly); On 2018/05/15 16:00:08, Manish ...
May 15, 2018, 4:16 p.m. (2018-05-15 16:16:15 UTC) #26
Manish Jethani
https://codereview.adblockplus.org/29773570/diff/29780555/lib/elemHide.js File lib/elemHide.js (right): https://codereview.adblockplus.org/29773570/diff/29780555/lib/elemHide.js#newcode139 lib/elemHide.js:139: function getSpecificFiltersForDomain(domain) Note: getSpecificFiltersForDomain could be reused by ElemHideEmulation ...
May 15, 2018, 4:20 p.m. (2018-05-15 16:20:59 UTC) #27
Manish Jethani
Patch Set 16: Reinclude test/elemHide.js
May 15, 2018, 4:27 p.m. (2018-05-15 16:27:19 UTC) #28
kzar
https://codereview.adblockplus.org/29773570/diff/29780555/lib/elemHide.js File lib/elemHide.js (left): https://codereview.adblockplus.org/29773570/diff/29780555/lib/elemHide.js#oldcode271 lib/elemHide.js:271: if (specificOnly && currentDomain == "") On 2018/05/15 16:00:08, ...
May 18, 2018, 2:34 p.m. (2018-05-18 14:34:54 UTC) #29
Manish Jethani
https://codereview.adblockplus.org/29773570/diff/29780555/lib/elemHide.js File lib/elemHide.js (left): https://codereview.adblockplus.org/29773570/diff/29780555/lib/elemHide.js#oldcode271 lib/elemHide.js:271: if (specificOnly && currentDomain == "") On 2018/05/18 14:34:53, ...
May 18, 2018, 5:18 p.m. (2018-05-18 17:18:27 UTC) #30
kzar
On 2018/05/18 17:18:27, Manish Jethani wrote: > We have a 100+ million user base though, ...
May 22, 2018, 1:59 p.m. (2018-05-22 13:59:40 UTC) #31
Manish Jethani
Patch Set 17: Strip out optimization for "generic-friendly" domains On 2018/05/22 13:59:40, kzar wrote: > ...
May 23, 2018, 12:12 a.m. (2018-05-23 00:12:26 UTC) #32
Manish Jethani
Patch Set 18: Avoid unnecessary Array.concat
May 23, 2018, 12:22 a.m. (2018-05-23 00:22:43 UTC) #33
kzar
May 23, 2018, 12:07 p.m. (2018-05-23 12:07:14 UTC) #34
On 2018/05/23 00:12:26, Manish Jethani wrote:
> > On 2018/05/18 17:18:27, Manish Jethani wrote:
> > >  We have a 100+ million user base though, if I understand correctly...
> > 
> > Please don't talk publicly about usage numbers without checking first, also
we
> > talk about "active devices" not users. For the record I'm not sure what our
> > numbers are at the moment.
> 
> OK, it's on the website so I thought these numbers are well known [1]
> 
> [1]: https://adblockplus.org/blog/100-million-users-100-million-thank-yous

Fair enough, I stand corrected!

Powered by Google App Engine
This is Rietveld