|
|
DescriptionIssue 3585 - Merge element hiding rules for the same domain
Patch Set 1 #
Total comments: 5
Patch Set 2 : Rebased to use ES2015 features everywhere, limit to 5000 selectors per rule and comma delimit selec… #
Total comments: 6
Patch Set 3 : Addressed more feedback #Patch Set 4 : Renamed matchDomain variable #Patch Set 5 : Fixed mistake in generated JSON structure #MessagesTotal messages: 10
Patch Set 1 Not sure how to test this, but I've attached an example generated list to the issue: https://issues.adblockplus.org/attachment/ticket/3585/easylist.json.gz https://codereview.adblockplus.org/29336349/diff/29336350/abp2blocklist.js File abp2blocklist.js (right): https://codereview.adblockplus.org/29336349/diff/29336350/abp2blocklist.js#ne... abp2blocklist.js:297: rule["action"]["selector"] = ":matches(" + selectors.join(", ") + ")"; Note: I wasn't sure if we need to escape the selectors here.
https://codereview.adblockplus.org/29336349/diff/29336350/abp2blocklist.js File abp2blocklist.js (right): https://codereview.adblockplus.org/29336349/diff/29336350/abp2blocklist.js#ne... abp2blocklist.js:270: for (let filter of elemhideFilters) Does these ES2015 features even work in node.js without any additional command line options? I remember that when I initially wrote that code I couldn't use let statements. Moreover, regardless of ES2015 support in the latest node version, I think we should either stick to var, or use let consistently in the whole file. https://codereview.adblockplus.org/29336349/diff/29336350/abp2blocklist.js#ne... abp2blocklist.js:297: rule["action"]["selector"] = ":matches(" + selectors.join(", ") + ")"; On 2016/02/13 19:33:24, kzar wrote: > Note: I wasn't sure if we need to escape the selectors here. Well, if any CSS selector is invalid it will break the whole group of selectors. But that is the same with the logic we have on Chrome. So I don't think we need to bother. And as far as I understand :matches, it's not possible to prevent that by escaping as we need raw selectors here. However, I wonder whether we even need :matches() or whether we can simply join the selectors by comma. It seems that is what AdBlock is doing. While looking at their code I also figured that they never merge more than 1000 selectors into one rule. Perhaps, also Dean can give some advice here?
On 2016/02/15 14:33:11, Sebastian Noack wrote: > Well, if any CSS selector is invalid it will break the whole group of selectors. > But that is the same with the logic we have on Chrome. So I don't think we need > to bother. And as far as I understand :matches, it's not possible to prevent > that by escaping as we need raw selectors here. > > However, I wonder whether we even need :matches() or whether we can simply join > the selectors by comma. It seems that is what AdBlock is doing. While looking at > their code I also figured that they never merge more than 1000 selectors into > one rule. Perhaps, also Dean can give some advice here? There is definitely a limit on how many selectors can be in one rule, but I'm not 100% sure what it is. However I have over 5500 selectors per rule with no issues. I'd recommend capping it at 5000 per rule when generating it programatically, as its a nice round number, unless there is any real reason to have it limited to 1000? AFAIK, the fewer rules, the better for performance.
Patch Set 2 : Rebased to use ES2015 features everywhere, limit to 5000 selectors per rule and comma delimit selectors https://codereview.adblockplus.org/29336349/diff/29336350/abp2blocklist.js File abp2blocklist.js (right): https://codereview.adblockplus.org/29336349/diff/29336350/abp2blocklist.js#ne... abp2blocklist.js:270: for (let filter of elemhideFilters) On 2016/02/15 14:33:11, Sebastian Noack wrote: > Does these ES2015 features even work in node.js without any additional command > line options? I remember that when I initially wrote that code I couldn't use > let statements. > > Moreover, regardless of ES2015 support in the latest node version, I think we > should either stick to var, or use let consistently in the whole file. (As discussed in IRC we will use ES2015 features that are supported by node.js stable without special arguments.) https://codereview.adblockplus.org/29336349/diff/29336350/abp2blocklist.js#ne... abp2blocklist.js:297: rule["action"]["selector"] = ":matches(" + selectors.join(", ") + ")"; On 2016/02/15 14:33:11, Sebastian Noack wrote: > On 2016/02/13 19:33:24, kzar wrote: > > Note: I wasn't sure if we need to escape the selectors here. > > Well, if any CSS selector is invalid it will break the whole group of selectors. > But that is the same with the logic we have on Chrome. So I don't think we need > to bother. And as far as I understand :matches, it's not possible to prevent > that by escaping as we need raw selectors here. > > However, I wonder whether we even need :matches() or whether we can simply join > the selectors by comma. It seems that is what AdBlock is doing. While looking at > their code I also figured that they never merge more than 1000 selectors into > one rule. Perhaps, also Dean can give some advice here? Acknowledged.
https://codereview.adblockplus.org/29336349/diff/29336418/abp2blocklist.js File abp2blocklist.js (right): https://codereview.adblockplus.org/29336349/diff/29336418/abp2blocklist.js#ne... abp2blocklist.js:105: return [included.map(matchDomain), filter.selector]; How about returning an object here? That would make the calling code a little more straight-forward, by using descriptive properties rather than bogus numeric indices. https://codereview.adblockplus.org/29336349/diff/29336418/abp2blocklist.js#ne... abp2blocklist.js:281: if (!groupedElemhideFilters.has(domain)) The additional lookup is redundant. IMO, better: let group = groupedElemhideFilters.get(domain); if (!group) { group = []; groupedElemhideFilters.set(domain, group); } https://codereview.adblockplus.org/29336349/diff/29336418/abp2blocklist.js#ne... abp2blocklist.js:292: trigger: {"url-filter": domain}, The variable name isn't accurate. It's not a domain but a regexp matching the URL. Perhaps, it makes even more sense to generate the regexp here rather than passing it around, as the variable names already indicate.
Patch Set 3 : Addressed more feedback https://codereview.adblockplus.org/29336349/diff/29336418/abp2blocklist.js File abp2blocklist.js (right): https://codereview.adblockplus.org/29336349/diff/29336418/abp2blocklist.js#ne... abp2blocklist.js:105: return [included.map(matchDomain), filter.selector]; On 2016/02/15 19:48:48, Sebastian Noack wrote: > How about returning an object here? That would make the calling code a little > more straight-forward, by using descriptive properties rather than bogus numeric > indices. Done. https://codereview.adblockplus.org/29336349/diff/29336418/abp2blocklist.js#ne... abp2blocklist.js:281: if (!groupedElemhideFilters.has(domain)) On 2016/02/15 19:48:48, Sebastian Noack wrote: > The additional lookup is redundant. IMO, better: > > let group = groupedElemhideFilters.get(domain); > if (!group) > { > group = []; > groupedElemhideFilters.set(domain, group); > } Done, kind of. We're either going to have to perform an extra lookup, an extra assignment or set a flag like `wasEmpty` to check before performing the assignment. I don't mind which, so far I went with the extra assignment. https://codereview.adblockplus.org/29336349/diff/29336418/abp2blocklist.js#ne... abp2blocklist.js:292: trigger: {"url-filter": domain}, On 2016/02/15 19:48:47, Sebastian Noack wrote: > The variable name isn't accurate. It's not a domain but a regexp matching the > URL. Perhaps, it makes even more sense to generate the regexp here rather than > passing it around, as the variable names already indicate. Acknowledged.
Patch Set 4 : Renamed matchDomain variable
LGTM
Patch Set 5 : Fixed mistake in generated JSON structure
Still LGTM |