|
|
DescriptionIssue 3671 - Split out contentBlockerList API
Patch Set 1 #
Total comments: 6
Patch Set 2 : Addressed feedback, removed some modules #
Total comments: 44
Patch Set 3 : Reworked API into ContentBlockerList class and addressed other feedback #
Total comments: 7
Patch Set 4 : Addressed more feedback #
Total comments: 1
Patch Set 5 : Skip lines only containing whitespace #
Total comments: 5
Patch Set 6 : Tweaked line skipping logic and addressed a nit #Patch Set 7 : Use Sebastian's regexp suggestion #Patch Set 8 : Renamed lib/contentBlockList.js to lib/abp2blocklist.js #
MessagesTotal messages: 23
Patch Set 1
https://codereview.adblockplus.org/29336753/diff/29336754/node_modules/filter... File node_modules/filterNotifier.js (right): https://codereview.adblockplus.org/29336753/diff/29336754/node_modules/filter... node_modules/filterNotifier.js:2: * This file is part of Adblock Plus <https://adblockplus.org/>, Why didn't you leave filter classes.js as is, not adding this module? https://codereview.adblockplus.org/29336753/diff/29336754/node_modules/punyco... File node_modules/punycode.js (right): https://codereview.adblockplus.org/29336753/diff/29336754/node_modules/punyco... node_modules/punycode.js:2: * This file is part of Adblock Plus <https://adblockplus.org/>, Nice.js already comes with punycode. So does adblockpluschrome. Adding that code here seems redundant. https://codereview.adblockplus.org/29336753/diff/29336754/node_modules/url.js File node_modules/url.js (right): https://codereview.adblockplus.org/29336753/diff/29336754/node_modules/url.js... node_modules/url.js:2: * This file is part of Adblock Plus <https://adblockplus.org/>, Wrapping tldjs here seems more appropriate, than copying even more code from Adblock Plus.
Patch Set 2 : Addressed feedback, removed some modules https://codereview.adblockplus.org/29336753/diff/29336754/node_modules/filter... File node_modules/filterNotifier.js (right): https://codereview.adblockplus.org/29336753/diff/29336754/node_modules/filter... node_modules/filterNotifier.js:2: * This file is part of Adblock Plus <https://adblockplus.org/>, On 2016/02/21 02:21:08, Sebastian Noack wrote: > Why didn't you leave filter classes.js as is, not adding this module? Done. https://codereview.adblockplus.org/29336753/diff/29336754/node_modules/punyco... File node_modules/punycode.js (right): https://codereview.adblockplus.org/29336753/diff/29336754/node_modules/punyco... node_modules/punycode.js:2: * This file is part of Adblock Plus <https://adblockplus.org/>, On 2016/02/21 02:21:08, Sebastian Noack wrote: > Nice.js already comes with punycode. So does adblockpluschrome. Adding that code > here seems redundant. Done. https://codereview.adblockplus.org/29336753/diff/29336754/node_modules/url.js File node_modules/url.js (right): https://codereview.adblockplus.org/29336753/diff/29336754/node_modules/url.js... node_modules/url.js:2: * This file is part of Adblock Plus <https://adblockplus.org/>, On 2016/02/21 02:21:09, Sebastian Noack wrote: > Wrapping tldjs here seems more appropriate, than copying even more code from > Adblock Plus. Done.
https://codereview.adblockplus.org/29336753/diff/29336776/abp2blocklist.js File abp2blocklist.js (right): https://codereview.adblockplus.org/29336753/diff/29336776/abp2blocklist.js#ne... abp2blocklist.js:10: require.cache["url"] = require.cache[require.resolve("url.js")]; I'd rather rename our url module, here and in Adblock Plus, rather than going with this hack. https://codereview.adblockplus.org/29336753/diff/29336776/abp2blocklist.js#ne... abp2blocklist.js:22: if (line.charAt(0) != "[") What's about empty lines? It seems that we didn't handle those before either. But wouldn't an empty line result result into a filter that would block everything? https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... File lib/contentBlockerLists.js (right): https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:18: /** @module contentBlockerLists */ I'd rather call that module simply abp2blocklist, to keep things straight-forward, as this is the module of the project, with this module being the only public API. https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:26: Nit: One empty line should suffice here. Same below. https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:72: let result = ""; This is unrelated, but we probably should use an array here, and join it in the end as we do in convertIDSelectorsToAttributeSelectors(). https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:151: if (filter.contentType & (typeMap.XMLHTTPREQUEST | typeMap.OBJECT_SUBREQUEST Nit: This check reads easier if the types are aligned: if (filter.contentType & (typeMap.XMLHTTPREQUEST | typeMap.OBJECT_SUBREQUEST | typeMap.PING | typeMap.OTHER)) https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:211: for (let name of Object.getOwnPropertyNames(obj)) When we start to transpile the code with jsHydra, Object.getOwnPropertyNames(obj) will be called on each iteration unless you use a variable. https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:262: newSelector.push('[id=' + selector.substring(pos.start + 1, pos.end) + ']'); Sorry, I missed that one on the previous review; the string concatenation here is unneccessary. You should rather push 3 strings on the array having them concatenated by the join in the end. This can be done in a single call: newSelector.push('[id=', selector.substring(pos.start + 1, pos.end), ']'); https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:276: exports.convertFilters = function(filters) On the command line we read filters line by line. In Adblock Plus we would iterate over the active subscriptions, and then over their active filters. So in both cases, it would be useful be able to report filter by filter, rather than compiling an array to pass it in. Theoretically, a generator would do, but wrapping the calling code into a generator requires some boilerplate as well, and in Adblock Plus for Safari we wouldn't have real generators anyway. So I think an API that can be used like below would make more sense: let converter = new FilterConverter(); for (let filter of filters) converter.addFilter(filter); let rules = converter.getRules(); I wouldn't insist on the name FilterConverter though. BlockList or ContentBlockerList would be neat names as well. I leave it up to you. https://codereview.adblockplus.org/29336753/diff/29336776/node_modules/url.js File node_modules/url.js (right): https://codereview.adblockplus.org/29336753/diff/29336776/node_modules/url.js... node_modules/url.js:1: /* Please correct me if I'm wrong, as I'm not too familiar with that whole node.js universe either, but to me it seems that node_modules is only for modules installed through npm, while own modules should go under lib. https://codereview.adblockplus.org/29336753/diff/29336776/node_modules/url.js... node_modules/url.js:18: /** @module url */ Since the existence of this module is just an implementation detail, I think, I wouldn't want to have it documented, if we will ever generate documentation for this project. https://codereview.adblockplus.org/29336753/diff/29336776/node_modules/url.js... node_modules/url.js:22: exports.getBaseDomain = hostname => require("tldjs").getDomain(hostname); Nit: exports.getBaseDomain = require("tldjs").getDomain; https://codereview.adblockplus.org/29336753/diff/29336776/node_modules/url.js... node_modules/url.js:22: exports.getBaseDomain = hostname => require("tldjs").getDomain(hostname); How about a comment that this is mimicking the url module in adblockpluschrome for compatibility?
https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... File lib/contentBlockerLists.js (right): https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:18: /** @module contentBlockerLists */ On 2016/02/21 20:34:45, Sebastian Noack wrote: > I'd rather call that module simply abp2blocklist, to keep things > straight-forward, as this is the module of the project, with this module being > the only public API. s/is the module of the project/is the name of the project/
https://codereview.adblockplus.org/29336753/diff/29336776/abp2blocklist.js File abp2blocklist.js (left): https://codereview.adblockplus.org/29336753/diff/29336776/abp2blocklist.js#ol... abp2blocklist.js:1: "use strict"; Mind adding the missing license disclaimer here?
Patch Set 3 : Reworked API into ContentBlockerList class and addressed other feedback https://codereview.adblockplus.org/29336753/diff/29336776/abp2blocklist.js File abp2blocklist.js (left): https://codereview.adblockplus.org/29336753/diff/29336776/abp2blocklist.js#ol... abp2blocklist.js:1: "use strict"; On 2016/02/21 21:21:43, Sebastian Noack wrote: > Mind adding the missing license disclaimer here? Done. https://codereview.adblockplus.org/29336753/diff/29336776/abp2blocklist.js File abp2blocklist.js (right): https://codereview.adblockplus.org/29336753/diff/29336776/abp2blocklist.js#ne... abp2blocklist.js:10: require.cache["url"] = require.cache[require.resolve("url.js")]; On 2016/02/21 20:34:44, Sebastian Noack wrote: > I'd rather rename our url module, here and in Adblock Plus, rather than going > with this hack. OK I've renamed the module urlHelpers and removed the hack. (Open to suggestions if you have a better idea for the name, I just made that up on the spot.) https://codereview.adblockplus.org/29336753/diff/29336776/abp2blocklist.js#ne... abp2blocklist.js:22: if (line.charAt(0) != "[") On 2016/02/21 20:34:44, Sebastian Noack wrote: > What's about empty lines? It seems that we didn't handle those before either. > But wouldn't an empty line result result into a filter that would block > everything? It appears that contentBlockerLists.js logic was handling that case somehow but I can't see exactly why. Anyway I think you're right, better to check here explicitly. Done. https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... File lib/contentBlockerLists.js (right): https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:18: /** @module contentBlockerLists */ On 2016/02/21 20:36:15, Sebastian Noack wrote: > On 2016/02/21 20:34:45, Sebastian Noack wrote: > > I'd rather call that module simply abp2blocklist, to keep things > > straight-forward, as this is the module of the project, with this module being > > the only public API. > > s/is the module of the project/is the name of the project/ I think abp2blocklist is a fine name for the script that converts an Adblock Plus filter list to a content blocker list, but not for a library that will one day be included in adblockpluscore. I've renamed to contentBlockerList to match the class name, hope that's OK. https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:26: On 2016/02/21 20:34:45, Sebastian Noack wrote: > Nit: One empty line should suffice here. Same below. Done. https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:72: let result = ""; On 2016/02/21 20:34:45, Sebastian Noack wrote: > This is unrelated, but we probably should use an array here, and join it in the > end as we do in convertIDSelectorsToAttributeSelectors(). Done. https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:151: if (filter.contentType & (typeMap.XMLHTTPREQUEST | typeMap.OBJECT_SUBREQUEST On 2016/02/21 20:34:45, Sebastian Noack wrote: > Nit: This check reads easier if the types are aligned: > > if (filter.contentType & (typeMap.XMLHTTPREQUEST | > typeMap.OBJECT_SUBREQUEST | > typeMap.PING | > typeMap.OTHER)) Done. https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:211: for (let name of Object.getOwnPropertyNames(obj)) On 2016/02/21 20:34:45, Sebastian Noack wrote: > When we start to transpile the code with jsHydra, > Object.getOwnPropertyNames(obj) will be called on each iteration unless you use > a variable. Done. https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:262: newSelector.push('[id=' + selector.substring(pos.start + 1, pos.end) + ']'); On 2016/02/21 20:34:45, Sebastian Noack wrote: > Sorry, I missed that one on the previous review; the string concatenation here > is unneccessary. You should rather push 3 strings on the array having them > concatenated by the join in the end. This can be done in a single call: > > newSelector.push('[id=', selector.substring(pos.start + 1, pos.end), ']'); Done. https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:276: exports.convertFilters = function(filters) On 2016/02/21 20:34:44, Sebastian Noack wrote: > On the command line we read filters line by line. In Adblock Plus we would > iterate over the active subscriptions, and then over their active filters. > > So in both cases, it would be useful be able to report filter by filter, rather > than compiling an array to pass it in. Theoretically, a generator would do, but > wrapping the calling code into a generator requires some boilerplate as well, > and in Adblock Plus for Safari we wouldn't have real generators anyway. So I > think an API that can be used like below would make more sense: > > let converter = new FilterConverter(); > for (let filter of filters) > converter.addFilter(filter); > let rules = converter.getRules(); > > I wouldn't insist on the name FilterConverter though. BlockList or > ContentBlockerList would be neat names as well. I leave it up to you. Done. https://codereview.adblockplus.org/29336753/diff/29336776/node_modules/url.js File node_modules/url.js (right): https://codereview.adblockplus.org/29336753/diff/29336776/node_modules/url.js... node_modules/url.js:1: /* On 2016/02/21 20:34:45, Sebastian Noack wrote: > Please correct me if I'm wrong, as I'm not too familiar with that whole node.js > universe either, but to me it seems that node_modules is only for modules > installed through npm, while own modules should go under lib. The documentation[1] seems to say that while the lib/node/ and .node_libraries/ directories can also be used as well as node_modules/ it is for historic reasons and discouraged. I think how the files are laid out now is more or less correct, but like you I'm no expert. 1: https://nodejs.org/api/modules.html#modules_loading_from_the_global_folders https://codereview.adblockplus.org/29336753/diff/29336776/node_modules/url.js... node_modules/url.js:18: /** @module url */ On 2016/02/21 20:34:45, Sebastian Noack wrote: > Since the existence of this module is just an implementation detail, I think, I > wouldn't want to have it documented, if we will ever generate documentation for > this project. Done. https://codereview.adblockplus.org/29336753/diff/29336776/node_modules/url.js... node_modules/url.js:18: /** @module url */ On 2016/02/21 20:34:45, Sebastian Noack wrote: > Since the existence of this module is just an implementation detail, I think, I > wouldn't want to have it documented, if we will ever generate documentation for > this project. Done. https://codereview.adblockplus.org/29336753/diff/29336776/node_modules/url.js... node_modules/url.js:22: exports.getBaseDomain = hostname => require("tldjs").getDomain(hostname); On 2016/02/21 20:34:45, Sebastian Noack wrote: > How about a comment that this is mimicking the url module in adblockpluschrome > for compatibility? Done. https://codereview.adblockplus.org/29336753/diff/29336776/node_modules/url.js... node_modules/url.js:22: exports.getBaseDomain = hostname => require("tldjs").getDomain(hostname); On 2016/02/21 20:34:45, Sebastian Noack wrote: > Nit: exports.getBaseDomain = require("tldjs").getDomain; I did that originally but it didn't work. I've changed it, what do you think? (I don't think it matters much either way.)
https://codereview.adblockplus.org/29336753/diff/29336776/abp2blocklist.js File abp2blocklist.js (right): https://codereview.adblockplus.org/29336753/diff/29336776/abp2blocklist.js#ne... abp2blocklist.js:10: require.cache["url"] = require.cache[require.resolve("url.js")]; On 2016/02/22 12:28:05, kzar wrote: > On 2016/02/21 20:34:44, Sebastian Noack wrote: > > I'd rather rename our url module, here and in Adblock Plus, rather than going > > with this hack. > > OK I've renamed the module urlHelpers and removed the hack. (Open to suggestions > if you have a better idea for the name, I just made that up on the spot.) I might have got an even a better idea. How about splitting out url.geBaseDomain (and url.isThirdParty), in the extension, into a separate module called tldjs and renaming getBaseDomain to getDomain. That way we wouldn't need any abstraction here at all. What do you think? https://codereview.adblockplus.org/29336753/diff/29336776/abp2blocklist.js#ne... abp2blocklist.js:22: if (line.charAt(0) != "[") On 2016/02/22 12:28:05, kzar wrote: > On 2016/02/21 20:34:44, Sebastian Noack wrote: > > What's about empty lines? It seems that we didn't handle those before either. > > But wouldn't an empty line result result into a filter that would block > > everything? > > It appears that contentBlockerLists.js logic was handling that case somehow but > I can't see exactly why. Anyway I think you're right, better to check here > explicitly. Done. OK, I looked into it and figured that this case was already handled by following check in the original code: if (filter instanceof filterClasses.RegExpFilter && !filter.regexpSource) return; This check was meant to exclude raw regular expression filters. But it also bails if the filter has been parsed from and empty string, in which case regexpSource is "" rather than null. I think that check should be changed to |filter.regexpSource == null|. While empty lines should be explicitly ignored here. Also what is if a line in a filter list contains only whitespace characters? I didn't look into it yet, but we should make sure that we treat that case the same as when parsing filter lists in Adblock Plus. https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... File lib/contentBlockerLists.js (right): https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:18: /** @module contentBlockerLists */ On 2016/02/22 12:28:06, kzar wrote: > On 2016/02/21 20:36:15, Sebastian Noack wrote: > > On 2016/02/21 20:34:45, Sebastian Noack wrote: > > > I'd rather call that module simply abp2blocklist, to keep things > > > straight-forward, as this is the module of the project, with this module > being > > > the only public API. > > > > s/is the module of the project/is the name of the project/ > > I think abp2blocklist is a fine name for the script that converts an Adblock > Plus filter list to a content blocker list, but not for a library that will one > day be included in adblockpluscore. > > I've renamed to contentBlockerList to match the class name, hope that's OK. Whether it will be included into adblockpluscore or remain as a separate repo isn't clear yet. Currently, in particular with the great changes you did here, I mildly lend towards the latter. It's extremely uncommon that libraries like that call their (main) module differently than the project itself. IMO, it's quite confusing. https://codereview.adblockplus.org/29336753/diff/29336800/lib/contentBlockerL... File lib/contentBlockerList.js (right): https://codereview.adblockplus.org/29336753/diff/29336800/lib/contentBlockerL... lib/contentBlockerList.js:106: result.push("\\"); Nit: Perhaps we should change the code here to: result.push("\\", c); break; The missing drop-through here might be a little sneaky and hard to spot when reading the code. Plus that way we have one function call less, when escaping charterers. But I wouldn't insist. https://codereview.adblockplus.org/29336753/diff/29336800/lib/contentBlockerL... lib/contentBlockerList.js:151: typeMap.PING | typeMap.OTHER)) Nit: I think it reads slightly better, and looks more consistent, if all types are aligned. But I wouldn't insist. https://codereview.adblockplus.org/29336753/diff/29336800/lib/contentBlockerL... lib/contentBlockerList.js:285: ContentBlockerList.prototype = {}; There is no need to set the prototype to an empty object. Each function automatically gets a prototype object setup.
https://codereview.adblockplus.org/29336753/diff/29336800/lib/contentBlockerL... File lib/contentBlockerList.js (right): https://codereview.adblockplus.org/29336753/diff/29336800/lib/contentBlockerL... lib/contentBlockerList.js:106: result.push("\\"); On 2016/02/22 17:35:28, Sebastian Noack wrote: > Nit: Perhaps we should change the code here to: > > result.push("\\", c); > break; > > The missing drop-through here might be a little sneaky and hard to spot when > reading the code. Plus that way we have one function call less, when escaping > charterers. But I wouldn't insist. s/missing drop-through/missing break/
Patch Set 4 : Addressed more feedback https://codereview.adblockplus.org/29336753/diff/29336776/abp2blocklist.js File abp2blocklist.js (right): https://codereview.adblockplus.org/29336753/diff/29336776/abp2blocklist.js#ne... abp2blocklist.js:10: require.cache["url"] = require.cache[require.resolve("url.js")]; On 2016/02/22 17:35:27, Sebastian Noack wrote: > On 2016/02/22 12:28:05, kzar wrote: > > On 2016/02/21 20:34:44, Sebastian Noack wrote: > > > I'd rather rename our url module, here and in Adblock Plus, rather than > going > > > with this hack. > > > > OK I've renamed the module urlHelpers and removed the hack. (Open to > suggestions > > if you have a better idea for the name, I just made that up on the spot.) > > I might have got an even a better idea. How about splitting out url.geBaseDomain > (and url.isThirdParty), in the extension, into a separate module called tldjs > and renaming getBaseDomain to getDomain. That way we wouldn't need any > abstraction here at all. What do you think? Sounds good, done. https://codereview.adblockplus.org/29336753/diff/29336776/abp2blocklist.js#ne... abp2blocklist.js:22: if (line.charAt(0) != "[") On 2016/02/22 17:35:27, Sebastian Noack wrote: > On 2016/02/22 12:28:05, kzar wrote: > > On 2016/02/21 20:34:44, Sebastian Noack wrote: > > > What's about empty lines? It seems that we didn't handle those before > either. > > > But wouldn't an empty line result result into a filter that would block > > > everything? > > > > It appears that contentBlockerLists.js logic was handling that case somehow > but > > I can't see exactly why. Anyway I think you're right, better to check here > > explicitly. Done. > > OK, I looked into it and figured that this case was already handled by following > check in the original code: > > if (filter instanceof filterClasses.RegExpFilter && !filter.regexpSource) > return; > > This check was meant to exclude raw regular expression filters. But it also > bails if the filter has been parsed from and empty string, in which case > regexpSource is "" rather than null. > > I think that check should be changed to |filter.regexpSource == null|. While > empty lines should be explicitly ignored here. > > Also what is if a line in a filter list contains only whitespace characters? I > didn't look into it yet, but we should make sure that we treat that case the > same as when parsing filter lists in Adblock Plus. Done. As for lines with only white-space, we're currently passing them to Filter.fromText, which I guess is all that Adblock Plus does. I tried it out and here's what a rule generated from a filter with a fitertext of " " looks like: { "trigger": { "url-filter": "^https?://.* ", "resource-type": [ "image", "style-sheet", "script", "font", "media", "raw", "document" ] }, "action": { "type": "block" } } I'm not entirely sure if we want abp2blocklist to skip over lines with only whitespace or treat them as any other line. I guess probably the former, as it's more likely a mistake than anything else. https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... File lib/contentBlockerLists.js (right): https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:18: /** @module contentBlockerLists */ On 2016/02/22 17:35:28, Sebastian Noack wrote: > On 2016/02/22 12:28:06, kzar wrote: > > On 2016/02/21 20:36:15, Sebastian Noack wrote: > > > On 2016/02/21 20:34:45, Sebastian Noack wrote: > > > > I'd rather call that module simply abp2blocklist, to keep things > > > > straight-forward, as this is the module of the project, with this module > > being > > > > the only public API. > > > > > > s/is the module of the project/is the name of the project/ > > > > I think abp2blocklist is a fine name for the script that converts an Adblock > > Plus filter list to a content blocker list, but not for a library that will > one > > day be included in adblockpluscore. > > > > I've renamed to contentBlockerList to match the class name, hope that's OK. > > Whether it will be included into adblockpluscore or remain as a separate repo > isn't clear yet. Currently, in particular with the great changes you did here, I > mildly lend towards the latter. > > It's extremely uncommon that libraries like that call their (main) module > differently than the project itself. IMO, it's quite confusing. I'd really rather keep the name lib/contentBlockerLists.js for this, I don't think it's too confusing. https://codereview.adblockplus.org/29336753/diff/29336800/lib/contentBlockerL... File lib/contentBlockerList.js (right): https://codereview.adblockplus.org/29336753/diff/29336800/lib/contentBlockerL... lib/contentBlockerList.js:106: result.push("\\"); On 2016/02/22 17:35:28, Sebastian Noack wrote: > Nit: Perhaps we should change the code here to: > > result.push("\\", c); > break; > > The missing drop-through here might be a little sneaky and hard to spot when > reading the code. Plus that way we have one function call less, when escaping > charterers. But I wouldn't insist. Done. https://codereview.adblockplus.org/29336753/diff/29336800/lib/contentBlockerL... lib/contentBlockerList.js:151: typeMap.PING | typeMap.OTHER)) On 2016/02/22 17:35:28, Sebastian Noack wrote: > Nit: I think it reads slightly better, and looks more consistent, if all types > are aligned. But I wouldn't insist. Done. https://codereview.adblockplus.org/29336753/diff/29336800/lib/contentBlockerL... lib/contentBlockerList.js:285: ContentBlockerList.prototype = {}; On 2016/02/22 17:35:28, Sebastian Noack wrote: > There is no need to set the prototype to an empty object. Each function > automatically gets a prototype object setup. Done. https://codereview.adblockplus.org/29336753/diff/29336815/lib/contentBlockerL... File lib/contentBlockerList.js (right): https://codereview.adblockplus.org/29336753/diff/29336815/lib/contentBlockerL... lib/contentBlockerList.js:23: let tldjs = require("tldjs"); Note: `let getDomain = require("tldjs").getDomain` doesn't work as `this` gets messed up.
https://codereview.adblockplus.org/29336753/diff/29336776/abp2blocklist.js File abp2blocklist.js (right): https://codereview.adblockplus.org/29336753/diff/29336776/abp2blocklist.js#ne... abp2blocklist.js:22: if (line.charAt(0) != "[") On 2016/02/22 18:09:28, kzar wrote: > On 2016/02/22 17:35:27, Sebastian Noack wrote: > > On 2016/02/22 12:28:05, kzar wrote: > > > On 2016/02/21 20:34:44, Sebastian Noack wrote: > > > > What's about empty lines? It seems that we didn't handle those before > > either. > > > > But wouldn't an empty line result result into a filter that would block > > > > everything? > > > > > > It appears that contentBlockerLists.js logic was handling that case somehow > > but > > > I can't see exactly why. Anyway I think you're right, better to check here > > > explicitly. Done. > > > > OK, I looked into it and figured that this case was already handled by > following > > check in the original code: > > > > if (filter instanceof filterClasses.RegExpFilter && !filter.regexpSource) > > return; > > > > This check was meant to exclude raw regular expression filters. But it also > > bails if the filter has been parsed from and empty string, in which case > > regexpSource is "" rather than null. > > > > I think that check should be changed to |filter.regexpSource == null|. While > > empty lines should be explicitly ignored here. > > > > Also what is if a line in a filter list contains only whitespace characters? I > > didn't look into it yet, but we should make sure that we treat that case the > > same as when parsing filter lists in Adblock Plus. > > Done. > > As for lines with only white-space, we're currently passing them to > Filter.fromText, which I guess is all that Adblock Plus does. I tried it out and > here's what a rule generated from a filter with a fitertext of " " looks like: > > { > "trigger": { > "url-filter": "^https?://.* ", > "resource-type": [ > "image", > "style-sheet", > "script", > "font", > "media", > "raw", > "document" > ] > }, > "action": { > "type": "block" > } > } > > I'm not entirely sure if we want abp2blocklist to skip over lines with only > whitespace or treat them as any other line. I guess probably the former, as it's > more likely a mistake than anything else. Yeah, I already figured that myself. The point is we want abp2blocklist behave exactly like Adblock Plus here. Can you try out what happens to empty lines and lines who only have whitespace when Adblock Plus parses filter subscriptions? https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... File lib/contentBlockerLists.js (right): https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:18: /** @module contentBlockerLists */ On 2016/02/22 18:09:29, kzar wrote: > On 2016/02/22 17:35:28, Sebastian Noack wrote: > > On 2016/02/22 12:28:06, kzar wrote: > > > On 2016/02/21 20:36:15, Sebastian Noack wrote: > > > > On 2016/02/21 20:34:45, Sebastian Noack wrote: > > > > > I'd rather call that module simply abp2blocklist, to keep things > > > > > straight-forward, as this is the module of the project, with this module > > > being > > > > > the only public API. > > > > > > > > s/is the module of the project/is the name of the project/ > > > > > > I think abp2blocklist is a fine name for the script that converts an Adblock > > > Plus filter list to a content blocker list, but not for a library that will > > one > > > day be included in adblockpluscore. > > > > > > I've renamed to contentBlockerList to match the class name, hope that's OK. > > > > Whether it will be included into adblockpluscore or remain as a separate repo > > isn't clear yet. Currently, in particular with the great changes you did here, > I > > mildly lend towards the latter. > > > > It's extremely uncommon that libraries like that call their (main) module > > differently than the project itself. IMO, it's quite confusing. > > I'd really rather keep the name lib/contentBlockerLists.js for this, I don't > think it's too confusing. So I guess we need a third opinion here. fhd, what do you think?
Patch Set 5 : Skip lines only containing whitespace https://codereview.adblockplus.org/29336753/diff/29336776/abp2blocklist.js File abp2blocklist.js (right): https://codereview.adblockplus.org/29336753/diff/29336776/abp2blocklist.js#ne... abp2blocklist.js:22: if (line.charAt(0) != "[") On 2016/02/22 18:20:26, Sebastian Noack wrote: > On 2016/02/22 18:09:28, kzar wrote: > > On 2016/02/22 17:35:27, Sebastian Noack wrote: > > > On 2016/02/22 12:28:05, kzar wrote: > > > > On 2016/02/21 20:34:44, Sebastian Noack wrote: > > > > > What's about empty lines? It seems that we didn't handle those before > > > either. > > > > > But wouldn't an empty line result result into a filter that would block > > > > > everything? > > > > > > > > It appears that contentBlockerLists.js logic was handling that case > somehow > > > but > > > > I can't see exactly why. Anyway I think you're right, better to check here > > > > explicitly. Done. > > > > > > OK, I looked into it and figured that this case was already handled by > > following > > > check in the original code: > > > > > > if (filter instanceof filterClasses.RegExpFilter && !filter.regexpSource) > > > return; > > > > > > This check was meant to exclude raw regular expression filters. But it also > > > bails if the filter has been parsed from and empty string, in which case > > > regexpSource is "" rather than null. > > > > > > I think that check should be changed to |filter.regexpSource == null|. While > > > empty lines should be explicitly ignored here. > > > > > > Also what is if a line in a filter list contains only whitespace characters? > I > > > didn't look into it yet, but we should make sure that we treat that case the > > > same as when parsing filter lists in Adblock Plus. > > > > Done. > > > > As for lines with only white-space, we're currently passing them to > > Filter.fromText, which I guess is all that Adblock Plus does. I tried it out > and > > here's what a rule generated from a filter with a fitertext of " " looks > like: > > > > { > > "trigger": { > > "url-filter": "^https?://.* ", > > "resource-type": [ > > "image", > > "style-sheet", > > "script", > > "font", > > "media", > > "raw", > > "document" > > ] > > }, > > "action": { > > "type": "block" > > } > > } > > > > I'm not entirely sure if we want abp2blocklist to skip over lines with only > > whitespace or treat them as any other line. I guess probably the former, as > it's > > more likely a mistake than anything else. > > Yeah, I already figured that myself. The point is we want abp2blocklist behave > exactly like Adblock Plus here. Can you try out what happens to empty lines and > lines who only have whitespace when Adblock Plus parses filter subscriptions? OK add the custom subscription of http://static.kzar.co.uk/dave.txt and then run this in the background console: require("subscriptionClasses").Subscription.knownSubscriptions["http://static.kzar.co.uk/dave.txt"].filters.map(function(f) { return f.text; }).join("\n") It turns out that the lines with just whitespace are stripped, so I've updated abp2blocklist.js to do the same. https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... File lib/contentBlockerLists.js (right): https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:18: /** @module contentBlockerLists */ On 2016/02/22 18:20:26, Sebastian Noack wrote: > On 2016/02/22 18:09:29, kzar wrote: > > On 2016/02/22 17:35:28, Sebastian Noack wrote: > > > On 2016/02/22 12:28:06, kzar wrote: > > > > On 2016/02/21 20:36:15, Sebastian Noack wrote: > > > > > On 2016/02/21 20:34:45, Sebastian Noack wrote: > > > > > > I'd rather call that module simply abp2blocklist, to keep things > > > > > > straight-forward, as this is the module of the project, with this > module > > > > being > > > > > > the only public API. > > > > > > > > > > s/is the module of the project/is the name of the project/ > > > > > > > > I think abp2blocklist is a fine name for the script that converts an > Adblock > > > > Plus filter list to a content blocker list, but not for a library that > will > > > one > > > > day be included in adblockpluscore. > > > > > > > > I've renamed to contentBlockerList to match the class name, hope that's > OK. > > > > > > Whether it will be included into adblockpluscore or remain as a separate > repo > > > isn't clear yet. Currently, in particular with the great changes you did > here, > > I > > > mildly lend towards the latter. > > > > > > It's extremely uncommon that libraries like that call their (main) module > > > differently than the project itself. IMO, it's quite confusing. > > > > I'd really rather keep the name lib/contentBlockerLists.js for this, I don't > > think it's too confusing. > > So I guess we need a third opinion here. fhd, what do you think? I mean by the same logic, wouldn't having the library called abp2blocklist be even more confusing? Then it's not obvious that it just provides a class called ContentBlockerLists.
https://codereview.adblockplus.org/29336753/diff/29337491/abp2blocklist.js File abp2blocklist.js (right): https://codereview.adblockplus.org/29336753/diff/29337491/abp2blocklist.js#ne... abp2blocklist.js:29: if (line.charAt(0) != "[" && line.search(/\S/) > -1) After reading through the respective code in adblockpluscore that downloads and parses filter lists, I found two seemingly inconsistencies with the logic here. But feel free to double check. 1. Leading whitespaces (e.g. " [foo]") don't prevent a header from being recognized as such. 2. After downloading the filter subscription, filters are normalized before they are parsed and saved, e.g. "||example.com " would be treated as "||example.com", while without normalization the filter would also match those spaces. So if I read the code correctly, we should use following logic here: if (/^\s*[^\[\s]/.test(line)) converter.addFilter(Filter.fromText(Filter.normalize(line)));
https://codereview.adblockplus.org/29336753/diff/29337491/abp2blocklist.js File abp2blocklist.js (right): https://codereview.adblockplus.org/29336753/diff/29337491/abp2blocklist.js#ne... abp2blocklist.js:25: var converter = new ContentBlockerList(); Nit: As you went to call the class ContentBlockerList, mind calling the variable "blocklist" (or similar) to match the semantics?
Patch Set 6 : Tweaked line skipping logic and addressed a nit https://codereview.adblockplus.org/29336753/diff/29337491/abp2blocklist.js File abp2blocklist.js (right): https://codereview.adblockplus.org/29336753/diff/29337491/abp2blocklist.js#ne... abp2blocklist.js:25: var converter = new ContentBlockerList(); On 2016/02/24 00:49:50, Sebastian Noack wrote: > Nit: As you went to call the class ContentBlockerList, mind calling the variable > "blocklist" (or similar) to match the semantics? Done. https://codereview.adblockplus.org/29336753/diff/29337491/abp2blocklist.js#ne... abp2blocklist.js:29: if (line.charAt(0) != "[" && line.search(/\S/) > -1) On 2016/02/24 00:45:41, Sebastian Noack wrote: > After reading through the respective code in adblockpluscore that downloads and > parses filter lists, I found two seemingly inconsistencies with the logic here. > But feel free to double check. > > 1. Leading whitespaces (e.g. " [foo]") don't prevent a header from being > recognized as such. > 2. After downloading the filter subscription, filters are normalized before they > are parsed and saved, e.g. "||example.com " would be treated as > "||example.com", while without normalization the filter would also match those > spaces. > > So if I read the code correctly, we should use following logic here: > > if (/^\s*[^\[\s]/.test(line)) > converter.addFilter(Filter.fromText(Filter.normalize(line))); I don't think that will do it as it also matches blank lines. I can't see an easy way to do it with just a regexp, but how about this?
Sorry for the delay, here goes: https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... File lib/contentBlockerLists.js (right): https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:18: /** @module contentBlockerLists */ On 2016/02/22 18:20:26, Sebastian Noack wrote: > On 2016/02/22 18:09:29, kzar wrote: > > On 2016/02/22 17:35:28, Sebastian Noack wrote: > > > On 2016/02/22 12:28:06, kzar wrote: > > > > On 2016/02/21 20:36:15, Sebastian Noack wrote: > > > > > On 2016/02/21 20:34:45, Sebastian Noack wrote: > > > > > > I'd rather call that module simply abp2blocklist, to keep things > > > > > > straight-forward, as this is the module of the project, with this > module > > > > being > > > > > > the only public API. > > > > > > > > > > s/is the module of the project/is the name of the project/ > > > > > > > > I think abp2blocklist is a fine name for the script that converts an > Adblock > > > > Plus filter list to a content blocker list, but not for a library that > will > > > one > > > > day be included in adblockpluscore. > > > > > > > > I've renamed to contentBlockerList to match the class name, hope that's > OK. > > > > > > Whether it will be included into adblockpluscore or remain as a separate > repo > > > isn't clear yet. Currently, in particular with the great changes you did > here, > > I > > > mildly lend towards the latter. > > > > > > It's extremely uncommon that libraries like that call their (main) module > > > differently than the project itself. IMO, it's quite confusing. > > > > I'd really rather keep the name lib/contentBlockerLists.js for this, I don't > > think it's too confusing. > > So I guess we need a third opinion here. fhd, what do you think? No particularly strong opinion, but since you're asking, here's how I see it: If we're doing it in an object oriented way, it's a thing, a thing that can do something, and the name should be a noun that describes the thing. If we're doing it in a procedural way, it's essentially a function, and the name should be a verb that describes the function. ("abp2blocklist" is a verb to me, in this sense, since I see it as an abbreviation for something like "Convert ABP lists to content blocker lists".) Here, Dave started out procedural and switched to something object oriented later. I couldn't find any discussion on that at first glance, maybe that's the more interesting discussion. No strong opinion either, but IMO, while we can really worry about whether to move this into adblockpluscore later, when in doubt, I would lean towards using the same paradigms across projects, and our core code is pretty object oriented. But when it comes to the module name: How I see it, this module is really nothing but a wrapper around either the function or the class (depending on which paradigm we go for), and I would lean towards giving it a matching name. So if the code stays like this, I would go for `contentBlockerList`, singular. Hope that helps.
Patch Set 7 : Use Sebastian's regexp suggestion https://codereview.adblockplus.org/29336753/diff/29337491/abp2blocklist.js File abp2blocklist.js (right): https://codereview.adblockplus.org/29336753/diff/29337491/abp2blocklist.js#ne... abp2blocklist.js:29: if (line.charAt(0) != "[" && line.search(/\S/) > -1) On 2016/02/24 09:59:27, kzar wrote: > On 2016/02/24 00:45:41, Sebastian Noack wrote: > > After reading through the respective code in adblockpluscore that downloads > and > > parses filter lists, I found two seemingly inconsistencies with the logic > here. > > But feel free to double check. > > > > 1. Leading whitespaces (e.g. " [foo]") don't prevent a header from being > > recognized as such. > > 2. After downloading the filter subscription, filters are normalized before > they > > are parsed and saved, e.g. "||example.com " would be treated as > > "||example.com", while without normalization the filter would also match those > > spaces. > > > > So if I read the code correctly, we should use following logic here: > > > > if (/^\s*[^\[\s]/.test(line)) > > converter.addFilter(Filter.fromText(Filter.normalize(line))); > > I don't think that will do it as it also matches blank lines. I can't see an > easy way to do it with just a regexp, but how about this? (Sorry, as discussed in IRC I made a mistake during testing this. I didn't realise the testing tool had multi-line mode enabled, the whitespace lines matched were actually just a continued match from the previous line.)
https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... File lib/contentBlockerLists.js (right): https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:18: /** @module contentBlockerLists */ On 2016/02/24 13:06:30, Felix Dahlke wrote: > On 2016/02/22 18:20:26, Sebastian Noack wrote: > > On 2016/02/22 18:09:29, kzar wrote: > > > On 2016/02/22 17:35:28, Sebastian Noack wrote: > > > > On 2016/02/22 12:28:06, kzar wrote: > > > > > On 2016/02/21 20:36:15, Sebastian Noack wrote: > > > > > > On 2016/02/21 20:34:45, Sebastian Noack wrote: > > > > > > > I'd rather call that module simply abp2blocklist, to keep things > > > > > > > straight-forward, as this is the module of the project, with this > > module > > > > > being > > > > > > > the only public API. > > > > > > > > > > > > s/is the module of the project/is the name of the project/ > > > > > > > > > > I think abp2blocklist is a fine name for the script that converts an > > Adblock > > > > > Plus filter list to a content blocker list, but not for a library that > > will > > > > one > > > > > day be included in adblockpluscore. > > > > > > > > > > I've renamed to contentBlockerList to match the class name, hope that's > > OK. > > > > > > > > Whether it will be included into adblockpluscore or remain as a separate > > repo > > > > isn't clear yet. Currently, in particular with the great changes you did > > here, > > > I > > > > mildly lend towards the latter. > > > > > > > > It's extremely uncommon that libraries like that call their (main) module > > > > differently than the project itself. IMO, it's quite confusing. > > > > > > I'd really rather keep the name lib/contentBlockerLists.js for this, I don't > > > think it's too confusing. > > > > So I guess we need a third opinion here. fhd, what do you think? > > No particularly strong opinion, but since you're asking, here's how I see it: > > If we're doing it in an object oriented way, it's a thing, a thing that can do > something, and the name should be a noun that describes the thing. If we're > doing it in a procedural way, it's essentially a function, and the name should > be a verb that describes the function. ("abp2blocklist" is a verb to me, in this > sense, since I see it as an abbreviation for something like "Convert ABP lists > to content blocker lists".) > > Here, Dave started out procedural and switched to something object oriented > later. I couldn't find any discussion on that at first glance, maybe that's the > more interesting discussion. No strong opinion either, but IMO, while we can > really worry about whether to move this into adblockpluscore later, when in > doubt, I would lean towards using the same paradigms across projects, and our > core code is pretty object oriented. > > But when it comes to the module name: How I see it, this module is really > nothing but a wrapper around either the function or the class (depending on > which paradigm we go for), and I would lean towards giving it a matching name. > So if the code stays like this, I would go for `contentBlockerList`, singular. > > Hope that helps. Thanks for your feedback. But first of all "abp2blocklist" is arguably the name of this project, the same as "Adblock Plus" is the name of our browser extension, and a name is by definition a noun. As for procedural vs. object oriented, it doesn't really matter, as we are not talking about class and function names, but about the name of the module. Integration or consistence with adblockpluscore isn't anything to be considered here. As of now, we don't have any plans to integrate this code with adblockpluscore. And I don't think the structure of adblockpluscore can be compared to the case here. While adblockpluscore is a collection of different modules that are very specific to Adblock Plus, abp2blocklist is more like a generic library with a single entry point just like punycode, tldjs, express, async, etc.
https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... File lib/contentBlockerLists.js (right): https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:18: /** @module contentBlockerLists */ On 2016/02/24 19:31:51, Sebastian Noack wrote: > On 2016/02/24 13:06:30, Felix Dahlke wrote: > > On 2016/02/22 18:20:26, Sebastian Noack wrote: > > > On 2016/02/22 18:09:29, kzar wrote: > > > > On 2016/02/22 17:35:28, Sebastian Noack wrote: > > > > > On 2016/02/22 12:28:06, kzar wrote: > > > > > > On 2016/02/21 20:36:15, Sebastian Noack wrote: > > > > > > > On 2016/02/21 20:34:45, Sebastian Noack wrote: > > > > > > > > I'd rather call that module simply abp2blocklist, to keep things > > > > > > > > straight-forward, as this is the module of the project, with this > > > module > > > > > > being > > > > > > > > the only public API. > > > > > > > > > > > > > > s/is the module of the project/is the name of the project/ > > > > > > > > > > > > I think abp2blocklist is a fine name for the script that converts an > > > Adblock > > > > > > Plus filter list to a content blocker list, but not for a library that > > > will > > > > > one > > > > > > day be included in adblockpluscore. > > > > > > > > > > > > I've renamed to contentBlockerList to match the class name, hope > that's > > > OK. > > > > > > > > > > Whether it will be included into adblockpluscore or remain as a separate > > > repo > > > > > isn't clear yet. Currently, in particular with the great changes you did > > > here, > > > > I > > > > > mildly lend towards the latter. > > > > > > > > > > It's extremely uncommon that libraries like that call their (main) > module > > > > > differently than the project itself. IMO, it's quite confusing. > > > > > > > > I'd really rather keep the name lib/contentBlockerLists.js for this, I > don't > > > > think it's too confusing. > > > > > > So I guess we need a third opinion here. fhd, what do you think? > > > > No particularly strong opinion, but since you're asking, here's how I see it: > > > > If we're doing it in an object oriented way, it's a thing, a thing that can do > > something, and the name should be a noun that describes the thing. If we're > > doing it in a procedural way, it's essentially a function, and the name should > > be a verb that describes the function. ("abp2blocklist" is a verb to me, in > this > > sense, since I see it as an abbreviation for something like "Convert ABP lists > > to content blocker lists".) > > > > Here, Dave started out procedural and switched to something object oriented > > later. I couldn't find any discussion on that at first glance, maybe that's > the > > more interesting discussion. No strong opinion either, but IMO, while we can > > really worry about whether to move this into adblockpluscore later, when in > > doubt, I would lean towards using the same paradigms across projects, and our > > core code is pretty object oriented. > > > > But when it comes to the module name: How I see it, this module is really > > nothing but a wrapper around either the function or the class (depending on > > which paradigm we go for), and I would lean towards giving it a matching name. > > So if the code stays like this, I would go for `contentBlockerList`, singular. > > > > Hope that helps. > > Thanks for your feedback. > > But first of all "abp2blocklist" is arguably the name of this project, the same > as "Adblock Plus" is the name of our browser extension, and a name is by > definition a noun. > > As for procedural vs. object oriented, it doesn't really matter, as we are not > talking about class and function names, but about the name of the module. > > Integration or consistence with adblockpluscore isn't anything to be considered > here. As of now, we don't have any plans to integrate this code with > adblockpluscore. And I don't think the structure of adblockpluscore can be > compared to the case here. While adblockpluscore is a collection of different > modules that are very specific to Adblock Plus, abp2blocklist is more like a > generic library with a single entry point just like punycode, tldjs, express, > async, etc. Dave and I continued the discussion on IRC: <snoack> So you still disagree? <kzar> Well yea, I would still prefer to call it lib/contentBlockerList.js than lib/abp2blocklist.js <snoack> What's about comparing abp2blocklist to libraries like punycode (whose module isn't called "DomainConverter" either) or tldjs (not going for "TopLevelDomainDatabase") or express (not going for "WebFrameworkInterface"), but with a module name that matches their respective project name? <kzar> I guess here's the way I look at it: abp2blocklist started off as a script to generate content blocker lists and its name made sense then. Now we're splitting off the library, in the future the library might well not even be in the same repository as the abp2blocklist script. I think when naming the library we should choose something that will make sense for it even if the abp2blocklist repository no longer exists. <kzar> Look at the names of modules in adblockpluscore, things like lib/filterClasses.js. I think lib/contentBlockList.js would fix in with those names better than lib/abp2blocklist.js. <kzar> But ultimately this is a subjective matter, either opinion about it is valid <snoack> As I said, that really doesn't matter. As of now, there is no point of moving that code to adblockpluscore. With your changes it can persist as a separate repo, which also fits better into our granular repo approach, in particular since the code in there really is rather "platform" then "core", and we definitely don't want to have repos with code of <snoack> different modules anymore. We've been there before. <snoack> And even if we should move that code somewhere else in the future, we have to make changes anyway, renaming the imports while doing so wouldn't be a big deal. It seems that we can not get a consensus. So let's leave it up to Felix!
On 2016/02/24 22:10:04, Sebastian Noack wrote: > It seems that we can not get a consensus. So let's leave it up to Felix! Sounds good to me, happy to just go with whichever he says at this point! Are there any other things you want me to address before pushing this? (If not would you mind giving a provisional L... so I can push once we have a decision?)
On 2016/02/24 22:27:16, kzar wrote: > On 2016/02/24 22:10:04, Sebastian Noack wrote: > > It seems that we can not get a consensus. So let's leave it up to Felix! > > Sounds good to me, happy to just go with whichever he says at this point! > > Are there any other things you want me to address before pushing this? (If not > would you mind giving a provisional L... so I can push once we have a decision?) Yeah, otherwise, LGTM
Sorry for the delay! https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... File lib/contentBlockerLists.js (right): https://codereview.adblockplus.org/29336753/diff/29336776/lib/contentBlockerL... lib/contentBlockerLists.js:18: /** @module contentBlockerLists */ On 2016/02/24 22:10:04, Sebastian Noack wrote: > On 2016/02/24 19:31:51, Sebastian Noack wrote: > > On 2016/02/24 13:06:30, Felix Dahlke wrote: > > > On 2016/02/22 18:20:26, Sebastian Noack wrote: > > > > On 2016/02/22 18:09:29, kzar wrote: > > > > > On 2016/02/22 17:35:28, Sebastian Noack wrote: > > > > > > On 2016/02/22 12:28:06, kzar wrote: > > > > > > > On 2016/02/21 20:36:15, Sebastian Noack wrote: > > > > > > > > On 2016/02/21 20:34:45, Sebastian Noack wrote: > > > > > > > > > I'd rather call that module simply abp2blocklist, to keep things > > > > > > > > > straight-forward, as this is the module of the project, with > this > > > > module > > > > > > > being > > > > > > > > > the only public API. > > > > > > > > > > > > > > > > s/is the module of the project/is the name of the project/ > > > > > > > > > > > > > > I think abp2blocklist is a fine name for the script that converts an > > > > Adblock > > > > > > > Plus filter list to a content blocker list, but not for a library > that > > > > will > > > > > > one > > > > > > > day be included in adblockpluscore. > > > > > > > > > > > > > > I've renamed to contentBlockerList to match the class name, hope > > that's > > > > OK. > > > > > > > > > > > > Whether it will be included into adblockpluscore or remain as a > separate > > > > repo > > > > > > isn't clear yet. Currently, in particular with the great changes you > did > > > > here, > > > > > I > > > > > > mildly lend towards the latter. > > > > > > > > > > > > It's extremely uncommon that libraries like that call their (main) > > module > > > > > > differently than the project itself. IMO, it's quite confusing. > > > > > > > > > > I'd really rather keep the name lib/contentBlockerLists.js for this, I > > don't > > > > > think it's too confusing. > > > > > > > > So I guess we need a third opinion here. fhd, what do you think? > > > > > > No particularly strong opinion, but since you're asking, here's how I see > it: > > > > > > If we're doing it in an object oriented way, it's a thing, a thing that can > do > > > something, and the name should be a noun that describes the thing. If we're > > > doing it in a procedural way, it's essentially a function, and the name > should > > > be a verb that describes the function. ("abp2blocklist" is a verb to me, in > > this > > > sense, since I see it as an abbreviation for something like "Convert ABP > lists > > > to content blocker lists".) > > > > > > Here, Dave started out procedural and switched to something object oriented > > > later. I couldn't find any discussion on that at first glance, maybe that's > > the > > > more interesting discussion. No strong opinion either, but IMO, while we can > > > really worry about whether to move this into adblockpluscore later, when in > > > doubt, I would lean towards using the same paradigms across projects, and > our > > > core code is pretty object oriented. > > > > > > But when it comes to the module name: How I see it, this module is really > > > nothing but a wrapper around either the function or the class (depending on > > > which paradigm we go for), and I would lean towards giving it a matching > name. > > > So if the code stays like this, I would go for `contentBlockerList`, > singular. > > > > > > Hope that helps. > > > > Thanks for your feedback. > > > > But first of all "abp2blocklist" is arguably the name of this project, the > same > > as "Adblock Plus" is the name of our browser extension, and a name is by > > definition a noun. > > > > As for procedural vs. object oriented, it doesn't really matter, as we are not > > talking about class and function names, but about the name of the module. > > > > Integration or consistence with adblockpluscore isn't anything to be > considered > > here. As of now, we don't have any plans to integrate this code with > > adblockpluscore. And I don't think the structure of adblockpluscore can be > > compared to the case here. While adblockpluscore is a collection of different > > modules that are very specific to Adblock Plus, abp2blocklist is more like a > > generic library with a single entry point just like punycode, tldjs, express, > > async, etc. > > Dave and I continued the discussion on IRC: > > <snoack> So you still disagree? > <kzar> Well yea, I would still prefer to call it lib/contentBlockerList.js than > lib/abp2blocklist.js > <snoack> What's about comparing abp2blocklist to libraries like punycode (whose > module isn't called "DomainConverter" either) or tldjs (not going for > "TopLevelDomainDatabase") or express (not going for "WebFrameworkInterface"), > but with a module name that matches their respective project name? > <kzar> I guess here's the way I look at it: abp2blocklist started off as a > script to generate content blocker lists and its name made sense then. Now we're > splitting off the library, in the future the library might well not even be in > the same repository as the abp2blocklist script. I think when naming the library > we should choose something that will make sense for it even if the abp2blocklist > repository no longer exists. > <kzar> Look at the names of modules in adblockpluscore, things like > lib/filterClasses.js. I think lib/contentBlockList.js would fix in with those > names better than lib/abp2blocklist.js. > <kzar> But ultimately this is a subjective matter, either opinion about it is > valid > <snoack> As I said, that really doesn't matter. As of now, there is no point of > moving that code to adblockpluscore. With your changes it can persist as a > separate repo, which also fits better into our granular repo approach, in > particular since the code in there really is rather "platform" then "core", and > we definitely don't want to have repos with code of > <snoack> different modules anymore. We've been there before. > <snoack> And even if we should move that code somewhere else in the future, we > have to make changes anyway, renaming the imports while doing so wouldn't be a > big deal. > > It seems that we can not get a consensus. So let's leave it up to Felix! Alright, read up on the discussion again and had a closer look at the code, here I go: My previous reasoning was from this angle: We have an internal contentBlockerList module that is used in the abp2blocklist script, which is, in a way, the public API of this repository. So I was commenting on the name of an internal module. If you intend to turn this into a library project, then it does make more sense to me to name the module that contains its public API just like the library, that's how it seems to be done in other projects most of the time. (We could give the library a different name, I don't care much. But note that we already have a repository called "contentblockerlists", that stores pregenerated lists.) So I guess the real discussion is, after all, whether we turn abp2blocklist into a library, or whether we move the code to adblockpluscore. I suggest we do the former for now, for the sake of progress. And as Sebastian said, currently this code is in Platform, not Core, which is IMO the better place for it right now, if only because the people who actually work on it own it that way. BUT: If the name of the library stays "abp2blocklist", can you move the "abp2blocklist.js" script to /bin, and/or give it some other name? I find it confusing if we have a file of the same name both in the root and in the lib folders. We might even want to not have a conversion script at all here and rather move that into sitescripts - we seem to only need it for producing pregenerated lists, which we only need in sitescripts.
Patch Set 8 : Renamed lib/contentBlockList.js to lib/abp2blocklist.js |