|
|
Created:
March 6, 2019, 3:21 p.m. by hub Modified:
May 10, 2019, 2:17 p.m. CC:
Manish Jethani, Felix Dahlke Base URL:
https://hg.adblockplus.org/adblockpluscore/ Visibility:
Public. |
DescriptionIssue 7450 - Implement hide-if-contains-visible-text snippet
Patch Set 1 #
Total comments: 9
Patch Set 2 : Now handle when the element is hidden #Patch Set 3 : This is now a snippet #Patch Set 4 : added docs, more test. #
Total comments: 26
Patch Set 5 : Rework snippet #Patch Set 6 : Minor documentation tweak #
Total comments: 16
Patch Set 7 : Review comments addressed #Patch Set 8 : a couple of jsdocs fixes #Patch Set 9 : Snippet is now type 1 #
Total comments: 7
Patch Set 10 : Fixed a few nits. Unfactored test code. #
Total comments: 20
Patch Set 11 : Added some more CSS properties to check. Minor review comments addressed. #
Total comments: 2
Patch Set 12 : Properly handle visibility #
Total comments: 4
Patch Set 13 : Check about null parent and use parentElement instead #Patch Set 14 : Updated patch for real #
MessagesTotal messages: 49
https://codereview.adblockplus.org/30024560/diff/30024561/lib/content/elemHid... File lib/content/elemHideEmulation.js (right): https://codereview.adblockplus.org/30024560/diff/30024561/lib/content/elemHid... lib/content/elemHideEmulation.js:352: let style = window.getComputedStyle(element); I wonder if it is worth complexifying the code to call this function in the outer code to avoid redundancy on what *may* be expensive.
https://codereview.adblockplus.org/30024560/diff/30024561/lib/content/elemHid... File lib/content/elemHideEmulation.js (right): https://codereview.adblockplus.org/30024560/diff/30024561/lib/content/elemHid... lib/content/elemHideEmulation.js:375: function getVisibleContent(element) Instead of doing all this, why not just use the `innerText` property? https://developer.mozilla.org/en-US/docs/Web/API/HTMLElement/innerText
https://codereview.adblockplus.org/30024560/diff/30024561/lib/content/elemHid... File lib/content/elemHideEmulation.js (right): https://codereview.adblockplus.org/30024560/diff/30024561/lib/content/elemHid... lib/content/elemHideEmulation.js:375: function getVisibleContent(element) On 2019/03/12 08:44:01, Manish Jethani wrote: > Instead of doing all this, why not just use the `innerText` property? > > https://developer.mozilla.org/en-US/docs/Web/API/HTMLElement/innerText innerText doesn't know about opacity, 0 sized font, etc. It will only consider `display` and `visibility`. And I need to figure out individual which text node are visible or not.
https://codereview.adblockplus.org/30024560/diff/30024561/lib/content/elemHid... File lib/content/elemHideEmulation.js (right): https://codereview.adblockplus.org/30024560/diff/30024561/lib/content/elemHid... lib/content/elemHideEmulation.js:352: let style = window.getComputedStyle(element); On 2019/03/06 16:34:11, hub wrote: > I wonder if it is worth complexifying the code to call this function in the > outer code to avoid redundancy on what *may* be expensive. If you have `<div style="display: none"><div>Sponsored</div></div>` then this won't work anyway, because the immediate parent does not have any of the hiding styles. We need to check whether a node is visible before descending into it. Actually we need to check only for the non-inherited properties [1] first before descending into a node. The inherited properties (like color) can be checked when we're examining a text node (second case block). [1]: https://developer.mozilla.org/en-US/docs/Web/CSS/inheritance#Non-inherited_pr... https://codereview.adblockplus.org/30024560/diff/30024561/lib/content/elemHid... lib/content/elemHideEmulation.js:375: function getVisibleContent(element) On 2019/03/12 09:29:29, hub wrote: > On 2019/03/12 08:44:01, Manish Jethani wrote: > > Instead of doing all this, why not just use the `innerText` property? > > > > https://developer.mozilla.org/en-US/docs/Web/API/HTMLElement/innerText > > innerText doesn't know about opacity, 0 sized font, etc. It will only consider > `display` and `visibility`. And I need to figure out individual which text node > are visible or not. Acknowledged.
Also see my comment over here: https://issues.adblockplus.org/ticket/7337#comment:9 I think we should go with this syntax tentatively.
https://codereview.adblockplus.org/30024560/diff/30024561/lib/content/elemHid... File lib/content/elemHideEmulation.js (right): https://codereview.adblockplus.org/30024560/diff/30024561/lib/content/elemHid... lib/content/elemHideEmulation.js:352: let style = window.getComputedStyle(element); On 2019/03/12 12:24:09, Manish Jethani wrote: > On 2019/03/06 16:34:11, hub wrote: > > I wonder if it is worth complexifying the code to call this function in the > > outer code to avoid redundancy on what *may* be expensive. > > If you have `<div style="display: none"><div>Sponsored</div></div>` then this > won't work anyway, because the immediate parent does not have any of the hiding > styles. We need to check whether a node is visible before descending into it. > > Actually we need to check only for the non-inherited properties [1] first before > descending into a node. The inherited properties (like color) can be checked > when we're examining a text node (second case block). > > [1]: > https://developer.mozilla.org/en-US/docs/Web/CSS/inheritance#Non-inherited_pr... I think we can constrain to "visible within the specified element". With your example from the outer div (that has the style attribute), the text "Sponsored" is visible. But it is the outer element that is hidden. Also I have seen some case where setting the outer element style to `display:none` make `innerText` return the text even for the element that also is `display: none` so it really makes for us having to drill into the node tree.
https://codereview.adblockplus.org/30024560/diff/30024561/lib/content/elemHid... File lib/content/elemHideEmulation.js (right): https://codereview.adblockplus.org/30024560/diff/30024561/lib/content/elemHid... lib/content/elemHideEmulation.js:352: let style = window.getComputedStyle(element); On 2019/03/14 10:12:40, hub wrote: > On 2019/03/12 12:24:09, Manish Jethani wrote: > > On 2019/03/06 16:34:11, hub wrote: > > > I wonder if it is worth complexifying the code to call this function in the > > > outer code to avoid redundancy on what *may* be expensive. > > > > If you have `<div style="display: none"><div>Sponsored</div></div>` then this > > won't work anyway, because the immediate parent does not have any of the > hiding > > styles. We need to check whether a node is visible before descending into it. > > > > Actually we need to check only for the non-inherited properties [1] first > before > > descending into a node. The inherited properties (like color) can be checked > > when we're examining a text node (second case block). > > > > [1]: > > > https://developer.mozilla.org/en-US/docs/Web/CSS/inheritance#Non-inherited_pr... > > I think we can constrain to "visible within the specified element". With your > example from the outer div (that has the style attribute), the text "Sponsored" > is visible. But it is the outer element that is hidden. OK, perhaps I can change my example to this: `<article><div style="display: none"><div>Sponsored</div></div></article>`. The filter `#?#article:-abp-contains-visible(Sponsored)` should hide the `<article>` element. With the current code it wouldn't.
https://codereview.adblockplus.org/30024560/diff/30024561/lib/content/elemHid... File lib/content/elemHideEmulation.js (right): https://codereview.adblockplus.org/30024560/diff/30024561/lib/content/elemHid... lib/content/elemHideEmulation.js:352: let style = window.getComputedStyle(element); On 2019/03/14 11:36:33, Manish Jethani wrote: > OK, perhaps I can change my example to this: `<article><div style="display: > none"><div>Sponsored</div></div></article>`. The filter > `#?#article:-abp-contains-visible(Sponsored)` should hide the `<article>` > element. With the current code it wouldn't. Actually I believe it should not hide it. But it does.
https://codereview.adblockplus.org/30024560/diff/30024561/lib/content/elemHid... File lib/content/elemHideEmulation.js (right): https://codereview.adblockplus.org/30024560/diff/30024561/lib/content/elemHid... lib/content/elemHideEmulation.js:352: let style = window.getComputedStyle(element); On 2019/03/20 14:19:55, hub wrote: > On 2019/03/14 11:36:33, Manish Jethani wrote: > > > OK, perhaps I can change my example to this: `<article><div style="display: > > none"><div>Sponsored</div></div></article>`. The filter > > `#?#article:-abp-contains-visible(Sponsored)` should hide the `<article>` > > element. With the current code it wouldn't. > > Actually I believe it should not hide it. But it does. This is handled properly now.
This is the snippet approach. I need to have a wider test coverage, and I need to test it IRL. (and at that point rewrite the trac issue)
On 2019/04/05 21:06:06, hub wrote: > This is the snippet approach. I need to have a wider test coverage, and I need > to test it IRL. > > (and at that point rewrite the trac issue) Thanks, I'll take a look at this tomorrow.
https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:921: function hideIfContainsVisibleText(search, hideSelector, innerSelector = null) I know the use case here is to target a node based on visible text, but I can see the definition of "visible" changing both between releases and for different sites. How about we make it an additional parameter to the `hide-if-contains-and-matches-style` snippet? We could call it `textStyle`. In addition to doing everything else it already does, it would also do what it does here, i.e. check the style of the text. If we do this, it would be more flexible but would not cover the case of `color == background-color`, but filter list authors can just specify the actual color on the site and it should work. Since regular expressions would be supported, it should cover more cases. To use the example in the Trac ticket: <style> .c978y4 { display: none; } </style> <div id="foo"> <span>S</span><span class="c978y4">A</span><span>ponso</span><span class="c978y4">x</span><span>red</span> </div> The filter would look like this: example.com#$#hide-if-contains-and-matches-style Sponsored #foo #foo '' '' 'display: inline' With named arguments: example.com#$#hide-if-contains-and-matches-style search=Sponsored selector=#foo text-style='display: inline' What do you think?
https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:921: function hideIfContainsVisibleText(search, hideSelector, innerSelector = null) On 2019/04/09 12:31:23, Manish Jethani wrote: > I know the use case here is to target a node based on visible text, but I can > see the definition of "visible" changing both between releases and for different > sites. How about we make it an additional parameter to the > `hide-if-contains-and-matches-style` snippet? We could call it `textStyle`. In > addition to doing everything else it already does, it would also do what it does > here, i.e. check the style of the text. > > If we do this, it would be more flexible but would not cover the case of `color > == background-color`, but filter list authors can just specify the actual color > on the site and it should work. Since regular expressions would be supported, it > should cover more cases. > > To use the example in the Trac ticket: > > <style> > .c978y4 { > display: none; > } > </style> > <div id="foo"> > <span>S</span><span class="c978y4">A</span><span>ponso</span><span > class="c978y4">x</span><span>red</span> > </div> > > The filter would look like this: > > example.com#$#hide-if-contains-and-matches-style Sponsored #foo #foo '' '' > 'display: inline' > > With named arguments: > > example.com#$#hide-if-contains-and-matches-style search=Sponsored > selector=#foo text-style='display: inline' > > What do you think? I think this defeat the design of that snippet that establish the visibility: the idea is that the filter authors don't have to deal with it, and therefor don't need to update the filter when things change. We should instead ensure we can cover all the bases here. Also we have seen in the wild case were some of the text has opacity=0 and other has font-size=0. In the same label. Which this snippet handles fine.
https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:921: function hideIfContainsVisibleText(search, hideSelector, innerSelector = null) On 2019/04/10 16:16:21, hub wrote: > On 2019/04/09 12:31:23, Manish Jethani wrote: > > I know the use case here is to target a node based on visible text, but I can > > see the definition of "visible" changing both between releases and for > different > > sites. How about we make it an additional parameter to the > > `hide-if-contains-and-matches-style` snippet? We could call it `textStyle`. In > > addition to doing everything else it already does, it would also do what it > does > > here, i.e. check the style of the text. > > > > If we do this, it would be more flexible but would not cover the case of > `color > > == background-color`, but filter list authors can just specify the actual > color > > on the site and it should work. Since regular expressions would be supported, > it > > should cover more cases. > > > > To use the example in the Trac ticket: > > > > <style> > > .c978y4 { > > display: none; > > } > > </style> > > <div id="foo"> > > <span>S</span><span class="c978y4">A</span><span>ponso</span><span > > class="c978y4">x</span><span>red</span> > > </div> > > > > The filter would look like this: > > > > example.com#$#hide-if-contains-and-matches-style Sponsored #foo #foo '' '' > > 'display: inline' > > > > With named arguments: > > > > example.com#$#hide-if-contains-and-matches-style search=Sponsored > > selector=#foo text-style='display: inline' > > > > What do you think? > > I think this defeat the design of that snippet that establish the visibility: > the idea is that the filter authors don't have to deal with it, and therefor > don't need to update the filter when things change. We should instead ensure we > can cover all the bases here. > > Also we have seen in the wild case were some of the text has opacity=0 and other > has font-size=0. In the same label. Which this snippet handles fine. Instead of hard-coding the styles that would identify text that is "visible" (by some definition of the term), what I am suggesting is that we make it dynamic so that filter list authors can specify their own style. In the case of ABP filters, we would then specify the styles ourselves - but not in the snippet's code but in the filter. This way we can use it for multiple purposes to cover more cases and still retain the flexibility of updating our technique to identify visible text without having to do a whole new release. What do you think is the downside to this approach exactly?
https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:921: function hideIfContainsVisibleText(search, hideSelector, innerSelector = null) On 2019/04/10 16:39:23, Manish Jethani wrote: > On 2019/04/10 16:16:21, hub wrote: > > On 2019/04/09 12:31:23, Manish Jethani wrote: > > > I know the use case here is to target a node based on visible text, but I > can > > > see the definition of "visible" changing both between releases and for > > different > > > sites. How about we make it an additional parameter to the > > > `hide-if-contains-and-matches-style` snippet? We could call it `textStyle`. > In > > > addition to doing everything else it already does, it would also do what it > > does > > > here, i.e. check the style of the text. > > > > > > If we do this, it would be more flexible but would not cover the case of > > `color > > > == background-color`, but filter list authors can just specify the actual > > color > > > on the site and it should work. Since regular expressions would be > supported, > > it > > > should cover more cases. > > > > > > To use the example in the Trac ticket: > > > > > > <style> > > > .c978y4 { > > > display: none; > > > } > > > </style> > > > <div id="foo"> > > > <span>S</span><span class="c978y4">A</span><span>ponso</span><span > > > class="c978y4">x</span><span>red</span> > > > </div> > > > > > > The filter would look like this: > > > > > > example.com#$#hide-if-contains-and-matches-style Sponsored #foo #foo '' '' > > > 'display: inline' > > > > > > With named arguments: > > > > > > example.com#$#hide-if-contains-and-matches-style search=Sponsored > > > selector=#foo text-style='display: inline' > > > > > > What do you think? > > > > I think this defeat the design of that snippet that establish the visibility: > > the idea is that the filter authors don't have to deal with it, and therefor > > don't need to update the filter when things change. We should instead ensure > we > > can cover all the bases here. > > > > Also we have seen in the wild case were some of the text has opacity=0 and > other > > has font-size=0. In the same label. Which this snippet handles fine. > > Instead of hard-coding the styles that would identify text that is "visible" (by > some definition of the term), what I am suggesting is that we make it dynamic so > that filter list authors can specify their own style. In the case of ABP > filters, we would then specify the styles ourselves - but not in the snippet's > code but in the filter. This way we can use it for multiple purposes to cover > more cases and still retain the flexibility of updating our technique to > identify visible text without having to do a whole new release. > > What do you think is the downside to this approach exactly? By the way, if you think we should add a `hide-if-contains-visible-text` snippet, let's add that too, but let's route the implementation through `hide-if-contains-and-matches-style`. This way if it gets circumvented at least we don't have to wait for a new release, plus the text-style matching functionality might come in handy in other cases. Does this make sense?
https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:921: function hideIfContainsVisibleText(search, hideSelector, innerSelector = null) On 2019/04/10 16:39:23, Manish Jethani wrote: > On 2019/04/10 16:16:21, hub wrote: > > On 2019/04/09 12:31:23, Manish Jethani wrote: > > > I know the use case here is to target a node based on visible text, but I > can > > > see the definition of "visible" changing both between releases and for > > different > > > sites. How about we make it an additional parameter to the > > > `hide-if-contains-and-matches-style` snippet? We could call it `textStyle`. > In > > > addition to doing everything else it already does, it would also do what it > > does > > > here, i.e. check the style of the text. > > > > > > If we do this, it would be more flexible but would not cover the case of > > `color > > > == background-color`, but filter list authors can just specify the actual > > color > > > on the site and it should work. Since regular expressions would be > supported, > > it > > > should cover more cases. > > > > > > To use the example in the Trac ticket: > > > > > > <style> > > > .c978y4 { > > > display: none; > > > } > > > </style> > > > <div id="foo"> > > > <span>S</span><span class="c978y4">A</span><span>ponso</span><span > > > class="c978y4">x</span><span>red</span> > > > </div> > > > > > > The filter would look like this: > > > > > > example.com#$#hide-if-contains-and-matches-style Sponsored #foo #foo '' '' > > > 'display: inline' > > > > > > With named arguments: > > > > > > example.com#$#hide-if-contains-and-matches-style search=Sponsored > > > selector=#foo text-style='display: inline' > > > > > > What do you think? > > > > I think this defeat the design of that snippet that establish the visibility: > > the idea is that the filter authors don't have to deal with it, and therefor > > don't need to update the filter when things change. We should instead ensure > we > > can cover all the bases here. > > > > Also we have seen in the wild case were some of the text has opacity=0 and > other > > has font-size=0. In the same label. Which this snippet handles fine. > > Instead of hard-coding the styles that would identify text that is "visible" (by > some definition of the term), what I am suggesting is that we make it dynamic so > that filter list authors can specify their own style. In the case of ABP > filters, we would then specify the styles ourselves - but not in the snippet's > code but in the filter. This way we can use it for multiple purposes to cover > more cases and still retain the flexibility of updating our technique to > identify visible text without having to do a whole new release. > > What do you think is the downside to this approach exactly? I'd be ok, despite being more complicated, to allow filter authors to specify properties/values to either override or supplement the defaults. However this mean we need to agree on a syntax and that can get complicated. But it is important that this snippets works well with default values (I have added more CSS properties than what I found used in the wild, and I'm ok to justify adding more). Also another thing that is not being handled, because I believe it is complicated, but could fall into the area of this snippets, is the absolute position of the element could out of the display to hide it. I have deliberately left this out for the first version because of its inherent complexity. But if it came it was used... https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:921: function hideIfContainsVisibleText(search, hideSelector, innerSelector = null) On 2019/04/10 17:15:17, Manish Jethani wrote: > On 2019/04/10 16:39:23, Manish Jethani wrote: > > On 2019/04/10 16:16:21, hub wrote: > > > On 2019/04/09 12:31:23, Manish Jethani wrote: > > > > I know the use case here is to target a node based on visible text, but I > > can > > > > see the definition of "visible" changing both between releases and for > > > different > > > > sites. How about we make it an additional parameter to the > > > > `hide-if-contains-and-matches-style` snippet? We could call it > `textStyle`. > > In > > > > addition to doing everything else it already does, it would also do what > it > > > does > > > > here, i.e. check the style of the text. > > > > > > > > If we do this, it would be more flexible but would not cover the case of > > > `color > > > > == background-color`, but filter list authors can just specify the actual > > > color > > > > on the site and it should work. Since regular expressions would be > > supported, > > > it > > > > should cover more cases. > > > > > > > > To use the example in the Trac ticket: > > > > > > > > <style> > > > > .c978y4 { > > > > display: none; > > > > } > > > > </style> > > > > <div id="foo"> > > > > <span>S</span><span class="c978y4">A</span><span>ponso</span><span > > > > class="c978y4">x</span><span>red</span> > > > > </div> > > > > > > > > The filter would look like this: > > > > > > > > example.com#$#hide-if-contains-and-matches-style Sponsored #foo #foo '' > '' > > > > 'display: inline' > > > > > > > > With named arguments: > > > > > > > > example.com#$#hide-if-contains-and-matches-style search=Sponsored > > > > selector=#foo text-style='display: inline' > > > > > > > > What do you think? > > > > > > I think this defeat the design of that snippet that establish the > visibility: > > > the idea is that the filter authors don't have to deal with it, and therefor > > > don't need to update the filter when things change. We should instead ensure > > we > > > can cover all the bases here. > > > > > > Also we have seen in the wild case were some of the text has opacity=0 and > > other > > > has font-size=0. In the same label. Which this snippet handles fine. > > > > Instead of hard-coding the styles that would identify text that is "visible" > (by > > some definition of the term), what I am suggesting is that we make it dynamic > so > > that filter list authors can specify their own style. In the case of ABP > > filters, we would then specify the styles ourselves - but not in the snippet's > > code but in the filter. This way we can use it for multiple purposes to cover > > more cases and still retain the flexibility of updating our technique to > > identify visible text without having to do a whole new release. > > > > What do you think is the downside to this approach exactly? > > By the way, if you think we should add a `hide-if-contains-visible-text` > snippet, let's add that too, but let's route the implementation through > `hide-if-contains-and-matches-style`. This way if it gets circumvented at least > we don't have to wait for a new release, plus the text-style matching > functionality might come in handy in other cases. > > Does this make sense? I don't think it make sense to merge the code of these two snippets.
https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:921: function hideIfContainsVisibleText(search, hideSelector, innerSelector = null) On 2019/04/10 19:40:05, hub wrote: > On 2019/04/10 16:39:23, Manish Jethani wrote: > > On 2019/04/10 16:16:21, hub wrote: > > > On 2019/04/09 12:31:23, Manish Jethani wrote: > > > > I know the use case here is to target a node based on visible text, but I > > can > > > > see the definition of "visible" changing both between releases and for > > > different > > > > sites. How about we make it an additional parameter to the > > > > `hide-if-contains-and-matches-style` snippet? We could call it > `textStyle`. > > In > > > > addition to doing everything else it already does, it would also do what > it > > > does > > > > here, i.e. check the style of the text. > > > > > > > > If we do this, it would be more flexible but would not cover the case of > > > `color > > > > == background-color`, but filter list authors can just specify the actual > > > color > > > > on the site and it should work. Since regular expressions would be > > supported, > > > it > > > > should cover more cases. > > > > > > > > To use the example in the Trac ticket: > > > > > > > > <style> > > > > .c978y4 { > > > > display: none; > > > > } > > > > </style> > > > > <div id="foo"> > > > > <span>S</span><span class="c978y4">A</span><span>ponso</span><span > > > > class="c978y4">x</span><span>red</span> > > > > </div> > > > > > > > > The filter would look like this: > > > > > > > > example.com#$#hide-if-contains-and-matches-style Sponsored #foo #foo '' > '' > > > > 'display: inline' > > > > > > > > With named arguments: > > > > > > > > example.com#$#hide-if-contains-and-matches-style search=Sponsored > > > > selector=#foo text-style='display: inline' > > > > > > > > What do you think? > > > > > > I think this defeat the design of that snippet that establish the > visibility: > > > the idea is that the filter authors don't have to deal with it, and therefor > > > don't need to update the filter when things change. We should instead ensure > > we > > > can cover all the bases here. > > > > > > Also we have seen in the wild case were some of the text has opacity=0 and > > other > > > has font-size=0. In the same label. Which this snippet handles fine. > > > > Instead of hard-coding the styles that would identify text that is "visible" > (by > > some definition of the term), what I am suggesting is that we make it dynamic > so > > that filter list authors can specify their own style. In the case of ABP > > filters, we would then specify the styles ourselves - but not in the snippet's > > code but in the filter. This way we can use it for multiple purposes to cover > > more cases and still retain the flexibility of updating our technique to > > identify visible text without having to do a whole new release. > > > > What do you think is the downside to this approach exactly? > > I'd be ok, despite being more complicated, to allow filter authors to specify > properties/values to either override or supplement the defaults. However this > mean we need to agree on a syntax and that can get complicated. But it is > important that this snippets works well with default values (I have added more > CSS properties than what I found used in the wild, and I'm ok to justify adding > more). We don't have to invent anything new for the syntax, it can use the same syntax as the `style` and `searchStyle` parameters? Isn't it? > Also another thing that is not being handled, because I believe it is > complicated, but could fall into the area of this snippets, is the absolute > position of the element could out of the display to hide it. I have deliberately > left this out for the first version because of its inherent complexity. But if > it came it was used... Ack, but we can cross that bridge when we get to it. https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:921: function hideIfContainsVisibleText(search, hideSelector, innerSelector = null) On 2019/04/10 19:40:05, hub wrote: > On 2019/04/10 17:15:17, Manish Jethani wrote: > > On 2019/04/10 16:39:23, Manish Jethani wrote: > > > On 2019/04/10 16:16:21, hub wrote: > > > > On 2019/04/09 12:31:23, Manish Jethani wrote: > > > > > I know the use case here is to target a node based on visible text, but > I > > > can > > > > > see the definition of "visible" changing both between releases and for > > > > different > > > > > sites. How about we make it an additional parameter to the > > > > > `hide-if-contains-and-matches-style` snippet? We could call it > > `textStyle`. > > > In > > > > > addition to doing everything else it already does, it would also do what > > it > > > > does > > > > > here, i.e. check the style of the text. > > > > > > > > > > If we do this, it would be more flexible but would not cover the case of > > > > `color > > > > > == background-color`, but filter list authors can just specify the > actual > > > > color > > > > > on the site and it should work. Since regular expressions would be > > > supported, > > > > it > > > > > should cover more cases. > > > > > > > > > > To use the example in the Trac ticket: > > > > > > > > > > <style> > > > > > .c978y4 { > > > > > display: none; > > > > > } > > > > > </style> > > > > > <div id="foo"> > > > > > <span>S</span><span class="c978y4">A</span><span>ponso</span><span > > > > > class="c978y4">x</span><span>red</span> > > > > > </div> > > > > > > > > > > The filter would look like this: > > > > > > > > > > example.com#$#hide-if-contains-and-matches-style Sponsored #foo #foo > '' > > '' > > > > > 'display: inline' > > > > > > > > > > With named arguments: > > > > > > > > > > example.com#$#hide-if-contains-and-matches-style search=Sponsored > > > > > selector=#foo text-style='display: inline' > > > > > > > > > > What do you think? > > > > > > > > I think this defeat the design of that snippet that establish the > > visibility: > > > > the idea is that the filter authors don't have to deal with it, and > therefor > > > > don't need to update the filter when things change. We should instead > ensure > > > we > > > > can cover all the bases here. > > > > > > > > Also we have seen in the wild case were some of the text has opacity=0 and > > > other > > > > has font-size=0. In the same label. Which this snippet handles fine. > > > > > > Instead of hard-coding the styles that would identify text that is "visible" > > (by > > > some definition of the term), what I am suggesting is that we make it > dynamic > > so > > > that filter list authors can specify their own style. In the case of ABP > > > filters, we would then specify the styles ourselves - but not in the > snippet's > > > code but in the filter. This way we can use it for multiple purposes to > cover > > > more cases and still retain the flexibility of updating our technique to > > > identify visible text without having to do a whole new release. > > > > > > What do you think is the downside to this approach exactly? > > > > By the way, if you think we should add a `hide-if-contains-visible-text` > > snippet, let's add that too, but let's route the implementation through > > `hide-if-contains-and-matches-style`. This way if it gets circumvented at > least > > we don't have to wait for a new release, plus the text-style matching > > functionality might come in handy in other cases. > > > > Does this make sense? > > I don't think it make sense to merge the code of these two snippets. What I'm saying is that the entire functionality that you have implemented here can be implemented with _fewer_ lines of code _and_ combined with other functionality in an existing snippet. What's wrong with this? Most of the functionality is common, it's just an extra parameter and a few more lines of code.
https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:921: function hideIfContainsVisibleText(search, hideSelector, innerSelector = null) On 2019/04/10 19:48:20, Manish Jethani wrote: > On 2019/04/10 19:40:05, hub wrote: > > On 2019/04/10 17:15:17, Manish Jethani wrote: > > > On 2019/04/10 16:39:23, Manish Jethani wrote: > > > > On 2019/04/10 16:16:21, hub wrote: > > > > > On 2019/04/09 12:31:23, Manish Jethani wrote: > > > > > > I know the use case here is to target a node based on visible text, > but > > I > > > > can > > > > > > see the definition of "visible" changing both between releases and for > > > > > different > > > > > > sites. How about we make it an additional parameter to the > > > > > > `hide-if-contains-and-matches-style` snippet? We could call it > > > `textStyle`. > > > > In > > > > > > addition to doing everything else it already does, it would also do > what > > > it > > > > > does > > > > > > here, i.e. check the style of the text. > > > > > > > > > > > > If we do this, it would be more flexible but would not cover the case > of > > > > > `color > > > > > > == background-color`, but filter list authors can just specify the > > actual > > > > > color > > > > > > on the site and it should work. Since regular expressions would be > > > > supported, > > > > > it > > > > > > should cover more cases. > > > > > > > > > > > > To use the example in the Trac ticket: > > > > > > > > > > > > <style> > > > > > > .c978y4 { > > > > > > display: none; > > > > > > } > > > > > > </style> > > > > > > <div id="foo"> > > > > > > <span>S</span><span class="c978y4">A</span><span>ponso</span><span > > > > > > class="c978y4">x</span><span>red</span> > > > > > > </div> > > > > > > > > > > > > The filter would look like this: > > > > > > > > > > > > example.com#$#hide-if-contains-and-matches-style Sponsored #foo #foo > > '' > > > '' > > > > > > 'display: inline' > > > > > > > > > > > > With named arguments: > > > > > > > > > > > > example.com#$#hide-if-contains-and-matches-style search=Sponsored > > > > > > selector=#foo text-style='display: inline' > > > > > > > > > > > > What do you think? > > > > > > > > > > I think this defeat the design of that snippet that establish the > > > visibility: > > > > > the idea is that the filter authors don't have to deal with it, and > > therefor > > > > > don't need to update the filter when things change. We should instead > > ensure > > > > we > > > > > can cover all the bases here. > > > > > > > > > > Also we have seen in the wild case were some of the text has opacity=0 > and > > > > other > > > > > has font-size=0. In the same label. Which this snippet handles fine. > > > > > > > > Instead of hard-coding the styles that would identify text that is > "visible" > > > (by > > > > some definition of the term), what I am suggesting is that we make it > > dynamic > > > so > > > > that filter list authors can specify their own style. In the case of ABP > > > > filters, we would then specify the styles ourselves - but not in the > > snippet's > > > > code but in the filter. This way we can use it for multiple purposes to > > cover > > > > more cases and still retain the flexibility of updating our technique to > > > > identify visible text without having to do a whole new release. > > > > > > > > What do you think is the downside to this approach exactly? > > > > > > By the way, if you think we should add a `hide-if-contains-visible-text` > > > snippet, let's add that too, but let's route the implementation through > > > `hide-if-contains-and-matches-style`. This way if it gets circumvented at > > least > > > we don't have to wait for a new release, plus the text-style matching > > > functionality might come in handy in other cases. > > > > > > Does this make sense? > > > > I don't think it make sense to merge the code of these two snippets. > > What I'm saying is that the entire functionality that you have implemented here > can be implemented with _fewer_ lines of code _and_ combined with other > functionality in an existing snippet. What's wrong with this? Most of the > functionality is common, it's just an extra parameter and a few more lines of > code. To give an example of what I'm saying, let's say there's a function like this: /** * Finds needle string in haystack string. */ function indexOf(haystack: string, needle: string, fromIndex: int = 0); Now we want to implement a `startsWith()` function. What I'm saying is that we should extend the `indexOf()` function to take an additional parameter: function indexOf(haystack: string, needle: string, fromIndex: int = 0, toIndex: int = -1); If `toIndex` is 0, it stops the search at index 0. Then we implement `startsWith()` like this: function startsWith(haystack: string, needle: string) { return indexOf(haystack, needle, 0, 0); } What you're saying is that we repeat the code for searching for a string within a string in `startsWith()`. I don't understand why. We might do it for performance reasons, but in this case I don't see how such a reason would apply. If the parameter is not passed then the function would not follow that path, period.
On 2019/04/10 19:58:53, Manish Jethani wrote: > To give an example of what I'm saying, let's say there's a function like this: > > /** > * Finds needle string in haystack string. > */ > function indexOf(haystack: string, needle: string, fromIndex: int = 0); > > Now we want to implement a `startsWith()` function. What I'm saying is that we > should extend the `indexOf()` function to take an additional parameter: > > function indexOf(haystack: string, needle: string, fromIndex: int = 0, > toIndex: int = -1); > > If `toIndex` is 0, it stops the search at index 0. > > Then we implement `startsWith()` like this: > > function startsWith(haystack: string, needle: string) > { > return indexOf(haystack, needle, 0, 0); > } > > What you're saying is that we repeat the code for searching for a string within > a string in `startsWith()`. I don't understand why. We might do it for > performance reasons, but in this case I don't see how such a reason would apply. > If the parameter is not passed then the function would not follow that path, > period. I don't really see what this snippet has in common with other snippets. The text matching is what make this snippet different. I don't use startsWith() in that snippet. isTextVisible() and getVisibleContent() could be moved out of the function is needed, but for now it is not.
On 2019/04/17 04:02:21, hub wrote: > On 2019/04/10 19:58:53, Manish Jethani wrote: > > > To give an example of what I'm saying, let's say there's a function like this: > > > > /** > > * Finds needle string in haystack string. > > */ > > function indexOf(haystack: string, needle: string, fromIndex: int = 0); > > > > Now we want to implement a `startsWith()` function. What I'm saying is that we > > should extend the `indexOf()` function to take an additional parameter: > > > > function indexOf(haystack: string, needle: string, fromIndex: int = 0, > > toIndex: int = -1); > > > > If `toIndex` is 0, it stops the search at index 0. > > > > Then we implement `startsWith()` like this: > > > > function startsWith(haystack: string, needle: string) > > { > > return indexOf(haystack, needle, 0, 0); > > } > > > > What you're saying is that we repeat the code for searching for a string > within > > a string in `startsWith()`. I don't understand why. We might do it for > > performance reasons, but in this case I don't see how such a reason would > apply. > > If the parameter is not passed then the function would not follow that path, > > period. > > I don't really see what this snippet has in common with other snippets. The text > matching is what make this snippet different. I don't use startsWith() in that > snippet. The `startsWith()` and `indexOf()` thing was an _analogy_. Just as one would add an additional parameter to an _existing_ `indexOf()` implementation and then call it with that additional parameter from `startsWith()`, over here we have a snippet that does most of the work, it just needs an additional parameter, and then the new snippet can simply call into the existing snippet with the additional parameter. This is what I meant. Since the analogy was lost, let me explain what's common between the two snippets: 1. `hide-if-contains-and-matches-style` looks for a node containing some text 2. `hide-if-contains-visible` will look for a node containing some text matching certain styles The "matching certain styles" part is the additional parameter (i.e. `textStyle`) that I am talking about. Just to be _absolutely_ clear this time, this is what I mean: function hideIfContainsAndMatchesStyle(search, selector = "*", searchSelector = null, style = null, searchStyle = null, textStyle = null /* new parameter */) { ... } function hideIfContainsVisible(search, selector = "*", searchSelector = null) { hideIfContainsAndMatchesStyle(search, selector, searchSelector, null, null, /display: inline.*font-size: [1-9].*opacity: [1-9].*visibility: visible/); } Basically the idea is to use a regular expression to match the text styles, _exactly_ as we do in the `hide-if-contains-and-matches-style` snippet. This would continue the existing method of doing it _and_ make it possible to use the `hide-if-contains-and-matches-style` snippet directly if the current regular expression gets circumvented somehow. Anyway, in order to move this forward faster, let's get someone else into this code review. In fact, feel free to put me in CC.
On 2019/04/17 06:32:31, Manish Jethani wrote: > Basically the idea is to use a regular expression to match the text styles, > _exactly_ as we do in the `hide-if-contains-and-matches-style` snippet. This > would continue the existing method of doing it _and_ make it possible to use the > `hide-if-contains-and-matches-style` snippet directly if the current regular > expression gets circumvented somehow. Now that I think about it, the implementation is actually quite straightforward if the code is separated. Hmm ... yeah, I think if we go for simplicity then it's probably better to do it the way you're doing it (but it gives less flexibility). OK, let me take a look at the current patch as it is.
https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:908: Nit: Maybe a good idea to group all the `hide-if-contains-*` snippets together? https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:910: * Hide an element matching a selector if it contains a specified visible text. Nits or consistency: s/Hide/Hides/ s/an element/any HTML element/ s/a selector/a CSS selector/ s/if it contains a specified visible text/<placeholder>/ <placeholder>: if the visible text content of the element contains a given string. https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:921: function hideIfContainsVisibleText(search, hideSelector, innerSelector = null) Nit: parameter names for consistency with other snippets: search selector searchSelector (All nits are mere suggestions.) https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:924: * Determine if the text inside the element is visible. Nit: s/Determine/Determines/ https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:925: * @param {Element} element the leaf element we are checking. Nit: So far in this file the doc string starts with a capital letter, i.e. "The leaf element ...". https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:933: style = window.getComputedStyle(element); There's a local `getComputedStyle()` which covers all supported platforms. https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:984: innerSelector ? element.querySelectorAll(innerSelector) : [element]; Here there's a second call to `querySelectorAll()`, whereas in `hide-if-contains` there is only one call, even though the selector for hiding vs. searching (inner) is also different over there. Maybe it's better to follow the same pattern?
Reworked patch https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:908: On 2019/04/17 13:54:26, Manish Jethani wrote: > Nit: Maybe a good idea to group all the `hide-if-contains-*` snippets together? yes https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:910: * Hide an element matching a selector if it contains a specified visible text. On 2019/04/17 13:54:25, Manish Jethani wrote: > Nits or consistency: > > s/Hide/Hides/ > s/an element/any HTML element/ > s/a selector/a CSS selector/ > > s/if it contains a specified visible text/<placeholder>/ > > <placeholder>: if the visible text content of the element contains a given > string. Done. https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:921: function hideIfContainsVisibleText(search, hideSelector, innerSelector = null) On 2019/04/17 13:54:25, Manish Jethani wrote: > Nit: parameter names for consistency with other snippets: > > search > selector > searchSelector > > (All nits are mere suggestions.) They have a different meaning, but.... (see below) https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:924: * Determine if the text inside the element is visible. On 2019/04/17 13:54:25, Manish Jethani wrote: > Nit: s/Determine/Determines/ Done. https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:925: * @param {Element} element the leaf element we are checking. On 2019/04/17 13:54:26, Manish Jethani wrote: > Nit: So far in this file the doc string starts with a capital letter, i.e. "The > leaf element ...". Done. https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:933: style = window.getComputedStyle(element); On 2019/04/17 13:54:26, Manish Jethani wrote: > There's a local `getComputedStyle()` which covers all supported platforms. I only see `getComputedCSSText()` which calls`getComputedStyle()`. But here I'm not interested in the text version of the computed style. https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:984: innerSelector ? element.querySelectorAll(innerSelector) : [element]; On 2019/04/17 13:54:26, Manish Jethani wrote: > Here there's a second call to `querySelectorAll()`, whereas in > `hide-if-contains` there is only one call, even though the selector for hiding > vs. searching (inner) is also different over there. Maybe it's better to follow > the same pattern? Let's rework this in the next patch. (see comment about the argument names)
https://codereview.adblockplus.org/30024560/diff/30047560/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30047560/lib/content/snippet... lib/content/snippets.js:543: makeInjector(hideIfContainsVisibleText, hideIfMatch, hideElement); It doesn't need to be a type 2. I need to fix the testing code to allow testing type 1.
https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:933: style = window.getComputedStyle(element); On 2019/04/17 17:25:06, hub wrote: > On 2019/04/17 13:54:26, Manish Jethani wrote: > > There's a local `getComputedStyle()` which covers all supported platforms. > > I only see `getComputedCSSText()` which calls`getComputedStyle()`. But here I'm > not interested in the text version of the computed style. Sorry, my bad. Ack. https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:984: innerSelector ? element.querySelectorAll(innerSelector) : [element]; On 2019/04/17 17:25:06, hub wrote: > On 2019/04/17 13:54:26, Manish Jethani wrote: > > Here there's a second call to `querySelectorAll()`, whereas in > > `hide-if-contains` there is only one call, even though the selector for hiding > > vs. searching (inner) is also different over there. Maybe it's better to > follow > > the same pattern? > > Let's rework this in the next patch. (see comment about the argument names) Just so I get this right, we'll have a follow-up patch which (1) makes this Type 1 snippet and (2) refactor the code a little bit. Yes?
https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30039555/lib/content/snippet... lib/content/snippets.js:984: innerSelector ? element.querySelectorAll(innerSelector) : [element]; On 2019/04/17 18:33:28, Manish Jethani wrote: > On 2019/04/17 17:25:06, hub wrote: > > On 2019/04/17 13:54:26, Manish Jethani wrote: > > > Here there's a second call to `querySelectorAll()`, whereas in > > > `hide-if-contains` there is only one call, even though the selector for > hiding > > > vs. searching (inner) is also different over there. Maybe it's better to > > follow > > > the same pattern? > > > > Let's rework this in the next patch. (see comment about the argument names) > > Just so I get this right, we'll have a follow-up patch which (1) makes this Type > 1 snippet and (2) refactor the code a little bit. Yes? The current patch already refactor :-)
Aside from the nits, the implementation is looking great. I'll look at the test files tomorrow. https://codereview.adblockplus.org/30024560/diff/30047560/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30047560/lib/content/snippet... lib/content/snippets.js:416: * @param {string?} [searchSelector] The CSS selector that an HTML element Nit: Maybe a good opportunity to fix the JSDoc: `?string` instead of `string?` https://codereview.adblockplus.org/30024560/diff/30047560/lib/content/snippet... lib/content/snippets.js:420: function hideIfMatch(match, selector, searchSelector) Nit: Suggestion: `hideIfMatches()` https://codereview.adblockplus.org/30024560/diff/30047560/lib/content/snippet... lib/content/snippets.js:467: * @param {string} search The text to match to the visible text. Is considered Nit: s/The text to match/The string to match/ Nit: s/Is considered visible text that isn't/Text is considered visible if it isn't/ https://codereview.adblockplus.org/30024560/diff/30047560/lib/content/snippet... lib/content/snippets.js:481: * @param {Element} element The leaf element we are checking. Nit: Shouldn't this just be "The element", since whether it's a leaf or not is not relevant to this function? https://codereview.adblockplus.org/30024560/diff/30047560/lib/content/snippet... lib/content/snippets.js:489: style = window.getComputedStyle(element); Nit: `window.` is kinda redundant here, but maybe it also makes things clearer. Up to you. https://codereview.adblockplus.org/30024560/diff/30047560/lib/content/snippet... lib/content/snippets.js:503: * Returns the visible text content from an element and its children. Nit: "and its children" is kinda redundant, but if we want to be specific then "and its descendants" would be more accurate since it's recursive. https://codereview.adblockplus.org/30024560/diff/30047560/lib/content/snippet... lib/content/snippets.js:535: hideIfMatch(element => Wondering if we can shorten this to: hideIfMatch(element => getVisibleContent(element).test(content), selector, searchSelector);
Updated patch https://codereview.adblockplus.org/30024560/diff/30047560/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30047560/lib/content/snippet... lib/content/snippets.js:416: * @param {string?} [searchSelector] The CSS selector that an HTML element On 2019/04/17 18:52:35, Manish Jethani wrote: > Nit: Maybe a good opportunity to fix the JSDoc: `?string` instead of `string?` Done. https://codereview.adblockplus.org/30024560/diff/30047560/lib/content/snippet... lib/content/snippets.js:420: function hideIfMatch(match, selector, searchSelector) On 2019/04/17 18:52:34, Manish Jethani wrote: > Nit: Suggestion: `hideIfMatches()` Done. https://codereview.adblockplus.org/30024560/diff/30047560/lib/content/snippet... lib/content/snippets.js:467: * @param {string} search The text to match to the visible text. Is considered On 2019/04/17 18:52:35, Manish Jethani wrote: > Nit: s/The text to match/The string to match/ > > Nit: s/Is considered visible text that isn't/Text is considered visible if it > isn't/ Done. https://codereview.adblockplus.org/30024560/diff/30047560/lib/content/snippet... lib/content/snippets.js:481: * @param {Element} element The leaf element we are checking. On 2019/04/17 18:52:35, Manish Jethani wrote: > Nit: Shouldn't this just be "The element", since whether it's a leaf or not is > not relevant to this function? Done. https://codereview.adblockplus.org/30024560/diff/30047560/lib/content/snippet... lib/content/snippets.js:489: style = window.getComputedStyle(element); On 2019/04/17 18:52:34, Manish Jethani wrote: > Nit: `window.` is kinda redundant here, but maybe it also makes things clearer. > Up to you. Acknowledged. https://codereview.adblockplus.org/30024560/diff/30047560/lib/content/snippet... lib/content/snippets.js:503: * Returns the visible text content from an element and its children. On 2019/04/17 18:52:35, Manish Jethani wrote: > Nit: "and its children" is kinda redundant, but if we want to be specific then > "and its descendants" would be more accurate since it's recursive. Done. https://codereview.adblockplus.org/30024560/diff/30047560/lib/content/snippet... lib/content/snippets.js:535: hideIfMatch(element => On 2019/04/17 18:52:35, Manish Jethani wrote: > Wondering if we can shorten this to: > > hideIfMatch(element => getVisibleContent(element).test(content), > selector, searchSelector); No exactly. But ok.
https://codereview.adblockplus.org/30024560/diff/30047560/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30047560/lib/content/snippet... lib/content/snippets.js:535: hideIfMatch(element => On 2019/04/17 21:38:06, hub wrote: > On 2019/04/17 18:52:35, Manish Jethani wrote: > > Wondering if we can shorten this to: > > > > hideIfMatch(element => getVisibleContent(element).test(content), > > selector, searchSelector); > > No exactly. But ok. Ah, sorry I got that wrong. Thanks. https://codereview.adblockplus.org/30024560/diff/30047579/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30047579/lib/content/snippet... lib/content/snippets.js:460: selector, searchSelector); Shouldn't the `s` in `selector` here be aligned with the `e` in `element` above? https://codereview.adblockplus.org/30024560/diff/30047579/test/browser/_utils.js File test/browser/_utils.js (right): https://codereview.adblockplus.org/30024560/diff/30047579/test/browser/_utils... test/browser/_utils.js:31: { I really don't think this is generic enough to keep these functions here. I would rather move them back to their individual files. https://codereview.adblockplus.org/30024560/diff/30047579/test/browser/snippe... File test/browser/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30047579/test/browser/snippe... test/browser/snippets.js:300: test, "hide-if-contains-visible-text", "Spon", "#parent > div" Why not just test for "Sponsored" (updating the code above), since that is literally what we're going to target anyway?
Minus the one nit, the change in `lib/*` looks good to me. For the test file, I'd also like to invite Andrea to this review. Note: I'd still like to see `expectHidden()` and `expectVisible()` back in their individual test files.
updated patch. https://codereview.adblockplus.org/30024560/diff/30047579/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30047579/lib/content/snippet... lib/content/snippets.js:460: selector, searchSelector); On 2019/04/18 08:40:43, Manish Jethani wrote: > Shouldn't the `s` in `selector` here be aligned with the `e` in `element` above? Done. https://codereview.adblockplus.org/30024560/diff/30047579/test/browser/_utils.js File test/browser/_utils.js (right): https://codereview.adblockplus.org/30024560/diff/30047579/test/browser/_utils... test/browser/_utils.js:31: { On 2019/04/18 08:40:43, Manish Jethani wrote: > I really don't think this is generic enough to keep these functions here. I > would rather move them back to their individual files. Done. https://codereview.adblockplus.org/30024560/diff/30047579/test/browser/snippe... File test/browser/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30047579/test/browser/snippe... test/browser/snippets.js:300: test, "hide-if-contains-visible-text", "Spon", "#parent > div" On 2019/04/18 08:40:43, Manish Jethani wrote: > Why not just test for "Sponsored" (updating the code above), since that is > literally what we're going to target anyway? I don't think it matters. Also "Sponsored" is only in English. So just any string will work.
LGTM for the changes in `lib/*` Andrea: For `test/*`, it would be great if you could give it a look. I'll move myself to CC now. https://codereview.adblockplus.org/30024560/diff/30047579/test/browser/snippe... File test/browser/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30047579/test/browser/snippe... test/browser/snippets.js:300: test, "hide-if-contains-visible-text", "Spon", "#parent > div" On 2019/04/18 17:48:40, hub wrote: > On 2019/04/18 08:40:43, Manish Jethani wrote: > > Why not just test for "Sponsored" (updating the code above), since that is > > literally what we're going to target anyway? > > I don't think it matters. Also "Sponsored" is only in English. So just any > string will work. That's true, a string like "12fd2810" would also work as far as we're interested in testing the functionality. But one of the benefits of having a test case that is close to the problem we're trying to solve in reality is that it doubles up as documentation. i.e. you and I know that "Sponsored", "Spon", and "12fd2810" are all the same, but the person who maintains this code two years from now could benefit from realistic examples in the test file itself.
Overall not bad, but I have a couple of questions I'd like to be addressed before going forward, if possible, thanks. https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... lib/content/snippets.js:438: .observe(document, {childList: true, characterData: true, subtree: true}); I might make a similar observation we had before (so that this is maybe a broader issue whenever we use MutationObserver). A CSS selector might have `a[data-whatever]` condition in it, but it'll never have anything able to reach the text content (only the `match` callback might be interested, not the `querySelectorAll` result) see: https://developer.mozilla.org/en-US/docs/Web/API/MutationObserverInit/charact... Accordingly, are we sure we don't want to improve our CSS selector abilities so that if it does contain a square bracket we should also watch for attribute changes? 'cause speaking of CSS selectors, I find attributes change more relevant than unreachable text content which never interests the CSS selector itself (different story for XPath where this point was initially made). What do you think? https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... lib/content/snippets.js:485: * falsey it will be queried. typo? did you mean `falsy` ? https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... lib/content/snippets.js:497: if (style.getPropertyValue("color") == shouldn't `color` be checked against `transparent` too, which might not be the same of having a transparent background? https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... lib/content/snippets.js:535: let re = toRegExp(search); this feature is overall OK but I have one concern: it's easy with CSS to capitalize or upper case text, but this technique seems to suffer a general issue we have with our `toRegExp` function that doesn't accept flags as in `/find-me/i` I think the ignorecase is crucial for this scenario, so I wonder if we're working on this already, or we'll need to manually address all case-sensitive variants of any text we're looking for. https://codereview.adblockplus.org/30024560/diff/30048561/test/browser/snippe... File test/browser/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30048561/test/browser/snippe... test/browser/snippets.js:319: </div>`; nit: I wonder if for test maintainability sake we should properly format/indent HTML so it's easier to change/review on the fly eventual changes https://codereview.adblockplus.org/30024560/diff/30048561/test/browser/snippe... test/browser/snippets.js:346: test.done(); Happy to discuss this offline/elsewhere though, I just wanted to point this little weird thing out. This test looks OK, and maybe this is for another ticket/discussion, but I don't understand why we need both `async` function declaration and `test.done()` at the end. The `test.done()` thing is from pre promise era, while here we're writing already a flow that can wait anything, if needed, via `async`. Accordingly, wouldn't be normal to have every test awaited so that `test.done()` can be triggered automatically and it's one less thing to think about when writing asynchronous tests?
Oooops, forgot one question https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... lib/content/snippets.js:517: shouldn't `clientWidth` and `clientHeight` be also checked against 0? (either is fine) that should already screen a lot of fancy CSS use cases, right?
Updated patch. I'll address clienthHeight / clientWidth on the next iteration. https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... lib/content/snippets.js:438: .observe(document, {childList: true, characterData: true, subtree: true}); On 2019/04/26 13:33:58, a.giammarchi wrote: > I might make a similar observation we had before (so that this is maybe a > broader issue whenever we use MutationObserver). > > A CSS selector might have `a[data-whatever]` condition in it, but it'll never > have anything able to reach the text content (only the `match` callback might be > interested, not the `querySelectorAll` result) > > see: > https://developer.mozilla.org/en-US/docs/Web/API/MutationObserverInit/charact... > > Accordingly, are we sure we don't want to improve our CSS selector abilities so > that if it does contain a square bracket we should also watch for attribute > changes? > > 'cause speaking of CSS selectors, I find attributes change more relevant than > unreachable text content which never interests the CSS selector itself > (different story for XPath where this point was initially made). > > What do you think? The current code is just refactoring of the other snippet "hide-if-contains". It's a valid point but I'd rather take it as a separate issue as to make sure we don't impact the other snippet. https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... lib/content/snippets.js:485: * falsey it will be queried. On 2019/04/26 13:33:58, a.giammarchi wrote: > typo? did you mean `falsy` ? Both are acceptable, and we use `falsey` elsewhere in the code base. https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... lib/content/snippets.js:497: if (style.getPropertyValue("color") == On 2019/04/26 13:33:57, a.giammarchi wrote: > shouldn't `color` be checked against `transparent` too, which might not be the > same of having a transparent background? good point. I knew there were things I'm missing. Done https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... lib/content/snippets.js:517: On 2019/04/26 13:51:06, a.giammarchi wrote: > shouldn't `clientWidth` and `clientHeight` be also checked against 0? (either is > fine) > > that should already screen a lot of fancy CSS use cases, right? Good point. I'll look into it. https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... lib/content/snippets.js:535: let re = toRegExp(search); On 2019/04/26 13:33:58, a.giammarchi wrote: > this feature is overall OK but I have one concern: it's easy with CSS to > capitalize or upper case text, but this technique seems to suffer a general > issue we have with our `toRegExp` function that doesn't accept flags as in > `/find-me/i` > > I think the ignorecase is crucial for this scenario, so I wonder if we're > working on this already, or we'll need to manually address all case-sensitive > variants of any text we're looking for. Whichever effect applies to the text with CSS, the content (that we check against) doesn't change. The 'i' flag would be useful to have the filter more tolerant to changes. https://codereview.adblockplus.org/30024560/diff/30048561/test/browser/snippe... File test/browser/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30048561/test/browser/snippe... test/browser/snippets.js:319: </div>`; On 2019/04/26 13:33:58, a.giammarchi wrote: > nit: I wonder if for test maintainability sake we should properly format/indent > HTML so it's easier to change/review on the fly eventual changes A '\n' in the markup generate a blank space in the DOM, which mean it will be translated as a space in the `contentText`. I can possibly indent a little bit, but really not all of it. Done https://codereview.adblockplus.org/30024560/diff/30048561/test/browser/snippe... test/browser/snippets.js:346: test.done(); On 2019/04/26 13:33:59, a.giammarchi wrote: > Happy to discuss this offline/elsewhere though, I just wanted to point this > little weird thing out. > > This test looks OK, and maybe this is for another ticket/discussion, but I don't > understand why we need both `async` function declaration and `test.done()` at > the end. > > The `test.done()` thing is from pre promise era, while here we're writing > already a flow that can wait anything, if needed, via `async`. > > Accordingly, wouldn't be normal to have every test awaited so that `test.done()` > can be triggered automatically and it's one less thing to think about when > writing asynchronous tests? This is because we still use nodeunit. There is issue #6820 to get rid of it (and a patch that uses mocha in TDD instead). The `async` here is needed only for the `await runSnippet()`
https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... lib/content/snippets.js:438: .observe(document, {childList: true, characterData: true, subtree: true}); On 2019/04/27 17:32:05, hub wrote: > On 2019/04/26 13:33:58, a.giammarchi wrote: > > I might make a similar observation we had before (so that this is maybe a > > broader issue whenever we use MutationObserver). > > > > A CSS selector might have `a[data-whatever]` condition in it, but it'll never > > have anything able to reach the text content (only the `match` callback might > be > > interested, not the `querySelectorAll` result) > > > > see: > > > https://developer.mozilla.org/en-US/docs/Web/API/MutationObserverInit/charact... > > > > Accordingly, are we sure we don't want to improve our CSS selector abilities > so > > that if it does contain a square bracket we should also watch for attribute > > changes? > > > > 'cause speaking of CSS selectors, I find attributes change more relevant than > > unreachable text content which never interests the CSS selector itself > > (different story for XPath where this point was initially made). > > > > What do you think? > > The current code is just refactoring of the other snippet "hide-if-contains". > It's a valid point but I'd rather take it as a separate issue as to make sure we > don't impact the other snippet. Acknowledged. https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... lib/content/snippets.js:485: * falsey it will be queried. On 2019/04/27 17:32:05, hub wrote: > On 2019/04/26 13:33:58, a.giammarchi wrote: > > typo? did you mean `falsy` ? > > Both are acceptable, and we use `falsey` elsewhere in the code base. Acknowledged. https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... lib/content/snippets.js:535: let re = toRegExp(search); On 2019/04/27 17:32:05, hub wrote: > Whichever effect applies to the text with CSS, the content (that we check > against) doesn't change. If I understand correctly, we might check for "Ads" but the content might be "aDs" instead, or "adS", or "ADS", or "AdS", or "aDS" and each char visually normalized by CSS. In this scenrio, the `i` flag does make the difference, isn't it?
https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... lib/content/snippets.js:535: let re = toRegExp(search); On 2019/04/29 07:27:25, a.giammarchi wrote: > On 2019/04/27 17:32:05, hub wrote: > > Whichever effect applies to the text with CSS, the content (that we check > > against) doesn't change. > > If I understand correctly, we might check for "Ads" but the content might be > "aDs" instead, or "adS", or "ADS", or "AdS", or "aDS" and each char visually > normalized by CSS. > > In this scenrio, the `i` flag does make the difference, isn't it? I see where this is going. Good point. Because this impact other snippets, I have filed https://gitlab.com/eyeo/adblockplus/adblockpluscore/issues/11
https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... lib/content/snippets.js:438: .observe(document, {childList: true, characterData: true, subtree: true}); On 2019/04/29 07:27:25, a.giammarchi wrote: > On 2019/04/27 17:32:05, hub wrote: > > On 2019/04/26 13:33:58, a.giammarchi wrote: > > > I might make a similar observation we had before (so that this is maybe a > > > broader issue whenever we use MutationObserver). > > > > > > A CSS selector might have `a[data-whatever]` condition in it, but it'll > never > > > have anything able to reach the text content (only the `match` callback > might > > be > > > interested, not the `querySelectorAll` result) > > > > > > see: > > > > > > https://developer.mozilla.org/en-US/docs/Web/API/MutationObserverInit/charact... > > > > > > Accordingly, are we sure we don't want to improve our CSS selector abilities > > so > > > that if it does contain a square bracket we should also watch for attribute > > > changes? > > > > > > 'cause speaking of CSS selectors, I find attributes change more relevant > than > > > unreachable text content which never interests the CSS selector itself > > > (different story for XPath where this point was initially made). > > > > > > What do you think? > > > > The current code is just refactoring of the other snippet "hide-if-contains". > > It's a valid point but I'd rather take it as a separate issue as to make sure > we > > don't impact the other snippet. > > Acknowledged. BTW I filed https://gitlab.com/eyeo/adblockplus/adblockpluscore/issues/12
not sure if you want to address now the `clientWidth` and `clientHeight` part, but since all other things have been addressed, this LGTM (but happy to review changes if you want to land these here) https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30048561/lib/content/snippet... lib/content/snippets.js:535: let re = toRegExp(search); On 2019/04/29 13:03:47, hub wrote: > On 2019/04/29 07:27:25, a.giammarchi wrote: > > On 2019/04/27 17:32:05, hub wrote: > > > Whichever effect applies to the text with CSS, the content (that we check > > > against) doesn't change. > > > > If I understand correctly, we might check for "Ads" but the content might be > > "aDs" instead, or "adS", or "ADS", or "AdS", or "aDS" and each char visually > > normalized by CSS. > > > > In this scenrio, the `i` flag does make the difference, isn't it? > > I see where this is going. Good point. > > Because this impact other snippets, I have filed > https://gitlab.com/eyeo/adblockplus/adblockpluscore/issues/11 Acknowledged.
clientWidth / clientHeight has issue where some elements default to 0,0, so it needs more considerations and might not be worth it. https://codereview.adblockplus.org/30024560/diff/30051555/test/browser/snippe... File test/browser/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30051555/test/browser/snippe... test/browser/snippets.js:353: expectHidden(test, element, "article"); Actually this should be `expectVisible` since the parent element is `display: none`. But the code is incorrect and doesn't check that far up.
updated patch https://codereview.adblockplus.org/30024560/diff/30051555/test/browser/snippe... File test/browser/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30051555/test/browser/snippe... test/browser/snippets.js:353: expectHidden(test, element, "article"); On 2019/04/29 18:24:26, hub wrote: > Actually this should be `expectVisible` since the parent element is `display: > none`. But the code is incorrect and doesn't check that far up. This is now fixed in current patch.
I have made changes (patch 12) to fix an issue that deserve an extra review if you don't mind. Thanks !
Added some major concern over current implementation of `isVisible`, please have a look, thanks. https://codereview.adblockplus.org/30024560/diff/30052555/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30052555/lib/content/snippet... lib/content/snippets.js:527: return isVisible(element.parentNode, null, closest); I have bad feelings about this for at least two reasons: 1. it breaks if it reaches the `document`, since no `parentElement` would be there 2. it's very expensive in complex DOM I have a coupe of suggestions here: 1. how about we cut the mustard and use element.getBoundingClientRect() directly, so that if either width or height are 0 we can call it a day without crawling to the root each node that is visible? 2. if gBCR() is not a solution, mind checking that `parentNode` is a `nodeType` === `Node.ELEMENT_NODE` so that stuff in fragments, or document itself, wouldn't break?
updated patch https://codereview.adblockplus.org/30024560/diff/30052555/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30052555/lib/content/snippet... lib/content/snippets.js:527: return isVisible(element.parentNode, null, closest); On 2019/05/08 09:43:47, a.giammarchi wrote: > I have bad feelings about this for at least two reasons: > > 1. it breaks if it reaches the `document`, since no `parentElement` would be > there I'll stop if parent is null. > 2. it's very expensive in complex DOM How come? it is already limited to closest. I realise I should have used `parentElement` instead. > > I have a coupe of suggestions here: > > 1. how about we cut the mustard and use element.getBoundingClientRect() > directly, so that if either width or height are 0 we can call it a day without > crawling to the root each node that is visible? To "closest" > 2. if gBCR() is not a solution, mind checking that `parentNode` is a `nodeType` > === `Node.ELEMENT_NODE` so that stuff in fragments, or document itself, wouldn't > break? Using `parentElement` now and stop when it returns null
clarified which issue might happen with current implementation https://codereview.adblockplus.org/30024560/diff/30052555/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30052555/lib/content/snippet... lib/content/snippets.js:527: return isVisible(element.parentNode, null, closest); On 2019/05/08 17:03:42, hub wrote: > I'll stop if parent is null. where does this happen? 'cause all I see is that you grab the computed style right away from an element without checking that element is live/not null I guess logically, all I am saying is that if !closest || element == closest || !element.parentElement you should return and avoid recursion. Please note I am talking about the signature as is, where one could pass closest as plain `document` simply to know if a generic element is visible. In that case: * closest is document * element is anything * the moment element crawl up to documentElement, closest is there but no parentElement is returned so that the following breaks: window.getComputedStyle(document.documentElement.parentElement)
Really updated patch this time. https://codereview.adblockplus.org/30024560/diff/30052555/lib/content/snippet... File lib/content/snippets.js (right): https://codereview.adblockplus.org/30024560/diff/30052555/lib/content/snippet... lib/content/snippets.js:527: return isVisible(element.parentNode, null, closest); On 2019/05/09 08:58:59, a.giammarchi wrote: > On 2019/05/08 17:03:42, hub wrote: > > I'll stop if parent is null. > > where does this happen? 'cause all I see is that you grab the computed style > right away from an element without checking that element is live/not null > > I guess logically, all I am saying is that if !closest || element == closest || > !element.parentElement you should return and avoid recursion. > > Please note I am talking about the signature as is, where one could pass closest > as plain `document` simply to know if a generic element is visible. > > In that case: > > * closest is document > * element is anything > * the moment element crawl up to documentElement, closest is there but no > parentElement is returned so that the following breaks: > > window.getComputedStyle(document.documentElement.parentElement) It is actually better if I actually properly submit the changes for review. Sorry about that.
LGTM |