|
|
Created:
Nov. 10, 2014, 12:26 p.m. by sergei Modified:
Nov. 27, 2014, 2:57 p.m. CC:
Eric Visibility:
Public. |
Description- IHTMLElement::getAttribute is inconvenient to use (see below) but since Windows XP all versions support IHTMLDOMAttribute and IHTMLElement4.
IHTMLElement::getAttribute sometimes returns not only BSTR or integer but also null variant or IDispatch which is different from the value written in the html file. For the cases like <some-tag on-some-event="do();"/> IE returns the value of 'on-some-event' attribute as 'function(event){\ndo();\n}'. Our filters are not designed for such transformations, so we need to retrive the value of (body) of /^.*?\{(body)\}$/. Using IHTMLDOMAttribute and IHTMLElement4 solves the troubles and simplifies the code.
- fix predecessor processing. We should call `tagName.ToLower();` after the tag name retrieved. The bug (typo) was we called it before which makes no sense.
Patch Set 1 #Patch Set 2 : fix IE8 and simplify the code #
Total comments: 13
Patch Set 3 : merge GetAttributeValueAsString into the caller #Patch Set 4 : fix according to comments #Patch Set 5 : fix according to the discussion #
Total comments: 6
MessagesTotal messages: 10
http://codereview.adblockplus.org/6433575100481536/diff/5741031244955648/src/... File src/plugin/PluginFilter.cpp (right): http://codereview.adblockplus.org/6433575100481536/diff/5741031244955648/src/... src/plugin/PluginFilter.cpp:23: std::wstring GetAttributeValueAsString(const ATL::CComVariant& vAttr) Is it really required for this to be a separate function? It's used once. http://codereview.adblockplus.org/6433575100481536/diff/5741031244955648/src/... src/plugin/PluginFilter.cpp:36: std::pair<std::wstring, bool> GetHtmlElementAttribute(IHTMLElement& htmlElement, const ATL::CComBSTR& attributeName) How about something like this instead std::wstring GetHtmlElementAttribute(IHTMLElement* const htmlElemen t, const ATL::CComBSTR& attributeName, /*out*/ bool* const result) Looks more consistent with all the rest of the code and Windows APIs, IMO. Note the first element is a pointer and there's a third element for a return value. http://codereview.adblockplus.org/6433575100481536/diff/5741031244955648/src/... src/plugin/PluginFilter.cpp:41: ATL::CComQIPtr<IHTMLElement4> htmlElement4 = &htmlElement; I prefer manually calling QueryInterface. We are checking the result anyway, so it's not like this saves a lot of typing. And with CComQIPtr we'll only introduce another class, not managed by us. http://codereview.adblockplus.org/6433575100481536/diff/5741031244955648/src/... src/plugin/PluginFilter.cpp:317: hr = pEl->get_tagName(&tagName); Well this was a long living bug :D. Nice! http://codereview.adblockplus.org/6433575100481536/diff/5741031244955648/src/... File src/plugin/PluginUtil.h (left): http://codereview.adblockplus.org/6433575100481536/diff/5741031244955648/src/... src/plugin/PluginUtil.h:20: void ReplaceString(std::wstring& input, const std::wstring placeholder, const std::wstring replacement); Pretty sure we still use it. Why remove this?
http://codereview.adblockplus.org/6433575100481536/diff/5741031244955648/src/... File src/plugin/PluginFilter.cpp (right): http://codereview.adblockplus.org/6433575100481536/diff/5741031244955648/src/... src/plugin/PluginFilter.cpp:23: std::wstring GetAttributeValueAsString(const ATL::CComVariant& vAttr) On 2014/11/19 03:24:23, Oleksandr wrote: > Is it really required for this to be a separate function? It's used once. It's not required. Firstly I extracted it during developing, then it turned out to be quite small and it's indeed used only once. I thought to remove it, but it contains several exit points and the caller is already quite long, so I decided to keep it separately. Merged into the caller. http://codereview.adblockplus.org/6433575100481536/diff/5741031244955648/src/... src/plugin/PluginFilter.cpp:36: std::pair<std::wstring, bool> GetHtmlElementAttribute(IHTMLElement& htmlElement, const ATL::CComBSTR& attributeName) On 2014/11/19 03:24:23, Oleksandr wrote: > How about something like this instead > std::wstring GetHtmlElementAttribute(IHTMLElement* const htmlElemen > t, const ATL::CComBSTR& attributeName, /*out*/ bool* const result) > Looks more consistent with all the rest of the code and Windows APIs, IMO. Note > the first element is a pointer and there's a third element for a return value. - I don't think that making `htmlElement` to be pointer is a good idea because in that case we ought to verify that it's not nullptr which only complicates this method as well as the caller. In most cases the code would be much cleaner/safer/better readable if we prefer references to pointers. If it's inconsistent with the current approach then we should change it. - I would not like to spread the return value among arguments and the return value until we don't have to do it. For `/*out*/ bool* const result` I would put the meaning whether the method is succeeded or not. So I would say there could be two types of method signatures: `RetValue method(parameters..., bool* isOK = nullptr)`, where `bool* isOK` can be `ErrorCode* errorCode`, and `bool method(parameters..., RetVaelu& retValue)` or it can return some `ErrorCode`. In the current case `retResult.second` means whether the header found or not, similar how std::pair<iterator, bool> is used in stl. I personally would like to have a custom structure `struct GetHtmlElementAttributeResult { std::wstring header; bool headerFound };` for better readability. If it makes sense let me know. http://codereview.adblockplus.org/6433575100481536/diff/5741031244955648/src/... src/plugin/PluginFilter.cpp:41: ATL::CComQIPtr<IHTMLElement4> htmlElement4 = &htmlElement; On 2014/11/19 03:24:23, Oleksandr wrote: > I prefer manually calling QueryInterface. We are checking the result anyway, so > it's not like this saves a lot of typing. And with CComQIPtr we'll only > introduce another class, not managed by us. We don't need HRESULT here, so I would say that ATL::CComQIPtr<IHTMLElement4> htmlElement4 = &htmlElement; if (!htmlElement4) { return retResult; } is better than ATL::CComPtr<IHTMLElement4> htmlElement4; if (FAILED(htmlElement.QueryInterface(&htmlElement)) || !htmlElement4) { return retResult; } It's not about amount of typing, I would say, less logic and operations in our code is better in most cases. I don't think ATL::CComQIPtr causes or could cause any troubles. http://codereview.adblockplus.org/6433575100481536/diff/5741031244955648/src/... File src/plugin/PluginUtil.h (left): http://codereview.adblockplus.org/6433575100481536/diff/5741031244955648/src/... src/plugin/PluginUtil.h:20: void ReplaceString(std::wstring& input, const std::wstring placeholder, const std::wstring replacement); On 2014/11/19 03:24:23, Oleksandr wrote: > Pretty sure we still use it. Why remove this? Because it's from 'shared' project and it's declared in 'src\shared\Utils.h'. There is no reason to declare it again in this project.
Let me know if we should ask for a third opinion. We can ask Eric or Felix to take side here. http://codereview.adblockplus.org/6433575100481536/diff/5741031244955648/src/... File src/plugin/PluginFilter.cpp (right): http://codereview.adblockplus.org/6433575100481536/diff/5741031244955648/src/... src/plugin/PluginFilter.cpp:36: std::pair<std::wstring, bool> GetHtmlElementAttribute(IHTMLElement& htmlElement, const ATL::CComBSTR& attributeName) > If it's > inconsistent with the current approach then we should change it. - I think it is inconsistent. We use pointers for COM interfaces everywhere. For example IsMatchFilterElementHide or IsElementHidden in this file. Also what would be complicated in the caller if we use a pointer instead? It would only make the function call more readable, IMHO. - Sorry, I do not understand your point here. When would be the situation when we will *have* to spread the return value among the arguments? We can always return some custom structure, so by that logic we can never split those. Why not use one of the two types of signatures you've mentioned. I would actually prefer "`bool method(parameters..., RetVaelu& retValue)'", which is very COM-like, which is used a lot in the codebase. Just for consistency sake. http://codereview.adblockplus.org/6433575100481536/diff/5741031244955648/src/... src/plugin/PluginFilter.cpp:41: ATL::CComQIPtr<IHTMLElement4> htmlElement4 = &htmlElement; On 2014/11/19 11:53:24, sergei wrote: > On 2014/11/19 03:24:23, Oleksandr wrote: > > I prefer manually calling QueryInterface. We are checking the result anyway, > so > > it's not like this saves a lot of typing. And with CComQIPtr we'll only > > introduce another class, not managed by us. > > We don't need HRESULT here, so I would say that > ATL::CComQIPtr<IHTMLElement4> htmlElement4 = &htmlElement; > if (!htmlElement4) > { > return retResult; > } > is better than > ATL::CComPtr<IHTMLElement4> htmlElement4; > if (FAILED(htmlElement.QueryInterface(&htmlElement)) || !htmlElement4) > { > return retResult; > } > It's not about amount of typing, I would say, less logic and operations in our > code is better in most cases. > I don't think ATL::CComQIPtr causes or could cause any troubles. We do QueryInterface in a lot of places in the code, but this would be the first time we use CComQIPtr. I don't see a point of it, so I disagree here.
http://codereview.adblockplus.org/6433575100481536/diff/5741031244955648/src/... File src/plugin/PluginFilter.cpp (right): http://codereview.adblockplus.org/6433575100481536/diff/5741031244955648/src/... src/plugin/PluginFilter.cpp:36: std::pair<std::wstring, bool> GetHtmlElementAttribute(IHTMLElement& htmlElement, const ATL::CComBSTR& attributeName) On 2014/11/20 10:21:56, Oleksandr wrote: > > If it's > > inconsistent with the current approach then we should change it. > > - I think it is inconsistent. We use pointers for COM interfaces everywhere. For > example IsMatchFilterElementHide or IsElementHidden in this file. Also what > would be complicated in the caller if we use a pointer instead? It would only > make the function call more readable, IMHO. In the function we need to check that it's not null and the caller should verify the return value. I don't think that such additional logic is required here actually. BTW, the caller, CFilterElementHide::IsMatchFilterElementHide, here also can accept reference instead of a pointer. > > - Sorry, I do not understand your point here. When would be the situation when > we will *have* to spread the return value among the arguments? We can always > return some custom structure, so by that logic we can never split those. Why not > use one of the two types of signatures you've mentioned. I would actually prefer > "`bool method(parameters..., RetVaelu& retValue)'", which is very COM-like, > which is used a lot in the codebase. Just for consistency sake. The return value here consists from the attribute value and from the flag whether it was found or not. The return code indicates some error during the procedure, they are from different domains. In general the return code is negative when arguments are incorrect or some resources are not available or something else went wrong. BTW, the signature "`bool method(parameters..., RetValue& retValue)'" requires to always care about the return type, so we cannot use `auto` for the return type.
http://codereview.adblockplus.org/6433575100481536/diff/5741031244955648/src/... File src/plugin/PluginFilter.cpp (right): http://codereview.adblockplus.org/6433575100481536/diff/5741031244955648/src/... src/plugin/PluginFilter.cpp:36: std::pair<std::wstring, bool> GetHtmlElementAttribute(IHTMLElement& htmlElement, const ATL::CComBSTR& attributeName) On 2014/11/21 12:02:21, sergei wrote: > On 2014/11/20 10:21:56, Oleksandr wrote: > > > If it's > > > inconsistent with the current approach then we should change it. > > > > - I think it is inconsistent. We use pointers for COM interfaces everywhere. > For > > example IsMatchFilterElementHide or IsElementHidden in this file. Also what > > would be complicated in the caller if we use a pointer instead? It would only > > make the function call more readable, IMHO. > In the function we need to check that it's not null and the caller should verify > the return value. I don't think that such additional logic is required here > actually. BTW, the caller, > CFilterElementHide::IsMatchFilterElementHide, here also can accept reference > instead of a pointer. I agree with Sergei here, IMO we should go with references wherever possible, i.e. when the function doesn't need to deal with null references. Less room for error and increased readability (no need to ponder whether the function can deal with null or not). > > > > - Sorry, I do not understand your point here. When would be the situation when > > we will *have* to spread the return value among the arguments? We can always > > return some custom structure, so by that logic we can never split those. Why > not > > use one of the two types of signatures you've mentioned. I would actually > prefer > > "`bool method(parameters..., RetVaelu& retValue)'", which is very COM-like, > > which is used a lot in the codebase. Just for consistency sake. > > The return value here consists from the attribute value and from the flag > whether it was found or not. The return code indicates some error during the > procedure, they are from different domains. In general the return code is > negative when arguments are incorrect or some resources are not available or > something else went wrong. > BTW, the signature "`bool method(parameters..., RetValue& retValue)'" requires > to always care about the return type, so we cannot use `auto` for the return > type. Not a fan of returning the pair here, either. I'd like either of your suggestions (output parameter and struct return value), leaning a bit towards the output parameter because it's less code and clear enough IMO.
LGTM
Looks good all in all. I think the whitespace changes (I particularly like the ones getting rid of trailing whitespace) should go to a separate commit though. Noissue would be fine. http://codereview.adblockplus.org/6433575100481536/diff/5754903989321728/src/... File src/plugin/PluginFilter.cpp (right): http://codereview.adblockplus.org/6433575100481536/diff/5754903989321728/src/... src/plugin/PluginFilter.cpp:23: struct GetHtmlElementAttributeResult How about "HtmlAttributeMatch" or something in that spirit? http://codereview.adblockplus.org/6433575100481536/diff/5754903989321728/src/... src/plugin/PluginFilter.cpp:332: ATL::CString value; This change seems unrelated, hm? Plus we're still using CString (without ATL::) elsewhere in this file.
http://codereview.adblockplus.org/6433575100481536/diff/5754903989321728/src/... File src/plugin/PluginFilter.cpp (right): http://codereview.adblockplus.org/6433575100481536/diff/5754903989321728/src/... src/plugin/PluginFilter.cpp:23: struct GetHtmlElementAttributeResult On 2014/11/27 13:58:46, Felix H. Dahlke wrote: > How about "HtmlAttributeMatch" or something in that spirit? I also don't feel it to be a good name, but I find "HtmlAttributeMatch" even more confusing. http://codereview.adblockplus.org/6433575100481536/diff/5754903989321728/src/... src/plugin/PluginFilter.cpp:332: ATL::CString value; On 2014/11/27 13:58:46, Felix H. Dahlke wrote: > This change seems unrelated, hm? Plus we're still using CString (without ATL::) > elsewhere in this file. Yes, it's not important here, just was irritating the eye. I would not say that we don't use it in this file elsewhere is an argument. We should tend to use ATL:: for the same reason that we don't have `using namespace std` in any header.
http://codereview.adblockplus.org/6433575100481536/diff/5754903989321728/src/... File src/plugin/PluginFilter.cpp (right): http://codereview.adblockplus.org/6433575100481536/diff/5754903989321728/src/... src/plugin/PluginFilter.cpp:23: struct GetHtmlElementAttributeResult On 2014/11/27 14:47:25, sergei wrote: > On 2014/11/27 13:58:46, Felix H. Dahlke wrote: > > How about "HtmlAttributeMatch" or something in that spirit? > > I also don't feel it to be a good name, but I find "HtmlAttributeMatch" even > more confusing. Fair enough, yeah. Let's go with GetHtmlElementAttributeResult then. http://codereview.adblockplus.org/6433575100481536/diff/5754903989321728/src/... src/plugin/PluginFilter.cpp:332: ATL::CString value; On 2014/11/27 14:47:25, sergei wrote: > On 2014/11/27 13:58:46, Felix H. Dahlke wrote: > > This change seems unrelated, hm? Plus we're still using CString (without > ATL::) > > elsewhere in this file. > > Yes, it's not important here, just was irritating the eye. I would not say that > we don't use it in this file elsewhere is an argument. We should tend to use > ATL:: for the same reason that we don't have `using namespace std` in any > header. Yes I agree. It's just that we should avoid unrelated changes, particularly with bug fixes. A commit should make one logical change. That way we can trace down regressions and back commits out much easier. |