|
|
Created:
Oct. 14, 2014, 11:23 p.m. by Oleksandr Modified:
Nov. 5, 2014, 4:51 p.m. Visibility:
Public. |
DescriptionThis builds on http://codereview.adblockplus.org/6390087684194304/, and will very likely have tons of conflicts when merging. Which is why I submit this as an overview of the idea in general. There will most likely be another patch with exact code to be committed after we push the patches from the above code review.
The idea is that we can get the real Referer header, which is available on BeginningTransaction AFTER the original BeginningTransaction is called. Doing that leaves no reason to have our ShouldBlock check running in OnStart, so all that code is moved to BeginningTransaction as well.
It works and it finally fixes the issue.
Patch Set 1 #
Total comments: 16
Patch Set 2 : Update to the repositorory tip #
Total comments: 11
Patch Set 3 : Cleanup #
Total comments: 2
Patch Set 4 : Revert the incorrect cleanup #
Total comments: 16
Patch Set 5 : Refactoring lambda and ExtractHttpHeaders functions. Coding style cleanup. #
Total comments: 5
Patch Set 6 : Revert refactoring of ExtractHttpHeaders. Keeping it simple. Typo fixes. #
Total comments: 3
Patch Set 7 : Coding style changes #
Total comments: 6
Patch Set 8 : Reorder stuff in BeginningTransaction for clarity. #Patch Set 9 : Rename the parameter #
MessagesTotal messages: 23
Looks OK. Merging this directly does indeed look difficult, given the number of changes since the version you were working from. Were there changes to PluginWbPassthrough.h also? There's a variable 'm_boundDomain' I don't see a declaration for. http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... File src/plugin/PluginWbPassThrough.cpp (left): http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... src/plugin/PluginWbPassThrough.cpp:131: CString src = szUrl; The review I just posted converts this CString to std::wstring. http://codereview.adblockplus.org/4912420225024000/ Since you're planning on rewriting this anyway, it would be useful to get more of CString occurrences out of the way quickly. http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... src/plugin/PluginWbPassThrough.cpp:351: I'm assuming this is new code that's not showing up as such because of the way the patch set was generated. http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... File src/plugin/PluginWbPassThrough.cpp (right): http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... src/plugin/PluginWbPassThrough.cpp:219: If we really need both standard and wide versions of this function, at least use a template to eliminate purely duplicated code. http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... src/plugin/PluginWbPassThrough.cpp:245: HRESULT nativeHresult = spHttpNegotiate ? spHttpNegotiate->BeginningTransaction(szURL, szHeaders,dwReserved, pszAdditionalHeaders) : S_OK; It would seem to me that if the service query fails that we have an error. I realize this expression is a legacy from previous code but it's fairly opaque as to why it's correct. http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... src/plugin/PluginWbPassThrough.cpp:255: auto acceptHeader = [&]() -> std::string This function does not look like a good candidate for a lambda. Since you're invoking it immediately, there's little advantage defining a function. Simply declare the variable 'acceptHeader' and initialize it in place of the 'return' statement. It also seems reasonable to define a separate function to retrieve the headers, given how verbose the implementation for such a simple-looking function. http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... src/plugin/PluginWbPassThrough.cpp:255: auto acceptHeader = [&]() -> std::string Also, it would be better generally to have 'acceptHeader' be 'std::wstring' for consistency with the rest of the code. It's used as an argument to 'GetContentTypeFromMimeType', whose 'CString' argument is slated for conversion. http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... src/plugin/PluginWbPassThrough.cpp:258: // only HTTP_QUERY_RAW_HEADERS_CRLF | HTTP_QUERY_FLAG_REQUEST_HEADERS does dork. Doesn't HTTP_QUERY_CUSTOM work? If so, that would greatly simplify things, including not needing to define a std::string version of ExtractHTTPHeader. "Causes HttpQueryInfo to search for the header name specified in lpvBuffer and store the header data in lpvBuffer." http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... src/plugin/PluginWbPassThrough.cpp:273: std::string buf(size, '\0'); I'll simply register my distaste for writing into an internal string buffer that's declared read-only. It works today, but it may not work later. http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... src/plugin/PluginWbPassThrough.cpp:284: m_boundDomain = ExtractHTTPHeader(std::wstring(*pszAdditionalHeaders), L"Referer:").c_str(); In general, use 'ToCString()' here preferably. It makes explicit the boundary between the old type and the new. In this case, however, there's a local variable 'boundDomain' that's already been converted. The name 'm_boundDomain' seems to be a new member variable, but I see no code outside the present function that would use its value. http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... src/plugin/PluginWbPassThrough.cpp:284: m_boundDomain = ExtractHTTPHeader(std::wstring(*pszAdditionalHeaders), L"Referer:").c_str(); Is it the case the "Referer" only ever appears in the additional headers from PassthroughAPP and not in the headers provided through QueryInfo? It seems oddly inconsistent. http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... src/plugin/PluginWbPassThrough.cpp:300: { FYI. A solid reason to need a rewrite. The change set that removes SUPPORT_FRAME_CACHING has now been pushed.
http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... File src/plugin/PluginWbPassThrough.cpp (right): http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... src/plugin/PluginWbPassThrough.cpp:255: auto acceptHeader = [&]() -> std::string On 2014/10/15 17:02:33, Eric wrote: > This function does not look like a good candidate for a lambda. Since you're > invoking it immediately, there's little advantage defining a function. Simply > declare the variable 'acceptHeader' and initialize it in place of the 'return' > statement. > > It also seems reasonable to define a separate function to retrieve the headers, > given how verbose the implementation for such a simple-looking function. I think it's good to leave this as is, since it helps us avoid a lot of the nested if's and also the code looks cleaner this way, IMHO. http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... src/plugin/PluginWbPassThrough.cpp:258: // only HTTP_QUERY_RAW_HEADERS_CRLF | HTTP_QUERY_FLAG_REQUEST_HEADERS does dork. On 2014/10/15 17:02:33, Eric wrote: > Doesn't HTTP_QUERY_CUSTOM work? If so, that would greatly simplify things, > including not needing to define a std::string version of ExtractHTTPHeader. > > "Causes HttpQueryInfo to search for the header name specified in lpvBuffer and > store the header data in lpvBuffer." No, it doesn't work. Added a comment about that.
- http://www.netzwelt.de still does not work as expected. - BTW, as a privacy feature we can cut or replace User-Agent, found x-flash-version and other headers. http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... File src/plugin/PluginWbPassThrough.cpp (right): http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... src/plugin/PluginWbPassThrough.cpp:284: m_boundDomain = ExtractHTTPHeader(std::wstring(*pszAdditionalHeaders), L"Referer:").c_str(); It indeed looks inconsistent although there is may be some logic, but I completely agree that no one can guarantee that there is no Referer in the string obtained through QueryInfo. We could put some debug tracing here which is triggered when Referer is available via QueryInfo. http://codereview.adblockplus.org/4974480757620736/diff/5676830073815040/src/... File src/plugin/PluginWbPassThrough.cpp (left): http://codereview.adblockplus.org/4974480757620736/diff/5676830073815040/src/... src/plugin/PluginWbPassThrough.cpp:149: pOIBindInfo->GetBindString(BINDSTRING_ACCEPT_MIMES, mime, 10, &resLen); so, can we rely only on the Accept header? http://codereview.adblockplus.org/4974480757620736/diff/5676830073815040/src/... src/plugin/PluginWbPassThrough.cpp:256: m_spInternetProtocolSink->ReportResult(INET_E_REDIRECTING, 301, L"data:"); Previously we did that trick for such content types, but now we return html page for any type when it's blocked. It might be not a problem, because the server can return some html error pages like 502 or 404. So the question is whether it's OK or not? Even if it's not acceptable I would make it in the next change. http://codereview.adblockplus.org/4974480757620736/diff/5676830073815040/src/... File src/plugin/PluginWbPassThrough.cpp (right): http://codereview.adblockplus.org/4974480757620736/diff/5676830073815040/src/... src/plugin/PluginWbPassThrough.cpp:204: DEBUG_GENERAL(src); DEBUG_GENERAL expects ATL::CString, it's not compilable. http://codereview.adblockplus.org/4974480757620736/diff/5676830073815040/src/... src/plugin/PluginWbPassThrough.cpp:266: // Here is the heuristic which detects the requests issued by Flash.ocx. I've also discovered that flash plugin on my computer adds "x-flash-version: 15,0,0,189" header into `pszAdditionalHeaders`, I would add it but as another issue. http://codereview.adblockplus.org/4974480757620736/diff/5676830073815040/src/... src/plugin/PluginWbPassThrough.cpp:288: // For IE6 and earlier there is iframe back button issue, so avoid it. the comment is not relevant here http://codereview.adblockplus.org/4974480757620736/diff/5676830073815040/src/... src/plugin/PluginWbPassThrough.cpp:320: // call here and eating it, we will call the proper ReportResult later by ourself. The code is correct but I've recognized that the comment is not enough clear. I think we should document it somewhere, may be not directly here. Normally this method is called from Read/OnRead but if we return E_ABORT from BeginningTransaction then it's called by the original implementation of Start. In that case the original implementation becomes into aborted state and does not do anything else. So it seems the last chance for it to notify the client that the request aborted and then client becomes into corresponding state and does not expect anything else as well. But it's not what we want, we need the client waiting for the result which we are generating just after original Start called and we call m_spInternetProtocolSink->ReportResult from OnRead. JIC, BaseClass::ReportResult(...) makes the same, m_spInternetProtocolSink->ReportResult. http://codereview.adblockplus.org/4974480757620736/diff/5676830073815040/src/... src/plugin/PluginWbPassThrough.cpp:362: if (!m_hasOriginalStartCalled) Since we always call original Start let's delete m_hasOriginalStartCalled completely. Just remove member and check compilation errors initialization in ctr, if blocks in Lock/Unlock and an assigning in src/plugin/SinkPolicy.inl,
On 2014/10/28 14:08:45, sergei wrote: > - http://www.netzwelt.de still does not work as expected. Awesome, sorry, it does work! It was a local trouble on my machine.
http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... File src/plugin/PluginWbPassThrough.cpp (right): http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... src/plugin/PluginWbPassThrough.cpp:284: m_boundDomain = ExtractHTTPHeader(std::wstring(*pszAdditionalHeaders), L"Referer:").c_str(); On 2014/10/28 14:08:46, sergei wrote: > It indeed looks inconsistent although there is may be some logic, but I > completely agree that no one can guarantee that there is no Referer in the > string obtained through QueryInfo. > We could put some debug tracing here which is triggered when Referer is > available via QueryInfo. In my tests the Referer is available through QueryInfo only in ReportResult. I have never seen it available before the request was made. http://codereview.adblockplus.org/4974480757620736/diff/5676830073815040/src/... File src/plugin/PluginWbPassThrough.cpp (left): http://codereview.adblockplus.org/4974480757620736/diff/5676830073815040/src/... src/plugin/PluginWbPassThrough.cpp:149: pOIBindInfo->GetBindString(BINDSTRING_ACCEPT_MIMES, mime, 10, &resLen); On 2014/10/28 14:08:46, sergei wrote: > so, can we rely only on the Accept header? From my experience the Accept header in BeginningTransaction is available every time when that header is actually sent in a request. Not the case with BINDSTRING_ACCEPT_MIMES. So I think relying on Accept header in BeginningTransaction is our best shot. Do you have any other ideas? http://codereview.adblockplus.org/4974480757620736/diff/5676830073815040/src/... src/plugin/PluginWbPassThrough.cpp:256: m_spInternetProtocolSink->ReportResult(INET_E_REDIRECTING, 301, L"data:"); On 2014/10/28 14:08:46, sergei wrote: > Previously we did that trick for such content types, but now we return html page > for any type when it's blocked. It might be not a problem, because the server > can return some html error pages like 502 or 404. > So the question is whether it's OK or not? > Even if it's not acceptable I would make it in the next change. I don't think it's a problem at all. We did the "data:" trick because we had to redirect requests somewhere and it was the best location for that purpose. Not because we were ever afraid to return unexpected mime. Before "data:" we've actually used 'res://' links, which were returning html, but that turned out to cause issues on 'https://' web sites ("Only secure content is displayed" warning). In this case we don't need to redirect anywhere, so I'll vote for getting rid of this hack.
http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... File src/plugin/PluginWbPassThrough.cpp (right): http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... src/plugin/PluginWbPassThrough.cpp:284: m_boundDomain = ExtractHTTPHeader(std::wstring(*pszAdditionalHeaders), L"Referer:").c_str(); On 2014/10/30 23:40:48, Oleksandr wrote: > On 2014/10/28 14:08:46, sergei wrote: > > It indeed looks inconsistent although there is may be some logic, but I > > completely agree that no one can guarantee that there is no Referer in the > > string obtained through QueryInfo. > > We could put some debug tracing here which is triggered when Referer is > > available via QueryInfo. > > In my tests the Referer is available through QueryInfo only in ReportResult. I > have never seen it available before the request was made. It's OK for me. http://codereview.adblockplus.org/4974480757620736/diff/5676830073815040/src/... File src/plugin/PluginWbPassThrough.cpp (left): http://codereview.adblockplus.org/4974480757620736/diff/5676830073815040/src/... src/plugin/PluginWbPassThrough.cpp:149: pOIBindInfo->GetBindString(BINDSTRING_ACCEPT_MIMES, mime, 10, &resLen); On 2014/10/30 23:40:48, Oleksandr wrote: > On 2014/10/28 14:08:46, sergei wrote: > > so, can we rely only on the Accept header? > > From my experience the Accept header in BeginningTransaction is available every > time when that header is actually sent in a request. Not the case with > BINDSTRING_ACCEPT_MIMES. So I think relying on Accept header in > BeginningTransaction is our best shot. Do you have any other ideas? No any idea, I observe the same. http://codereview.adblockplus.org/4974480757620736/diff/5676830073815040/src/... src/plugin/PluginWbPassThrough.cpp:256: m_spInternetProtocolSink->ReportResult(INET_E_REDIRECTING, 301, L"data:"); On 2014/10/30 23:40:48, Oleksandr wrote: > On 2014/10/28 14:08:46, sergei wrote: > > Previously we did that trick for such content types, but now we return html > page > > for any type when it's blocked. It might be not a problem, because the server > > can return some html error pages like 502 or 404. > > So the question is whether it's OK or not? > > Even if it's not acceptable I would make it in the next change. > > I don't think it's a problem at all. We did the "data:" trick because we had to > redirect requests somewhere and it was the best location for that purpose. Not > because we were ever afraid to return unexpected mime. Before "data:" we've > actually used 'res://' links, which were returning html, but that turned out to > cause issues on 'https://' web sites ("Only secure content is displayed" > warning). In this case we don't need to redirect anywhere, so I'll vote for > getting rid of this hack. It's OK for me. http://codereview.adblockplus.org/4974480757620736/diff/5700735861784576/src/... File src/plugin/SinkPolicy.inl (left): http://codereview.adblockplus.org/4974480757620736/diff/5700735861784576/src/... src/plugin/SinkPolicy.inl:387: if (E_ABORT == hr && pSink->m_blockedInTransaction) It should not be removed, only one line with m_hasOriginalStartCalled above. I'm afraid it even does not work now.
http://codereview.adblockplus.org/4974480757620736/diff/5700735861784576/src/... File src/plugin/SinkPolicy.inl (left): http://codereview.adblockplus.org/4974480757620736/diff/5700735861784576/src/... src/plugin/SinkPolicy.inl:387: if (E_ABORT == hr && pSink->m_blockedInTransaction) On 2014/10/31 12:47:57, sergei wrote: > It should not be removed, only one line with m_hasOriginalStartCalled above. > I'm afraid it even does not work now. You're right. Removing this would not let us respond with custom content.
LGTM
http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... File src/plugin/PluginWbPassThrough.cpp (right): http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... src/plugin/PluginWbPassThrough.cpp:203: std::wstring src = szURL; Shouldn't this be stored as a CString? It's only ever used as a CString it seems. http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... src/plugin/PluginWbPassThrough.cpp:206: CComPtr<IHttpNegotiate> spHttpNegotiate; Other local variables here don't use hungarian notation, why this one? http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... src/plugin/PluginWbPassThrough.cpp:210: HRESULT nativeHr = spHttpNegotiate ? spHttpNegotiate->BeginningTransaction(szURL, szHeaders,dwReserved, pszAdditionalHeaders) : S_OK; Space after "szHeaders,". http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... src/plugin/PluginWbPassThrough.cpp:212: auto acceptHeader = [&]() -> std::string Looks like this could move up to be the first thing in the function, since it's the first thing we do and it doesn't need pszAdditionalHeaders, nativeHr etc. I suggest this: std::wstring src = szURL; DEBUG_GENERAL(ToCString(src)); auto acceptHeader = ... if (pszAdditionalHeaders) { *pszAdditionalHeaders = nullptr; } CComPtr<IHTTPNegotiate> spHttpNegotiate; ... That said, I'd find this even easier to follow if the lambda was a named function, ExtractAcceptHeader or something. http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... src/plugin/PluginWbPassThrough.cpp:214: // Despite there is HTTP_QUERY_ACCEPT and other query info flags, they don't work here, s/is/being/? http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... src/plugin/PluginWbPassThrough.cpp:218: if(FAILED(hr)) Space before "(" please, also below. http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... src/plugin/PluginWbPassThrough.cpp:225: /*buffer*/nullptr, /* get size */&size, &flags, /*reserved*/ 0); How about /*get size*/ for consistency? http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... src/plugin/PluginWbPassThrough.cpp:231: hr = winInetHttpInfo->QueryInfo(HTTP_QUERY_RAW_HEADERS_CRLF | HTTP_QUERY_FLAG_REQUEST_HEADERS, Can you store the headers to retrieve in a variable to avoid duplication here? http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... src/plugin/PluginWbPassThrough.cpp:265: { Also something I'd suggest to move to a dedicated function, the comment shows it wants to be free IMO :) Up to you too. http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... src/plugin/PluginWbPassThrough.cpp:278: && (BINDF_ASYNCHRONOUS | BINDF_ASYNCSTORAGE| BINDF_PULLDATA) == grfBINDF According to Mozilla's style, we should break after &&/||, not before. http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... File src/shared/Utils.h (right): http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... src/shared/Utils.h:54: T ExtractHTTPHeader(const T& allHeaders, const T& targetHeaderName, const T& delimiter) 1. I think we generally capitalise it like this: ExtractHttpHeader - but I wouldn't dare say it's consistent :) 2. Since HTTP headers generally end with \n, I don't think we need to pass in delimiter here. 3. I find it somewhat unintuitive that targetHeaderName needs to include the colon here. IMO we should just pass the name here and deal with the colon in this function.
http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... File src/plugin/PluginWbPassThrough.cpp (right): http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... src/plugin/PluginWbPassThrough.cpp:203: std::wstring src = szURL; On 2014/11/03 04:58:16, Felix H. Dahlke wrote: > Shouldn't this be stored as a CString? It's only ever used as a CString it > seems. There is a folow-up review where we will be removing the CString occurences. Since I was rewriting this code it made sense to use std::wstring off the bat. See http://codereview.adblockplus.org/4974480757620736/diff/5629499534213120/src/... http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... File src/shared/Utils.h (right): http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... src/shared/Utils.h:54: T ExtractHTTPHeader(const T& allHeaders, const T& targetHeaderName, const T& delimiter) On 2014/11/03 04:58:16, Felix H. Dahlke wrote: > 1. I think we generally capitalise it like this: ExtractHttpHeader - but I > wouldn't dare say it's consistent :) > 2. Since HTTP headers generally end with \n, I don't think we need to pass in > delimiter here. > 3. I find it somewhat unintuitive that targetHeaderName needs to include the > colon here. IMO we should just pass the name here and deal with the colon in > this function. 2. Http headers can be seperated by '\0' as well. See HTTP_QUERY_RAW_HEADERS http://msdn.microsoft.com/en-us/library/windows/desktop/aa385351(v=vs.85).aspx. I just want to keep this function as flexible as possible for future. Made that argument optional.
http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... File src/shared/Utils.h (right): http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... src/shared/Utils.h:54: T ExtractHTTPHeader(const T& allHeaders, const T& targetHeaderName, const T& delimiter) On 2014/11/03 04:58:16, Felix H. Dahlke wrote: > 2. Since HTTP headers generally end with \n, I don't think we need to pass in > delimiter here. > 3. I find it somewhat unintuitive that targetHeaderName needs to include the > colon here. IMO we should just pass the name here and deal with the colon in > this function. If we really want to have it in utils then tests are required as well. But I would say we are trying to generalize it too early and it leads only to unnecessary logic and questions (btw, it violates such buzzwords KISS and "the rule of three [refactoring]"). I would keep it in PluginWbPassThrough.cpp file without default values and with requirement to include colon in target header name. http://codereview.adblockplus.org/4974480757620736/diff/5713144022302720/src/... File src/shared/Utils.h (right): http://codereview.adblockplus.org/4974480757620736/diff/5713144022302720/src/... src/shared/Utils.h:50: T TrimString(T text, std::function<bool(typename T::value_type)> differentiator) const references are missed. http://codereview.adblockplus.org/4974480757620736/diff/5713144022302720/src/... src/shared/Utils.h:60: T CreateConstString(std::string val); `cosnt std::string&`
http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... File src/shared/Utils.h (right): http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... src/shared/Utils.h:54: T ExtractHTTPHeader(const T& allHeaders, const T& targetHeaderName, const T& delimiter) On 2014/11/03 14:17:46, sergei wrote: > On 2014/11/03 04:58:16, Felix H. Dahlke wrote: > > 2. Since HTTP headers generally end with \n, I don't think we need to pass in > > delimiter here. > > 3. I find it somewhat unintuitive that targetHeaderName needs to include the > > colon here. IMO we should just pass the name here and deal with the colon in > > this function. > > If we really want to have it in utils then tests are required as well. > But I would say we are trying to generalize it too early and it leads only to > unnecessary logic and questions (btw, it violates such buzzwords KISS and "the > rule of three [refactoring]"). I would keep it in PluginWbPassThrough.cpp file > without default values and with requirement to include colon in target header > name. Yes, I can see how the changes to make this more generalized turned out unnecessarily complex for reading. And indeed we are not very likely to use this code anywhere else (no third use). So I would actually agree with Sergei to keep it simple and revert this part to just a narrowly specialized function in PluginWbPassThrough.cpp.
http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... File src/shared/Utils.h (right): http://codereview.adblockplus.org/4974480757620736/diff/5681717746597888/src/... src/shared/Utils.h:54: T ExtractHTTPHeader(const T& allHeaders, const T& targetHeaderName, const T& delimiter) On 2014/11/04 12:02:42, Oleksandr wrote: > On 2014/11/03 14:17:46, sergei wrote: > > On 2014/11/03 04:58:16, Felix H. Dahlke wrote: > > > 2. Since HTTP headers generally end with \n, I don't think we need to pass > in > > > delimiter here. > > > 3. I find it somewhat unintuitive that targetHeaderName needs to include the > > > colon here. IMO we should just pass the name here and deal with the colon in > > > this function. > > > > If we really want to have it in utils then tests are required as well. > > But I would say we are trying to generalize it too early and it leads only to > > unnecessary logic and questions (btw, it violates such buzzwords KISS and "the > > rule of three [refactoring]"). I would keep it in PluginWbPassThrough.cpp file > > without default values and with requirement to include colon in target header > > name. > > Yes, I can see how the changes to make this more generalized turned out > unnecessarily complex for reading. And indeed we are not very likely to use this > code anywhere else (no third use). So I would actually agree with Sergei to keep > it simple and revert this part to just a narrowly specialized function in > PluginWbPassThrough.cpp. Fully agree, good points Sergei. http://codereview.adblockplus.org/4974480757620736/diff/5713144022302720/src/... File src/plugin/PluginWbPassThrough.cpp (right): http://codereview.adblockplus.org/4974480757620736/diff/5713144022302720/src/... src/plugin/PluginWbPassThrough.cpp:196: std::string ExtractHttpAcceptHeader(IInternetProtocol* inernetProtocol) internetProtocol with a t, I guess? Can you move this to the unnamed namespace above to avoid polluting the global namespace? http://codereview.adblockplus.org/4974480757620736/diff/5713144022302720/src/... src/plugin/PluginWbPassThrough.cpp:199: // only HTTP_QUERY_RAW_HEADERS_CRLF | HTTP_QUERY_FLAG_REQUEST_HEADERS does dork. does dork? That a freudian? :D http://codereview.adblockplus.org/4974480757620736/diff/5713144022302720/src/... src/plugin/PluginWbPassThrough.cpp:262: std::string acceptHeader = ExtractHttpAcceptHeader(m_spTargetProtocol); I still think we can move this up, i.e. this and the line below could be the first thing after DEBUG_GENERAL(ToCString(src)). The point is mainly that the code before and after is dealing with pszAdditionalHeaders, so this seems intermixed.
http://codereview.adblockplus.org/4974480757620736/diff/5678807906254848/src/... File src/plugin/PluginWbPassThrough.cpp (right): http://codereview.adblockplus.org/4974480757620736/diff/5678807906254848/src/... src/plugin/PluginWbPassThrough.cpp:39: std::string ExtractHttpAcceptHeader(IInternetProtocol* internetProtocol) Would you shift this function one line below, so one line between functions and no empty line below. http://codereview.adblockplus.org/4974480757620736/diff/5678807906254848/src/... src/plugin/PluginWbPassThrough.cpp:63: return ExtractHttpHeader<std::string>(buf, std::string("Accept:"), "\r\n"); It's not necessary to explicitly construct std::string argument because we call it already as `ExtractHttpHeader<std::string>` here and below with std::wstring. But it's OK. http://codereview.adblockplus.org/4974480757620736/diff/5678807906254848/src/... src/plugin/PluginWbPassThrough.cpp:65: this line
On 2014/11/05 00:27:23, Oleksandr wrote: LGTM
http://codereview.adblockplus.org/4974480757620736/diff/5630721452408832/src/... File src/plugin/PluginWbPassThrough.cpp (right): http://codereview.adblockplus.org/4974480757620736/diff/5630721452408832/src/... src/plugin/PluginWbPassThrough.cpp:26: auto targetHeaderBeginsAt = allHeaders.find(targetHeaderName); I still find it unintuitive that the colon is part of "header name" here. Why not just append the colon here? More importantly, header names are generally case insensitive, are we sure both HTTP_QUERY_RAW_HEADERS_CRLF and pszAdditionalHeaders will always return them in a way our case sensitive comparison expects? http://codereview.adblockplus.org/4974480757620736/diff/5630721452408832/src/... src/plugin/PluginWbPassThrough.cpp:272: DEBUG_GENERAL(ToCString(src)); Hehe, still not quite what I meant :D Wouldn't it make more sense to: 1. Have these two lines (i.e. DEBUG_GENERAL) be the first thing the function does 2. Move the if (pszAdditionalHeaders) down to where we actually use pszAdditionalHeaders? Just wanted to bring across what I mean, go with what you think makes most sense.
http://codereview.adblockplus.org/4974480757620736/diff/5630721452408832/src/... File src/plugin/PluginWbPassThrough.cpp (right): http://codereview.adblockplus.org/4974480757620736/diff/5630721452408832/src/... src/plugin/PluginWbPassThrough.cpp:26: auto targetHeaderBeginsAt = allHeaders.find(targetHeaderName); On 2014/11/05 09:04:52, Felix H. Dahlke wrote: > I still find it unintuitive that the colon is part of "header name" here. Why > not just append the colon here? > > More importantly, header names are generally case insensitive, are we sure both > HTTP_QUERY_RAW_HEADERS_CRLF and pszAdditionalHeaders will always return them in > a way our case sensitive comparison expects? 1. Colon is part of header name because this is a templated function and otherwise we'll end up with stuff CreateConstString (see Patchset 5). This is to keep it simple. 2. The headers we get from IE are constructed by IE (not some third party code). So it is always in the same format. Besides, if we wanted to convert everything to lower case ::tolower wouldn't work (since it's meant to be used for chars only, not wchars. So we'll make things unnecessarily complex again. http://codereview.adblockplus.org/4974480757620736/diff/5630721452408832/src/... src/plugin/PluginWbPassThrough.cpp:272: DEBUG_GENERAL(ToCString(src)); On 2014/11/05 09:04:52, Felix H. Dahlke wrote: > Hehe, still not quite what I meant :D Wouldn't it make more sense to: > > 1. Have these two lines (i.e. DEBUG_GENERAL) be the first thing the function > does > 2. Move the if (pszAdditionalHeaders) down to where we actually use > pszAdditionalHeaders? > > Just wanted to bring across what I mean, go with what you think makes most > sense. That makes sense, yes.
LGTM!
http://codereview.adblockplus.org/4974480757620736/diff/5630721452408832/src/... File src/plugin/PluginWbPassThrough.cpp (right): http://codereview.adblockplus.org/4974480757620736/diff/5630721452408832/src/... src/plugin/PluginWbPassThrough.cpp:26: auto targetHeaderBeginsAt = allHeaders.find(targetHeaderName); On 2014/11/05 10:01:58, Oleksandr wrote: > On 2014/11/05 09:04:52, Felix H. Dahlke wrote: > > I still find it unintuitive that the colon is part of "header name" here. Why > > not just append the colon here? > > > > More importantly, header names are generally case insensitive, are we sure > both > > HTTP_QUERY_RAW_HEADERS_CRLF and pszAdditionalHeaders will always return them > in > > a way our case sensitive comparison expects? > > 1. Colon is part of header name because this is a templated function and > otherwise we'll end up with stuff CreateConstString (see Patchset 5). This is to > keep it simple. > 2. The headers we get from IE are constructed by IE (not some third party code). > So it is always in the same format. Besides, if we wanted to convert everything > to lower case ::tolower wouldn't work (since it's meant to be used for chars > only, not wchars. So we'll make things unnecessarily complex again. 1. T is std::string or std::wstring. The variant with template specialization looks like overengineering here. May be such functions like `ConvertToResult<From, To>::Type ConvertTo<To>(const From&)` could help but we don't have them as well as we can use something like `static_cast<typename T::value_type>(':')`, but it still smells. What about `targetHeaderNameWithColon`? 2. It works now, but I also think it's important. I would say that content type should also be case-insensitive. BTW, the headers are not always constructed by IE. `BeginningTransaction` is provided by the client and the client sets or modifies the header. (as I pointed before we can actually use it to implement additional features like hiding the real user agent). A real example is flash addon, as I see it adds its own header which we can use for better object-subrequest detection as well (should be as another issue). I think that we can discuss the details how it should be implemented in the issue tracker. I would like to have it as a good-first-bug, it's an excellent challenging task which can wait for now I think ;)
Renamed the parameter. Closing this and pushing. http://codereview.adblockplus.org/4974480757620736/diff/5630721452408832/src/... File src/plugin/PluginWbPassThrough.cpp (right): http://codereview.adblockplus.org/4974480757620736/diff/5630721452408832/src/... src/plugin/PluginWbPassThrough.cpp:26: auto targetHeaderBeginsAt = allHeaders.find(targetHeaderName); On 2014/11/05 10:28:59, sergei wrote: > On 2014/11/05 10:01:58, Oleksandr wrote: > > On 2014/11/05 09:04:52, Felix H. Dahlke wrote: > > > I still find it unintuitive that the colon is part of "header name" here. > Why > > > not just append the colon here? > > > > > > More importantly, header names are generally case insensitive, are we sure > > both > > > HTTP_QUERY_RAW_HEADERS_CRLF and pszAdditionalHeaders will always return them > > in > > > a way our case sensitive comparison expects? > > > > 1. Colon is part of header name because this is a templated function and > > otherwise we'll end up with stuff CreateConstString (see Patchset 5). This is > to > > keep it simple. > > 2. The headers we get from IE are constructed by IE (not some third party > code). > > So it is always in the same format. Besides, if we wanted to convert > everything > > to lower case ::tolower wouldn't work (since it's meant to be used for chars > > only, not wchars. So we'll make things unnecessarily complex again. > > 1. T is std::string or std::wstring. The variant with template specialization > looks like overengineering here. May be such functions like > `ConvertToResult<From, To>::Type ConvertTo<To>(const From&)` could help but we > don't have them as well as we can use something like `static_cast<typename > T::value_type>(':')`, but it still smells. What about > `targetHeaderNameWithColon`? > 2. It works now, but I also think it's important. I would say that content type > should also be case-insensitive. > BTW, the headers are not always constructed by IE. `BeginningTransaction` is > provided by the client and the client sets or modifies the header. (as I > pointed before we can actually use it to implement additional features like > hiding the real user agent). A real example is flash addon, as I see it adds its > own header which we can use for better object-subrequest detection as well > (should be as another issue). > I think that we can discuss the details how it should be implemented in the > issue tracker. I would like to have it as a good-first-bug, it's an excellent > challenging task which can wait for now I think ;) Created https://issues.adblockplus.org/ticket/1529 on this. |