Issue 5921969115496448: Issue 1115 - Some yahoo page not correctly shown on IE8 when ABP enabled

sergei

Jan. 29, 2015, 3:24 p.m. (2015-01-29 15:24:51 UTC) #1

Eric

Since the content of http://codereview.adblockplus.org/5316782940225536/ has overlaps with the present review, I'd expect a rebased/reworked ...

Feb. 2, 2015, 6:41 p.m. (2015-02-02 18:41:58 UTC) #2

sergei

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/plugin/PluginWbPassThrough.cpp File src/plugin/PluginWbPassThrough.cpp (right): http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/plugin/PluginWbPassThrough.cpp#newcode93 src/plugin/PluginWbPassThrough.cpp:93: auto lastDotIt = std::find(value.rbegin(), value.rend(), L'.').base(); On 2015/02/02 18:41:58, ...

Feb. 12, 2015, 2:44 p.m. (2015-02-12 14:44:06 UTC) #3

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/...
File src/plugin/PluginWbPassThrough.cpp (right):

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/...
src/plugin/PluginWbPassThrough.cpp:93: auto lastDotIt =
std::find(value.rbegin(), value.rend(), L'.').base();
On 2015/02/02 18:41:58, Eric wrote:
> It's much clearer just to call 'rfind':
> 
>     auto lastDotPos = value.rfind(L'.');

Good point!
fixed

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/...
src/plugin/PluginWbPassThrough.cpp:175: {
On 2015/02/02 18:41:58, Eric wrote:
> As it's already been mentioned in the comments, it's hazardous to try to
derive
> a content type from the query string.
> https://issues.adblockplus.org/ticket/1115#comment:7
> 
> Since this is only a problem on IE 8, we should add a version check here. If
the
> version is >= 9, we should always return here, even for 'contentTypeAny'

Actually, I'm not sure that it's a problem of only IE8, the execution flow gets
there in IE11 as well but I don't know concrete web site where it can derive the
content type from the query string parameter. However, I also agree that it can
be hazardous, so I've added it only for IE8.

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/...
File src/shared/Utils.cpp (right):

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/...
src/shared/Utils.cpp:157: }
On 2015/02/02 18:41:58, Eric wrote:
> If you increment questionSignPos here, you get a (valid) position for the
> beginning of the query string. It eliminates the +1 and -1 in the return
> statement and makes the code a little easier to follow.

done

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/...
src/shared/Utils.cpp:166: void SplitString(const std::wstring& value, wchar_t
delimiter,
On 2015/02/02 18:41:58, Eric wrote:
> A better name would be ForEachToken(), by analogy with 'std::for_each'. When I
> see SplitString, I'm expecting something rather returns, say, a vector.
done.
> 
> I'd recommend changing the name of tokenHandler, which is pretty generic
already
> to something completely generic, like 'fn' or 'op' (the first as in
> 'std::for_each', the second as in 'std::transform').
Why? I've changed them in the header but in the code they reduce the
readability.
> Perhaps even better, just make a token iterator class and call it with
> 'for_each' instead of the (just one single) call this function below.
I also very like the idea with token iterators but I don't think it's better
here. It's harder to make a mistake working with `ForEachToken(src, delimiter,
tokenHandler)` than with custom iterators. Let's do it later when we really need
it.

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/...
src/shared/Utils.cpp:172: }
On 2015/02/02 18:41:58, Eric wrote:
> I would write the following loop as a 'for' loop. It makes termination
clearer.
> 
>   for( size_t delimiterPos = 0 ; delimiterPos != std::wstring::npos ;
> ++delimiterPos )
>   {
>     size_t prevDelimiterPos = delimiterPos;
>     delimiterPos = value.find(delimiter, prevDelimiterPos);
>     //...
>   }

Firstly, the proposed statements in the loop are not correct. Secondly, for-loop
consists from the following steps, 'initialization', 'condition', 'increment'
and 'statement' (`for(initialization, condition, increment){statement();}`).
Taking into account that 'increment' is last three lines of the loop body as
well as we can leave the loop not only because the 'condition' is false, I would
say it's more logical to use while-loop here.

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/...
src/shared/Utils.cpp:188: void ProcessQueryStringParameters(const std::wstring&
queryString,
On 2015/02/02 18:41:58, Eric wrote:
> Similiary, 'ForEachQueryParameter()' would be a better name.

done.

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/...
src/shared/Utils.cpp:191: if (queryString.empty() || !parameterHandler)
On 2015/02/02 18:41:58, Eric wrote:
> Why bother with the term "!parameterHandler" when you're passing
> 'parameterHandler' by reference? Under what circumstances would that term ever
> be true?

std::function<...> can be an empty object, so before calling it, we should
ensure that it's not empty.

Anyway, I've removed it here and in ForEachToken because it's actually covered
by the test (see
http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/test...)

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/...
src/shared/Utils.cpp:195: auto unparsedParameterHandler =
[&parameterHandler](std::wstring::const_iterator begin,
std::wstring::const_iterator end)->bool
On 2015/02/02 18:41:58, Eric wrote:
> No need for a local variable here; this lambda can be defined directly as a
> function argument below.

done

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/...
File src/shared/Utils.h (right):

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/...
src/shared/Utils.h:59: /// Splits the string using delimiter
On 2015/02/02 18:41:58, Eric wrote:
> Might as well use a doxygen-style comment here. 
> 
>   /**
>    * Splits a string into tokens separated by a delimiter.
>    * Applies a function to each token.
>    */

fixed.

BTW, it's not necessary to have additional lines /** and leading */, doxygen
should perfectly work with /// as well.

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/...
src/shared/Utils.h:63: /// until it returns false.
On 2015/02/02 18:41:58, Eric wrote:
> /**
>  * Applies a function to each name=value pair in a query string.
>  * Terminates early if the function returns false.
>  */

done

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/test...
File test/UtilGetQueryString.cpp (right):

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/test...
test/UtilGetQueryString.cpp:29: }
On 2015/02/02 18:41:58, Eric wrote:
> Need a test for an empty query string
> 
>     L"schema://host/path1/path2?"

added

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/test...
test/UtilGetQueryString.cpp:34: }
On 2015/02/02 18:41:58, Eric wrote:
> There are three tests with empty anchor strings. Need three more with
non-empty
> anchor strings.

I disagree here. According to such logic we should create the infinite number of
tests to test absolutely all cases which we cannot achieve. Now we cover all
execution paths of the tested function body, it should be enough.

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/test...
File test/UtilProcessQueryStringParameters.cpp (right):

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/test...
test/UtilProcessQueryStringParameters.cpp:26:
ProcessQueryStringParameters(L"some=qyery&string", std::function<bool(const
std::wstring&, const std::wstring&)>());
On 2015/02/02 18:41:58, Eric wrote:
> Use ASSERT_NO_THROW here to clearly indicate the intent of the test.
I've changed it, now it throws and there is ASSERT_THROW

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/test...
test/UtilProcessQueryStringParameters.cpp:216: }
On 2015/02/02 18:41:58, Eric wrote:
> You need a test for a query string with a non-empty name and an empty value,
> both with and without the '=' symbol.
> 
>   L"param="
>   L"param"
> 
It's already here, ParameterWithoutAssignSign, ParametersWithoutAssignSign,
ParameterWithEmptyValue.

> You also need a test for a query string with an empty name and non-empty
value.
> 
>   L"=valueonly"
> 
> It's not a valid URL string, but you should ensure that it behaves as you
want,
> either throwing an exception, returning an error, or simply initializing an
> empty name. At the very least, if it's not supposed to throw an exception,
test
> that it doesn't.
I've added such tests.

JIC, I want to clarify, we should not throw any exception on unusual user input.

Eric

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/shared/Utils.cpp File src/shared/Utils.cpp (right): http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/shared/Utils.cpp#newcode172 src/shared/Utils.cpp:172: } On 2015/02/12 14:44:06, sergei wrote: > Firstly, the ...

Feb. 12, 2015, 5:24 p.m. (2015-02-12 17:24:41 UTC) #4

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/...
File src/shared/Utils.cpp (right):

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/...
src/shared/Utils.cpp:172: }
On 2015/02/12 14:44:06, sergei wrote:
> Firstly, the proposed statements in the loop are not correct. 

Look more closely.

> Taking into account that 'increment' is last three lines of the loop body as
> well as we can leave the loop not only because the 'condition' is false, I
would
> say it's more logical to use while-loop here.

Providing a second condition to leave the loop is part of the point. 

The critique you are making, that there are two ways of terminating the loop, is
also present in the existing code. There's termination if a position reaches the
end or termination if the function argument returns true. There's no difference
there.

The advantage of the for-loop form is that you don't have to read the body of
the loop to verify that the loop terminates.

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/...
src/shared/Utils.cpp:191: if (queryString.empty() || !parameterHandler)
On 2015/02/12 14:44:06, sergei wrote:
> std::function<...> can be an empty object, so before calling it, we should
> ensure that it's not empty.

I had forgotten that you can have a non-empty reference and also a null
function, so this condition is actually meaningful.

Any iteration with a null function is the same as an empty statement. So I'm
withdrawing my original objection. The original behavior, to do nothing if the
function provide is null, is better behavior.

> Anyway, I've removed it here and in ForEachToken because it's actually covered
> by the test

This works too, though now that I've seen both I prefer the original. If you
want to keep the condition out (as with patch set 2), it deserves a document
comment in the header file that this function throws if the function argument is
null.

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/...
File src/shared/Utils.h (right):

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/...
src/shared/Utils.h:59: /// Splits the string using delimiter
On 2015/02/12 14:44:06, sergei wrote:
> BTW, it's not necessary to have additional lines /** and leading */, doxygen
> should perfectly work with /// as well.

Minimizing whitespace is not always a virtue. I find the /**-style doxygen
comment much easier to read. That's what you've done, though, so I'm fine.

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/test...
File test/UtilGetQueryString.cpp (right):

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/test...
test/UtilGetQueryString.cpp:34: }
On 2015/02/12 14:44:06, sergei wrote:
> I disagree here. According to such logic we should create the infinite number
of
> tests to test absolutely all cases which we cannot achieve. 

No, not an infinite number of tests. Simply a few more tests here.

> Now we cover all
> execution paths of the tested function body, it should be enough.

Unit tests should treat their units as black boxes, not white boxes. You can't
see inside a black box; you see inside a white box. Arguing that all execution
paths are exercised is contrary to this principle.

Unit tests should be written so that someone else implementing the same function
has the benefit of exhaustive testing. I've written lots of parsers in my life.
It is particularly useful for parsers to test exhaustively, for some reason.
When testing a parser, it's important to test as many variations in the grammar
as possible. In the present case, if it's possible for some element of a parsed
string to be empty, there should be unit tests when it's empty.

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/test...
File test/UtilProcessQueryStringParameters.cpp (right):

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/test...
test/UtilProcessQueryStringParameters.cpp:216: }
On 2015/02/12 14:44:06, sergei wrote:
> JIC, I want to clarify, we should not throw any exception on unusual user
input.

This comment belongs as documentation as a function header.

Sloppy error handling in parsing user input is a major source of security
defects. It matters less how you do error handling than that it be done
explicitly and precisely.

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/...
File src/plugin/PluginWbPassThrough.cpp (right):

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/...
src/plugin/PluginWbPassThrough.cpp:185: contentType ==
ContentType::CONTENT_TYPE_OTHER)
I don't think it matters all that much, but it would be a little better to test
content type first and check the IE version only if we need to.

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/...
File src/shared/Utils.cpp (right):

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/...
src/shared/Utils.cpp:175: }
This version much clearer and easier to verify correctness, including better
local variable names.

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/test...
File test/UtilForEachQueryStringParameterTest.cpp (right):

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/test...
test/UtilForEachQueryStringParameterTest.cpp:24: struct ParameterHandler
The code reads much better with this class present.

Now that I've seen the new version, I'll just observe that there's lot of
duplication in verifying the resulting vectors. For whatever reason, it bothers
me less.

That's partly because our current compiler doesn't support C++11 uniform
initializer lists, so writing a single expression for expected results isn't
possible.

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/test...
test/UtilForEachQueryStringParameterTest.cpp:43: struct LimitedParameterHandler
: ParameterHandler
I think I'd prefer than ParameterHandler take the moral equivalent of
shouldContinue() here as a constructor argument. My reason for this is that the
code here only applies to a single test case and so the code for that test is
split between two, separate, non-contiguous types.

I'd be satisfied, though, if you declared this call immediately before the unit
test in which it's used.

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/test...
test/UtilForEachQueryStringParameterTest.cpp:52: return shouldContinue() &&
ParameterHandler::operator()(parameter, value) && shouldContinue();
No need for the first shouldContinue() call here. The only time it could matter
in the present case is when 'limit' is zero.

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/test...
test/UtilForEachQueryStringParameterTest.cpp:72:
ForEachQueryStringParameter(L"", std::ref(parameterHandler));
Do we need reference wrappers? It seems like would should be able to declare the
ForEach... function arguments as rvalue references rather than regular
references. 

If we do, we should likely use 'std::cref' here.

Oleksandr

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/plugin/PluginWbPassThrough.cpp File src/plugin/PluginWbPassThrough.cpp (right): http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/plugin/PluginWbPassThrough.cpp#newcode187 src/plugin/PluginWbPassThrough.cpp:187: ForEachQueryStringParameter(GetQueryString(src), Why can't we just use strtok here? The ...

Feb. 13, 2015, 3:32 a.m. (2015-02-13 03:32:16 UTC) #5

sergei

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/shared/Utils.cpp File src/shared/Utils.cpp (right): http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/shared/Utils.cpp#newcode172 src/shared/Utils.cpp:172: } On 2015/02/12 17:24:41, Eric wrote: > On 2015/02/12 ...

Feb. 13, 2015, 3:21 p.m. (2015-02-13 15:21:56 UTC) #6

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/...
File src/shared/Utils.cpp (right):

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/...
src/shared/Utils.cpp:172: }
On 2015/02/12 17:24:41, Eric wrote:
> On 2015/02/12 14:44:06, sergei wrote:
> > Firstly, the proposed statements in the loop are not correct. 
> 
> Look more closely.
It results in an infinite cycle. Look, at the end `delimiterPos =
value.find(delimiter, prevDelimiterPos);` sets `delimiterPos` to
std::string::npos which is the greatest possible value for an element of type
size_t and then 'increment' step of for-loop is executed, (`++delimiterPos`)
which results into setting `delimiterPos` value to zero (actually, integer
overflow is undefined behaviour), and it starts from the beginning.
So, I still cannot say that the proposed approach is easier to understand.

> 
> > Taking into account that 'increment' is last three lines of the loop body as
> > well as we can leave the loop not only because the 'condition' is false, I
> would
> > say it's more logical to use while-loop here.
> 
> Providing a second condition to leave the loop is part of the point. 
> 
> The critique you are making, that there are two ways of terminating the loop,
is
> also present in the existing code. There's termination if a position reaches
the
> end or termination if the function argument returns true. There's no
difference
> there.
> 
> The advantage of the for-loop form is that you don't have to read the body of
> the loop to verify that the loop terminates.
I agree here, I personally don't like to have exit points in the body of any
loop as well as several exist points from the function, except the cases when we
return the error. I usually create additional boolean variable which indicates
whether the loop should stop or not and the function returns the correct value
only at one place. But as I got feeling that it's considered as something bad
here.

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/...
src/shared/Utils.cpp:191: if (queryString.empty() || !parameterHandler)
On 2015/02/12 17:24:41, Eric wrote:
> On 2015/02/12 14:44:06, sergei wrote:
> > std::function<...> can be an empty object, so before calling it, we should
> > ensure that it's not empty.
> 
> I had forgotten that you can have a non-empty reference and also a null
> function, so this condition is actually meaningful.
> 
> Any iteration with a null function is the same as an empty statement. So I'm
> withdrawing my original objection. The original behavior, to do nothing if the
> function provide is null, is better behavior.
> 
> > Anyway, I've removed it here and in ForEachToken because it's actually
covered
> > by the test
> 
> This works too, though now that I've seen both I prefer the original. If you
> want to keep the condition out (as with patch set 2), it deserves a document
> comment in the header file that this function throws if the function argument
is
> null.

I've added the comments.

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/test...
File test/UtilGetQueryString.cpp (right):

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/test...
test/UtilGetQueryString.cpp:34: }
On 2015/02/12 17:24:41, Eric wrote:
> On 2015/02/12 14:44:06, sergei wrote:
> > I disagree here. According to such logic we should create the infinite
number
> of
> > tests to test absolutely all cases which we cannot achieve. 
> 
> No, not an infinite number of tests. Simply a few more tests here.
> 
> > Now we cover all
> > execution paths of the tested function body, it should be enough.
> 
> Unit tests should treat their units as black boxes, not white boxes. You can't
> see inside a black box; you see inside a white box. Arguing that all execution
> paths are exercised is contrary to this principle.
> 
> Unit tests should be written so that someone else implementing the same
function
> has the benefit of exhaustive testing. I've written lots of parsers in my
life.
> It is particularly useful for parsers to test exhaustively, for some reason.
> When testing a parser, it's important to test as many variations in the
grammar
> as possible. In the present case, if it's possible for some element of a
parsed
> string to be empty, there should be unit tests when it's empty.

Added.

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/...
File src/plugin/PluginWbPassThrough.cpp (right):

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/...
src/plugin/PluginWbPassThrough.cpp:185: contentType ==
ContentType::CONTENT_TYPE_OTHER)
On 2015/02/12 17:24:41, Eric wrote:
> I don't think it matters all that much, but it would be a little better to
test
> content type first and check the IE version only if we need to.

Changed.

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/...
src/plugin/PluginWbPassThrough.cpp:187:
ForEachQueryStringParameter(GetQueryString(src),
On 2015/02/13 03:32:16, Oleksandr wrote:
> Why can't we just use strtok here? The code would be much simpler, IMHO.

strtok is unsafe and I would say the using of any function from the family is
not simpler, https://msdn.microsoft.com/en-us/library/ftsafwz3.aspx

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/...
File src/shared/Utils.cpp (right):

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/...
src/shared/Utils.cpp:156: schemeAndHierarchicalPartEndsAt = url.find(L'#');
On 2015/02/13 03:32:16, Oleksandr wrote:
> So what if there's neither '?' nor '#' in the url?
It means there is neither query nor fragment and we will return the original
URL. Actually, there is the test for that case.

> Even more, why not just use CreateUri
> (https://msdn.microsoft.com/en-us/library/ie/ms775098%28v=vs.85%29.aspx) and
then
> just IUri::GetQuery? Or maybe WinHttpCrackUrl or InternetCrackUrl. For this
and
> GetQueryString below.
>
https://msdn.microsoft.com/en-us/library/windows/desktop/aa384092%2528v=vs.85...
> or
>
https://msdn.microsoft.com/en-us/library/windows/desktop/aa384376%28v=vs.85%2...

I thought that it's an unnecessary overhead, especially I'm afraid that to parse
the URL they can instantiate a protocol handler (BTW, our implementation of
protocol handler) to use IInternetProtocolInfo::ParseUrl. At least I've seen
such behaviour but I don't remember whether it was for one of the mentioned
function or not.
CreateUri looks fine for me, should I investigate it and change?

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/test...
File test/UtilForEachQueryStringParameterTest.cpp (right):

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/test...
test/UtilForEachQueryStringParameterTest.cpp:24: struct ParameterHandler
On 2015/02/12 17:24:41, Eric wrote:
> The code reads much better with this class present.
> 
> Now that I've seen the new version, I'll just observe that there's lot of
> duplication in verifying the resulting vectors. For whatever reason, it
bothers
> me less.
> 
> That's partly because our current compiler doesn't support C++11 uniform
> initializer lists, so writing a single expression for expected results isn't
> possible.

Yes, it's not possible due to compiler limitation, in addition, having
verification statements in the body of test (not in the common function) can be
useful because in that case we immediately know the exact location of the
failure.

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/test...
test/UtilForEachQueryStringParameterTest.cpp:43: struct LimitedParameterHandler
: ParameterHandler
On 2015/02/12 17:24:41, Eric wrote:
> I think I'd prefer than ParameterHandler take the moral equivalent of
> shouldContinue() here as a constructor argument. My reason for this is that
the
> code here only applies to a single test case and so the code for that test is
> split between two, separate, non-contiguous types.
> 
> I'd be satisfied, though, if you declared this call immediately before the
unit
> test in which it's used.

I thought about it when I was creating of it and decided that it would be better
to have it as separate class. 
It's moved before the usage.

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/test...
test/UtilForEachQueryStringParameterTest.cpp:52: return shouldContinue() &&
ParameterHandler::operator()(parameter, value) && shouldContinue();
On 2015/02/12 17:24:41, Eric wrote:
> No need for the first shouldContinue() call here. The only time it could
matter
> in the present case is when 'limit' is zero.
I've added it with another intention but it's fine without it. Removed.

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/test...
test/UtilForEachQueryStringParameterTest.cpp:72:
ForEachQueryStringParameter(L"", std::ref(parameterHandler));
On 2015/02/12 17:24:41, Eric wrote:
> Do we need reference wrappers? It seems like would should be able to declare
the
> ForEach... function arguments as rvalue references rather than regular
> references. 
> 
> If we do, we should likely use 'std::cref' here.

The tested function accepts `std::function` but we are passing an object with
the `operator()`. When `std::function` is implicitly constructed from it, it
copies the passed object, in that case the tests fail because the tested
function will populate the temporary object held by `std::function`. To avoid
such copying one can pass std::reference_wrapper or use some another technique,
like put `ParameterHandler` members into shared object or use lambda which
captures it by reference. I guess, `std::reference_wrapper` is easier here.

We cannot use std::cref because the object is not constant, we need call
ParameterHandler::operator()(...) which has no `const` modifier. An attempt to
use `std::cref` results into compilation error.

Eric

LGTM. There are some small ways of improving the code in this change set, but ...

Feb. 13, 2015, 4:33 p.m. (2015-02-13 16:33:42 UTC) #7

LGTM.

There are some small ways of improving the code in this change set, but nothing
that I think must be done beforehand.

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/...
File src/plugin/PluginWbPassThrough.cpp (right):

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/...
src/plugin/PluginWbPassThrough.cpp:187:
ForEachQueryStringParameter(GetQueryString(src),
On 2015/02/13 03:32:16, Oleksandr wrote:
> Why can't we just use strtok here? The code would be much simpler, IMHO.

For the reasons that strtok won't work here, see
http://stackoverflow.com/questions/4031075/strtok-function-thread-safety

Microsoft doesn't supply any variation of strtok_r, which might work.

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/test...
File test/UtilForEachQueryStringParameterTest.cpp (right):

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/test...
test/UtilForEachQueryStringParameterTest.cpp:24: struct ParameterHandler
On 2015/02/13 15:21:56, sergei wrote:
> having
> verification statements in the body of test (not in the common function) can
be
> useful because in that case we immediately know the exact location of the
> failure.

I've written such things, and you can get the location by how you write the
common verification function. On the other hand, I've used lots of unit test
frameworks but I've not needed to do this in google-test, so I'm not sure what
the details and limitations are for the package we're using...

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/test...
test/UtilForEachQueryStringParameterTest.cpp:43: struct LimitedParameterHandler
: ParameterHandler
On 2015/02/13 15:21:56, sergei wrote:
> I thought about it when I was creating of it and decided that it would be
better
> to have it as separate class. 

If we had more variation in the tests, subclassing each time would become
problematic. With just one variation, it's fine.

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/test...
test/UtilForEachQueryStringParameterTest.cpp:72:
ForEachQueryStringParameter(L"", std::ref(parameterHandler));
On 2015/02/13 15:21:56, sergei wrote:
> The tested function accepts `std::function` but we are passing an object with
> the `operator()`.

You may have misunderstood. I was suggesting declaring

    void ForEachQueryStringParameter(..., const std::function<...>&&
parameterHandler)

instead of 

    void ForEachQueryStringParameter(..., const std::function<...>&&
parameterHandler)

> To avoid
> such copying one can pass std::reference_wrapper or use some another
technique,
> like put `ParameterHandler` members into shared object or use lambda which
> captures it by reference.

My suggestion boils down to avoiding copy by replacing it with a move.

> We cannot use std::cref because the object is not constant, we need call
> ParameterHandler::operator()(...) which has no `const` modifier. An attempt to
> use `std::cref` results into compilation error.

I'm misunderstanding std::cref, I guess. I would have called it std::pure_ref if
that's its actual behavior.

http://codereview.adblockplus.org/5921969115496448/diff/5728116278296576/src/...
File src/shared/Utils.cpp (right):

http://codereview.adblockplus.org/5921969115496448/diff/5728116278296576/src/...
src/shared/Utils.cpp:178: const std::function<bool(std::wstring::const_iterator
begin, std::wstring::const_iterator end)>& tokenHandler)
You might change "tokenHandler" to "fn" here. Or at least "tokenFn".

This is not a strong opinion.

http://codereview.adblockplus.org/5921969115496448/diff/5728116278296576/src/...
src/shared/Utils.cpp:200: const std::function<bool(const std::wstring& name,
const std::wstring& value)>& parameterHandler)
See comment above.

http://codereview.adblockplus.org/5921969115496448/diff/5728116278296576/test...
File test/UtilForEachQueryStringParameterTest.cpp (right):

http://codereview.adblockplus.org/5921969115496448/diff/5728116278296576/test...
test/UtilForEachQueryStringParameterTest.cpp:160: } parameterHandler(2);
If you'd like to have the subclass definition inside the function body for a
single use, it would be clearer to eliminate the constructor argument and
initialize with the clause "limit(2)".

http://codereview.adblockplus.org/5921969115496448/diff/5728116278296576/test...
File test/UtilGetQueryStringTest.cpp (right):

http://codereview.adblockplus.org/5921969115496448/diff/5728116278296576/test...
test/UtilGetQueryStringTest.cpp:38: // 1 1 1 1 1
Ah. This is a good comment.

Exhaustive testing makes me very pleased.

Eric

http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/shared/Utils.cpp File src/shared/Utils.cpp (right): http://codereview.adblockplus.org/5921969115496448/diff/5668600916475904/src/shared/Utils.cpp#newcode172 src/shared/Utils.cpp:172: } While I think the loop could be improved, ...

Feb. 13, 2015, 4:38 p.m. (2015-02-13 16:38:22 UTC) #8

Oleksandr

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/plugin/PluginWbPassThrough.cpp File src/plugin/PluginWbPassThrough.cpp (right): http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/plugin/PluginWbPassThrough.cpp#newcode187 src/plugin/PluginWbPassThrough.cpp:187: ForEachQueryStringParameter(GetQueryString(src), strotok_s would actually work good here. It has ...

Feb. 16, 2015, 6:59 a.m. (2015-02-16 06:59:28 UTC) #9

sergei

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/plugin/PluginWbPassThrough.cpp File src/plugin/PluginWbPassThrough.cpp (right): http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/plugin/PluginWbPassThrough.cpp#newcode187 src/plugin/PluginWbPassThrough.cpp:187: ForEachQueryStringParameter(GetQueryString(src), On 2015/02/16 06:59:28, Oleksandr wrote: > This would ...

Feb. 16, 2015, 10:54 a.m. (2015-02-16 10:54:33 UTC) #10

Eric

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/plugin/PluginWbPassThrough.cpp File src/plugin/PluginWbPassThrough.cpp (right): http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/plugin/PluginWbPassThrough.cpp#newcode187 src/plugin/PluginWbPassThrough.cpp:187: ForEachQueryStringParameter(GetQueryString(src), There are two underlying observations here. The first ...

Feb. 16, 2015, 6:18 p.m. (2015-02-16 18:18:06 UTC) #11

sergei

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/plugin/PluginWbPassThrough.cpp File src/plugin/PluginWbPassThrough.cpp (right): http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/plugin/PluginWbPassThrough.cpp#newcode187 src/plugin/PluginWbPassThrough.cpp:187: ForEachQueryStringParameter(GetQueryString(src), On 2015/02/16 18:18:07, Eric wrote: > There are ...

Feb. 20, 2015, 11:17 a.m. (2015-02-20 11:17:03 UTC) #12

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/...
File src/plugin/PluginWbPassThrough.cpp (right):

http://codereview.adblockplus.org/5921969115496448/diff/5766466041282560/src/...
src/plugin/PluginWbPassThrough.cpp:187:
ForEachQueryStringParameter(GetQueryString(src),
On 2015/02/16 18:18:07, Eric wrote:
> There are two underlying observations here. The first is that there's a
library
> function, albeit a non-standard Microsoft-specific one, that can split tokens.
> The second is that, _for this particular function_, there's no need to
> distinguish between name and value in a URL parameter string.
Good, but what does it give us? We use a lot of Windows API, yet one
Microsoft-specific should not cause any more trouble. And it looks reasonable to
start with the value and then try with the name but it's too premature
assumption, so I've removed it without any comments.
> 
> Having seen the new code, I greatly prefer the use of a function iterator and
an
> inline lambda. It find it easier to understand that the new commingled
version.
> Separating syntax and semantics is almost always the right thing to do, in no
> small part because large numbers of security problems come from bad parsing
(the
> syntax part), and because it's much easier to validate parsing code when it's
> not commingled with other concerns.
I completely support it!
> 
> However, the previous version of the code could have been simpler. The
> particular algorithm used makes no distinction between name and value. Hence
it
> could have been written with 'ForEachToken' rather than
> 'ForEachQueryStringParameter'. In that case the body of the lambda would be
much
> smaller, something like this:
> 
>     if (token.empty()) return true;
>     contentType = GetContentTypeFromString(token);
>     return contentType == ContentType::CONTENT_TYPE_OTHER;
> 
> The other part, that about 'strtok_s', is a viable way of rewriting
> 'ForEachToken'. I wasn't fond of its previous implementation, although it was
> adequate. Using the library function would clean it up.
Actually, I find the previous implementation a bit more complicated than the
current because the previous one requires at least to read the description of
the function to understand the aim of the return value of the lambda. It's a
design flaw, it's better to use enum instead of bool there, but all these things
only complicate it, I guess.

I would not say that the using of any function from `strtok` family is easier
and is good for functions like `ForEachToken`. Pay attention that we pass
"immutable" object to `ForEachToken` (const string&) but `strtok` modifies it.
In this particular case it's clear and we are modifying the temporary object,
but, as I said, I find all such peculiarities only complicating the code.
In addition, the previous implementation of `ForEachToken` works with iterators
which is safer than working with the raw pointers.

So, I would like to keep the current version (with `strtok`) but if you want
it's completely fine for me to implement `ForEachToken` which works with a set
of delimiters (but not based on `strtok`).

Oleksandr

With the 2 nits addressed - LGTM http://codereview.adblockplus.org/5921969115496448/diff/5636470266134528/src/shared/Utils.h File src/shared/Utils.h (right): http://codereview.adblockplus.org/5921969115496448/diff/5636470266134528/src/shared/Utils.h#newcode62 src/shared/Utils.h:62: * Returns ...

March 3, 2015, 6:43 p.m. (2015-03-03 18:43:17 UTC) #13

sergei

http://codereview.adblockplus.org/5921969115496448/diff/5636470266134528/src/shared/Utils.h File src/shared/Utils.h (right): http://codereview.adblockplus.org/5921969115496448/diff/5636470266134528/src/shared/Utils.h#newcode62 src/shared/Utils.h:62: * Returns the begging of the URL which includes ...

March 4, 2015, 1:24 p.m. (2015-03-04 13:24:36 UTC) #14

Oleksandr

April 2, 2015, 8:14 a.m. (2015-04-02 08:14:32 UTC) #15

I didn't think this needed another LGTM. Here it goes.

Issue 5921969115496448: Issue 1115 - Some yahoo page not correctly shown on IE8 when ABP enabled (Closed)

Patch Set 1 : x #

Patch Set 2 : rebase and address comments #

Patch Set 3 : address comments #

Patch Set 4 : simplify processing of query string #

Patch Set 5 : rebase #

Patch Set 6 : fix and rebase #

Messages