|
|
Patch Set 1 #
Total comments: 5
Patch Set 2 : change tests #
Total comments: 2
Patch Set 3 : remove WriteStrings and ReadStrings #
Total comments: 11
MessagesTotal messages: 17
http://codereview.adblockplus.org/4806567450902528/diff/5629499534213120/src/... File src/shared/Communication.h (right): http://codereview.adblockplus.org/4806567450902528/diff/5629499534213120/src/... src/shared/Communication.h:41: TYPE_PROC, TYPE_STRING, TYPE_WSTRING, TYPE_INT64, TYPE_INT32, TYPE_BOOL, TYPE_STRINGS We should use TYPE_WSTRINGS. The string type used in the filter code is std::wstring (that is, where the conversion away from CString is complete). It will be simpler to transfer the type used in the plugin and do the conversion in the engine. If you think we will need both TYPE_STRINGS and TYPE_WSTRINGS, you can do it analogously to how TYPE_STRING and TYPE_WSTRING are already working. http://codereview.adblockplus.org/4806567450902528/diff/5629499534213120/src/... src/shared/Communication.h:95: InputBuffer& ReadStrings(std::vector<std::string>& value) This should be a template function parallel to ReadString() immediately above. http://codereview.adblockplus.org/4806567450902528/diff/5629499534213120/test... File test/CommunicationTest.cpp (right): http://codereview.adblockplus.org/4806567450902528/diff/5629499534213120/test... test/CommunicationTest.cpp:119: Communication::OutputBuffer outputBuffer; The core of these tests is the same code. It would be easier to read to use a common testing function here, in large part to make it clear that these testing are all doing essentially the same thing, varying only in arguments. http://codereview.adblockplus.org/4806567450902528/diff/5629499534213120/test... test/CommunicationTest.cpp:127: TEST(InputOutputBuffersTests, NonemptyVectors) I'd make this two tests.
http://codereview.adblockplus.org/4806567450902528/diff/5629499534213120/src/... File src/shared/Communication.h (right): http://codereview.adblockplus.org/4806567450902528/diff/5629499534213120/src/... src/shared/Communication.h:41: TYPE_PROC, TYPE_STRING, TYPE_WSTRING, TYPE_INT64, TYPE_INT32, TYPE_BOOL, TYPE_STRINGS On 2015/01/13 16:20:45, Eric wrote: > We should use TYPE_WSTRINGS. > > The string type used in the filter code is std::wstring (that is, where the > conversion away from CString is complete). It will be simpler to transfer the > type used in the plugin and do the conversion in the engine. > > If you think we will need both TYPE_STRINGS and TYPE_WSTRINGS, you can do it > analogously to how TYPE_STRING and TYPE_WSTRING are already working. I'm confused, we recently accepted https://issues.adblockplus.org/ticket/715 . I guess that the direction is that we should not work with std::wstring on the engine side, at least in the messages to communicate between Addon and Engine. So, I guess we don't need a version for std::wstring.
On 2015/01/27 15:28:38, sergei wrote: > I'm confused, we recently accepted https://issues.adblockplus.org/ticket/715 . Just saw that. > I > guess that the direction is that we should not work with std::wstring on the > engine side, at least in the messages to communicate between Addon and Engine. That seems to be what's happening, although I think that's an unfortunate choice. Wide-character is the standard character type for Win32. It would have been better to have standardized on wide characters everywhere, including the engine, which would mean doing all the conversions to/from libadblockplus on the engine side. > So, I guess we don't need a version for std::wstring. Unfortunately, I have to agree.
I might be missing something, but why do we need both ReadStrings in AdblockPlusClient.cpp, WriteStrings in Main.cpp and this? http://codereview.adblockplus.org/4806567450902528/diff/5685265389584384/src/... File src/shared/Communication.h (right): http://codereview.adblockplus.org/4806567450902528/diff/5685265389584384/src/... src/shared/Communication.h:172: for (const auto& str : value) How about being specific here and providing type explicitly? There is only one type possible here anyway. That would let us use WriteString instead of operator<<. Alternatively we can use templates here, I suppose, but I think that'll be premature. Same about for ReadStrings
On 2015/02/04 20:49:37, Oleksandr wrote: > I might be missing something, but why do we need both ReadStrings in > AdblockPlusClient.cpp, WriteStrings in Main.cpp and this? We need them to pass the frame hierarchy. Now in details. In particular, to decide whether we need to inject CSS or not we has to check whether it's whitelisted or not. To do that we need a frame hierarchy. BTW, in comparison with requests (remember referrer maps), for frames we can obtain the perfect frame hierarchy and we can do it only in Addon, not in Engine. I'm sure it would be too inefficient to have a series of calls to the engine for each frame, we could, firstly, obtain the frame hierarchy in Addon and then pass it to the engine. BTW, I guess, it's also useful and you might missed https://issues.adblockplus.org/ticket/1711, which implies that we need to pass the frame hierarchy to the core, I guess, for this ticket each node in the hierarchy should also contain the sitekey beside the frame URL. So, this ticket also implies that we firstly obtain the information about frames and then pass it to the core instead of having a series of calls to the core. http://codereview.adblockplus.org/4806567450902528/diff/5685265389584384/src/... File src/shared/Communication.h (right): http://codereview.adblockplus.org/4806567450902528/diff/5685265389584384/src/... src/shared/Communication.h:172: for (const auto& str : value) On 2015/02/04 20:49:37, Oleksandr wrote: > How about being specific here and providing type explicitly? There is only one > type possible here anyway. That would let us use WriteString instead of > operator<<. I would say that using of `operator<<(value)` is cleaner than `WriteString(value, TYPE_STRING);` because we cannot pass wrong type. The same for reading. What is the advantage of using `WriteString` here? > > Alternatively we can use templates here, I suppose, but I think that'll be > premature. > > Same about for ReadStrings I also think there is no need to use template now because it occurs only once here. I have not thought about it, but if we indeed start to pass std::vector of different types then we can refactor it into template method `template<typename T> OutputBuffer& WriteValues(const std::vector<T>& values, ValueType type);`.
> > I might be missing something, but why do we need both ReadStrings in > > AdblockPlusClient.cpp, WriteStrings in Main.cpp and this? > > We need them to pass the frame hierarchy. I understand why we need this, but I don't understand why do we need both ReadStrings and WriteStrings I mentioned above and this. Are you proposing to remove the old versions? Will that be done in a separate changeset?
On 2015/02/06 15:12:46, Oleksandr wrote: > > > I might be missing something, but why do we need both ReadStrings in > > > AdblockPlusClient.cpp, WriteStrings in Main.cpp and this? > > > > We need them to pass the frame hierarchy. > > I understand why we need this, but I don't understand why do we need both > ReadStrings and WriteStrings I mentioned above and this. Are you proposing to > remove the old versions? Will that be done in a separate changeset? I've created them analogously to the methods which work with a single string. Actually, we can move the impl into the body of the operators. Should I move it?
Sorry, now I see ReadStrings in AdblockPlusClient.cpp and WriteStrings in Main.cpp. Yes, it makes sense to use functions suggested in that review, so I will remove old functions in the next patch set.
On 2015/02/09 11:48:55, sergei wrote: > Sorry, now I see ReadStrings in AdblockPlusClient.cpp and WriteStrings in > Main.cpp. Yes, it makes sense to use functions suggested in that review, so I > will remove old functions in the next patch set. done
Now that the code is clearer, it has become clear that there's a design flaw in 'InputBuffer' and 'OutputBuffer'. The problem was there already. The present review perpetuates that flaw, but it would be better to fix it. The problem shows itself by needing to call 'ToUtf16Strings' immediately after every call to 'operator>>'. Requiring such call pairs is bad design, as it invites the error of forgetting to do manual type conversion. Likewise, the client code contains a large number of explicit calls to 'ToUtf16String' (for individual strings) immediately following 'operator>>'. And if the parallel problem isn't obvious enough, the client code also contains explicit calls to 'ToUtf8String' immediately following 'operator<<'. The right design is fairly clear 'operator>>' should return UTF-16 on the client side and UTF-8 on the server side, and analogously for 'operator<<'. This should be true for both individual strings and vectors of them. The trouble is with the implementations of 'ReadString' and 'WriteString'. They are template functions taking either 'string' or 'wstring' as the template argument. The problem is that this template argument chooses character encoding over the pipe and is ignorant of UTF issues. The template argument ought to determine how the data in the pipe is interpreted, not how it's transmitted. We can transmit data always in UTF-8. When the argument is 'string', we use the encoding as-is. When the argument is 'wstring', we would call 'ToUtf8String' as part of 'WriteString' and we would call 'ToUtf16String' as part of 'ReadString'. We can retain common, templated implementations of 'ReadString' and 'WriteString' by adding a second template parameter, a traits class 'Tr', declared with a default 'Tr = Traits<T>'. The traits class 'Traits' should have only a trivial base definition and then a pair of specializations, one each for 'string' and 'wstring'. Each of the specializations has two conversion functions, one for read and one for write. The conversion functions for 'string' are trivial. The conversion functions for 'wstring' perform the UTF conversions. With all this, the implementations of 'ReadStrings' and 'WriteStrings' in the present review would not need to be modified, since they are calling 'operator<<' and 'operator>>', which would be doing the conversions internally. That would allow elimination of 'ToUtf16Strings' entirely. http://codereview.adblockplus.org/4806567450902528/diff/5700735861784576/src/... File src/plugin/AdblockPlusClient.cpp (right): http://codereview.adblockplus.org/4806567450902528/diff/5700735861784576/src/... src/plugin/AdblockPlusClient.cpp:300: return ToUtf16Strings(selectors); Here and below, you're following a call to operator>> with a call to ToUtf16Strings. Note that this is on the client side. http://codereview.adblockplus.org/4806567450902528/diff/5700735861784576/src/... src/plugin/AdblockPlusClient.cpp:369: return ToUtf16Strings(domains); Second pair of calls: operator>>, ToUtf16Strings http://codereview.adblockplus.org/4806567450902528/diff/5700735861784576/src/... File src/shared/Communication.h (right): http://codereview.adblockplus.org/4806567450902528/diff/5700735861784576/src/... src/shared/Communication.h:87: buffer.read(reinterpret_cast<char*>(data.get()), sizeof(T::value_type) * length); This should be 'const_cast'. It should also be documented here that we're reading into the constant internal buffer of a string. As we've discussed before, this happens to work but is not guaranteed to work. A note should appear here acknowledging that. http://codereview.adblockplus.org/4806567450902528/diff/5700735861784576/src/... src/shared/Communication.h:91: value.assign(data.get(), length); This is the place to use either plain assignment or a UTF conversion function. http://codereview.adblockplus.org/4806567450902528/diff/5700735861784576/src/... src/shared/Communication.h:161: buffer.write(reinterpret_cast<const char*>(value.c_str()), sizeof(T::value_type) * length); Please take the opportunity to change this to 'const_cast' http://codereview.adblockplus.org/4806567450902528/diff/5700735861784576/src/... File src/shared/Utils.h (right): http://codereview.adblockplus.org/4806567450902528/diff/5700735861784576/src/... src/shared/Utils.h:39: std::vector<std::wstring> ToUtf16Strings(const std::vector<std::string>& value); Where's the other half of the pair ToUtf8Strings?
I'm pretty sure it's beyond the scope of this issue, so could you move it into the issue tracker, please? I agree only with the fact that working with these buffers we should not manually call conversion functions. The implementation of any class should not depend on the side, whether it's client side or server side. http://codereview.adblockplus.org/4806567450902528/diff/5700735861784576/src/... File src/shared/Communication.h (right): http://codereview.adblockplus.org/4806567450902528/diff/5700735861784576/src/... src/shared/Communication.h:87: buffer.read(reinterpret_cast<char*>(data.get()), sizeof(T::value_type) * length); On 2015/02/09 16:18:57, Eric wrote: > This should be 'const_cast'. > > It should also be documented here that we're reading into the constant internal > buffer of a string. As we've discussed before, this happens to work but is not > guaranteed to work. A note should appear here acknowledging that. Here we are not reading the data into the constant buffer of the string, though instead of `std::auto_ptr<T::value_type>` here should be `std::unique_ptr<T::value_type[]>`, so this particular line is correct. http://codereview.adblockplus.org/4806567450902528/diff/5700735861784576/src/... src/shared/Communication.h:161: buffer.write(reinterpret_cast<const char*>(value.c_str()), sizeof(T::value_type) * length); On 2015/02/09 16:18:57, Eric wrote: > Please take the opportunity to change this to 'const_cast' We should not change this to `const_cast`, here, as well as above, `reinterpret_cast` is used because we are reading and writing bytes. http://codereview.adblockplus.org/4806567450902528/diff/5700735861784576/src/... File src/shared/Utils.h (right): http://codereview.adblockplus.org/4806567450902528/diff/5700735861784576/src/... src/shared/Utils.h:39: std::vector<std::wstring> ToUtf16Strings(const std::vector<std::string>& value); On 2015/02/09 16:18:57, Eric wrote: > Where's the other half of the pair ToUtf8Strings? We don't need, it would be a dead code.
On 2015/02/09 16:55:37, sergei wrote: > I'm pretty sure it's beyond the scope of this issue, so could you move it into > the issue tracker, please? Given that we haven't agreed on what is to be done, that's a bit premature. > I agree only with the fact that working with these > buffers we should not manually call conversion functions. Good. Then at least ensure that the array-of-strings functions have the proper interface, the one that does type conversion behind the scenes. We can change the rest of the code later. I am less concerned that we fix everything now than with propagating a bad design in new code. If that means what looks like code duplication, that's fine with me in this case. Getting the interface right is more important than getting a perfectly clean internal implementation. It convert the problem from one that's present everywhere it's called to a problem that only exists in one place. That's worth a bit of duplicate-looking code for a while. > The implementation of > any class should not depend on the side, whether it's client side or server > side. Nice high-minded ideal, but I don't think it's applicable here. There was a design decision at the outset to use different string class between client and server. IMO, that was a mistake, but we've got to live with it now.
http://codereview.adblockplus.org/4806567450902528/diff/5700735861784576/src/... File src/shared/Communication.h (right): http://codereview.adblockplus.org/4806567450902528/diff/5700735861784576/src/... src/shared/Communication.h:87: buffer.read(reinterpret_cast<char*>(data.get()), sizeof(T::value_type) * length); On 2015/02/09 16:55:37, sergei wrote: > Here we are not reading the data into the constant buffer of the string, though > instead of `std::auto_ptr<T::value_type>` here should be > `std::unique_ptr<T::value_type[]>`, so this particular line is correct. Sorry. Too hasty about 'string'. Nevertheless, using 'reinterpret_cast' would be totally unnecessary if we were always transmitting UTF-8 across the pipe. We'd always get 'string' out of it, and then we'd either assign it or convert it as appropriate.
On 2015/02/09 17:33:52, Eric wrote: > On 2015/02/09 16:55:37, sergei wrote: > > I'm pretty sure it's beyond the scope of this issue, so could you move it into > > the issue tracker, please? > > Given that we haven't agreed on what is to be done, that's a bit premature. > I like Eric's idea of the proposed change in general, but I agree that it should be done in s separate change-set.
LGTM
For reference, the concerns I had are being addressed in this review: http://codereview.adblockplus.org/4882650246414336/ I have one reservation about implementation of a unit test, but we can address it in the next review. Neither of these caveats need to block this. LGTM. http://codereview.adblockplus.org/4806567450902528/diff/5700735861784576/test... File test/CommunicationTest.cpp (right): http://codereview.adblockplus.org/4806567450902528/diff/5700735861784576/test... test/CommunicationTest.cpp:120: Communication::InputBuffer inputBuffer(outputBuffer.Get()); I don't like the use of Get() here, but I'll save comments for the subsequent change set. |