|
|
Created:
June 20, 2014, 9:22 p.m. by Eric Modified:
Dec. 30, 2014, 1:47 p.m. Reviewers:
Visibility:
Public. |
DescriptionEliminates CString, CComBSTR, and CComVariant.
This review consists a few kinds of things:
1) Conversion of Microsoft-specific types to standard ones.
2) Replacement of member functions on removed types with utility functions, and their unit tests.
3) Support functions for dealing with COM arguments and return values.
4) Wrappers for COM objects and system calls. These are used to isolate the necessarily-remaining Microsoft-specific types (notable BSTR and VARIANT) into the wrapper classes.
Most of the changes in (1) come in matched pairs, particularly in the member declarations. Some of the string algorithms needed more extensive rewrites.
The doxygen files have been sitting in my local copy for a while, uncommitted. Type 'doxygen' (no arguments needed) in the root directory of the project to generate. The configuration is set to build call graphs, so GraphViz needs to be installed for that to work.
Patch Set 1 #
Total comments: 47
Patch Set 2 : Defect corrections #
MessagesTotal messages: 16
It's big, and it's finally ready.
Replacing Wladimir with Felix for this review.
http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/AdblockPlusClient.cpp (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/AdblockPlusClient.cpp:32: std::unique_ptr< wchar_t[] > buffer ; Nit: Here and in other places no space after < and before >. Or after ( and before ). Or [ and ]. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/AdblockPlusClient.cpp:88: createProcRes = CreateProcessW(engineExecutablePath.c_str(), (LPWSTR) params, We've had a similar issue (a need to get a writable wstring's buffer) in libadblockplus\src\Utils.cpp, and the solution there was to use something like std::wstring params; CreateProcessW(..., ¶ms[0],...) I agree that the temporary_wchar_buffer can be considered safer then the approach above, since in theory some compilers can have strings in multiple memory blocks. But in C++11 there is a requirement that std::basic_string's buffer was allocated contiguously in memory, and in most C++03 compilers it is de facto the case. I would argue to use the same approach here for code simplicity. We don't need temporary_wchar_buffer then. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/AdblockPlusClient.cpp:251: bool CAdblockPlusClient::ShouldBlock( std::wstring src, int contentType, const std::wstring & domain, bool addDebug) Nit: no space after const std::wstring http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/COM_Client.h (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/COM_Client.h:1: I guess COM_Client.h is not needed? http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/COM_Client.h:24: class Returned_BSTR This multiple classes approach is a bit confusing, IMO. Why not have one class for all cases? I don't think an extra copy operation makes a difference. For example something like below (NOTE: not tested) class BSTR_AdblockPlus { BSTR m_str; BSTR_AdblockPlus(wchar_t* b) { m_str = ::SysAllocStringLen(s.c_str(), s.length()); } BSTR_AdblockPlus(BSTR b) { m_str = ::SysAllocStringByteLen((char*)b, ::SysStringByteLen(b)); ::SysFreeString(b); } ~BSTR_AdblockPlus() { ::SysFreeString(m_str); } ... } http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/PluginMutex.cpp (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/PluginMutex.cpp:22: m_hMutex = ::CreateMutexW( NULL, FALSE, mutex_name.c_str() ); We should try to open a mutex in Local namespace if Global failed. It should not be the same mutex_name here. Should be "Local\\Adblockplus" + name. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/PluginUtil.cpp (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/PluginUtil.cpp:39: { Since these functions are generic enough, I think it would be appropriate to move them to Utils.cpp instead. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/PluginUtil.cpp:60: { We already gave AdblockPlus::Utils::TrimString http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/PluginUtil.h (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/PluginUtil.h:11: namespace ABP { I'd rather have the namespace called AdblockPlus.IE. We use AdblockPlus namespace in libadblockplus. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/Wrapper.cpp (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/Wrapper.cpp:12: std::wstring & Unescape_URL( std::wstring & url ) This belongs to Utils.cpp, I think http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/Wrapper.h (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/Wrapper.h:31: class Browser I suppose the general idea is to make these classes to be smart pointers for COM objects later, correct? Still, I don't quite understand the need for CComQIPtr here, when we are using just plain interface pointers in other similar classes. Can's it be just IWebBrowser* _browser here?
A number of the suggestions here I'd prefer to do later. We can create an issue to capture them if that seems useful. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/AdblockPlusClient.cpp (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/AdblockPlusClient.cpp:88: createProcRes = CreateProcessW(engineExecutablePath.c_str(), (LPWSTR) params, This issue is not contiguous memory. Rather, it's that CreateProcessW is specifically documented to need writable memory. *> The Unicode version of this function, CreateProcessW, can modify *> the contents of this string. Therefore, this parameter cannot be *> a pointer to read-only memory (such as a const variable or a literal *> string). If this parameter is a constant string, the function may *> cause an access violation. Using ¶ms[0] or params.c_str() both give access to the internal representation of the string, which is not a writable buffer. It's possible there's not actual problem here, but without knowing implementation details of basic_string, it's a wish and a prayer to do otherwise. And even if there's no problem now, the implementation could change in a future version. I don't like the separate copy; it's inefficient. In this specific case, though, this function gets called infrequently, so there's barely a performance hit. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/COM_Client.h (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/COM_Client.h:1: On 2014/06/26 00:48:43, Oleksandr wrote: > I guess COM_Client.h is not needed? I take it you mean COM_Client.cpp. Right now, all that I've got so far is these parameter-marshalling classes, which are simple enough to make inline. The implementations for the other COM support classes won't be so simple, and I anticipate I'll be using the file for that. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/COM_Client.h:24: class Returned_BSTR On 2014/06/26 00:48:43, Oleksandr wrote: > This multiple classes approach is a bit confusing, IMO. > Why not have one class for all cases? The biggest confusion to my eye is the naming. I don't particularly like these names, but I didn't know what was needed when I started. I'm planning to rationalize them at the end of this process. It looks like we have four cases, each with slightly different lifecycle requirements: 1) As caller: outgoing arguments to external functions 2) As caller: incoming return values from external functions 3) As callee: incoming arguments to our functions 4) As callee: outgoing return values from our functions I'll move to a more consistent naming convention when I rationalize all the wrapper functions. That'll be at the end of the ATL removal. > I don't think an extra copy operation makes a difference. Extraneous memory copying is one of the biggest performance penalties in software generally. It's something to pay attention to here, since there are lots of IE and DOM calls that use BSTR. We're already taking a performance hit by using std::wstring and copying to BSTR, but without a basic_string implementations that's known also to be a BSTR, that's what we're stuck with. For the record, I think the maintenance advantages of using standard strings outweigh the performance disadvantages. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/PluginMutex.cpp (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/PluginMutex.cpp:22: m_hMutex = ::CreateMutexW( NULL, FALSE, mutex_name.c_str() ); On 2014/06/26 00:48:43, Oleksandr wrote: > Should be "Local\\Adblockplus" + name. This a defect I introduced. I don't think I noticed that the string constants were different. Fixed in next patch set. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/PluginUtil.cpp (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/PluginUtil.cpp:39: { On 2014/06/26 00:48:43, Oleksandr wrote: > Since these functions are generic enough, I think it would be appropriate to > move them to Utils.cpp instead. I'm OK with that. I'd prefer to do it in a separate step; dealing with this large patch set is already problematic. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/PluginUtil.cpp:60: { On 2014/06/26 00:48:43, Oleksandr wrote: > We already gave AdblockPlus::Utils::TrimString I noticed that after I posted the review. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/PluginUtil.h (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/PluginUtil.h:11: namespace ABP { On 2014/06/26 00:48:43, Oleksandr wrote: > I'd rather have the namespace called AdblockPlus.IE. We use AdblockPlus > namespace in libadblockplus. I don't have a firm opinion on these namespace names. I would, however, rather do all the renaming and code movement in a later change. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/Wrapper.cpp (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/Wrapper.cpp:12: std::wstring & Unescape_URL( std::wstring & url ) On 2014/06/26 00:48:43, Oleksandr wrote: > This belongs to Utils.cpp, I think Since I started on this file, I've come to think that everything in this file belongs elsewhere. I started it just to isolate all the argument wrangling for wrapped functions generally, but it has become apparent that this isn't the most relevant distinction. There are really three kinds of things here. Each belongs in its own kind of source file. 1) utility functions provided by the system 2) browser-specific functions 3) DOM functions http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/Wrapper.h (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/Wrapper.h:31: class Browser On 2014/06/26 00:48:43, Oleksandr wrote: > I suppose the general idea is to make these classes to be smart pointers for COM > objects later, correct? Yes, basically. > Can's it be just IWebBrowser* _browser here? Yes. This was the first one I wrote, and I just copied the declaration that it came from. I never changed it to match the others. Changed in next patch set.
http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/AdblockPlusClient.cpp (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/AdblockPlusClient.cpp:88: createProcRes = CreateProcessW(engineExecutablePath.c_str(), (LPWSTR) params, On 2014/06/26 15:13:56, Eric wrote: > Using ¶ms[0] or params.c_str() both give access to the internal > representation of the string, which is not a writable buffer. c_str() returns a const char, but we using ¶ms[0] as writable memory seems to be well-defined in C++11. I vote for that. (Note that I was _against_ doing this with C++03, but we still do it. Nobody's aware of any compiler that does not store strings contiguously.)
Doesn't look like code formatting changes have been addressed here. On 2014/06/30 13:35:39, Felix H. Dahlke wrote: > http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... > File src/plugin/AdblockPlusClient.cpp (right): > > http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... > src/plugin/AdblockPlusClient.cpp:88: createProcRes = > CreateProcessW(engineExecutablePath.c_str(), (LPWSTR) params, > On 2014/06/26 15:13:56, Eric wrote: > > Using ¶ms[0] or params.c_str() both give access to the internal > > representation of the string, which is not a writable buffer. > > c_str() returns a const char, but we using ¶ms[0] as writable memory seems > to be well-defined in C++11. I vote for that. (Note that I was _against_ doing > this with C++03, but we still do it. Nobody's aware of any compiler that does > not store strings contiguously.)
On 2014/06/20 21:52:37, Eric wrote: > It's big, and it's finally ready. Yes, it's indeed big and not only the amount of changes. There is a lot of things are mentioned and not fixed yet, and a lot are not mentioned (beside, for me for some parts tests are required). As well it contains not only the changes related to CString, CComBSTR, and CComVariant but also some wrappers and helpers. I think that it would be better to split it up into several changesets and review and merge them separately. - it will increase the quality of the review and significantly simplify it - it will speed up it a lot - it will be possible to parallelize the process, while one is fixing one part somebody else can fix or safely review another part. Your thoughts about it? If there is no one on it right now, I can take it.
I've not tested it yet, just several comments regarding the style and the most bright mistakes. It's already causes conflicts while applying the patch, so to simplify the merging, and perform a good reviewing I want to say again that splitting will help a lot. As it mentioned in the previous discussion, it can be firstly some utility functions (with tests and string functions should be tested not only with ASCII characters, but as well as with other characters with different locales, Turkish famous case with I<>i<>İ, Chinese when one wchar_t is not enough for the character). In the next changeset we can get rid from ATL string classes, again with tests. And then we can introduce the wrappers. We will see the situation with tests there, may be we can use some helpers like {{{ return getString(&WrappedType::get_display, m_element, /*by ref*/retValue); // -> bool }}} http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/AdblockPlusClient.cpp (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/AdblockPlusClient.cpp:23: class temporary_wchar_buffer { Despite below it's mentioned that we can proceed even without additional copying, this class can be replaced by `std::vector<wchar_t>`, there is non-const `data()` method which returns a pointer to the underlying data, which should be allocated in the continues memory. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/AdblockPlusClient.cpp:38: temporary_wchar_buffer( std::wstring s ) Just in case, it's safer to have such constructors to be `explicit`, to avoid any confusing behaviour. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/AdblockPlusClient.cpp:88: createProcRes = CreateProcessW(engineExecutablePath.c_str(), (LPWSTR) params, I also vote for `¶ms[0]`. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/AdblockPlusClient.cpp:111: createProcRes = CreateProcessAsUserW(newToken, engineExecutablePath.c_str(), (LPWSTR) params, First of all, the compile should understand that the conversion is required even without explicit cast, secondly it's better to use `static_cast` instead of C-style cast. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/AdblockPlusClient.h (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/AdblockPlusClient.h:52: bool ShouldBlock( std::wstring src, int contentType, const std::wstring & domain, bool addDebug=false); `src` also should be `const std::wstring&`. It happens in many places, just go through all changes and fix it. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/COM_Client.h (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/COM_Client.h:24: class Returned_BSTR I also would like to have only one class. As well in COM it's a common practice to have `operator&` which returns the valid pointer if the internal state is "empty". One thing which should be taken into the account designing such classes is to be exception friendly. Compare the cases: BSTR result; getResult(&result); some code // which throws an exception if (some condition) // condition is false, who will free result? { value = result ? Returned_BSTR( result ) : std::wstring(); } vs AnotherSmartBSTR result; // we can exit from the function at any point, the result will be freed. getResult(&result); some code // which throws an exception std::wstring value = result; Another thing here is the absence of copy-ctr, assignment operator and for C++11 a couple of members with the move semantics. If you don't want to fiddle around with them I would like to use `std::unique_ptr` with custom deleter. On 2014/06/26 15:13:56, Eric wrote: > On 2014/06/26 00:48:43, Oleksandr wrote: > > This multiple classes approach is a bit confusing, IMO. > > Why not have one class for all cases? > > The biggest confusion to my eye is the naming. I don't particularly like these > names, but I didn't know what was needed when I started. I'm planning to > rationalize them at the end of this process. It looks like we have four cases, > each with slightly different lifecycle requirements: > > 1) As caller: outgoing arguments to external functions > 2) As caller: incoming return values from external functions > 3) As callee: incoming arguments to our functions > 4) As callee: outgoing return values from our functions > > I'll move to a more consistent naming convention when I rationalize all the > wrapper functions. That'll be at the end of the ATL removal. > > > I don't think an extra copy operation makes a difference. > > Extraneous memory copying is one of the biggest performance penalties in > software generally. It's something to pay attention to here, since there are > lots of IE and DOM calls that use BSTR. We're already taking a performance hit > by using std::wstring and copying to BSTR, but without a basic_string > implementations that's known also to be a BSTR, that's what we're stuck with. > > For the record, I think the maintenance advantages of using standard strings > outweigh the performance disadvantages. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/Config.h (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/Config.h:107: #define DEBUG_ERROR(x) DEBUG_ERROR_FUNC(std::string("!!! Error:") + x); In defines it's always better to use braces, `(x)`. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/PluginDebug.cpp (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/PluginDebug.cpp:33: return std::wstring( s.begin(), s.end() ); Please, get acquainted with the concept of utf-8. If you need some help just let me know. BTW, using `std::codecvt` does not work with MS Visual Studio, use Windows API. Tests are appreciated. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/PluginMutex.h (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/PluginMutex.h:13: std::wstring mutex_name; in `m_someVar` "m" means "member", since we are already in the class which name contains "mutex" there is no need to call it `mutex_name`, just `m_name`. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/PluginUtil.cpp (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/PluginUtil.cpp:51: if ( std::tolower( a[j] ) != std::tolower( b[j] ) ) Strictly speaking it's not a correct implementation. I would prefer to use Windows API here for string comparison. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/Wrapper.h (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/Wrapper.h:7: #ifndef WRAPPER_H `WRAPPER_H` is too generic and it's very easy to get it clashed with another header. I personally prefer to use `#pragma once`, because it's much safer and easier, it is a very well supported by compilers and it's not so easy to find a compiler which does not support it. I don't insist on `#pragma once` but if the policy is to use ifdef/define/endif, then it should contain the the project name with the full path to the file or in addition UUID. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/Wrapper.h:56: bool Location_URL( std::wstring & value ) const; It should be without the underscore. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/Wrapper.h:62: class Internet_Bind_Info In the other parts of the project the naming convention is different, it looks like `class SomeClass;` members are named as `m_someMember`. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/Wrapper.h:76: class HTML_Element But here for me `HTML_Element` looks better than `HTMLElement` and even better than having `html` namespace.
I'm replying to few important bits this morning. This is not a complete response. On 2014/07/07 16:19:05, sergei wrote: > it would be better to split it up I did split it up. What you're not seeing in this review is some dead code removal and two other ATL types that cleanly separated from the rest of the code. I looked at doing a more incremental set of changes. The trouble is that when you start down that path that there's no clean separation between the various type usages. The intermediate code was positively riddled with various string conversions back and forth between CString and std::wstring. Splitting it up introduces code that exists in one changeset and then is removed in another. Overall it's more code to review to ensure that it all works. Same for the utility functions. They're all replacements for various CString built-ins. If you don't have utility functions, then you need to convert from wstring to CString, use the built-in, and then convert back. Same for the wrappers. There's less code to review with the wrappers because (for the most part) I wrote wrappers where I saw the same argument conversion needed more than once. Again, it's more code to ensure that all works correctly if you copy the code from the wrapper back to its point of use just to have the wrapper in a separate change set. It might be possible to trace out the data dependencies starting from various primitive sources of strings, but I really doubt that it's worth the effort.
On 2014/07/07 09:59:50, Oleksandr wrote: > Doesn't look like code formatting changes have been addressed here. No. I just fixed the things that were actual defects. Code formatting to happen later.
On 2014/07/08 11:58:34, sergei wrote: > It's already causes conflicts while applying the patch Yes, because a bunch of work has been proceeding in parallel rather than trying to get this thing finished and committed. Given how pervasive the changes are, just about every future change is likely to conflict with this one. For this reason I would like to defer just about every change we can manage that's not about defects proper in order to be able to not get bogged down. (I've made a concession to Oleksandr about whitespace, but that's a fairly trivial one to deal with.) I'd prefer that the best not be the enemy of the good. In this case in particular it's going to mean multiple changes over time. > string functions should be tested not only with ASCII That's true if we want to write a general-purpose Unicode library. It's not clear that's a proper direction to take. Insofar as I can recall, we're not using strings for any natural-language, user-generated inputs. Instead we have a fairly restricted set of texts: 1) HTML attribute text ("style" and "class" especially) 2) fixed file names used by the plugin (all ASCII, none user-generated) 3) The inner HTML of an <object> tag. 4) std::exception::what() There are probably some others I'm not thinking of. I don't recall a case, though, where there's a defect introduced by not handling multibyte UTF-16 Chinese correctly. If there _is_ such a defect, let's get that fixed in this change. > Another thing here is the absence of copy-ctr, assignment operator and for C++11 > a couple of members with the move semantics. Yes, these classes aren't complete yet. Part of the reason is that some of the special-purpose classes don't need assignment. Visual Studio 2012, our current build environment, does not have full C++11 support. In particular it doesn't recognize "= delete". Again, though, if there are defects, I am concerned about that. Other things I think we need to handle later. > As well in COM it's a common practice to have `operator&` which returns the > valid pointer if the internal state is "empty". Don't get me started on this one right now. I'm currently writing a replacement for CComPtr, which has just such an operator, and I'm convinced that the way it's implemented is responsible for various rare crash-to-desktop defects in published software. > One thing which should be taken into the account designing such classes is to be > exception friendly. well, yes, except that the current state of the code base is not exception-friendly. We don't have default-catch blocks, for example, at all the entry points, so at present a stray exception (like std::bad_alloc) will propagate out of the C++ environment, doing god-knows-what to the caller. It's my next project after ATL to get all those try-catch statements in place. There's a large amount of cruft in the code checking for various things that are better handled by exceptions.
After making some (what I thought were going to be) quick comments, I decided to address the latest round of comments this morning, rather than let them sit even a day or two longer. The merge pain will only get worse. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/AdblockPlusClient.cpp (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/AdblockPlusClient.cpp:23: class temporary_wchar_buffer { On 2014/07/08 11:58:34, sergei wrote: < this class can be replaced by `std::vector<wchar_t>` No, it can't be. The API call requires a null-terminated string, and std::wstring is not. (quote: "Accessing the value at data()+size() produces undefined behavior") This means we need a custom constructor and thus a separate class. Now we might look at deriving such a class from a standard container, admittedly. std::vector::data() is a C++11 extension, so if we're going to use a C++11 construct, it might as well be std::array. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/AdblockPlusClient.cpp:38: temporary_wchar_buffer( std::wstring s ) On 2014/07/08 11:58:34, sergei wrote: > Just in case, it's safer to have such constructors to be `explicit` I have no problem with that. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/AdblockPlusClient.cpp:88: createProcRes = CreateProcessW(engineExecutablePath.c_str(), (LPWSTR) params, On 2014/06/30 13:35:39, Felix H. Dahlke wrote: > c_str() returns a const char, but we using ¶ms[0] as writable memory seems > to be well-defined in C++11. I vote for that. (Note that I was _against_ doing > this with C++03, but we still do it. Nobody's aware of any compiler that does > not store strings contiguously.) We have two issues: 1) Contiguous memory. 2) Writable memory. We get contiguous memory by explicit guarantee in std::array and std::vector, and in practice with std::wstring. However, we don't get writable memory with std::wstring, since its data() method is 'const'. That's the whole reason for the buffer class above. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/AdblockPlusClient.cpp:111: createProcRes = CreateProcessAsUserW(newToken, engineExecutablePath.c_str(), (LPWSTR) params, On 2014/07/08 11:58:34, sergei wrote: > First of all, the compile should understand that the conversion is required even > without explicit cast, secondly it's better to use `static_cast` instead of > C-style cast. I have to admit that the C-style cast is copied from the Microsoft documentation page. Brain fart. As long as the type of 'params' admits a writable buffer (as this code is) and not a std::wstring, then '¶ms[0]' is fine, both here and above, since they're each a variant on the same kind of system call. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/COM_Client.h (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/COM_Client.h:24: class Returned_BSTR On 2014/07/08 11:58:34, sergei wrote: > As well in COM it's a common practice to have `operator&` which returns the > valid pointer if the internal state is "empty". Here's the place for the longer rant about address-of and COM. The way CComPtr is implemented, it returns a writable pointer-to-pointer in all cases, not just when the pointer is zero, but only in Release configurations, because in Debug there's an assertion that will catch it. In other words, it does not actually enforce its own invariant. Since, however, it catches the most obvious defective usages in the development cycle, all that's left are the rare and unusual ones. This is pretty much the worst case possible. That said, it's possible to write an efficient and correct class that has 'operator&'. I just didn't do it in this case. This class was written before I had much experience with what the right pattern is for this kind of type conversion interface for this class of system calls. I know how to do it better now. I'm putting it on my to-do list to rewrite this class when I work on the IE wrapper classes next. > One thing which should be taken into the account designing such classes is to be > exception friendly. Yes, the overall point is well-taken. I should point out, however, that the present code doesn't leak memory, since all the uses of this class have intervening code that can't throw. (I just verified this.) > Another thing here is the absence of copy-ctr, assignment operator and for C++11 > a couple of members with the move semantics. For this class, whose sole purpose is to receive-and-convert, all those operators should be declared "= delete", which VS 2012 doesn't support. (For the record: copy constructor, move constructor, copy assignment, and move assignment. None are used presently and none should be used.) http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/Config.h (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/Config.h:107: #define DEBUG_ERROR(x) DEBUG_ERROR_FUNC(std::string("!!! Error:") + x); On 2014/07/08 11:58:34, sergei wrote: > In defines it's always better to use braces, `(x)`. Yep. It's also used exactly once in the code base. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/PluginDebug.cpp (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/PluginDebug.cpp:33: return std::wstring( s.begin(), s.end() ); On 2014/07/08 11:58:34, sergei wrote: > Please, get acquainted with the concept of utf-8. I am fully acquainted with UTF-8. This isn't a full solution, but rather a quick hack to get this change set working. It's in the namespace "ABP::debug" for a reason, because it's not intended to deal with anything other than std::exception::what(). At the present, the only thing it receives are standard library messages and our own. All of these are ASCII, so this usage, as blunt an instrument as it is, does not introduce defects at present. It ought to be eliminated, I agree. It was expedient, however, to do the conversion out of CString entirely to std::wstring rather than to try to distinguish between debugging-only strings and more general ones. Indeed, there are lots of cases where the development logs write such general strings. Lacking an already-present Unicode library that works on Windows, it wasn't feasible to try to deal with this problem with any kind of comprehensiveness at this time. Personally, I think it would be a good idea to convert all the development logs to UTF-8, but that's not the purpose of the present change set, which is large enough as it is. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/PluginMutex.h (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/PluginMutex.h:13: std::wstring mutex_name; The "m_" prefix (and the "s_" also) are carryovers from PassthroughAPP, written my a Microsoft guy who couldn't break away from Hungarian-style notation. Neither Oleksandr nor I like Hungarian notation, and I've not introduced any new variable names with these prefixes. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/PluginUtil.cpp (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/PluginUtil.cpp:51: if ( std::tolower( a[j] ) != std::tolower( b[j] ) ) On 2014/07/08 11:58:34, sergei wrote: > Strictly speaking it's not a correct implementation. I would prefer to use > Windows API here for string comparison. That's right, it's not correct for every purpose. The only way it's used, however, is to compare file names in exactly one place. Amusingly, it's mostly irrelevant there, because IE 11 seems to converting a "file://" URL to a Windows path name, so the comparison is failing anyway and needs to be rewritten. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/Wrapper.h (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/Wrapper.h:7: #ifndef WRAPPER_H On 2014/07/08 11:58:34, sergei wrote: > `WRAPPER_H` is too generic and it's very easy to get it clashed with another > header. This is the convention we've been using in all recent code. If we want to change the convention, this review is not the right place to have that discussion. > I personally prefer to use `#pragma once`, because it's much safer and easier, > it is a very well supported by compilers and it's not so easy to find a compiler > which does not support it. FWIW. Only recently did GCC support it correctly. We don't have issues with files being copied before compilation (not in the current build environment), but #pragma once can fail in those cases.
I agree with Sergei that this review is getting way too extensive. In fact even rietveld doesn't allow me to comment here with previous comments included, since it's more then 10000 characters already. So that's a proof :) Eric, could you please split it into atoms? Like make separate reviews for util functions (ABP.util namespace and maybe another for other utils), separate for Wrapper, and for COM_Client, BSTR etc?
Can we close this? On 2014/07/09 12:03:55, Oleksandr wrote: > I agree with Sergei that this review is getting way too extensive. In fact even > rietveld doesn't allow me to comment here with previous comments included, since > it's more then 10000 characters already. So that's a proof :) > > > Eric, could you please split it into atoms? Like make separate reviews for util > functions (ABP.util namespace and maybe another for other utils), separate for > Wrapper, and for COM_Client, BSTR etc?
On 2014/08/05 15:09:13, Oleksandr wrote: > Can we close this? I'm going to leave it open for a while. There's a lot of reference code in here that remains relevant for what the code will look like eventually. I haven't started making URL references into it, but that's been because I haven't seen much need to.
http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/PluginFilter.cpp (left): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/PluginFilter.cpp:327: { The old way. Explicit type conversions, and you have to dig through a bunch of cruft to see the inputs and outputs. http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... File src/plugin/PluginFilter.cpp (right): http://codereview.adblockplus.org/5750789393874944/diff/5629499534213120/src/... src/plugin/PluginFilter.cpp:339: { The new way. A single line. |