|
|
DescriptionIssue #1234 - std::wstring version of UnescapeUrl
Added a std::wstring version of UnescapeUrl as a standalone function. The
existing version remains in place; refactoring all calls to this function
simultaneously causes too many problems.
Patch Set 1 : #
Total comments: 7
Patch Set 2 : #
Total comments: 17
Patch Set 3 : #Patch Set 4 : #Patch Set 5 : #MessagesTotal messages: 15
http://codereview.adblockplus.org/6012307226230784/diff/5668600916475904/src/... File src/plugin/PluginClientBase.cpp (right): http://codereview.adblockplus.org/6012307226230784/diff/5668600916475904/src/... src/plugin/PluginClientBase.cpp:85: HRESULT hr = UrlUnescapeW(const_cast<wchar_t *>(url.c_str()), result.get(), & result_length, 0); Nit: no space after & http://codereview.adblockplus.org/6012307226230784/diff/5668600916475904/src/... src/plugin/PluginClientBase.cpp:97: { I personally am very cautious with empty catch-all exceptions. Maybe at least let's log something here?
http://codereview.adblockplus.org/6012307226230784/diff/5668600916475904/src/... File src/plugin/PluginClientBase.cpp (right): http://codereview.adblockplus.org/6012307226230784/diff/5668600916475904/src/... src/plugin/PluginClientBase.cpp:81: std::unique_ptr<wchar_t> result(new wchar_t[result_length]); // can throw bad_alloc It's better to use `std::unique_ptr<wchar_t[]>`. http://codereview.adblockplus.org/6012307226230784/diff/5668600916475904/src/... src/plugin/PluginClientBase.cpp:97: { +1 here. Each time such all exception eating construction looks very dangerous. I've seen Eric's comments above in the code. Let's create a ticket to make a plan what we need and what we understand under the maintenance of this code. If there is no big plans of evolving of the code, then it could be OK.
http://codereview.adblockplus.org/6012307226230784/diff/5668600916475904/src/... File src/plugin/PluginClientBase.cpp (right): http://codereview.adblockplus.org/6012307226230784/diff/5668600916475904/src/... src/plugin/PluginClientBase.cpp:81: std::unique_ptr<wchar_t> result(new wchar_t[result_length]); // can throw bad_alloc On 2014/08/05 12:53:04, sergei wrote: > It's better to use `std::unique_ptr<wchar_t[]>`. Yep. Missed that one. http://codereview.adblockplus.org/6012307226230784/diff/5668600916475904/src/... src/plugin/PluginClientBase.cpp:97: { Look, I agree with both of you. I do not like blanket suppression of exceptions. In the present case, however, it is acceptable as an interim measure. At the worst, it has the same behavior as the previous version of this code, which is to leave the URL unchanged. In the general case, if you catch bad_alloc, it's generally time to gracefully shut down the application or something related. We don't have such a mechanism other than std::terminate, which would shut down IE as a whole. I added a ticket such as Sergei wants yesterday. #1173 Ensure that C++ exceptions do not propagate out of entry points https://issues.adblockplus.org/ticket/1173 I did a bit of code audit to identify most (probably not all) of the places we need it. This isn't the entirety of what we need for this task, admittedly. Some kind of unified logging would also be appropriate. Nevertheless, if it were safe to throw exceptions in the current code base, I would have left out the exception handler entirely; bad_alloc is unusual enough that it could be handled at the top level.
All fixed. http://codereview.adblockplus.org/6012307226230784/diff/5668600916475904/src/... File src/plugin/PluginClientBase.cpp (right): http://codereview.adblockplus.org/6012307226230784/diff/5668600916475904/src/... src/plugin/PluginClientBase.cpp:85: HRESULT hr = UrlUnescapeW(const_cast<wchar_t *>(url.c_str()), result.get(), & result_length, 0); On 2014/08/05 12:20:58, Oleksandr wrote: > Nit: no space after & Done.
LGTM
http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... File src/plugin/PluginClientBase.cpp (right): http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... src/plugin/PluginClientBase.cpp:72: * When the code can tolerate exceptions better, I think this comment should go into the catch block below, something like: "Ignoring exceptions until the calling code can tolerate them better." http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... src/plugin/PluginClientBase.cpp:79: * The buffer length is greater than 2 Kb, so we keep it off the stack and allocate it. I don't think this is worth a comment, kind of a no-brainer. http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... src/plugin/PluginClientBase.cpp:83: * Casting away const here is harmless because we're not using the in-place modification mode of UrlUnescape This is pretty obvious from the documentation of UrlUnescape and not worth a comment IMO. We're doing this in various places when interacting with WinAPI without specifically pointing out why the cast is safe. http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... src/plugin/PluginClientBase.cpp:85: HRESULT hr = UrlUnescapeW(const_cast<wchar_t *>(url.c_str()), result.get(), &result_length, 0); No space between wchar_t and *. http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... src/plugin/PluginClientBase.cpp:92: * This matches the behavior of the previous version of this function. How about changing "the previous version" to "the wrapped version"?
http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... File src/plugin/PluginClientBase.cpp (right): http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... src/plugin/PluginClientBase.cpp:72: * When the code can tolerate exceptions better, On 2014/08/15 15:32:29, Felix H. Dahlke wrote: > I think this comment should go into the catch block below, something like: > "Ignoring exceptions until the calling code can tolerate them better." OK. Since I've written this code, I've come to think that this try-catch statement has a longer lifetime in the code base than I first anticipated. The reason is that the first thing we'll have is a bare minimum of exception safety, and that won't include doing something useful in the case of low memory. Eventually we'll want to rethrow bad_alloc, but (perhaps) not rethrow other exceptions (whatever they might be; I don't see any others but that doesn't mean they aren't there). I'll add an explicit catch statement for bad_alloc and put the comment there. http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... src/plugin/PluginClientBase.cpp:79: * The buffer length is greater than 2 Kb, so we keep it off the stack and allocate it. On 2014/08/15 15:32:29, Felix H. Dahlke wrote: > I don't think this is worth a comment, kind of a no-brainer. Done. http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... src/plugin/PluginClientBase.cpp:83: * Casting away const here is harmless because we're not using the in-place modification mode of UrlUnescape On 2014/08/15 15:32:29, Felix H. Dahlke wrote: > This is pretty obvious from the documentation of UrlUnescape and not worth a > comment IMO. I'm going to leave this comment in, since the call looks on the surface like a defect. The real problem is that UrlUnescape is badly overloaded, mixing const and non-const arguments. Not requiring a reader of this code to look up the documentation for UrlUnescape is worth a comment line. http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... src/plugin/PluginClientBase.cpp:85: HRESULT hr = UrlUnescapeW(const_cast<wchar_t *>(url.c_str()), result.get(), &result_length, 0); On 2014/08/15 15:32:29, Felix H. Dahlke wrote: > No space between wchar_t and *. Done. http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... src/plugin/PluginClientBase.cpp:92: * This matches the behavior of the previous version of this function. On 2014/08/15 15:32:29, Felix H. Dahlke wrote: > How about changing "the previous version" to "the wrapped version"? I'm moving this comment to the function header, since it's really about the specification, not the implementation.
http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... File src/plugin/PluginClientBase.cpp (right): http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... src/plugin/PluginClientBase.cpp:83: * Casting away const here is harmless because we're not using the in-place modification mode of UrlUnescape On 2014/09/29 11:50:07, Eric wrote: > On 2014/08/15 15:32:29, Felix H. Dahlke wrote: > > This is pretty obvious from the documentation of UrlUnescape and not worth a > > comment IMO. > > I'm going to leave this comment in, since the call looks on the surface like a > defect. The real problem is that UrlUnescape is badly overloaded, mixing const > and non-const arguments. Not requiring a reader of this code to look up the > documentation for UrlUnescape is worth a comment line. We have the exact same problem in several cases - that's the nature of working with WinAPI. Are you suggesting we should point out that the function we call won't modify the string we pass in for every const_cast we do for this purpose? The fact that we do a const_cast already says that it's safe - otherwise we wouldn't do it. If someone reading the code feels this could be wrong, they can look up the docs. But I guess we won't agree. Sergei, your verdict? http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... src/plugin/PluginClientBase.cpp:92: * This matches the behavior of the previous version of this function. On 2014/09/29 11:50:07, Eric wrote: > On 2014/08/15 15:32:29, Felix H. Dahlke wrote: > > How about changing "the previous version" to "the wrapped version"? > > I'm moving this comment to the function header, since it's really about the > specification, not the implementation. Yeah, good point. I think the comment remaining here is redundant now - it's quite obvious that we leave the string alone when hr != S_OK.
http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... File src/plugin/PluginClientBase.cpp (right): http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... src/plugin/PluginClientBase.cpp:83: * Casting away const here is harmless because we're not using the in-place modification mode of UrlUnescape On 2014/09/30 09:10:28, Felix H. Dahlke wrote: > We have the exact same problem in several cases - that's the nature of working > with WinAPI. UrlUnescape is worse than the usual Windows API, because some modes do alter the string and some do not. While typical of Microsoft, that's just an awful interface; there's no way to call it cleanly in all cases. Other API calls, at least, always treat the argument as constant even if they don't declare it. > The fact that we do a const_cast already says that it's safe - otherwise we > wouldn't do it. A good design never needs const_cast, but the world is not made up of good designs, so it's a necessary evil. The unfortunate situation, though, is that most of the uses of const_cast I see in other people's code are a result of laziness. Therefore personally, I consider every use of const_cast guilty before proven innocent. It always looks jarring to me; indeed, I added the comment after I first wrote the call to UrlUnescape because it kept looking wrong to me. I appreciate the point that const_cast won't enter the code base without review, but review is not infallible. > But I guess we won't agree. Sergei, your verdict? Oh, Felix, it takes at least two round-trips to know that this is true. I would rather commit the code without this comment than to let it linger any more than it already has, so if I haven't persuaded you with the two points above, I'm removing it. http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... src/plugin/PluginClientBase.cpp:92: * This matches the behavior of the previous version of this function. On 2014/09/30 09:10:28, Felix H. Dahlke wrote: > I think the comment remaining here is redundant now - it's > quite obvious that we leave the string alone when hr != S_OK. I'll fix the comment again. What's not obvious is that when we do nothing we're masking non-trivial error returns from UrlUnescape. It's a situation that deserves to be logged, however rare a case it might be.
http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... File src/plugin/PluginClientBase.cpp (right): http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... src/plugin/PluginClientBase.cpp:83: * Casting away const here is harmless because we're not using the in-place modification mode of UrlUnescape On 2014/09/30 15:22:21, Eric wrote: > UrlUnescape is worse than the usual Windows API, because some modes do alter the > string and some do not. While typical of Microsoft, that's just an awful > interface; there's no way to call it cleanly in all cases. > > Other API calls, at least, always treat the argument as constant even if they > don't declare it. Yeah, that's true. Still - at the end of the day the problem is that WinAPI can't be arsed about const at all, and we have to const_cast around that. > The unfortunate situation, though, is that > most of the uses of const_cast I see in other people's code are a result of > laziness. Therefore personally, I consider every use of const_cast guilty before > proven innocent. It always looks jarring to me; indeed, I added the comment > after I first wrote the call to UrlUnescape because it kept looking wrong to me. > > I appreciate the point that const_cast won't enter the code base without review, > but review is not infallible. While the last part is true, we are trying very hard to keep const_cast (and any kind of casting really) to an absolute minimum. We're not there yet when it comes to the legacy code in ABP for IE, but the general assumption should be that at least two people thought about each cast, trying to avoid it. So I don't think we need to explicitly legitimise casts with comments - people can still look up the docs to see why it had to be that way. > > But I guess we won't agree. Sergei, your verdict? > > Oh, Felix, it takes at least two round-trips to know that this is true. > > I would rather commit the code without this comment than to let it linger any > more than it already has, so if I haven't persuaded you with the two points > above, I'm removing it. That's up to you of course :) I still think the comment is pointing out the obvious and thus hurts more than it helps. http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... src/plugin/PluginClientBase.cpp:92: * This matches the behavior of the previous version of this function. On 2014/09/30 15:22:21, Eric wrote: > On 2014/09/30 09:10:28, Felix H. Dahlke wrote: > > I think the comment remaining here is redundant now - it's > > quite obvious that we leave the string alone when hr != S_OK. > > I'll fix the comment again. > > What's not obvious is that when we do nothing we're masking non-trivial error > returns from UrlUnescape. It's a situation that deserves to be logged, however > rare a case it might be. Fair enough.
On 2014/09/30 16:08:16, Felix H. Dahlke wrote: > That's up to you of course :) I still think the comment is pointing out the > obvious and thus hurts more than it helps. It's gone in patch set 5. I thought I had removed it in patch set 4, but I hadn't. This should be everything at this point.
On 2014/10/01 18:55:14, Eric wrote: > On 2014/09/30 16:08:16, Felix H. Dahlke wrote: > > That's up to you of course :) I still think the comment is pointing out the > > obvious and thus hurts more than it helps. > > It's gone in patch set 5. I thought I had removed it in patch set 4, but I > hadn't. > > This should be everything at this point. LGTM.
LGTM
LGTM http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... File src/plugin/PluginClientBase.cpp (right): http://codereview.adblockplus.org/6012307226230784/diff/5717271485874176/src/... src/plugin/PluginClientBase.cpp:83: * Casting away const here is harmless because we're not using the in-place modification mode of UrlUnescape I would say that this comment could be here or may be even a little bit changed like // Casting away const here is required by the method signature // for the case of in place modification, despite it's not our case // and url will not be modified. HRESULT hr = UrlUnescapeW( /* src */ const_cast<wchar_t*>(url.c_str()), /* dst */ result.get(), /* dst size without terminating '\0' */ &result_length, /* flags, 0 - don't modify src in place */ 0); But the current version is also OK. Firstly, const_cast indeed is attracting the attention, each time I see const_cast I start to suspect something bad is going here. Also I don't remember the entire Windows API and for this particular function each time I have to read its documentation, especially here are attracting const_cast and magic number 0. Since anyway I personally have to read the documentation the comment seems to be redundant. It would be great if I don't need to open the reference of the function or at least it can simplify the reading if it explains the reason of the decision, so the everything I need is mere to check whether it's truth or not. BTW, in most cases I would like to avoid comments for arguments by proper naming. |