Issue 5698769010032640: Issue #1466 - fix usage of BSTR

sergei

Oct. 8, 2014, 3:07 p.m. (2014-10-08 15:07:33 UTC) #1

Eric

http://codereview.adblockplus.org/5698769010032640/diff/5629499534213120/src/plugin/AdblockPlusDomTraverser.cpp File src/plugin/AdblockPlusDomTraverser.cpp (right): http://codereview.adblockplus.org/5698769010032640/diff/5629499534213120/src/plugin/AdblockPlusDomTraverser.cpp#newcode52 src/plugin/AdblockPlusDomTraverser.cpp:52: if (SUCCEEDED(pEl->getAttribute(ATL::CComBSTR(L"src"), 0, &vAttr)) && vAttr.vt == VT_BSTR && ...

Oct. 9, 2014, 1:54 p.m. (2014-10-09 13:54:01 UTC) #2

sergei

http://codereview.adblockplus.org/5698769010032640/diff/5629499534213120/src/plugin/AdblockPlusDomTraverser.cpp File src/plugin/AdblockPlusDomTraverser.cpp (right): http://codereview.adblockplus.org/5698769010032640/diff/5629499534213120/src/plugin/AdblockPlusDomTraverser.cpp#newcode52 src/plugin/AdblockPlusDomTraverser.cpp:52: if (SUCCEEDED(pEl->getAttribute(ATL::CComBSTR(L"src"), 0, &vAttr)) && vAttr.vt == VT_BSTR && ...

Oct. 16, 2014, 8:53 a.m. (2014-10-16 08:53:25 UTC) #3

Oleksandr

http://codereview.adblockplus.org/5698769010032640/diff/5629499534213120/src/plugin/AdblockPlusDomTraverser.cpp File src/plugin/AdblockPlusDomTraverser.cpp (right): http://codereview.adblockplus.org/5698769010032640/diff/5629499534213120/src/plugin/AdblockPlusDomTraverser.cpp#newcode52 src/plugin/AdblockPlusDomTraverser.cpp:52: if (SUCCEEDED(pEl->getAttribute(ATL::CComBSTR(L"src"), 0, &vAttr)) && vAttr.vt == VT_BSTR && ...

Oct. 17, 2014, 7:38 a.m. (2014-10-17 07:38:02 UTC) #4

http://codereview.adblockplus.org/5698769010032640/diff/5629499534213120/src/...
File src/plugin/AdblockPlusDomTraverser.cpp (right):

http://codereview.adblockplus.org/5698769010032640/diff/5629499534213120/src/...
src/plugin/AdblockPlusDomTraverser.cpp:52: if
(SUCCEEDED(pEl->getAttribute(ATL::CComBSTR(L"src"), 0, &vAttr)) && vAttr.vt ==
VT_BSTR && ::SysStringLen(vAttr.bstrVal) > 0)
- I agree that using ATL's class is better then our custom class. We will need
to catch that exception either way, so it's not like it would result in cleaner
code.
- I already admitted that I would be fine if we did not handle that exception at
all for now (that is really a very extreme case when it would be thrown, IMHO,
and crashing wouldn't be the worst option in that case), but if we do, Sergei's
suggestion with catchExceptions makes sense to me.
 
On 2014/10/16 08:53:25, sergei wrote:
> - I would say that speaking about maintainability I am completely sure that
ATL
> is much better than our custom classes. There is a plenty of documentation and
> it's proved by huge usage. As well if I understood correctly we decided to
stick
> with ATL.
> So, since we use ATL and we definitely cannot control all cases when it can
> throw an exception I would say that it should not be considered as something
> avoidable. Instead of counting lines of code we should think how to implement
it
> in a good way. For example, we can make each entry method (see #1173) like
> HRESULT SomeClassImpl::SomeMethod(arg1, arg2, ...)
> {
>   return catchExceptions([]()->HRESULT
>   {
>     // our current impl
>   });
> }
> 
> I don't see a big problem in several lines of code to catch additional ATL
> exception inside `catchExceptions`. Even more, ATL exception allows to return
> the real HRESULT instead of manual mapping each exception to HRESULT.
> - I am also convinced that we should not pollute the code by try-catch
> statements, it would be better to have it centralized.
> - As well as I don't know how to compare what is worse right now, wrong usage
of
> wchar_t* strings instead of BSTR or propagating of exceptions in case of there
> is no enough free memory.
> 
> Conclusion: use ATL, implement exception handling as #1173.
> @Oleksandr, @Felix.

Eric

http://codereview.adblockplus.org/5698769010032640/diff/5629499534213120/src/plugin/AdblockPlusDomTraverser.cpp File src/plugin/AdblockPlusDomTraverser.cpp (right): http://codereview.adblockplus.org/5698769010032640/diff/5629499534213120/src/plugin/AdblockPlusDomTraverser.cpp#newcode52 src/plugin/AdblockPlusDomTraverser.cpp:52: if (SUCCEEDED(pEl->getAttribute(ATL::CComBSTR(L"src"), 0, &vAttr)) && vAttr.vt == VT_BSTR && ...

Oct. 20, 2014, 4:06 a.m. (2014-10-20 04:06:24 UTC) #5

http://codereview.adblockplus.org/5698769010032640/diff/5629499534213120/src/...
File src/plugin/AdblockPlusDomTraverser.cpp (right):

http://codereview.adblockplus.org/5698769010032640/diff/5629499534213120/src/...
src/plugin/AdblockPlusDomTraverser.cpp:52: if
(SUCCEEDED(pEl->getAttribute(ATL::CComBSTR(L"src"), 0, &vAttr)) && vAttr.vt ==
VT_BSTR && ::SysStringLen(vAttr.bstrVal) > 0)
On 2014/10/17 07:38:02, Oleksandr wrote:
> - I agree that using ATL's class is better then our custom class. We will need
> to catch that exception either way, so it's not like it would result in
cleaner
> code.

The main way that the code becomes cleaner by using our own class is that we can
define a conversion function to std::wstring.

http://codereview.adblockplus.org/5698769010032640/diff/5629499534213120/src/...
src/plugin/AdblockPlusDomTraverser.cpp:52: if
(SUCCEEDED(pEl->getAttribute(ATL::CComBSTR(L"src"), 0, &vAttr)) && vAttr.vt ==
VT_BSTR && ::SysStringLen(vAttr.bstrVal) > 0)
On 2014/10/16 08:53:25, sergei wrote:
> As well if I understood correctly we decided to stick
> with ATL.

What we decided was that there was little rush to continue the process of
removing ATL. That is not at all the same thing as sticking with it. What we
really do not need is to use _more_ ATL.

http://codereview.adblockplus.org/5698769010032640/diff/5629499534213120/src/...
src/plugin/AdblockPlusDomTraverser.cpp:52: if
(SUCCEEDED(pEl->getAttribute(ATL::CComBSTR(L"src"), 0, &vAttr)) && vAttr.vt ==
VT_BSTR && ::SysStringLen(vAttr.bstrVal) > 0)
On 2014/10/16 08:53:25, sergei wrote:
> - I would say that speaking about maintainability I am completely sure that
ATL
> is much better than our custom classes.

From what I can tell, I don't think you spent any significant time even reading
the code I posted for them, neither the first iteration nor the second. I
certainly have seen little indication that you know or appreciate the range of
issues I've been concerned with in these classes.

> So, since we use ATL and we definitely cannot control all cases when it can
> throw an exception I would say that it should not be considered as something
> avoidable.

I've studied the issue, and we can control the cases where ATL might throw. 

In the following, I'm excluding PassthroughAPP. Even though it has numerous
occurrences of ATLEXCEPTION and ATLTRACE, its code calls ours, not vice-versa,
so it's not our primary concern.

For the vast majority of our own code, the only ATL we're using is the two
classes CComBSTR and CComVariant.  I've studied the source code to CComBSTR and
also audited the context of how we use it. I am reasonably certain that the only
exception that CComBSTR can throw from within our code is for a failure to
allocate memory. Similarly, the only ones that CComVariant could throw are
out-of-memory and some outlier cases where an IE defect would return an
incorrect type. Most all of the uses of CComPtr don't throw (unless for some
bizarre reason AddRef throws). There are a few simple uses of CComQIPtr; these
would only throw in practice if IE returned an object of invalid type.

Other than these value-classes, the only other cases I know of that can throw
ATL exceptions are the places we're using default ATL implementations of certain
interfaces. These are 'IObjectWithSiteImpl' and 'IDispatchImpl'. There is only
one possible occurrence where we'd ever see an ATL exception, and it's in a call
to 'IObjectWithSiteImpl<CPluginClass>::SetSite'.

This is not an insurmountable problem.

> Instead of counting lines of code we should think how to implement it
> in a good way. For example, we can make each entry method (see #1173) like
> HRESULT SomeClassImpl::SomeMethod(arg1, arg2, ...)
> {
>   return catchExceptions([]()->HRESULT
>   {
>     // our current impl
>   });
> }

This implementation pattern requires that all exceptions are handled identically
in all cases. There is an assumption here that all exceptions _should_ be
handled identically. I am very certain I am not ready to make that assumption at
the current time. I suspect we may only need a handful of exception patterns,
but that number is in my present opinion very likely to be greater than one.

I'm not opposed to considering the use of such a pattern once we understand
better what the commonalities are. We don't understand them yet.

> Even more, ATL exception allows to return
> the real HRESULT instead of manual mapping each exception to HRESULT.

This claim has false assumptions. For exceptions thrown by the standard library,
there is no such thing as a "real HRESULT". For exceptions thrown by systems
calls, we can define a subclass of 'std::runtime_exception' that not only
captures the HRESULT but also always calls 'GetLastError' for good measure.

> - I am also convinced that we should not pollute the code by try-catch
> statements, it would be better to have it centralized.

Centralization is acceptable if you know you need common behavior. We do not yet
know that.

I have to say that I have a distaste for the idea that try-catch statements are
somehow a pollution. It's possible to overuse them, to be sure. There are plenty
of places in the code where error handling is non-existent or rather crude.
Whether such error handling gets expressed in a try-catch block or by explicit
checking of return values is less important than that the code handle the error
correctly. As an example, the current code does a poor job dealing with mutex
failures.

http://codereview.adblockplus.org/5698769010032640/diff/5629499534213120/src/...
src/plugin/AdblockPlusDomTraverser.cpp:52: if
(SUCCEEDED(pEl->getAttribute(ATL::CComBSTR(L"src"), 0, &vAttr)) && vAttr.vt ==
VT_BSTR && ::SysStringLen(vAttr.bstrVal) > 0)
On 2014/10/16 08:53:25, sergei wrote:
> - As well as I don't know how to compare what is worse right now, wrong usage
of
> wchar_t* strings instead of BSTR or propagating of exceptions in case of there
> is no enough free memory.

It is possible to define a class for the kind of effectively-constant BSTR
arguments we're using here. There's no such thing as an actually-constant BSTR,
since they're all required to be system-allocated. Among other things, it means
you can't take advantage of 'constexpr' to define the BSTR memory layout at
compile time. (It probably even works today, but there's no guarantee it will
work tomorrow.) Nevertheless, there are ways of dealing with it.

One way is to use a global static variable. This moves the exception from
run-time to load-time. This is an improvement, since there's more likely to be
free memory when IE starts up than after there are two dozen tabs open. The
problem with this is that an exception during static initialization doesn't have
a handler.

Better is to use local variable for a singleton class that allocates on first
use but not thereafter. It means checking a flag each time the value is
referenced, but that's faster than allocating and copying the constant string
each time. You still a low-memory exception to handle.

At the cost of almost-certainly-too-much code overhead, you could do all your
allocations on the first call to DllMain. That gets rid of the exception, but I
doubt it's worth it.

The upshot is that I don't see a practical way to eliminate dealing with the
memory exception. This is, however, not really a problem that's unique here.
We've got a number of dynamic memory structures that could throw the same
exception as a result of processing pages. We're going to have to deal with it
anyway. My main concern with respect to this particular change set is that we
not put ourselves in a position of need to deal with multiple kinds of memory
allocation exceptions.

Oleksandr

http://codereview.adblockplus.org/5698769010032640/diff/5629499534213120/src/plugin/AdblockPlusDomTraverser.cpp File src/plugin/AdblockPlusDomTraverser.cpp (right): http://codereview.adblockplus.org/5698769010032640/diff/5629499534213120/src/plugin/AdblockPlusDomTraverser.cpp#newcode52 src/plugin/AdblockPlusDomTraverser.cpp:52: if (SUCCEEDED(pEl->getAttribute(ATL::CComBSTR(L"src"), 0, &vAttr)) && vAttr.vt == VT_BSTR && ...

Oct. 21, 2014, 11:53 p.m. (2014-10-21 23:53:19 UTC) #6

Eric

LGTM. I really don't like to have to allocate BSTR each time for what could ...

Feb. 2, 2015, 3:17 p.m. (2015-02-02 15:17:55 UTC) #9

sergei

On 2015/02/02 15:17:55, Eric wrote: > LGTM. > > I really don't like to have ...

Feb. 2, 2015, 3:22 p.m. (2015-02-02 15:22:03 UTC) #10

Eric

On 2015/02/02 15:22:03, sergei wrote: > Yes, I also don't like it but it's better ...

Feb. 2, 2015, 4:09 p.m. (2015-02-02 16:09:41 UTC) #11

sergei

On 2015/02/02 16:09:41, Eric wrote: > On 2015/02/02 15:22:03, sergei wrote: > > Yes, I ...

Feb. 4, 2015, 11:52 a.m. (2015-02-04 11:52:57 UTC) #12

Eric

Feb. 4, 2015, 12:36 p.m. (2015-02-04 12:36:08 UTC) #13

On 2015/02/04 11:52:57, sergei wrote:
> You propose to prepare at the compile time the
> memory layout for such BSTR strings, thus initialize the structure { const
> uint32_t length; const wchar_t value[];}; at the compile time, don't you? 

Yes, that's exactly the idea.

It should also be possible to do something similar at static initialization
time, to set up the correct memory layout just once, and to do it in a way that
always succeeds and never throws. If that were working, then adding the
declaration 'constexpr' would simply cause it to be initialized at compile time.
The key would be to use a template function that grabs the length of a
wide-character array, and then to use that length as a template argument to
declare the memory-layout structure.

> That looks actually very attractive, feel free to create the ticket for it.

Repeating the allocations with each call is hardly the worst of our
inefficiencies.

And while it looks attractive, there remains one problem with it. There's a
stated requirement (although not stated very explicitly or even frequently) that
all BSTR that goes through COM be allocated with SysAllocString(). Perhaps this
only really means "we require the BSTR memory layout", but it might also mean
"COM needs to interact with the allocator in mysterious ways". It's possible
this requirement arose at the time "for future development" and that future
development never happened. If that's the case, then compile-time initialization
should be fine, because COM is basically finished and done at this point.

The fully correct way of getting the effect of static initialization and also
uses SysAllocString() is to write an explicit initialization routine that runs
when the DLL is loaded. It would need a matching routine that runs when the DLL
is unloaded to deallocate.

Issue 5698769010032640: Issue #1466 - fix usage of BSTR (Closed)

Description

Patch Set 1 #

Patch Set 2 : rebased #

Messages