BlueMonkMN February 2016

Application hanging during CoCreateInstance of .NET-based COM object

I have a C++ DLL that is creating an instance of a COM object that's implemented in .NET. Under many circumstances this works fine, but under certain circumstances, it hangs the application, and I see it stuck with the following call stack (this is just the part underneath the level of my DLL's code):

ntdll.dll!_NtAlpcSendWaitReceivePort@32()
rpcrt4.dll!LRPC_CASSOCIATION::AlpcSendWaitReceivePort(unsigned long,struct _PORT_MESSAGE *,struct _ALPC_MESSAGE_ATTRIBUTES *,struct _PORT_MESSAGE *,unsigned long *,struct _ALPC_MESSAGE_ATTRIBUTES *,union _LARGE_INTEGER *)
rpcrt4.dll!LRPC_BASE_CCALL::DoSendReceive(void)
rpcrt4.dll!LRPC_BASE_CCALL::SendReceive(struct _RPC_MESSAGE *)
rpcrt4.dll!_I_RpcSendReceive@4()
rpcrt4.dll!_NdrSendReceive@8()
rpcrt4.dll!@NdrpSendReceive@4()
rpcrt4.dll!_NdrClientCall2()
combase.dll!ServerAllocateOXIDAndOIDs(void * hServer, void * phProcess, unsigned __int64 * poxidServer, unsigned long cOids, unsigned __int64 * aOid, unsigned long * pcOidsAllocated, const tagOXID_INFO * poxidInfo, tagDUALSTRINGARRAY * pdsaStringBindings, tagDUALSTRINGARRAY * pdsaSecurityBindings, unsigned __int64 * pdwOrBindingsID, tagDUALSTRINGARRAY * * ppdsaOrBindings) Line 246
combase.dll!CRpcResolver::ServerRegisterOXID(const tagOXID_INFO & oxidInfo, unsigned __int64 * poxid, unsigned long * pcOidsToAllocate, unsigned __int64 * arNewOidList) Line 1020
combase.dll!OXIDEntry::RegisterOXIDAndOIDs(unsigned long * pcOids, unsigned __int64 * pOids) Line 1631
combase.dll!OXIDEntry::AllocOIDs(unsigned long * pcOidsAlloc, unsigned __int64 * pOidsAlloc, unsigned long cOidsReturn, unsigned __int64 * pOidsReturn)
combase.dll!CComApartment::CallTheResolver() Line 856
combase.dll!CComApartment::InitRemoting() Line 1166
combase.dll!CComApartment::StartServer() Line 1386
combase.dll!InitChannelIfNecessary() Line 1393
combase.dll!ClassicSTAThreadWaitForHandles(unsigned long dwFlags, unsigned long dwTimeout, unsigned long cHandles, void * * pHandles, unsigned long * pd        

Answers


BlueMonkMN February 2016

With the assistance of Microsoft support and DebugDiag we have determined that the cause of the problem seems to be related to a Loader Lock. Loader Lock is documented in detail at https://msdn.microsoft.com/en-us/library/ms173266(v=vs.120).aspx but basically, there are certain restrictions that apply to code that runs within the scope of DllMain or dynamic initialization of static un-managed code objects whose instances require dynamic initialization on loading the DLL (because they're in global scope). One way around this is to tell the C++ compiler that the code should be compiled with CLR support so it's not handling the initialization in DllMain, but rather another function that doesn't retain the loader lock.

In our code we had a global declaration:

CFSCoCultureWrapper cultureWrapper;

Which had a constructor that called CoCreateInstance on a managed COM object, which in turn had a reference to Microsoft.InteropToolbox. Applying the /clr switch to that one source file allowed the DLL to be loaded without hanging.

It's not clear why the behavior changed in different deployments, but, as the article linked states, the hang does not necessarily always occur, so these can be difficult problems to debug. To illustrate, even our simple test case was 4 levels deep loading DLLs before we encountered the problem - EXE loads (LoadLibrary) unmanaged DLL loads (CoCreateInstance) managed DLL loads Microsoft DLL. We decided with the level of complexity involved in these sorts of problems it was well enough understood and did not pursue further understanding why the problem only occurred in certain deployments.

Simple answer, don't create global instances of objects that load managed code during the constructor from unmanaged code. Use lazy initialization or switch the code file to use the /c

Post Status

Asked in February 2016
Viewed 1,186 times
Voted 9
Answered 1 times

Search




Leave an answer