共用方式為


Troubleshoot: Client Calls to DCOM "Server Failed With :A security package specific error occurred. (Exception from HRESULT: 0x80070721)."

Customer has an web application, which needs to call a remote DCOM server. However the web application intermittently got this error message;
"A security package specific error occurred. (Exception from HRESULT: 0x80070721)."

Before moving forward, we need to check if the environment hits below known issues:

A COM+ application may stop working on Windows Server 2008 when the identity user logs off

Error message when you try to connect to the Product Studio middle-tier server that is running on a Windows Server 2008-based computer from a computer that is running Windows XP: "Could not connect to the Product Studio datastore listed above"

Note: The second issue described for Product Studio scenario, but actually it is for all DCOM calls from Windows XP to Windows 2008.

A COM+ application may stop working on Windows Server 2008 when the identity user logs off

If the above info doesn't help, we should continue as below.

This error message usually happened when the DCOM client tries to establish authentication with the remote DCOM server, especially when Kerberos authentication failed.
To troubleshooting this kind of issue, we can capture network monitor trace on the DCOM client side (web application) and DCOM server side first, and then analyze the network trace to see if there is any Kerberos failure to be fixed. Below are general steps:

1. Install the Network monitor tool on the DCOM client and Web server from below link:

 (32bit please select X86 version, 64bit please select X64 version)
 
https://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=983b941d-06cb-4658-b7f6-3088333d062f

2. On the DCOM Client, run this command to clean up Kerberos cache:

klist purge

Note: Win7 or Win2008 including klist tool by default. For Windows 2003, we can get this from Windows 2003 Resource Kit

3. On the DCOM Client side, run this command in the command window to start network monitor:
 
nmcap /network * /capture /file test.cap:500M /stopwhen /keypress q

on the DCOM Server side, run the same command in the command window:
 
nmcap /network * /capture /file test.cap:500M /stopwhen /keypress q

4. Use one IE client to visit the web function and reproduce this DCOM error message.

5. Press Ctrl+C to stop network monitoring trace and then we can start analyzing the .Cap file through network monitor trace. You can then proceed further troubleshooting based on the Kerberos Error or other Authentication error message if there is any.

In this case, we apply the "KerberosV5" as the display filter. Got several Kerberos errors, including KDC_ERR_ETYPE_NOSUPP:

Consider in this case the customer faced this issue after upgrading Domain Server to Win2008 R2, then advanced encryption standard (AES 128 and AES 256) will be supported for the Kerberos protocol.

Therefore, raising function level will trigger the following two actions on PDC directly:

1. Modify the msDS-Behavior-Version attribute on domain partition object.
2. Change the password of krbtgt object

However, if DC received the above change 2 earlier than change 1 from PDC, it will trigger the function to rebuild the password hash in its memory without function level upgrade. Due to change 1 (function level change) does not arrive yet, DC thought itself still do not support the AES, the hash was rebuilt without AES in the memory. (Exiting Password stored in AD database will rebuild by legacy encryption type. No passwords were rebuilt by AES). Later, the change 1 arrived after that DC will think it starts to support AES, but will still fail to support AES due to no AES hash in memory.

So we can see if we have two DCs in the domain, DC1 received change1 earlier than change2, whereas DC2 received chang2 earlier than change1. Thus the TGT requested from DC1 will be encrypted by AES and will failed to be decrypted by DC2. KRB5KDC_ERR_ETYPE_NOSUPP will be returned.

In order to make sure all DC rebuild the hash in memory, restart each DC one by one. Customer rebooted DCs and this issue got resolved completely.

Regards,
Freist Li from APGC DSI Team