Azure Monitor Alert: AgentExpiry on VaultXXX ( microsoft.recoveryservices/vaults ).

AmazingPhone6248 20 Reputation points
2024-12-03T11:09:08.2133333+00:00

"Azure Monitor Alert AgentExpiry on VaultXXX ( microsoft.recoveryservices/vaults )"

"Critical: Azure Site Recovery components of Recovery Services vault: 'VaultXXX' has expired."

Hello,

We've had two of the alerts stated above, over the past couple of weeks.

This is Hyper-V to Azure replication.

Each time I've checked the systems and everything looks up to date and replicating just fine.

So what am I supposed to be looking for?

I'm concerned as the alert says Critical and Severity 1, but I can't find anything wrong.

Thanks.

Azure Backup
Azure Backup
An Azure backup service that provides built-in management at scale.
1,347 questions
{count} vote

2 answers

Sort by: Most helpful
  1. Ashok Gandhi Kotnana 2,660 Reputation points Microsoft Vendor
    2024-12-04T10:37:02.97+00:00

    Hi @AmazingPhone6248

    Welcome to Microsoft Q&A Forum, thank you for posting your query here!

    The alerts you're seeing, particularly the "Azure Monitor Alert AgentExpiry" and the "Critical: Azure Site Recovery components... has expired", suggest that there may be an issue with the version of the Azure Site Recovery (ASR) components in your setup

    Both the alerts are Built-in Azure Monitor alerts for Azure Site Recovery for prevention.

    1. "Critical: Azure Site Recovery components of Recovery Services vault: 'VaultXXX' has expired."
      We recommend always upgrading to the latest component versions:

    With every new version 'N' of an Azure Site Recovery component that's released, all versions below 'N-4' are considered to be out of support.

    Important

    Official support is for upgrading from > N-4 version to N version. For example, if you're running you are on N-6, you need to first upgrade to N-4, and then upgrade to N.

    Please go through the below reference document.
    Refer: https://learn.microsoft.com/en-us/azure/site-recovery/service-updates-how-to
    https://learn.microsoft.com/en-us/azure/site-recovery/site-recovery-whats-new

    2 "Azure Monitor Alert AgentExpiry on VaultXX microsoft.recoveryservices/vaul)"

    In this scenario, we strongly recommend that you enable automatic updates. You can allow Site Recovery to manage updates as follows:

    When a new agent update is available, Site Recovery provides a notification in the vault towards the top of the page. In the vault > Replicated Items, click this notification at the top of the screen:

    New Site Recovery replication agent update is available. Click to install ->

    Select the VMs for which you want to apply the update, and then click OK.

    On the VM disaster recovery overview page, you will find the 'Agent status' field which will say 'Critical Upgrade' if the agent is due to expire. Click on it and follow the subsequent instructions to manually upgrade the virtual machine.

    Refer: https://learn.microsoft.com/en-us/azure/site-recovery/service-updates-how-to

    Please let me know how it goes, we will always help as you needed.!


  2. Patrick Gawthorne 0 Reputation points
    2025-01-08T06:10:59.1266667+00:00

    I was having this exact issue as well with the same version numbers and alerts from Azure, complete with broken links to the upgrade documentation in the alert email. 🙄

    Your mileage my vary, here is an overview of what I did to resolve this issue:

    I had to completely uninstall both the Recovery Services Agent and Site Recovery Provider. When uninstalling from control panel, the agent (MARS) doesn't appear to completely remove itself, you can try stopping the service before uninstalling to see if the steps below are necessary or not. This will impact replication but only for as long as it's not running. I performed this on a cluster, so I was able to move replicated VM's around the hosts to get this done to retain our RPO. This would definitely halt any replication while you're doing this, thankfully it doesn't take long.

    After uninstalling via control panel, I stopped the Azure Recovery Services Agent (obengine) service and deleted this service manually as well as remove its associated files in Program Files. I know, I don't know why this is even a solution, this was a last resort for me.

    After uninstalling via control panel, I removed the stale service in cmd as admin:

    SC STOP obengine
    SC DELETE obengine
    

    Once the MARS agent is stopped and service deleted, I cleaned up its program files by removing the below folder. Probably best to move it instead of delete in-case you need to go back to logs for whatever reason as some logs are retained there.

    C:\Program Files\Microsoft Azure Recovery Services Agent\

    After moving or deleting, I reinstalled both the provider and agent via the bundled setup downloaded from the Azure Portal. Here is the link to save you time: https://aka.ms/downloaddra_ae

    Once it has finished installing, it will say the server is already registered, you can just click exit to finish the setup. Check the status of the server in the Recovery Services vault (under Site Recovery Infrastructure | Hyper-V Hosts), it should state connected, click said server and click the refresh server button if the version hasn't updated yet, wait a few mins and a couple refreshes of your browser later, you'll see the version should be up to date. It won't have "(latest)" appended to it but it should be at or higher than the roll up (Mine is higher at 2.0.9940.0).

    After alot of troubleshooting to sort this issue out, it appears it is that the patch to 2.0.9940.0 (which is in the bundled setup) is what is failing to install. When looking at the MSI install logs in win temp and the extracted setup files of marsagentinstaller.exe (bundled in AzureSiteRecoveryProvider.exe from the Recovery Services Vault -> Add Host-> Download Installer) you'll see an MSI with version 2.0.8673.0 and a patch file named as a KB number in the "installer" folder. The setup detects if MARS is already installed and then applies a patch (.msp) if applicable, which in our case fails, then continues as the service is already on the machine so MARS never gets updated and is left behind, resulting us stuck on 2.0.9260.0. I have seen warnings of expected version mismatches in the MSI install logs, but this was when attempting to upgrade before doing any cleanup. When doing a fresh install, it seems to install version 2.0.8673.0 and then applies the same patch to 2.0.9940.0 straight after. You might be able to verify this by opening OBPatch.log in C:\Windows\Temp and searching for "IncompatiblePatchError".

    Hope this helps.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.