Updating Hyper-V hosts with CAU

Peter 20 Reputation points
2024-11-19T10:26:26.1733333+00:00

Hello All,

Hope you all are okay and doing fine.

In my new company that I started working with I'm thinking about optimizing the patching process of our Hyper-V hosts, they are all in a windows cluster.

Doing some researches here, I stepped on Cluster Aware Updating that is 'waiting for me' on the clusters, never been using it before, so please, apologies if the questions are stupid.

I've started testing it, and it is doing fine, doing its job, but I have some prerequisites that I would like it to follow during the process.

Firstly,

As far as I understood, when starting the process, it download and install the packets on all hosts simultaneously, and then, choosing the lowest used node and draining the VMs and storages from it, then reboot, etc. checks, until it is patch compliant.

I would like to seek your assistance for the following.

Yesterday, I faced an issue where storage cannot be migrated from the particular host that the CAU is forcing to MM mode, and this led to breaking the storage that it holds.

It went into 'Draining' and stop responding, and the disk that is hosting to freeze and show itself as Down.

After hard restart, the system unlock the storage and it become operational after bringing it online on another host.

1.What I can do to prevent further cases with the disk locks and the unsuccessful migration?

I have something in mind, can I firstly put the host in MM mode (Noticed that CAU has the option to run a script before doing the updates, and the script to be "Put the host in MM mode, simple powershell command")?

  1. I'm not sure why the disk was locked and why the drain hang and become unsuccessful.

Any advises that can come in your mind are much appreciated.

As mentioned, I'm still testing it, but I want to configure it correctly, with the prerequisite to put the host in MM mode before doing anything, (I think the storage hangs after the patches are installed, i do not know) and when it is working fine to be scheduled so we can use the automation.

3.Everyone that's using it, I'm seeking for advises and how to be done correctly so I can prevent any disruptions with the storages, or unsuccessful migration of vms, whatever can happen.

Looking forward to hearing your feedback.

Have a nice one.

BR,

Peter

Hyper-V
Hyper-V
A Windows technology providing a hypervisor-based virtualization solution enabling customers to consolidate workloads onto a single server.
2,742 questions
Windows Server Clustering
Windows Server Clustering
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.Clustering: The grouping of multiple servers in a way that allows them to appear to be a single unit to client computers on a network. Clustering is a means of increasing network capacity, providing live backup in case one of the servers fails, and improving data security.
1,014 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Alex Bykovskyi 2,166 Reputation points
    2024-11-20T21:40:05.12+00:00

    Hey,

    I would recommend you to run cluster validation. The issue might be related to your storage. Owner node of the CSV should move between the nodes without issues. The following best practices should cover more info: https://github.com/MicrosoftDocs/windowsserverdocs/blob/main/WindowsServerDocs/failover-clustering/cluster-aware-updating-requirements.md

    Might also help:

    https://www.starwindsoftware.com/blog/whip-your-hyperconverged-failover-cluster-into-shape-automatically-and-with-no-downtime-using-microsofts-cluster-aware-updating/

    Cheers,

    Alex Bykovskyi

    StarWind Software

    Note: Posts are provided “AS IS” without warranty of any kind, either expressed or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.

    0 comments No comments

  2. Ian Xue 37,966 Reputation points Microsoft Vendor
    2024-11-21T03:29:56.47+00:00

    Hi Peter,

    Thanks for your post. To avoid any random storage errors, it is recommended to see if there is any evidence in the Event viewer, which will help us for the root cause analysis. There are some known issues for the live migration errors, just for your reference and hope it helps.

    Reference: Troubleshoot live migration issues - Windows Server | Microsoft Learn

    Also, please note that CAU supports updating Storage Spaces Direct clusters regardless of the deployment type: hyper-converged or converged. Specifically, CAU orchestration ensures that suspending each cluster node waits for the underlying clustered storage space to be healthy.

    Reference: Cluster-Aware Updating - Frequently Asked Questions | Microsoft Learn

    Best Regards,

    Ian Xue


    If the Answer is helpful, please click "Accept Answer" and upvote it.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.