Troubleshoot a Linux VM by attaching the OS disk to a recovery VM with the Azure CLI
Applies to: ✔️ Linux VMs
If your Linux virtual machine (VM) encounters a boot or disk error, you may need to perform troubleshooting steps on the virtual hard disk itself. A common example would be an invalid entry in /etc/fstab
that prevents the VM from being able to boot successfully. This article details how to use the Azure CLI to connect your virtual hard disk to another Linux VM to fix any errors, then re-create your original VM.
Recovery process overview
The troubleshooting process is as follows:
- Stop the affected VM.
- Take a snapshot from the OS disk of the VM.
- Create a disk from the OS disk snapshot.
- Attach and mount the new OS disk to another Linux VM for troubleshooting purposes.
- Connect to the troubleshooting VM. Edit files or run any tools to fix issues on the new OS disk.
- Unmount and detach the new OS disk from the troubleshooting VM.
- Change the OS disk for the affected VM.
To perform these troubleshooting steps, you need the latest Azure CLI installed and logged in to an Azure account using az login.
You can use the VM repair commands to automate steps 1, 2, 3, 4, 6, and 7. For more documentation and instructions, see Repair a Linux VM by using the Azure Virtual Machine repair commands.
Important
The scripts in this article only apply to the VMs that use Managed Disk.
In the following examples, replace parameter names with your own values, such as myResourceGroup
and myVM
.
Determine boot issues
Examine the serial output to determine why your VM is not able to boot correctly. A common example is an invalid entry in /etc/fstab
, or the underlying virtual hard disk being deleted or moved.
Get the boot logs with az vm boot-diagnostics get-boot-log. The following example gets the serial output from the VM named myVM
in the resource group named myResourceGroup
:
az vm boot-diagnostics get-boot-log --resource-group myResourceGroup --name myVM
Review the serial output to determine why the VM is failing to boot. If the serial output isn't providing any indication, you may need to review log files in /var/log
once you have the virtual hard disk connected to a troubleshooting VM.
Stop the VM
The following example stops the VM named myVM
from the resource group named myResourceGroup
:
az vm stop --resource-group MyResourceGroup --name MyVm
Take a snapshot from the OS Disk of the affected VM
A snapshot is a full, read-only copy of a VHD. It cannot be attached to a VM. In the next step, we will create a disk from this snapshot. The following example creates a snapshot with name mySnapshot
from the OS disk of the VM named `myVM'.
#Get the OS disk Id
$osdiskid=(az vm show -g myResourceGroup -n myVM --query "storageProfile.osDisk.managedDisk.id" -o tsv)
#creates a snapshot of the disk
az snapshot create --resource-group myResourceGroupDisk --source "$osdiskid" --name mySnapshot
Create a disk from the snapshot
This script creates a managed disk with name myOSDisk
from the snapshot named mySnapshot
.
#Provide the name of your resource group
$resourceGroup="myResourceGroup"
#Provide the name of the snapshot that will be used to create Managed Disks
$snapshot="mySnapshot"
#Provide the name of the Managed Disk
$osDisk="myNewOSDisk"
#Provide the size of the disks in GB. It should be greater than the VHD file size.
$diskSize=128
#Provide the storage type for Managed Disk. Premium_LRS or Standard_LRS.
$storageType="Premium_LRS"
#Provide the OS type
$osType="linux"
#Get the snapshot Id
$snapshotId=(az snapshot show --name $snapshot --resource-group $resourceGroup --query id -o tsv)
# Create a new Managed Disks using the snapshot Id.
az disk create --resource-group $resourceGroup --name $osDisk --sku $storageType --size-gb $diskSize --source $snapshotId
If the resource group and the source snapshot is not in the same region, you will receive the "Resource is not found" error when you run az disk create
. In this case, you must specify --location <region>
to create the disk into the same region as the source snapshot.
Now you have a copy of the original OS disk. You can mount this new disk to another Windows VM for troubleshooting purposes.
Attach the new virtual hard disk to another VM
For the next few steps, you use another VM for troubleshooting purposes. You attach the disk to this troubleshooting VM to browse and edit the disk's content. This process allows you to correct any configuration errors or review additional application or system log files.
This script attach the disk myNewOSDisk
to the VM MyTroubleshootVM
:
# Get ID of the OS disk that you just created.
$myNewOSDiskid=(az disk show -g $resourceGroup -n $osDisk --query id -o tsv)
# Attach the disk to the troubleshooting VM
az vm disk attach --disk $myNewOSDiskid --resource-group $resourceGroup --size-gb $diskSize --sku $storageType --vm-name MyTroubleshootVM
Mount the attached data disk
Note
The following examples detail the steps required on an Ubuntu VM. If you are using a different Linux distro, such as Red Hat Enterprise Linux or SUSE, the log file locations and mount
commands may be a little different. Refer to the documentation for your specific distro for the appropriate changes in commands.
SSH to your troubleshooting VM using the appropriate credentials. If this disk is the first data disk attached to your troubleshooting VM, the disk is likely connected to
/dev/sdc
. Usedmesg
to view attached disks:dmesg | grep SCSI
The output is similar to the following example:
[ 0.294784] SCSI subsystem initialized [ 0.573458] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252) [ 7.110271] sd 2:0:0:0: [sda] Attached SCSI disk [ 8.079653] sd 3:0:1:0: [sdb] Attached SCSI disk [ 1828.162306] sd 5:0:0:0: [sdc] Attached SCSI disk
In the preceding example, the OS disk is at
/dev/sda
and the temporary disk provided for each VM is at/dev/sdb
. If you had multiple data disks, they should be at/dev/sdd
,/dev/sde
, and so on.Create a directory to mount your existing virtual hard disk. The following example creates a directory named
troubleshootingdisk
:sudo mkdir /mnt/troubleshootingdisk
If you have multiple partitions on your existing virtual hard disk, mount the required partition. The following example mounts the first primary partition at
/dev/sdc1
:sudo mount /dev/sdc1 /mnt/troubleshootingdisk
Note
Best practice is to mount data disks on VMs in Azure using the universally unique identifier (UUID) of the virtual hard disk. For this short troubleshooting scenario, mounting the virtual hard disk using the UUID is not necessary. However, under normal use, editing
/etc/fstab
to mount virtual hard disks using device name rather than UUID may cause the VM to fail to boot.
Fix issues on the new OS disk
With the existing virtual hard disk mounted, you can now perform any maintenance and troubleshooting steps as needed. Once you have addressed the issues, continue with the following steps.
Unmount and detach the new OS disk
Once your errors are resolved, you unmount and detach the existing virtual hard disk from your troubleshooting VM. You cannot use your virtual hard disk with any other VM until the lease attaching the virtual hard disk to the troubleshooting VM is released.
From the SSH session to your troubleshooting VM, unmount the existing virtual hard disk. Change out of the parent directory for your mount point first:
cd /
Now unmount the existing virtual hard disk. The following example unmounts the device at
/dev/sdc1
:sudo umount /dev/sdc1
Now detach the virtual hard disk from the VM. Exit the SSH session to your troubleshooting VM:
az vm disk detach -g MyResourceGroup --vm-name MyTroubleShootVm --name myNewOSDisk
Change the OS disk for the affected VM
You can use Azure CLI to swap the OS disks. You don't have to delete and recreate the VM.
This example stops the VM named myVM
and assigns the disk named myNewOSDisk
as the new OS disk.
# Stop the affected VM
az vm stop -n myVM -g myResourceGroup
# Get ID of the OS disk that is repaired.
$myNewOSDiskid=(az disk show -g $resourceGroup -n $osDisk --query id -o tsv)
# Change the OS disk of the affected VM to "myNewOSDisk"
az vm update -g myResourceGroup -n myVM --os-disk $myNewOSDiskid
# Start the VM
az vm start -n myVM -g myResourceGroup
Next steps
If you are having issues connecting to your VM, see Troubleshoot SSH connections to an Azure VM. For issues with accessing applications running on your VM, see Troubleshoot application connectivity issues on a Linux VM.
Contact us for help
If you have questions or need help, create a support request, or ask Azure community support. You can also submit product feedback to Azure feedback community.