Partilhar via


SWAP space in Linux VM’s on Windows Azure – Part 1

With the recent GA of IaaS on Windows Azure, many new scenarios are opening up. One such scenario was a recent project where I needed to deploy a multi-node Hadoop Cluster on Linux based machines in Azure. To get started, I used a CentOS 6.3 image from the Azure gallery to provision a medium sized VM and proceeded to deploy a single node core Hadoop. This seemed to work fine except that as I started testing slightly heavier workloads, I noticed that the VM would often freeze up or become unresponsive.

It is not difficult to guess that this had something to do with the limited resources of a medium sized VM – after all it has only 2 CPU cores and 3.5 GB of memory. But, while I would have expected the workloads to run slow, I was not expecting the whole VM to become unresponsive or start dropping connections. After discussing this issue with my friends and colleagues, we determined that the VM did not have Swap (i.e. what is called a page file on Windows) configured at all. Thus, its virtual memory system could not swap to disk when the memory pressure increased.

You can check how the system is doing memory wise by running the “free” command from the Linux shell prompt, and in particular, you can use “cat /proc/swaps” to see the state of the swap space – how much is configured and how much is in use. See the screenshot, below.

 

If the swap space is not configured at all, as is the default case with Linux VM’s provisioned in Azure IaaS, the “cat /proc/swaps” will return nothing, and likewise the “free” command will not show any activity in swap.

An interesting question is why doesn’t the VM provisioning using a Linux library image (i.e. from the Azure gallery) automatically configure swap space. Perhaps the thinking is that the user should decide on the size and location of the swap and do it post provisioning. However, this information is not documented anywhere, and it is quite possible that one continues to use the VM without the swap ever getting configured till processes begin to crash or the VM freezes up.

That said, once the problem has been diagnosed, the solution is a simple set steps to configure a file based swap on the resource disk; a medium sized IaaS VM in Azure comes with 135 GB of resource disk mounted as “/mnt/resource”. Given below is a walkthrough of the steps for configuring a file based swap space on the VM.

  • Use the “fallocate” command to allocate a swap file of suitable size, say, 5GB on the resource disk. The syntax is: “fallocate -l 5g /mnt/resource/swap5g” where “swap5g” is the name of the file
  • Change the permissions on the file using “chmod” command so that only the root user has read/write permissions on the swap file. The syntax is: “chmod 600 /mnt/resource/swap5g”
  • Use the “mkswap” command to set up the file as swap area. The syntax is: “mkswap /mnt/resource/swap5g”
  • Enable the use of the swap file using “swapon” command. The syntax is: “swapon /mnt/resource/swap5g”
  • The swap is ready for use now, and the “cat /proc/swaps” command should confirm it now. Add an entry to the “/etc/fstab” file so that even if the VM recycles in Azure, the swap settings are retained. The syntax is: echo “/mnt/resource/swap5g none swap sw 0 0” >> /etc/fstab

 

Here is a transcript of the above commands executed in my VM.

[root@mylinvm ~]# fallocate -l 5g /mnt/resource/swap5g

 

[root@mylinvm ~]# ll /mnt/resource/

total 5242904

drwxr-xr-x. 4 root root 4096 May 10 19:55 hadoop

drwx------. 2 root root 16384 May 1 17:08 lost+found

-rw-r--r--. 1 root root 5368709120 May 17 22:23 swap5g

 

[root@mylinvm ~]# chmod 600 /mnt/resource/swap5g

 

[root@mylinvm ~]# ll /mnt/resource/

total 5242904

drwxr-xr-x. 4 root root 4096 May 10 19:55 hadoop

drwx------. 2 root root 16384 May 1 17:08 lost+found

-rw-------. 1 root root 5368709120 May 17 22:23 swap5g

 

[root@mylinvm ~]# mkswap /mnt/resource/swap5g

mkswap: /mnt/resource/swap5g: warning: don't erase bootbits sectors

        on whole disk. Use -f to force.

Setting up swapspace version 1, size = 5242876 KiB

no label, UUID=564242e6-ac36-4a82-9766-7f9590b12369

 

[root@mylinvm ~]# swapon /mnt/resource/swap5g

 

[root@mylinvm ~]# free

             total used free shared buffers cached

Mem: 3399192 2356524 1042668 0 17844 310620

-/+ buffers/cache: 2028060 1371132

Swap: 5242872 0 5242872

 

Acknowledgement: Thanks to my colleague Amit Srivastava for help with troubleshooting and resolving the swap issue.

Comments

  • Anonymous
    February 16, 2014
    Thanks for writing this. I was running R in an extra small CentOS image and install.package() was silently failing. I eventually narrowed down the reason with strace -e trace=process, and when I saw I had no swap space, google brought me to this article.

  • Anonymous
    February 27, 2014
    That command to remount it does not seem to work. I see that it adds it correctly  to the file but the structure in the file seems to be very different. I tested by rebooting the VM and it did not remount it. Any suggestions?

  • Anonymous
    March 03, 2014
    I never got that command to work for the echo it would never recreate the swap or start it on boot so I added this. Which creates it every time the machine starts. sudo -i echo "fallocate -l 8g /mnt/resource/swapfile" >> /etc/rc3.d/S99local echo "chmod 600 /mnt/resource/swapfile" >> /etc/rc3.d/S99local echo "mkswap /mnt/resource/swapfile" >> /etc/rc3.d/S99local echo "swapon /mnt/resource/swapfile" >> /etc/rc3.d/S99local

  • Anonymous
    May 23, 2014
    Be advised that the procedure described is completely wrong for Linux VMs created in Azure. The /mnt/resource is mounted by the WALinuxAgent and it is not clear when that is done, so even the suggestion to add the commands to /etc/rc3.d/S99local doesn't work because it creates the file before the /mnt/resource is mounted and in my case I got a "/" mount point with no space left. The correct way is to make the changes in /etc/waagent.conf as described in part 2 of this article (just change "part-1" to "part-2" in the URL). Be sure to delete any files created by the fallocate command before rebooting, the WALinuxAgent will recreate the file /mnt/resource/swapfile in every boot.

  • Anonymous
    November 05, 2014
    Sorted my azure vm swap issues out, thank you!

  • Anonymous
    November 19, 2014
    This is a create solution! Thank you

  • Anonymous
    June 15, 2015
    Thanks for this - a great resource.