Επεξεργασία

Κοινή χρήση μέσω


SAS on Azure architecture

Azure Virtual Machines
Azure Virtual Network
Azure Files

This solution runs SAS analytics workloads on Azure. The guidance covers various deployment scenarios. For instance, multiple versions of SAS are available. You can run SAS software on self-managed virtual machines (VMs). You can also deploy container-based versions by using Azure Kubernetes Service (AKS).

Architecture

Architecture diagram showing how to deploy SAS products on Azure.

The diagram contains a large rectangle with the label Azure Virtual Network. Inside it, another large rectangle has the label Proximity placement group. Two rectangles are inside it. They're stacked vertically, and each has the label Network security group. Each security group rectangle contains several computer icons that are arranged in rows. In the upper rectangle, the computer icons on the left side of the upper row have the label Mid tier. The icons on the right have the label Metadata tier. The lower row of icons has the label Compute tier. In the lower rectangle, the upper row of computer icons has the label MGS and MDS servers. The lower row has the label OSTs and OSS servers.

Download a Visio file of this architecture.

Workflow

SAS Azure deployments typically contain three layers:

  • An API or visualization tier. Within this layer:

    • The metadata tier gives client apps access to metadata on data sources, resources, servers, and users.
    • Web apps provide access to intelligence data in the mid tier.
  • A compute platform, where SAS servers process data.

  • A storage tier that SAS uses for permanent storage. Popular choices on Azure are:

    • Lustre
    • IBM Spectrum Scale
    • Network File System (NFS)

An Azure Virtual Network isolates the system in the cloud. Within that network:

  • A proximity placement group reduces latency between VMs.
  • Network security groups protect SAS resources from unwanted traffic.

Prerequisites

Before deploying a SAS workload, ensure the following components are in place:

  • A sizing recommendation from a SAS sizing team
  • A SAS license file
  • Access to a resource group for deploying your resources
  • A virtual central processing unit (vCPU) subscription quota that takes into account your sizing document and VM choice
  • Access to a secure Lightweight Directory Access Protocol (LDAP) server

Scenario details

Along with discussing different implementations, this guide also aligns with Microsoft Azure Well-Architected Framework tenets for achieving excellence in the areas of cost, DevOps, resiliency, scalability, and security. But besides using this guide, consult with a SAS team for additional validation of your particular use case.

As partners, Microsoft and SAS are working to develop a roadmap for organizations that innovate in the cloud. Both companies are committed to ensuring high-quality deployments of SAS products and solutions on Azure.

Introduction to SAS

SAS analytics software provides a suite of services and tools for drawing insights from data and making intelligent decisions. SAS platforms fully support its solutions for areas such as data management, fraud detection, risk analysis, and visualization. SAS offers these primary platforms, which Microsoft has validated:

  • SAS Grid 9.4
  • SAS Viya

The following architectures have been tested:

  • SAS Grid 9.4 on Linux
  • SAS 9 Foundation
  • SAS Viya 3.5 with symmetric multiprocessing (SMP) and massively parallel processing (MPP) architectures on Linux
  • SAS Viya 2020 and up with an MPP architecture on AKS

This guide provides general information for running SAS on Azure, not platform-specific information. These guidelines assume that you host your own SAS solution on Azure in your own tenant. SAS doesn't host a solution for you on Azure. For more information on the Azure hosting and management services that SAS provides, see SAS Managed Application Services.

Recommendations

Consider the points in the following sections when designing your implementation.

SAS documentation provides requirements per core, meaning per physical CPU core. But Azure provides vCPU listings. On the VMs that we recommend for use with SAS, there are two vCPUs for every physical core. As a result, to calculate the value of a vCPU requirement, use half the core requirement value. For instance, a physical core requirement of 150 MBps translates to 75 MBps per vCPU. For more information on Azure computing performance, see Azure compute unit (ACU).

Note

If you're scaling up and persisting data in a single-node SAS deployment (and not to an externalized file system), the SAS documentation recommends bandwidth of at least 150 MB/s. To achieve this bandwidth, you need to stripe multiple P30 Premium (or larger) disks.

Operating systems

Linux works best for running SAS workloads. SAS supports 64-bit versions of the following operating systems:

  • Red Hat 7 or later
  • SUSE Linux Enterprise Server (SLES) 12.2
  • Oracle Linux 6 or later

For more information about specific SAS releases, see the SAS Operating System support matrix. In environments that use multiple machines, it's best to run the same version of Linux on all machines. Azure doesn't support Linux 32-bit deployments.

To optimize compatibility and integration with Azure, start with an operating system image from Azure Marketplace. Using a custom image without additional configurations can degrade SAS performance.

Kernel issues

When choosing an operating system, be aware of a soft lockup issue that affects the entire Red Hat 7.x series. It occurs in these kernels:

  • Linux 3.x kernels
  • Versions earlier than 4.4

A problem with the memory and I/O management of Linux and Hyper-V causes the issue. When it occurs, the system logs contain entries like this one that mention a non-maskable interrupt (NMI):

Message from syslogd@ronieuwe-sas-e48-2 at Sep 13 08:26:08
kernel:NMI watchdog: BUG: soft lockup - CPU#12 stuck for 22s! [swapper/12:0]

Another issue affects older versions of Red Hat. Specifically, it can happen in versions that meet these conditions:

  • Have Linux kernels that precede 3.10.0-957.27.2
  • Use non-volatile memory express (NVMe) drives

When the system experiences high memory pressure, the generic Linux NVMe driver might not allocate sufficient memory for a write operation. As a result, the system reports a soft lockup that stems from an actual deadlock.

Upgrade your kernel to avoid both issues. Alternatively, try this possible workaround:

  • Set /sys/block/nvme0n1/queue/max_sectors_kb to 128 instead of using the default value, 512.
  • Change this setting on each NVMe device in the VM and on each VM boot.

Run these commands to adjust that setting:

# cat /sys/block/nvme0n1/queue/max_sectors_kb
512
# echo 128 >/sys/block/nvme0n1/queue/max_sectors_kb
# cat /sys/block/nvme0n1/queue/max_sectors_kb
128

VM sizing recommendations

SAS deployments often use the following VM SKUs:

Edsv5-series

VMs in the Edsv5-series are the default SAS machines for Viya and Grid. They offer these features:

  • Constrained cores. With many machines in this series, you can constrain the VM vCPU count.
  • A good CPU-to-memory ratio.
  • A high-throughput locally attached disk. I/O speed is important for folders like SASWORK and the Cloud Analytics Services (CAS) cache, CAS_CACHE, that SAS uses for temporary files.

If the Edsv5-series VMs are unavailable, it's recommended to use the prior generation. The Edsv4-series VMs have been tested and perform well on SAS workloads.

Ebsv5-series

In some cases, the locally attached disk doesn't have sufficient storage space for SASWORK or CAS_CACHE. To get a larger working directory, use the Ebsv5-series of VMs with premium attached disks. These VMs offer these features:

  • Same specifications as the Edsv5 and Esv5 VMs
  • High throughput against remote attached disk, up to 4 GB/s, giving you as large a SASWORK or CAS_CACHE as needed at the I/O needs of SAS.

If the Edsv5-series VMs offer enough storage, it's better to use them as they're more cost efficient.

M-series

Many workloads use M-series VMs, including:

  • SAS Programming Runtime Environment (SPRE) implementations that use a Viya approach to software architecture.
  • Certain SAS Grid workloads.

M-series VMs offer these features:

  • Constrained cores
  • Up to 3.8 TiB of memory, suited for workloads that use a large amount of memory
  • High throughput to remote disks, which works well for the SASWORK folder when the locally available disk is insufficient

Ls-series

Certain I/O heavy environments should use Lsv2-series or Lsv3-series VMs. In particular, implementations that require fast, low latency I/O speed and a large amount of memory benefit from this type of machine. Examples include systems that make heavy use of the SASWORK folder or CAS_CACHE.

Note

SAS optimizes its services for use with the Intel Math Kernel Library (MKL).

  • With math-heavy workloads, avoid VMs that don't use Intel processors: the Lsv2 and Lasv3.
  • When selecting an AMD CPU, validate how the MKL performs on it.

Warning

When possible, avoid using Lsv2 VMs. Please use the Lsv3 VMs with Intel chipsets instead.

With Azure, you can scale SAS Viya systems on demand to meet deadlines:

  • By increasing the compute capacity of the node pool.
  • By using the AKS Cluster Autoscaler to add nodes and scale horizontally.
  • By temporarily scaling up infrastructure to accelerate a SAS workload.

Note

When scaling computing components, also consider scaling up storage to avoid storage I/O bottlenecks.

With Viya 3.5 and Grid workloads, Azure doesn't currently support horizontal or vertical scaling. Viya 2022 supports horizontal scaling.

Network and VM placement considerations

SAS workloads are often chatty. As a result, they can transfer a significant amount of data. With all SAS platforms, follow these recommendations to reduce the effects of chatter:

  • Deploy SAS and storage platforms on the same virtual network. This approach also avoids incurring peering costs.
  • Place SAS machines in a proximity placement group to reduce latency between nodes.
  • When possible, deploy SAS machines and VM-based data storage platforms in the same proximity placement group.
  • Deploy SAS and storage appliances in the same availability zone to avoid cross-zone latency. If you can't confirm your solution components are deployed in the same zone, contact Azure support.

SAS has specific fully qualified domain name (FQDN) requirements for VMs. Set machine FQDNs correctly, and ensure that domain name system (DNS) services are working. You can set the names with Azure DNS. You can also edit the hosts file in the etc configuration folder.

Note

Turn on accelerated networking on all nodes in the SAS deployment. When you turn this feature off, performance suffers significantly.

To turn on accelerated networking on a VM, follow these steps:

  1. Run this command in the Azure CLI to deallocate the VM:

    az vm deallocate --resource-group <resource_group_name> --name <VM_name>

  2. Turn off the VM.

  3. Run this command in the CLI:

    az network nic update -n <network_interface_name> -g <resource_group_name> --accelerated-networking true

When you migrate data or interact with SAS in Azure, we recommend that you use one of these solutions to connect on-premises resources to Azure:

For production SAS workloads in Azure, ExpressRoute provides a private, dedicated, and reliable connection that offers these advantages over a site-to-site VPN:

  • Higher speed
  • Lower latency
  • Tighter security

Be aware of latency-sensitive interfaces between SAS and non-SAS applications. Consider moving data sources and sinks close to SAS.

Identity management

SAS platforms can use local user accounts. They can also use a secure LDAP server to validate users. We recommend running a domain controller in Azure. Then use the domain join feature to properly manage security access. If you haven't set up domain controllers, consider deploying Microsoft Entra Domain Services. When you use the domain join feature, ensure machine names don't exceed the 15-character limit.

Note

In some environments, there's a requirement for on-premises connectivity or shared datasets between on-premises and Azure-hosted SAS environments. In these situations, we strongly recommended deploying a domain controller in Azure.

The Microsoft Entra Domain Services forest creates users that can authenticate against Microsoft Entra devices but not on-premises resources and vice versa.

Data sources

SAS solutions often access data from multiple systems. These data sources fall into two categories:

  • SAS datasets, which SAS stores in the SASDATA folder
  • Databases, which SAS often places a heavy load on

For best performance:

  • Position data sources as close as possible to SAS infrastructure.
  • Limit the number of network hops and appliances between data sources and SAS infrastructure.

Note

If you can't move data sources close to SAS infrastructure, avoid running analytics on them. Instead, run extract, transform, load (ETL) processes first and analytics later. Take the same approach with data sources that are under stress.

Permanent remote storage for SAS Data

SAS and Microsoft have tested a series of data platforms that you can use to host SAS datasets. The SAS blogs document the results in detail, including performance characteristics. The tests include the following platforms:

SAS offers performance-testing scripts for the Viya and Grid architectures. The SAS forums provide documentation on tests with scripts on these platforms.

Sycomp Storage Fueled by IBM Spectrum Scale (GPFS)

For information about how Sycomp Storage Fueled by IBM Spectrum Scale meets performance expectations, see SAS review of Sycomp for SAS Grid.

For sizing, Sycomp makes the following recommendations:

  • Provide one GPFS scale node per eight cores with a configuration of 150 MBps per core.
  • Use a minimum of five P30 drives per instance.
Azure Managed Lustre

Azure Managed Lustre is a managed file system created for high-performance computing (HPC) and AI workloads. Managed Lustre can run SAS 9 and Viya workloads in parallel. To optimize the performance of your file system, follow these recommendations:

  • When you deploy Managed Lustre, perform tuning on all client nodes to increase Lustre client readahead and optimize concurrency for SAS I/O patterns. Run the following command to perform this tuning:

    lctl set_param mdc.*.max_rpcs_in_flight=128 osc.*.max_pages_per_rpc=16M osc.*.max_rpcs_in_flight=16 osc.*.max_dirty_mb=1024 llite.*.max_read_ahead_mb=2048 osc.*.checksums=0  llite.*.max_read_ahead_per_file_mb=256
    
  • Enable accelerated networking on all SAS VMs.

  • To reduce network latency, place the SAS VMs in the same availability zone that Managed Lustre is deployed in.

Azure Files premium tier

Azure Files premium tier is a managed service that supports the NFS 4.1 protocol. It provides a cost-efficient, elastic, performant, and POSIX-compliant file system. The IOPS and throughput of NFS shares scale with the provisioned capacity. SAS has extensively tested the premium tier of Azure Files and found that performance is more than sufficient to power SAS installations.

You can use nconnect to improve performance. This mount option spreads IO requests over multiple channels. For more information, see NFS performance.

When using an NFS Azure file share in Azure Files, consider the following points:

  • Adjust the provisioned capacity to meet performance requirements. The IOPS and throughput of NFS shares scale with the provisioned capacity. For more information, see NFS performance.
  • Use nConnect in your mounts with a setting of nconnect=4 for optimal-performance parallel channel use.
  • Optimize read-ahead settings to be 15 times the rsize and wsize. For most workloads, we recommend an rsize and wsize of 1 MB and a read-ahead setting of 15 MB. For more information, see Increase read-ahead size.
Azure NetApp Files (NFS)

SAS tests have validated NetApp performance for SAS Grid. Specifically, testing shows that Azure NetApp Files is a viable primary storage option for SAS Grid clusters of up to 32 physical cores across multiple machines. When NetApp provided-optimizations and Linux features are used, Azure NetApp Files can be the primary option for clusters up to 48 physical cores across multiple machines.

Consider the following points when using this service:

  • Azure NetApp Files works well with Viya deployments. Don't use Azure NetApp Files for the CAS cache in Viya, because the write throughput is inadequate. If possible, use your VM's local ephemeral disk instead.
  • On SAS 9 Foundation with Grid 9.4, the performance of Azure NetApp Files with SAS for SASDATA files is good for clusters up to 32 physical cores. This increases to 48 cores when tuning is applied.
  • To ensure good performance, select at least a Premium or Ultra storage tier service level when deploying Azure NetApp Files. You can choose the Standard service level for very large volumes. Consider starting with the Premium level and switching to Ultra or Standard later. Service level changes can be done online, without disruption or data migrations.
  • Read performance and write performance differ for Azure NetApp Files. Write throughput for SAS hits limits at around 1600 MiB/s, while read throughput goes beyond that, to around 4500 MiB/s. If you need continuous high write throughput, Azure NetApp Files might not be a good fit.
NFS read-ahead tuning

To improve the performance of your SAS workload, it's important to tune the read-ahead kernel setting, which affects how NFS shares are mounted. When read-ahead is enabled, the Linux kernel can request blocks before any actual I/O by the application. The result is improved sequential read throughput. Most SAS workloads read many large files for further processing, so SAS benefits tremendously from large read-ahead buffers.

With Linux kernels 5.4 or later, the default read-ahead changed from 15 MB to 128 KB. The new default reduces read performance for SAS. To maximize your performance, increase the read-ahead setting on your SAS Linux VMs. SAS and Microsoft recommend that you set read-ahead to be 15 times the rsize and wsize. Ideally, the rsize and wsize are both 1 MB, and the read-ahead is 15 MB.

Setting the read-ahead on a virtual machine is straightforward. It requires adding a udev rule.

For Kubernetes, this process is more complex because it needs to be done on the host and not on the pod. SAS provides scripts for Viya on AKS that automatically set the read-ahead value on the post. For more information, see Using NFS Premium shares in Azure Files for SAS Viya on Kubernetes.

Other data sources

SAS platforms support various data sources:

Considerations

These considerations implement the pillars of the Azure Well-Architected Framework, which is a set of guiding tenets that can be used to improve the quality of a workload. For more information, see Microsoft Azure Well-Architected Framework.

Security

Security provides assurances against deliberate attacks and the abuse of your valuable data and systems. For more information, see Overview of the security pillar.

The output of your SAS workloads can be one of your organization's critical assets. SAS output provides insight into internal efficiencies and can play a critical role in your reporting strategy. It's important, then, to secure access to your SAS architecture. To achieve this goal, use secure authentication and address network vulnerabilities. Use encryption to help protect all data moving in and out of your architecture.

Azure delivers SAS by using an infrastructure as a service (IaaS) cloud model. Microsoft builds security protections into the service at the following levels:

  • Physical datacenter
  • Physical network
  • Physical host
  • Hypervisor

Carefully evaluate the services and technologies that you select for the areas above the hypervisor, such as the guest operating system for SAS. Make sure to provide the proper security controls for your architecture.

SAS currently doesn't fully support Microsoft Entra ID. For authentication into the visualization layer for SAS, you can use Microsoft Entra ID. But for back-end authorization, use a strategy that's similar to on-premises authentication. When managing IaaS resources, you can use Microsoft Entra ID for authentication and authorization to the Azure portal. When using Microsoft Entra Domain Services, you can't authenticate guest accounts. Guest attempts to sign in will fail.

Use network security groups to filter network traffic to and from resources in your virtual network. With these groups, you can define rules that grant or deny access to your SAS services. Examples include:

  • Giving access to CAS worker ports from on-premises IP address ranges.
  • Blocking access to SAS services from the internet.

You can use Azure Disk Encryption for encryption within the operating system. This solution uses the DM-Crypt feature of Linux. But we currently don't recommend using Azure Disk Encryption. It can severely degrade performance, especially when you use SASWORK files locally.

Server-side encryption (SSE) of Azure Disk Storage protects your data. It also helps you meet organizational security and compliance commitments. With Azure managed disks, SSE encrypts the data at rest when persisting it to the cloud. This behavior applies by default to both OS and data disks. You can use platform-managed keys or your own keys to encrypt your managed disk.

Protect your infrastructure

Control access to the Azure resources that you deploy. Every Azure subscription has a trust relationship with a Microsoft Entra tenant. Use Azure role-based access control (Azure RBAC) to grant users within your organization the correct permissions to Azure resources. Grant access by assigning Azure roles to users or groups at a certain scope. The scope can be a subscription, a resource group, or a single resource. Make sure to audit all changes to infrastructure.

Manage remote access to your VMs through Azure Bastion. Don't expose any of these components to the internet:

  • VMs
  • Secure Shell Protocol (SSH) ports
  • Remote Desktop Protocol (RDP) ports

Deploy this scenario

It's best to deploy workloads using an infrastructure as code (IaC) process. SAS workloads can be sensitive to misconfigurations that often occur in manual deployments and reduce productivity.

When building your environment, see the quickstart reference material at CoreCompete SAS 9 or Viya on Azure.

Contributors

This article is maintained by Microsoft. It was originally written by the following contributors.

Principal authors:

Other contributors:

To see non-public LinkedIn profiles, sign in to LinkedIn.

Next steps

For help getting started, see the following resources:

For help with the automation process, see the following templates that SAS provides: