What's New in Microsoft HPC Pack 2012
Applies To: Microsoft HPC Pack 2012, Microsoft HPC Pack 2012 R2
This document lists the new features and changes that are available in Microsoft® HPC Pack 2012.
In this topic:
Deployment
Windows Azure integration
Node management
Job scheduling
Runtimes and development
Deployment
Operating system and software requirements have changed. HPC Pack 2012 has an updated set of requirements for operating system and other prerequisite software.
The following table shows the operating system requirements of HPC Pack 2012 for the different node roles.
Role
Operating system requirement
Head node
Windows Server® 2012
Note
Windows Server® 2008 R2 is no longer supported.
Compute node
Windows Server 2012, Windows Server 2008 R2
Note
Windows Server® 2008 is no longer supported.
WCF broker node
Windows Server 2012
Note
Windows Server 2008 R2 is no longer supported.
Workstation node
Windows® 8, Windows® 7
Unmanaged server node
Windows Server 2012, Windows Server 2008 R2
Note
Windows Server 2008 is no longer supported.
Windows Azure node
Windows Server 2012, Windows Server 2008 R2
Client (compute with only client utilities installed)
Windows Server 2012, Windows 8, Windows Server 2008 R2, Windows 7, Windows Server® 2008, Windows Vista®
HPC Pack 2012 also has updated the requirements for software in addition to the operating system as follows:
HPC Pack 2012 requires and supports Microsoft® SQL Server® 2008 R2 or SQL Server 2012 to manage the HPC cluster databases. HPC Pack 2012 no longer supports SQL Server 2008.
HPC Pack 2012 supports Microsoft® .NET Framework 4. SQL Server® 2008 R2 or SQL Server 2012 installation still requires .NET Framework 3.5.
HPC Pack now supports the Server Core installation option of the Windows Server 2012 operating system. HPC Pack 2012 supports the Server Core installation option of the Windows Server 2012 operating system for the following roles:
Compute node
WCF Broker node
Unmanaged server node
HPC Services for Excel is not supported on the Server Core installation option of the Windows Server 2012 operating system.
The SQL permissions required for remote database setup have been reduced. You no longer need to be a member of the SQL Server sysadmin role to install HPC Pack 2012 with remote databases. Before you install HPC Pack 2012 with remote databases, ask the database administrator to run the SetupHpcDatabase.cmd script in the Setup folder or to manually perform or modify the tasks in the script. For details, see 1.3. Decide if you want to deploy your cluster with remote databases in Step 1 of the Getting Started Guide.
Windows Azure integration
Nodes can be deployed in Windows Azure deployments in which Windows Azure Virtual Network is available. You can use HPC Pack 2012 to deploy nodes in Windows Azure deployments in which Windows Azure Virtual Network is available. Virtual Network securely extends your enterprise network to Windows Azure, which allows applications that run on Windows Azure nodes to access resources on the enterprise network. With Virtual Network, you can build traditional site-to-site virtual private networks (VPNs) to scale data centers, and create hybrid applications that span from an on-premises HPC cluster to Windows Azure. This feature provides applications with the means to access files in shared folders, and to access servers for license validation.
The number of proxy nodes is configurable. Cluster administrators can now configure the number of proxy nodes that a Windows Azure deployment uses. Proxy nodes facilitate communication between on-premises head nodes and nodes hosted in Windows Azure. You can specify that the deployment uses a fixed number of proxy nodes, or that the deployment uses a number of proxy nodes that varies with the size of the deployment. Proxy nodes use Medium size Azure worker roles.
An application VHD can be specified to provision Azure worker role instances. Cluster administrators now can specify an application virtual hard disk (VHD) that is automatically mounted when Windows Azure worker role instances are provisioned.
All Windows Azure burst communication that goes through TCP port 443 can now use HTTPS. If configured in this release, all communication through TCP port 443 between the on-premises cluster and Windows Azure burst node deployments can now use the HTTPS protocol. In previous releases, NetTcp communication was used for Service-Oriented Architecture (SOA), job scheduling, and file staging services to Windows Azure deployments. HTTPS communication is allowed in many enterprise environments in which non-HTTPS traffic on port 443 is blocked. By default, NetTcp communication continues be configured for these services in HPC Pack 2012, to improve performance for some Windows Azure burst deployments.
HPC Pack 2012 cluster nodes can be deployed on Windows Azure virtual machines. Using Windows Azure virtual machines, an administrator or independent software vendor (ISV) can create and run an HPC cluster and workload fully in Windows Azure with only minimal or no investment in on-premises infrastructure. The domain controller for the cluster can be either on-premises (if there is an existing enterprise domain) or in Windows Azure. You have options to deploy separate virtual machines for the head node, Microsoft SQL Server, and the Active Directory domain controller. You can add Windows Azure compute nodes to the cluster in the same way that you “burst” (add) Windows Azure nodes to an on-premises HPC cluster, or use Windows Azure virtual machines to deploy additional cluster roles. For steps to start using Windows Azure virtual machines, see Microsoft HPC Pack in a Windows Azure Virtual Machine.
Windows Azure trace log data can be written to persistent storage. This release allows the Windows Azure trace log data that is generated on Windows Azure nodes to be written to persistent Windows Azure table storage, instead of to local storage on the role instances. The trace log data facilitates troubleshooting of problems that can occur with Windows Azure burst node deployments. Writing the trace log data to persistent table storage is enabled by setting the AzureLoggingEnabled cluster property to True with the HPC PowerShell Set-HpcClusterProperty cmdlet. This affects all new deployments (not existing ones), and logs only Critical, Error, and Warning error messages to the WADLogsTable in the storage account associated with each deployment.
Important
Logging of Windows Azure deployment activities uses table storage space and generates storage transactions on the storage account associated with each deployment. The storage space and the storage transactions will be billed to your account. Activity logging is generally enabled only when problems occur with the deployment and is used to aid in troubleshooting issues with the deployment. After you disable logging, the logs will not be automatically removed from Windows Azure storage. You may want to keep the logs for future reference by downloading them. The log entries can be cleaned up by removing the WADLogsTable from your storage account.
Windows Azure HPC Scheduler for HPC Pack 2012 integrates with Windows Azure SDK 1.8. The Windows Azure HPC Scheduler that is compatible with HPC Pack 2012 requires both the Windows Azure HPC Scheduler SDK 1.8 (available for download from the Microsoft Download Center) and version 1.8 of the Windows Azure SDK for .NET x64.
Node management
New features are available for discovering and managing the power plan or power scheme setting for on-premises nodes. The Windows power plan or power scheme setting on the compute nodes in your HPC cluster can affect the performance of the cluster. To help identify and alleviate performance problems that result from power plan settings, you now can run the Active Power Scheme Report diagnostic test, which reports the power plan that is active on the nodes. You can run this diagnostic test on on-premises nodes to ensure that the active power plan is the desired one, or to verify that the power plan has not changed unexpectedly. Unexpected changes to the power plan might result from a group policy, for example. The desired power plan in most cases is High performance. You can find the Active Power Scheme Report diagnostic test in Diagnostics, under System, and then under System Configuration.
Additionally, you can now add a new Power Scheme Setting node template task for on-premises nodes that changes the power plan of the nodes to any of the default power plans, or to a specific power plan that you or your organization has created. This new node template task can run when you deploy nodes from bare-metal, or when you add preconfigured nodes to the HPC cluster. Also, the node template task can run when you run maintenance on existing nodes. You can find the Power Scheme Setting node template task in the Maintenance section of compute node templates.
Job scheduling
Custom job and task properties can be changed at run time. The owner of a job or a cluster administrator now can change the values of custom-defined job and task properties at any time.
A job can depend on another job. You now can specify that a job is dependent on a parent job when you submit the job. When a job is dependent on a parent job, the HPC Job Scheduler Service considers the job for scheduling only when its parent jobs finish.
Jobs can be allocated to nodes based on their memory requirements. You can configure a job property, EstimatedProcessMemory, to estimate the maximum amount of memory that a process in the job will consume. The HPC Job Scheduler Service uses this value to help allocate the job efficiently to the cluster nodes that have the memory resources to perform the job. Be default job allocation does not take into account the memory requirements of the job.
Additional flexibility is provided for scheduling jobs on node groups. You now can define a value for a property, NodeGroupOp, to specify an operator that affects how the HPC Job Scheduler Service uses node groups to allocate resources to a job. This new flexibility is useful for scheduling jobs on hybrid clusters where some nodes are Windows Azure nodes, and some node are on-premises compute nodes, workstation nodes, or unmanaged server nodes. Of the on-premises nodes, some nodes can have different attributes that some applications require, such as memory or cores.
The following table describes the values for the NodeGroupOp property.
Value
Description
Intersect
The job should run only on the nodes that belong to all of the node groups in the list. Prior to HPC Pack 2012, this option was the only behavior available when you specified node groups for a job. You can use this option for defining types of nodes that are appropriate for this job.
For example, if you have a node group for nodes that have at least 4 gigabytes (GB) of memory, and another node group for nodes that have at least 8 cores, you specify those node groups for the NodeGroups property and Intersect for the NodeGroupOp property to run an application on nodes that have at least 4 GB of memory and at least 8 cores.
Union
The job should run only on nodes that belong to any node group in the list.
For example, you specify ComputeNodes and WorkstationNodes for the NodeGroups property and Union for the NodeGroupOp property to run an application on either compute nodes or workstation nodes. Because you did not include AzureNodes in the node group list for the job, the job always runs on on-premises resources.
Uniform
The job should run only on nodes that all belong to any one node group in the list.
For example, the Uniform value is useful for hybrid clusters that contain on-premises compute nodes and Windows Azure nodes. You may want to run an application in either environment, but not allow the application to span both on-premises and Windows Azure nodes concurrently. You can obtain this behavior if you specify ComputeNodes and AzureNodes for the NodeGroups property and Uniform for the NodeGroupOp property.
You can specify exit codes other than 0 that indicate success for tasks. Jobs and tasks now have a ValidExitCodes property that you can use to specify a list of exit codes that indicate successful task completion. If no list is specified, then 0 is the only task exit code that indicates success. If you specify a value for the ValidExitCodes property for a job, the specified list of successful exit codes applies to all tasks within the job, unless you override that list by specifying a different value for the ValidExitCodes property of the task itself.
This feature is useful especially for some commercial applications that do not always return 0 when they successfully run. The list of successful exit codes applies only to the task exit code itself. Exit codes during setup of the task may have additional values. For example, if the file specified for the standard output of the task is not valid, the exit code is 3 for "file not found." Even if you include 3 in the list of values that you specify for the ValidExitCodes property of the task, the task still fails, because the failure occurs during task configuration, and is not the result of the task starting and finishing.
The most recent output of a task is cached, rather than the start of the output. Starting with this release, HPC Pack caches the most recent 4000 characters per task, not the first 4000 characters.
The HoldUntil job property can be set without cluster administrator permissions, and can be set any time before the job runs. Prior to this release, you could only set the HoldUntil property if you were a cluster administrator, and could only set this property on jobs in the queued state. In this release, all users can set this property, and they can change this property for a job any time before the job starts running.
A job can run on a single node without reserving all of the resources on the node. You now can run a job on a single node without reserving all the cores of the node. For example, prior to this release, if you specified that a job should run on a minimum of 2 cores and a maximum of 4 cores, the job could run on 2 cores, each located on a different node. You can now specify that a job should run on a minimum of 2 cores and a maximum of 4 cores, but still must run on a single node. This feature provides more efficient use of cluster resources for jobs that must run on a single node, such as OpenMP applications.
You can specify whether dependent tasks run if parent task fails. You can now set a new job property named FailDependentTasks to specify whether or not dependent tasks should continue if a parent task fails or is canceled. The property is set to false by default, and in that case all dependent tasks continue to run even if some of the parent tasks fail or get canceled. If you set this property to true, all dependent tasks fail upon the failure of any parent tasks.
Task-level preemption is now the default preemption behavior. In Queued scheduling mode, the default option for preemption behavior is now task-level immediate preemption, rather than job-level preemption. This new default behavior means that only as many tasks of low priority jobs are preempted as are needed to provide the resources required for the higher priority jobs, rather than preempting all of the tasks in the low priority jobs.
HPC Basic Profile Web Service is removed. The HPC Basic Profile Web Service was deprecated as of HPC Pack 2008 R2 with Service Pack 2 (SP2), and has been removed in this release. Instead, use the Web Service Interface, which is an HTTP web service that is based on the representational state transfer (REST) model. For information about the Web Service Interface, see Working with the Web Service Interface.
Runtimes and development
Easier monitoring of SOA jobs and sessions is available. You now can use HPC Cluster Manager or HPC Job Manager to view detailed information about the progress of SOA jobs and sessions, and to view message-level traces for SOA sessions. You can also export SOA traces and share them offline with support personnel.
The common data framework is extended to work on Windows Azure. This release extends support of the common data framework for service-oriented architecture (SOA) applications to Windows Azure burst deployments, using Windows Azure blob storage to stage data. Client application can use the SOA common data API as before and expect the data to be available on Windows Azure compute nodes.
Collective operations in the Microsoft Message Passing Interface can now take advantage of hierarchical processor topologies. In modern HPC clusters, communication between Message Passing Interface (MPI) ranks located on the same nodes is up to an order of magnitude faster than the communication between MPI ranks on different nodes. Similarly, on Non-Uniform Memory Access (NUMA) hardware, communication within a socket happens significantly faster than communication across sockets on the same machine. In this release, MPI applications can use hierarchical collective operations to take advantage of hierarchical processor topologies, by minimizing the number of messages and bytes sent over the slower links where possible.
This feature is enabled through the mpiexec MSMPI_HA_COLLECTIVE environment variable. For more details, see the mpiexec command-line Help.
This feature applies to the following MPI collective operations:
MPI_Barrier
MPI_Bcast
MPI_Reduce
MPI_Allreduce
Microsoft MPI now offers a tuning framework to help configure the appropriate algorithmic selection for collective operations. In this release, Microsoft MPI can run basic collective performance benchmarks and optimize the specific algorithms used for the collectives based on the cluster configuration. This facility is exposed through the following mpiexec environment variables:
MSMPI_TUNE_COLLECTIVE
MSMPI_TUNE_PRINT_SETTINGS
MSMPI_TUNE_SETTINGS_FILE
MSMPI_TUNE_TIME_LIMIT
MSMPI_TUNE_VERBOSE
Microsoft MPI can use message compression to improve performance over the socket interconnect. In this release, Microsoft MPI can compress messages before sending them over the sockets channel. Compression potentially reduces the amount of time applications spend waiting for communication, in exchange for the additional processing time needed to perform the compression.
Compression only applies to message traffic over the sockets channel. The performance impact of this feature depends on the configuration of the hardware for the HPC cluster and the entropy of the data being communicated between MPI ranks.
This feature can be enabled through the mpiexec MSMPI_COMPRESSION_THRESHOLD environment variable.
Microsoft MPI offers enhanced diagnostics. To help troubleshoot errors, Microsoft MPI can capture the program state when MPI errors are encountered. The system can be configured to capture individual process core dumps at various levels of detail after an application terminates abnormally. This facility is controlled through the MSMPI_DUMP_MODE mpiexec environment variable.
Microsoft MPI offers enhanced affinity support. In this release, Microsoft MPI supports a richer set of options for process placement. The system is aware of the processor topology on the nodes and is able to configure process layout based on this information. Typical configurations include:
One process per NUMA node to support hybrid parallelism (MPI/OpenMP)
Sequential process allocation across physical cores, to maximize locality
Spread process allocation, to maximize aggregate bandwidth
This facility is exposed by the mpiexec affinity and affinity_layout command-line parameters, or the MPIEXEC_AFFINITY environment variable.