About over-the-air updates
Updates are an important part of the Azure Sphere security model, as they embody the property of renewable security. Ensuring that updates take place regularly helps keep your devices 7-properties compliant. Azure Sphere devices check for updates when they first connect to the internet after powering on or after the Reset button is pressed. Thereafter, checks occur at regular intervals (currently 20 hours).
There are three types of updates: pre-requisite updates, OS updates, and deployment updates. Pre-requisite updates are used to ensure that the components the update process itself relies upon—currently the Trusted Key Store (TKS) and certificate store—are up to date. The TKS is used to authenticate the images to be downloaded and installed, while the certificate store validates internet connections. An OS update targets the Microsoft-supplied software on the device, including the normal-world operating system that your applications run in, but also lower-level firmware such as the Pluton subsystem and the Security Monitor. Deployment updates target your own software—your high-level and real-time capable applications and board configuration images (if any). Pre-requisite and OS updates are managed by Azure Sphere; application updates are coordinated by Azure Sphere based on deployments created by your organization.
In order for any device to receive pre-requisite or OS updates:
- It must be connected to the internet.
- Networking requirements must be appropriately configured.
In order for any device to update its application and board configuration images:
- It must not have the application development capability.
- It must be claimed by a catalog.
- It must belong to a device group.
- The device group to which it belongs must be targeted by a deployment.
- The deployment must contain application images (and optionally a board configuration image) created by or on behalf of your organization.
- The device group must have the UpdateAll update policy. You can disable application updates for a particular device group by using the az sphere device-group update command.
For all devices in a given device group, deployments targeting that device group are considered the source of truth for imaging those devices. Any images on the device that do not exist in the deployment will be removed from the device. The one exception, as of Azure Sphere OS 21.04, is that board configuration images are not removed unless they are replaced with a new board configuration image.
The device update check occurs in three phases corresponding to the three types of updates:
- In the first phase, Azure Sphere obtains a manifest listing the current versions of the TKS and certificate store. If the TKS and certificate store on the device are up to date, the update continues with the second phase. If not, the current images are downloaded and installed.
- In the second phase, Azure Sphere obtains a manifest listing the current versions of the various OS component images. If any images on the device are out of date, the current images are downloaded along with rollback images which can be used to roll back the device to a known good state if the update process fails. The OS and rollback images are downloaded and stored in a staging area on the device, then the OS images are installed and the device is rebooted.
- In the third phase, Azure Sphere checks for deployment updates if the device group is accepting them. As with the OS update, rollback images for the applications are also staged as needed. Application and rollback images are downloaded and stored in the staging area, then the application images are installed.
Update rollback
Each part of the update process includes a rollback option. In the pre-requisite update, the rollback image is simply a backup of the pre-update state. If the update fails, the pre-update state is restored.
Rollback at any level forces rollback at all higher levels: if any firmware image fails to boot, both the firmware and application partitions are rolled back.
For the OS update, either a signature verification failure or runtime difficulties can trigger a rollback. In case of a signature verification failure, an attempt is made to correct the image; if this fails, a full rollback is triggered. In a full rollback, the staged rollback images are installed for both the OS and applications.
OS updates and deployments have independent release cycles, so it is possible for multiple deployments to occur between OS updates. If this happens, it is important to note that the rollback targets for the deployment are not the most recent deployment, but rather the deployment at the time of the last OS update. This ensures that the OS and application work together in the rolled-back state.
Interrupted updates
If an update is interrupted, for example by a power outage or loss of connectivity, there are four possible scenarios for each update type:
- If a complete set of images was successfully downloaded and staged but not yet installed, the installation will complete when power is restored.
- If some but not all of the images were downloaded and staged, the update will continue downloading missing images and then proceed to installation.
- If an update is interrupted during installation after download is complete, the install will restart on boot.
- If no image was completely downloaded, the update process will begin fresh when power is restored, as there will be nothing ready to install.
Updates in power-down scenarios
Azure Sphere supports low-power scenarios that enable devices to be powered down for extended periods to conserve battery life. In such scenarios, it is important that the device be allowed to check for updates periodically. The Power Down sample application demonstrates how to properly reduce power consumption while still ensuring the device will periodically stay awake to check for OS and app updates.
Deferred updates
To prevent critical tasks from being interrupted by updates, high-level applications can incorporate deferred update. This feature allows the application to complete its critical tasks and then prepare for shutdown to allow the update to proceed. The DeferredUpdate sample demonstrates how to implement such a deferred update.