Windows 8.1 Audio streaming – Part 2: Power savings via H/W offload

 

Overview

As I mentioned in my previous blog post, one of the policies that is determined based on the audio categories is related to power savings. In order to lower power consumption for audio, we need to wake up the CPU less frequently and put less load on the CPU (by offloading more functionality to H/W that is dedicated to audio, such as the audio codec or the audio DSP). This feature is called Hardware-Offloaded Audio Processing (HAP) or simply audio offload.

 

Audio buffer sizes in Windows 8.1

In order to explain this functionality better, it makes sense to start by explaining the first step while streaming audio: the application writes the audio data in a buffer, so that it can be processed by the audio stack. There are 2 available buffer sizes:

  1. 10 ms (this is called the “host pin” or the “system pin”)

    1. The CPU wakes up every 10ms, fills in the 10ms buffer and can go to lower power states.
  2. 1 second (this is called the “offload pin”)

    1. The CPU wakes up every 1sec, fills in the 1-second buffer and can go to lower power states.

There is a trade-off between using each of the two buffer sizes.

The 10ms buffer leads to:

  1. Lower latency (imagine that if the application fills in the buffer and immediately needs to play “ding” tone, then it only needs to wait for 10ms until the next pass),

  2. High power consumption (we need to wake up the CPU every 10ms to write data into the audio buffer)

In general, we want to use this buffer for anything that requires low latency, such as VOIP calls, alerts, sound effects, etc.

The 1 second buffer leads to:

  1. High latency (after the data is written into the buffer, we need to wait 1 second before being able to write the next “ding” into the buffer)

  2. Low power consumption (we wake up the CPU every 1 second)

In general, we want to use the 1 second buffer, when we play audio files or movies. In that case, we want the battery to last as long as possible and waiting for a few seconds until the beginning of an mp3 file or a movie is not important.

 

H/W and S/W requirements to support audio offload

Hopefully, with the above use cases, it now makes sense why we’ve tied the 1-second buffers with the usage of the following 2 categories:

  1. BackgroundCapableMedia: Mostly used for audio playback

  2. ForegroundOnlyMedia: Mostly used for video playback

More specifically, one of the requirements in order to use the 1-second buffers is to use the two categories above. However, things become a little more complicated, when we take into consideration that not all H/W and drivers support this feature, and when we include both “screen-on” and “screen-off” scenarios.

So, here are the requirements, in order for an application to use the 1-second buffer, when the screen is ON:

  1. The application needs to be a Store application (offload is not available to Win32 apps. I will explain the reasons below)

  2. The application needs to set the audio category of a stream as BackgroundCapableMedia or ForegroundOnlyMedia

  3. The H/W needs to be offload-capable (either the DSP or the audio codec)

  4. The driver needs to support audio offload.

    1. Note: Even though I have not talked much about audio drivers yet, I’d like to point out that only that use the WaveRT audio miniport model can support audio offload

In addition to the all the above requirements, in order for an application to use the 1-second buffer, when the screen is off (also known as Low Power Audio or LPA), the following requirements need to be met:

  1. The application needs to set the audio category of a stream as BackgroundCapableMedia

  2. The application needs to declare in its manifest that it is background-capable (for more information: https://msdn.microsoft.com/en-us/library/windows/apps/hh700367.aspx)

  3. The H/W needs to support the Connected Standby (CS) low power state

Intel has a whitepaper for LPA at https://software.intel.com/en-us/articles/low-power-audio-playback-windows-store-whitepaper

 

Offload diagram

Apart from waking up the CPU less often (which is achieved by using a larger buffer), in order to achieve lower power consumption, we also need to ask the CPU to do less work. In the diagram that I included in my previous post, I showed that the audio data is passed from the application to the Audio Device Graph (audiodg). AudioDG loads the Audio Engine (AudioEng), which allows 3rd party dlls called APOs (Audio Processing Objects) to process the audio stream. Actually, this is the correct path when we use 10ms buffers. However, for 1-second buffers, Windows expects that all the processing will be done in H/W. As a result, for Windows 8.1, we do not load APOs, when we use the offload pin (1-second buffers). Here is a diagram that shows this:

 

Here I wanted to clarify that audio effects can be implemented:

  1. In S/W: Audio Device Graph (audiodg) loads 3rd party dlls, which are called Audio Processing Objects

    1. This option is only available, when there is no offload (i.e. for 10ms buffers)
  2. In H/W: By the audio codec or the DSP

    1. This option is available regardless of whether we use offload (1 second buffers) or not (10ms buffers)

 

FAQ: Why is offload not supported for Win32 apps?

One more point that I’d like to make here is that Windows only requires drivers to support 2 offload pins. It’s up to H/W developers to determine, if they want/can support more. Each offload pin corresponds to one stream that can be offloaded. So, if the H/W has 2 offload pins, this means that 2 streams can be offloaded (i.e. use the 1-second buffer) at any point in time. The 3rd stream (and all streams after that) that tries to use the offload pin will actually be switched to the non-offloaded path (10ms buffers). Implementing an offload pin takes a lot of H/W resources, that’s why most H/W cannot simultaneously support many (i.e. 3+) offloaded streams.

So, this makes it easier to understand why we offer support for offload only to Store apps and not to Win32 apps. Most Store apps are not background-capable, so we can suspend their audio streams when they go into the background (or are minimized, etc). This means that the offload pin that they occupied is now available for other apps to use. However, Win32 apps do not get suspended when they go into the background. So, if a Win32 application, such as Windows Media Player is sitting in the background with a paused stream, we cannot take its offload pin and make it available to other apps. Since most Win32 apps have a long process lifetime (i.e. users let them run in the background), if we allowed them to offload audio streams, then they would easily grab the two offload pins and keep them unused for long periods of time.

FAQ: Can a system have only S/W audio effects and support audio offload?

The last topic that is often asked from H/W manufacturers is whether the following combination is possible:

  1. Support audio offload

  2. Implement audio effects in S/W (as APOs that are loaded by audiodg)

  3. Do not implement any audio effects in H/W

Unfortunately, this combination is not possible in Windows 8.1. All systems that support audio offload, need to have their audio effects implemented in H/W. If the audio effects were implemented as APOs that are loaded by audiodg, then sometimes they would be applied on the audio stream (when the audio stream is going through audiodg) and sometimes they would not be applied (when the audio stream is being offloaded). As a result, the user would see inconsistent behavior (the same file would sound differently based on whether the stream is offloaded or not).

If a H/W manufacturer cannot implement the audio effects in H/W and still wants to support audio offload, then he can choose to implement the audio effects in kernel-mode (loaded by the audio drivers). This is a more complicated process, as it complicates driver development, but this option is provided by Windows.

Additional resources:

  1. MSDN link for Audio offload: https://msdn.microsoft.com/en-us/library/windows/hardware/dn302038%28v=vs.85%29.aspx

  2. MSDN link for Low Power Audio (LPA): https://msdn.microsoft.com/en-us/library/windows/hardware/dn621143(v=vs.85).aspx

  3. Intel’s whitepaper for Low Power Audio: https://software.intel.com/en-us/articles/low-power-audio-playback-windows-store-whitepaper

Comments

  • Anonymous
    February 18, 2015
    awesome article!

  • Anonymous
    April 11, 2015
    1 second (this is called the “offload pin”) The CPU wakes up every 10ms, fills in the 1-second buffer and can go to lower power states. I think the above statement has to be changed. Instead of 10ms it should be 1sec.

  • Anonymous
    April 11, 2015
    The comment has been removed

  • Anonymous
    April 11, 2015
    Please clarify my doubt: In system pin path, double buffering is used? First the application copies data to the audio engine buffer exposed to application by the audio engine. Then the audio engine copies the processed audio data to the driver mapped buffer. Is this the flow?