Audio in Vista, the big picture

So I've talked a bit about some of the details of the Vista audio architecture, but I figure a picture's worth a bunch of text, so here's a simple version of the audio architecture:

This picture is for "shared" mode, I'll talk about exclusive mode in a future post.

The picture looks complicated, but in reality it isn't.  There are a boatload of new constructs to discuss here, so bear with me a bit.

The flow of audio samples through the audio engine is represented by the arrows - data flows from the application, to the right in this example.

The first thing to notice is that once the audio leaves the application, it flows through a very simple graph - the topology is quite straightforward, but it's a graph nonetheless, and I tend to refer to samples as moving through the graph.

Starting from the left, the audio system introduces the concept of an "audio session".  An audio session is essentially a container for audio streams, in general there is only one session per process, although this isn't strictly true.

Next, we have the application that's playing audio.  The application (using WASAPI) renders audio to a "Cross Process Transport".  The CPT's job is to get the audio samples to the audio engine running in the Windows Audio service.

In general, the terminal nodes in the graph are transports, there are three transports that ship with Vista, the cross process transport I mentioned above, a "Kernel Streaming" transport (used for rendering audio to a local audio adapter), and an "RDP Transport" (used for rendering audio over a Remote Desktop Connection). 

As the audio samples flow from the cross process transport to the kernel streaming transport, they pass through a series of Audio Processing Objects, or APOs.  APOs are used to provide DSP on the audio samples.  Some examples of the APOs shipped in Vista are:

  • Volume - The volume APO provides mute and gain control.
  • Format Conversion - The format converter APOs (there are several) provide data format conversion - int to float32, float32 to int, etc.
  • Mixer - The mixer APO mixes multiple audio streams
  • Meter - The meter APO remembers the peak and RMS values of the audio samples pumped through it.
  • Limiter - The limiter APO prevents audio samples from clipping when rendering.

All of the code above runs in user mode except for the audio driver at the very end.

Comments

  • Anonymous
    March 07, 2006
    I can't see nothing in firefox or konqueror (same rendering engine than Apple's Safari)

  • Anonymous
    March 07, 2006
    I did include a VML warning :(

    I don't know how to get the image to work using firefox unfortunately :(

  • Anonymous
    March 07, 2006
    The comment has been removed

  • Anonymous
    March 07, 2006
    I've put a screengrab up at:

    http://www.visuar.com/vistasharedaudiostack.gif

    Larry: you can download my image and put that up on your web/blog host and use that instead of the VML...

  • Anonymous
    March 07, 2006
    This thing is excessively broken on Safari.  There's text apparently from the VML drawing all over the post, and I can't select any of the underlying text.  Also there's no drawing at all.  I had to tab through the entire navigation bar to get to the comment button...

    Vorn

  • Anonymous
    March 07, 2006
    Ok, vml fixed.

  • Anonymous
    March 07, 2006
    Can 3rd parties write their own transports and/or APOs, i.e. will there be publicly documented interfaces for implementing them?

    In particular I'm interested in writing a transport similiar to the RDP transport to route audio to a remote network device.

  • Anonymous
    March 07, 2006
    Sean, yes, IHVs will have the ability to write APOs for their audio solution.

    I'm not 100% on the transport issue.

  • Anonymous
    March 07, 2006
    What about ISVs writing an APO, e.g. a graphic equalizer that is indepedent of any particular hardware audio solution?

    If so then if ISVs can't write their own transport I could get my APO inserted into the graph which would copy the audio samples to the target network device and allow the samples to continue through the graph to the local audio driver.

  • Anonymous
    March 07, 2006
    Hey Larry,

    I think you could reach a bigger audience if you just took a screenshot of the page in IE and replaced the VML with the image of the screenshot. I created a GIF image from the page in IE and it was only 45kb so I dont think that bandwidth would be a big deal.

  • Anonymous
    March 07, 2006
    During this series, can you work in a discussion of how Secure Audio Path fits in?

  • Anonymous
    March 07, 2006
    All what I want from Vista’s Audio is this:

    I will go home. (it is there today).
    I will open my Tablet PC. (it is there today).
    I will start a game over my wireless network (it is there today).

    I will hear the game’s sound over my surround speakers at home wirelessly, either using the media edition PC that is there is the house or any other way, I want wireless sound driver, not streaming :-) (It does not exist today)

  • Anonymous
    March 07, 2006
    G.T.  I know I've read that there are people who are actively investigating wireless speaker solutions, so  there's no reason to believe that it won't work in the future.

  • Anonymous
    March 08, 2006
    PingBack from http://www.laranevans.com/posts/127

  • Anonymous
    March 14, 2006
    With the new audio stuff in vista, is it possible for the user to push a slider or something that merges all audio channels to one speaker? Occasionally one speaker of my headphones will break and some of the songs I listen to make heavy use of stereo effects and it's kind of annoying.

  • Anonymous
    March 14, 2006
    asdf, actually there is.  the multimedia control panel applet lets you chose the output format of the speaker.  Just chose a mono format and you'll get mono (assuming your audio solution supports mono).

  • Anonymous
    March 21, 2006
    PingBack from http://sdwheeler.com/blog/2006/03/21/introduction/

  • Anonymous
    March 23, 2006
    PingBack from http://igmo.org/blog/2006/03/23/introduction/

  • Anonymous
    April 21, 2006
    Is the audio a distinct User Mode process, and if so how is the process scheduled vs other processes?

  • Anonymous
    January 31, 2007
    The comment has been removed

  • Anonymous
    April 04, 2007
    Yesterday , I talked about volume in general, today I want to drill into volume more detail. In Vista,

  • Anonymous
    July 11, 2007
    PingBack from http://www.itwriting.com/blog/?p=272

  • Anonymous
    September 12, 2007
    PingBack from http://BESTONLINEDATINGSERVICE.INFO/2007/08/22/what-is-audiodgexe/

  • Anonymous
    October 31, 2007
    PingBack from http://dawnh.net/windows/268/vista-sound-subsystem-issue-again/

  • Anonymous
    January 21, 2009
    PingBack from http://www.keyongtech.com/2552939-mixers-and-windows-vista

  • Anonymous
    May 29, 2009
    PingBack from http://paidsurveyshub.info/story.php?title=larry-osterman-s-weblog-audio-in-vista-the-big-picture

  • Anonymous
    May 30, 2009
    PingBack from http://outdoorceilingfansite.info/story.php?id=161

  • Anonymous
    May 31, 2009
    PingBack from http://outdoorceilingfansite.info/story.php?id=17797

  • Anonymous
    June 19, 2009
    PingBack from http://debtsolutionsnow.info/story.php?id=11916