What's actually on an audio CD?

Before I can talk about reading audio CDs using DAE (Digital Audio Extraction), I need to talk a bit about what data's actually on the audio CD, since we're going to be reading the raw audio data from the CD.

An audio CD contains audio samples sampled at 44.1kHz, stereo.   Each sample is 16 bits of data long.  So one second of data takes up 176,400 bytes.  This will become important during playback, since we'll need to tell the audio adapter what its rendering.  There's a lot of math and physics that went into figuring out those numbers, essentially, 44.1kHz/16bits is "good enough" to be considered full fidelity to most people. 

The data on the audio CD is laid out in "sectors", each of which contains 1/75th of a second worth of data.   So each sector is 2352 bytes in length.

That number will be critical to the reading process later, since we'll be reading the data from the CD one sector at a time.

Note that the sector size for a CDROM drive is 2048 bytes - that's used to make life easier for applications.

When the CDROM filesystem sees an audio CD, it reads the track database off the CD (I'll talk about that when I'm showing the code that reads the database) and synthesizes a tiny RIFF file for CD audio.  The file contains enough information for Windows to hand to the media player to have it play the media back.

 

How Stuff Works has a pretty good tutorial on how the actual data is laid out on the CD here.

Oh, and they have a much better series of examples of how digital audio is sampled here.

Prof. Kelin Kuhn has an awesome breakdown of the details in the course materials for UW's EE498 course

Comments

  • Anonymous
    April 29, 2005
    The comment has been removed
  • Anonymous
    April 29, 2005
    The comment has been removed
  • Anonymous
    April 29, 2005
    The comment has been removed
  • Anonymous
    April 29, 2005
    Jeremy: Typically you'd use a program such as isobuster to convert that RAW data into user data (or, in vcds, mpeg frames, which are written in a different format). Of course if you're implementing it that doesn't help you, but it might be valueable to compare inputs and outputs. The last link in Larry's letter might help you decipher the data; actual C sources are available online if you'd rather not translate the mathematics and tables, though I'm not sure where. I remember looking into this a couple of years ago.

    Oh, ISOs and bins can contain raw or user data, but generally tend to stick to their usual that you pointed out. Bins particularly can be a wide variety of non-standard formats. Some formats, like tao, can only contain user data.

    Hope any of that helps!
  • Anonymous
    April 29, 2005
    The comment has been removed
  • Anonymous
    January 08, 2007
    PingBack from http://blogs.msdn.com/larryosterman/archive/2005/05/06/playing-audio-cds-part-11-why-isn-t-my-sample-ready-for-prime-time.aspx