Udostępnij za pośrednictwem


Intro to Audio Programming, Part 3: Synthesizing Simple Wave Audio using C#

If you’ve been following this series, you’re probably thinking, “Finally! He is going to show us some code!”

Well, I hate to disappoint you. So I’ll go ahead and show some code.

We’ve already discussed how audio is represented and what the WAV format looks like. The time has come to put these concepts into practice.

WaveFun! Wave Generator

image The app we will be building is really not all that flashy. It just generates a simple waveform with 1 second of audio and plays it back for you. Nothing configurable or anything like that. Trust me, though, the flashy stuff is coming.

>> DOWNLOAD THE TASTY CODE HERE (18.8 KB) <<

If one giant button on a form isn’t the pinnacle of UI design, I have no idea what to do in this world.

Anyway, this is what the structure of the app looks like:

image

Chunk Wrappers (Chunks.cs)

(oooh, delicious!)

The first thing we care about is Chunks.cs, which contains wrappers for the header and two chunks that we learned about in the last article.

Let’s look at the code for the WaveHeader wrapper class. Note the data types we use instead of just “int.” The strings will be converted to character arrays later when we write the file. If you don’t convert them, you get end-of-string characters that ruin your file. dwFileLength is initialized to zero, but is determined later (retroactively) after we have written the stream and we know how long the file is.

 public class WaveHeader
 {
     public string sGroupID; // RIFF
     public uint dwFileLength; // total file length minus 8, which is taken up by RIFF
     public string sRiffType; // always WAVE
  
     /// <summary>
     /// Initializes a WaveHeader object with the default values.
     /// </summary>
     public WaveHeader()
     {
         dwFileLength = 0;
         sGroupID = "RIFF";
         sRiffType = "WAVE";
     }
 }

Next up is the code for the Format chunk wrapper class. Again, note that the datatypes are consistent with the wave file format spec. Also note that we can explicitly set the chunk size in the constructor to 16 bytes, because the size of this chunk never changes (just add up the number of bytes taken up by each field, you get 16).

 public class WaveFormatChunk
 {
     public string sChunkID;         // Four bytes: "fmt "
     public uint dwChunkSize;        // Length of header in bytes
     public ushort wFormatTag;       // 1 (MS PCM)
     public ushort wChannels;        // Number of channels
     public uint dwSamplesPerSec;    // Frequency of the audio in Hz... 44100
     public uint dwAvgBytesPerSec;   // for estimating RAM allocation
     public ushort wBlockAlign;      // sample frame size, in bytes
     public ushort wBitsPerSample;    // bits per sample
  
     /// <summary>
     /// Initializes a format chunk with the following properties:
     /// Sample rate: 44100 Hz
     /// Channels: Stereo
     /// Bit depth: 16-bit
     /// </summary>
     public WaveFormatChunk()
     {
         sChunkID = "fmt ";
         dwChunkSize = 16;
         wFormatTag = 1;
         wChannels = 2;
         dwSamplesPerSec = 44100;
         wBitsPerSample = 16;
         wBlockAlign = (ushort)(wChannels * (wBitsPerSample / 8));
         dwAvgBytesPerSec = dwSamplesPerSec * wBlockAlign;            
     }
 }

Finally, let’s have a look at the wrapper for the Data chunk. Here, we use an array of shorts because we have 16-bit samples as specified in the format block. If you want to change to 8-bit audio, use an array of bytes. If you want to use 32-bit audio, use an array of floats. dwChunkSize is initialized to zero and is determined after the wave data is generated, when we know how long the array is and what the bit depth is.

 public class WaveDataChunk
 {
     public string sChunkID;     // "data"
     public uint dwChunkSize;    // Length of header in bytes
     public short[] shortArray;  // 8-bit audio
  
     /// <summary>
     /// Initializes a new data chunk with default values.
     /// </summary>
     public WaveDataChunk()
     {
         shortArray = new short[0];
         dwChunkSize = 0;
         sChunkID = "data";
     }   
 }

Now we have all the tools we need to assemble a wave file!

The Wave Generator (WaveGenerator.cs)

This class does two things. It has a constructor, which instantiates all these chunks and then uses a very simple algorithm to generate sample data for a sine wave oscillating at 440Hz. This results in an audible pitch known as Concert A.

In this file, we have an enum called WaveExampleType, which is used to identify what kind of wave we want to create. Feel free to create your own and modify the “big switch statement” to add different sound wave options.

 public enum WaveExampleType
{
    ExampleSineWave = 0
}

The WaveGenerator class only has three members, and they are all chunks.

 public class WaveGenerator
{
    // Header, Format, Data chunks
    WaveHeader header;
    WaveFormatChunk format;
    WaveDataChunk data;
    
    /// <snip>
}

The constructor of the WaveGenerator class takes in an argument of type WaveExampleType, which we switch on to determine what kind of wave to generate. Lots of stuff happens in the constructor, so I’ll use line numbers here to refer to after the jump.

    1: public WaveGenerator(WaveExampleType type)
    2: {          
    3:     // Init chunks
    4:     header = new WaveHeader();
    5:     format = new WaveFormatChunk();
    6:     data = new WaveDataChunk();            
    7:  
    8:     // Fill the data array with sample data
    9:     switch (type)
   10:     {
   11:         case WaveExampleType.ExampleSineWave:
   12:  
   13:             // Number of samples = sample rate * channels * bytes per sample
   14:             uint numSamples = format.dwSamplesPerSec * format.wChannels;
   15:             
   16:             // Initialize the 16-bit array
   17:             data.shortArray = new short[numSamples];
   18:  
   19:             int amplitude = 32760;  // Max amplitude for 16-bit audio
   20:             double freq = 440.0f;   // Concert A: 440Hz
   21:  
   22:             // The "angle" used in the function, adjusted for the number of channels and sample rate.
   23:             // This value is like the period of the wave.
   24:             double t = (Math.PI * 2 * freq) / (format.dwSamplesPerSec * format.wChannels);
   25:  
   26:             for (uint i = 0; i < numSamples - 1; i++)
   27:             {
   28:                 // Fill with a simple sine wave at max amplitude
   29:                 for (int channel = 0; channel < format.wChannels; channel++)
   30:                 {
   31:                     data.shortArray[i + channel] = Convert.ToInt16(amplitude * Math.Sin(t * i));
   32:                 }                        
   33:             }
   34:  
   35:             // Calculate data chunk size in bytes
   36:             data.dwChunkSize = (uint)(data.shortArray.Length * (format.wBitsPerSample / 8));
   37:  
   38:             break;
   39:     }          
   40: }

Lines 4-6 instantiate the chunks.

On line 9, we switch on the wave type. This gives us an opportunity to try different things without breaking stuff that works, which I encourage you to do.

On line 14, we calculate the size of the data array. This is calculated by multiplying the sample rate and channel count together. In our case, we have 44100 samples and 2 channels of data , giving us an array of length 88,200.

Line 19 specifies an important value: 32760 is the max amplitude for 16-bit audio. I discussed this in the second article. As an aside, the samples will range from -32760 to 32760; the negative values are provided by the fact that the sine function’s output ranges from -1.0 to 1.0. For other nonperiodic functions you may have to specify -32760 as your lower bound instead of zero – we’ll see this in action in a future article.

Line 20 specifies the frequency of the sound. 440Hz is concert A. You can use any other pitch you want – check out this awesome table for a handy reference.

On line 24, we are doing a little fun trig. See this article if you want to understand the math, otherwise just use this formula and love it.

Line 26 is where the magic happens! The structure of this nested for loop can change. It works for 1 or 2 channels – anything beyond that and you would need to change the condition in the topmost loop (i < numSamples – 1) lest you get an out of memory error.

It’s important to note how multichannel data is written. For WAV files, data is written in an interleaved manner. The sample at each time point is written to all the channels first before advancing to the next time. So shortArray[0] would be the sample in channel 1, and shortArray[1] would be the exact same sample in channel 2. That’s why we have a nested loop.

On line 31, we use Math.Sin to generate the sample data based on the “angle” (t) and the current time (i). This value is written once for each channel before “i” is incremented.

On line 36, we set the chunk size of the data chunk. Most other chunks know how to do this themselves, but because the chunks are independent, the data chunk does not know what the bitrate is (it’s stored in the format chunk). So we set that value directly. The reason we need the bit rate is that the chunk size is stored in bytes, and each sample takes two bytes. Therefore we are setting the data chunk size to the array length times the number of bytes in a sample (2).

At this point, all of our chunks have the correct values and we are ready to write the chunks to a stream. This is where the Save method comes in.

Again, I’ll use line numbers to refer to the Save method below.

    1: public void Save(string filePath)
    2: {
    3:     // Create a file (it always overwrites)
    4:     FileStream fileStream = new FileStream(filePath, FileMode.Create);   
    5:  
    6:     // Use BinaryWriter to write the bytes to the file
    7:     BinaryWriter writer = new BinaryWriter(fileStream);
    8:  
    9:     // Write the header
   10:     writer.Write(header.sGroupID.ToCharArray());
   11:     writer.Write(header.dwFileLength);
   12:     writer.Write(header.sRiffType.ToCharArray());
   13:  
   14:     // Write the format chunk
   15:     writer.Write(format.sChunkID.ToCharArray());
   16:     writer.Write(format.dwChunkSize);
   17:     writer.Write(format.wFormatTag);
   18:     writer.Write(format.wChannels);
   19:     writer.Write(format.dwSamplesPerSec);
   20:     writer.Write(format.dwAvgBytesPerSec);
   21:     writer.Write(format.wBlockAlign);
   22:     writer.Write(format.wBitsPerSample);
   23:  
   24:     // Write the data chunk
   25:     writer.Write(data.sChunkID.ToCharArray());
   26:     writer.Write(data.dwChunkSize);
   27:     foreach (short dataPoint in data.shortArray)
   28:     {
   29:         writer.Write(dataPoint);
   30:     }
   31:  
   32:     writer.Seek(4, SeekOrigin.Begin);
   33:     uint filesize = (uint)writer.BaseStream.Length;
   34:     writer.Write(filesize - 8);
   35:     
   36:     // Clean up
   37:     writer.Close();
   38:     fileStream.Close();            
   39: }

Save takes one argument – a file path. Lines 4-7 set up our file stream and binary writer associated with that stream. The order in which values are written is EXTREMELY IMPORTANT!

Lines 10-12 write the header chunk to the stream. We use the .ToCharArray method on the strings to convert them to actual character / byte arrays. If you don’t do this, your header gets messed up with end-of-string characters.

Lines 15-22 write the format chunk.

Lines 25 and 26 write the first two parts of the data array, and the foreach loop writes out every value of the data array.

Now we know exactly how long the file is, so we have to go back and specify the file length as the second value in the file. The first 4 bytes of the file are taken up with “RIFF" so we seek to byte 4 and write out the total length of the stream that we’ve written, minus 8 (as noted by the spec; we don’t count RIFF or WAVE).

Lastly, we close the streams. Our file is written! And it looks like this:

image

Zoom in a bit to see the awesome sine waviness:

image

All that’s left are the 5 lines of code that initialize the WaveGenerator object, save the file and play it back to you.

Putting it All Together – Main.cs

Let’s look at Main.cs, the codebehind for our main winform.

    1: using System;
    2: using System.Windows.Forms;
    3: using System.Media;
    4:  
    5: namespace WaveFun
    6: {
    7:     public partial class frmMain : Form
    8:     {
    9:         public frmMain()
   10:         {
   11:             InitializeComponent();
   12:         }
   13:  
   14:         private void btnGenerateWave_Click(object sender, EventArgs e)
   15:         {
   16:             string filePath = @"C:\Users\Dan\Desktop\test2.wav";
   17:             WaveGenerator wave = new WaveGenerator(WaveExampleType.ExampleSineWave);
   18:             wave.Save(filePath);            
   19:  
   20:             SoundPlayer player = new SoundPlayer(filePath);               
   21:             player.Play();
   22:         }
   23:     }
   24: }

On line 3, we reference System.Media. We need this namespace to play back our wave file.

Line 14 is the event handler for the Click event of the only huge button on the form.

On line 16, we define the location of the file to be written. IT IS VERY IMPORTANT THAT YOU CHANGE THIS TO A LOCATION THAT WORKS ON YOUR BOX.

Line 17 initializes the wave generator with a sine wave, and line 18 saves it to the location you defined.

Lines 20 and 21 use System.Media.SoundPlayer to play back the wave that we saved.

All Done!

Press F5 to run your program and bask in the glory of a very loud 440Hz sine wave.

Next Steps: If you are a math Jedi, you can experiment with the following code from WaveGenerator.cs:

 double t = (Math.PI * 2 * freq) / (format.dwSamplesPerSec * format.wChannels);

for (uint i = 0; i < numSamples - 1; i++)
{
    // Fill with a simple sine wave at max amplitude
    for (int channel = 0; channel < format.wChannels; channel++)
    {
        data.shortArray[i + channel] = Convert.ToInt16(amplitude * Math.Sin(t * i));
    }                        
}

Just remember it’s two-channel audio, so you have to write each channel in the frame first before writing the next frame.

In the next article, we’ll look at some algorithms to generate other types of waves.

Currently Playing: Lamb of God – Wrath – Set to Fail

Comments

  • Anonymous
    June 24, 2009
    Nice series. Always had a bit of curiosity on audio programming but never got around to looking into it. This got me to finally poke a bit at it. Keep at it. =)

  • Anonymous
    November 13, 2009
    for (uint i = 0; i < numSamples - 1; i++) {    for (int channel = 0; channel < format.wChannels; channel++)    {        data.shortArray[i + channel] = Convert.ToInt16(amplitude * Math.Sin(t * i));    } } Aren't you overwriting Channel 2 each time? Ie. i=0: data[0] = asin(t0); data[1] = asin(t0); i=1: data[1] = asin(t1); data[2] = asin(t1); So you overwrote data[1]; Instead of  i++, shouldn't you use i+=format.wChannels;

  • Anonymous
    December 15, 2009
    YESS ! Set to fail ! what a track to code !

  • Anonymous
    February 27, 2010
    The comment has been removed

  • Anonymous
    May 05, 2010
    In response to Dave,"Instead of  i++, shouldn't you use i+=format.wChannels;" You're overlooking the fact that when you're assigning to data, it's assigning to i+channel, which is the current sample index plus 0 or 1 for the channel. So, in effect you're constantly incrementing whilst interleaving each channel.

  • Anonymous
    June 05, 2010
    Nate, it is the incrementing of the outer loop by 1 that will cause channel 2 to be overwritten.  The value of i needs to be incremented by the number of channels, not by one.

  • Anonymous
    July 10, 2010
    One solution could be for (int channel = 0; channel < format.wChannels; channel++)                    data.shortArray[(i * format.wChannels) + channel] = Convert.ToInt16(amplitude * Math.Sin(t * i));                }

  • Anonymous
    December 14, 2011
    Hi! The link to understand the math doesn't work :(

  • Anonymous
    February 27, 2012
    what if I want to generate a 24-bit audio wave file?

  • Anonymous
    July 24, 2012
    The comment has been removed

  • Anonymous
    September 17, 2012
    Can we read an existing WAVE file and edit its amplitude and frequency in run time.

  • Anonymous
    November 15, 2012
    Hi,   please help in saving Request.Inputstram as WAV file .By using this code i could be able to listen the beep sound.I want pass the stream(byte[] ) data of recorded voice .How can i do that? Please help me on this !

  • Anonymous
    February 18, 2013
    James Hersey, I think you can prevent the audio crackling this way:

  1. on the start: begin the wave by making a fade-in(volume from zero to normal) on the very beginning.
  2. on the end: end the wave file by making a fade-out(volume from normal to zero) on the very end of the wave audio.
  • Anonymous
    May 19, 2013
    how I can increase the time of the audio file? In this code execura sound for 2 seconds, I wanted to increase to 10

  • Anonymous
    August 15, 2013
    Thanks for the great tutorial!!! I was able to converted to vb.net.

  • Anonymous
    April 12, 2014
    Regarding the crackling at the start/end, my guess would be that it is the result of an infinitely fast attack and release due to the function (and in turn waveform) starting at its peak amplitude. Using cosine instead of sin produces the exact same waveform, but with the start/end data points at zero.

  • Anonymous
    April 27, 2014
    It appears all the code links are now dead.  They go to Danwaters.com but that appears to be an artist in Australia.  Is the code still available anywhere?

  • Anonymous
    July 26, 2014
    as Rik says the link for the code no longer doesn't works.

  • Anonymous
    March 05, 2015
    I love this program, and was able to create a simple app using these principles that creates a small keyboard with different notes and lengths.  One problem until I pick up the mouse on a key it doesn't play back, (because it calculates the duration of the key hit and makes output of the same length).  The problem is that it has to create the wave file then play it.  Is there a way I can make it play the note without creating the wave file, and only create the file if it is needed (in my app a file dialog happens when a record button then some keys (also buttons) and then a save button are pressed.  I have it creating a temp file for each note, and saving the values to a list, then creating a full wave file with all notes when I click save.  That works, except there is a delay with each note I play as I add them. The app will also just play back without saving (however it still saves the notes one at a time before playback). There is no delay in the playback of the entire saved wave file, but the playback is clunky when just using keyboard feature.

  • Anonymous
    July 09, 2015
    good one..!! How can I concatenate wave files with (TEXT TO SPEECH) wave file into a single Wave file ..?

  • Anonymous
    October 09, 2015
    Hey, I've been looking for a good tutorial on the basics of making audio files in C#, but I feel like downloading the sample code is necessary, and as others have stated, the link is dead.