The Leaky Bucket Buffer Decoding Model

Статья
04/12/2010

When decoding compressed content, most decoders use a technique known as the "leaky bucket buffer model". This topic explains this decoding model.

The buffer that is used by the codec when decoding compressed content is often referred to as a "leaky bucket" because the decoder removes samples from the buffer regularly, as if the buffer were a bucket with a hole in it. Understanding the nature of the decoder's buffer is important when selecting the format of compressed data. This is because the encoder compresses content by using the buffer values for the decoder, to ensure proper decompression.

The leaky bucket metaphor can help you to visualize the buffering process for decoding. If you understand the limitations that are imposed by the characteristics of the buffer, you can configure compressed media streams for higher-quality content.

The Size of the Bucket

The size of the buffer that is used by the encoder is determined by the bit rate and buffer window values that are set for the stream in the profile. The bit rate measures the average number of bits per second in the encoded stream. The buffer window measures the number of milliseconds of data at that bit rate that can fit in the buffer. You can find the size of the buffer, in bits, by multiplying the bit rate by the buffer window divided by 1,000 (R * (Bw / 1,000)). This is the size of the leaky bucket.

Supplying and Leaking Data

The bit rate also determines the rate at which the data is put into and removed from the buffer. Using the leaky bucket metaphor, this means that the bucket is being filled at the same rate that it is leaking from the hole in its bottom. If the bit rate were a constant value, no buffer would be required; the data would be received at the same rate that it was used. However, the nature of digital media compression is that encoded samples vary in size.

Sample Sizes

The size of individual media samples is what dictates the need for a decoding buffer. Sample size has to do with the complexity of the data. To illustrate this, think of a single violin holding a note for one second. The sound waves that make up that sound are relatively complex: the string vibrates and the sound reflects within the body of the instrument, making a complex waveform. However, if you compare this sound with that of four violins, each playing a different note during the same second, it is not difficult to understand that the second sound is far more complex than the first. Uncompressed media is not affected by complexity. An uncompressed 16-bit, 44-kHz stereo encoding of the two seconds of violin music described above uses the same amount of data for each sample, regardless of the complexity. However, an audio codec can take advantage of less complex passages to compress the content to a smaller size.

The same is true of video content: frames with large areas of uniform colors are easily compressed to very small sizes, while frames containing many varied shapes and colors are more complex and harder to compress. Video has the added factor of change over time. Much of the compression achieved by the Windows Media Video codecs is due to using delta frames, which contain only information tracking the changes from one frame to the next. This means that movement in video makes the content more complex. Complex samples must be larger than their simpler counterparts to maintain the same quality.

The Bucket in Use

As previously discussed, the average rate of data entering the buffer is equal to the average rate of the data being taken from the buffer. The bucket never fills if these rates are constant. However, samples are not all the same size. A larger than average sample can have the same effect as opening the valve on the spigot wider and momentarily allowing a higher rate of water into the bucket. Conversely, a smaller than average sample can have the same effect as restricting the flow of water into the bucket. As long as fluctuations in flow average out, the bucket never overflows.

The goal of an encoder is to ensure that the content never overflows the buffer. The encoder uses the bit rate and buffer window values as guides. The actual number of bits passed over any period of time equal to the buffer window can never be greater than twice the size of the buffer.

Consider the following example: You have a 3-gallon bucket with a hole in it through which 1 gallon can flow per minute. You put the bucket under a spigot and open the valve to let out water at a rate of 1 gallon per minute. The water flows out of the bucket as quickly as it enters, leaving no extra in the bucket. Then you increase the flow from the spigot to 2 gallons per minute. Each minute that the water flows at this rate, 2 gallons go into the bucket and 1 gallon leaks out, leaving 1 gallon in the bucket. At the end of 3 minutes, 6 gallons of water have gone into the bucket, 3 gallons have leaked out, and the bucket is full.

In practice, the theoretical maximum data rate over an interval equal to the buffer window is never achieved. The previous example assumed a constant data rate. Given the same 3-gallon bucket, you could increase the flow rate from the spigot to 6 gallons per minute for one minute and then turn the spigot off for two minutes. Even though the total amount of water put into the bucket is within the theoretical maximum for the buffer window, the concentration of that amount into one part of the window causes the bucket to overflow. At 6 gallons per minute, the 3-gallon bucket overflows shortly after 30 seconds pass. Therefore, the actual maximum amount of data that can be delivered to the buffer over the duration of any interval equal to the buffer window setting depends upon the size of individual samples and when they are delivered.

So far the examples have only discussed the buffer used by the decoder, but a leaky bucket buffer is also used by the encoder which creates the compressed content. The encoder makes whatever adjustments are necessary to the compression algorithms to keep the bit rate of the compressed samples within the boundaries described by the bit rate and buffer window, assuming that the samples will be delivered to the decoder at a constant rate. You can think of the encoder bucket as mirroring the decoder bucket. The encoder bucket is filled at a variable rate determined by the size of the individual samples and leaks at a constant rate equal to the average bit rate.

Consider the following example of an encoder and decoder connected together over a network. You encode a video file at 30 frames per second with a bit rate of 6,000 bits per second and a buffer window of 3 seconds (a total buffer size of 18,000 bits). The first sample is encoded as a key frame and takes up 7,000 bits. The encoder buffer now contains 7,000 bits. The next 29 frames are all delta frames that total 3,000 bits. So the first second of content (30 frames) would put the buffer fullness at 10,000 bits if nothing were leaking out. We know that the bit rate of the stream is 6,000 bits per second, so after the first second of encoded content is put in the encoder buffer, the fullness drops to 4,000 bits. In the decoding application, this stream is delivered to the decoder buffer at 6,000 bits per second. After one second, the buffer contains 6,000 bits. The first sample contains 7,000 bits, so the decoder buffer must be filled more before the decoder begins removing samples.

The Leaky Bucket and Stream Configuration

When configuring streams, you should consider the way in which the encoder uses bit rate and buffer window values for encoding content. Evaluate your content for consistency of complexity and configure a stream with a large enough buffer to handle it. By selecting low bit-rate and buffer-window values, you hinder the codec's ability to maintain the quality of the encoded content. Even at higher bit rates, content with a very wide range of complexity can result in inconsistent quality.

When you configure a stream, you must consider the nature of the content, the delivery method, and the importance of consistent playback quality. It often takes experimentation to configure streams that suit your content and bandwidth requirements. Often you will discover that your content does not encode well below a certain bit rate, or that the quality is not noticeably increased by increasing the bit rate within a certain range. In any case, understanding the requirements to which the codec encodes is important in getting the best results.

Поделиться через