AudioFormat Class

Definition

The AudioFormat class is used to access a number of audio format and channel configuration constants.

[Android.Runtime.Register("android/media/AudioFormat", DoNotGenerateAcw=true)]
public class AudioFormat : Java.Lang.Object, Android.OS.IParcelable, IDisposable, Java.Interop.IJavaPeerable
[<Android.Runtime.Register("android/media/AudioFormat", DoNotGenerateAcw=true)>]
type AudioFormat = class
    inherit Object
    interface IParcelable
    interface IJavaObject
    interface IDisposable
    interface IJavaPeerable
Inheritance
AudioFormat
Attributes
Implements

Remarks

The AudioFormat class is used to access a number of audio format and channel configuration constants. They are for instance used in AudioTrack and AudioRecord, as valid values in individual parameters of constructors like AudioTrack#AudioTrack(int, int, int, int, int, int), where the fourth parameter is one of the AudioFormat.ENCODING_* constants. The AudioFormat constants are also used in MediaFormat to specify audio related values commonly used in media, such as for MediaFormat#KEY_CHANNEL_MASK.

The AudioFormat.Builder class can be used to create instances of the AudioFormat format class. Refer to AudioFormat.Builder for documentation on the mechanics of the configuration and building of such instances. Here we describe the main concepts that the AudioFormat class allow you to convey in each instance, they are: <ol> <li>sample rate<li>encoding<li>channel masks</ol>

Closely associated with the AudioFormat is the notion of an audio frame, which is used throughout the documentation to represent the minimum size complete unit of audio data.

<h4 id="sampleRate">Sample rate</h4>

Expressed in Hz, the sample rate in an AudioFormat instance expresses the number of audio samples for each channel per second in the content you are playing or recording. It is not the sample rate at which content is rendered or produced. For instance a sound at a media sample rate of 8000Hz can be played on a device operating at a sample rate of 48000Hz; the sample rate conversion is automatically handled by the platform, it will not play at 6x speed.

As of API android.os.Build.VERSION_CODES#M, sample rates up to 192kHz are supported for AudioRecord and AudioTrack, with sample rate conversion performed as needed. To improve efficiency and avoid lossy conversions, it is recommended to match the sample rate for AudioRecord and AudioTrack to the endpoint device sample rate, and limit the sample rate to no more than 48kHz unless there are special device capabilities that warrant a higher rate.

<h4 id="encoding">Encoding</h4>

Audio encoding is used to describe the bit representation of audio data, which can be either linear PCM or compressed audio, such as AC3 or DTS.

For linear PCM, the audio encoding describes the sample size, 8 bits, 16 bits, or 32 bits, and the sample representation, integer or float. <ul> <li> #ENCODING_PCM_8BIT: The audio sample is a 8 bit unsigned integer in the range [0, 255], with a 128 offset for zero. This is typically stored as a Java byte in a byte array or ByteBuffer. Since the Java byte is <em>signed</em>, be careful with math operations and conversions as the most significant bit is inverted. </li> <li> #ENCODING_PCM_16BIT: The audio sample is a 16 bit signed integer typically stored as a Java short in a short array, but when the short is stored in a ByteBuffer, it is native endian (as compared to the default Java big endian). The short has full range from [-32768, 32767], and is sometimes interpreted as fixed point Q.15 data. </li> <li> #ENCODING_PCM_FLOAT: Introduced in API android.os.Build.VERSION_CODES#LOLLIPOP, this encoding specifies that the audio sample is a 32 bit IEEE single precision float. The sample can be manipulated as a Java float in a float array, though within a ByteBuffer it is stored in native endian byte order. The nominal range of ENCODING_PCM_FLOAT audio data is [-1.0, 1.0]. It is implementation dependent whether the positive maximum of 1.0 is included in the interval. Values outside of the nominal range are clamped before sending to the endpoint device. Beware that the handling of NaN is undefined; subnormals may be treated as zero; and infinities are generally clamped just like other values for AudioTrack &ndash; try to avoid infinities because they can easily generate a NaN. <br> To achieve higher audio bit depth than a signed 16 bit integer short, it is recommended to use ENCODING_PCM_FLOAT for audio capture, processing, and playback. Floats are efficiently manipulated by modern CPUs, have greater precision than 24 bit signed integers, and have greater dynamic range than 32 bit signed integers. AudioRecord as of API android.os.Build.VERSION_CODES#M and AudioTrack as of API android.os.Build.VERSION_CODES#LOLLIPOP support ENCODING_PCM_FLOAT. </li> <li> #ENCODING_PCM_24BIT_PACKED: Introduced in API android.os.Build.VERSION_CODES#S, this encoding specifies the audio sample is an extended precision 24 bit signed integer stored as a 3 Java bytes in a ByteBuffer or byte array in native endian (see java.nio.ByteOrder#nativeOrder()). Each sample has full range from [-8388608, 8388607], and can be interpreted as fixed point Q.23 data. </li> <li> #ENCODING_PCM_32BIT: Introduced in API android.os.Build.VERSION_CODES#S, this encoding specifies the audio sample is an extended precision 32 bit signed integer stored as a 4 Java bytes in a ByteBuffer or byte array in native endian (see java.nio.ByteOrder#nativeOrder()). Each sample has full range from [-2147483648, 2147483647], and can be interpreted as fixed point Q.31 data. </li> </ul>

For compressed audio, the encoding specifies the method of compression, for example #ENCODING_AC3 and #ENCODING_DTS. The compressed audio data is typically stored as bytes in a byte array or ByteBuffer. When a compressed audio encoding is specified for an AudioTrack, it creates a direct (non-mixed) track for output to an endpoint (such as HDMI) capable of decoding the compressed audio. For (most) other endpoints, which are not capable of decoding such compressed audio, you will need to decode the data first, typically by creating a MediaCodec. Alternatively, one may use MediaPlayer for playback of compressed audio files or streams.

When compressed audio is sent out through a direct AudioTrack, it need not be written in exact multiples of the audio access unit; this differs from MediaCodec input buffers.

<h4 id="channelMask">Channel mask</h4>

Channel masks are used in AudioTrack and AudioRecord to describe the samples and their arrangement in the audio frame. They are also used in the endpoint (e.g. a USB audio interface, a DAC connected to headphones) to specify allowable configurations of a particular device. <br>As of API android.os.Build.VERSION_CODES#M, there are two types of channel masks: channel position masks and channel index masks.

<h5 id="channelPositionMask">Channel position masks</h5> Channel position masks are the original Android channel masks, and are used since API android.os.Build.VERSION_CODES#BASE. For input and output, they imply a positional nature - the location of a speaker or a microphone for recording or playback. <br>For a channel position mask, each allowed channel position corresponds to a bit in the channel mask. If that channel position is present in the audio frame, that bit is set, otherwise it is zero. The order of the bits (from lsb to msb) corresponds to the order of that position's sample in the audio frame. <br>The canonical channel position masks by channel count are as follows: <br><table> <tr><td>channel count</td><td>channel position mask</td></tr> <tr><td>1</td><td>#CHANNEL_OUT_MONO</td></tr> <tr><td>2</td><td>#CHANNEL_OUT_STEREO</td></tr> <tr><td>3</td><td>#CHANNEL_OUT_STEREO | #CHANNEL_OUT_FRONT_CENTER</td></tr> <tr><td>4</td><td>#CHANNEL_OUT_QUAD</td></tr> <tr><td>5</td><td>#CHANNEL_OUT_QUAD | #CHANNEL_OUT_FRONT_CENTER</td></tr> <tr><td>6</td><td>#CHANNEL_OUT_5POINT1</td></tr> <tr><td>7</td><td>#CHANNEL_OUT_5POINT1 | #CHANNEL_OUT_BACK_CENTER</td></tr> <tr><td>8</td><td>#CHANNEL_OUT_7POINT1_SURROUND</td></tr> </table> <br>These masks are an ORed composite of individual channel masks. For example #CHANNEL_OUT_STEREO is composed of #CHANNEL_OUT_FRONT_LEFT and #CHANNEL_OUT_FRONT_RIGHT.

The following diagram represents the layout of the output channels, as seen from above the listener (in the center at the "lis" position, facing the front-center channel).

TFL ----- TFC ----- TFR     T is Top
                  |  \       |       /  |
                  |   FL --- FC --- FR  |     F is Front
                  |   |\     |     /|   |
                  |   | BFL-BFC-BFR |   |     BF is Bottom Front
                  |   |             |   |
                  |   FWL   lis   FWR   |     W is Wide
                  |   |             |   |
                 TSL  SL    TC     SR  TSR    S is Side
                  |   |             |   |
                  |   BL --- BC -- BR   |     B is Back
                  |  /               \  |
                  TBL ----- TBC ----- TBR     C is Center, L/R is Left/Right

All "T" (top) channels are above the listener, all "BF" (bottom-front) channels are below the listener, all others are in the listener's horizontal plane. When used in conjunction, LFE1 and LFE2 are below the listener, when used alone, LFE plane is undefined. See the channel definitions for the abbreviations

<h5 id="channelIndexMask">Channel index masks</h5> Channel index masks are introduced in API android.os.Build.VERSION_CODES#M. They allow the selection of a particular channel from the source or sink endpoint by number, i.e. the first channel, the second channel, and so forth. This avoids problems with artificially assigning positions to channels of an endpoint, or figuring what the i<sup>th</sup> position bit is within an endpoint's channel position mask etc. <br>Here's an example where channel index masks address this confusion: dealing with a 4 channel USB device. Using a position mask, and based on the channel count, this would be a #CHANNEL_OUT_QUAD device, but really one is only interested in channel 0 through channel 3. The USB device would then have the following individual bit channel masks: #CHANNEL_OUT_FRONT_LEFT, #CHANNEL_OUT_FRONT_RIGHT, #CHANNEL_OUT_BACK_LEFT and #CHANNEL_OUT_BACK_RIGHT. But which is channel 0 and which is channel 3? <br>For a channel index mask, each channel number is represented as a bit in the mask, from the lsb (channel 0) upwards to the msb, numerically this bit value is 1 << channelNumber. A set bit indicates that channel is present in the audio frame, otherwise it is cleared. The order of the bits also correspond to that channel number's sample order in the audio frame. <br>For the previous 4 channel USB device example, the device would have a channel index mask 0xF. Suppose we wanted to select only the first and the third channels; this would correspond to a channel index mask 0x5 (the first and third bits set). If an AudioTrack uses this channel index mask, the audio frame would consist of two samples, the first sample of each frame routed to channel 0, and the second sample of each frame routed to channel 2. The canonical channel index masks by channel count are given by the formula (1 << channelCount) - 1.

<h5>Use cases</h5> <ul> <li>Channel position mask for an endpoint:CHANNEL_OUT_FRONT_LEFT, CHANNEL_OUT_FRONT_CENTER, etc. for HDMI home theater purposes. <li>Channel position mask for an audio stream: Creating an AudioTrack to output movie content, where 5.1 multichannel output is to be written. <li>Channel index mask for an endpoint: USB devices for which input and output do not correspond to left or right speaker or microphone. <li>Channel index mask for an audio stream: An AudioRecord may only want the third and fourth audio channels of the endpoint (i.e. the second channel pair), and not care the about position it corresponds to, in which case the channel index mask is 0xC. Multichannel AudioRecord sessions should use channel index masks. </ul> <h4 id="audioFrame">Audio Frame</h4>

For linear PCM, an audio frame consists of a set of samples captured at the same time, whose count and channel association are given by the channel mask, and whose sample contents are specified by the encoding. For example, a stereo 16 bit PCM frame consists of two 16 bit linear PCM samples, with a frame size of 4 bytes. For compressed audio, an audio frame may alternately refer to an access unit of compressed data bytes that is logically grouped together for decoding and bitstream access (e.g. MediaCodec), or a single byte of compressed data (e.g. AudioTrack#getBufferSizeInFrames() AudioTrack.getBufferSizeInFrames()), or the linear PCM frame result from decoding the compressed data (e.g.AudioTrack#getPlaybackHeadPosition() AudioTrack.getPlaybackHeadPosition()), depending on the context where audio frame is used. For the purposes of AudioFormat#getFrameSizeInBytes(), a compressed data format returns a frame size of 1 byte.

Java documentation for android.media.AudioFormat.

Portions of this page are modifications based on work created and shared by the Android Open Source Project and used according to terms described in the Creative Commons 2.5 Attribution License.

Constructors

AudioFormat()
AudioFormat(IntPtr, JniHandleOwnership)

A constructor used when creating managed representations of JNI objects; called by the runtime.

Fields

ChannelInvalid

Invalid audio channel mask

ChannelOut5point1point2
Obsolete.

Output channel mask for 5.

ChannelOut5point1point4
Obsolete.

Output channel mask for 5.

ChannelOut6point1
Obsolete.

Output channel mask for 6.

ChannelOut7point1

This member is deprecated.

ChannelOut7point1point2
Obsolete.

Output channel mask for 7.

ChannelOut7point1point4
Obsolete.

Output channel mask for 7.

ChannelOut9point1point4
Obsolete.

Output channel mask for 9.

ChannelOut9point1point6
Obsolete.

Output channel mask for 9.

ChannelOutBottomFrontCenter
Obsolete.

Bottom front center output channel (see BFC in channel diagram below FC)

ChannelOutBottomFrontLeft
Obsolete.

Bottom front left output channel (see BFL in channel diagram below FL)

ChannelOutBottomFrontRight
Obsolete.

Bottom front right output channel (see BFR in channel diagram below FR)

ChannelOutFrontWideLeft
Obsolete.

Front wide left output channel (see FWL in channel diagram)

ChannelOutFrontWideRight
Obsolete.

Front wide right output channel (see FWR in channel diagram)

ChannelOutLowFrequency2
Obsolete.

The second LFE channel When used in conjunction with #CHANNEL_OUT_LOW_FREQUENCY, it is intended to contain the right low-frequency effect signal, also referred to as "LFE2" in ITU-R BS.

ChannelOutTopBackCenter
Obsolete.

Top back center output channel (see TBC in channel diagram above BC)

ChannelOutTopBackLeft
Obsolete.

Top back left output channel (see TBL in channel diagram above BL)

ChannelOutTopBackRight
Obsolete.

Top back right output channel (see TBR in channel diagram above BR)

ChannelOutTopCenter
Obsolete.

Top center (above listener) output channel (see TC in channel diagram)

ChannelOutTopFrontCenter
Obsolete.

Top front center output channel (see TFC in channel diagram above FC)

ChannelOutTopFrontLeft
Obsolete.

Top front left output channel (see TFL in channel diagram above FL)

ChannelOutTopFrontRight
Obsolete.

Top front right output channel (see TFR in channel diagram above FR)

ChannelOutTopSideLeft
Obsolete.

Top side left output channel (see TSL in channel diagram above SL)

ChannelOutTopSideRight
Obsolete.

Top side right output channel (see TSR in channel diagram above SR)

EncodingDra
Obsolete.

Audio data format: DRA compressed

EncodingDsd
Obsolete.

Audio data format: Direct Stream Digital

EncodingDtsHdMa
Obsolete.

Audio data format: DTS HD Master Audio compressed DTS HD Master Audio stream is variable bit rate and contains lossless audio.

EncodingDtsUhd

Audio data format: DTS UHD Profile-1 compressed (aka DTS:X Profile 1) Has the same meaning and value as ENCODING_DTS_UHD_P1.

EncodingDtsUhdP1
Obsolete.

Audio data format: DTS UHD Profile-1 compressed (aka DTS:X Profile 1) Has the same meaning and value as the deprecated #ENCODING_DTS_UHD.

EncodingDtsUhdP2
Obsolete.

Audio data format: DTS UHD Profile-2 compressed DTS-UHD Profile-2 supports delivery of Channel-Based Audio, Object-Based Audio and High Order Ambisonic presentations up to the fourth order.

EncodingMpeghBlL3
Obsolete.

Audio data format: MPEG-H baseline profile, level 3

EncodingMpeghBlL4
Obsolete.

Audio data format: MPEG-H baseline profile, level 4

EncodingMpeghLcL3
Obsolete.

Audio data format: MPEG-H low complexity profile, level 3

EncodingMpeghLcL4
Obsolete.

Audio data format: MPEG-H low complexity profile, level 4

EncodingOpus
Obsolete.

Audio data format: OPUS compressed.

EncodingPcm24bitPacked
Obsolete.

Audio data format: PCM 24 bit per sample packed as 3 bytes.

EncodingPcm32bit
Obsolete.

Audio data format: PCM 32 bit per sample.

SampleRateUnspecified

Sample rate will be a route-dependent value.

Properties

ChannelCount

Return the channel count.

ChannelIndexMask

Return the channel index mask.

ChannelMask

Return the channel mask.

Class

Returns the runtime class of this Object.

(Inherited from Object)
Creator
Encoding

Return the encoding.

FrameSizeInBytes

Return the frame size in bytes.

Handle

The handle to the underlying Android instance.

(Inherited from Object)
JniIdentityHashCode (Inherited from Object)
JniPeerMembers
PeerReference (Inherited from Object)
SampleRate

Return the sample rate.

ThresholdClass

This API supports the Mono for Android infrastructure and is not intended to be used directly from your code.

ThresholdType

This API supports the Mono for Android infrastructure and is not intended to be used directly from your code.

Methods

Clone()

Creates and returns a copy of this object.

(Inherited from Object)
DescribeContents()
Dispose() (Inherited from Object)
Dispose(Boolean) (Inherited from Object)
Equals(Object)

Indicates whether some other object is "equal to" this one.

(Inherited from Object)
GetHashCode()

Returns a hash code value for the object.

(Inherited from Object)
JavaFinalize()

Called by the garbage collector on an object when garbage collection determines that there are no more references to the object.

(Inherited from Object)
Notify()

Wakes up a single thread that is waiting on this object's monitor.

(Inherited from Object)
NotifyAll()

Wakes up all threads that are waiting on this object's monitor.

(Inherited from Object)
SetHandle(IntPtr, JniHandleOwnership)

Sets the Handle property.

(Inherited from Object)
ToArray<T>() (Inherited from Object)
ToString()

Returns a string representation of the object.

(Inherited from Object)
UnregisterFromRuntime() (Inherited from Object)
Wait()

Causes the current thread to wait until it is awakened, typically by being <em>notified</em> or <em>interrupted</em>.

(Inherited from Object)
Wait(Int64, Int32)

Causes the current thread to wait until it is awakened, typically by being <em>notified</em> or <em>interrupted</em>, or until a certain amount of real time has elapsed.

(Inherited from Object)
Wait(Int64)

Causes the current thread to wait until it is awakened, typically by being <em>notified</em> or <em>interrupted</em>, or until a certain amount of real time has elapsed.

(Inherited from Object)
WriteToParcel(Parcel, ParcelableWriteFlags)

Explicit Interface Implementations

IJavaPeerable.Disposed() (Inherited from Object)
IJavaPeerable.DisposeUnlessReferenced() (Inherited from Object)
IJavaPeerable.Finalized() (Inherited from Object)
IJavaPeerable.JniManagedPeerState (Inherited from Object)
IJavaPeerable.SetJniIdentityHashCode(Int32) (Inherited from Object)
IJavaPeerable.SetJniManagedPeerState(JniManagedPeerStates) (Inherited from Object)
IJavaPeerable.SetPeerReference(JniObjectReference) (Inherited from Object)

Extension Methods

JavaCast<TResult>(IJavaObject)

Performs an Android runtime-checked type conversion.

JavaCast<TResult>(IJavaObject)
GetJniTypeName(IJavaPeerable)

Applies to