AudioFormat Class
Definition
Important
Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.
The AudioFormat
class is used to access a number of audio format and
channel configuration constants.
[Android.Runtime.Register("android/media/AudioFormat", DoNotGenerateAcw=true)]
public class AudioFormat : Java.Lang.Object, Android.OS.IParcelable, IDisposable, Java.Interop.IJavaPeerable
[<Android.Runtime.Register("android/media/AudioFormat", DoNotGenerateAcw=true)>]
type AudioFormat = class
inherit Object
interface IParcelable
interface IJavaObject
interface IDisposable
interface IJavaPeerable
- Inheritance
- Attributes
- Implements
Remarks
The AudioFormat
class is used to access a number of audio format and channel configuration constants. They are for instance used in AudioTrack
and AudioRecord
, as valid values in individual parameters of constructors like AudioTrack#AudioTrack(int, int, int, int, int, int)
, where the fourth parameter is one of the AudioFormat.ENCODING_*
constants. The AudioFormat
constants are also used in MediaFormat
to specify audio related values commonly used in media, such as for MediaFormat#KEY_CHANNEL_MASK
.
The AudioFormat.Builder
class can be used to create instances of the AudioFormat
format class. Refer to AudioFormat.Builder
for documentation on the mechanics of the configuration and building of such instances. Here we describe the main concepts that the AudioFormat
class allow you to convey in each instance, they are: <ol> <li>sample rate<li>encoding<li>channel masks</ol>
Closely associated with the AudioFormat
is the notion of an audio frame, which is used throughout the documentation to represent the minimum size complete unit of audio data.
<h4 id="sampleRate">Sample rate</h4>
Expressed in Hz, the sample rate in an AudioFormat
instance expresses the number of audio samples for each channel per second in the content you are playing or recording. It is not the sample rate at which content is rendered or produced. For instance a sound at a media sample rate of 8000Hz can be played on a device operating at a sample rate of 48000Hz; the sample rate conversion is automatically handled by the platform, it will not play at 6x speed.
As of API android.os.Build.VERSION_CODES#M
, sample rates up to 192kHz are supported for AudioRecord
and AudioTrack
, with sample rate conversion performed as needed. To improve efficiency and avoid lossy conversions, it is recommended to match the sample rate for AudioRecord
and AudioTrack
to the endpoint device sample rate, and limit the sample rate to no more than 48kHz unless there are special device capabilities that warrant a higher rate.
<h4 id="encoding">Encoding</h4>
Audio encoding is used to describe the bit representation of audio data, which can be either linear PCM or compressed audio, such as AC3 or DTS.
For linear PCM, the audio encoding describes the sample size, 8 bits, 16 bits, or 32 bits, and the sample representation, integer or float. <ul> <li> #ENCODING_PCM_8BIT
: The audio sample is a 8 bit unsigned integer in the range [0, 255], with a 128 offset for zero. This is typically stored as a Java byte in a byte array or ByteBuffer. Since the Java byte is <em>signed</em>, be careful with math operations and conversions as the most significant bit is inverted. </li> <li> #ENCODING_PCM_16BIT
: The audio sample is a 16 bit signed integer typically stored as a Java short in a short array, but when the short is stored in a ByteBuffer, it is native endian (as compared to the default Java big endian). The short has full range from [-32768, 32767], and is sometimes interpreted as fixed point Q.15 data. </li> <li> #ENCODING_PCM_FLOAT
: Introduced in API android.os.Build.VERSION_CODES#LOLLIPOP
, this encoding specifies that the audio sample is a 32 bit IEEE single precision float. The sample can be manipulated as a Java float in a float array, though within a ByteBuffer it is stored in native endian byte order. The nominal range of ENCODING_PCM_FLOAT
audio data is [-1.0, 1.0]. It is implementation dependent whether the positive maximum of 1.0 is included in the interval. Values outside of the nominal range are clamped before sending to the endpoint device. Beware that the handling of NaN is undefined; subnormals may be treated as zero; and infinities are generally clamped just like other values for AudioTrack
– try to avoid infinities because they can easily generate a NaN. <br> To achieve higher audio bit depth than a signed 16 bit integer short, it is recommended to use ENCODING_PCM_FLOAT
for audio capture, processing, and playback. Floats are efficiently manipulated by modern CPUs, have greater precision than 24 bit signed integers, and have greater dynamic range than 32 bit signed integers. AudioRecord
as of API android.os.Build.VERSION_CODES#M
and AudioTrack
as of API android.os.Build.VERSION_CODES#LOLLIPOP
support ENCODING_PCM_FLOAT
. </li> <li> #ENCODING_PCM_24BIT_PACKED
: Introduced in API android.os.Build.VERSION_CODES#S
, this encoding specifies the audio sample is an extended precision 24 bit signed integer stored as a 3 Java bytes in a ByteBuffer
or byte array in native endian (see java.nio.ByteOrder#nativeOrder()
). Each sample has full range from [-8388608, 8388607], and can be interpreted as fixed point Q.23 data. </li> <li> #ENCODING_PCM_32BIT
: Introduced in API android.os.Build.VERSION_CODES#S
, this encoding specifies the audio sample is an extended precision 32 bit signed integer stored as a 4 Java bytes in a ByteBuffer
or byte array in native endian (see java.nio.ByteOrder#nativeOrder()
). Each sample has full range from [-2147483648, 2147483647], and can be interpreted as fixed point Q.31 data. </li> </ul>
For compressed audio, the encoding specifies the method of compression, for example #ENCODING_AC3
and #ENCODING_DTS
. The compressed audio data is typically stored as bytes in a byte array or ByteBuffer. When a compressed audio encoding is specified for an AudioTrack
, it creates a direct (non-mixed) track for output to an endpoint (such as HDMI) capable of decoding the compressed audio. For (most) other endpoints, which are not capable of decoding such compressed audio, you will need to decode the data first, typically by creating a MediaCodec
. Alternatively, one may use MediaPlayer
for playback of compressed audio files or streams.
When compressed audio is sent out through a direct AudioTrack
, it need not be written in exact multiples of the audio access unit; this differs from MediaCodec
input buffers.
<h4 id="channelMask">Channel mask</h4>
Channel masks are used in AudioTrack
and AudioRecord
to describe the samples and their arrangement in the audio frame. They are also used in the endpoint (e.g. a USB audio interface, a DAC connected to headphones) to specify allowable configurations of a particular device. <br>As of API android.os.Build.VERSION_CODES#M
, there are two types of channel masks: channel position masks and channel index masks.
<h5 id="channelPositionMask">Channel position masks</h5> Channel position masks are the original Android channel masks, and are used since API android.os.Build.VERSION_CODES#BASE
. For input and output, they imply a positional nature - the location of a speaker or a microphone for recording or playback. <br>For a channel position mask, each allowed channel position corresponds to a bit in the channel mask. If that channel position is present in the audio frame, that bit is set, otherwise it is zero. The order of the bits (from lsb to msb) corresponds to the order of that position's sample in the audio frame. <br>The canonical channel position masks by channel count are as follows: <br><table> <tr><td>channel count</td><td>channel position mask</td></tr> <tr><td>1</td><td>#CHANNEL_OUT_MONO
</td></tr> <tr><td>2</td><td>#CHANNEL_OUT_STEREO
</td></tr> <tr><td>3</td><td>#CHANNEL_OUT_STEREO
| #CHANNEL_OUT_FRONT_CENTER
</td></tr> <tr><td>4</td><td>#CHANNEL_OUT_QUAD
</td></tr> <tr><td>5</td><td>#CHANNEL_OUT_QUAD
| #CHANNEL_OUT_FRONT_CENTER
</td></tr> <tr><td>6</td><td>#CHANNEL_OUT_5POINT1
</td></tr> <tr><td>7</td><td>#CHANNEL_OUT_5POINT1
| #CHANNEL_OUT_BACK_CENTER
</td></tr> <tr><td>8</td><td>#CHANNEL_OUT_7POINT1_SURROUND
</td></tr> </table> <br>These masks are an ORed composite of individual channel masks. For example #CHANNEL_OUT_STEREO
is composed of #CHANNEL_OUT_FRONT_LEFT
and #CHANNEL_OUT_FRONT_RIGHT
.
The following diagram represents the layout of the output channels, as seen from above the listener (in the center at the "lis" position, facing the front-center channel).
TFL ----- TFC ----- TFR T is Top
| \ | / |
| FL --- FC --- FR | F is Front
| |\ | /| |
| | BFL-BFC-BFR | | BF is Bottom Front
| | | |
| FWL lis FWR | W is Wide
| | | |
TSL SL TC SR TSR S is Side
| | | |
| BL --- BC -- BR | B is Back
| / \ |
TBL ----- TBC ----- TBR C is Center, L/R is Left/Right
All "T" (top) channels are above the listener, all "BF" (bottom-front) channels are below the listener, all others are in the listener's horizontal plane. When used in conjunction, LFE1 and LFE2 are below the listener, when used alone, LFE plane is undefined. See the channel definitions for the abbreviations
<h5 id="channelIndexMask">Channel index masks</h5> Channel index masks are introduced in API android.os.Build.VERSION_CODES#M
. They allow the selection of a particular channel from the source or sink endpoint by number, i.e. the first channel, the second channel, and so forth. This avoids problems with artificially assigning positions to channels of an endpoint, or figuring what the i<sup>th</sup> position bit is within an endpoint's channel position mask etc. <br>Here's an example where channel index masks address this confusion: dealing with a 4 channel USB device. Using a position mask, and based on the channel count, this would be a #CHANNEL_OUT_QUAD
device, but really one is only interested in channel 0 through channel 3. The USB device would then have the following individual bit channel masks: #CHANNEL_OUT_FRONT_LEFT
, #CHANNEL_OUT_FRONT_RIGHT
, #CHANNEL_OUT_BACK_LEFT
and #CHANNEL_OUT_BACK_RIGHT
. But which is channel 0 and which is channel 3? <br>For a channel index mask, each channel number is represented as a bit in the mask, from the lsb (channel 0) upwards to the msb, numerically this bit value is 1 << channelNumber
. A set bit indicates that channel is present in the audio frame, otherwise it is cleared. The order of the bits also correspond to that channel number's sample order in the audio frame. <br>For the previous 4 channel USB device example, the device would have a channel index mask 0xF
. Suppose we wanted to select only the first and the third channels; this would correspond to a channel index mask 0x5
(the first and third bits set). If an AudioTrack
uses this channel index mask, the audio frame would consist of two samples, the first sample of each frame routed to channel 0, and the second sample of each frame routed to channel 2. The canonical channel index masks by channel count are given by the formula (1 << channelCount) - 1
.
<h5>Use cases</h5> <ul> <li>Channel position mask for an endpoint:CHANNEL_OUT_FRONT_LEFT
, CHANNEL_OUT_FRONT_CENTER
, etc. for HDMI home theater purposes. <li>Channel position mask for an audio stream: Creating an AudioTrack
to output movie content, where 5.1 multichannel output is to be written. <li>Channel index mask for an endpoint: USB devices for which input and output do not correspond to left or right speaker or microphone. <li>Channel index mask for an audio stream: An AudioRecord
may only want the third and fourth audio channels of the endpoint (i.e. the second channel pair), and not care the about position it corresponds to, in which case the channel index mask is 0xC
. Multichannel AudioRecord
sessions should use channel index masks. </ul> <h4 id="audioFrame">Audio Frame</h4>
For linear PCM, an audio frame consists of a set of samples captured at the same time, whose count and channel association are given by the channel mask, and whose sample contents are specified by the encoding. For example, a stereo 16 bit PCM frame consists of two 16 bit linear PCM samples, with a frame size of 4 bytes. For compressed audio, an audio frame may alternately refer to an access unit of compressed data bytes that is logically grouped together for decoding and bitstream access (e.g. MediaCodec
), or a single byte of compressed data (e.g. AudioTrack#getBufferSizeInFrames() AudioTrack.getBufferSizeInFrames()
), or the linear PCM frame result from decoding the compressed data (e.g.AudioTrack#getPlaybackHeadPosition() AudioTrack.getPlaybackHeadPosition()
), depending on the context where audio frame is used. For the purposes of AudioFormat#getFrameSizeInBytes()
, a compressed data format returns a frame size of 1 byte.
Java documentation for android.media.AudioFormat
.
Portions of this page are modifications based on work created and shared by the Android Open Source Project and used according to terms described in the Creative Commons 2.5 Attribution License.
Constructors
AudioFormat() | |
AudioFormat(IntPtr, JniHandleOwnership) |
A constructor used when creating managed representations of JNI objects; called by the runtime. |
Fields
ChannelInvalid |
Invalid audio channel mask |
ChannelOut5point1point2 |
Obsolete.
Output channel mask for 5. |
ChannelOut5point1point4 |
Obsolete.
Output channel mask for 5. |
ChannelOut6point1 |
Obsolete.
Output channel mask for 6. |
ChannelOut7point1 |
This member is deprecated. |
ChannelOut7point1point2 |
Obsolete.
Output channel mask for 7. |
ChannelOut7point1point4 |
Obsolete.
Output channel mask for 7. |
ChannelOut9point1point4 |
Obsolete.
Output channel mask for 9. |
ChannelOut9point1point6 |
Obsolete.
Output channel mask for 9. |
ChannelOutBottomFrontCenter |
Obsolete.
Bottom front center output channel (see BFC in channel diagram below FC) |
ChannelOutBottomFrontLeft |
Obsolete.
Bottom front left output channel (see BFL in channel diagram below FL) |
ChannelOutBottomFrontRight |
Obsolete.
Bottom front right output channel (see BFR in channel diagram below FR) |
ChannelOutFrontWideLeft |
Obsolete.
Front wide left output channel (see FWL in channel diagram) |
ChannelOutFrontWideRight |
Obsolete.
Front wide right output channel (see FWR in channel diagram) |
ChannelOutLowFrequency2 |
Obsolete.
The second LFE channel
When used in conjunction with |
ChannelOutTopBackCenter |
Obsolete.
Top back center output channel (see TBC in channel diagram above BC) |
ChannelOutTopBackLeft |
Obsolete.
Top back left output channel (see TBL in channel diagram above BL) |
ChannelOutTopBackRight |
Obsolete.
Top back right output channel (see TBR in channel diagram above BR) |
ChannelOutTopCenter |
Obsolete.
Top center (above listener) output channel (see TC in channel diagram) |
ChannelOutTopFrontCenter |
Obsolete.
Top front center output channel (see TFC in channel diagram above FC) |
ChannelOutTopFrontLeft |
Obsolete.
Top front left output channel (see TFL in channel diagram above FL) |
ChannelOutTopFrontRight |
Obsolete.
Top front right output channel (see TFR in channel diagram above FR) |
ChannelOutTopSideLeft |
Obsolete.
Top side left output channel (see TSL in channel diagram above SL) |
ChannelOutTopSideRight |
Obsolete.
Top side right output channel (see TSR in channel diagram above SR) |
EncodingDra |
Obsolete.
Audio data format: DRA compressed |
EncodingDsd |
Obsolete.
Audio data format: Direct Stream Digital |
EncodingDtsHdMa |
Obsolete.
Audio data format: DTS HD Master Audio compressed DTS HD Master Audio stream is variable bit rate and contains lossless audio. |
EncodingDtsUhd |
Audio data format: DTS UHD Profile-1 compressed (aka DTS:X Profile 1) Has the same meaning and value as ENCODING_DTS_UHD_P1. |
EncodingDtsUhdP1 |
Obsolete.
Audio data format: DTS UHD Profile-1 compressed (aka DTS:X Profile 1)
Has the same meaning and value as the deprecated |
EncodingDtsUhdP2 |
Obsolete.
Audio data format: DTS UHD Profile-2 compressed DTS-UHD Profile-2 supports delivery of Channel-Based Audio, Object-Based Audio and High Order Ambisonic presentations up to the fourth order. |
EncodingMpeghBlL3 |
Obsolete.
Audio data format: MPEG-H baseline profile, level 3 |
EncodingMpeghBlL4 |
Obsolete.
Audio data format: MPEG-H baseline profile, level 4 |
EncodingMpeghLcL3 |
Obsolete.
Audio data format: MPEG-H low complexity profile, level 3 |
EncodingMpeghLcL4 |
Obsolete.
Audio data format: MPEG-H low complexity profile, level 4 |
EncodingOpus |
Obsolete.
Audio data format: OPUS compressed. |
EncodingPcm24bitPacked |
Obsolete.
Audio data format: PCM 24 bit per sample packed as 3 bytes. |
EncodingPcm32bit |
Obsolete.
Audio data format: PCM 32 bit per sample. |
SampleRateUnspecified |
Sample rate will be a route-dependent value. |
Properties
ChannelCount |
Return the channel count. |
ChannelIndexMask |
Return the channel index mask. |
ChannelMask |
Return the channel mask. |
Class |
Returns the runtime class of this |
Creator | |
Encoding |
Return the encoding. |
FrameSizeInBytes |
Return the frame size in bytes. |
Handle |
The handle to the underlying Android instance. (Inherited from Object) |
JniIdentityHashCode | (Inherited from Object) |
JniPeerMembers | |
PeerReference | (Inherited from Object) |
SampleRate |
Return the sample rate. |
ThresholdClass |
This API supports the Mono for Android infrastructure and is not intended to be used directly from your code. |
ThresholdType |
This API supports the Mono for Android infrastructure and is not intended to be used directly from your code. |
Methods
Clone() |
Creates and returns a copy of this object. (Inherited from Object) |
DescribeContents() | |
Dispose() | (Inherited from Object) |
Dispose(Boolean) | (Inherited from Object) |
Equals(Object) |
Indicates whether some other object is "equal to" this one. (Inherited from Object) |
GetHashCode() |
Returns a hash code value for the object. (Inherited from Object) |
JavaFinalize() |
Called by the garbage collector on an object when garbage collection determines that there are no more references to the object. (Inherited from Object) |
Notify() |
Wakes up a single thread that is waiting on this object's monitor. (Inherited from Object) |
NotifyAll() |
Wakes up all threads that are waiting on this object's monitor. (Inherited from Object) |
SetHandle(IntPtr, JniHandleOwnership) |
Sets the Handle property. (Inherited from Object) |
ToArray<T>() | (Inherited from Object) |
ToString() |
Returns a string representation of the object. (Inherited from Object) |
UnregisterFromRuntime() | (Inherited from Object) |
Wait() |
Causes the current thread to wait until it is awakened, typically by being <em>notified</em> or <em>interrupted</em>. (Inherited from Object) |
Wait(Int64, Int32) |
Causes the current thread to wait until it is awakened, typically by being <em>notified</em> or <em>interrupted</em>, or until a certain amount of real time has elapsed. (Inherited from Object) |
Wait(Int64) |
Causes the current thread to wait until it is awakened, typically by being <em>notified</em> or <em>interrupted</em>, or until a certain amount of real time has elapsed. (Inherited from Object) |
WriteToParcel(Parcel, ParcelableWriteFlags) |
Explicit Interface Implementations
IJavaPeerable.Disposed() | (Inherited from Object) |
IJavaPeerable.DisposeUnlessReferenced() | (Inherited from Object) |
IJavaPeerable.Finalized() | (Inherited from Object) |
IJavaPeerable.JniManagedPeerState | (Inherited from Object) |
IJavaPeerable.SetJniIdentityHashCode(Int32) | (Inherited from Object) |
IJavaPeerable.SetJniManagedPeerState(JniManagedPeerStates) | (Inherited from Object) |
IJavaPeerable.SetPeerReference(JniObjectReference) | (Inherited from Object) |
Extension Methods
JavaCast<TResult>(IJavaObject) |
Performs an Android runtime-checked type conversion. |
JavaCast<TResult>(IJavaObject) | |
GetJniTypeName(IJavaPeerable) |
Gets the JNI name of the type of the instance |
JavaAs<TResult>(IJavaPeerable) |
Try to coerce |
TryJavaCast<TResult>(IJavaPeerable, TResult) |
Try to coerce |