Redigera

Dela via


AAC Encoder

The Microsoft Media Foundation AAC encoder is a Media Foundation Transform that encodes Advanced Audio Coding (AAC) Low Complexity (LC) profile, as defined by ISO/IEC 13818-7 (MPEG-2 Audio Part 7) .

The AAC encoder does not support encoding to any other AAC profiles, such as Main, SSR, or LTP.

Class Identifier

The class identifier (CLSID) of the AAC encoder is CLSID_AACMFTEncoder, defined in the header file wmcodecdsp.h.

Media Types

The AAC encoder supports the following media types. You can set the types in either order input type first, or output type first.

Input Types

Set the following attributes on the input media type.

Attribute Description Remarks
MF_MT_MAJOR_TYPE Major type. Must be MFMediaType_Audio.
MF_MT_SUBTYPE Subtype. Must be MFAudioFormat_PCM.
MF_MT_AUDIO_BITS_PER_SAMPLE Bits per sample. Must be 16.
MF_MT_AUDIO_SAMPLES_PER_SECOND Samples per second. The following values are supported:
  • 44100 (44.1 KHz)
  • 48000 (48 KHz)
MF_MT_AUDIO_NUM_CHANNELS Number of channels. Must be 1 (mono) or 2 (stereo), or 6 (5.1). Note: Support for 6 audio channels was introduced with Windows 10 and is not available for earlier versions of Windows.

After the input type is set, the encoder derives the following values and adds them to the media type:

Output Types

Set the following attributes on the output media type.

Attribute Description Remarks
MF_MT_MAJOR_TYPE Major type. Must be MFMediaType_Audio.
MF_MT_SUBTYPE Audio subtype. Must be MFAudioFormat_AAC.
MF_MT_AUDIO_BITS_PER_SAMPLE Bits per sample. Must be 16.
MF_MT_AUDIO_SAMPLES_PER_SECOND Samples per second. Must match the input type.
MF_MT_AUDIO_NUM_CHANNELS Number of channels. Must match the input type.
MF_MT_AUDIO_AVG_BYTES_PER_SECOND Bit rate of the encoded AAC stream, in bytes per second. The following values are supported:
  • 12000
  • 16000
  • 20000
  • 24000
If using 6 channels, multiply these values by 6.
The default value for both mono and stereo is 12000 (96 Kbps). The default value for 6 channels is 72000 (576 kbps).
MF_MT_AAC_PAYLOAD_TYPE The AAC payload type. Optional. If set, the value must be zero, indicating that the stream contains raw_data_block elements only.
Optional. If the attribute is not set, the default value is zero, indicating that the stream contains raw_data_block elements only (raw AAC).
In Windows 7, if this attribute is set, the value must be zero.
Starting in Windows 8, the value can be 0 (raw AAC) or 1 (ADTS AAC).
MF_MT_AAC_AUDIO_PROFILE_LEVEL_INDICATION The AAC audio profile and level. Optional. The following values are supported:
  • 0x29 (default)
  • 0x2A
  • 0x2B
  • 0x2C
  • 0x2E
  • 0x2F
  • 0x30
  • 0x31
  • 0x32
  • 0x33

The following table lists the values that can be used for the MF_MT_AAC_PROFILE_LEVEL_INDICATION attribute.

MF_MT_AAC_PROFILE_LEVEL_INDICATION value Profile
0x29 AAC Profile L2
0x2A AAC Profile L4
0x2B AAC Profile L5
0x2C High Efficiency v1 AAC Profile L2
0x2E High Efficiency v1 AAC Profile L4
0x2F High Efficiency v1 AAC Profile L5
0x30 High Efficiency v2 AAC Profile L2
0x31 High Efficiency v2 AAC Profile L3
0x32 High Efficiency v2 AAC Profile L4
0x33 High Efficiency v2 AAC Profile L5

After the output type is set, the AAC encoder updates the type by adding the MF_MT_USER_DATA attribute. This attribute contains the portion of the HEAACWAVEINFO structure that appears after the WAVEFORMATEX structure (that is, after the wfx member). This is followed by the AudioSpecificConfig() data, as defined by ISO/IEC 14496-3.

Each output sample contains one compressed AAC frame with no header. This format is equivalent to the raw_data_block() element defined by MPEG-2. The MF_MT_AAC_PAYLOAD_TYPE attribute, if present in the output type, must be set to zero to indicate this payload type.

Each output sample contains one compressed AAC frame corresponding to 1024 PCM samples. For example, at 48 Khz sampling rate, the duration of one compressed frame is 21.33 msec.

If MF_MT_AAC_PAYLOAD_TYPE is zero (the default value), each output sample contains one raw_data_block() element as defined by ISO/IEC 13818-7.

Example Media Types

Here is an example of the media types needed to encode from 44.1-kHz, 160-Kbps stereo audio to raw AAC

Input media type:

Attribute Value
MF_MT_MAJOR_TYPE MFMediaType_Audio
MF_MT_SUBTYPE MFAudioFormat_PCM
MF_MT_AUDIO_BITS_PER_SAMPLE 16
MF_MT_AUDIO_SAMPLES_PER_SECOND 44100
MF_MT_AUDIO_NUM_CHANNELS 2
MF_MT_AUDIO_AVG_BYTES_PER_SECOND 176400 (optional)
MF_MT_AUDIO_BLOCK_ALIGNMENT 4 (optional)
MF_MT_ALL_SAMPLES_INDEPENDENT 1 (optional)
MF_MT_AVG_BITRATE 1411200 (optional)
MF_MT_FIXED_SIZE_SAMPLES 1 (optional)

Output media type:

Attribute Value
MF_MT_MAJOR_TYPE MFMediaType_Audio
MF_MT_SUBTYPE MFAudioFormat_AAC
MF_MT_AUDIO_BITS_PER_SAMPLE 16
MF_MT_AUDIO_SAMPLES_PER_SECOND 44100
MF_MT_AUDIO_NUM_CHANNELS 2
MF_MT_AUDIO_AVG_BYTES_PER_SECOND 20000
MF_MT_AAC_PAYLOAD_TYPE 0 (optional)
MF_MT_AAC_AUDIO_PROFILE_LEVEL_INDICATION 0x29 (optional)
MF_MT_AUDIO_BLOCK_ALIGNMENT 1 (optional)
MF_MT_ALL_SAMPLES_INDEPENDENT 0 (optional)
MF_MT_AVG_BITRATE 160000 (optional)
MF_MT_USER_DATA {0x00, 0x00, 0x29, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x12, 0x10} (optional)

Remarks

In the current implementation, every input sample must have a valid time and duration. To set the sample time, call IMFSample::SetSampleTime. To set the sample duration, call IMFSample::SetSampleDuration.

If the sample time is not set, the encoder's IMFTransform::ProcessInput method returns MF_E_NO_SAMPLE_TIMESTAMP. If the sample duration is not set, the ProcessInput method returns MF_E_NO_SAMPLE_DURATION.

Sample duration can be calculated as follows:

LONGLONG hnsSampleDuration = 
    ( nAudioSamplesPerChannel * (LONGLONG)10000000 )/nSamplesPerSec;

where nAudioSamplesPerChannel is the number of PCM audio samples per channel in the input buffer, and nSamplesPerSec is the sampling rate, in samples per second.

Note

Due to a bug in the current implementation, if the sample duration is set to zero, the ProcessInput call succeeds, but a subsequent call to IMFTransform::ProcessOutput will throw a divide-by-zero exception. To avoid this error, set a valid nonzero duration on each input sample.

Requirements

Requirement Value
Minimum supported client
Windows 7 [desktop apps only]
Minimum supported server
Windows Server 2008 R2 [desktop apps only]
DLL
Mfaacenc.dll

See also

Codec Objects

AAC Decoder

AAC Media Types

Audio Media Types

MPEG-4 Support in Media Foundation

Supported Media Formats in Media Foundation