Multi-language audio with IIS Smooth Streaming: An example from Radiovaticana Live Streaming
One of many useful features that comes with IIS Media Services 4.0 and Smooth Streaming is the ability to stream live and on-demand content with multiple language audio tracks that are selectable by the viewer. An example of this capability is the recent schedule of events that Vatican Radio (https://www.radiovaticana.org/) delivered during the last month. Vatican Radio delivered several events, such as Christmas night liturgical celebrations presided over by the Holy Father, World Day of Peace on January 1st, Jannuary 9 at 09.30 CET we have another event with the Pope, and other events with multiple audio tracks (Natural Audio, Italian, English, French, German, Spanish, Portuguese and Arabic) was delivered during the last month. The player used for these events was based on the Silverlight Media Framework (SMF) and provides the viewer with the ability to select which audio track to listen to. Below is an image of the player with the audio tracks selector shown:
Leveraging the integrated media platform capabilities of IIS Media Services (IIS MS) 4.0, the Vatican was also able to deliver the same events, using the same encoders and media server infrastructure, to Apple devices. Note that in this case only 1 audio track is available because these devices do not natively support multiple audio tracks.
You can see the architecture implemented for these events below:
The Digital Rapids encoders received as inputs 8 audio signals from Vatican Radio channels and a video signal from the Vatican TV center. They generated 8 audio tracks in sync with 6 video tracks at different bit rates (quality levels) and pushed all the audio and video tracks to a publishing point on an IIS MS 4.0 ingest server. A subset of the video tracks and a single audio track went to a second publishing point for Apple devices on the same IIS MS 4.0 ingest server. The ingest server pushed the content of both publishing points to the origin server.
The first publishing point on the IIS MS origin server provided a client manifest (.ismc file) of all the available tracks and the actual content to Silverlight media players. The second origin server publishing point had the Apple Devices Adaptive Streaming feature selected. This enabled the origin server to do on-the-fly trans-muxing (repackaging from one file format to another) from the fragmented MP4 streams used by the Smooth Streaming format to the Apple HTTP Live Adaptive Streaming (HLS) format compatible with iPhone and iPad. It also created and published an HLS-compatible client manifest (.m3u8 file). An HTTP CDN (content delivery network) pulled the content from the origin publishing points and distributed it to viewers on their Silverlight or iPhone/iPad clients.
The Silverlight client player, based on SMF, read all tracks published in the manifest and transparently adapted the video quality as needed, based on the bandwidth available and video rendering capabilities of each client. The player also offered the possibility to the viewer to choose the audio track.
If you are interested in more details about Smooth Streaming architectures for Live Streaming, you can start with this blog post.
As I mentioned before, the multiple audio tracks feature is available not only for live streaming, but also for on-demand scenarios. In the case of on-demand you have more flexibility, as you could add more audio tracks to existing assets. The Smooth Streaming format permits to you to add additional audio tracks at any time without re-encoding the assets. This is because the Smooth Streaming format uses a server manifest to describe to the streaming server the available tracks and file sources that are available in the assets, and the client manifest is used to describe to the client the available tracks.
If you are looking more information about smooth streaming files you can read this post and the Technical Overview of Smooth Streaming.
Smooth Streaming is built on top of technologies that Microsoft has released via the Community Promise Initiative, including the Protected Interoperable File Format (PIFF) and the IIS Smooth Streaming Transport Protocol (SSTP). The Protected Interoperable File Format (PIFF) Specification defines a standard file format for multimedia content delivery and playback. It includes the audio-video container, stream encryption, and metadata to support content delivery for multiple bit rate adaptive streaming, optionally using a standard encryption scheme that can support multiple digital rights management (DRM) systems. The IIS Smooth Streaming Transport Protocol Specification describes how live and on-demand Smooth Streaming audio/video content is distributed and cached over an HTTP network. It enables third parties to build their own client implementations that interoperate with IIS Media Services.
At any time you can encode a new PIFF (Protected Interoperable File Format) asset, creating a fragmented MP4 file (.ismv, or .isma in the case of an audio-only file) that contains the additional audio tracks. You can update the manifests to describe the additional tracks to the server and client. Ideally, you would synchronize the new audio track with the existing video track.
Here is an example of a server manifest (.ism extension) with two video tracks and one audio track:
<smil xmlns="https://www.w3.org/2001/SMIL20/Language">
<head>
<meta
name="clientManifestRelativePath"
content="big_buck_bunny_1080p_surround.ismc" />
<metadata id="meta-rdf">
----omitted-------
</head>
<body>
<switch>
<video
src="big_buck_bunny_1080p_surround_330.ismv"
systemBitrate="330000">
<param
name="trackID"
value="2"
valuetype="data" />
<param
name="trackName"
value="video"
valuetype="data" />
<param
name="timeScale"
value="10000000"
valuetype="data" />
</video>
<video
src="big_buck_bunny_1080p_surround_230.ismv"
systemBitrate="230000">
<param
name="trackID"
value="2"
valuetype="data" />
<param
name="trackName"
value="video"
valuetype="data" />
<param
name="timeScale"
value="10000000"
valuetype="data" />
</video>
<audio
src="big_buck_bunny_1080p_surround_6000.ismv"
systemBitrate="128000">
<param
name="trackID"
value="1"
valuetype="data" />
<param
name="trackName"
value="audio1"
valuetype="data" />
<param
name="timeScale"
value="10000000"
valuetype="data" />
</audio>
</switch>
</body>
</smil>
As you can see, this file describes to the server how many video and audio tracks are available and the location of the physical PIFF file where the content is available.
Here is an example of a client manifest file (.ismc extension) for the same asset:
<SmoothStreamingMedia
MajorVersion="2"
MinorVersion="1"
Duration="5964800000">
<StreamIndex
Type="video"
Name="video"
Chunks="312"
QualityLevels="2"
Url="QualityLevels({bitrate})/Fragments(video={start time})">
<QualityLevel
Index="0"
Bitrate="6000000"
FourCC="WVC1"
MaxWidth="1920"
MaxHeight="1080"
CodecPrivateData="250000010FDBBE3BF21B8A3BF8EFF18044800000010E5A0040" />
<QualityLevel
Index="1"
Bitrate="4176000"
FourCC="WVC1"
MaxWidth="1920"
MaxHeight="1080"
CodecPrivateData="250000010FDBBE3BF21B8A3BF8EFF18044800000010E5A0040" />
<QualityLevel
Index="0"
Bitrate="330000"
FourCC="WVC1"
MaxWidth="480"
MaxHeight="272"
CodecPrivateData="250000010FDB8A3BF21B8A3BF8EFF18044800000010E5A0040" />
<QualityLevel
Index="1"
Bitrate="230000"
FourCC="WVC1"
MaxWidth="320"
MaxHeight="180"
CodecPrivateData="250000010FDB863BF21B8A3BF8EFF18044800000010E5A0040" />
<c
d="19999968"/>
-----omitted-------
</StreamIndex>
<StreamIndex
Type="audio"
Index="0"
Name="audio"
Chunks="299"
QualityLevels="1"
Url="QualityLevels({bitrate})/Fragments(audio={start time})">
<QualityLevel
FourCC="WMAP"
Bitrate="128000"
SamplingRate="44100"
Channels="2"
BitsPerSample="16"
PacketSize="5945"
AudioTag="354"
CodecPrivateData="1000030000000000000000000000E0000000" />
<c
d="20433560" />
----omitted ------
</StreamIndex>
</SmoothStreamingMedia>
As you can see, this file describes to the client the video and audio tracks and fragments ("chunks") available.
Again, if you would like to add an additional audio track, you can encode the track in a separate PIFF file and add it to the video asset by editing the server and client manifests to indicate the new track. Very flexible!!!
The job delivered from Vatican Radio Web Team is a very good example of how you can use the robust features and flexibility of IIS Smooth Streaming to deliver a compelling and more engaging experience to more viewers.
Comments
Anonymous
April 26, 2011
Thanks for you note, Im product manager we are using smoothstreaming for on demand service, do you now if support multiple subtitle. ThanksAnonymous
July 31, 2011
Did you manage to deliver Multiple Audio with DRM protected content ? We are having trouble on the player .. we are getting a 3009 AG_E_ATTRIBUTENOTFOUND error only when there are multiple audio tracks (the second one was added as you described ) on a DRM protected content. Do you know on a bug / workaround with this issue ?Anonymous
August 01, 2011
Sebastian, yes it is supperto also multiple subtitle. You can leverage additional track in the manifest tu insert subtitleAnonymous
August 01, 2011
Multiple audio with DRM is supported. I have some partners that implemented multiaudio with drm in ondemand scenario for a movie store. What type of scenario isbtequired in your case ? Live , on demand ? What type of encoder you use ? From the exception that you report I can supose that the problem is from a malformed manifest not directly connected to DRM. The additional track works without DRM ?Anonymous
September 19, 2011
Hi Giuseppe, Was wondering how we specify a specific language to be chosen in a live stream. Currently it seems to do it randomly.Anonymous
June 21, 2013
i want to link to the folloving website 85.25.111.168/smoothstreamingplayer.html But i want that i call automaticly the audio audio_ger how must i call the url please help meAnonymous
April 15, 2014
We are planning to use adaptive bit rate for audio only file with DRM , will that be supported. I do not see any sample's with multiple bit rates for audio only files.