Implementation of DTS Audio in IIS Smooth Streaming
Implementation of DTS Audio in IIS Smooth Streaming
Implementation of DTS Audio in IIS Smooth Streaming
28500 v1 Page 1 of 17
DTS Audio in IIS Smooth Streaming Version 1.0
COPYRIGHT AND LICENSE
Do Not Duplicate. © 2012 DTS, Inc. All Rights Reserved. Unauthorized duplication is a violation of State,
Federal, and International laws.
This publication and the Product are copyrighted and all rights are reserved by DTS, Inc (“DTS”). DTS
grants users of this document a nonexclusive, worldwide, royalty‐free copyright license to implement
the specifications stated herein. Notwithstanding the foregoing, no part of this publication may be
reproduced, photocopied, stored on a retrieval system, translated, or transmitted in any form or by any
means, electronic or otherwise, without the express prior written permission of DTS.
TRADEMARK
DTS, the Symbol, and DTS and the Symbol together are registered trademarks of DTS, Inc.
All other trademarks are the property of their respective owners.
28500 v2 Page 2 of 17
DTS Audio in IIS Smooth Streaming Version 1.0
Contents
1 Resources .............................................................................................................................................. 5
2 Terms, Definitions and Abbreviations ................................................................................................... 5
3 Microsoft Smooth Streaming Manifest Files ........................................................................................ 6
3.1 Overview .......................................................................................................................................... 6
3.2 ISM and ISMC files with DTS‐HD ...................................................................................................... 6
3.2.1 IIS Smooth Streaming Server Manifest (On‐Demand) ............................................................ 7
3.2.2 IIS Smooth Streaming Server Manifest (Live) .......................................................................... 7
3.3 IIS Smooth Streaming Client Manifest ............................................................................................. 9
3.3.1 Child Elements to StreamIndex ............................................................................................... 9
3.3.2 CodecPrivateData .................................................................................................................... 9
3.3.3 ISMC example ........................................................................................................................ 11
4 Storing DTS‐HD Audio in PIFF 1.3 files ................................................................................................ 14
4.1 Overview ........................................................................................................................................ 14
4.1.1 DTS Bitstream structure basics ............................................................................................. 14
4.1.2 Constraints on DTS within PIFF 1.3 files ................................................................................ 14
4.1.3 Contents in the MDAT of each PIFF 1.3 fragment ................................................................. 15
4.1.4 Parsing a DTS bitstream ........................................................................................................ 15
4.1.5 Constraints on the PIFF Track Fragments .............................................................................. 15
4.2 Identifying DTS in a PIFF File .......................................................................................................... 15
4.3 Encryption ...................................................................................................................................... 16
4.4 Demultiplexing DTS from a PIFF file. .............................................................................................. 16
4.4.1 Overview ............................................................................................................................... 16
4.4.2 Buffering considerations ....................................................................................................... 16
4.4.3 Conflicts or known issues ...................................................................................................... 17
28500 v2 Page 3 of 17
DTS Audio in IIS Smooth Streaming Version 1.0
Tables
Table 1 ‐ IIS .ism ............................................................................................................................................ 7
Table 2 ‐ IIS .ism for On‐Demand .................................................................................................................. 7
Table 3 ‐ IIS .ism for Live ............................................................................................................................... 7
Table 4 ‐ IIS .ismc Elements ........................................................................................................................... 9
Table 5 ‐ IIS .ismc Quality Attributes ............................................................................................................. 9
Table 6 ‐ CodecPrivateData ........................................................................................................................... 9
Table 7 – SubFormat ................................................................................................................................... 10
Table 8 – Speaker Bitmasks ......................................................................................................................... 11
Table 9 ‐ dtsFlags ......................................................................................................................................... 11
Table 10 ‐ Valid sync words ......................................................................................................................... 15
28500 v2 Page 4 of 17
DTS Audio in IIS Smooth Streaming Version 1.0
This document covers:
How to identify a DTS‐HD audio stream within a PIFF file
How to identify the availability of DTS‐HD streams in server and client manifests
The steps needed to extract the DTS‐HD audio stream from the PIFF file
1 Resources
The following documents are helpful in implementing the procedures described in this document:
[DTSHD] ETSI TS 102 114 (2011‐08), “DTS Coherent Acoustics Core and Extensions, with
Additional Profiles”, www.etsi.org
[DTSISO] “Implementations of DTS Audio in Media Files Based on ISO/IEC 14496”, DTS Inc.,
Document #9302J81100, www.dts.com
[ISOFF] ISO/IEC 14496‐12, Third Edition (2008) and including corrigendum and amendments,
“Information technology – Coding of Audio‐Visual Object, part 12: ISO Based Media File
Format”, www.iso.org.
[PIFF] Portable Encoding of Audio‐Video Objects: The Protected Interoperable File Format
(PIFF), go.microsoft.com
[SSLSM] IIS Smooth Streaming Live Server Manifest Format, msdn.microsoft.com
[SSCMF] IIS Smooth Streaming Client Manifest Format, msdn.microsoft.com
[WAVE] Multiple Channel Audio Data and Wave Files, msdn.microsoft.com
28500 v2 Page 5 of 17
DTS Audio in IIS Smooth Streaming Version 1.0
audio frame – A component of an audio stream that corresponds to a certain number of PCM audio
samples. Usually also an AU.
CBR ‐ Constant Bit Rate
core substream: A DTS audio stream, or a component of a DTS audio stream, conforming to [DTSHD]
and always begins with the 32‐bit Sync word of 0x7FFE8001.
duration – The time represented by one decoded audio frame, may be represented in audio samples
per channel at a specific audio sampling frequency or in seconds.
extension – For DTS bitstreams, a component of an audio frame, may or may not exist in sequence with
other extension components or a core component.
extension substream: A DTS audio stream, or a component of a DTS audio stream, conforming to
[DTSHD] and always begins with the 32‐bit sync word of 0x64582025.
LFE: Low Frequency Effects or subwoofer channel
PIFF: Portable Interoperable File Format
VBR ‐ Variable Bit Rate
XLL – The DTS‐HD lossless audio coding extension, a logical element within the DTS elementary stream
containing compressed audio data that will decode into bit exact representation of the original signal.
28500 v2 Page 6 of 17
DTS Audio in IIS Smooth Streaming Version 1.0
element and the child <param> elements. The <audio> element is a child of the <switch> element, which
in turn falls under the <body> element.
In the <audio> element, the following attribute might be specific to DTS‐HD
Attribute Description
systemBitrate Specifies the bit rate of the track. This value is matched to the argument of
the QualityLevels() noun on the URL. This attribute is required. For DTS-HD audio, this
is the nominal average bit rate expressed in bits per second. It can be derived from the
avgBitrate parameter in the DTSSpecific Box [DTSISO] (see chapter 4)
Attribute Description
timescale Specifies the timescale for this track, as the number of units that pass in one second. This value
shall be set to 10000000.
Attribute Description
CodecPrivateData As defined in [WAVE], and extended for DTS-HD
SamplingRate SamplingRate is expressed in Hz, and must be equal to the maximum sampling
frequency of the audio encoded in the DTS-HD bitstream. In the DTSSpecificBox, this
value is stored as DTSSamplingFrequency
Channels The maximum number of output channels encoded in the DTS-HD bitstream. This can
be determined from ChannelLayout in the DTSSpecificBox.
BitsPerSample BitsPerSample shall always be set to 16
PacketSize The value of PackSize shall be set to 1
AudioTag The value of AudioTag shall be set to 65534
FourCC The value of “FourCC” shall be one of “dtsc”, “dtse”, “dtsh” or “dtsl” and shall match
the coding name used in the DTSSampleEntry box as defined in [DTSISO] and used
in the respective PIFF file.
28500 v2 Page 7 of 17
DTS Audio in IIS Smooth Streaming Version 1.0
28500 v2 Page 8 of 17
DTS Audio in IIS Smooth Streaming Version 1.0
</audio>
</switch>
</body>
</smil>
The following table summarizes the attributes for QualityLevel and how they are configured for DTS‐HD
3.3.2 CodecPrivateData
For DTS‐HD audio tracks, the client manifest contains specific information about the track in a
CodecPrivateData field. The CodePrivateData is formatted as a WAVE_FORMAT_EXTENSIBLE field
followed by some DTS‐HD specific information.
The CodecPrivateData for DTS‐HD is described in Table 6, below.
Table 6 ‐ CodecPrivateData
mnemonic Word Size in Bits
CodecPrivateData () {
28500 v2 Page 9 of 17
DTS Audio in IIS Smooth Streaming Version 1.0
wSamplesPerBlock 16 (little endian)
dwChannelMask 32 (little endian)
SubFormat
Data1 32 (little endian)
Data2 16 (little endian)
Data3 16 (little endian)
Data4 () {
for (int i=0; i<8; i++)
Data4[i] 8
}
dtsStreamConstruction 8
dtsChannelLayout 16 (big endian)
dtsFlags 8
Reserved = 0 16
}
Table 7 – SubFormat
mnemonic Hexadecimal value
Data1 0x00002001
Data2 0x0000
Data3 0x0010
Data4[] 0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71
Example of CodecPrivateData:
3.3.2.1 Semantics
wSamplesPerBlock is fully described in [WAVE] and contains the number of PCM samples per channel
decoded in each sample, (i.e. audio frame), of the track file. For DTS‐HD, valid values are 512, 1024,
2048, 4096 and 6144. The correct value can be determined from the value of FrameDuration in the
DTSSpecificBox, see [DTSISO] for details of FrameDuration.
dwChannelMask is fully described in [WAVE] and provides a representation of the available output
channels in the DTS‐HD bitstream. Since dwChannelMask cannot fully represent the possible speaker
layouts that can be represented in DTS‐HD, a more complete representation of the output channel
28500 v2 Page 10 of 17
DTS Audio in IIS Smooth Streaming Version 1.0
configuration is in dtsChannelLayout. The relationship between the channel notation and channel
description relating dtsChannelLayout and dwChannelMask can be seen in Table 8.
SubFormat shall be set according to Table 7.
dtsStreamConstruction This value shall be identical to StreamConstruction in the DTSSpecificBox,
[DTSISO].
dtsChannelLayout shall match ChannelLayout in the DTSSpecificBox defined in [DTSISO].
dtsFlags is defined according to Table 9. The value of each flag shall match the corresponding named bit
in the DTSSpecificBox defined in [DTSISO].
Table 9 ‐ dtsFlags
name mask description (informative)
MultiAssetFlag 0b10000000 0 = single asset, 1 = multiple asset
LBRDurationMod 0b01000000 0 = no modifier, 1 = LBR duration modifier, duration = 6144
reserved = 0 0b00111111 Reserved for future use, shall be set to 0
28500 v2 Page 11 of 17
DTS Audio in IIS Smooth Streaming Version 1.0
Type="video"
Name="video"
Chunks="12"
QualityLevels="1"
MaxWidth="1280"
MaxHeight="720"
DisplayWidth="1280"
DisplayHeight="720"
Url="QualityLevels({bitrate})/Fragments(video={start time})">
<QualityLevel
Index="0"
Bitrate="2962000"
FourCC="H264"
MaxWidth="1280"
MaxHeight="720"
CodecPrivateData=
"000000016764001FAC2CA5014016EFFC100010014808080A000007D200017700C100005A648000B4C9FE31
C6080002D3240005A64FF18E1DA12251600000000168E9093525" />
<c d="20020000" />
<c d="20020000" />
<c d="20020000" />
<c d="20020000" />
<c d="20020000" />
<c d="20020000" />
<c d="20020000" />
<c d="20020000" />
<c d="20020000" />
<c d="20020000" />
<c d="20020000" />
<c d="18000000" />
</StreamIndex>
<StreamIndex
Type="audio"
Name="audio"
Chunks="12"
QualityLevels="2"
Url="QualityLevels({bitrate})/Fragments(audio={start time})">
<QualityLevel
Index="0"
FourCC="dtse"
28500 v2 Page 12 of 17
DTS Audio in IIS Smooth Streaming Version 1.0
Bitrate="768000"
SamplingRate="48000"
Channels="8"
BitsPerSample="16"
PacketSize="1"
AudioTag="65534"
CodecPrivateData=”00103F0600000120000000001000800000AA00389B7112084B000000” />
<QualityLevel
Index="1"
FourCC="dtse"
Bitrate="447000"
SamplingRate="48000"
Channels="8"
BitsPerSample="16"
PacketSize="1"
AudioTag="65534"
CodecPrivateData=”00103F0600000120000000001000800000AA00389B7112084B000000” />
<c d="20480000" />
<c d="20480000" />
<c d="20480000" />
<c d="20480000" />
<c d="20480000" />
<c d="20480000" />
<c d="20480000" />
<c d="20480000" />
<c d="20480000" />
<c d="20480000" />
<c d="20480000" />
<c d="13620000" />
</StreamIndex>
</SmoothStreamingMedia>
28500 v2 Page 13 of 17
DTS Audio in IIS Smooth Streaming Version 1.0
4.1 Overview
DTS‐HD audio covers a range of profiles within a unified container format. DTS‐HD decoders are
structured such that all DTS‐HD decoders can create a meaningful presentation from any DTS‐HD
compliant audio stream.
The above‐mentioned core and extension substreams may be used together or individually to compose
an audio presentation.
Additional constraints are enforced within the various DTS product licensing programs, but those
constraints do not affect the systems requirements of a server or client.
28500 v2 Page 14 of 17
DTS Audio in IIS Smooth Streaming Version 1.0
DTS_SYNCWORD_CORE DTS_SYNCWORD_SUBSTREAM
1 PIFF Sample (Access Unit)
Figure 1 ‐ Example of a DTS‐HD sample as stored in a
PIFF fragment
28500 v2 Page 15 of 17
DTS Audio in IIS Smooth Streaming Version 1.0
4 5 6 7 8 9 Description
smhd Sound Media Header
stbl Sample table
stsd Sample Descriptions
dtsh May also be ‘dtsc’, ‘dtse’ or ‘dtsl’, see [DTSISO]
ddts Defined in [DTSISO]
stts Normally these boxes are all NULL in PIFF files since the ‘moov’ box
stsc does not contain samples. The relevant information is in the track
stsz ‘trun’ in the fragments (‘moof’ boxes) that are downloaded if the track
stco is selected.
4.3 Encryption
Encryption of DTS_HD tracks in PIFF files is similar to any other media format. The DTSSampleEntry 4CC
is replaced by ‘enca’ with the requisite ‘sinf’ added, as shown below Figure 3.
4 5 6 7 8 9 Reference
stbl [ISOFF], section 8.5
stsd [ISOFF], section 8.5.2
enca
ddts
sinf [ISOFF], section 8.12.1 and [PIFF]
frma scheme_type = ‘cenc’, scheme_version = 0x00010000
schm
schi
tenc
stts
4.4.1 Overview
This section provides some design considerations for products that implement PIFF demultiplexing for
DTS‐HD audio bitstreams.
2 ∗ 512
∗ ∗ 1
8 2
The first multiplicand converts bits/second to bytes/second.
28500 v2 Page 16 of 17
DTS Audio in IIS Smooth Streaming Version 1.0
The second part of the equation calculates the frame duration in seconds. “FrameDuration” in
DTSSampleEntry is codified [0 to 3] representing frame duration in samples of 512, 1024, 2048 or 4096
respectively, (relative to the base sample rate). The base sample rate, “samplerate” in the
DTSSampleEntry box, may have a value of 32000, 44100 or 48000.
The last piece of this equation accounts for a special case of LBR when it is internally band limited to 1/3
or 2/3 of the full band, (useful for low bit rates). If LBRDurationMod is clear, then no modification and if
it is set, then a 1.5 modifier for a special case of LBR (6144 sample frame duration).
28500 v2 Page 17 of 17