1 | <?xml version="1.0" standalone="no"?> |
---|
2 | <!DOCTYPE appendix PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" |
---|
3 | "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [ |
---|
4 | |
---|
5 | ]> |
---|
6 | |
---|
7 | <appendix id="vorbis-over-ogg"> |
---|
8 | <appendixinfo> |
---|
9 | <releaseinfo> |
---|
10 | $Id: a1-encapsulation_ogg.xml 7186 2004-07-20 07:19:25Z xiphmont $ |
---|
11 | </releaseinfo> |
---|
12 | </appendixinfo> |
---|
13 | <title>Embedding Vorbis into an Ogg stream</title> |
---|
14 | |
---|
15 | <section> |
---|
16 | <title>Overview</title> |
---|
17 | |
---|
18 | <para> |
---|
19 | This document describes using Ogg logical and physical transport |
---|
20 | streams to encapsulate Vorbis compressed audio packet data into file |
---|
21 | form.</para> |
---|
22 | |
---|
23 | <para> |
---|
24 | The <xref linkend="vorbis-spec-intro"/> provides an overview of the construction |
---|
25 | of Vorbis audio packets.</para> |
---|
26 | |
---|
27 | <para> |
---|
28 | The <ulink url="oggstream.html">Ogg |
---|
29 | bitstream overview</ulink> and <ulink url="framing.html">Ogg logical |
---|
30 | bitstream and framing spec</ulink> provide detailed descriptions of Ogg |
---|
31 | transport streams. This specification document assumes a working |
---|
32 | knowledge of the concepts covered in these named backround |
---|
33 | documents. Please read them first.</para> |
---|
34 | |
---|
35 | <section><title>Restrictions</title> |
---|
36 | |
---|
37 | <para> |
---|
38 | The Ogg/Vorbis I specification currently dictates that Ogg/Vorbis |
---|
39 | streams use Ogg transport streams in degenerate, unmultiplexed |
---|
40 | form only. That is: |
---|
41 | |
---|
42 | <itemizedlist> |
---|
43 | <listitem><simpara> |
---|
44 | A meta-headerless Ogg file encapsulates the Vorbis I packets |
---|
45 | </simpara></listitem> |
---|
46 | <listitem><simpara> |
---|
47 | The Ogg stream may be chained, i.e. contain multiple, contigous logical streams (links). |
---|
48 | </simpara></listitem> |
---|
49 | <listitem><simpara> |
---|
50 | The Ogg stream must be unmultiplexed (only one stream, a Vorbis audio stream, per link) |
---|
51 | </simpara></listitem> |
---|
52 | </itemizedlist> |
---|
53 | </para> |
---|
54 | |
---|
55 | <para> |
---|
56 | This is not to say that it is not currently possible to multiplex |
---|
57 | Vorbis with other media types into a multi-stream Ogg file. At the |
---|
58 | time this document was written, Ogg was becoming a popular container |
---|
59 | for low-bitrate movies consisting of DiVX video and Vorbis audio. |
---|
60 | However, a 'Vorbis I audio file' is taken to imply Vorbis audio |
---|
61 | existing alone within a degenerate Ogg stream. A compliant 'Vorbis |
---|
62 | audio player' is not required to implement Ogg support beyond the |
---|
63 | specific support of Vorbis within a degenrate ogg stream (naturally, |
---|
64 | application authors are encouraged to support full multiplexed Ogg |
---|
65 | handling). |
---|
66 | </para> |
---|
67 | |
---|
68 | </section> |
---|
69 | |
---|
70 | <section><title>MIME type</title> |
---|
71 | |
---|
72 | <para> |
---|
73 | The correct MIME type of any Ogg file is <literal>application/ogg</literal>. |
---|
74 | However, if a file is a Vorbis I audio file (which implies a |
---|
75 | degenerate Ogg stream including only unmultiplexed Vorbis audio), the |
---|
76 | mime type <literal>audio/x-vorbis</literal> is also allowed.</para> |
---|
77 | |
---|
78 | </section> |
---|
79 | |
---|
80 | </section> |
---|
81 | |
---|
82 | <section> |
---|
83 | <title>Encapsulation</title> |
---|
84 | |
---|
85 | <para> |
---|
86 | Ogg encapsulation of a Vorbis packet stream is straightforward.</para> |
---|
87 | |
---|
88 | <itemizedlist> |
---|
89 | |
---|
90 | <listitem><simpara> |
---|
91 | The first Vorbis packet (the identification header), which |
---|
92 | uniquely identifies a stream as Vorbis audio, is placed alone in the |
---|
93 | first page of the logical Ogg stream. This results in a first Ogg |
---|
94 | page of exactly 58 bytes at the very beginning of the logical stream. |
---|
95 | </simpara></listitem> |
---|
96 | |
---|
97 | <listitem><simpara> |
---|
98 | This first page is marked 'beginning of stream' in the page flags. |
---|
99 | </simpara></listitem> |
---|
100 | |
---|
101 | <listitem><simpara> |
---|
102 | The second and third vorbis packets (comment and setup |
---|
103 | headers) may span one or more pages beginning on the second page of |
---|
104 | the logical stream. However many pages they span, the third header |
---|
105 | packet finishes the page on which it ends. The next (first audio) packet |
---|
106 | must begin on a fresh page. |
---|
107 | </simpara></listitem> |
---|
108 | |
---|
109 | <listitem><simpara> |
---|
110 | The granule position of these first pages containing only headers is zero. |
---|
111 | </simpara></listitem> |
---|
112 | |
---|
113 | <listitem><simpara> |
---|
114 | The first audio packet of the logical stream begins a fresh Ogg page. |
---|
115 | </simpara></listitem> |
---|
116 | |
---|
117 | <listitem><simpara> |
---|
118 | Packets are placed into ogg pages in order until the end of stream. |
---|
119 | </simpara></listitem> |
---|
120 | |
---|
121 | <listitem><simpara> |
---|
122 | The last page is marked 'end of stream' in the page flags. |
---|
123 | </simpara></listitem> |
---|
124 | |
---|
125 | <listitem><simpara> |
---|
126 | Vorbis packets may span page boundaries. |
---|
127 | </simpara></listitem> |
---|
128 | |
---|
129 | <listitem><simpara> |
---|
130 | The granule position of pages containing Vorbis audio is in units |
---|
131 | of PCM audio samples (per channel; a stereo stream's granule position |
---|
132 | does not increment at twice the speed of a mono stream). |
---|
133 | </simpara></listitem> |
---|
134 | |
---|
135 | <listitem><simpara> |
---|
136 | The granule position of a page represents the end PCM sample |
---|
137 | position of the last packet <emphasis>completed</emphasis> on that page. |
---|
138 | A page that is entirely spanned by a single packet (that completes on a |
---|
139 | subsequent page) has no granule position, and the granule position is |
---|
140 | set to '-1'. |
---|
141 | </simpara></listitem> |
---|
142 | |
---|
143 | <listitem> |
---|
144 | <simpara> |
---|
145 | The granule (PCM) position of the first page need not indicate |
---|
146 | that the stream started at position zero. Although the granule |
---|
147 | position belongs to the last completed packet on the page and a |
---|
148 | valid granule position must be positive, by |
---|
149 | inference it may indicate that the PCM position of the beginning |
---|
150 | of audio is positive or negative. |
---|
151 | </simpara> |
---|
152 | |
---|
153 | <itemizedlist> |
---|
154 | <listitem><simpara> |
---|
155 | A positive starting value simply indicates that this stream begins at |
---|
156 | some positive time offset, potentially within a larger |
---|
157 | program. This is a common case when connecting to the middle |
---|
158 | of broadcast stream. |
---|
159 | </simpara></listitem> |
---|
160 | <listitem><simpara> |
---|
161 | A negative value indicates that |
---|
162 | output samples preceeding time zero should be discarded during |
---|
163 | decoding; this technique is used to allow sample-granularity |
---|
164 | editing of the stream start time of already-encoded Vorbis |
---|
165 | streams. The number of samples to be discarded must not exceed |
---|
166 | the overlap-add span of the first two audio packets. |
---|
167 | </simpara></listitem> |
---|
168 | </itemizedlist> |
---|
169 | |
---|
170 | <simpara> |
---|
171 | In both of these cases in which the initial audio PCM starting |
---|
172 | offset is nonzero, the second finished audio packet must flush the |
---|
173 | page on which it appears and the third packet begin a fresh page. |
---|
174 | This allows the decoder to always be able to perform PCM position |
---|
175 | adjustments before needing to return any PCM data from synthesis, |
---|
176 | resulting in correct positioning information without any aditional |
---|
177 | seeking logic. |
---|
178 | </simpara> |
---|
179 | |
---|
180 | <note><simpara> |
---|
181 | Failure to do so should, at worst, cause a |
---|
182 | decoder implementation to return incorrect positioning information |
---|
183 | for seeking operations at the very beginning of the stream. |
---|
184 | </simpara></note> |
---|
185 | </listitem> |
---|
186 | |
---|
187 | <listitem><simpara> |
---|
188 | A granule position on the final page in a stream that indicates |
---|
189 | less audio data than the final packet would normally return is used to |
---|
190 | end the stream on other than even frame boundaries. The difference |
---|
191 | between the actual available data returned and the declared amount |
---|
192 | indicates how many trailing samples to discard from the decoding |
---|
193 | process. |
---|
194 | </simpara></listitem> |
---|
195 | </itemizedlist> |
---|
196 | |
---|
197 | </section> |
---|
198 | |
---|
199 | </appendix> |
---|
200 | |
---|
201 | <!-- end appendix on Vorbis encapsulation in Ogg --> |
---|