[16] | 1 | <?xml version="1.0" standalone="no"?> |
---|
| 2 | <!DOCTYPE appendix PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" |
---|
| 3 | "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [ |
---|
| 4 | |
---|
| 5 | ]> |
---|
| 6 | |
---|
| 7 | <appendix id="vorbis-over-ogg"> |
---|
| 8 | <appendixinfo> |
---|
| 9 | <releaseinfo> |
---|
| 10 | $Id: a1-encapsulation_ogg.xml 7186 2004-07-20 07:19:25Z xiphmont $ |
---|
| 11 | </releaseinfo> |
---|
| 12 | </appendixinfo> |
---|
| 13 | <title>Embedding Vorbis into an Ogg stream</title> |
---|
| 14 | |
---|
| 15 | <section> |
---|
| 16 | <title>Overview</title> |
---|
| 17 | |
---|
| 18 | <para> |
---|
| 19 | This document describes using Ogg logical and physical transport |
---|
| 20 | streams to encapsulate Vorbis compressed audio packet data into file |
---|
| 21 | form.</para> |
---|
| 22 | |
---|
| 23 | <para> |
---|
| 24 | The <xref linkend="vorbis-spec-intro"/> provides an overview of the construction |
---|
| 25 | of Vorbis audio packets.</para> |
---|
| 26 | |
---|
| 27 | <para> |
---|
| 28 | The <ulink url="oggstream.html">Ogg |
---|
| 29 | bitstream overview</ulink> and <ulink url="framing.html">Ogg logical |
---|
| 30 | bitstream and framing spec</ulink> provide detailed descriptions of Ogg |
---|
| 31 | transport streams. This specification document assumes a working |
---|
| 32 | knowledge of the concepts covered in these named backround |
---|
| 33 | documents. Please read them first.</para> |
---|
| 34 | |
---|
| 35 | <section><title>Restrictions</title> |
---|
| 36 | |
---|
| 37 | <para> |
---|
| 38 | The Ogg/Vorbis I specification currently dictates that Ogg/Vorbis |
---|
| 39 | streams use Ogg transport streams in degenerate, unmultiplexed |
---|
| 40 | form only. That is: |
---|
| 41 | |
---|
| 42 | <itemizedlist> |
---|
| 43 | <listitem><simpara> |
---|
| 44 | A meta-headerless Ogg file encapsulates the Vorbis I packets |
---|
| 45 | </simpara></listitem> |
---|
| 46 | <listitem><simpara> |
---|
| 47 | The Ogg stream may be chained, i.e. contain multiple, contigous logical streams (links). |
---|
| 48 | </simpara></listitem> |
---|
| 49 | <listitem><simpara> |
---|
| 50 | The Ogg stream must be unmultiplexed (only one stream, a Vorbis audio stream, per link) |
---|
| 51 | </simpara></listitem> |
---|
| 52 | </itemizedlist> |
---|
| 53 | </para> |
---|
| 54 | |
---|
| 55 | <para> |
---|
| 56 | This is not to say that it is not currently possible to multiplex |
---|
| 57 | Vorbis with other media types into a multi-stream Ogg file. At the |
---|
| 58 | time this document was written, Ogg was becoming a popular container |
---|
| 59 | for low-bitrate movies consisting of DiVX video and Vorbis audio. |
---|
| 60 | However, a 'Vorbis I audio file' is taken to imply Vorbis audio |
---|
| 61 | existing alone within a degenerate Ogg stream. A compliant 'Vorbis |
---|
| 62 | audio player' is not required to implement Ogg support beyond the |
---|
| 63 | specific support of Vorbis within a degenrate ogg stream (naturally, |
---|
| 64 | application authors are encouraged to support full multiplexed Ogg |
---|
| 65 | handling). |
---|
| 66 | </para> |
---|
| 67 | |
---|
| 68 | </section> |
---|
| 69 | |
---|
| 70 | <section><title>MIME type</title> |
---|
| 71 | |
---|
| 72 | <para> |
---|
| 73 | The correct MIME type of any Ogg file is <literal>application/ogg</literal>. |
---|
| 74 | However, if a file is a Vorbis I audio file (which implies a |
---|
| 75 | degenerate Ogg stream including only unmultiplexed Vorbis audio), the |
---|
| 76 | mime type <literal>audio/x-vorbis</literal> is also allowed.</para> |
---|
| 77 | |
---|
| 78 | </section> |
---|
| 79 | |
---|
| 80 | </section> |
---|
| 81 | |
---|
| 82 | <section> |
---|
| 83 | <title>Encapsulation</title> |
---|
| 84 | |
---|
| 85 | <para> |
---|
| 86 | Ogg encapsulation of a Vorbis packet stream is straightforward.</para> |
---|
| 87 | |
---|
| 88 | <itemizedlist> |
---|
| 89 | |
---|
| 90 | <listitem><simpara> |
---|
| 91 | The first Vorbis packet (the identification header), which |
---|
| 92 | uniquely identifies a stream as Vorbis audio, is placed alone in the |
---|
| 93 | first page of the logical Ogg stream. This results in a first Ogg |
---|
| 94 | page of exactly 58 bytes at the very beginning of the logical stream. |
---|
| 95 | </simpara></listitem> |
---|
| 96 | |
---|
| 97 | <listitem><simpara> |
---|
| 98 | This first page is marked 'beginning of stream' in the page flags. |
---|
| 99 | </simpara></listitem> |
---|
| 100 | |
---|
| 101 | <listitem><simpara> |
---|
| 102 | The second and third vorbis packets (comment and setup |
---|
| 103 | headers) may span one or more pages beginning on the second page of |
---|
| 104 | the logical stream. However many pages they span, the third header |
---|
| 105 | packet finishes the page on which it ends. The next (first audio) packet |
---|
| 106 | must begin on a fresh page. |
---|
| 107 | </simpara></listitem> |
---|
| 108 | |
---|
| 109 | <listitem><simpara> |
---|
| 110 | The granule position of these first pages containing only headers is zero. |
---|
| 111 | </simpara></listitem> |
---|
| 112 | |
---|
| 113 | <listitem><simpara> |
---|
| 114 | The first audio packet of the logical stream begins a fresh Ogg page. |
---|
| 115 | </simpara></listitem> |
---|
| 116 | |
---|
| 117 | <listitem><simpara> |
---|
| 118 | Packets are placed into ogg pages in order until the end of stream. |
---|
| 119 | </simpara></listitem> |
---|
| 120 | |
---|
| 121 | <listitem><simpara> |
---|
| 122 | The last page is marked 'end of stream' in the page flags. |
---|
| 123 | </simpara></listitem> |
---|
| 124 | |
---|
| 125 | <listitem><simpara> |
---|
| 126 | Vorbis packets may span page boundaries. |
---|
| 127 | </simpara></listitem> |
---|
| 128 | |
---|
| 129 | <listitem><simpara> |
---|
| 130 | The granule position of pages containing Vorbis audio is in units |
---|
| 131 | of PCM audio samples (per channel; a stereo stream's granule position |
---|
| 132 | does not increment at twice the speed of a mono stream). |
---|
| 133 | </simpara></listitem> |
---|
| 134 | |
---|
| 135 | <listitem><simpara> |
---|
| 136 | The granule position of a page represents the end PCM sample |
---|
| 137 | position of the last packet <emphasis>completed</emphasis> on that page. |
---|
| 138 | A page that is entirely spanned by a single packet (that completes on a |
---|
| 139 | subsequent page) has no granule position, and the granule position is |
---|
| 140 | set to '-1'. |
---|
| 141 | </simpara></listitem> |
---|
| 142 | |
---|
| 143 | <listitem> |
---|
| 144 | <simpara> |
---|
| 145 | The granule (PCM) position of the first page need not indicate |
---|
| 146 | that the stream started at position zero. Although the granule |
---|
| 147 | position belongs to the last completed packet on the page and a |
---|
| 148 | valid granule position must be positive, by |
---|
| 149 | inference it may indicate that the PCM position of the beginning |
---|
| 150 | of audio is positive or negative. |
---|
| 151 | </simpara> |
---|
| 152 | |
---|
| 153 | <itemizedlist> |
---|
| 154 | <listitem><simpara> |
---|
| 155 | A positive starting value simply indicates that this stream begins at |
---|
| 156 | some positive time offset, potentially within a larger |
---|
| 157 | program. This is a common case when connecting to the middle |
---|
| 158 | of broadcast stream. |
---|
| 159 | </simpara></listitem> |
---|
| 160 | <listitem><simpara> |
---|
| 161 | A negative value indicates that |
---|
| 162 | output samples preceeding time zero should be discarded during |
---|
| 163 | decoding; this technique is used to allow sample-granularity |
---|
| 164 | editing of the stream start time of already-encoded Vorbis |
---|
| 165 | streams. The number of samples to be discarded must not exceed |
---|
| 166 | the overlap-add span of the first two audio packets. |
---|
| 167 | </simpara></listitem> |
---|
| 168 | </itemizedlist> |
---|
| 169 | |
---|
| 170 | <simpara> |
---|
| 171 | In both of these cases in which the initial audio PCM starting |
---|
| 172 | offset is nonzero, the second finished audio packet must flush the |
---|
| 173 | page on which it appears and the third packet begin a fresh page. |
---|
| 174 | This allows the decoder to always be able to perform PCM position |
---|
| 175 | adjustments before needing to return any PCM data from synthesis, |
---|
| 176 | resulting in correct positioning information without any aditional |
---|
| 177 | seeking logic. |
---|
| 178 | </simpara> |
---|
| 179 | |
---|
| 180 | <note><simpara> |
---|
| 181 | Failure to do so should, at worst, cause a |
---|
| 182 | decoder implementation to return incorrect positioning information |
---|
| 183 | for seeking operations at the very beginning of the stream. |
---|
| 184 | </simpara></note> |
---|
| 185 | </listitem> |
---|
| 186 | |
---|
| 187 | <listitem><simpara> |
---|
| 188 | A granule position on the final page in a stream that indicates |
---|
| 189 | less audio data than the final packet would normally return is used to |
---|
| 190 | end the stream on other than even frame boundaries. The difference |
---|
| 191 | between the actual available data returned and the declared amount |
---|
| 192 | indicates how many trailing samples to discard from the decoding |
---|
| 193 | process. |
---|
| 194 | </simpara></listitem> |
---|
| 195 | </itemizedlist> |
---|
| 196 | |
---|
| 197 | </section> |
---|
| 198 | |
---|
| 199 | </appendix> |
---|
| 200 | |
---|
| 201 | <!-- end appendix on Vorbis encapsulation in Ogg --> |
---|