[16] | 1 | <?xml version="1.0" standalone="no"?> |
---|
| 2 | <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" |
---|
| 3 | "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [ |
---|
| 4 | |
---|
| 5 | ]> |
---|
| 6 | |
---|
| 7 | <section id="vorbis-spec-floor0"> |
---|
| 8 | <sectioninfo> |
---|
| 9 | <releaseinfo> |
---|
| 10 | $Id: 06-floor0.xml 10424 2005-11-23 08:44:18Z xiphmont $ |
---|
| 11 | </releaseinfo> |
---|
| 12 | </sectioninfo> |
---|
| 13 | <title>Floor type 0 setup and decode</title> |
---|
| 14 | |
---|
| 15 | |
---|
| 16 | <section> |
---|
| 17 | <title>Overview</title> |
---|
| 18 | |
---|
| 19 | <para> |
---|
| 20 | Vorbis floor type zero uses Line Spectral Pair (LSP, also alternately |
---|
| 21 | known as Line Spectral Frequency or LSF) representation to encode a |
---|
| 22 | smooth spectral envelope curve as the frequency response of the LSP |
---|
| 23 | filter. This representation is equivalent to a traditional all-pole |
---|
| 24 | infinite impulse response filter as would be used in linear predictive |
---|
| 25 | coding; LSP representation may be converted to LPC representation and |
---|
| 26 | vice-versa.</para> |
---|
| 27 | |
---|
| 28 | </section> |
---|
| 29 | |
---|
| 30 | <section> |
---|
| 31 | <title>Floor 0 format</title> |
---|
| 32 | |
---|
| 33 | <para> |
---|
| 34 | Floor zero configuration consists of six integer fields and a list of |
---|
| 35 | VQ codebooks for use in coding/decoding the LSP filter coefficient |
---|
| 36 | values used by each frame. </para> |
---|
| 37 | |
---|
| 38 | <section><title>header decode</title> |
---|
| 39 | |
---|
| 40 | <para> |
---|
| 41 | Configuration information for instances of floor zero decodes from the |
---|
| 42 | codec setup header (third packet). configuration decode proceeds as |
---|
| 43 | follows:</para> |
---|
| 44 | |
---|
| 45 | <screen> |
---|
| 46 | 1) [floor0_order] = read an unsigned integer of 8 bits |
---|
| 47 | 2) [floor0_rate] = read an unsigned integer of 16 bits |
---|
| 48 | 3) [floor0_bark_map_size] = read an unsigned integer of 16 bits |
---|
| 49 | 4) [floor0_amplitude_bits] = read an unsigned integer of six bits |
---|
| 50 | 5) [floor0_amplitude_offset] = read an unsigned integer of eight bits |
---|
| 51 | 6) [floor0_number_of_books] = read an unsigned integer of four bits and add 1 |
---|
| 52 | 7) if any of [floor0_order], [floor0_rate], [floor0_bark_map_size], [floor0_amplitude_bits], |
---|
| 53 | [floor0_amplitude_offset] or [floor0_number_of_books] are less than zero, the stream is not decodable |
---|
| 54 | 8) array [floor0_book_list] = read a list of [floor0_number_of_books] unsigned integers of eight bits each; |
---|
| 55 | </screen> |
---|
| 56 | |
---|
| 57 | <para> |
---|
| 58 | An end-of-packet condition during any of these bitstream reads renders |
---|
| 59 | this stream undecodable. In addition, any element of the array |
---|
| 60 | <varname>[floor0_book_list]</varname> that is greater than the maximum codebook |
---|
| 61 | number for this bitstream is an error condition that also renders the |
---|
| 62 | stream undecodable.</para> |
---|
| 63 | |
---|
| 64 | </section> |
---|
| 65 | |
---|
| 66 | <section id="vorbis-spec-floor0-decode"> |
---|
| 67 | <title>packet decode</title> |
---|
| 68 | |
---|
| 69 | <para> |
---|
| 70 | Extracting a floor0 curve from an audio packet consists of first |
---|
| 71 | decoding the curve amplitude and <varname>[floor0_order]</varname> LSP |
---|
| 72 | coefficient values from the bitstream, and then computing the floor |
---|
| 73 | curve, which is defined as the frequency response of the decoded LSP |
---|
| 74 | filter.</para> |
---|
| 75 | |
---|
| 76 | <para> |
---|
| 77 | Packet decode proceeds as follows:</para> |
---|
| 78 | <screen> |
---|
| 79 | 1) [amplitude] = read an unsigned integer of [floor0_amplitude_bits] bits |
---|
| 80 | 2) if ( [amplitude] is greater than zero ) { |
---|
| 81 | 3) [coefficients] is an empty, zero length vector |
---|
| 82 | 4) [booknumber] = read an unsigned integer of <link linkend="vorbis-spec-ilog">ilog</link>( [floor0_number_of_books] ) bits |
---|
| 83 | 5) if ( [booknumber] is greater than the highest number decode codebook ) then packet is undecodable |
---|
| 84 | 6) [last] = zero; |
---|
| 85 | 7) vector [temp_vector] = read vector from bitstream using codebook number [floor0_book_list] element [booknumber] in VQ context. |
---|
| 86 | 8) add the scalar value [last] to each scalar in vector [temp_vector] |
---|
| 87 | 9) [last] = the value of the last scalar in vector [temp_vector] |
---|
| 88 | 10) concatenate [temp_vector] onto the end of the [coefficients] vector |
---|
| 89 | 11) if (length of vector [coefficients] is less than [floor0_order], continue at step 6 |
---|
| 90 | |
---|
| 91 | } |
---|
| 92 | |
---|
| 93 | 12) done. |
---|
| 94 | |
---|
| 95 | </screen> |
---|
| 96 | |
---|
| 97 | <para> |
---|
| 98 | Take note of the following properties of decode: |
---|
| 99 | <itemizedlist> |
---|
| 100 | <listitem><simpara>An <varname>[amplitude]</varname> value of zero must result in a return code that indicates this channel is unused in this frame (the output of the channel will be all-zeroes in synthesis). Several later stages of decode don't occur for an unused channel.</simpara></listitem> |
---|
| 101 | <listitem><simpara>An end-of-packet condition during decode should be considered a |
---|
| 102 | nominal occruence; if end-of-packet is reached during any read |
---|
| 103 | operation above, floor decode is to return 'unused' status as if the |
---|
| 104 | <varname>[amplitude]</varname> value had read zero at the beginning of decode.</simpara></listitem> |
---|
| 105 | |
---|
| 106 | <listitem><simpara>The book number used for decode |
---|
| 107 | can, in fact, be stored in the bitstream in <link linkend="vorbis-spec-ilog">ilog</link>( <varname>[floor0_number_of_books]</varname> - |
---|
| 108 | 1 ) bits. Nevertheless, the above specification is correct and values |
---|
| 109 | greater than the maximum possible book value are reserved.</simpara></listitem> |
---|
| 110 | |
---|
| 111 | <listitem><simpara>The number of scalars read into the vector <varname>[coefficients]</varname> |
---|
| 112 | may be greater than <varname>[floor0_order]</varname>, the number actually |
---|
| 113 | required for curve computation. For example, if the VQ codebook used |
---|
| 114 | for the floor currently being decoded has a |
---|
| 115 | <varname>[codebook_dimensions]</varname> value of three and |
---|
| 116 | <varname>[floor0_order]</varname> is ten, the only way to fill all the needed |
---|
| 117 | scalars in <varname>[coefficients]</varname> is to to read a total of twelve |
---|
| 118 | scalars as four vectors of three scalars each. This is not an error |
---|
| 119 | condition, and care must be taken not to allow a buffer overflow in |
---|
| 120 | decode. The extra values are not used and may be ignored or discarded.</simpara></listitem> |
---|
| 121 | </itemizedlist> |
---|
| 122 | </para> |
---|
| 123 | |
---|
| 124 | </section> |
---|
| 125 | |
---|
| 126 | <section id="vorbis-spec-floor0-synth"> |
---|
| 127 | <title>curve computation</title> |
---|
| 128 | |
---|
| 129 | <para> |
---|
| 130 | Given an <varname>[amplitude]</varname> integer and <varname>[coefficients]</varname> |
---|
| 131 | vector from packet decode as well as the [floor0_order], |
---|
| 132 | [floor0_rate], [floor0_bark_map_size], [floor0_amplitude_bits] and |
---|
| 133 | [floor0_amplitude_offset] values from floor setup, and an output |
---|
| 134 | vector size <varname>[n]</varname> specified by the decode process, we compute a |
---|
| 135 | floor output vector.</para> |
---|
| 136 | |
---|
| 137 | <para> |
---|
| 138 | If the value <varname>[amplitude]</varname> is zero, the return value is a |
---|
| 139 | length <varname>[n]</varname> vector with all-zero scalars. Otherwise, begin by |
---|
| 140 | assuming the following definitions for the given vector to be |
---|
| 141 | synthesized:</para> |
---|
| 142 | |
---|
| 143 | <informalequation> |
---|
| 144 | <mediaobject> |
---|
| 145 | <textobject><phrase>[lsp map equation]</phrase></textobject> |
---|
| 146 | <textobject role="tex"><phrase> |
---|
| 147 | <![CDATA[ |
---|
| 148 | \begin{math} |
---|
| 149 | \mathrm{map}_i = \left\{ |
---|
| 150 | \begin{array}{ll} |
---|
| 151 | \min ( |
---|
| 152 | \mathtt{floor0\_bark\_map\_size} - 1, |
---|
| 153 | foobar |
---|
| 154 | ) & \textrm{for } i \in [0,n-1] \\ |
---|
| 155 | -1 & \textrm{for } i = n |
---|
| 156 | \end{array} |
---|
| 157 | \right. |
---|
| 158 | \end {math} |
---|
| 159 | |
---|
| 160 | where |
---|
| 161 | |
---|
| 162 | \begin{math} |
---|
| 163 | foobar = |
---|
| 164 | \left\lfloor |
---|
| 165 | \mathrm{bark}\left(\frac{\mathtt{floor0\_rate} \cdot i}{2n}\right) \cdot \frac{\mathtt{floor0\_bark\_map\_size}} {\mathrm{bark}(.5 \cdot \mathtt{floor0\_rate})} |
---|
| 166 | \right\rfloor |
---|
| 167 | \end{math} |
---|
| 168 | |
---|
| 169 | and |
---|
| 170 | |
---|
| 171 | \begin{math} |
---|
| 172 | \mathrm{bark}(x) = 13.1 \arctan (.00074x) + 2.24 \arctan (.0000000185x^2 + .0001x) |
---|
| 173 | \end{math} |
---|
| 174 | ]]> |
---|
| 175 | </phrase></textobject> |
---|
| 176 | <imageobject><imagedata fileref="lspmap.png"/></imageobject> |
---|
| 177 | </mediaobject> |
---|
| 178 | </informalequation> |
---|
| 179 | |
---|
| 180 | <para> |
---|
| 181 | The above is used to synthesize the LSP curve on a Bark-scale frequency |
---|
| 182 | axis, then map the result to a linear-scale frequency axis. |
---|
| 183 | Similarly, the below calculation synthesizes the output LSP curve <varname>[output]</varname> on a log |
---|
| 184 | (dB) amplitude scale, mapping it to linear amplitude in the last step:</para> |
---|
| 185 | |
---|
| 186 | <orderedlist> |
---|
| 187 | <listitem><simpara> <varname>[i]</varname> = 0 </simpara></listitem> |
---|
| 188 | <listitem><para>if ( <varname>[floor0_order]</varname> is odd ) { |
---|
| 189 | <orderedlist> |
---|
| 190 | <listitem><para>calculate <varname>[p]</varname> and <varname>[q]</varname> according to: |
---|
| 191 | <informalequation> |
---|
| 192 | <mediaobject> |
---|
| 193 | <textobject><phrase>[equation for odd lsp]</phrase></textobject> |
---|
| 194 | <textobject role="tex"><phrase> |
---|
| 195 | <![CDATA[ |
---|
| 196 | \begin{eqnarray*} |
---|
| 197 | p & = & (1 - \cos^2\omega)\prod_{j=0}^{(\mathtt{order}-3)/2} 4 (\cos c_{2j+1} - \cos \omega)^2 \\ |
---|
| 198 | q & = & \frac{1}{4} \prod_{j=0}^{(\mathtt{order}-1)/2} 4 (\cos c_{2j+1} - \cos \omega)^2 |
---|
| 199 | \end{eqnarray*} |
---|
| 200 | ]]> |
---|
| 201 | </phrase></textobject> |
---|
| 202 | <imageobject><imagedata fileref="oddlsp.png"/></imageobject> |
---|
| 203 | </mediaobject> |
---|
| 204 | </informalequation> |
---|
| 205 | </para></listitem> |
---|
| 206 | </orderedlist> |
---|
| 207 | } else <varname>[floor0_order]</varname> is even { |
---|
| 208 | <orderedlist> |
---|
| 209 | <listitem><para>calculate <varname>[p]</varname> and <varname>[q]</varname> according to: |
---|
| 210 | <informalequation> |
---|
| 211 | <mediaobject> |
---|
| 212 | <textobject><phrase>[equation for even lsp]</phrase></textobject> |
---|
| 213 | <textobject role="tex"><phrase> |
---|
| 214 | <![CDATA[ |
---|
| 215 | \begin{eqnarray*} |
---|
| 216 | p & = & \frac{(1 - \cos^2\omega)}{2} \prod_{j=0}^{(\mathtt{order}-2)/2} 4 (\cos c_{2j} - \cos \omega)^2 \\ |
---|
| 217 | q & = & \frac{(1 + \cos^2\omega)}{2} \prod_{j=0}^{(\mathtt{order}-2)/2} 4 (\cos c_{2j} - \cos \omega)^2 |
---|
| 218 | \end{eqnarray*} |
---|
| 219 | ]]> |
---|
| 220 | </phrase></textobject> |
---|
| 221 | <imageobject><imagedata fileref="evenlsp.png"/></imageobject> |
---|
| 222 | </mediaobject> |
---|
| 223 | </informalequation> |
---|
| 224 | </para></listitem> |
---|
| 225 | </orderedlist> |
---|
| 226 | } |
---|
| 227 | </para></listitem> |
---|
| 228 | <listitem><para>calculate <varname>[linear_floor_value]</varname> according to: |
---|
| 229 | <informalequation> |
---|
| 230 | <mediaobject> |
---|
| 231 | <textobject><phrase>[expression for floorval]</phrase></textobject> |
---|
| 232 | <textobject role="tex"><phrase> |
---|
| 233 | <![CDATA[ |
---|
| 234 | \begin{math} |
---|
| 235 | \exp \left( .11512925 \left(\frac{\mathtt{amplitude} \cdot \mathtt{floor0\_amplitute\_offset}}{(2^{\mathtt{floor0\_amplitude\_bits}}-1)\sqrt{p+q}} |
---|
| 236 | - \mathtt{floor0\_amplitude\_offset} \right) \right) |
---|
| 237 | \end{math} |
---|
| 238 | ]]> |
---|
| 239 | </phrase></textobject> |
---|
| 240 | <imageobject><imagedata fileref="floorval.png"/></imageobject> |
---|
| 241 | </mediaobject> |
---|
| 242 | </informalequation> |
---|
| 243 | </para></listitem> |
---|
| 244 | <listitem><simpara><varname>[iteration_condition]</varname> = map element <varname>[i]</varname></simpara></listitem> |
---|
| 245 | <listitem><simpara><varname>[output]</varname> element <varname>[i]</varname> = <varname>[linear_floor_value]</varname></simpara></listitem> |
---|
| 246 | <listitem><simpara>increment <varname>[i]</varname></simpara></listitem> |
---|
| 247 | <listitem><simpara>if ( map element <varname>[i]</varname> is equal to <varname>[iteration_condition]</varname> ) continue at step 5</simpara></listitem> |
---|
| 248 | <listitem><simpara>if ( <varname>[i]</varname> is less than <varname>[n]</varname> ) continue at step 2</simpara></listitem> |
---|
| 249 | <listitem><simpara>done</simpara></listitem> |
---|
| 250 | </orderedlist> |
---|
| 251 | |
---|
| 252 | </section> |
---|
| 253 | |
---|
| 254 | </section> |
---|
| 255 | |
---|
| 256 | </section> |
---|
| 257 | |
---|