1 | <?xml version="1.0" standalone="no"?> |
---|
2 | <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" |
---|
3 | "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [ |
---|
4 | |
---|
5 | ]> |
---|
6 | |
---|
7 | <section id="vorbis-spec-codebook"> |
---|
8 | <sectioninfo> |
---|
9 | <releaseinfo> |
---|
10 | $Id: 03-codebook.xml 7186 2004-07-20 07:19:25Z xiphmont $ |
---|
11 | </releaseinfo> |
---|
12 | </sectioninfo> |
---|
13 | <title>Probability Model and Codebooks</title> |
---|
14 | |
---|
15 | <section> |
---|
16 | <title>Overview</title> |
---|
17 | |
---|
18 | <para> |
---|
19 | Unlike practically every other mainstream audio codec, Vorbis has no |
---|
20 | statically configured probability model, instead packing all entropy |
---|
21 | decoding configuration, VQ and Huffman, into the bitstream itself in |
---|
22 | the third header, the codec setup header. This packed configuration |
---|
23 | consists of multiple 'codebooks', each containing a specific |
---|
24 | Huffman-equivalent representation for decoding compressed codewords as |
---|
25 | well as an optional lookup table of output vector values to which a |
---|
26 | decoded Huffman value is applied as an offset, generating the final |
---|
27 | decoded output corresponding to a given compressed codeword.</para> |
---|
28 | |
---|
29 | <section><title>Bitwise operation</title> |
---|
30 | <para> |
---|
31 | The codebook mechanism is built on top of the vorbis bitpacker. Both |
---|
32 | the codebooks themselves and the codewords they decode are unrolled |
---|
33 | from a packet as a series of arbitrary-width values read from the |
---|
34 | stream according to <xref linkend="vorbis-spec-bitpacking"/>.</para> |
---|
35 | </section> |
---|
36 | |
---|
37 | </section> |
---|
38 | |
---|
39 | <section> |
---|
40 | <title>Packed codebook format</title> |
---|
41 | |
---|
42 | <para> |
---|
43 | For purposes of the examples below, we assume that the storage |
---|
44 | system's native byte width is eight bits. This is not universally |
---|
45 | true; see <xref linkend="vorbis-spec-bitpacking"/> for discussion |
---|
46 | relating to non-eight-bit bytes.</para> |
---|
47 | |
---|
48 | <section><title>codebook decode</title> |
---|
49 | |
---|
50 | <para> |
---|
51 | A codebook begins with a 24 bit sync pattern, 0x564342: |
---|
52 | |
---|
53 | <screen> |
---|
54 | byte 0: [ 0 1 0 0 0 0 1 0 ] (0x42) |
---|
55 | byte 1: [ 0 1 0 0 0 0 1 1 ] (0x43) |
---|
56 | byte 2: [ 0 1 0 1 0 1 1 0 ] (0x56) |
---|
57 | </screen></para> |
---|
58 | |
---|
59 | <para> |
---|
60 | 16 bit <varname>[codebook_dimensions]</varname> and 24 bit <varname>[codebook_entries]</varname> fields: |
---|
61 | |
---|
62 | <screen> |
---|
63 | |
---|
64 | byte 3: [ X X X X X X X X ] |
---|
65 | byte 4: [ X X X X X X X X ] [codebook_dimensions] (16 bit unsigned) |
---|
66 | |
---|
67 | byte 5: [ X X X X X X X X ] |
---|
68 | byte 6: [ X X X X X X X X ] |
---|
69 | byte 7: [ X X X X X X X X ] [codebook_entries] (24 bit unsigned) |
---|
70 | |
---|
71 | </screen></para> |
---|
72 | |
---|
73 | <para> |
---|
74 | Next is the <varname>[ordered]</varname> bit flag: |
---|
75 | |
---|
76 | <screen> |
---|
77 | |
---|
78 | byte 8: [ X ] [ordered] (1 bit) |
---|
79 | |
---|
80 | </screen></para> |
---|
81 | |
---|
82 | <para> |
---|
83 | Each entry, numbering a |
---|
84 | total of <varname>[codebook_entries]</varname>, is assigned a codeword length. |
---|
85 | We now read the list of codeword lengths and store these lengths in |
---|
86 | the array <varname>[codebook_codeword_lengths]</varname>. Decode of lengths is |
---|
87 | according to whether the <varname>[ordered]</varname> flag is set or unset. |
---|
88 | |
---|
89 | <itemizedlist> |
---|
90 | <listitem> |
---|
91 | <para>If the <varname>[ordered]</varname> flag is unset, the codeword list is not |
---|
92 | length ordered and the decoder needs to read each codeword length |
---|
93 | one-by-one.</para> |
---|
94 | |
---|
95 | <para>The decoder first reads one additional bit flag, the |
---|
96 | <varname>[sparse]</varname> flag. This flag determines whether or not the |
---|
97 | codebook contains unused entries that are not to be included in the |
---|
98 | codeword decode tree: |
---|
99 | |
---|
100 | <screen> |
---|
101 | byte 8: [ X 1 ] [sparse] flag (1 bit) |
---|
102 | </screen></para> |
---|
103 | |
---|
104 | <para> |
---|
105 | The decoder now performs for each of the <varname>[codebook_entries]</varname> |
---|
106 | codebook entries: |
---|
107 | |
---|
108 | <screen> |
---|
109 | |
---|
110 | 1) if([sparse] is set){ |
---|
111 | |
---|
112 | 2) [flag] = read one bit; |
---|
113 | 3) if([flag] is set){ |
---|
114 | |
---|
115 | 4) [length] = read a five bit unsigned integer; |
---|
116 | 5) codeword length for this entry is [length]+1; |
---|
117 | |
---|
118 | } else { |
---|
119 | |
---|
120 | 6) this entry is unused. mark it as such. |
---|
121 | |
---|
122 | } |
---|
123 | |
---|
124 | } else the sparse flag is not set { |
---|
125 | |
---|
126 | 7) [length] = read a five bit unsigned integer; |
---|
127 | 8) the codeword length for this entry is [length]+1; |
---|
128 | |
---|
129 | } |
---|
130 | |
---|
131 | </screen></para> |
---|
132 | </listitem> |
---|
133 | <listitem> |
---|
134 | <para>If the <varname>[ordered]</varname> flag is set, the codeword list for this |
---|
135 | codebook is encoded in ascending length order. Rather than reading |
---|
136 | a length for every codeword, the encoder reads the number of |
---|
137 | codewords per length. That is, beginning at entry zero: |
---|
138 | |
---|
139 | <screen> |
---|
140 | 1) [current_entry] = 0; |
---|
141 | 2) [current_length] = read a five bit unsigned integer and add 1; |
---|
142 | 3) [number] = read <link linkend="vorbis-spec-ilog">ilog</link>([codebook_entries] - [current_entry]) bits as an unsigned integer |
---|
143 | 4) set the entries [current_entry] through [current_entry]+[number]-1, inclusive, |
---|
144 | of the [codebook_codeword_lengths] array to [current_length] |
---|
145 | 5) set [current_entry] to [number] + [current_entry] |
---|
146 | 6) increment [current_length] by 1 |
---|
147 | 7) if [current_entry] is greater than [codebook_entries] ERROR CONDITION; |
---|
148 | the decoder will not be able to read this stream. |
---|
149 | 8) if [current_entry] is less than [codebook_entries], repeat process starting at 3) |
---|
150 | 9) done. |
---|
151 | </screen></para> |
---|
152 | </listitem> |
---|
153 | </itemizedlist> |
---|
154 | |
---|
155 | After all codeword lengths have been decoded, the decoder reads the |
---|
156 | vector lookup table. Vorbis I supports three lookup types: |
---|
157 | <orderedlist> |
---|
158 | <listitem> |
---|
159 | <simpara>No lookup</simpara> |
---|
160 | </listitem><listitem> |
---|
161 | <simpara>Implicitly populated value mapping (lattice VQ)</simpara> |
---|
162 | </listitem><listitem> |
---|
163 | <simpara>Explicitly populated value mapping (tessellated or 'foam' |
---|
164 | VQ)</simpara> |
---|
165 | </listitem> |
---|
166 | </orderedlist> |
---|
167 | </para> |
---|
168 | |
---|
169 | <para> |
---|
170 | The lookup table type is read as a four bit unsigned integer: |
---|
171 | <screen> |
---|
172 | 1) [codebook_lookup_type] = read four bits as an unsigned integer |
---|
173 | </screen></para> |
---|
174 | |
---|
175 | <para> |
---|
176 | Codebook decode precedes according to <varname>[codebook_lookup_type]</varname>: |
---|
177 | <itemizedlist> |
---|
178 | <listitem> |
---|
179 | <para>Lookup type zero indicates no lookup to be read. Proceed past |
---|
180 | lookup decode.</para> |
---|
181 | </listitem><listitem> |
---|
182 | <para>Lookup types one and two are similar, differing only in the |
---|
183 | number of lookup values to be read. Lookup type one reads a list of |
---|
184 | values that are permuted in a set pattern to build a list of vectors, |
---|
185 | each vector of order <varname>[codebook_dimensions]</varname> scalars. Lookup |
---|
186 | type two builds the same vector list, but reads each scalar for each |
---|
187 | vector explicitly, rather than building vectors from a smaller list of |
---|
188 | possible scalar values. Lookup decode proceeds as follows: |
---|
189 | |
---|
190 | <screen> |
---|
191 | 1) [codebook_minimum_value] = <link linkend="vorbis-spec-float32_unpack">float32_unpack</link>( read 32 bits as an unsigned integer) |
---|
192 | 2) [codebook_delta_value] = <link linkend="vorbis-spec-float32_unpack">float32_unpack</link>( read 32 bits as an unsigned integer) |
---|
193 | 3) [codebook_value_bits] = read 4 bits as an unsigned integer and add 1 |
---|
194 | 4) [codebook_sequence_p] = read 1 bit as a boolean flag |
---|
195 | |
---|
196 | if ( [codebook_lookup_type] is 1 ) { |
---|
197 | |
---|
198 | 5) [codebook_lookup_values] = <link linkend="vorbis-spec-lookup1_values">lookup1_values</link>(<varname>[codebook_entries]</varname>, <varname>[codebook_dimensions]</varname> ) |
---|
199 | |
---|
200 | } else { |
---|
201 | |
---|
202 | 6) [codebook_lookup_values] = <varname>[codebook_entries]</varname> * <varname>[codebook_dimensions]</varname> |
---|
203 | |
---|
204 | } |
---|
205 | |
---|
206 | 7) read a total of [codebook_lookup_values] unsigned integers of [codebook_value_bits] each; |
---|
207 | store these in order in the array [codebook_multiplicands] |
---|
208 | </screen></para> |
---|
209 | </listitem><listitem> |
---|
210 | <para>A <varname>[codebook_lookup_type]</varname> of greater than two is reserved |
---|
211 | and indicates a stream that is not decodable by the specification in this |
---|
212 | document.</para> |
---|
213 | </listitem> |
---|
214 | </itemizedlist> |
---|
215 | </para> |
---|
216 | |
---|
217 | <para> |
---|
218 | An 'end of packet' during any read operation in the above steps is |
---|
219 | considered an error condition rendering the stream undecodable.</para> |
---|
220 | |
---|
221 | <section><title>Huffman decision tree representation</title> |
---|
222 | |
---|
223 | <para> |
---|
224 | The <varname>[codebook_codeword_lengths]</varname> array and |
---|
225 | <varname>[codebook_entries]</varname> value uniquely define the Huffman decision |
---|
226 | tree used for entropy decoding.</para> |
---|
227 | |
---|
228 | <para> |
---|
229 | Briefly, each used codebook entry (recall that length-unordered |
---|
230 | codebooks support unused codeword entries) is assigned, in order, the |
---|
231 | lowest valued unused binary Huffman codeword possible. Assume the |
---|
232 | following codeword length list: |
---|
233 | |
---|
234 | <screen> |
---|
235 | entry 0: length 2 |
---|
236 | entry 1: length 4 |
---|
237 | entry 2: length 4 |
---|
238 | entry 3: length 4 |
---|
239 | entry 4: length 4 |
---|
240 | entry 5: length 2 |
---|
241 | entry 6: length 3 |
---|
242 | entry 7: length 3 |
---|
243 | </screen></para> |
---|
244 | |
---|
245 | <para> |
---|
246 | Assigning codewords in order (lowest possible value of the appropriate |
---|
247 | length to highest) results in the following codeword list: |
---|
248 | |
---|
249 | <screen> |
---|
250 | entry 0: length 2 codeword 00 |
---|
251 | entry 1: length 4 codeword 0100 |
---|
252 | entry 2: length 4 codeword 0101 |
---|
253 | entry 3: length 4 codeword 0110 |
---|
254 | entry 4: length 4 codeword 0111 |
---|
255 | entry 5: length 2 codeword 10 |
---|
256 | entry 6: length 3 codeword 110 |
---|
257 | entry 7: length 3 codeword 111 |
---|
258 | </screen></para> |
---|
259 | |
---|
260 | |
---|
261 | <note> |
---|
262 | <para> |
---|
263 | Unlike most binary numerical values in this document, we |
---|
264 | intend the above codewords to be read and used bit by bit from left to |
---|
265 | right, thus the codeword '001' is the bit string 'zero, zero, one'. |
---|
266 | When determining 'lowest possible value' in the assignment definition |
---|
267 | above, the leftmost bit is the MSb.</para> |
---|
268 | </note> |
---|
269 | |
---|
270 | <para> |
---|
271 | It is clear that the codeword length list represents a Huffman |
---|
272 | decision tree with the entry numbers equivalent to the leaves numbered |
---|
273 | left-to-right: |
---|
274 | |
---|
275 | <mediaobject> |
---|
276 | <imageobject> |
---|
277 | <imagedata fileref="hufftree.png" format="PNG"/> |
---|
278 | </imageobject> |
---|
279 | <textobject> |
---|
280 | <phrase>[huffman tree illustration]</phrase> |
---|
281 | </textobject> |
---|
282 | </mediaobject> |
---|
283 | </para> |
---|
284 | |
---|
285 | <para> |
---|
286 | As we assign codewords in order, we see that each choice constructs a |
---|
287 | new leaf in the leftmost possible position.</para> |
---|
288 | |
---|
289 | <para> |
---|
290 | Note that it's possible to underspecify or overspecify a Huffman tree |
---|
291 | via the length list. In the above example, if codeword seven were |
---|
292 | eliminated, it's clear that the tree is unfinished: |
---|
293 | |
---|
294 | <mediaobject> |
---|
295 | <imageobject> |
---|
296 | <imagedata fileref="hufftree-under.png" format="PNG"/> |
---|
297 | </imageobject> |
---|
298 | <textobject> |
---|
299 | <phrase>[underspecified huffman tree illustration]</phrase> |
---|
300 | </textobject> |
---|
301 | </mediaobject> |
---|
302 | </para> |
---|
303 | |
---|
304 | <para> |
---|
305 | Similarly, in the original codebook, it's clear that the tree is fully |
---|
306 | populated and a ninth codeword is impossible. Both underspecified and |
---|
307 | overspecified trees are an error condition rendering the stream |
---|
308 | undecodable.</para> |
---|
309 | |
---|
310 | <para> |
---|
311 | Codebook entries marked 'unused' are simply skipped in the assigning |
---|
312 | process. They have no codeword and do not appear in the decision |
---|
313 | tree, thus it's impossible for any bit pattern read from the stream to |
---|
314 | decode to that entry number.</para> |
---|
315 | |
---|
316 | </section> |
---|
317 | |
---|
318 | <section><title>VQ lookup table vector representation</title> |
---|
319 | |
---|
320 | <para> |
---|
321 | Unpacking the VQ lookup table vectors relies on the following values: |
---|
322 | <programlisting> |
---|
323 | the [codebook_multiplicands] array |
---|
324 | [codebook_minimum_value] |
---|
325 | [codebook_delta_value] |
---|
326 | [codebook_sequence_p] |
---|
327 | [codebook_lookup_type] |
---|
328 | [codebook_entries] |
---|
329 | [codebook_dimensions] |
---|
330 | [codebook_lookup_values] |
---|
331 | </programlisting> |
---|
332 | </para> |
---|
333 | |
---|
334 | <para> |
---|
335 | Decoding (unpacking) a specific vector in the vector lookup table |
---|
336 | proceeds according to <varname>[codebook_lookup_type]</varname>. The unpacked |
---|
337 | vector values are what a codebook would return during audio packet |
---|
338 | decode in a VQ context.</para> |
---|
339 | |
---|
340 | <section><title>Vector value decode: Lookup type 1</title> |
---|
341 | |
---|
342 | <para> |
---|
343 | Lookup type one specifies a lattice VQ lookup table built |
---|
344 | algorithmically from a list of scalar values. Calculate (unpack) the |
---|
345 | final values of a codebook entry vector from the entries in |
---|
346 | <varname>[codebook_multiplicands]</varname> as follows (<varname>[value_vector]</varname> |
---|
347 | is the output vector representing the vector of values for entry number |
---|
348 | <varname>[lookup_offset]</varname> in this codebook): |
---|
349 | |
---|
350 | <screen> |
---|
351 | 1) [last] = 0; |
---|
352 | 2) [index_divisor] = 1; |
---|
353 | 3) iterate [i] over the range 0 ... [codebook_dimensions]-1 (once for each scalar value in the value vector) { |
---|
354 | |
---|
355 | 4) [multiplicand_offset] = ( [lookup_offset] divided by [index_divisor] using integer |
---|
356 | division ) integer modulo [codebook_lookup_values] |
---|
357 | |
---|
358 | 5) vector [value_vector] element [i] = |
---|
359 | ( [codebook_multiplicands] array element number [multiplicand_offset] ) * |
---|
360 | [codebook_delta_value] + [codebook_minimum_value] + [last]; |
---|
361 | |
---|
362 | 6) if ( [codebook_sequence_p] is set ) then set [last] = vector [value_vector] element [i] |
---|
363 | |
---|
364 | 7) [index_divisor] = [index_divisor] * [codebook_lookup_values] |
---|
365 | |
---|
366 | } |
---|
367 | |
---|
368 | 8) vector calculation completed. |
---|
369 | </screen></para> |
---|
370 | |
---|
371 | </section> |
---|
372 | |
---|
373 | <section><title>Vector value decode: Lookup type 2</title> |
---|
374 | |
---|
375 | <para> |
---|
376 | Lookup type two specifies a VQ lookup table in which each scalar in |
---|
377 | each vector is explicitly set by the <varname>[codebook_multiplicands]</varname> |
---|
378 | array in a one-to-one mapping. Calculate [unpack] the |
---|
379 | final values of a codebook entry vector from the entries in |
---|
380 | <varname>[codebook_multiplicands]</varname> as follows (<varname>[value_vector]</varname> |
---|
381 | is the output vector representing the vector of values for entry number |
---|
382 | <varname>[lookup_offset]</varname> in this codebook): |
---|
383 | |
---|
384 | <screen> |
---|
385 | 1) [last] = 0; |
---|
386 | 2) [multiplicand_offset] = [lookup_offset] * [codebook_dimensions] |
---|
387 | 3) iterate [i] over the range 0 ... [codebook_dimensions]-1 (once for each scalar value in the value vector) { |
---|
388 | |
---|
389 | 4) vector [value_vector] element [i] = |
---|
390 | ( [codebook_multiplicands] array element number [multiplicand_offset] ) * |
---|
391 | [codebook_delta_value] + [codebook_minimum_value] + [last]; |
---|
392 | |
---|
393 | 5) if ( [codebook_sequence_p] is set ) then set [last] = vector [value_vector] element [i] |
---|
394 | |
---|
395 | 6) increment [multiplicand_offset] |
---|
396 | |
---|
397 | } |
---|
398 | |
---|
399 | 7) vector calculation completed. |
---|
400 | </screen></para> |
---|
401 | |
---|
402 | </section> |
---|
403 | |
---|
404 | </section> |
---|
405 | |
---|
406 | </section> |
---|
407 | |
---|
408 | </section> |
---|
409 | |
---|
410 | <section> |
---|
411 | <title>Use of the codebook abstraction</title> |
---|
412 | |
---|
413 | <para> |
---|
414 | The decoder uses the codebook abstraction much as it does the |
---|
415 | bit-unpacking convention; a specific codebook reads a |
---|
416 | codeword from the bitstream, decoding it into an entry number, and then |
---|
417 | returns that entry number to the decoder (when used in a scalar |
---|
418 | entropy coding context), or uses that entry number as an offset into |
---|
419 | the VQ lookup table, returning a vector of values (when used in a context |
---|
420 | desiring a VQ value). Scalar or VQ context is always explicit; any call |
---|
421 | to the codebook mechanism requests either a scalar entry number or a |
---|
422 | lookup vector.</para> |
---|
423 | |
---|
424 | <para> |
---|
425 | Note that VQ lookup type zero indicates that there is no lookup table; |
---|
426 | requesting decode using a codebook of lookup type 0 in any context |
---|
427 | expecting a vector return value (even in a case where a vector of |
---|
428 | dimension one) is forbidden. If decoder setup or decode requests such |
---|
429 | an action, that is an error condition rendering the packet |
---|
430 | undecodable.</para> |
---|
431 | |
---|
432 | <para> |
---|
433 | Using a codebook to read from the packet bitstream consists first of |
---|
434 | reading and decoding the next codeword in the bitstream. The decoder |
---|
435 | reads bits until the accumulated bits match a codeword in the |
---|
436 | codebook. This process can be though of as logically walking the |
---|
437 | Huffman decode tree by reading one bit at a time from the bitstream, |
---|
438 | and using the bit as a decision boolean to take the 0 branch (left in |
---|
439 | the above examples) or the 1 branch (right in the above examples). |
---|
440 | Walking the tree finishes when the decode process hits a leaf in the |
---|
441 | decision tree; the result is the entry number corresponding to that |
---|
442 | leaf. Reading past the end of a packet propagates the 'end-of-stream' |
---|
443 | condition to the decoder.</para> |
---|
444 | |
---|
445 | <para> |
---|
446 | When used in a scalar context, the resulting codeword entry is the |
---|
447 | desired return value.</para> |
---|
448 | |
---|
449 | <para> |
---|
450 | When used in a VQ context, the codeword entry number is used as an |
---|
451 | offset into the VQ lookup table. The value returned to the decoder is |
---|
452 | the vector of scalars corresponding to this offset.</para> |
---|
453 | |
---|
454 | </section> |
---|
455 | |
---|
456 | </section> |
---|
457 | |
---|
458 | <!-- end section of probablity model and codebooks --> |
---|