Planet
navi homePPSaboutscreenshotsdownloaddevelopmentforum

source: downloads/libvorbis-1.2.0/doc/xml/03-codebook.xml @ 16

Last change on this file since 16 was 16, checked in by landauf, 16 years ago

added libvorbis

File size: 15.2 KB
Line 
1<?xml version="1.0" standalone="no"?>
2<!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
3                "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
4
5]>
6
7<section id="vorbis-spec-codebook">
8<sectioninfo>
9<releaseinfo>
10 $Id: 03-codebook.xml 7186 2004-07-20 07:19:25Z xiphmont $
11</releaseinfo>
12</sectioninfo>
13<title>Probability Model and Codebooks</title>
14
15<section>
16<title>Overview</title>
17
18<para>
19Unlike practically every other mainstream audio codec, Vorbis has no
20statically configured probability model, instead packing all entropy
21decoding configuration, VQ and Huffman, into the bitstream itself in
22the third header, the codec setup header.  This packed configuration
23consists of multiple 'codebooks', each containing a specific
24Huffman-equivalent representation for decoding compressed codewords as
25well as an optional lookup table of output vector values to which a
26decoded Huffman value is applied as an offset, generating the final
27decoded output corresponding to a given compressed codeword.</para>
28
29<section><title>Bitwise operation</title>
30<para>
31The codebook mechanism is built on top of the vorbis bitpacker. Both
32the codebooks themselves and the codewords they decode are unrolled
33from a packet as a series of arbitrary-width values read from the
34stream according to <xref linkend="vorbis-spec-bitpacking"/>.</para>
35</section>
36
37</section>
38
39<section>
40<title>Packed codebook format</title>
41
42<para>
43For purposes of the examples below, we assume that the storage
44system's native byte width is eight bits.  This is not universally
45true; see <xref linkend="vorbis-spec-bitpacking"/> for discussion
46relating to non-eight-bit bytes.</para>
47
48<section><title>codebook decode</title>
49
50<para>
51A codebook begins with a 24 bit sync pattern, 0x564342:
52
53<screen>
54byte 0: [ 0 1 0 0 0 0 1 0 ] (0x42)
55byte 1: [ 0 1 0 0 0 0 1 1 ] (0x43)
56byte 2: [ 0 1 0 1 0 1 1 0 ] (0x56)
57</screen></para>
58
59<para>
6016 bit <varname>[codebook_dimensions]</varname> and 24 bit <varname>[codebook_entries]</varname> fields:
61
62<screen>
63
64byte 3: [ X X X X X X X X ]
65byte 4: [ X X X X X X X X ] [codebook_dimensions] (16 bit unsigned)
66
67byte 5: [ X X X X X X X X ]
68byte 6: [ X X X X X X X X ]
69byte 7: [ X X X X X X X X ] [codebook_entries] (24 bit unsigned)
70
71</screen></para>
72
73<para>
74Next is the <varname>[ordered]</varname> bit flag:
75
76<screen>
77
78byte 8: [               X ] [ordered] (1 bit)
79
80</screen></para>
81
82<para>
83Each entry, numbering a
84total of <varname>[codebook_entries]</varname>, is assigned a codeword length.
85We now read the list of codeword lengths and store these lengths in
86the array <varname>[codebook_codeword_lengths]</varname>. Decode of lengths is
87according to whether the <varname>[ordered]</varname> flag is set or unset.
88
89<itemizedlist>
90<listitem>
91  <para>If the <varname>[ordered]</varname> flag is unset, the codeword list is not
92  length ordered and the decoder needs to read each codeword length
93  one-by-one.</para> 
94
95  <para>The decoder first reads one additional bit flag, the
96  <varname>[sparse]</varname> flag.  This flag determines whether or not the
97  codebook contains unused entries that are not to be included in the
98  codeword decode tree:
99
100<screen>
101byte 8: [             X 1 ] [sparse] flag (1 bit)
102</screen></para>
103
104<para>
105  The decoder now performs for each of the <varname>[codebook_entries]</varname>
106  codebook entries:
107
108<screen>
109 
110  1) if([sparse] is set){
111
112         2) [flag] = read one bit;
113         3) if([flag] is set){
114
115              4) [length] = read a five bit unsigned integer;
116              5) codeword length for this entry is [length]+1;
117
118            } else {
119
120              6) this entry is unused.  mark it as such.
121
122            }
123
124     } else the sparse flag is not set {
125
126        7) [length] = read a five bit unsigned integer;
127        8) the codeword length for this entry is [length]+1;
128       
129     }
130
131</screen></para>
132</listitem>
133<listitem>
134  <para>If the <varname>[ordered]</varname> flag is set, the codeword list for this
135  codebook is encoded in ascending length order.  Rather than reading
136  a length for every codeword, the encoder reads the number of
137  codewords per length.  That is, beginning at entry zero:
138
139<screen>
140  1) [current_entry] = 0;
141  2) [current_length] = read a five bit unsigned integer and add 1;
142  3) [number] = read <link linkend="vorbis-spec-ilog">ilog</link>([codebook_entries] - [current_entry]) bits as an unsigned integer
143  4) set the entries [current_entry] through [current_entry]+[number]-1, inclusive,
144    of the [codebook_codeword_lengths] array to [current_length]
145  5) set [current_entry] to [number] + [current_entry]
146  6) increment [current_length] by 1
147  7) if [current_entry] is greater than [codebook_entries] ERROR CONDITION;
148    the decoder will not be able to read this stream.
149  8) if [current_entry] is less than [codebook_entries], repeat process starting at 3)
150  9) done.
151</screen></para>
152</listitem>
153</itemizedlist>
154
155After all codeword lengths have been decoded, the decoder reads the
156vector lookup table.  Vorbis I supports three lookup types:
157<orderedlist>
158<listitem>
159<simpara>No lookup</simpara>
160</listitem><listitem>
161<simpara>Implicitly populated value mapping (lattice VQ)</simpara>
162</listitem><listitem>
163<simpara>Explicitly populated value mapping (tessellated or 'foam'
164VQ)</simpara>
165</listitem>
166</orderedlist>
167</para>
168
169<para>
170The lookup table type is read as a four bit unsigned integer:
171<screen>
172  1) [codebook_lookup_type] = read four bits as an unsigned integer
173</screen></para>
174
175<para>
176Codebook decode precedes according to <varname>[codebook_lookup_type]</varname>:
177<itemizedlist>
178<listitem>
179<para>Lookup type zero indicates no lookup to be read.  Proceed past
180lookup decode.</para>
181</listitem><listitem>
182<para>Lookup types one and two are similar, differing only in the
183number of lookup values to be read.  Lookup type one reads a list of
184values that are permuted in a set pattern to build a list of vectors,
185each vector of order <varname>[codebook_dimensions]</varname> scalars.  Lookup
186type two builds the same vector list, but reads each scalar for each
187vector explicitly, rather than building vectors from a smaller list of
188possible scalar values.  Lookup decode proceeds as follows:
189
190<screen>
191  1) [codebook_minimum_value] = <link linkend="vorbis-spec-float32_unpack">float32_unpack</link>( read 32 bits as an unsigned integer)
192  2) [codebook_delta_value] = <link linkend="vorbis-spec-float32_unpack">float32_unpack</link>( read 32 bits as an unsigned integer)
193  3) [codebook_value_bits] = read 4 bits as an unsigned integer and add 1
194  4) [codebook_sequence_p] = read 1 bit as a boolean flag
195
196  if ( [codebook_lookup_type] is 1 ) {
197   
198     5) [codebook_lookup_values] = <link linkend="vorbis-spec-lookup1_values">lookup1_values</link>(<varname>[codebook_entries]</varname>, <varname>[codebook_dimensions]</varname> )
199
200  } else {
201
202     6) [codebook_lookup_values] = <varname>[codebook_entries]</varname> * <varname>[codebook_dimensions]</varname>
203
204  }
205
206  7) read a total of [codebook_lookup_values] unsigned integers of [codebook_value_bits] each;
207     store these in order in the array [codebook_multiplicands]
208</screen></para>
209</listitem><listitem>
210<para>A <varname>[codebook_lookup_type]</varname> of greater than two is reserved
211and indicates a stream that is not decodable by the specification in this
212document.</para>
213</listitem>
214</itemizedlist>
215</para>
216
217<para>
218An 'end of packet' during any read operation in the above steps is
219considered an error condition rendering the stream undecodable.</para>
220
221<section><title>Huffman decision tree representation</title>
222
223<para>
224The <varname>[codebook_codeword_lengths]</varname> array and
225<varname>[codebook_entries]</varname> value uniquely define the Huffman decision
226tree used for entropy decoding.</para>
227
228<para>
229Briefly, each used codebook entry (recall that length-unordered
230codebooks support unused codeword entries) is assigned, in order, the
231lowest valued unused binary Huffman codeword possible.  Assume the
232following codeword length list:
233
234<screen>
235entry 0: length 2
236entry 1: length 4
237entry 2: length 4
238entry 3: length 4
239entry 4: length 4
240entry 5: length 2
241entry 6: length 3
242entry 7: length 3
243</screen></para>
244
245<para>
246Assigning codewords in order (lowest possible value of the appropriate
247length to highest) results in the following codeword list:
248
249<screen>
250entry 0: length 2 codeword 00
251entry 1: length 4 codeword 0100
252entry 2: length 4 codeword 0101
253entry 3: length 4 codeword 0110
254entry 4: length 4 codeword 0111
255entry 5: length 2 codeword 10
256entry 6: length 3 codeword 110
257entry 7: length 3 codeword 111
258</screen></para>
259
260
261<note>
262<para>
263Unlike most binary numerical values in this document, we
264intend the above codewords to be read and used bit by bit from left to
265right, thus the codeword '001' is the bit string 'zero, zero, one'.
266When determining 'lowest possible value' in the assignment definition
267above, the leftmost bit is the MSb.</para>
268</note>
269
270<para>
271It is clear that the codeword length list represents a Huffman
272decision tree with the entry numbers equivalent to the leaves numbered
273left-to-right:
274
275<mediaobject>
276<imageobject>
277 <imagedata fileref="hufftree.png" format="PNG"/>
278</imageobject>
279<textobject>
280 <phrase>[huffman tree illustration]</phrase>
281</textobject>
282</mediaobject>
283</para>
284
285<para>
286As we assign codewords in order, we see that each choice constructs a
287new leaf in the leftmost possible position.</para>
288
289<para>
290Note that it's possible to underspecify or overspecify a Huffman tree
291via the length list.  In the above example, if codeword seven were
292eliminated, it's clear that the tree is unfinished:
293
294<mediaobject>
295<imageobject>
296 <imagedata fileref="hufftree-under.png" format="PNG"/>
297</imageobject>
298<textobject>
299 <phrase>[underspecified huffman tree illustration]</phrase>
300</textobject>
301</mediaobject>
302</para>
303
304<para>
305Similarly, in the original codebook, it's clear that the tree is fully
306populated and a ninth codeword is impossible.  Both underspecified and
307overspecified trees are an error condition rendering the stream
308undecodable.</para>
309
310<para>
311Codebook entries marked 'unused' are simply skipped in the assigning
312process.  They have no codeword and do not appear in the decision
313tree, thus it's impossible for any bit pattern read from the stream to
314decode to that entry number.</para>
315
316</section>
317
318<section><title>VQ lookup table vector representation</title>
319
320<para>
321Unpacking the VQ lookup table vectors relies on the following values:
322<programlisting>
323the [codebook_multiplicands] array
324[codebook_minimum_value]
325[codebook_delta_value]
326[codebook_sequence_p]
327[codebook_lookup_type]
328[codebook_entries]
329[codebook_dimensions]
330[codebook_lookup_values]
331</programlisting>
332</para>
333
334<para>
335Decoding (unpacking) a specific vector in the vector lookup table
336proceeds according to <varname>[codebook_lookup_type]</varname>.  The unpacked
337vector values are what a codebook would return during audio packet
338decode in a VQ context.</para>
339
340<section><title>Vector value decode: Lookup type 1</title>
341
342<para>
343Lookup type one specifies a lattice VQ lookup table built
344algorithmically from a list of scalar values.  Calculate (unpack) the
345final values of a codebook entry vector from the entries in
346<varname>[codebook_multiplicands]</varname> as follows (<varname>[value_vector]</varname>
347is the output vector representing the vector of values for entry number
348<varname>[lookup_offset]</varname> in this codebook):
349
350<screen>
351  1) [last] = 0;
352  2) [index_divisor] = 1;
353  3) iterate [i] over the range 0 ... [codebook_dimensions]-1 (once for each scalar value in the value vector) {
354       
355       4) [multiplicand_offset] = ( [lookup_offset] divided by [index_divisor] using integer
356          division ) integer modulo [codebook_lookup_values]
357
358       5) vector [value_vector] element [i] =
359            ( [codebook_multiplicands] array element number [multiplicand_offset] ) *
360            [codebook_delta_value] + [codebook_minimum_value] + [last];
361
362       6) if ( [codebook_sequence_p] is set ) then set [last] = vector [value_vector] element [i]
363
364       7) [index_divisor] = [index_divisor] * [codebook_lookup_values]
365
366     }
367 
368  8) vector calculation completed.
369</screen></para>
370
371</section>
372
373<section><title>Vector value decode: Lookup type 2</title>
374
375<para>
376Lookup type two specifies a VQ lookup table in which each scalar in
377each vector is explicitly set by the <varname>[codebook_multiplicands]</varname>
378array in a one-to-one mapping.  Calculate [unpack] the
379final values of a codebook entry vector from the entries in
380<varname>[codebook_multiplicands]</varname> as follows (<varname>[value_vector]</varname>
381is the output vector representing the vector of values for entry number
382<varname>[lookup_offset]</varname> in this codebook):
383
384<screen>
385  1) [last] = 0;
386  2) [multiplicand_offset] = [lookup_offset] * [codebook_dimensions]
387  3) iterate [i] over the range 0 ... [codebook_dimensions]-1 (once for each scalar value in the value vector) {
388
389       4) vector [value_vector] element [i] =
390            ( [codebook_multiplicands] array element number [multiplicand_offset] ) *
391            [codebook_delta_value] + [codebook_minimum_value] + [last];
392
393       5) if ( [codebook_sequence_p] is set ) then set [last] = vector [value_vector] element [i]
394
395       6) increment [multiplicand_offset]
396
397     }
398 
399  7) vector calculation completed.
400</screen></para>
401
402</section>
403
404</section>
405
406</section>
407
408</section>
409
410<section>
411<title>Use of the codebook abstraction</title>
412
413<para>
414The decoder uses the codebook abstraction much as it does the
415bit-unpacking convention; a specific codebook reads a
416codeword from the bitstream, decoding it into an entry number, and then
417returns that entry number to the decoder (when used in a scalar
418entropy coding context), or uses that entry number as an offset into
419the VQ lookup table, returning a vector of values (when used in a context
420desiring a VQ value). Scalar or VQ context is always explicit; any call
421to the codebook mechanism requests either a scalar entry number or a
422lookup vector.</para>
423
424<para>
425Note that VQ lookup type zero indicates that there is no lookup table;
426requesting decode using a codebook of lookup type 0 in any context
427expecting a vector return value (even in a case where a vector of
428dimension one) is forbidden.  If decoder setup or decode requests such
429an action, that is an error condition rendering the packet
430undecodable.</para>
431
432<para>
433Using a codebook to read from the packet bitstream consists first of
434reading and decoding the next codeword in the bitstream. The decoder
435reads bits until the accumulated bits match a codeword in the
436codebook.  This process can be though of as logically walking the
437Huffman decode tree by reading one bit at a time from the bitstream,
438and using the bit as a decision boolean to take the 0 branch (left in
439the above examples) or the 1 branch (right in the above examples).
440Walking the tree finishes when the decode process hits a leaf in the
441decision tree; the result is the entry number corresponding to that
442leaf.  Reading past the end of a packet propagates the 'end-of-stream'
443condition to the decoder.</para>
444
445<para>
446When used in a scalar context, the resulting codeword entry is the
447desired return value.</para>
448
449<para>
450When used in a VQ context, the codeword entry number is used as an
451offset into the VQ lookup table.  The value returned to the decoder is
452the vector of scalars corresponding to this offset.</para>
453
454</section>
455
456</section>
457
458<!-- end section of probablity model and codebooks -->
Note: See TracBrowser for help on using the repository browser.