Planet
navi homePPSaboutscreenshotsdownloaddevelopmentforum

source: downloads/libvorbis-1.2.0/doc/xml/08-residue.xml @ 16

Last change on this file since 16 was 16, checked in by landauf, 17 years ago

added libvorbis

File size: 17.9 KB
Line 
1<?xml version="1.0" standalone="no"?>
2<!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
3                "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
4
5]>
6
7<section id="vorbis-spec-residue">
8<sectioninfo>
9 <releaseinfo>
10  $Id: 08-residue.xml 13159 2007-06-21 05:22:35Z xiphmont $
11 </releaseinfo>
12</sectioninfo>
13<title>Residue setup and decode</title>
14
15
16<section>
17<title>Overview</title>
18
19<para>
20A residue vector represents the fine detail of the audio spectrum of
21one channel in an audio frame after the encoder subtracts the floor
22curve and performs any channel coupling.  A residue vector may
23represent spectral lines, spectral magnitude, spectral phase or
24hybrids as mixed by channel coupling.  The exact semantic content of
25the vector does not matter to the residue abstraction.</para>
26
27<para>
28Whatever the exact qualities, the Vorbis residue abstraction codes the
29residue vectors into the bitstream packet, and then reconstructs the
30vectors during decode.  Vorbis makes use of three different encoding
31variants (numbered 0, 1 and 2) of the same basic vector encoding
32abstraction.</para>
33
34</section>
35
36<section>
37<title>Residue format</title>
38
39<para>
40Residue format partitions each vector in the vector bundle into chunks,
41classifies each chunk, encodes the chunk classifications and finally
42encodes the chunks themselves using the the specific VQ arrangement
43defined for each selected classification.
44The exact interleaving and partitioning vary by residue encoding number,
45however the high-level process used to classify and encode the residue
46vector is the same in all three variants.</para>
47
48<para>
49A set of coded residue vectors are all of the same length.  High level
50coding structure, ignoring for the moment exactly how a partition is
51encoded and simply trusting that it is, is as follows:</para>
52
53<para>
54<itemizedlist>
55<listitem><para>Each vector is partitioned into multiple equal sized chunks
56according to configuration specified.  If we have a vector size of
57<emphasis>n</emphasis>, a partition size <emphasis>residue_partition_size</emphasis>, and a total
58of <emphasis>ch</emphasis> residue vectors, the total number of partitioned chunks
59coded is <emphasis>n</emphasis>/<emphasis>residue_partition_size</emphasis>*<emphasis>ch</emphasis>.  It is
60important to note that the integer division truncates.  In the below
61example, we assume an example <emphasis>residue_partition_size</emphasis> of 8.</para></listitem>
62
63<listitem><para>Each partition in each vector has a classification number that
64specifies which of multiple configured VQ codebook setups are used to
65decode that partition.  The classification numbers of each partition
66can be thought of as forming a vector in their own right, as in the
67illustration below.  Just as the residue vectors are coded in grouped
68partitions to increase encoding efficiency, the classification vector
69is also partitioned into chunks.  The integer elements of each scalar
70in a classification chunk are built into a single scalar that
71represents the classification numbers in that chunk.  In the below
72example, the classification codeword encodes two classification
73numbers.</para></listitem>
74
75<listitem><para>The values in a residue vector may be encoded monolithically in a
76single pass through the residue vector, but more often efficient
77codebook design dictates that each vector is encoded as the additive
78sum of several passes through the residue vector using more than one
79VQ codebook.  Thus, each residue value potentially accumulates values
80from multiple decode passes.  The classification value associated with
81a partition is the same in each pass, thus the classification codeword
82is coded only in the first pass.</para></listitem>
83
84</itemizedlist>
85</para>
86
87<mediaobject>
88<imageobject>
89 <imagedata fileref="residue-pack.png" format="PNG"/>
90</imageobject>
91<textobject>
92 <phrase>[illustration of residue vector format]</phrase>
93</textobject>
94</mediaobject>
95
96</section>
97
98<section><title>residue 0</title>
99
100<para>
101Residue 0 and 1 differ only in the way the values within a residue
102partition are interleaved during partition encoding (visually treated
103as a black box--or cyan box or brown box--in the above figure).</para>
104
105<para>
106Residue encoding 0 interleaves VQ encoding according to the
107dimension of the codebook used to encode a partition in a specific
108pass.  The dimension of the codebook need not be the same in multiple
109passes, however the partition size must be an even multiple of the
110codebook dimension.</para>
111
112<para>
113As an example, assume a partition vector of size eight, to be encoded
114by residue 0 using codebook sizes of 8, 4, 2 and 1:</para>
115
116<programlisting>
117
118            original residue vector: [ 0 1 2 3 4 5 6 7 ]
119
120codebook dimensions = 8  encoded as: [ 0 1 2 3 4 5 6 7 ]
121
122codebook dimensions = 4  encoded as: [ 0 2 4 6 ], [ 1 3 5 7 ]
123
124codebook dimensions = 2  encoded as: [ 0 4 ], [ 1 5 ], [ 2 6 ], [ 3 7 ]
125
126codebook dimensions = 1  encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
127
128</programlisting>
129
130<para>
131It is worth mentioning at this point that no configurable value in the
132residue coding setup is restricted to a power of two.</para>
133
134</section>
135
136<section><title>residue 1</title>
137
138<para>
139Residue 1 does not interleave VQ encoding.  It represents partition
140vector scalars in order.  As with residue 0, however, partition length
141must be an integer multiple of the codebook dimension, although
142dimension may vary from pass to pass.</para>
143
144<para>
145As an example, assume a partition vector of size eight, to be encoded
146by residue 0 using codebook sizes of 8, 4, 2 and 1:</para>
147
148<programlisting>
149
150            original residue vector: [ 0 1 2 3 4 5 6 7 ]
151
152codebook dimensions = 8  encoded as: [ 0 1 2 3 4 5 6 7 ]
153
154codebook dimensions = 4  encoded as: [ 0 1 2 3 ], [ 4 5 6 7 ]
155
156codebook dimensions = 2  encoded as: [ 0 1 ], [ 2 3 ], [ 4 5 ], [ 6 7 ]
157
158codebook dimensions = 1  encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
159
160</programlisting>
161
162</section>
163
164<section><title>residue 2</title>
165
166<para>
167Residue type two can be thought of as a variant of residue type 1.
168Rather than encoding multiple passed-in vectors as in residue type 1,
169the <emphasis>ch</emphasis> passed in vectors of length <emphasis>n</emphasis> are first
170interleaved and flattened into a single vector of length
171<emphasis>ch</emphasis>*<emphasis>n</emphasis>.  Encoding then proceeds as in type 1. Decoding is
172as in type 1 with decode interleave reversed. If operating on a single
173vector to begin with, residue type 1 and type 2 are equivalent.</para>
174
175<mediaobject>
176<imageobject>
177 <imagedata fileref="residue2.png" format="PNG"/>
178</imageobject>
179<textobject>
180 <phrase>[illustration of residue type 2]</phrase>
181</textobject>
182</mediaobject>
183
184</section>
185
186<section>
187<title>Residue decode</title>
188
189<section><title>header decode</title>
190
191<para>
192Header decode for all three residue types is identical.</para>
193<programlisting>
194  1) [residue_begin] = read 24 bits as unsigned integer
195  2) [residue_end] = read 24 bits as unsigned integer
196  3) [residue_partition_size] = read 24 bits as unsigned integer and add one
197  4) [residue_classifications] = read 6 bits as unsigned integer and add one
198  5) [residue_classbook] = read 8 bits as unsigned integer
199</programlisting>
200
201<para>
202<varname>[residue_begin]</varname> and <varname>[residue_end]</varname> select the specific
203sub-portion of each vector that is actually coded; it implements akin
204to a bandpass where, for coding purposes, the vector effectively
205begins at element <varname>[residue_begin]</varname> and ends at
206<varname>[residue_end]</varname>.  Preceding and following values in the unpacked
207vectors are zeroed.  Note that for residue type 2, these values as
208well as <varname>[residue_partition_size]</varname>apply to the interleaved
209vector, not the individual vectors before interleave.
210<varname>[residue_partition_size]</varname> is as explained above,
211<varname>[residue_classifications]</varname> is the number of possible
212classification to which a partition can belong and
213<varname>[residue_classbook]</varname> is the codebook number used to code
214classification codewords.  The number of dimensions in book
215<varname>[residue_classbook]</varname> determines how many classification values
216are grouped into a single classification codeword.</para>
217
218<para>
219Next we read a bitmap pattern that specifies which partition classes
220code values in which passes.</para>
221
222<programlisting>
223  1) iterate [i] over the range 0 ... [residue_classifications]-1 {
224 
225       2) [high_bits] = 0
226       3) [low_bits] = read 3 bits as unsigned integer
227       4) [bitflag] = read one bit as boolean
228       5) if ( [bitflag] is set ) then [high_bits] = read five bits as unsigned integer
229       6) vector [residue_cascade] element [i] = [high_bits] * 8 + [low_bits]
230     }
231  7) done
232</programlisting>
233
234<para>
235Finally, we read in a list of book numbers, each corresponding to
236specific bit set in the cascade bitmap.  We loop over the possible
237codebook classifications and the maximum possible number of encoding
238stages (8 in Vorbis I, as constrained by the elements of the cascade
239bitmap being eight bits):</para>
240
241<programlisting>
242  1) iterate [i] over the range 0 ... [residue_classifications]-1 {
243 
244       2) iterate [j] over the range 0 ... 7 {
245 
246            3) if ( vector [residue_cascade] element [i] bit [j] is set ) {
247
248                 4) array [residue_books] element [i][j] = read 8 bits as unsigned integer
249
250               } else {
251
252                 5) array [residue_books] element [i][j] = unused
253
254               }
255          }
256      }
257
258  6) done
259</programlisting>
260
261<para>
262An end-of-packet condition at any point in header decode renders the
263stream undecodable.  In addition, any codebook number greater than the
264maximum numbered codebook set up in this stream also renders the
265stream undecodable.</para>
266
267</section>
268
269<section><title>packet decode</title>
270
271<para>
272Format 0 and 1 packet decode is identical except for specific
273partition interleave.  Format 2 packet decode can be built out of the
274format 1 decode process.  Thus we describe first the decode
275infrastructure identical to all three formats.</para>
276
277<para>
278In addition to configuration information, the residue decode process
279is passed the number of vectors in the submap bundle and a vector of
280flags indicating if any of the vectors are not to be decoded.  If the
281passed in number of vectors is 3 and vector number 1 is marked 'do not
282decode', decode skips vector 1 during the decode loop.  However, even
283'do not decode' vectors are allocated and zeroed.</para>
284
285<para>
286Depending on the values of <varname>[residue_begin]</varname> and
287<varname>[residue_end]</varname>, it is obvious that the encoded
288portion of a residue vector may be the entire possible residue vector
289or some other strict subset of the actual residue vector size with
290zero padding at either uncoded end.  However, it is also possible to
291set <varname>[residue_begin]</varname> and
292<varname>[residue_end]</varname> to specify a range partially or
293wholly beyond the maximum vector size.  Before beginning residue
294decode, limit <varname>[residue_begin]</varname> and
295<varname>[residue_end]</varname> to the maximum possible vector size
296as follows.  We assume that the number of vectors being encoded,
297<varname>[ch]</varname> is provided by the higher level decoding
298process.</para>
299
300<programlisting>
301  1) [actual_size] = current blocksize/2;
302  2) if residue encoding is format 2
303       3) [actual_size] = [actual_size] * [ch];
304  4) [limit_residue_begin] = maximum of ([residue_begin],[actual_size]);
305  5) [limit_residue_end] = maximum of ([residue_end],[actual_size]);
306</programlisting>
307
308<para>
309The following convenience values are conceptually useful to clarifying
310the decode process:</para>
311
312<programlisting>
313  1) [classwords_per_codeword] = [codebook_dimensions] value of codebook [residue_classbook]
314  2) [n_to_read] = [limit_residue_end] - [limit_residue_begin]
315  3) [partitions_to_read] = [n_to_read] / [residue_partition_size]
316</programlisting>
317
318<para>
319Packet decode proceeds as follows, matching the description offered earlier in the document. </para>
320<programlisting>
321  1) allocate and zero all vectors that will be returned.
322  2) if ([n_to_read] is zero), stop; there is no residue to decode.
323  3) iterate [pass] over the range 0 ... 7 {
324
325       4) [partition_count] = 0
326
327       5) while [partition_count] is less than [partitions_to_read]
328
329            6) if ([pass] is zero) {
330     
331                 7) iterate [j] over the range 0 .. [ch]-1 {
332
333                      8) if vector [j] is not marked 'do not decode' {
334
335                           9) [temp] = read from packet using codebook [residue_classbook] in scalar context
336                          10) iterate [i] descending over the range [classwords_per_codeword]-1 ... 0 {
337
338                               11) array [classifications] element [j],([i]+[partition_count]) =
339                                   [temp] integer modulo [residue_classifications]
340                               12) [temp] = [temp] / [residue_classifications] using integer division
341
342                              }
343     
344                         }
345           
346                    }
347         
348               }
349
350           13) iterate [i] over the range 0 .. ([classwords_per_codeword] - 1) while [partition_count]
351               is also less than [partitions_to_read] {
352
353                 14) iterate [j] over the range 0 .. [ch]-1 {
354   
355                      15) if vector [j] is not marked 'do not decode' {
356   
357                           16) [vqclass] = array [classifications] element [j],[partition_count]
358                           17) [vqbook] = array [residue_books] element [vqclass],[pass]
359                           18) if ([vqbook] is not 'unused') {
360   
361                                19) decode partition into output vector number [j], starting at scalar
362                                    offset [limit_residue_begin]+[partition_count]*[residue_partition_size] using
363                                    codebook number [vqbook] in VQ context
364                          }
365                     }
366   
367                 20) increment [partition_count] by one
368
369               }
370          }
371     }
372 
373 21) done
374
375</programlisting>
376
377<para>
378An end-of-packet condition during packet decode is to be considered a
379nominal occurrence.  Decode returns the result of vector decode up to
380that point.</para>
381
382</section>
383
384<section><title>format 0 specifics</title>
385
386<para>
387Format zero decodes partitions exactly as described earlier in the
388'Residue Format: residue 0' section.  The following pseudocode
389presents the same algorithm. Assume:</para>
390
391<para>
392<itemizedlist>
393<listitem><simpara> <varname>[n]</varname> is the value in <varname>[residue_partition_size]</varname></simpara></listitem>
394<listitem><simpara><varname>[v]</varname> is the residue vector</simpara></listitem>
395<listitem><simpara><varname>[offset]</varname> is the beginning read offset in [v]</simpara></listitem>
396</itemizedlist>
397</para>
398
399<programlisting>
400 1) [step] = [n] / [codebook_dimensions]
401 2) iterate [i] over the range 0 ... [step]-1 {
402
403      3) vector [entry_temp] = read vector from packet using current codebook in VQ context
404      4) iterate [j] over the range 0 ... [codebook_dimensions]-1 {
405
406           5) vector [v] element ([offset]+[i]+[j]*[step]) =
407                vector [v] element ([offset]+[i]+[j]*[step]) +
408                vector [entry_temp] element [j]
409
410         }
411
412    }
413
414  6) done
415
416</programlisting>
417
418</section>
419
420<section><title>format 1 specifics</title>
421
422<para>
423Format 1 decodes partitions exactly as described earlier in the
424'Residue Format: residue 1' section.  The following pseudocode
425presents the same algorithm. Assume:</para>
426
427<para>
428<itemizedlist>
429<listitem><simpara> <varname>[n]</varname> is the value in
430<varname>[residue_partition_size]</varname></simpara></listitem>
431<listitem><simpara><varname>[v]</varname> is the residue vector</simpara></listitem>
432<listitem><simpara><varname>[offset]</varname> is the beginning read offset in [v]</simpara></listitem>
433</itemizedlist>
434</para>
435
436<programlisting>
437 1) [i] = 0
438 2) vector [entry_temp] = read vector from packet using current codebook in VQ context
439 3) iterate [j] over the range 0 ... [codebook_dimensions]-1 {
440
441      4) vector [v] element ([offset]+[i]) =
442          vector [v] element ([offset]+[i]) +
443          vector [entry_temp] element [j]
444      5) increment [i]
445
446    }
447 
448  6) if ( [i] is less than [n] ) continue at step 2
449  7) done
450</programlisting>
451
452</section>
453
454<section><title>format 2 specifics</title>
455 
456<para>
457Format 2 is reducible to format 1.  It may be implemented as an additional step prior to and an additional post-decode step after a normal format 1 decode.
458</para>
459
460<para>
461Format 2 handles 'do not decode' vectors differently than residue 0 or
4621; if all vectors are marked 'do not decode', no decode occurrs.
463However, if at least one vector is to be decoded, all the vectors are
464decoded.  We then request normal format 1 to decode a single vector
465representing all output channels, rather than a vector for each
466channel.  After decode, deinterleave the vector into independent vectors, one for each output channel.  That is:</para>
467
468<orderedlist>
469 <listitem><simpara>If all vectors 0 through <emphasis>ch</emphasis>-1 are marked 'do not decode', allocate and clear a single vector <varname>[v]</varname>of length <emphasis>ch*n</emphasis> and skip step 2 below; proceed directly to the post-decode step.</simpara></listitem>
470 <listitem><simpara>Rather than performing format 1 decode to produce <emphasis>ch</emphasis> vectors of length <emphasis>n</emphasis> each, call format 1 decode to produce a single vector <varname>[v]</varname> of length <emphasis>ch*n</emphasis>. </simpara></listitem>
471 <listitem><para>Post decode: Deinterleave the single vector <varname>[v]</varname> returned by format 1 decode as described above into <emphasis>ch</emphasis> independent vectors, one for each outputchannel, according to:
472  <programlisting>
473  1) iterate [i] over the range 0 ... [n]-1 {
474
475       2) iterate [j] over the range 0 ... [ch]-1 {
476
477            3) output vector number [j] element [i] = vector [v] element ([i] * [ch] + [j])
478
479          }
480     }
481
482  4) done
483  </programlisting>
484 </para></listitem>
485</orderedlist>
486
487</section>
488
489</section>
490
491</section>
492
Note: See TracBrowser for help on using the repository browser.