1 | <?xml version="1.0" standalone="no"?> |
---|
2 | <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" |
---|
3 | "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [ |
---|
4 | |
---|
5 | ]> |
---|
6 | |
---|
7 | <section id="vorbis-spec-floor1"> |
---|
8 | <sectioninfo> |
---|
9 | <releaseinfo> |
---|
10 | $Id: 07-floor1.xml 10466 2005-11-28 00:34:44Z giles $ |
---|
11 | </releaseinfo> |
---|
12 | </sectioninfo> |
---|
13 | <title>Floor type 1 setup and decode</title> |
---|
14 | |
---|
15 | <section> |
---|
16 | <title>Overview</title> |
---|
17 | |
---|
18 | <para> |
---|
19 | Vorbis floor type one uses a piecewise straight-line representation to |
---|
20 | encode a spectral envelope curve. The representation plots this curve |
---|
21 | mechanically on a linear frequency axis and a logarithmic (dB) |
---|
22 | amplitude axis. The integer plotting algorithm used is similar to |
---|
23 | Bresenham's algorithm.</para> |
---|
24 | |
---|
25 | </section> |
---|
26 | |
---|
27 | <section> |
---|
28 | <title>Floor 1 format</title> |
---|
29 | |
---|
30 | <section><title>model</title> |
---|
31 | |
---|
32 | <para> |
---|
33 | Floor type one represents a spectral curve as a series of |
---|
34 | line segments. Synthesis constructs a floor curve using iterative |
---|
35 | prediction in a process roughly equivalent to the following simplified |
---|
36 | description:</para> |
---|
37 | |
---|
38 | <para> |
---|
39 | <itemizedlist> |
---|
40 | <listitem><simpara> the first line segment (base case) is a logical line spanning |
---|
41 | from x_0,y_0 to x_1,y_1 where in the base case x_0=0 and x_1=[n], the |
---|
42 | full range of the spectral floor to be computed.</simpara></listitem> |
---|
43 | |
---|
44 | <listitem><simpara>the induction step chooses a point x_new within an existing |
---|
45 | logical line segment and produces a y_new value at that point computed |
---|
46 | from the existing line's y value at x_new (as plotted by the line) and |
---|
47 | a difference value decoded from the bitstream packet.</simpara></listitem> |
---|
48 | |
---|
49 | <listitem><simpara>floor computation produces two new line segments, one running from |
---|
50 | x_0,y_0 to x_new,y_new and from x_new,y_new to x_1,y_1. This step is |
---|
51 | performed logically even if y_new represents no change to the |
---|
52 | amplitude value at x_new so that later refinement is additionally |
---|
53 | bounded at x_new.</simpara></listitem> |
---|
54 | |
---|
55 | <listitem><simpara>the induction step repeats, using a list of x values specified in |
---|
56 | the codec setup header at floor 1 initialization time. Computation |
---|
57 | is completed at the end of the x value list.</simpara></listitem> |
---|
58 | |
---|
59 | </itemizedlist> |
---|
60 | </para> |
---|
61 | |
---|
62 | <para> |
---|
63 | Consider the following example, with values chosen for ease of |
---|
64 | understanding rather than representing typical configuration:</para> |
---|
65 | |
---|
66 | <para> |
---|
67 | For the below example, we assume a floor setup with an [n] of 128. |
---|
68 | The list of selected X values in increasing order is |
---|
69 | 0,16,32,48,64,80,96,112 and 128. In list order, the values interleave |
---|
70 | as 0, 128, 64, 32, 96, 16, 48, 80 and 112. The corresponding |
---|
71 | list-order Y values as decoded from an example packet are 110, 20, -5, |
---|
72 | -45, 0, -25, -10, 30 and -10. We compute the floor in the following |
---|
73 | way, beginning with the first line:</para> |
---|
74 | |
---|
75 | <mediaobject> |
---|
76 | <imageobject> |
---|
77 | <imagedata fileref="floor1-1.png" format="PNG"/> |
---|
78 | </imageobject> |
---|
79 | <textobject> |
---|
80 | <phrase>[graph of example floor]</phrase> |
---|
81 | </textobject> |
---|
82 | </mediaobject> |
---|
83 | |
---|
84 | <para> |
---|
85 | We now draw new logical lines to reflect the correction to new_Y, and |
---|
86 | iterate for X positions 32 and 96:</para> |
---|
87 | |
---|
88 | <mediaobject> |
---|
89 | <imageobject> |
---|
90 | <imagedata fileref="floor1-2.png" format="PNG"/> |
---|
91 | </imageobject> |
---|
92 | <textobject> |
---|
93 | <phrase>[graph of example floor]</phrase> |
---|
94 | </textobject> |
---|
95 | </mediaobject> |
---|
96 | |
---|
97 | <para> |
---|
98 | Although the new Y value at X position 96 is unchanged, it is still |
---|
99 | used later as an endpoint for further refinement. From here on, the |
---|
100 | pattern should be clear; we complete the floor computation as follows:</para> |
---|
101 | |
---|
102 | <mediaobject> |
---|
103 | <imageobject> |
---|
104 | <imagedata fileref="floor1-3.png" format="PNG"/> |
---|
105 | </imageobject> |
---|
106 | <textobject> |
---|
107 | <phrase>[graph of example floor]</phrase> |
---|
108 | </textobject> |
---|
109 | </mediaobject> |
---|
110 | |
---|
111 | <mediaobject> |
---|
112 | <imageobject> |
---|
113 | <imagedata fileref="floor1-4.png" format="PNG"/> |
---|
114 | </imageobject> |
---|
115 | <textobject> |
---|
116 | <phrase>[graph of example floor]</phrase> |
---|
117 | </textobject> |
---|
118 | </mediaobject> |
---|
119 | |
---|
120 | |
---|
121 | <para> |
---|
122 | A more efficient algorithm with carefully defined integer rounding |
---|
123 | behavior is used for actual decode, as described later. The actual |
---|
124 | algorithm splits Y value computation and line plotting into two steps |
---|
125 | with modifications to the above algorithm to eliminate noise |
---|
126 | accumulation through integer roundoff/truncation. </para> |
---|
127 | |
---|
128 | </section> |
---|
129 | |
---|
130 | <section><title>header decode</title> |
---|
131 | |
---|
132 | <para> |
---|
133 | A list of floor X values is stored in the packet header in interleaved |
---|
134 | format (used in list order during packet decode and synthesis). This |
---|
135 | list is split into partitions, and each partition is assigned to a |
---|
136 | partition class. X positions 0 and [n] are implicit and do not belong |
---|
137 | to an explicit partition or partition class.</para> |
---|
138 | |
---|
139 | <para> |
---|
140 | A partition class consists of a representation vector width (the |
---|
141 | number of Y values which the partition class encodes at once), a |
---|
142 | 'subclass' value representing the number of alternate entropy books |
---|
143 | the partition class may use in representing Y values, the list of |
---|
144 | [subclass] books and a master book used to encode which alternate |
---|
145 | books were chosen for representation in a given packet. The |
---|
146 | master/subclass mechanism is meant to be used as a flexible |
---|
147 | representation cascade while still using codebooks only in a scalar |
---|
148 | context.</para> |
---|
149 | |
---|
150 | <screen> |
---|
151 | |
---|
152 | 1) [floor1_partitions] = read 5 bits as unsigned integer |
---|
153 | 2) [maximum_class] = -1 |
---|
154 | 3) iterate [i] over the range 0 ... [floor1_partitions]-1 { |
---|
155 | |
---|
156 | 4) vector [floor1_partition_class_list] element [i] = read 4 bits as unsigned integer |
---|
157 | |
---|
158 | } |
---|
159 | |
---|
160 | 5) [maximum_class] = largest integer scalar value in vector [floor1_partition_class_list] |
---|
161 | 6) iterate [i] over the range 0 ... [maximum_class] { |
---|
162 | |
---|
163 | 7) vector [floor1_class_dimensions] element [i] = read 3 bits as unsigned integer and add 1 |
---|
164 | 8) vector [floor1_class_subclasses] element [i] = read 2 bits as unsigned integer |
---|
165 | 9) if ( vector [floor1_class_subclasses] element [i] is nonzero ) { |
---|
166 | |
---|
167 | 10) vector [floor1_class_masterbooks] element [i] = read 8 bits as unsigned integer |
---|
168 | |
---|
169 | } |
---|
170 | |
---|
171 | 11) iterate [j] over the range 0 ... (2 exponent [floor1_class_subclasses] element [i]) - 1 { |
---|
172 | |
---|
173 | 12) array [floor1_subclass_books] element [i],[j] = |
---|
174 | read 8 bits as unsigned integer and subtract one |
---|
175 | } |
---|
176 | } |
---|
177 | |
---|
178 | 13) [floor1_multiplier] = read 2 bits as unsigned integer and add one |
---|
179 | 14) [rangebits] = read 4 bits as unsigned integer |
---|
180 | 15) vector [floor1_X_list] element [0] = 0 |
---|
181 | 16) vector [floor1_X_list] element [1] = 2 exponent [rangebits]; |
---|
182 | 17) [floor1_values] = 2 |
---|
183 | 18) iterate [i] over the range 0 ... [floor1_partitions]-1 { |
---|
184 | |
---|
185 | 19) [current_class_number] = vector [floor1_partition_class_list] element [i] |
---|
186 | 20) iterate [j] over the range 0 ... ([floor1_class_dimensions] element [current_class_number])-1 { |
---|
187 | 21) vector [floor1_X_list] element ([floor1_values]) = |
---|
188 | read [rangebits] bits as unsigned integer |
---|
189 | 22) increment [floor1_values] by one |
---|
190 | } |
---|
191 | } |
---|
192 | |
---|
193 | 23) done |
---|
194 | </screen> |
---|
195 | |
---|
196 | <para> |
---|
197 | An end-of-packet condition while reading any aspect of a floor 1 |
---|
198 | configuration during setup renders a stream undecodable. In |
---|
199 | addition, a <varname>[floor1_class_masterbooks]</varname> or |
---|
200 | <varname>[floor1_subclass_books]</varname> scalar element greater than the |
---|
201 | highest numbered codebook configured in this stream is an error |
---|
202 | condition that renders the stream undecodable.</para> |
---|
203 | |
---|
204 | <section id="vorbis-spec-floor1-decode"> |
---|
205 | <title>packet decode</title> |
---|
206 | |
---|
207 | <para> |
---|
208 | Packet decode begins by checking the <varname>[nonzero]</varname> flag:</para> |
---|
209 | |
---|
210 | <screen> |
---|
211 | 1) [nonzero] = read 1 bit as boolean |
---|
212 | </screen> |
---|
213 | |
---|
214 | <para> |
---|
215 | If <varname>[nonzero]</varname> is unset, that indicates this channel contained |
---|
216 | no audio energy in this frame. Decode immediately returns a status |
---|
217 | indicating this floor curve (and thus this channel) is unused this |
---|
218 | frame. (A return status of 'unused' is different from decoding a |
---|
219 | floor that has all points set to minimum representation amplitude, |
---|
220 | which happens to be approximately -140dB). |
---|
221 | </para> |
---|
222 | |
---|
223 | <para> |
---|
224 | Assuming <varname>[nonzero]</varname> is set, decode proceeds as follows:</para> |
---|
225 | |
---|
226 | <screen> |
---|
227 | 1) [range] = vector { 256, 128, 86, 64 } element ([floor1_multiplier]-1) |
---|
228 | 2) vector [floor1_Y] element [0] = read <link linkend="vorbis-spec-ilog">ilog</link>([range]-1) bits as unsigned integer |
---|
229 | 3) vector [floor1_Y] element [1] = read <link linkend="vorbis-spec-ilog">ilog</link>([range]-1) bits as unsigned integer |
---|
230 | 4) [offset] = 2; |
---|
231 | 5) iterate [i] over the range 0 ... [floor1_partitions]-1 { |
---|
232 | |
---|
233 | 6) [class] = vector [floor1_partition_class] element [i] |
---|
234 | 7) [cdim] = vector [floor1_class_dimensions] element [class] |
---|
235 | 8) [cbits] = vector [floor1_class_subclasses] element [class] |
---|
236 | 9) [csub] = (2 exponent [cbits])-1 |
---|
237 | 10) [cval] = 0 |
---|
238 | 11) if ( [cbits] is greater than zero ) { |
---|
239 | |
---|
240 | 12) [cval] = read from packet using codebook number |
---|
241 | (vector [floor1_class_masterbooks] element [class]) in scalar context |
---|
242 | } |
---|
243 | |
---|
244 | 13) iterate [j] over the range 0 ... [cdim]-1 { |
---|
245 | |
---|
246 | 14) [book] = array [floor1_subclass_books] element [class],([cval] bitwise AND [csub]) |
---|
247 | 15) [cval] = [cval] right shifted [cbits] bits |
---|
248 | 16) if ( [book] is not less than zero ) { |
---|
249 | |
---|
250 | 17) vector [floor1_Y] element ([j]+[offset]) = read from packet using codebook |
---|
251 | [book] in scalar context |
---|
252 | |
---|
253 | } else [book] is less than zero { |
---|
254 | |
---|
255 | 18) vector [floor1_Y] element ([j]+[offset]) = 0 |
---|
256 | |
---|
257 | } |
---|
258 | } |
---|
259 | |
---|
260 | 19) [offset] = [offset] + [cdim] |
---|
261 | |
---|
262 | } |
---|
263 | |
---|
264 | 20) done |
---|
265 | </screen> |
---|
266 | |
---|
267 | <para> |
---|
268 | An end-of-packet condition during curve decode should be considered a |
---|
269 | nominal occurrence; if end-of-packet is reached during any read |
---|
270 | operation above, floor decode is to return 'unused' status as if the |
---|
271 | <varname>[nonzero]</varname> flag had been unset at the beginning of decode. |
---|
272 | </para> |
---|
273 | |
---|
274 | <para> |
---|
275 | Vector <varname>[floor1_Y]</varname> contains the values from packet decode |
---|
276 | needed for floor 1 synthesis.</para> |
---|
277 | |
---|
278 | </section> |
---|
279 | |
---|
280 | <section id="vorbis-spec-floor1-synth"> |
---|
281 | <title>curve computation</title> |
---|
282 | |
---|
283 | <para> |
---|
284 | Curve computation is split into two logical steps; the first step |
---|
285 | derives final Y amplitude values from the encoded, wrapped difference |
---|
286 | values taken from the bitstream. The second step plots the curve |
---|
287 | lines. Also, although zero-difference values are used in the |
---|
288 | iterative prediction to find final Y values, these points are |
---|
289 | conditionally skipped during final line computation in step two. |
---|
290 | Skipping zero-difference values allows a smoother line fit. </para> |
---|
291 | |
---|
292 | <para> |
---|
293 | Although some aspects of the below algorithm look like inconsequential |
---|
294 | optimizations, implementors are warned to follow the details closely. |
---|
295 | Deviation from implementing a strictly equivalent algorithm can result |
---|
296 | in serious decoding errors.</para> |
---|
297 | |
---|
298 | <section> |
---|
299 | <title>step 1: amplitude value synthesis</title> |
---|
300 | |
---|
301 | <para> |
---|
302 | Unwrap the always-positive-or-zero values read from the packet into |
---|
303 | +/- difference values, then apply to line prediction.</para> |
---|
304 | |
---|
305 | <screen> |
---|
306 | 1) [range] = vector { 256, 128, 86, 64 } element ([floor1_multiplier]-1) |
---|
307 | 2) vector [floor1_step2_flag] element [0] = set |
---|
308 | 3) vector [floor1_step2_flag] element [1] = set |
---|
309 | 4) vector [floor1_final_Y] element [0] = vector [floor1_Y] element [0] |
---|
310 | 5) vector [floor1_final_Y] element [1] = vector [floor1_Y] element [1] |
---|
311 | 6) iterate [i] over the range 2 ... [floor1_values]-1 { |
---|
312 | |
---|
313 | 7) [low_neighbor_offset] = <link linkend="vorbis-spec-low_neighbor">low_neighbor</link>([floor1_X_list],[i]) |
---|
314 | 8) [high_neighbor_offset] = <link linkend="vorbis-spec-high_neighbor">high_neighbor</link>([floor1_X_list],[i]) |
---|
315 | |
---|
316 | 9) [predicted] = <link linkend="vorbis-spec-render_point">render_point</link>( vector [floor1_X_list] element [low_neighbor_offset], |
---|
317 | vector [floor1_final_Y] element [low_neighbor_offset], |
---|
318 | vector [floor1_X_list] element [high_neighbor_offset], |
---|
319 | vector [floor1_final_Y] element [high_neighbor_offset], |
---|
320 | vector [floor1_X_list] element [i] ) |
---|
321 | |
---|
322 | 10) [val] = vector [floor1_Y] element [i] |
---|
323 | 11) [highroom] = [range] - [predicted] |
---|
324 | 12) [lowroom] = [predicted] |
---|
325 | 13) if ( [highroom] is less than [lowroom] ) { |
---|
326 | |
---|
327 | 14) [room] = [highroom] * 2 |
---|
328 | |
---|
329 | } else [highroom] is not less than [lowroom] { |
---|
330 | |
---|
331 | 15) [room] = [lowroom] * 2 |
---|
332 | |
---|
333 | } |
---|
334 | |
---|
335 | 16) if ( [val] is nonzero ) { |
---|
336 | |
---|
337 | 17) vector [floor1_step2_flag] element [low_neighbor_offset] = set |
---|
338 | 18) vector [floor1_step2_flag] element [high_neighbor_offset] = set |
---|
339 | 19) vector [floor1_step2_flag] element [i] = set |
---|
340 | 20) if ( [val] is greater than or equal to [room] ) { |
---|
341 | |
---|
342 | 21) if ( [highroom] is greater than [lowroom] ) { |
---|
343 | |
---|
344 | 22) vector [floor1_final_Y] element [i] = [val] - [lowroom] + [predicted] |
---|
345 | |
---|
346 | } else [highroom] is not greater than [lowroom] { |
---|
347 | |
---|
348 | 23) vector [floor1_final_Y] element [i] = [predicted] - [val] + [highroom] - 1 |
---|
349 | |
---|
350 | } |
---|
351 | |
---|
352 | } else [val] is less than [room] { |
---|
353 | |
---|
354 | 24) if ([val] is odd) { |
---|
355 | |
---|
356 | 25) vector [floor1_final_Y] element [i] = |
---|
357 | [predicted] - (([val] + 1) divided by 2 using integer division) |
---|
358 | |
---|
359 | } else [val] is even { |
---|
360 | |
---|
361 | 26) vector [floor1_final_Y] element [i] = |
---|
362 | [predicted] + ([val] / 2 using integer division) |
---|
363 | |
---|
364 | } |
---|
365 | |
---|
366 | } |
---|
367 | |
---|
368 | } else [val] is zero { |
---|
369 | |
---|
370 | 27) vector [floor1_step2_flag] element [i] = unset |
---|
371 | 28) vector [floor1_final_Y] element [i] = [predicted] |
---|
372 | |
---|
373 | } |
---|
374 | |
---|
375 | } |
---|
376 | |
---|
377 | 29) done |
---|
378 | |
---|
379 | </screen> |
---|
380 | |
---|
381 | </section> |
---|
382 | |
---|
383 | <section> |
---|
384 | <title>step 2: curve synthesis</title> |
---|
385 | |
---|
386 | <para> |
---|
387 | Curve synthesis generates a return vector <varname>[floor]</varname> of length |
---|
388 | <varname>[n]</varname> (where <varname>[n]</varname> is provided by the decode process |
---|
389 | calling to floor decode). Floor 1 curve synthesis makes use of the |
---|
390 | <varname>[floor1_X_list]</varname>, <varname>[floor1_final_Y]</varname> and |
---|
391 | <varname>[floor1_step2_flag]</varname> vectors, as well as [floor1_multiplier] |
---|
392 | and [floor1_values] values.</para> |
---|
393 | |
---|
394 | <para> |
---|
395 | Decode begins by sorting the scalars from vectors |
---|
396 | <varname>[floor1_X_list]</varname>, <varname>[floor1_final_Y]</varname> and |
---|
397 | <varname>[floor1_step2_flag]</varname> together into new vectors |
---|
398 | <varname>[floor1_X_list]'</varname>, <varname>[floor1_final_Y]'</varname> and |
---|
399 | <varname>[floor1_step2_flag]'</varname> according to ascending sort order of the |
---|
400 | values in <varname>[floor1_X_list]</varname>. That is, sort the values of |
---|
401 | <varname>[floor1_X_list]</varname> and then apply the same permutation to |
---|
402 | elements of the other two vectors so that the X, Y and step2_flag |
---|
403 | values still match.</para> |
---|
404 | |
---|
405 | <para> |
---|
406 | Then compute the final curve in one pass:</para> |
---|
407 | |
---|
408 | <screen> |
---|
409 | 1) [hx] = 0 |
---|
410 | 2) [lx] = 0 |
---|
411 | 3) [ly] = vector [floor1_final_Y]' element [0] * [floor1_multiplier] |
---|
412 | 4) iterate [i] over the range 1 ... [floor1_values]-1 { |
---|
413 | |
---|
414 | 5) if ( [floor1_step2_flag]' element [i] is set ) { |
---|
415 | |
---|
416 | 6) [hy] = [floor1_final_Y]' element [i] * [floor1_multiplier] |
---|
417 | 7) [hx] = [floor1_X_list]' element [i] |
---|
418 | 8) <link linkend="vorbis-spec-render_line">render_line</link>( [lx], [ly], [hx], [hy], [floor] ) |
---|
419 | 9) [lx] = [hx] |
---|
420 | 10) [ly] = [hy] |
---|
421 | } |
---|
422 | } |
---|
423 | |
---|
424 | 11) if ( [hx] is less than [n] ) { |
---|
425 | |
---|
426 | 12) <link linkend="vorbis-spec-render_line">render_line</link>( [hx], [hy], [n], [hy], [floor] ) |
---|
427 | |
---|
428 | } |
---|
429 | |
---|
430 | 13) if ( [hx] is greater than [n] ) { |
---|
431 | |
---|
432 | 14) truncate vector [floor] to [n] elements |
---|
433 | |
---|
434 | } |
---|
435 | |
---|
436 | 15) for each scalar in vector [floor], perform a lookup substitution using |
---|
437 | the scalar value from [floor] as an offset into the vector <link linkend="vorbis-spec-floor1_inverse_dB_table">[floor1_inverse_dB_static_table]</link> |
---|
438 | |
---|
439 | 16) done |
---|
440 | |
---|
441 | </screen> |
---|
442 | |
---|
443 | </section> |
---|
444 | |
---|
445 | </section> |
---|
446 | |
---|
447 | </section> |
---|
448 | </section> |
---|
449 | </section> |
---|
450 | |
---|