Planet

navi

home

PPS

about

screenshots

download

development

forum

Context Navigation

source: downloads/libvorbis-1.2.0/doc/xml/04-codec.xml @ 16

Last change on this file since 16 was 16, checked in by landauf, 16 years ago
added libvorbis
File size: 34.6 KB

Line
1	<?xml version="1.0" standalone="no"?>
2	<!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
3	"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
4	<!ENTITY pi "π"> <!-- GREEK SMALL LETTER PI -->
5	]>
6
7	<section id="vorbis-spec-codec">
8	<sectioninfo>
9	<releaseinfo>
10	$Id: 04-codec.xml 10466 2005-11-28 00:34:44Z giles $
11	</releaseinfo>
12	</sectioninfo>
13	<title>Codec Setup and Packet Decode</title>
14
15	<section>
16	<title>Overview</title>
17
18	<para>
19	This document serves as the top-level reference document for the
20	bit-by-bit decode specification of Vorbis I. This document assumes a
21	high-level understanding of the Vorbis decode process, which is
22	provided in <xref linkend="vorbis-spec-intro"/>. <xref
23	linkend="vorbis-spec-bitpacking"/> covers reading and writing bit fields from
24	and to bitstream packets.</para>
25
26	</section>
27
28	<section>
29	<title>Header decode and decode setup</title>
30
31	<para>
32	A Vorbis bitstream begins with three header packets. The header
33	packets are, in order, the identification header, the comments header,
34	and the setup header. All are required for decode compliance. An
35	end-of-packet condition during decoding the first or third header
36	packet renders the stream undecodable. End-of-packet decoding the
37	comment header is a non-fatal error condition.</para>
38
39	<section><title>Common header decode</title>
40
41	<para>
42	Each header packet begins with the same header fields.
43	</para>
44
45	<screen>
46	1) [packet_type] : 8 bit value
47	2) 0x76, 0x6f, 0x72, 0x62, 0x69, 0x73: the characters 'v','o','r','b','i','s' as six octets
48	</screen>
49
50	<para>
51	Decode continues according to packet type; the identification header
52	is type 1, the comment header type 3 and the setup header type 5
53	(these types are all odd as a packet with a leading single bit of '0'
54	is an audio packet). The packets must occur in the order of
55	identification, comment, setup.</para>
56
57	</section>
58
59	<section><title>Identification header</title>
60
61	<para>
62	The identification header is a short header of only a few fields used
63	to declare the stream definitively as Vorbis, and provide a few externally
64	relevant pieces of information about the audio stream. The
65	identification header is coded as follows:</para>
66
67	<screen>
68	1) [vorbis_version] = read 32 bits as unsigned integer
69	2) [audio_channels] = read 8 bit integer as unsigned
70	3) [audio_sample_rate] = read 32 bits as unsigned integer
71	4) [bitrate_maximum] = read 32 bits as signed integer
72	5) [bitrate_nominal] = read 32 bits as signed integer
73	6) [bitrate_minimum] = read 32 bits as signed integer
74	7) [blocksize_0] = 2 exponent (read 4 bits as unsigned integer)
75	8) [blocksize_1] = 2 exponent (read 4 bits as unsigned integer)
76	9) [framing_flag] = read one bit
77	</screen>
78
79	<para>
80	<varname>[vorbis_version]</varname> is to read '0' in order to be compatible
81	with this document. Both <varname>[audio_channels]</varname> and
82	<varname>[audio_sample_rate]</varname> must read greater than zero. Allowed final
83	blocksize values are 64, 128, 256, 512, 1024, 2048, 4096 and 8192 in
84	Vorbis I. <varname>[blocksize_0]</varname> must be less than or equal to
85	<varname>[blocksize_1]</varname>. The framing bit must be nonzero. Failure to
86	meet any of these conditions renders a stream undecodable.</para>
87
88	<para>
89	The bitrate fields above are used only as hints. The nominal bitrate
90	field especially may be considerably off in purely VBR streams. The
91	fields are meaningful only when greater than zero.</para>
92
93	<para>
94	<itemizedlist>
95	<listitem><simpara>All three fields set to the same value implies a fixed rate, or tightly bounded, nearly fixed-rate bitstream</simpara></listitem>
96	<listitem><simpara>Only nominal set implies a VBR or ABR stream that averages the nominal bitrate</simpara></listitem>
97	<listitem><simpara>Maximum and or minimum set implies a VBR bitstream that obeys the bitrate limits</simpara></listitem>
98	<listitem><simpara>None set indicates the encoder does not care to speculate.</simpara></listitem>
99	</itemizedlist>
100	</para>
101
102	</section>
103
104	<section><title>Comment header</title>
105	<para>
106	Comment header decode and data specification is covered in
107	<xref linkend="vorbis-spec-comment"/>.</para>
108	</section>
109
110	<section><title>Setup header</title>
111
112	<para>
113	Vorbis codec setup is configurable to an extreme degree:
114
115	<mediaobject>
116	<imageobject>
117	<imagedata fileref="components.png" format="PNG"/>
118	</imageobject>
119	<textobject>
120	<phrase>[decoder pipeline configuration]</phrase>
121	</textobject>
122	</mediaobject>
123	</para>
124
125	<para>
126	The setup header contains the bulk of the codec setup information
127	needed for decode. The setup header contains, in order, the lists of
128	codebook configurations, time-domain transform configurations
129	(placeholders in Vorbis I), floor configurations, residue
130	configurations, channel mapping configurations and mode
131	configurations. It finishes with a framing bit of '1'. Header decode
132	proceeds in the following order:</para>
133
134	<section><title>Codebooks</title>
135
136	<orderedlist>
137	<listitem><simpara><varname>[vorbis_codebook_count]</varname> = read eight bits as unsigned integer and add one</simpara></listitem>
138	<listitem><simpara>Decode <varname>[vorbis_codebook_count]</varname> codebooks in order as defined
139	in <xref linkend="vorbis-spec-codebook"/>. Save each configuration, in
140	order, in an array of
141	codebook configurations <varname>[vorbis_codebook_configurations]</varname>.</simpara></listitem>
142	</orderedlist>
143
144	</section>
145
146	<section><title>Time domain transforms</title>
147
148	<para>
149	These hooks are placeholders in Vorbis I. Nevertheless, the
150	configuration placeholder values must be read to maintain bitstream
151	sync.</para>
152
153	<orderedlist>
154	<listitem><simpara><varname>[vorbis_time_count]</varname> = read 6 bits as unsigned integer and add one</simpara></listitem>
155	<listitem><simpara>read <varname>[vorbis_time_count]</varname> 16 bit values; each value should be zero. If any value is nonzero, this is an error condition and the stream is undecodable.</simpara></listitem>
156	</orderedlist>
157
158	</section>
159
160	<section><title>Floors</title>
161
162	<para>
163	Vorbis uses two floor types; header decode is handed to the decode
164	abstraction of the appropriate type.</para>
165
166	<orderedlist>
167	<listitem><simpara><varname>[vorbis_floor_count]</varname> = read 6 bits as unsigned integer and add one</simpara></listitem>
168	<listitem><para>For each <varname>[i]</varname> of <varname>[vorbis_floor_count]</varname> floor numbers:
169	<orderedlist>
170	<listitem><simpara>read the floor type: vector <varname>[vorbis_floor_types]</varname> element <varname>[i]</varname> =
171	read 16 bits as unsigned integer</simpara></listitem>
172	<listitem><simpara>If the floor type is zero, decode the floor
173	configuration as defined in <xref linkend="vorbis-spec-floor0"/>; save
174	this
175	configuration in slot <varname>[i]</varname> of the floor configuration array <varname>[vorbis_floor_configurations]</varname>.</simpara></listitem>
176	<listitem><simpara>If the floor type is one,
177	decode the floor configuration as defined in <xref
178	linkend="vorbis-spec-floor1"/>; save this configuration in slot <varname>[i]</varname> of the floor configuration array <varname>[vorbis_floor_configurations]</varname>.</simpara></listitem>
179	<listitem><simpara>If the the floor type is greater than one, this stream is undecodable; ERROR CONDITION</simpara></listitem>
180	</orderedlist>
181	</para></listitem>
182	</orderedlist>
183
184	</section>
185
186	<section><title>Residues</title>
187
188	<para>
189	Vorbis uses three residue types; header decode of each type is identical.
190	</para>
191
192	<orderedlist>
193	<listitem><simpara><varname>[vorbis_residue_count]</varname> = read 6 bits as unsigned integer and add one
194	</simpara></listitem>
195	<listitem><para>For each of <varname>[vorbis_residue_count]</varname> residue numbers:
196	<orderedlist>
197	<listitem><simpara>read the residue type; vector <varname>[vorbis_residue_types]</varname> element <varname>[i]</varname> = read 16 bits as unsigned integer</simpara></listitem>
198	<listitem><simpara>If the residue type is zero,
199	one or two, decode the residue configuration as defined in <xref
200	linkend="vorbis-spec-residue"/>; save this configuration in slot <varname>[i]</varname> of the residue configuration array <varname>[vorbis_residue_configurations]</varname>.</simpara></listitem>
201	<listitem><simpara>If the the residue type is greater than two, this stream is undecodable; ERROR CONDITION</simpara></listitem>
202	</orderedlist>
203	</para></listitem>
204	</orderedlist>
205
206	</section>
207
208	<section><title>Mappings</title>
209
210	<para>
211	Mappings are used to set up specific pipelines for encoding
212	multichannel audio with varying channel mapping applications. Vorbis I
213	uses a single mapping type (0), with implicit PCM channel mappings.</para>
214
215	<orderedlist>
216	<listitem><simpara><varname>[vorbis_mapping_count]</varname> = read 6 bits as unsigned integer and add one</simpara></listitem>
217	<listitem><para>For each <varname>[i]</varname> of <varname>[vorbis_mapping_count]</varname> mapping numbers:
218	<orderedlist>
219	<listitem><simpara>read the mapping type: 16 bits as unsigned integer. There's no reason to save the mapping type in Vorbis I.</simpara></listitem>
220	<listitem><simpara>If the mapping type is nonzero, the stream is undecodable</simpara></listitem>
221	<listitem><para>If the mapping type is zero:
222	<orderedlist>
223	<listitem><para>read 1 bit as a boolean flag
224	<orderedlist>
225	<listitem><simpara>if set, <varname>[vorbis_mapping_submaps]</varname> = read 4 bits as unsigned integer and add one</simpara></listitem>
226	<listitem><simpara>if unset, <varname>[vorbis_mapping_submaps]</varname> = 1</simpara></listitem>
227	</orderedlist>
228	</para>
229	</listitem>
230	<listitem><para>read 1 bit as a boolean flag
231	<orderedlist>
232	<listitem><para>if set, square polar channel mapping is in use:
233	<orderedlist>
234	<listitem><simpara><varname>[vorbis_mapping_coupling_steps]</varname> = read 8 bits as unsigned integer and add one</simpara></listitem>
235	<listitem><para>for <varname>[j]</varname> each of <varname>[vorbis_mapping_coupling_steps]</varname> steps:
236	<orderedlist>
237	<listitem><simpara>vector <varname>[vorbis_mapping_magnitude]</varname> element <varname>[j]</varname>= read <link linkend="vorbis-spec-ilog">ilog</link>(<varname>[audio_channels]</varname> - 1) bits as unsigned integer</simpara></listitem>
238	<listitem><simpara>vector <varname>[vorbis_mapping_angle]</varname> element <varname>[j]</varname>= read <link linkend="vorbis-spec-ilog">ilog</link>(<varname>[audio_channels]</varname> - 1) bits as unsigned integer</simpara></listitem>
239	<listitem><simpara>the numbers read in the above two steps are channel numbers representing the channel to treat as magnitude and the channel to treat as angle, respectively. If for any coupling step the angle channel number equals the magnitude channel number, the magnitude channel number is greater than <varname>[audio_channels]</varname>-1, or the angle channel is greater than <varname>[audio_channels]</varname>-1, the stream is undecodable.</simpara></listitem>
240	</orderedlist>
241	</para>
242	</listitem>
243	</orderedlist>
244	</para>
245	</listitem>
246	<listitem><simpara>if unset, <varname>[vorbis_mapping_coupling_steps]</varname> = 0</simpara></listitem>
247	</orderedlist>
248	</para>
249	</listitem>
250	<listitem><simpara>read 2 bits (reserved field); if the value is nonzero, the stream is undecodable</simpara></listitem>
251	<listitem><simpara>if <varname>[vorbis_mapping_submaps]</varname> is greater than one, we read channel multiplex settings. For each <varname>[j]</varname> of <varname>[audio_channels]</varname> channels:</simpara>
252	<orderedlist>
253	<listitem><simpara>vector <varname>[vorbis_mapping_mux]</varname> element <varname>[j]</varname> = read 4 bits as unsigned integer</simpara></listitem>
254	<listitem><simpara>if the value is greater than the highest numbered submap (<varname>[vorbis_mapping_submaps]</varname> - 1), this in an error condition rendering the stream undecodable</simpara></listitem>
255	</orderedlist>
256	</listitem>
257	<listitem><simpara>for each submap <varname>[j]</varname> of <varname>[vorbis_mapping_submaps]</varname> submaps, read the floor and residue numbers for use in decoding that submap:</simpara>
258	<orderedlist>
259	<listitem><simpara>read and discard 8 bits (the unused time configuration placeholder)</simpara></listitem>
260	<listitem><simpara>read 8 bits as unsigned integer for the floor number; save in vector <varname>[vorbis_mapping_submap_floor]</varname> element <varname>[j]</varname></simpara></listitem>
261	<listitem><simpara>verify the floor number is not greater than the highest number floor configured for the bitstream. If it is, the bitstream is undecodable</simpara></listitem>
262	<listitem><simpara>read 8 bits as unsigned integer for the residue number; save in vector <varname>[vorbis_mapping_submap_residue]</varname> element <varname>[j]</varname></simpara></listitem>
263	<listitem><simpara>verify the residue number is not greater than the highest number residue configured for the bitstream. If it is, the bitstream is undecodable</simpara></listitem>
264	</orderedlist>
265	</listitem>
266	<listitem><simpara>save this mapping configuration in slot <varname>[i]</varname> of the mapping configuration array <varname>[vorbis_mapping_configurations]</varname>.</simpara></listitem>
267	</orderedlist></para>
268	</listitem>
269	</orderedlist>
270	</para></listitem>
271	</orderedlist>
272
273	</section>
274
275	<section><title>Modes</title>
276
277	<orderedlist>
278	<listitem><simpara><varname>[vorbis_mode_count]</varname> = read 6 bits as unsigned integer and add one</simpara></listitem>
279	<listitem><simpara>For each of <varname>[vorbis_mode_count]</varname> mode numbers:</simpara>
280	<orderedlist>
281	<listitem><simpara><varname>[vorbis_mode_blockflag]</varname> = read 1 bit</simpara></listitem>
282	<listitem><simpara><varname>[vorbis_mode_windowtype]</varname> = read 16 bits as unsigned integer</simpara></listitem>
283	<listitem><simpara><varname>[vorbis_mode_transformtype]</varname> = read 16 bits as unsigned integer</simpara></listitem>
284	<listitem><simpara><varname>[vorbis_mode_mapping]</varname> = read 8 bits as unsigned integer</simpara></listitem>
285	<listitem><simpara>verify ranges; zero is the only legal value in Vorbis I for
286	<varname>[vorbis_mode_windowtype]</varname>
287	and <varname>[vorbis_mode_transformtype]</varname>. <varname>[vorbis_mode_mapping]</varname> must not be greater than the highest number mapping in use. Any illegal values render the stream undecodable.</simpara></listitem>
288	<listitem><simpara>save this mode configuration in slot <varname>[i]</varname> of the mode configuration array
289	<varname>[vorbis_mode_configurations]</varname>.</simpara></listitem>
290	</orderedlist>
291	</listitem>
292	<listitem><simpara>read 1 bit as a framing flag. If unset, a framing error occurred and the stream is not
293	decodable.</simpara></listitem>
294	</orderedlist>
295
296	<para>
297	After reading mode descriptions, setup header decode is complete.
298	</para>
299
300	</section>
301
302	</section>
303
304	</section>
305
306	<section>
307	<title>Audio packet decode and synthesis</title>
308
309	<para>
310	Following the three header packets, all packets in a Vorbis I stream
311	are audio. The first step of audio packet decode is to read and
312	verify the packet type. <emphasis>A non-audio packet when audio is expected
313	indicates stream corruption or a non-compliant stream. The decoder
314	must ignore the packet and not attempt decoding it to audio</emphasis>.
315	</para>
316
317	<section><title>packet type, mode and window decode</title>
318
319	<orderedlist>
320	<listitem><simpara>read 1 bit <varname>[packet_type]</varname>; check that packet type is 0 (audio)</simpara></listitem>
321	<listitem><simpara>read <link linkend="vorbis-spec-ilog">ilog</link>([vorbis_mode_count]-1) bits
322	<varname>[mode_number]</varname></simpara></listitem>
323	<listitem><simpara>decode blocksize <varname>[n]</varname> is equal to <varname>[blocksize_0]</varname> if
324	<varname>[vorbis_mode_blockflag]</varname> is 0, else <varname>[n]</varname> is equal to <varname>[blocksize_1]</varname>.</simpara></listitem>
325	<listitem><simpara>perform window selection and setup; this window is used later by the inverse MDCT:</simpara>
326	<orderedlist>
327	<listitem><simpara>if this is a long window (the <varname>[vorbis_mode_blockflag]</varname> flag of this mode is
328	set):</simpara>
329	<orderedlist>
330	<listitem><simpara>read 1 bit for <varname>[previous_window_flag]</varname></simpara></listitem>
331	<listitem><simpara>read 1 bit for <varname>[next_window_flag]</varname></simpara></listitem>
332	<listitem><simpara>if <varname>[previous_window_flag]</varname> is not set, the left half
333	of the window will be a hybrid window for lapping with a
334	short block. See <xref
335	linkend="vorbis-spec-window"/> for an illustration of overlapping
336	dissimilar
337	windows. Else, the left half window will have normal long
338	shape.</simpara></listitem>
339	<listitem><simpara>if <varname>[next_window_flag]</varname> is not set, the right half of
340	the window will be a hybrid window for lapping with a short
341	block. See <xref linkend="vorbis-spec-window"/> for an
342	illustration of overlapping dissimilar
343	windows. Else, the left right window will have normal long
344	shape.</simpara></listitem>
345	</orderedlist>
346	</listitem>
347	<listitem><simpara> if this is a short window, the window is always the same
348	short-window shape.</simpara></listitem>
349	</orderedlist>
350	</listitem>
351	</orderedlist>
352
353	<para>
354	Vorbis windows all use the slope function y=sin(0.5 * π * sin^2((x+.5)/n * π)),
355	where n is window size and x ranges 0...n-1, but dissimilar
356	lapping requirements can affect overall shape. Window generation
357	proceeds as follows:</para>
358
359	<orderedlist>
360	<listitem><simpara> <varname>[window_center]</varname> = <varname>[n]</varname> / 2</simpara></listitem>
361	<listitem><para> if (<varname>[vorbis_mode_blockflag]</varname> is set and <varname>[previous_window_flag]</varname> is
362	not set) then
363	<orderedlist>
364	<listitem><simpara><varname>[left_window_start]</varname> = <varname>[n]</varname>/4 -
365	<varname>[blocksize_0]</varname>/4</simpara></listitem>
366	<listitem><simpara><varname>[left_window_end]</varname> = <varname>[n]</varname>/4 + <varname>[blocksize_0]</varname>/4</simpara></listitem>
367	<listitem><simpara><varname>[left_n]</varname> = <varname>[blocksize_0]</varname>/2</simpara></listitem>
368	</orderedlist>
369	else
370	<orderedlist>
371	<listitem><simpara><varname>[left_window_start]</varname> = 0</simpara></listitem>
372	<listitem><simpara><varname>[left_window_end]</varname> = <varname>[window_center]</varname></simpara></listitem>
373	<listitem><simpara><varname>[left_n]</varname> = <varname>[n]</varname>/2</simpara></listitem>
374	</orderedlist></para>
375	</listitem>
376	<listitem><para> if (<varname>[vorbis_mode_blockflag]</varname> is set and <varname>[next_window_flag]</varname> is not
377	set) then
378	<orderedlist>
379	<listitem><simpara><varname>[right_window_start]</varname> = <varname>[n]*3</varname>/4 -
380	<varname>[blocksize_0]</varname>/4</simpara></listitem>
381	<listitem><simpara><varname>[right_window_end]</varname> = <varname>[n]*3</varname>/4 +
382	<varname>[blocksize_0]</varname>/4</simpara></listitem>
383	<listitem><simpara><varname>[right_n]</varname> = <varname>[blocksize_0]</varname>/2</simpara></listitem>
384	</orderedlist>
385	else
386	<orderedlist>
387	<listitem><simpara><varname>[right_window_start]</varname> = <varname>[window_center]</varname></simpara></listitem>
388	<listitem><simpara><varname>[right_window_end]</varname> = <varname>[n]</varname></simpara></listitem>
389	<listitem><simpara><varname>[right_n]</varname> = <varname>[n]</varname>/2</simpara></listitem>
390	</orderedlist></para>
391	</listitem>
392	<listitem><simpara> window from range 0 ... <varname>[left_window_start]</varname>-1 inclusive is zero</simpara></listitem>
393	<listitem><simpara> for <varname>[i]</varname> in range <varname>[left_window_start]</varname> ...
394	<varname>[left_window_end]</varname>-1, window(<varname>[i]</varname>) = sin(.5 * π * sin^2( (<varname>[i]</varname>-<varname>[left_window_start]</varname>+.5) / <varname>[left_n]</varname> * .5 * π) )</simpara></listitem>
395	<listitem><simpara> window from range <varname>[left_window_end]</varname> ... <varname>[right_window_start]</varname>-1
396	inclusive is one</simpara></listitem><listitem><simpara> for <varname>[i]</varname> in range <varname>[right_window_start]</varname> ... <varname>[right_window_end]</varname>-1, window(<varname>[i]</varname>) = sin(.5 * π * sin^2( (<varname>[i]</varname>-<varname>[right_window_start]</varname>+.5) / <varname>[right_n]</varname> * .5 * π + .5 * π) )</simpara></listitem>
397	<listitem><simpara> window from range <varname>[right_window_start]</varname> ... <varname>[n]</varname>-1 is
398	zero</simpara></listitem>
399	</orderedlist>
400
401	<para>
402	An end-of-packet condition up to this point should be considered an
403	error that discards this packet from the stream. An end of packet
404	condition past this point is to be considered a possible nominal
405	occurrence.</para>
406
407	</section>
408
409	<section><title>floor curve decode</title>
410
411	<para>
412	From this point on, we assume out decode context is using mode number
413	<varname>[mode_number]</varname> from configuration array
414	<varname>[vorbis_mode_configurations]</varname> and the map number
415	<varname>[vorbis_mode_mapping]</varname> (specified by the current mode) taken
416	from the mapping configuration array
417	<varname>[vorbis_mapping_configurations]</varname>.</para>
418
419	<para>
420	Floor curves are decoded one-by-one in channel order.</para>
421
422	<para>
423	For each floor <varname>[i]</varname> of <varname>[audio_channels]</varname>
424	<orderedlist>
425	<listitem><simpara><varname>[submap_number]</varname> = element <varname>[i]</varname> of vector [vorbis_mapping_mux]</simpara></listitem>
426	<listitem><simpara><varname>[floor_number]</varname> = element <varname>[submap_number]</varname> of vector
427	[vorbis_submap_floor]</simpara></listitem>
428	<listitem><simpara>if the floor type of this
429	floor (vector <varname>[vorbis_floor_types]</varname> element
430	<varname>[floor_number]</varname>) is zero then decode the floor for
431	channel <varname>[i]</varname> according to the
432	<xref linkend="vorbis-spec-floor0-decode"/></simpara></listitem>
433	<listitem><simpara>if the type of this floor
434	is one then decode the floor for channel <varname>[i]</varname> according
435	to the <xref linkend="vorbis-spec-floor1-decode"/></simpara></listitem>
436	<listitem><simpara>save the needed decoded floor information for channel for later synthesis</simpara></listitem>
437	<listitem><simpara>if the decoded floor returned 'unused', set vector <varname>[no_residue]</varname> element
438	<varname>[i]</varname> to true, else set vector <varname>[no_residue]</varname> element <varname>[i]</varname> to
439	false</simpara></listitem>
440	</orderedlist>
441	</para>
442
443	<para>
444	An end-of-packet condition during floor decode shall result in packet
445	decode zeroing all channel output vectors and skipping to the
446	add/overlap output stage.</para>
447
448	</section>
449
450	<section><title>nonzero vector propagate</title>
451
452	<para>
453	A possible result of floor decode is that a specific vector is marked
454	'unused' which indicates that that final output vector is all-zero
455	values (and the floor is zero). The residue for that vector is not
456	coded in the stream, save for one complication. If some vectors are
457	used and some are not, channel coupling could result in mixing a
458	zeroed and nonzeroed vector to produce two nonzeroed vectors.</para>
459
460	<para>
461	for each <varname>[i]</varname> from 0 ... <varname>[vorbis_mapping_coupling_steps]</varname>-1
462
463	<orderedlist>
464	<listitem><simpara>if either <varname>[no_residue]</varname> entry for channel
465	(<varname>[vorbis_mapping_magnitude]</varname> element <varname>[i]</varname>)
466	or channel
467	(<varname>[vorbis_mapping_angle]</varname> element <varname>[i]</varname>)
468	are set to false, then both must be set to false. Note that an 'unused'
469	floor has no decoded floor information; it is important that this is
470	remembered at floor curve synthesis time.</simpara></listitem>
471	</orderedlist>
472	</para>
473
474	</section>
475
476	<section><title>residue decode</title>
477
478	<para>
479	Unlike floors, which are decoded in channel order, the residue vectors
480	are decoded in submap order.</para>
481
482	<para>
483	for each submap <varname>[i]</varname> in order from 0 ... <varname>[vorbis_mapping_submaps]</varname>-1</para>
484
485	<orderedlist>
486	<listitem><simpara><varname>[ch]</varname> = 0</simpara></listitem>
487	<listitem><simpara>for each channel <varname>[j]</varname> in order from 0 ... <varname>[audio_channels]</varname> - 1</simpara>
488	<orderedlist>
489	<listitem><simpara>if channel <varname>[j]</varname> in submap <varname>[i]</varname> (vector <varname>[vorbis_mapping_mux]</varname> element <varname>[j]</varname> is equal to <varname>[i]</varname>)</simpara>
490	<orderedlist>
491	<listitem><para>if vector <varname>[no_residue]</varname> element <varname>[j]</varname> is true
492	<orderedlist>
493	<listitem><simpara>vector <varname>[do_not_decode_flag]</varname> element <varname>[ch]</varname> is set</simpara></listitem>
494	</orderedlist>
495	else
496	<orderedlist>
497	<listitem><simpara>vector <varname>[do_not_decode_flag]</varname> element <varname>[ch]</varname> is unset</simpara></listitem>
498	</orderedlist></para>
499	</listitem>
500	<listitem><simpara>increment <varname>[ch]</varname></simpara></listitem>
501	</orderedlist>
502	</listitem>
503	</orderedlist>
504	</listitem><listitem><simpara><varname>[residue_number]</varname> = vector <varname>[vorbis_mapping_submap_residue]</varname> element <varname>[i]</varname></simpara></listitem>
505	<listitem><simpara><varname>[residue_type]</varname> = vector <varname>[vorbis_residue_types]</varname> element <varname>[residue_number]</varname></simpara></listitem>
506	<listitem><simpara>decode <varname>[ch]</varname> vectors using residue <varname>[residue_number]</varname>, according to type <varname>[residue_type]</varname>, also passing vector <varname>[do_not_decode_flag]</varname> to indicate which vectors in the bundle should not be decoded. Correct per-vector decode length is <varname>[n]</varname>/2.</simpara></listitem>
507	<listitem><simpara><varname>[ch]</varname> = 0</simpara></listitem>
508	<listitem><simpara>for each channel <varname>[j]</varname> in order from 0 ... <varname>[audio_channels]</varname></simpara>
509	<orderedlist>
510	<listitem><simpara>if channel <varname>[j]</varname> is in submap <varname>[i]</varname> (vector <varname>[vorbis_mapping_mux]</varname> element <varname>[j]</varname> is equal to <varname>[i]</varname>)</simpara>
511	<orderedlist>
512	<listitem><simpara>residue vector for channel <varname>[j]</varname> is set to decoded residue vector <varname>[ch]</varname></simpara></listitem>
513	<listitem><simpara>increment <varname>[ch]</varname></simpara></listitem>
514	</orderedlist>
515	</listitem>
516	</orderedlist>
517	</listitem>
518	</orderedlist>
519
520	</section>
521
522	<section><title>inverse coupling</title>
523
524	<para>
525	for each <varname>[i]</varname> from <varname>[vorbis_mapping_coupling_steps]</varname>-1 descending to 0
526
527	<orderedlist>
528	<listitem><simpara><varname>[magnitude_vector]</varname> = the residue vector for channel
529	(vector <varname>[vorbis_mapping_magnitude]</varname> element <varname>[i]</varname>)</simpara></listitem>
530	<listitem><simpara><varname>[angle_vector]</varname> = the residue vector for channel (vector
531	<varname>[vorbis_mapping_angle]</varname> element <varname>[i]</varname>)</simpara></listitem>
532	<listitem><simpara>for each scalar value <varname>[M]</varname> in vector <varname>[magnitude_vector]</varname> and the corresponding scalar value <varname>[A]</varname> in vector <varname>[angle_vector]</varname>:</simpara>
533	<orderedlist>
534	<listitem><para>if (<varname>[M]</varname> is greater than zero)
535	<orderedlist>
536	<listitem><para>if (<varname>[A]</varname> is greater than zero)
537	<orderedlist>
538	<listitem><simpara><varname>[new_M]</varname> = <varname>[M]</varname></simpara></listitem>
539	<listitem><simpara><varname>[new_A]</varname> = <varname>[M]</varname>-<varname>[A]</varname></simpara></listitem>
540	</orderedlist>
541	else
542	<orderedlist>
543	<listitem><simpara><varname>[new_A]</varname> = <varname>[M]</varname></simpara></listitem>
544	<listitem><simpara><varname>[new_M]</varname> = <varname>[M]</varname>+<varname>[A]</varname></simpara></listitem>
545	</orderedlist>
546	</para></listitem>
547	</orderedlist>
548	else
549	<orderedlist>
550	<listitem><para>if (<varname>[A]</varname> is greater than zero)
551	<orderedlist>
552	<listitem><simpara><varname>[new_M]</varname> = <varname>[M]</varname></simpara></listitem>
553	<listitem><simpara><varname>[new_A]</varname> = <varname>[M]</varname>+<varname>[A]</varname></simpara></listitem>
554	</orderedlist>
555	else
556	<orderedlist>
557	<listitem><simpara><varname>[new_A]</varname> = <varname>[M]</varname></simpara></listitem>
558	<listitem><simpara><varname>[new_M]</varname> = <varname>[M]</varname>-<varname>[A]</varname></simpara></listitem>
559	</orderedlist>
560	</para></listitem>
561	</orderedlist>
562	</para></listitem>
563	<listitem><simpara>set scalar value <varname>[M]</varname> in vector <varname>[magnitude_vector]</varname> to <varname>[new_M]</varname></simpara></listitem>
564	<listitem><simpara>set scalar value <varname>[A]</varname> in vector <varname>[angle_vector]</varname> to <varname>[new_A]</varname></simpara></listitem>
565	</orderedlist>
566	</listitem>
567	</orderedlist>
568	</para>
569
570	</section>
571
572	<section><title>dot product</title>
573
574	<para>
575	For each channel, synthesize the floor curve from the decoded floor
576	information, according to packet type. Note that the vector synthesis
577	length for floor computation is <varname>[n]</varname>/2.</para>
578
579	<para>
580	For each channel, multiply each element of the floor curve by each
581	element of that channel's residue vector. The result is the dot
582	product of the floor and residue vectors for each channel; the produced
583	vectors are the length <varname>[n]</varname>/2 audio spectrum for each
584	channel.</para>
585
586	<para>
587	One point is worth mentioning about this dot product; a common mistake
588	in a fixed point implementation might be to assume that a 32 bit
589	fixed-point representation for floor and residue and direct
590	multiplication of the vectors is sufficient for acceptable spectral
591	depth in all cases because it happens to mostly work with the current
592	Xiph.Org reference encoder. </para>
593
594	<para>
595	However, floor vector values can span ~140dB (~24 bits unsigned), and
596	the audio spectrum vector should represent a minimum of 120dB (~21
597	bits with sign), even when output is to a 16 bit PCM device. For the
598	residue vector to represent full scale if the floor is nailed to
599	-140dB, it must be able to span 0 to +140dB. For the residue vector
600	to reach full scale if the floor is nailed at 0dB, it must be able to
601	represent -140dB to +0dB. Thus, in order to handle full range
602	dynamics, a residue vector may span -140dB to +140dB entirely within
603	spec. A 280dB range is approximately 48 bits with sign; thus the
604	residue vector must be able to represent a 48 bit range and the dot
605	product must be able to handle an effective 48 bit times 24 bit
606	multiplication. This range may be achieved using large (64 bit or
607	larger) integers, or implementing a movable binary point
608	representation.</para>
609
610	</section>
611
612	<section><title>inverse MDCT</title>
613
614	<para>
615	Convert the audio spectrum vector of each channel back into time
616	domain PCM audio via an inverse Modified Discrete Cosine Transform
617	(MDCT). A detailed description of the MDCT is available in the paper
618	<ulink url="http://www.iocon.com/resource/docs/ps/eusipco_corrected.ps"><citetitle pubwork="article">The
619	use of multirate filter banks for coding of high quality digital
620	audio</citetitle></ulink>, by T. Sporer, K. Brandenburg and B. Edler. The window
621	function used for the MDCT is the function described earlier.</para>
622
623	</section>
624
625	<section><title>overlap_add</title>
626
627	<para>
628	Windowed MDCT output is overlapped and added with the right hand data
629	of the previous window such that the 3/4 point of the previous window
630	is aligned with the 1/4 point of the current window (as illustrated in
631	<xref linkend="vorbis-spec-window"/>). The overlapped portion
632	produced from overlapping the previous and current frame data is
633	finished data to be returned by the decoder. This data spans from the
634	center of the previous window to the center of the current window. In
635	the case of same-sized windows, the amount of data to return is
636	one-half block consisting of and only of the overlapped portions. When
637	overlapping a short and long window, much of the returned range does not
638	actually overlap. This does not damage transform orthogonality. Pay
639	attention however to returning the correct data range; the amount of
640	data to be returned is:
641
642	<programlisting>
643	window_blocksize(previous_window)/4+window_blocksize(current_window)/4
644	</programlisting>
645
646	from the center (element windowsize/2) of the previous window to the
647	center (element windowsize/2-1, inclusive) of the current window.</para>
648
649	<para>
650	Data is not returned from the first frame; it must be used to 'prime'
651	the decode engine. The encoder accounts for this priming when
652	calculating PCM offsets; after the first frame, the proper PCM output
653	offset is '0' (as no data has been returned yet).</para>
654
655	</section>
656
657	<section><title>output channel order</title>
658
659	<para>
660	Vorbis I specifies only a channel mapping type 0. In mapping type 0,
661	channel mapping is implicitly defined as follows for standard audio
662	applications:</para>
663
664	<variablelist>
665	<varlistentry>
666	<term>one channel</term>
667	<listitem>
668	<simpara>the stream is monophonic</simpara>
669	</listitem>
670	</varlistentry><varlistentry>
671	<term>two channels</term><listitem>
672	<simpara>the stream is stereo. channel order: left, right</simpara>
673	</listitem>
674	</varlistentry><varlistentry>
675	<term>three channels</term><listitem>
676	<simpara>the stream is a 1d-surround encoding. channel order: left,
677	center, right</simpara>
678	</listitem>
679	</varlistentry><varlistentry>
680	<term>four channels</term><listitem>
681	<simpara>the stream is quadraphonic surround. channel order: front left,
682	front right, rear left, rear right</simpara>
683	</listitem>
684	</varlistentry><varlistentry>
685	<term>five channels</term><listitem>
686	<simpara>the stream is five-channel surround. channel order: front left,
687	front center, front right, rear left, rear right</simpara>
688	</listitem>
689	</varlistentry><varlistentry>
690	<term>six channels</term><listitem>
691	<simpara>the stream is 5.1 surround. channel order: front left, front
692	center, front right, rear left, rear right, LFE</simpara>
693	</listitem>
694	</varlistentry><varlistentry>
695	<term>greater than six channels</term><listitem>
696	<simpara>channel use and order is defined by the application</simpara>
697	</listitem>
698	</varlistentry>
699	</variablelist>
700
701	<para>
702	Applications using Vorbis for dedicated purposes may define channel
703	mapping as seen fit. Future channel mappings (such as three and four
704	channel <ulink url="http://www.ambisonic.net/">Ambisonics</ulink>) will
705	make use of channel mappings other than mapping 0.</para>
706
707	</section>
708
709	</section>
710
711	</section>

Note: See TracBrowser for help on using the repository browser.

Download in other formats: