1 | <?xml version="1.0" standalone="no"?> |
---|
2 | <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" |
---|
3 | "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [ |
---|
4 | |
---|
5 | ]> |
---|
6 | |
---|
7 | <section id="vorbis-spec-bitpacking"> |
---|
8 | <sectioninfo> |
---|
9 | <releaseinfo> |
---|
10 | $Id: 02-bitpacking.xml 7186 2004-07-20 07:19:25Z xiphmont $ |
---|
11 | </releaseinfo> |
---|
12 | </sectioninfo> |
---|
13 | <title>Bitpacking Convention</title> |
---|
14 | |
---|
15 | |
---|
16 | <section> |
---|
17 | <title>Overview</title> |
---|
18 | |
---|
19 | <para> |
---|
20 | The Vorbis codec uses relatively unstructured raw packets containing |
---|
21 | arbitrary-width binary integer fields. Logically, these packets are a |
---|
22 | bitstream in which bits are coded one-by-one by the encoder and then |
---|
23 | read one-by-one in the same monotonically increasing order by the |
---|
24 | decoder. Most current binary storage arrangements group bits into a |
---|
25 | native word size of eight bits (octets), sixteen bits, thirty-two bits |
---|
26 | or, less commonly other fixed word sizes. The Vorbis bitpacking |
---|
27 | convention specifies the correct mapping of the logical packet |
---|
28 | bitstream into an actual representation in fixed-width words. |
---|
29 | </para> |
---|
30 | |
---|
31 | <section><title>octets, bytes and words</title> |
---|
32 | |
---|
33 | <para> |
---|
34 | In most contemporary architectures, a 'byte' is synonymous with an |
---|
35 | 'octet', that is, eight bits. This has not always been the case; |
---|
36 | seven, ten, eleven and sixteen bit 'bytes' have been used. For |
---|
37 | purposes of the bitpacking convention, a byte implies the native, |
---|
38 | smallest integer storage representation offered by a platform. On |
---|
39 | modern platforms, this is generally assumed to be eight bits (not |
---|
40 | necessarily because of the processor but because of the |
---|
41 | filesystem/memory architecture. Modern filesystems invariably offer |
---|
42 | bytes as the fundamental atom of storage). A 'word' is an integer |
---|
43 | size that is a grouped multiple of this smallest size.</para> |
---|
44 | |
---|
45 | <para> |
---|
46 | The most ubiquitous architectures today consider a 'byte' to be an |
---|
47 | octet (eight bits) and a word to be a group of two, four or eight |
---|
48 | bytes (16, 32 or 64 bits). Note however that the Vorbis bitpacking |
---|
49 | convention is still well defined for any native byte size; Vorbis uses |
---|
50 | the native bit-width of a given storage system. This document assumes |
---|
51 | that a byte is one octet for purposes of example.</para> |
---|
52 | |
---|
53 | </section><section><title>bit order</title> |
---|
54 | |
---|
55 | <para> |
---|
56 | A byte has a well-defined 'least significant' bit (LSb), which is the |
---|
57 | only bit set when the byte is storing the two's complement integer |
---|
58 | value +1. A byte's 'most significant' bit (MSb) is at the opposite |
---|
59 | end of the byte. Bits in a byte are numbered from zero at the LSb to |
---|
60 | <emphasis>n</emphasis> (<emphasis>n</emphasis>=7 in an octet) for the |
---|
61 | MSb.</para> |
---|
62 | |
---|
63 | </section> |
---|
64 | |
---|
65 | <section><title>byte order</title> |
---|
66 | |
---|
67 | <para> |
---|
68 | Words are native groupings of multiple bytes. Several byte orderings |
---|
69 | are possible in a word; the common ones are 3-2-1-0 ('big endian' or |
---|
70 | 'most significant byte first' in which the highest-valued byte comes |
---|
71 | first), 0-1-2-3 ('little endian' or 'least significant byte first' in |
---|
72 | which the lowest value byte comes first) and less commonly 3-1-2-0 and |
---|
73 | 0-2-1-3 ('mixed endian').</para> |
---|
74 | |
---|
75 | <para> |
---|
76 | The Vorbis bitpacking convention specifies storage and bitstream |
---|
77 | manipulation at the byte, not word, level, thus host word ordering is |
---|
78 | of a concern only during optimization when writing high performance |
---|
79 | code that operates on a word of storage at a time rather than by byte. |
---|
80 | Logically, bytes are always coded and decoded in order from byte zero |
---|
81 | through byte <emphasis>n</emphasis>.</para> |
---|
82 | |
---|
83 | </section> |
---|
84 | |
---|
85 | <section><title>coding bits into byte sequences</title> |
---|
86 | |
---|
87 | <para> |
---|
88 | The Vorbis codec has need to code arbitrary bit-width integers, from |
---|
89 | zero to 32 bits wide, into packets. These integer fields are not |
---|
90 | aligned to the boundaries of the byte representation; the next field |
---|
91 | is written at the bit position at which the previous field ends.</para> |
---|
92 | |
---|
93 | <para> |
---|
94 | The encoder logically packs integers by writing the LSb of a binary |
---|
95 | integer to the logical bitstream first, followed by next least |
---|
96 | significant bit, etc, until the requested number of bits have been |
---|
97 | coded. When packing the bits into bytes, the encoder begins by |
---|
98 | placing the LSb of the integer to be written into the least |
---|
99 | significant unused bit position of the destination byte, followed by |
---|
100 | the next-least significant bit of the source integer and so on up to |
---|
101 | the requested number of bits. When all bits of the destination byte |
---|
102 | have been filled, encoding continues by zeroing all bits of the next |
---|
103 | byte and writing the next bit into the bit position 0 of that byte. |
---|
104 | Decoding follows the same process as encoding, but by reading bits |
---|
105 | from the byte stream and reassembling them into integers.</para> |
---|
106 | |
---|
107 | </section> |
---|
108 | |
---|
109 | <section><title>signedness</title> |
---|
110 | |
---|
111 | <para> |
---|
112 | The signedness of a specific number resulting from decode is to be |
---|
113 | interpreted by the decoder given decode context. That is, the three |
---|
114 | bit binary pattern 'b111' can be taken to represent either 'seven' as |
---|
115 | an unsigned integer, or '-1' as a signed, two's complement integer. |
---|
116 | The encoder and decoder are responsible for knowing if fields are to |
---|
117 | be treated as signed or unsigned.</para> |
---|
118 | |
---|
119 | </section> |
---|
120 | |
---|
121 | <section><title>coding example</title> |
---|
122 | |
---|
123 | <para> |
---|
124 | Code the 4 bit integer value '12' [b1100] into an empty bytestream. |
---|
125 | Bytestream result: |
---|
126 | |
---|
127 | <screen> |
---|
128 | | |
---|
129 | V |
---|
130 | |
---|
131 | 7 6 5 4 3 2 1 0 |
---|
132 | byte 0 [0 0 0 0 1 1 0 0] <- |
---|
133 | byte 1 [ ] |
---|
134 | byte 2 [ ] |
---|
135 | byte 3 [ ] |
---|
136 | ... |
---|
137 | byte n [ ] bytestream length == 1 byte |
---|
138 | |
---|
139 | </screen> |
---|
140 | </para> |
---|
141 | |
---|
142 | <para> |
---|
143 | Continue by coding the 3 bit integer value '-1' [b111]: |
---|
144 | |
---|
145 | <screen> |
---|
146 | | |
---|
147 | V |
---|
148 | |
---|
149 | 7 6 5 4 3 2 1 0 |
---|
150 | byte 0 [0 1 1 1 1 1 0 0] <- |
---|
151 | byte 1 [ ] |
---|
152 | byte 2 [ ] |
---|
153 | byte 3 [ ] |
---|
154 | ... |
---|
155 | byte n [ ] bytestream length == 1 byte |
---|
156 | </screen> |
---|
157 | </para> |
---|
158 | |
---|
159 | <para> |
---|
160 | Continue by coding the 7 bit integer value '17' [b0010001]: |
---|
161 | |
---|
162 | <screen> |
---|
163 | | |
---|
164 | V |
---|
165 | |
---|
166 | 7 6 5 4 3 2 1 0 |
---|
167 | byte 0 [1 1 1 1 1 1 0 0] |
---|
168 | byte 1 [0 0 0 0 1 0 0 0] <- |
---|
169 | byte 2 [ ] |
---|
170 | byte 3 [ ] |
---|
171 | ... |
---|
172 | byte n [ ] bytestream length == 2 bytes |
---|
173 | bit cursor == 6 |
---|
174 | </screen> |
---|
175 | </para> |
---|
176 | |
---|
177 | <para> |
---|
178 | Continue by coding the 13 bit integer value '6969' [b110 11001110 01]: |
---|
179 | |
---|
180 | <screen> |
---|
181 | | |
---|
182 | V |
---|
183 | |
---|
184 | 7 6 5 4 3 2 1 0 |
---|
185 | byte 0 [1 1 1 1 1 1 0 0] |
---|
186 | byte 1 [0 1 0 0 1 0 0 0] |
---|
187 | byte 2 [1 1 0 0 1 1 1 0] |
---|
188 | byte 3 [0 0 0 0 0 1 1 0] <- |
---|
189 | ... |
---|
190 | byte n [ ] bytestream length == 4 bytes |
---|
191 | |
---|
192 | </screen> |
---|
193 | </para> |
---|
194 | |
---|
195 | </section> |
---|
196 | |
---|
197 | <section><title>decoding example</title> |
---|
198 | |
---|
199 | <para> |
---|
200 | Reading from the beginning of the bytestream encoded in the above example: |
---|
201 | |
---|
202 | <screen> |
---|
203 | | |
---|
204 | V |
---|
205 | |
---|
206 | 7 6 5 4 3 2 1 0 |
---|
207 | byte 0 [1 1 1 1 1 1 0 0] <- |
---|
208 | byte 1 [0 1 0 0 1 0 0 0] |
---|
209 | byte 2 [1 1 0 0 1 1 1 0] |
---|
210 | byte 3 [0 0 0 0 0 1 1 0] bytestream length == 4 bytes |
---|
211 | |
---|
212 | </screen> |
---|
213 | </para> |
---|
214 | |
---|
215 | <para> |
---|
216 | We read two, two-bit integer fields, resulting in the returned numbers |
---|
217 | 'b00' and 'b11'. Two things are worth noting here: |
---|
218 | |
---|
219 | <itemizedlist> |
---|
220 | <listitem> |
---|
221 | <para>Although these four bits were originally written as a single |
---|
222 | four-bit integer, reading some other combination of bit-widths from the |
---|
223 | bitstream is well defined. There are no artificial alignment |
---|
224 | boundaries maintained in the bitstream.</para> |
---|
225 | </listitem> |
---|
226 | <listitem> |
---|
227 | <para>The second value is the |
---|
228 | two-bit-wide integer 'b11'. This value may be interpreted either as |
---|
229 | the unsigned value '3', or the signed value '-1'. Signedness is |
---|
230 | dependent on decode context.</para> |
---|
231 | </listitem> |
---|
232 | </itemizedlist> |
---|
233 | </para> |
---|
234 | |
---|
235 | </section> |
---|
236 | |
---|
237 | <section><title>end-of-packet alignment</title> |
---|
238 | |
---|
239 | <para> |
---|
240 | The typical use of bitpacking is to produce many independent |
---|
241 | byte-aligned packets which are embedded into a larger byte-aligned |
---|
242 | container structure, such as an Ogg transport bitstream. Externally, |
---|
243 | each bytestream (encoded bitstream) must begin and end on a byte |
---|
244 | boundary. Often, the encoded bitstream is not an integer number of |
---|
245 | bytes, and so there is unused (uncoded) space in the last byte of a |
---|
246 | packet.</para> |
---|
247 | |
---|
248 | <para> |
---|
249 | Unused space in the last byte of a bytestream is always zeroed during |
---|
250 | the coding process. Thus, should this unused space be read, it will |
---|
251 | return binary zeroes.</para> |
---|
252 | |
---|
253 | <para> |
---|
254 | Attempting to read past the end of an encoded packet results in an |
---|
255 | 'end-of-packet' condition. End-of-packet is not to be considered an |
---|
256 | error; it is merely a state indicating that there is insufficient |
---|
257 | remaining data to fulfill the desired read size. Vorbis uses truncated |
---|
258 | packets as a normal mode of operation, and as such, decoders must |
---|
259 | handle reading past the end of a packet as a typical mode of |
---|
260 | operation. Any further read operations after an 'end-of-packet' |
---|
261 | condition shall also return 'end-of-packet'.</para> |
---|
262 | |
---|
263 | </section> |
---|
264 | |
---|
265 | <section><title> reading zero bits</title> |
---|
266 | |
---|
267 | <para> |
---|
268 | Reading a zero-bit-wide integer returns the value '0' and does not |
---|
269 | increment the stream cursor. Reading to the end of the packet (but |
---|
270 | not past, such that an 'end-of-packet' condition has not triggered) |
---|
271 | and then reading a zero bit integer shall succeed, returning 0, and |
---|
272 | not trigger an end-of-packet condition. Reading a zero-bit-wide |
---|
273 | integer after a previous read sets 'end-of-packet' shall also fail |
---|
274 | with 'end-of-packet'.</para> |
---|
275 | |
---|
276 | </section> |
---|
277 | |
---|
278 | </section> |
---|
279 | </section> |
---|
280 | |
---|