| 1 | <html> |
|---|
| 2 | |
|---|
| 3 | <head> |
|---|
| 4 | <title>libvorbisenc - API Overview</title> |
|---|
| 5 | <link rel=stylesheet href="style.css" type="text/css"> |
|---|
| 6 | </head> |
|---|
| 7 | |
|---|
| 8 | <body bgcolor=white text=black link="#5555ff" alink="#5555ff" vlink="#5555ff"> |
|---|
| 9 | <table border=0 width=100%> |
|---|
| 10 | <tr> |
|---|
| 11 | <td><p class=tiny>libvorbisenc documentation</p></td> |
|---|
| 12 | <td align=right><p class=tiny>libvorbisenc release 1.1 - 20040709</p></td> |
|---|
| 13 | </tr> |
|---|
| 14 | </table> |
|---|
| 15 | |
|---|
| 16 | <h1>Libvorbisenc API Overview</h1> |
|---|
| 17 | |
|---|
| 18 | <p>Libvorbisenc is an encoding convenience library intended to |
|---|
| 19 | encapsulate the elaborate setup that libvorbis requires for encoding. |
|---|
| 20 | Libvorbisenc gives easy access to all high-level adjustments an |
|---|
| 21 | application may require when encoding and also exposes some low-level |
|---|
| 22 | tuning parameters to allow applications to make detailed adjustments |
|---|
| 23 | to the encoding process. <p> |
|---|
| 24 | |
|---|
| 25 | All the <b>libvorbisenc</b> routines are declared in "vorbis/vorbisenc.h". |
|---|
| 26 | |
|---|
| 27 | <em>Note: libvorbis and libvorbisenc always |
|---|
| 28 | encode in a single pass. Thus, all possible encoding setups will work |
|---|
| 29 | properly with live input and produce streams that decode properly when |
|---|
| 30 | streamed. See the subsection titled <a href="#BBR">"managed bitrate |
|---|
| 31 | modes"</a> for details on setting limits on bitrate usage when Vorbis |
|---|
| 32 | streams are used in a limited-bandwidth environment.</em> |
|---|
| 33 | |
|---|
| 34 | <h2>workflow</h2> |
|---|
| 35 | |
|---|
| 36 | <p>Libvorbisenc is used only during encoder setup; its function |
|---|
| 37 | is to automate initialization of a multitude of settings in a |
|---|
| 38 | <tt>vorbis_info</tt> structure which libvorbis then uses as a reference |
|---|
| 39 | during the encoding process. Libvorbisenc plays no part in the |
|---|
| 40 | encoding process after setup. |
|---|
| 41 | |
|---|
| 42 | <p>Encode setup using libvorbisenc consists of three steps: |
|---|
| 43 | |
|---|
| 44 | <ol> |
|---|
| 45 | <li>high-level initialization of a <tt>vorbis_info</tt> structure by |
|---|
| 46 | calling one of <a |
|---|
| 47 | href="vorbis_encode_setup_vbr.html">vorbis_encode_setup_vbr()</a> or <a |
|---|
| 48 | href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a> |
|---|
| 49 | with the basic input audio parameters (rate and channels) and the |
|---|
| 50 | basic desired encoded audio output parameters (VBR quality or ABR/CBR |
|---|
| 51 | bitrate)<p> |
|---|
| 52 | |
|---|
| 53 | <li>optional adjustment of the basic setup defaults using <a |
|---|
| 54 | href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a><p> |
|---|
| 55 | |
|---|
| 56 | <li>calling <a |
|---|
| 57 | href="vorbis_encode_setup_init.html">vorbis_encode_setup_init()</a> to |
|---|
| 58 | finalize the high-level setup into the detailed low-level reference |
|---|
| 59 | values needed by libvorbis to encode audio. The <tt>vorbis_info</tt> |
|---|
| 60 | structure is then ready to use for encoding by libvorbis.<p> |
|---|
| 61 | |
|---|
| 62 | </ol> |
|---|
| 63 | |
|---|
| 64 | These three steps can be collapsed into a single call by using <a |
|---|
| 65 | href="vorbis_encode_init_vbr.html">vorbis_encode_init_vbr</a> to set up a |
|---|
| 66 | quality-based VBR stream or <a |
|---|
| 67 | href="vorbis_encode_init.html">vorbis_encode_init</a> to set up a managed |
|---|
| 68 | bitrate (ABR or CBR) stream.<p> |
|---|
| 69 | |
|---|
| 70 | <h2>adjustable encoding parameters</h2> |
|---|
| 71 | |
|---|
| 72 | <h3>input audio parameters</h3> |
|---|
| 73 | |
|---|
| 74 | <p> |
|---|
| 75 | <table border=1 color=black width=50% cellspacing=0 cellpadding=7> |
|---|
| 76 | <tr bgcolor=#cccccc> |
|---|
| 77 | <td><b>parameter</b></td> |
|---|
| 78 | <td><b>description</b></td> |
|---|
| 79 | </tr> |
|---|
| 80 | <tr valign=top> |
|---|
| 81 | <td>sampling rate</td> |
|---|
| 82 | <td> |
|---|
| 83 | The sampling rate (in samples per second) of the input audio. Common examples are 8000 for telephony, 44100 for CD audio and 48000 for DAT. Note that a mono sample (one center value) and a stereo sample (one left value and one right value) both are a single sample. |
|---|
| 84 | |
|---|
| 85 | </td> |
|---|
| 86 | </tr> |
|---|
| 87 | <tr valign=top> |
|---|
| 88 | <td>channels</td> |
|---|
| 89 | <td> |
|---|
| 90 | |
|---|
| 91 | The number of channels encoded in each input sample. By default, |
|---|
| 92 | stereo input modes (two channels) are 'coupled' by Vorbis 1.1 such |
|---|
| 93 | that the stereo relationship between the samples is taken into account |
|---|
| 94 | when encoding. Stereo coupling my be disabled by using <a |
|---|
| 95 | href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> with <a |
|---|
| 96 | href="vorbis_encode_ctl.html#OV_ECTL_COUPLE_SET">OV_ECTL_COUPLE_SET</a>. |
|---|
| 97 | |
|---|
| 98 | </td> |
|---|
| 99 | </tr> |
|---|
| 100 | </table> |
|---|
| 101 | |
|---|
| 102 | <h3>quality and VBR modes</h3> |
|---|
| 103 | |
|---|
| 104 | Vorbis is natively a VBR codec; a user requests a given constant |
|---|
| 105 | <em>quality</em> and the encoder keeps the encoding quality constant |
|---|
| 106 | while allowing the bitrate to vary. 'Quality' modes (Variable BitRate) |
|---|
| 107 | will always produce the most consistent encoding results as well as |
|---|
| 108 | the highest quality for the amount of bits used. |
|---|
| 109 | |
|---|
| 110 | <p> |
|---|
| 111 | <table border=1 color=black width=50% cellspacing=0 cellpadding=7> |
|---|
| 112 | <tr bgcolor=#cccccc> |
|---|
| 113 | <td><b>parameter</b></td> |
|---|
| 114 | <td><b>description</b></td> |
|---|
| 115 | </tr> |
|---|
| 116 | <tr valign=top> |
|---|
| 117 | <td>quality</td> |
|---|
| 118 | <td> |
|---|
| 119 | A decimal float value requesting a desired quality. Libvorbisenc 1.1 allows quality requests in the range of -0.1 (lowest quality, smallest files) through +1.0 (highest-quality, largest files). Quality -0.1 is intended as an ultra-low setting in which low bitrate is much more important than quality consistency. Quality settings 0.0 and above are intended to produce consistent results at all times. |
|---|
| 120 | |
|---|
| 121 | </td> |
|---|
| 122 | </tr> |
|---|
| 123 | </table> |
|---|
| 124 | |
|---|
| 125 | <a name="BBR"> |
|---|
| 126 | <h3>managed bitrate modes</h3> |
|---|
| 127 | |
|---|
| 128 | Although the Vorbis codec is natively VBR, libvorbis includes |
|---|
| 129 | infrastructure for 'managing' the bitrate of streams by setting |
|---|
| 130 | minimum and maximum usage constraints, as well as functionality for |
|---|
| 131 | nudging a stream toward a desired average value. These features |
|---|
| 132 | should <em>only</em> be used when there is a requirement to limit |
|---|
| 133 | bitrate in some way. Although the difference is usually slight, |
|---|
| 134 | managed bitrate modes will always produce output inferior to VBR |
|---|
| 135 | (given equal bitrate usage). Setting overly or impossibly tight |
|---|
| 136 | bitrate management requirements can affect output quality dramatically |
|---|
| 137 | for the worse.<p> |
|---|
| 138 | |
|---|
| 139 | Beginning in libvorbis 1.1, bitrate management is implemented using a |
|---|
| 140 | <em>bit-reservoir</em> algorithm. The encoder has a fixed-size |
|---|
| 141 | reservoir used as a 'savings account' in encoding. When a frame is |
|---|
| 142 | smaller than the target rate, the unused bits go into the reservoir so |
|---|
| 143 | that they may be used by future frames. When a frame is larger than |
|---|
| 144 | target bitrate, it draws 'banked' bits out of the reservoir. Encoding |
|---|
| 145 | is managed so that the reservoir never goes negative (when a maximum |
|---|
| 146 | bitrate is specified) or fills beyond a fixed limit (when a minimum |
|---|
| 147 | bitrate is specified). An 'average bitrate' request is used as the |
|---|
| 148 | set-point in a long-range bitrate tracker which adjusts the encoder's |
|---|
| 149 | aggressiveness up or down depending on whether or not frames are coming |
|---|
| 150 | in larger or smaller than the requested average point. |
|---|
| 151 | |
|---|
| 152 | <p> |
|---|
| 153 | <table border=1 color=black width=50% cellspacing=0 cellpadding=7> |
|---|
| 154 | <tr bgcolor=#cccccc> |
|---|
| 155 | <td><b>parameter</b></td> |
|---|
| 156 | <td><b>description</b></td> |
|---|
| 157 | </tr> |
|---|
| 158 | <tr valign=top> |
|---|
| 159 | <td>maximum bitrate</td> <td> The maximum allowed bitrate, set in bits |
|---|
| 160 | per second. If the bitrate would otherwise rise such that oversized |
|---|
| 161 | frames would underflow the bit-reservoir by consuming banked bits, |
|---|
| 162 | bitrate management will force the encoder to use fewer bits per frame |
|---|
| 163 | by encoding with a more aggressive psychoacoustic model.<p> This |
|---|
| 164 | setting is a hard limit; the bitstream will never be allowed, under |
|---|
| 165 | any circumstances, to increase above the specified bitrate over the |
|---|
| 166 | average period set by the reservoir; it may momentarily rise over if |
|---|
| 167 | inspected on a granularity much finer than the average period across |
|---|
| 168 | the reservoir. Normally, the encoder will conserve bits gracefully by |
|---|
| 169 | using more aggressive psychoacoustics to shrink a frame when forced |
|---|
| 170 | to. However, if the encoder runs out of means of gracefully shrinking |
|---|
| 171 | a frame, it will simply take the smallest frame it can otherwise |
|---|
| 172 | generate and truncate it to the maximum allowed length. Note that |
|---|
| 173 | this is not an error and although it will obviously adversely affect |
|---|
| 174 | audio quality, a Vorbis decoder will be able to decode a truncated |
|---|
| 175 | frame into audio. |
|---|
| 176 | |
|---|
| 177 | </td> |
|---|
| 178 | </tr> |
|---|
| 179 | |
|---|
| 180 | <tr valign=top> |
|---|
| 181 | <td>average bitrate</td> |
|---|
| 182 | |
|---|
| 183 | <td> |
|---|
| 184 | |
|---|
| 185 | The average desired bitrate of a stream, set |
|---|
| 186 | in bits per second. Average bitrate is tracked via a reservoir like |
|---|
| 187 | minimum and maximum bitrate, however the averaging reservior does not |
|---|
| 188 | impose a hard limit; it is used to nudge the bitrate toward the |
|---|
| 189 | desired average by slowly adjusting the psychoacoustic aggressiveness. |
|---|
| 190 | As such, the reservoir size does not affect the average bitrate |
|---|
| 191 | behavior. Because this setting alone is not used to impose hard |
|---|
| 192 | bitrate limits, the bitrate of a stream produced using only the |
|---|
| 193 | <tt>average bitrate</tt> constraint will track the average over time |
|---|
| 194 | but not necessarily adhere strictly to that average for any given |
|---|
| 195 | period. Should a strict localized average be required, <tt>average |
|---|
| 196 | bitrate</tt> should be used along with <tt>minimum bitrate</tt> and |
|---|
| 197 | <tt>maximum bitrate</tt>. |
|---|
| 198 | </td> |
|---|
| 199 | |
|---|
| 200 | </tr> |
|---|
| 201 | |
|---|
| 202 | <tr valign=top> |
|---|
| 203 | <td>minimum bitrate</td> |
|---|
| 204 | <td> |
|---|
| 205 | The minimum allowed bitrate, set in bits per second. If |
|---|
| 206 | the bitrate would otherwise fall such that undersized frames would |
|---|
| 207 | overflow the bit-reservoir with unused bits, bitrate management will |
|---|
| 208 | force the encoder to use more bits per frame by encoding with a less |
|---|
| 209 | aggressive psychoacoustic model.<p> This setting is a hard limit; the |
|---|
| 210 | bitstream will never be allowed, under any circumstances, to drop |
|---|
| 211 | below the specified bitrate over the average period set by the |
|---|
| 212 | reservoir; it may momentarily fall under if inspected on a granularity |
|---|
| 213 | much finer than the average period across the reservoir. Normally, |
|---|
| 214 | the encoder will fill out undersided frames with additional useful |
|---|
| 215 | coding information by increasing the perceived quality of the stream. |
|---|
| 216 | If the encoder runs out of useful ways to consume more bits, it will |
|---|
| 217 | pad frames out with zeroes. |
|---|
| 218 | </td> |
|---|
| 219 | </tr> |
|---|
| 220 | |
|---|
| 221 | <tr valign=top> |
|---|
| 222 | <td>reservoir size</td> <td> The size of the minimum/maximum bitrate |
|---|
| 223 | tracking reservoir, set in bits. The reservoir is used as a 'bit |
|---|
| 224 | bank' to average out localized surges and dips in bitrate while |
|---|
| 225 | providing predictable, guaranteed buffering behavior for streams to be |
|---|
| 226 | used in situations with constrained transport bandwidth. The default |
|---|
| 227 | setting is two seconds of average bitrate.<p> |
|---|
| 228 | |
|---|
| 229 | When a single frame is larger than the maximum allowed overall |
|---|
| 230 | bitrate, the bits are 'borrowed' from the bitrate reservoir; if the |
|---|
| 231 | reservoir contains insufficient bits to cover the defecit, the encoder |
|---|
| 232 | must find some way to reduce the frame size. <p> |
|---|
| 233 | |
|---|
| 234 | When a frame is under the minimum limit, the surplus bits are placed |
|---|
| 235 | into the reservoir, banking them for future use. If the reservoir is |
|---|
| 236 | already full of banked bits, the encoder is forced to find some way to |
|---|
| 237 | make the frame larger.<p> |
|---|
| 238 | |
|---|
| 239 | If the frame size is between the minimum and maximum rates (thus |
|---|
| 240 | implying the minimum and maximum allowed rates are different), the |
|---|
| 241 | reservoir gravitates toward a fill point configured by the |
|---|
| 242 | <tt>reservoir bias</tt> setting described next. If the reservoir is |
|---|
| 243 | fuller than the fill point (a 'surplus of surplus'), the encoder will |
|---|
| 244 | consume a number bits from the reservoir equal to the number of the |
|---|
| 245 | bits by which the frame exceeds minimum size. If the reservoir is |
|---|
| 246 | emptier than the fillpoint (a 'surplus of defecit'), bits are returned |
|---|
| 247 | to the reservoir equaling the current frame's number of bits under the |
|---|
| 248 | maximum frame size. The idea of the fill point is to buffer against |
|---|
| 249 | both underruns and overruns, by trying to hold the reservoir to a |
|---|
| 250 | middle course. |
|---|
| 251 | </td> |
|---|
| 252 | </tr> |
|---|
| 253 | |
|---|
| 254 | <tr valign=top> |
|---|
| 255 | <td>reservoir bias</td> |
|---|
| 256 | |
|---|
| 257 | <td> |
|---|
| 258 | |
|---|
| 259 | Reservoir bias is a setting between 0.0 and 1.0 that biases bitrate |
|---|
| 260 | management toward smoothing bitrate spikes (0.0) or bitrate peaks |
|---|
| 261 | (1.0); the default setting is 0.1.<p> |
|---|
| 262 | |
|---|
| 263 | Using settings toward 0.0 causes the bitrate manager to hoard bits in |
|---|
| 264 | the bit reservoir such that there is a large pool of banked surplus to |
|---|
| 265 | draw upon during short spikes in bitrate. As a result, the encoder |
|---|
| 266 | will react less aggressively and less drastically to curtail framesize |
|---|
| 267 | during brief surges in bitrate.<p> |
|---|
| 268 | |
|---|
| 269 | Using settings toward 1.0 causes the bitrate manager to empty the bit |
|---|
| 270 | reservoir such that there is a large buffer available to store surplus |
|---|
| 271 | bits during sudden drops in bitrate. As a result, the encoder will |
|---|
| 272 | react less aggressively and less drastically to support minimum frame |
|---|
| 273 | sizes during drops in bitrate and will tend not to store any extra |
|---|
| 274 | bits in the reservoir for future bitrate spikes.<p> |
|---|
| 275 | |
|---|
| 276 | </td> |
|---|
| 277 | </tr> |
|---|
| 278 | |
|---|
| 279 | <tr valign=top> |
|---|
| 280 | <td>average track damping</td> |
|---|
| 281 | <td> |
|---|
| 282 | |
|---|
| 283 | A decimal value, in seconds, that controls how quickly the average |
|---|
| 284 | bitrate tracker is allowed to slew from enforcing minimum frame sizes |
|---|
| 285 | to maximum framesizes and vice versa. Default value is 1.5 |
|---|
| 286 | seconds.<p> |
|---|
| 287 | |
|---|
| 288 | When the 'average bitrate' setting is in use, the average bitrate |
|---|
| 289 | tracker uses an unbounded reservoir to track overall bitrate-to-date |
|---|
| 290 | in the stream. When bitrates are too low, the tracker will try to |
|---|
| 291 | nudge bitrates up and when the bitrate is too high, nudge it down. |
|---|
| 292 | The damping value regulates the maximum strength of the nudge; it |
|---|
| 293 | describes, in seconds, how quickly the tracker may transition from an |
|---|
| 294 | extreme nudge in one direction to an extreme nudge in the other.<p> |
|---|
| 295 | |
|---|
| 296 | </td> |
|---|
| 297 | </tr> |
|---|
| 298 | |
|---|
| 299 | </table> |
|---|
| 300 | |
|---|
| 301 | <h3>encoding model adjustments</h3> |
|---|
| 302 | |
|---|
| 303 | The <a href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> call provides |
|---|
| 304 | a generalized interface for making encoding setup adjustments to the |
|---|
| 305 | basic high-level setup provided by <a |
|---|
| 306 | href="vorbis_encode_setup_vbr.html">vorbis_encode_setup_vbr()</a> or <a |
|---|
| 307 | href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a>. |
|---|
| 308 | In reality, these two calls use <a |
|---|
| 309 | href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> internally, and <a |
|---|
| 310 | href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> can be used to adjust |
|---|
| 311 | most of the parameters set by other calls.<p> |
|---|
| 312 | |
|---|
| 313 | In Vorbis 1.1, <a href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> can |
|---|
| 314 | adjust the following additional parameters not described elsewhere: |
|---|
| 315 | |
|---|
| 316 | <p> |
|---|
| 317 | <table border=1 color=black width=50% cellspacing=0 cellpadding=7> |
|---|
| 318 | <tr bgcolor=#cccccc> |
|---|
| 319 | <td><b>parameter</b></td> |
|---|
| 320 | <td><b>description</b></td> |
|---|
| 321 | </tr> |
|---|
| 322 | <tr valign=top> |
|---|
| 323 | <td>management mode</td> <td> Configures whether or not bitrate |
|---|
| 324 | management is in use or not. Normally, this value is set implicitly |
|---|
| 325 | during encoding setup; however, the supported means of selecting a |
|---|
| 326 | quality mode by bitrate (that is, requesting a true VBR stream, but |
|---|
| 327 | doing so by asking for an approximate bitrate) is to use <a |
|---|
| 328 | href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a> |
|---|
| 329 | and then to explicitly turn off bitrate management by calling <a |
|---|
| 330 | href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> with <a |
|---|
| 331 | href="vorbis_encode_ctl.html#OV_ECTL_RATEMANAGE2_SET">OV_ECTL_RATEMANAGE2_SET</a> |
|---|
| 332 | </td> |
|---|
| 333 | </tr> |
|---|
| 334 | |
|---|
| 335 | <tr valign=top> |
|---|
| 336 | <td>coupling</td> <td> Stereo encoding (and in the future, surround |
|---|
| 337 | encodings) are normally encoded assuming the channels form a stereo |
|---|
| 338 | image and that lossy-stereo modelling is appropriate; this is called |
|---|
| 339 | 'coupling'. Stereo coupling may be explicitly enabled or disabled. |
|---|
| 340 | </td> |
|---|
| 341 | </tr> |
|---|
| 342 | <tr valign=top> |
|---|
| 343 | <td>lowpass</td> <td> Sets the hard lowpass of a given encoding mode; |
|---|
| 344 | this may be used to conserve a few bits in high-rate audio that has |
|---|
| 345 | limited bandwidth, or in testing of the encoder's acoustic model. The |
|---|
| 346 | encoder is generally already configured with ideal lowpasses (if any |
|---|
| 347 | at all) for given modes; use of this parameter is strongly discouraged |
|---|
| 348 | if the point is to try to 'improve' a given encoding mode for general |
|---|
| 349 | encoding. |
|---|
| 350 | </td> |
|---|
| 351 | </tr> |
|---|
| 352 | |
|---|
| 353 | <tr valign=top> |
|---|
| 354 | <td>impulse coding aggressiveness</td> <td>By default, libvorbis |
|---|
| 355 | attempts to compromise between preventing wide bitrate swings and |
|---|
| 356 | high-resolution impulse coding (which is required for the crispest |
|---|
| 357 | possible attacks, but also requires a relatively large momentary |
|---|
| 358 | bitrate increase). This parameter allows an application to tune the |
|---|
| 359 | compromise or eliminate it; A value of 0.0 indicates normal behavior |
|---|
| 360 | while a value of -15.0 requests maximum possible impulse |
|---|
| 361 | resolution.</td> |
|---|
| 362 | </tr> |
|---|
| 363 | |
|---|
| 364 | </table> |
|---|
| 365 | |
|---|
| 366 | |
|---|
| 367 | <br><br> |
|---|
| 368 | <hr noshade> |
|---|
| 369 | <table border=0 width=100%> |
|---|
| 370 | <tr valign=top> |
|---|
| 371 | <td><p class=tiny>copyright © 2004 Vorbis team</p></td> |
|---|
| 372 | <td align=right><p class=tiny><a href="http://www.xiph.org/ogg/vorbis/index.html">Ogg Vorbis</a><br><a href="mailto:team@vorbis.org">team@vorbis.org</a></p></td> |
|---|
| 373 | </tr><tr> |
|---|
| 374 | <td><p class=tiny>libvorbisenc documentation</p></td> |
|---|
| 375 | <td align=right><p class=tiny>libvorbisenc release 1.1 - 20040709</p></td> |
|---|
| 376 | </tr> |
|---|
| 377 | </table> |
|---|
| 378 | |
|---|
| 379 | </body> |
|---|
| 380 | |
|---|
| 381 | </html> |
|---|
| 382 | |
|---|