| 1 | <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" | 
|---|
| 2 | "http://www.w3.org/TR/html4/loose.dtd"> | 
|---|
| 3 |  | 
|---|
| 4 | <html> | 
|---|
| 5 | <head> | 
|---|
| 6 |   <meta http-equiv="Content-Language" content="en-us"> | 
|---|
| 7 |   <meta http-equiv="Content-Type" content="text/html; charset=us-ascii"> | 
|---|
| 8 |   <link rel="stylesheet" type="text/css" href="../../../../boost.css"> | 
|---|
| 9 |  | 
|---|
| 10 |   <title>Rounding Policies</title> | 
|---|
| 11 | </head> | 
|---|
| 12 |  | 
|---|
| 13 | <body lang="en"> | 
|---|
| 14 |   <h1>Rounding Policies</h1> | 
|---|
| 15 |  | 
|---|
| 16 |   <p>In order to be as general as possible, the library uses a class to | 
|---|
| 17 |   compute all the necessary functions rounded upward or downward. This class | 
|---|
| 18 |   is the first parameter of <code>policies</code>, it is also the type named | 
|---|
| 19 |   <code>rounding</code> in the policy definition of | 
|---|
| 20 |   <code>interval</code>.</p> | 
|---|
| 21 |  | 
|---|
| 22 |   <p>By default, it is <code>interval_lib::rounded_math<T></code>. The | 
|---|
| 23 |   class <code>interval_lib::rounded_math</code> is already specialized for | 
|---|
| 24 |   the standard floating types (<code>float</code> , <code>double</code> and | 
|---|
| 25 |   <code>long double</code>). So if the base type of your intervals is not one | 
|---|
| 26 |   of these, a good solution would probably be to provide a specialization of | 
|---|
| 27 |   this class. But if the default specialization of | 
|---|
| 28 |   <code>rounded_math<T></code> for <code>float</code>, | 
|---|
| 29 |   <code>double</code>, or <code>long double</code> is not what you seek, or | 
|---|
| 30 |   you do not want to specialize | 
|---|
| 31 |   <code>interval_lib::rounded_math<T></code> (say because you prefer to | 
|---|
| 32 |   work in your own namespace) you can also define your own rounding policy | 
|---|
| 33 |   and pass it directly to <code>interval_lib::policies</code>.</p> | 
|---|
| 34 |  | 
|---|
| 35 |   <h2>Requirements</h2> | 
|---|
| 36 |  | 
|---|
| 37 |   <p>Here comes what the class is supposed to provide. The domains are | 
|---|
| 38 |   written next to their respective functions (as you can see, the functions | 
|---|
| 39 |   do not have to worry about invalid values, but they have to handle infinite | 
|---|
| 40 |   arguments).</p> | 
|---|
| 41 |   <pre> | 
|---|
| 42 | /* Rounding requirements */ | 
|---|
| 43 | struct rounding { | 
|---|
| 44 |   // defaut constructor, destructor | 
|---|
| 45 |   rounding(); | 
|---|
| 46 |   ~rounding(); | 
|---|
| 47 |   // mathematical operations | 
|---|
| 48 |   T add_down(T, T); // [-∞;+∞][-∞;+∞] | 
|---|
| 49 |   T add_up  (T, T); // [-∞;+∞][-∞;+∞] | 
|---|
| 50 |   T sub_down(T, T); // [-∞;+∞][-∞;+∞] | 
|---|
| 51 |   T sub_up  (T, T); // [-∞;+∞][-∞;+∞] | 
|---|
| 52 |   T mul_down(T, T); // [-∞;+∞][-∞;+∞] | 
|---|
| 53 |   T mul_up  (T, T); // [-∞;+∞][-∞;+∞] | 
|---|
| 54 |   T div_down(T, T); // [-∞;+∞]([-∞;+∞]-{0}) | 
|---|
| 55 |   T div_up  (T, T); // [-∞;+∞]([-∞;+∞]-{0}) | 
|---|
| 56 |   T sqrt_down(T);   // ]0;+∞] | 
|---|
| 57 |   T sqrt_up  (T);   // ]0;+∞] | 
|---|
| 58 |   T exp_down(T);    // [-∞;+∞] | 
|---|
| 59 |   T exp_up  (T);    // [-∞;+∞] | 
|---|
| 60 |   T log_down(T);    // ]0;+∞] | 
|---|
| 61 |   T log_up  (T);    // ]0;+∞] | 
|---|
| 62 |   T cos_down(T);    // [0;2π] | 
|---|
| 63 |   T cos_up  (T);    // [0;2π] | 
|---|
| 64 |   T tan_down(T);    // ]-π/2;π/2[ | 
|---|
| 65 |   T tan_up  (T);    // ]-π/2;π/2[ | 
|---|
| 66 |   T asin_down(T);   // [-1;1] | 
|---|
| 67 |   T asin_up  (T);   // [-1;1] | 
|---|
| 68 |   T acos_down(T);   // [-1;1] | 
|---|
| 69 |   T acos_up  (T);   // [-1;1] | 
|---|
| 70 |   T atan_down(T);   // [-∞;+∞] | 
|---|
| 71 |   T atan_up  (T);   // [-∞;+∞] | 
|---|
| 72 |   T sinh_down(T);   // [-∞;+∞] | 
|---|
| 73 |   T sinh_up  (T);   // [-∞;+∞] | 
|---|
| 74 |   T cosh_down(T);   // [-∞;+∞] | 
|---|
| 75 |   T cosh_up  (T);   // [-∞;+∞] | 
|---|
| 76 |   T tanh_down(T);   // [-∞;+∞] | 
|---|
| 77 |   T tanh_up  (T);   // [-∞;+∞] | 
|---|
| 78 |   T asinh_down(T);  // [-∞;+∞] | 
|---|
| 79 |   T asinh_up  (T);  // [-∞;+∞] | 
|---|
| 80 |   T acosh_down(T);  // [1;+∞] | 
|---|
| 81 |   T acosh_up  (T);  // [1;+∞] | 
|---|
| 82 |   T atanh_down(T);  // [-1;1] | 
|---|
| 83 |   T atanh_up  (T);  // [-1;1]  | 
|---|
| 84 |   T median(T, T);   // [-∞;+∞][-∞;+∞] | 
|---|
| 85 |   T int_down(T);    // [-∞;+∞] | 
|---|
| 86 |   T int_up  (T);    // [-∞;+∞] | 
|---|
| 87 |   // conversion functions | 
|---|
| 88 |   T conv_down(U); | 
|---|
| 89 |   T conv_up  (U); | 
|---|
| 90 |   // unprotected rounding class | 
|---|
| 91 |   typedef ... unprotected_rounding; | 
|---|
| 92 | }; | 
|---|
| 93 | </pre> | 
|---|
| 94 |  | 
|---|
| 95 |   <p>The constructor and destructor of the rounding class have a very | 
|---|
| 96 |   important semantic requirement: they are responsible for setting and | 
|---|
| 97 |   resetting the rounding modes of the computation on T. For instance, if T is | 
|---|
| 98 |   a standard floating point type and floating point computation is performed | 
|---|
| 99 |   according to the Standard IEEE 754, the constructor can save the current | 
|---|
| 100 |   rounding state, each <code>_up</code> (resp. <code>_down</code>) function | 
|---|
| 101 |   will round up (resp. down), and the destructor will restore the saved | 
|---|
| 102 |   rounding state. Indeed this is the behavior of the default rounding | 
|---|
| 103 |   policy.</p> | 
|---|
| 104 |  | 
|---|
| 105 |   <p>The meaning of all the mathematical functions up until | 
|---|
| 106 |   <code>atanh_up</code> is clear: each function returns number representable | 
|---|
| 107 |   in the type <code>T</code> which is a lower bound (for <code>_down</code>) | 
|---|
| 108 |   or upper bound (for <code>_up</code>) on the true mathematical result of | 
|---|
| 109 |   the corresponding function. The function <code>median</code> computes the | 
|---|
| 110 |   average of its two arguments rounded to its nearest representable number. | 
|---|
| 111 |   The functions <code>int_down</code> and <code>int_up</code> compute the | 
|---|
| 112 |   nearest integer smaller or bigger than their argument. Finally, | 
|---|
| 113 |   <code>conv_down</code> and <code>conv_up</code> are responsible of the | 
|---|
| 114 |   conversions of values of other types to the base number type: the first one | 
|---|
| 115 |   must round down the value and the second one must round it up.</p> | 
|---|
| 116 |  | 
|---|
| 117 |   <p>The type <code>unprotected_rounding</code> allows to remove all | 
|---|
| 118 |   controls. For reasons why one might to do this, see the <a href= | 
|---|
| 119 |   "#Protection">protection</a> paragraph below.</p> | 
|---|
| 120 |  | 
|---|
| 121 |   <h2>Overview of the provided classes</h2> | 
|---|
| 122 |  | 
|---|
| 123 |   <p>A lot of classes are provided. The classes are organized by level. At | 
|---|
| 124 |   the bottom is the class <code>rounding_control</code>. At the next level | 
|---|
| 125 |   come <code>rounded_arith_exact</code>, <code>rounded_arith_std</code> and | 
|---|
| 126 |   <code>rounded_arith_opp</code>. Then there are | 
|---|
| 127 |   <code>rounded_transc_dummy</code>, <code>rounded_transc_exact</code>, | 
|---|
| 128 |   <code>rounded_transc_std</code> and <code>rounded_transc_opp</code>. And | 
|---|
| 129 |   finally are <code>save_state</code> and <code>save_state_nothing</code>. | 
|---|
| 130 |   Each of these classes provide a set of members that are required by the | 
|---|
| 131 |   classes of the next level. For example, a <code>rounded_transc_...</code> | 
|---|
| 132 |   class needs the members of a <code>rounded_arith_...</code> class.</p> | 
|---|
| 133 |  | 
|---|
| 134 |   <p>When they exist in two versions <code>_std</code> and <code>_opp</code>, | 
|---|
| 135 |   the first one does switch the rounding mode each time, and the second one | 
|---|
| 136 |   tries to keep it oriented toward plus infinity. The main purpose of the | 
|---|
| 137 |   <code>_opp</code> version is to speed up the computations through the use | 
|---|
| 138 |   of the "opposite trick" (see the <a href="#perf">performance notes</a>). | 
|---|
| 139 |   This version requires the rounding mode to be upward before entering any | 
|---|
| 140 |   computation functions of the class. It guarantees that the rounding mode | 
|---|
| 141 |   will still be upward at the exit of the functions.</p> | 
|---|
| 142 |  | 
|---|
| 143 |   <p>Please note that it is really a very bad idea to mix the | 
|---|
| 144 |   <code>_opp</code> version with the <code>_std</code> since they do not have | 
|---|
| 145 |   compatible properties.</p> | 
|---|
| 146 |  | 
|---|
| 147 |   <p>There is a third version named <code>_exact</code> which computes the | 
|---|
| 148 |   functions without changing the rounding mode. It is an "exact" version | 
|---|
| 149 |   because it is intended for a base type that produces exact results.</p> | 
|---|
| 150 |  | 
|---|
| 151 |   <p>The last version is the <code>_dummy</code> version. It does not do any | 
|---|
| 152 |   computations but still produces compatible results.</p> | 
|---|
| 153 |  | 
|---|
| 154 |   <p>Please note that it is possible to use the "exact" version for an | 
|---|
| 155 |   inexact base type, e.g. <code>float</code> or <code>double</code>. In that | 
|---|
| 156 |   case, the inclusion property is no longer guaranteed, but this can be | 
|---|
| 157 |   useful to speed up the computation when the inclusion property is not | 
|---|
| 158 |   desired strictly. For instance, in computer graphics, a small error due to | 
|---|
| 159 |   floating-point roundoff is acceptable as long as an approximate version of | 
|---|
| 160 |   the inclusion property holds.</p> | 
|---|
| 161 |  | 
|---|
| 162 |   <p>Here comes what each class defines. Later, when they will be described | 
|---|
| 163 |   more thoroughly, these members will not be repeated. Please come back here | 
|---|
| 164 |   in order to see them. Inheritance is also used to avoid repetitions.</p> | 
|---|
| 165 |   <pre> | 
|---|
| 166 | template <class T> | 
|---|
| 167 | struct rounding_control | 
|---|
| 168 | { | 
|---|
| 169 |   typedef ... rounding_mode; | 
|---|
| 170 |   void set_rounding_mode(rounding_mode); | 
|---|
| 171 |   void get_rounding_mode(rounding_mode&); | 
|---|
| 172 |   void downward (); | 
|---|
| 173 |   void upward   (); | 
|---|
| 174 |   void to_nearest(); | 
|---|
| 175 |   T to_int(T); | 
|---|
| 176 |   T force_rounding(T); | 
|---|
| 177 | }; | 
|---|
| 178 |  | 
|---|
| 179 | template <class T, class Rounding> | 
|---|
| 180 | struct rounded_arith_... : Rounding | 
|---|
| 181 | { | 
|---|
| 182 |   void init(); | 
|---|
| 183 |   T add_down(T, T); | 
|---|
| 184 |   T add_up  (T, T); | 
|---|
| 185 |   T sub_down(T, T); | 
|---|
| 186 |   T sub_up  (T, T); | 
|---|
| 187 |   T mul_down(T, T); | 
|---|
| 188 |   T mul_up  (T, T); | 
|---|
| 189 |   T div_down(T, T); | 
|---|
| 190 |   T div_up  (T, T); | 
|---|
| 191 |   T sqrt_down(T); | 
|---|
| 192 |   T sqrt_up  (T); | 
|---|
| 193 |   T median(T, T); | 
|---|
| 194 |   T int_down(T); | 
|---|
| 195 |   T int_up  (T); | 
|---|
| 196 | }; | 
|---|
| 197 |  | 
|---|
| 198 | template <class T, class Rounding> | 
|---|
| 199 | struct rounded_transc_... : Rounding | 
|---|
| 200 | { | 
|---|
| 201 |   T exp_down(T); | 
|---|
| 202 |   T exp_up  (T); | 
|---|
| 203 |   T log_down(T); | 
|---|
| 204 |   T log_up  (T); | 
|---|
| 205 |   T cos_down(T); | 
|---|
| 206 |   T cos_up  (T); | 
|---|
| 207 |   T tan_down(T); | 
|---|
| 208 |   T tan_up  (T); | 
|---|
| 209 |   T asin_down(T); | 
|---|
| 210 |   T asin_up  (T); | 
|---|
| 211 |   T acos_down(T); | 
|---|
| 212 |   T acos_up  (T); | 
|---|
| 213 |   T atan_down(T); | 
|---|
| 214 |   T atan_up  (T); | 
|---|
| 215 |   T sinh_down(T); | 
|---|
| 216 |   T sinh_up  (T); | 
|---|
| 217 |   T cosh_down(T); | 
|---|
| 218 |   T cosh_up  (T); | 
|---|
| 219 |   T tanh_down(T); | 
|---|
| 220 |   T tanh_up  (T); | 
|---|
| 221 |   T asinh_down(T); | 
|---|
| 222 |   T asinh_up  (T); | 
|---|
| 223 |   T acosh_down(T); | 
|---|
| 224 |   T acosh_up  (T); | 
|---|
| 225 |   T atanh_down(T); | 
|---|
| 226 |   T atanh_up  (T); | 
|---|
| 227 | }; | 
|---|
| 228 |  | 
|---|
| 229 | template <class Rounding> | 
|---|
| 230 | struct save_state_... : Rounding | 
|---|
| 231 | { | 
|---|
| 232 |   save_state_...(); | 
|---|
| 233 |   ~save_state_...(); | 
|---|
| 234 |   typedef ... unprotected_rounding; | 
|---|
| 235 | }; | 
|---|
| 236 | </pre> | 
|---|
| 237 |  | 
|---|
| 238 |   <h2>Synopsis.</h2> | 
|---|
| 239 |   <pre> | 
|---|
| 240 | namespace boost { | 
|---|
| 241 | namespace numeric { | 
|---|
| 242 | namespace interval_lib { | 
|---|
| 243 |  | 
|---|
| 244 | <span style="color: #FF0000">/* basic rounding control */</span> | 
|---|
| 245 | template <class T>  struct rounding_control; | 
|---|
| 246 |  | 
|---|
| 247 | <span style="color: #FF0000">/* arithmetic functions rounding */</span> | 
|---|
| 248 | template <class T, class Rounding = rounding_control<T> > struct rounded_arith_exact; | 
|---|
| 249 | template <class T, class Rounding = rounding_control<T> > struct rounded_arith_std; | 
|---|
| 250 | template <class T, class Rounding = rounding_control<T> > struct rounded_arith_opp; | 
|---|
| 251 |  | 
|---|
| 252 | <span style="color: #FF0000">/* transcendental functions rounding */</span> | 
|---|
| 253 | template <class T, class Rounding> struct rounded_transc_dummy; | 
|---|
| 254 | template <class T, class Rounding = rounded_arith_exact<T> > struct rounded_transc_exact; | 
|---|
| 255 | template <class T, class Rounding = rounded_arith_std<T> > struct rounded_transc_std; | 
|---|
| 256 | template <class T, class Rounding = rounded_arith_opp<T> > struct rounded_transc_opp; | 
|---|
| 257 |  | 
|---|
| 258 | <span style="color: #FF0000">/* rounding-state-saving classes */</span> | 
|---|
| 259 | template <class Rounding> struct save_state; | 
|---|
| 260 | template <class Rounding> struct save_state_nothing; | 
|---|
| 261 |  | 
|---|
| 262 | <span style="color: #FF0000">/* default policy for type T */</span> | 
|---|
| 263 | template <class T>  struct rounded_math; | 
|---|
| 264 | template <>  struct rounded_math<float>; | 
|---|
| 265 | template <>  struct rounded_math<double>; | 
|---|
| 266 |  | 
|---|
| 267 | <span style= | 
|---|
| 268 | "color: #FF0000">/* some metaprogramming to convert a protected to unprotected rounding */</span> | 
|---|
| 269 | template <class I> struct unprotect; | 
|---|
| 270 |  | 
|---|
| 271 | } // namespace interval_lib | 
|---|
| 272 | } // namespace numeric | 
|---|
| 273 | } // namespace boost | 
|---|
| 274 | </pre> | 
|---|
| 275 |  | 
|---|
| 276 |   <h2>Description of the provided classes</h2> | 
|---|
| 277 |  | 
|---|
| 278 |   <p>We now describe each class in the order they appear in the definition of | 
|---|
| 279 |   a rounding policy (this outermost-to-innermost order is the reverse order | 
|---|
| 280 |   from the synopsis).</p> | 
|---|
| 281 |  | 
|---|
| 282 |   <h3 id="Protection">Protection control</h3> | 
|---|
| 283 |  | 
|---|
| 284 |   <p>Protection refers to the fact that the interval operations will be | 
|---|
| 285 |   surrounded by rounding mode controls. Unprotecting a class means to remove | 
|---|
| 286 |   all the rounding controls. Each rounding policy provides a type | 
|---|
| 287 |   <code>unprotected_rounding</code>. The required type | 
|---|
| 288 |   <code>unprotected_rounding</code> gives another rounding class that enables | 
|---|
| 289 |   to work when nested inside rounding. For example, the first three lines | 
|---|
| 290 |   below should all produce the same result (because the first operation is | 
|---|
| 291 |   the rounding constructor, and the last is its destructor, which take care | 
|---|
| 292 |   of setting the rounding modes); and the last line is allowed to have an | 
|---|
| 293 |   undefined behavior (since no rounding constructor or destructor is ever | 
|---|
| 294 |   called).</p> | 
|---|
| 295 |   <pre> | 
|---|
| 296 | T c; { rounding rnd; c = rnd.add_down(a, b); } | 
|---|
| 297 | T c; { rounding rnd1; { rounding rnd2; c = rnd2.add_down(a, b); } } | 
|---|
| 298 | T c; { rounding rnd1; { rounding::unprotected_rounding rnd2; c = rnd2.add_down(a, b); } } | 
|---|
| 299 | T d; { rounding::unprotected_rounding rnd; d = rnd.add_down(a, b); } | 
|---|
| 300 | </pre> | 
|---|
| 301 |  | 
|---|
| 302 |   <p>Naturally <code>rounding::unprotected_rounding</code> may simply be | 
|---|
| 303 |   <code>rounding</code> itself. But it can improve performance if it is a | 
|---|
| 304 |   simplified version with empty constructor and destructor. In order to avoid | 
|---|
| 305 |   undefined behaviors, in the library, an object of type | 
|---|
| 306 |   <code>rounding::unprotected_rounding</code> is guaranteed to be created | 
|---|
| 307 |   only when an object of type <code>rounding</code> is already alive. See the | 
|---|
| 308 |   <a href="#perf">performance notes</a> for some additional details.</p> | 
|---|
| 309 |  | 
|---|
| 310 |   <p>The support library defines a metaprogramming class template | 
|---|
| 311 |   <code>unprotect</code> which takes an interval type <code>I</code> and | 
|---|
| 312 |   returns an interval type <code>unprotect<I>::type</code> where the | 
|---|
| 313 |   rounding policy has been unprotected. Some information about the types: | 
|---|
| 314 |   <code>interval<T, interval_lib::policies<Rounding, _> | 
|---|
| 315 |   >::traits_type::rounding</code> <b>is</b> the same type as | 
|---|
| 316 |   <code>Rounding</code>, and <code>unprotect<interval<T, | 
|---|
| 317 |   interval_lib::policies<Rounding, _> > >::type</code> <b>is</b> | 
|---|
| 318 |   the same type as <code>interval<T, | 
|---|
| 319 |   interval_lib::policies<Rounding::unprotected, _> ></code>.</p> | 
|---|
| 320 |  | 
|---|
| 321 |   <h3>State saving</h3> | 
|---|
| 322 |  | 
|---|
| 323 |   <p>First comes <code>save_state</code>. This class is responsible for | 
|---|
| 324 |   saving the current rounding mode and calling init in its constructor, and | 
|---|
| 325 |   for restoring the saved rounding mode in its destructor. This class also | 
|---|
| 326 |   defines the <code>unprotected_rounding</code> type.</p> | 
|---|
| 327 |  | 
|---|
| 328 |   <p>If the rounding mode does not require any state-saving or | 
|---|
| 329 |   initialization, <code>save_state_nothing</code> can be used instead of | 
|---|
| 330 |   <code>save_state</code>.</p> | 
|---|
| 331 |  | 
|---|
| 332 |   <h3>Transcendental functions</h3> | 
|---|
| 333 |  | 
|---|
| 334 |   <p>The classes <code>rounded_transc_exact</code>, | 
|---|
| 335 |   <code>rounded_transc_std</code> and <code>rounded_transc_opp</code> expect | 
|---|
| 336 |   the std namespace to provide the functions exp log cos tan acos asin atan | 
|---|
| 337 |   cosh sinh tanh acosh asinh atanh. For the <code>_std</code> and | 
|---|
| 338 |   <code>_opp</code> versions, all these functions should respect the current | 
|---|
| 339 |   rounding mode fixed by a call to downward or upward.</p> | 
|---|
| 340 |  | 
|---|
| 341 |   <p><strong>Please note:</strong> Unfortunately, the latter is rarely the | 
|---|
| 342 |   case. It is the reason why a class <code>rounded_transc_dummy</code> is | 
|---|
| 343 |   provided which does not depend on the functions from the std namespace. | 
|---|
| 344 |   There is no magic, however. The functions of | 
|---|
| 345 |   <code>rounded_transc_dummy</code> do not compute anything. They only return | 
|---|
| 346 |   valid values. For example, <code>cos_down</code> always returns -1. In this | 
|---|
| 347 |   way, we do verify the inclusion property for the default implementation, | 
|---|
| 348 |   even if this has strictly no value for the user. In order to have useful | 
|---|
| 349 |   values, another policy should be used explicitely, which will most likely | 
|---|
| 350 |   lead to a violation of the inclusion property. In this way, we ensure that | 
|---|
| 351 |   the violation is clearly pointed out to the user who then knows what he | 
|---|
| 352 |   stands against. This class could have been used as the default | 
|---|
| 353 |   transcendental rounding class, but it was decided it would be better for | 
|---|
| 354 |   the compilation to fail due to missing declarations rather than succeed | 
|---|
| 355 |   thanks to valid but unusable functions.</p> | 
|---|
| 356 |  | 
|---|
| 357 |   <h3>Basic arithmetic functions</h3> | 
|---|
| 358 |  | 
|---|
| 359 |   <p>The classes <code>rounded_arith_std</code> and | 
|---|
| 360 |   <code>rounded_arith_opp</code> expect the operators + - * / and the | 
|---|
| 361 |   function <code>std::sqrt</code> to respect the current rounding mode.</p> | 
|---|
| 362 |  | 
|---|
| 363 |   <p>The class <code>rounded_arith_exact</code> requires | 
|---|
| 364 |   <code>std::floor</code> and <code>std::ceil</code> to be defined since it | 
|---|
| 365 |   can not rely on <code>to_int</code>.</p> | 
|---|
| 366 |  | 
|---|
| 367 |   <h3>Rounding control</h3> | 
|---|
| 368 |  | 
|---|
| 369 |   <p>The functions defined by each of the previous classes did not need any | 
|---|
| 370 |   explanation. For example, the behavior of <code>add_down</code> is to | 
|---|
| 371 |   compute the sum of two numbers rounded downward. For | 
|---|
| 372 |   <code>rounding_control</code>, the situation is a bit more complex.</p> | 
|---|
| 373 |  | 
|---|
| 374 |   <p>The basic function is <code>force_rounding</code> which returns its | 
|---|
| 375 |   argument correctly rounded accordingly to the current rounding mode if it | 
|---|
| 376 |   was not already the case. This function is necessary to handle delayed | 
|---|
| 377 |   rounding. Indeed, depending on the way the computations are done, the | 
|---|
| 378 |   intermediate results may be internaly stored in a more precise format and | 
|---|
| 379 |   it can lead to a wrong rounding. So the function enforces the rounding. | 
|---|
| 380 |   <a href="#extreg">Here</a> is an example of what happens when the rounding | 
|---|
| 381 |   is not enforced.</p> | 
|---|
| 382 |  | 
|---|
| 383 |   <p>The function <code>get_rounding_mode</code> returns the current rounding | 
|---|
| 384 |   mode, <code>set_rounding_mode</code> sets the rounding mode back to a | 
|---|
| 385 |   previous value returned by <code>get_rounding_mode</code>. | 
|---|
| 386 |   <code>downward</code>, <code>upward</code> and <code>to_nearest</code> sets | 
|---|
| 387 |   the rounding mode in one of the three directions. This rounding mode should | 
|---|
| 388 |   be global to all the functions that use the type <code>T</code>. For | 
|---|
| 389 |   example, after a call to <code>downward</code>, | 
|---|
| 390 |   <code>force_rounding(x+y)</code> is expected to return the sum rounded | 
|---|
| 391 |   toward -∞.</p> | 
|---|
| 392 |  | 
|---|
| 393 |   <p>The function <code>to_int</code> computes the nearest integer | 
|---|
| 394 |   accordingly to the current rounding mode.</p> | 
|---|
| 395 |  | 
|---|
| 396 |   <p>The non-specialized version of <code>rounding_control</code> does not do | 
|---|
| 397 |   anything. The functions for the rounding mode are empty, and | 
|---|
| 398 |   <code>to_int</code> and <code>force_rounding</code> are identity functions. | 
|---|
| 399 |   The <code>pi_</code> constant functions return suitable integers (for | 
|---|
| 400 |   example, <code>pi_up</code> returns <code>T(4)</code>).</p> | 
|---|
| 401 |  | 
|---|
| 402 |   <p>The class template <code>rounding_control</code> is specialized for | 
|---|
| 403 |   <code>float</code>, <code>double</code> and <code>long double</code> in | 
|---|
| 404 |   order to best use the floating point unit of the computer.</p> | 
|---|
| 405 |  | 
|---|
| 406 |   <h2>Template class <tt>rounded_math</tt></h2> | 
|---|
| 407 |  | 
|---|
| 408 |   <p>The default policy (aka <code>rounded_math<T></code>) is simply | 
|---|
| 409 |   defined as:</p> | 
|---|
| 410 |   <pre> | 
|---|
| 411 | template <class T> struct rounded_math<T> : save_state_nothing<rounded_arith_exact<T> > {}; | 
|---|
| 412 | </pre> | 
|---|
| 413 |  | 
|---|
| 414 |   <p>and the specializations for <code>float</code>, <code>double</code> and | 
|---|
| 415 |   <code>long double</code> use <code>rounded_arith_opp</code>, as in:</p> | 
|---|
| 416 |   <pre> | 
|---|
| 417 | template <> struct rounded_math<float>       : save_state<rounded_arith_opp<float> >       {}; | 
|---|
| 418 | template <> struct rounded_math<double>      : save_state<rounded_arith_opp<double> >      {}; | 
|---|
| 419 | template <> struct rounded_math<long double> : save_state<rounded_arith_opp<long double> > {}; | 
|---|
| 420 | </pre> | 
|---|
| 421 |  | 
|---|
| 422 |   <h2 id="perf">Performance Issues</h2> | 
|---|
| 423 |  | 
|---|
| 424 |   <p>This paragraph deals mostly with the performance of the library with | 
|---|
| 425 |   intervals using the floating-point unit (FPU) of the computer. Let's | 
|---|
| 426 |   consider the sum of [<i>a</i>,<i>b</i>] and [<i>c</i>,<i>d</i>] as an | 
|---|
| 427 |   example. The result is [<code>down</code>(<i>a</i>+<i>c</i>), | 
|---|
| 428 |   <code>up</code>(<i>b</i>+<i>d</i>)], where <code>down</code> and | 
|---|
| 429 |   <code>up</code> indicate the rounding mode needed.</p> | 
|---|
| 430 |  | 
|---|
| 431 |   <h3>Rounding Mode Switch</h3> | 
|---|
| 432 |  | 
|---|
| 433 |   <p>If the FPU is able to use a different rounding mode for each operation, | 
|---|
| 434 |   there is no problem. For example, it's the case for the Alpha processor: | 
|---|
| 435 |   each floating-point instruction can specify a different rounding mode. | 
|---|
| 436 |   However, the IEEE-754 Standard does not require such a behavior. So most of | 
|---|
| 437 |   the FPUs only provide some instructions to set the rounding mode for all | 
|---|
| 438 |   subsequent operations. And generally, these instructions need to flush the | 
|---|
| 439 |   pipeline of the FPU.</p> | 
|---|
| 440 |  | 
|---|
| 441 |   <p>In this situation, the time needed to sum [<i>a</i>,<i>b</i>] and | 
|---|
| 442 |   [<i>c</i>,<i>d</i>] is far worse than the time needed to calculate | 
|---|
| 443 |   <i>a</i>+<i>b</i> and <i>c</i>+<i>d</i> since the two additions cannot be | 
|---|
| 444 |   parallelized. Consequently, the objective is to diminish the number of | 
|---|
| 445 |   rounding mode switches.</p> | 
|---|
| 446 |  | 
|---|
| 447 |   <p>If this library is not used to provide exact computations, but only for | 
|---|
| 448 |   pair arithmetic, the solution is quite simple: do not use rounding. In that | 
|---|
| 449 |   case, doing the sum [<i>a</i>,<i>b</i>] and [<i>c</i>,<i>d</i>] will be as | 
|---|
| 450 |   fast as computing <i>a</i>+<i>b</i> and <i>c</i>+<i>d</i>. Everything is | 
|---|
| 451 |   perfect.</p> | 
|---|
| 452 |  | 
|---|
| 453 |   <p>However, if exact computations are required, such a solution is totally | 
|---|
| 454 |   unthinkable. So, are we penniless? No, there is still a trick available. | 
|---|
| 455 |   Indeed, down(<i>a</i>+<i>c</i>) = -up(-<i>a</i>-<i>c</i>) if the unary | 
|---|
| 456 |   minus is an exact operation. It is now possible to calculate the whole sum | 
|---|
| 457 |   with the same rounding mode. Generally, the cost of the mode switching is | 
|---|
| 458 |   worse than the cost of the sign changes.</p> | 
|---|
| 459 |  | 
|---|
| 460 |   <h4>Speeding up consecutive operations</h4> | 
|---|
| 461 |  | 
|---|
| 462 |   <p>The interval addition is not the only operation; most of the interval | 
|---|
| 463 |   operations can be computed by setting the rounding direction of the FPU | 
|---|
| 464 |   only once. So the operations of the floating point rounding policy assume | 
|---|
| 465 |   that the direction is correctly set. This assumption is usually not true in | 
|---|
| 466 |   a program (the user and the standard library expect the rounding direction | 
|---|
| 467 |   to be to nearest), so these operations have to be enclosed in a shell that | 
|---|
| 468 |   sets the floating point environment. This protection is done by the | 
|---|
| 469 |   constructor and destructor of the rounding policy.</p> | 
|---|
| 470 |  | 
|---|
| 471 |   <p>Les us now consider the case of two consecutive interval additions: | 
|---|
| 472 |   [<i>a</i>,<i>b</i>] + [<i>c</i>,<i>d</i>] + [<i>e</i>,<i>f</i>]. The | 
|---|
| 473 |   generated code should look like:</p> | 
|---|
| 474 |   <pre> | 
|---|
| 475 | init_rounding_mode();    // rounding object construction during the first addition | 
|---|
| 476 | t1 = -(-a - c); | 
|---|
| 477 | t2 = b + d; | 
|---|
| 478 | restore_rounding_mode(); // rounding object destruction | 
|---|
| 479 | init_rounding_mode();    // rounding object construction during the second addition | 
|---|
| 480 | x = -(-t1 - e); | 
|---|
| 481 | y = t2 + f; | 
|---|
| 482 | restore_rounding_mode(); // rounding object destruction | 
|---|
| 483 | // the result is the interval [x,y] | 
|---|
| 484 | </pre> | 
|---|
| 485 |  | 
|---|
| 486 |   <p>Between the two operations, the rounding direction is restored, and then | 
|---|
| 487 |   initialized again. Ideally, compilers should be able to optimize this | 
|---|
| 488 |   useless code away. But unfortunately they are not, and this slows the code | 
|---|
| 489 |   down by an order of magnitude. In order to avoid this bottleneck, the user | 
|---|
| 490 |   can tell to the interval operations that they do not need to be protected | 
|---|
| 491 |   anymore. It will then be up to the user to protect the interval | 
|---|
| 492 |   computations. The compiler will then be able to generate such a code:</p> | 
|---|
| 493 |   <pre> | 
|---|
| 494 | init_rounding_mode();    // done by the user | 
|---|
| 495 | x = -(-a - c - e); | 
|---|
| 496 | y = b + d + f; | 
|---|
| 497 | restore_rounding_mode(); // done by the user | 
|---|
| 498 | </pre> | 
|---|
| 499 |  | 
|---|
| 500 |   <p>The user will have to create a rounding object. And as long as this | 
|---|
| 501 |   object is alive, unprotected versions of the interval operations can be | 
|---|
| 502 |   used. They are selected by using an interval type with a specific rounding | 
|---|
| 503 |   policy. If the initial interval type is <code>I</code>, then | 
|---|
| 504 |   <code>I::traits_type::rounding</code> is the type of the rounding object, | 
|---|
| 505 |   and <code>interval_lib::unprotect<I>::type</code> is the type of the | 
|---|
| 506 |   unprotected interval type.</p> | 
|---|
| 507 |  | 
|---|
| 508 |   <p>Because the rounding mode of the FPU is changed during the life of the | 
|---|
| 509 |   rounding object, any arithmetic floating point operation that does not | 
|---|
| 510 |   involve the interval library can lead to unexpected results. And | 
|---|
| 511 |   reciprocally, using unprotected interval operation when no rounding object | 
|---|
| 512 |   is alive will produce intervals that are not guaranteed anymore to contain | 
|---|
| 513 |   the real result.</p> | 
|---|
| 514 |  | 
|---|
| 515 |   <h4 id="perfexp">Example</h4> | 
|---|
| 516 |  | 
|---|
| 517 |   <p>Here is an example of Horner's scheme to compute the value of a polynom. | 
|---|
| 518 |   The rounding mode switches are disabled for the whole computation.</p> | 
|---|
| 519 |   <pre> | 
|---|
| 520 | // I is an interval class, the polynom is a simple array | 
|---|
| 521 | template<class I> | 
|---|
| 522 | I horner(const I& x, const I p[], int n) { | 
|---|
| 523 |  | 
|---|
| 524 |   // save and initialize the rounding mode | 
|---|
| 525 |   typename I::traits_type::rounding rnd; | 
|---|
| 526 |  | 
|---|
| 527 |   // define the unprotected version of the interval type | 
|---|
| 528 |   typedef typename boost::numeric::interval_lib::unprotect<I>::type R; | 
|---|
| 529 |  | 
|---|
| 530 |   const R& a = x; | 
|---|
| 531 |   R y = p[n - 1]; | 
|---|
| 532 |   for(int i = n - 2; i >= 0; i--) { | 
|---|
| 533 |     y = y * a + (const R&)(p[i]); | 
|---|
| 534 |   } | 
|---|
| 535 |   return y; | 
|---|
| 536 |  | 
|---|
| 537 |   // restore the rounding mode with the destruction of rnd | 
|---|
| 538 | } | 
|---|
| 539 | </pre> | 
|---|
| 540 |  | 
|---|
| 541 |   <p>Please note that a rounding object is specially created in order to | 
|---|
| 542 |   protect all the interval computations. Each interval of type I is converted | 
|---|
| 543 |   in an interval of type R before any operations. If this conversion is not | 
|---|
| 544 |   done, the result is still correct, but the interest of this whole | 
|---|
| 545 |   optimization has disappeared. Whenever possible, it is good to convert to | 
|---|
| 546 |   <code>const R&</code> instead of <code>R</code>: indeed, the function | 
|---|
| 547 |   could already be called inside an unprotection block so the types | 
|---|
| 548 |   <code>R</code> and <code>I</code> would be the same interval, no need for a | 
|---|
| 549 |   conversion.</p> | 
|---|
| 550 |  | 
|---|
| 551 |   <h4>Uninteresting remark</h4> | 
|---|
| 552 |  | 
|---|
| 553 |   <p>It was said at the beginning that the Alpha processors can use a | 
|---|
| 554 |   specific rounding mode for each operation. However, due to the instruction | 
|---|
| 555 |   format, the rounding toward plus infinity is not available. Only the | 
|---|
| 556 |   rounding toward minus infinity can be used. So the trick using the change | 
|---|
| 557 |   of sign becomes essential, but there is no need to save and restore the | 
|---|
| 558 |   rounding mode on both sides of an operation.</p> | 
|---|
| 559 |  | 
|---|
| 560 |   <h3 id="extreg">Extended Registers</h3> | 
|---|
| 561 |  | 
|---|
| 562 |   <p>There is another problem besides the cost of the rounding mode switch. | 
|---|
| 563 |   Some FPUs use extended registers (for example, float computations will be | 
|---|
| 564 |   done with double registers, or double computations with long double | 
|---|
| 565 |   registers). Consequently, many problems can arise.</p> | 
|---|
| 566 |  | 
|---|
| 567 |   <p>The first one is due to to the extended precision of the mantissa. The | 
|---|
| 568 |   rounding is also done on this extended precision. And consequently, we | 
|---|
| 569 |   still have down(<i>a</i>+<i>b</i>) = -up(-<i>a</i>-<i>b</i>) in the | 
|---|
| 570 |   extended registers. But back to the standard precision, we now have | 
|---|
| 571 |   down(<i>a</i>+<i>b</i>) < -up(-<i>a</i>-<i>b</i>) instead of an | 
|---|
| 572 |   equality. A solution could be not to use this method. But there still are | 
|---|
| 573 |   other problems, with the comparisons between numbers for example.</p> | 
|---|
| 574 |  | 
|---|
| 575 |   <p>Naturally, there is also a problem with the extended precision of the | 
|---|
| 576 |   exponent. To illustrate this problem, let <i>m</i> be the biggest number | 
|---|
| 577 |   before +<i>inf</i>. If we calculate 2*[<i>m</i>,<i>m</i>], the answer | 
|---|
| 578 |   should be [<i>m</i>,<i>inf</i>]. But due to the extended registers, the FPU | 
|---|
| 579 |   will first store [<i>2m</i>,<i>2m</i>] and then convert it to | 
|---|
| 580 |   [<i>inf</i>,<i>inf</i>] at the end of the calculus (when the rounding mode | 
|---|
| 581 |   is toward +<i>inf</i>). So the answer is no more accurate.</p> | 
|---|
| 582 |  | 
|---|
| 583 |   <p>There is only one solution: to force the FPU to convert the extended | 
|---|
| 584 |   values back to standard precision after each operation. Some FPUs provide | 
|---|
| 585 |   an instruction able to do this conversion (for example the PowerPC | 
|---|
| 586 |   processors). But for the FPUs that do not provide it (the x86 processors), | 
|---|
| 587 |   the only solution is to write the values to memory and read them back. Such | 
|---|
| 588 |   an operation is obviously very expensive.</p> | 
|---|
| 589 |  | 
|---|
| 590 |   <h2>Some Examples</h2> | 
|---|
| 591 |  | 
|---|
| 592 |   <p>Here come several cases:</p> | 
|---|
| 593 |  | 
|---|
| 594 |   <ul> | 
|---|
| 595 |     <li>if you need precise computations with the <code>float</code> or | 
|---|
| 596 |     <code>double</code> types, use the default | 
|---|
| 597 |     <code>rounded_math<T></code>;</li> | 
|---|
| 598 |  | 
|---|
| 599 |     <li>for fast wide intervals without any rounding nor precision, use | 
|---|
| 600 |     <code>save_state_nothing<rounded_transc_exact<T> | 
|---|
| 601 |     ></code>;</li> | 
|---|
| 602 |  | 
|---|
| 603 |     <li>for an exact type (like int or rational with a little help for | 
|---|
| 604 |     infinite and NaN values) without support for transcendental functions, | 
|---|
| 605 |     the solution could be | 
|---|
| 606 |     <code>save_state_nothing<rounded_transc_dummy<T, | 
|---|
| 607 |     rounded_arith_exact<T> > ></code> or directly | 
|---|
| 608 |     <code>save_state_nothing<rounded_arith_exact<T> | 
|---|
| 609 |     ></code>;</li> | 
|---|
| 610 |  | 
|---|
| 611 |     <li>if it is a more complex case than the previous ones, the best thing | 
|---|
| 612 |     is probably to directly define your own policy.</li> | 
|---|
| 613 |   </ul> | 
|---|
| 614 |   <hr> | 
|---|
| 615 |  | 
|---|
| 616 |   <p><a href="http://validator.w3.org/check?uri=referer"><img border="0" src= | 
|---|
| 617 |   "http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01 Transitional" | 
|---|
| 618 |   height="31" width="88"></a></p> | 
|---|
| 619 |  | 
|---|
| 620 |   <p>Revised  | 
|---|
| 621 |   <!--webbot bot="Timestamp" s-type="EDITED" s-format="%Y-%m-%d" startspan -->2006-12-24<!--webbot bot="Timestamp" endspan i-checksum="12172" --></p> | 
|---|
| 622 |  | 
|---|
| 623 |   <p><i>Copyright © 2002 Guillaume Melquiond, Sylvain Pion, Hervé | 
|---|
| 624 |   Brönnimann, Polytechnic University<br> | 
|---|
| 625 |   Copyright © 2004-2005 Guillaume Melquiond, ENS Lyon</i></p> | 
|---|
| 626 |  | 
|---|
| 627 |   <p><i>Distributed under the Boost Software License, Version 1.0. (See | 
|---|
| 628 |   accompanying file <a href="../../../../LICENSE_1_0.txt">LICENSE_1_0.txt</a> | 
|---|
| 629 |   or copy at <a href= | 
|---|
| 630 |   "http://www.boost.org/LICENSE_1_0.txt">http://www.boost.org/LICENSE_1_0.txt</a>)</i></p> | 
|---|
| 631 | </body> | 
|---|
| 632 | </html> | 
|---|