[12] | 1 | <title>Boost Filesystem FAQ</title> |
---|
| 2 | <body bgcolor="#FFFFFF"> |
---|
| 3 | |
---|
| 4 | <h1> |
---|
| 5 | <img border="0" src="../../../boost.png" align="center" width="277" height="86">Filesystem |
---|
| 6 | FAQ</h1> |
---|
| 7 | <p><b>Why base the generic-path string format on POSIX?</b></p> |
---|
| 8 | <p><a href="design.htm#POSIX-01">POSIX</a> is the basis for the most familiar path-string formats, including the |
---|
| 9 | URL portion of URI's and the native Windows format. It is ubiquitous and |
---|
| 10 | familiar. On many systems, it is very easy to implement because it is |
---|
| 11 | either the native operating system format (Unix and Windows) or via a |
---|
| 12 | operating system supplied |
---|
| 13 | POSIX library (z/OS, OS/390, and many more.)</p> |
---|
| 14 | <p><b>Why not use a full URI (Universal Resource Identifier) based path?</b></p> |
---|
| 15 | <p><a href="design.htm#URI">URI's</a> would promise more than the Filesystem Library can actually deliver, |
---|
| 16 | since URI's extend far beyond what most operating systems consider a file or a |
---|
| 17 | directory. Thus for the primary "portable script-style file system |
---|
| 18 | operations" requirement of the Filesystem Library, full URI's appear to be over-specification.</p> |
---|
| 19 | <p><b>Why isn't <i>path</i> a base class with derived <i>directory_path</i> and |
---|
| 20 | <i>file_path</i> classes?</b></p> |
---|
| 21 | <p>Why bother? The behavior of all three classes is essentially identical. |
---|
| 22 | Several early versions did require users to identify each path as a file or |
---|
| 23 | directory path, and this seemed to increase errors and decrease code |
---|
| 24 | readability. There was no apparent upside benefit.</p> |
---|
| 25 | <p><b>Why are fully specified paths called <i>complete</i> rather than <i> |
---|
| 26 | <a name="absolute">absolute</a></i>?</b></p> |
---|
| 27 | <p>To avoid long-held assumptions (what do you mean, <i>"/foo"</i> isn't |
---|
| 28 | absolute on some systems?) by programmers used to single-rooted filesystems. |
---|
| 29 | Using an unfamiliar name for the concept and related functions causes |
---|
| 30 | programmers to read the specs rather than just assuming the meaning is known.</p> |
---|
| 31 | <p><b>Why do some function names have a "native_" prefix?</b></p> |
---|
| 32 | <p>To alert users that the results are inherently non-portable. The names are |
---|
| 33 | deliberately ugly to discourage use except where really necessary.</p> |
---|
| 34 | <p><b>Why not support a concept of specific kinds of file systems, such as |
---|
| 35 | posix_file_system or windows_file_system?</b></p> |
---|
| 36 | <p>Portability is one of the one or two most important requirements for the |
---|
| 37 | library. Gaining some advantage by using features specific to particular |
---|
| 38 | operating systems is not a requirement. There doesn't appear to be much need for |
---|
| 39 | the ability to manipulate, say, a classic Mac OS path while running on an |
---|
| 40 | OpenVMS machine.</p> |
---|
| 41 | <p>Furthermore, concepts like "posix_file_system" |
---|
| 42 | are very slippery. What happens when a NTFS or ISO 9660 file system is mounted |
---|
| 43 | in directory on a machine running a POSIX-like operating system, for example?</p> |
---|
| 44 | <p><b>Why not supply a 'handle' type, and let the file and directory operations |
---|
| 45 | traffic in it?</b></p> |
---|
| 46 | <p>It isn't clear there is any feasible way to meet the "portable script-style |
---|
| 47 | file system operations" requirement with such a system. File systems exist where operations are usually performed on |
---|
| 48 | some non-string handle type. The classic Mac OS has been mentioned explicitly as a case where |
---|
| 49 | trafficking in paths isn't always natural. </p> |
---|
| 50 | <p>The case for the "handle" (opaque data type to identify a file) |
---|
| 51 | style may be strongest for directory iterator value type. (See Jesse Jones' Jan 28, |
---|
| 52 | 2002, Boost postings). However, as class path has evolved, it seems sufficient |
---|
| 53 | even as the directory iterator value type.</p> |
---|
| 54 | <p><b>Why aren't directories considered to be files?</b></p> |
---|
| 55 | <p>Because |
---|
| 56 | directories cannot portably and usefully be opened as files using the C++ Standard Library stdio or fstream |
---|
| 57 | I/O facilities. An important additional rationale is that separating the concept |
---|
| 58 | of directories and files makes exposition and specification clearer. A |
---|
| 59 | particular problem is the naming and description of function arguments.</p> |
---|
| 60 | <div align="center"> |
---|
| 61 | <center> |
---|
| 62 | <table border="1" cellpadding="5" cellspacing="0"> |
---|
| 63 | <tr> |
---|
| 64 | <td colspan="3"> |
---|
| 65 | <p align="center"><b>Meaningful Names for Arguments</b></td> |
---|
| 66 | </tr> |
---|
| 67 | <tr> |
---|
| 68 | <td><b>Argument Intent</b></td> |
---|
| 69 | <td><b>Meaningful name if<br> |
---|
| 70 | directories are files</b></td> |
---|
| 71 | <td><b>Meaningful name if<br> |
---|
| 72 | directories aren't files</b></td> |
---|
| 73 | </tr> |
---|
| 74 | <tr> |
---|
| 75 | <td>A path to either a directory or a non-directory</td> |
---|
| 76 | <td align="center"><i>path</i></td> |
---|
| 77 | <td align="center"><i>path</i></td> |
---|
| 78 | </tr> |
---|
| 79 | <tr> |
---|
| 80 | <td>A path to a directory, but not to a non-directory</td> |
---|
| 81 | <td align="center"><i>directory_path</i></td> |
---|
| 82 | <td align="center"><i>directory_path</i></td> |
---|
| 83 | </tr> |
---|
| 84 | <tr> |
---|
| 85 | <td>A path to a non-directory, but not a directory</td> |
---|
| 86 | <td align="center"><i>non_directory_path</i></td> |
---|
| 87 | <td align="center"><i>file_path</i></td> |
---|
| 88 | </tr> |
---|
| 89 | </table> |
---|
| 90 | </center> |
---|
| 91 | </div> |
---|
| 92 | <p>The problem is that when directories are considered files, <i> |
---|
| 93 | non_directory_path</i> as an argument name, and the corresponding "non-directory |
---|
| 94 | path" in documentation, is ugly and lengthy, and so is shortened to just <i>path</i>, |
---|
| 95 | causing the code and documentation to be confusing if not downright wrong. The |
---|
| 96 | names which result from the "directories aren't files" approach are more |
---|
| 97 | acceptable and less likely to be used incorrectly. </p> |
---|
| 98 | <p><b>Why are the operations.hpp non-member functions so low-level?</b></p> |
---|
| 99 | <p>To provide a toolkit from which higher-level functionality can be created.</p> |
---|
| 100 | <p>An |
---|
| 101 | extended attempt to add convenience functions on top of, or as a replacement |
---|
| 102 | for, the low-level functionality failed because there is no widely acceptable |
---|
| 103 | set of simple semantics for most convenience functions considered. |
---|
| 104 | Attempts to provide alternate semantics, via either run-time options or |
---|
| 105 | compile-time polices, became overly complicated in relation to the value |
---|
| 106 | delivered, or became contentious. OTOH, the specific functionality needed for several trial |
---|
| 107 | applications was very easy for the user to construct from the lower-level |
---|
| 108 | toolkit functions. See <a href="design.htm#Abandoned_Designs">Failed Attempts</a>.</p> |
---|
| 109 | <p><b>Isn't it inconsistent then to provide a few convenience functions?</b></p> |
---|
| 110 | <p>Yes, but experience with both this library, POSIX, and Windows indicates |
---|
| 111 | the utility of certain convenience functions, and that it is possible to provide |
---|
| 112 | simple, yet widely acceptable, semantics for them. For example, remove_all.</p> |
---|
| 113 | <p><b>Why are library functions so picky about errors?</b></p> |
---|
| 114 | <p>Safety. The default is to be safe rather than sorry. This is particularly |
---|
| 115 | important given the reality that on many computer systems files and directories |
---|
| 116 | are <a href="index.htm#Race-condition">globally shared</a> resources, and thus subject to |
---|
| 117 | unexpected errors.</p> |
---|
| 118 | <p><b>Why are errors reported by exception rather than return code or error |
---|
| 119 | notification variable?</b></p> |
---|
| 120 | <p>Safety. Return codes or error notification variables are often ignored |
---|
| 121 | by programmers. Exceptions are much harder to ignore, provided desired |
---|
| 122 | default behavior (program termination) if not caught, yet allow error recovery |
---|
| 123 | if desired.</p> |
---|
| 124 | <p><b>Why are attributes accessed via named functions rather than property maps?</b></p> |
---|
| 125 | <p>For a few commonly used attributes (existence, directory or file, emptiness), |
---|
| 126 | simple syntax and guaranteed presence outweigh other considerations. Because |
---|
| 127 | access to virtually all other attributes is inherently system dependent, |
---|
| 128 | property maps are viewed as the best hope for access and modification, but it is |
---|
| 129 | better design to provide such functionality in a separate library. (Historical |
---|
| 130 | note: even the apparently simple attribute "read-only" turned out to be so |
---|
| 131 | system depend as to be disqualified as a "guaranteed presence" operation.)</p> |
---|
| 132 | <p><b>Why isn't there a set_current_directory function?</b></p> |
---|
| 133 | <p>Global variables are considered harmful [<a href="design.htm#Wulf-Shaw-73">wulf-shaw-73</a>]. |
---|
| 134 | While we can't prevent people from shooting themselves in the foot, we aren't |
---|
| 135 | about to hand them a loaded gun pointed right at their big toe.</p> |
---|
| 136 | <p><b>Why aren't there query functions for compound conditions like existing_directory?</b></p> |
---|
| 137 | <p>After several attempts, named queries for multi-attribute proved a |
---|
| 138 | slippery-slope; where do you stop?</p> |
---|
| 139 | <p><b>Why aren't <a name="wide-character_names">wide-character names</a> supported? Why not std::wstring or even |
---|
| 140 | a templated type?</b></p> |
---|
| 141 | <p>Wide-character names would provide an illusion of portability where |
---|
| 142 | portability does not in fact exist. Behavior would be completely different on |
---|
| 143 | operating systems (Windows, for example) that support wide-character names, than |
---|
| 144 | on systems which don't (POSIX). Providing functionality that appears to provide |
---|
| 145 | portability but in fact delivers only implementation-defined behavior is highly |
---|
| 146 | undesirable. Programs would not even be portable between library implementations |
---|
| 147 | on the same operating system, let alone portable to different operating systems.</p> |
---|
| 148 | <p>The C++ standards committee Library Working Group discussed this in some |
---|
| 149 | detail both on the committee's library reflector and at the Spring, 2002, <font face="Times New Roman"> meeting</font>, and feels that (1) names based on types other than char are |
---|
| 150 | extremely non-portable, (2) there are no agreed upon semantics for conversion |
---|
| 151 | between wide-character and narrow-character names for file systems which do not support |
---|
| 152 | wide-character name, and |
---|
| 153 | (3) even the committee members most interested in wide-character names are |
---|
| 154 | unsure that they are a good idea in the context of a portable library.</p> |
---|
| 155 | <p>[October, 2002 - PJ Plauger has suggested a locale based conversion |
---|
| 156 | scheme. Others have indicated support for such an experiment.]</p> |
---|
| 157 | <p><b>Why are file and directory name portability errors detected automatically |
---|
| 158 | when these aren't actually errors in some applications?</b></p> |
---|
| 159 | <p>For many uses, automatic portability error detection based on the generic |
---|
| 160 | path grammar is a sensible default. For cases where some other error check |
---|
| 161 | (including no check at all) is preferred for the entire application, |
---|
| 162 | functionality is provided to change the default. For cases where some other |
---|
| 163 | error check (including no check at all) is preferred for a particular |
---|
| 164 | path, the error check can be specified in the path constructor.</p> |
---|
| 165 | <p>The error checking functions call also be explicitly called. That provides |
---|
| 166 | yet another way to check for errors.</p> |
---|
| 167 | <p>The design makes error checking easy and automatic for common cases, yet |
---|
| 168 | provides explicit control when that is required.</p> |
---|
| 169 | <p><b>Why isn't more powerful name portability error detection provided, such as |
---|
| 170 | deferring checking until a path is actually used?</b></p> |
---|
| 171 | <p>A number (at least six) of designs for name validity error |
---|
| 172 | detection were evaluated, including at least four complete implementations. |
---|
| 173 | While the details for rejection differed, all of the more powerful name validity checking |
---|
| 174 | designs distorted other |
---|
| 175 | otherwise simple aspects of the library. While name checking can be helpful, it |
---|
| 176 | isn't important enough to justify added a lot of additional complexity.</p> |
---|
| 177 | <p><b>Why are paths sometimes manipulated by member functions and sometimes by |
---|
| 178 | non-member functions?</b></p> |
---|
| 179 | <p>The design rule is that purely lexical operations are supplied as <i>class |
---|
| 180 | path</i> member |
---|
| 181 | functions, while operations performed by the operating system are provided as |
---|
| 182 | free functions.</p> |
---|
| 183 | <p><b>Why is path <a href="path.htm#Normalized">normalized form</a> different |
---|
| 184 | from <a href="path.htm#Canonical">canonical form</a>?</b></p> |
---|
| 185 | <p>On operating systems such as POSIX which allow symbolic links to directories, |
---|
| 186 | the normalized form of a path can represent a different location than the |
---|
| 187 | canonical form. See <a href="design.htm#symbolic-link-use-case">use case</a> |
---|
| 188 | from Walter Landry.</p> |
---|
| 189 | <hr> |
---|
| 190 | <p>Revised |
---|
| 191 | <!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%d %B, %Y" startspan -->02 August, 2005<!--webbot bot="Timestamp" endspan i-checksum="34600" --></p> |
---|
| 192 | <p>© Copyright Beman Dawes, 2002</p> |
---|
| 193 | <p> Use, modification, and distribution are subject to the Boost Software |
---|
| 194 | License, Version 1.0. (See accompanying file <a href="../../../LICENSE_1_0.txt"> |
---|
| 195 | LICENSE_1_0.txt</a> or copy at <a href="http://www.boost.org/LICENSE_1_0.txt"> |
---|
| 196 | www.boost.org/LICENSE_1_0.txt</a>)</p> |
---|