1 | <title>Boost Filesystem FAQ</title> |
---|
2 | <body bgcolor="#FFFFFF"> |
---|
3 | |
---|
4 | <h1> |
---|
5 | <img border="0" src="../../../boost.png" align="center" width="277" height="86">Filesystem |
---|
6 | FAQ</h1> |
---|
7 | <p><b>Why base the generic-path string format on POSIX?</b></p> |
---|
8 | <p><a href="design.htm#POSIX-01">POSIX</a> is the basis for the most familiar path-string formats, including the |
---|
9 | URL portion of URI's and the native Windows format. It is ubiquitous and |
---|
10 | familiar. On many systems, it is very easy to implement because it is |
---|
11 | either the native operating system format (Unix and Windows) or via a |
---|
12 | operating system supplied |
---|
13 | POSIX library (z/OS, OS/390, and many more.)</p> |
---|
14 | <p><b>Why not use a full URI (Universal Resource Identifier) based path?</b></p> |
---|
15 | <p><a href="design.htm#URI">URI's</a> would promise more than the Filesystem Library can actually deliver, |
---|
16 | since URI's extend far beyond what most operating systems consider a file or a |
---|
17 | directory. Thus for the primary "portable script-style file system |
---|
18 | operations" requirement of the Filesystem Library, full URI's appear to be over-specification.</p> |
---|
19 | <p><b>Why isn't <i>path</i> a base class with derived <i>directory_path</i> and |
---|
20 | <i>file_path</i> classes?</b></p> |
---|
21 | <p>Why bother? The behavior of all three classes is essentially identical. |
---|
22 | Several early versions did require users to identify each path as a file or |
---|
23 | directory path, and this seemed to increase errors and decrease code |
---|
24 | readability. There was no apparent upside benefit.</p> |
---|
25 | <p><b>Why are fully specified paths called <i>complete</i> rather than <i> |
---|
26 | <a name="absolute">absolute</a></i>?</b></p> |
---|
27 | <p>To avoid long-held assumptions (what do you mean, <i>"/foo"</i> isn't |
---|
28 | absolute on some systems?) by programmers used to single-rooted filesystems. |
---|
29 | Using an unfamiliar name for the concept and related functions causes |
---|
30 | programmers to read the specs rather than just assuming the meaning is known.</p> |
---|
31 | <p><b>Why do some function names have a "native_" prefix?</b></p> |
---|
32 | <p>To alert users that the results are inherently non-portable. The names are |
---|
33 | deliberately ugly to discourage use except where really necessary.</p> |
---|
34 | <p><b>Why not support a concept of specific kinds of file systems, such as |
---|
35 | posix_file_system or windows_file_system?</b></p> |
---|
36 | <p>Portability is one of the one or two most important requirements for the |
---|
37 | library. Gaining some advantage by using features specific to particular |
---|
38 | operating systems is not a requirement. There doesn't appear to be much need for |
---|
39 | the ability to manipulate, say, a classic Mac OS path while running on an |
---|
40 | OpenVMS machine.</p> |
---|
41 | <p>Furthermore, concepts like "posix_file_system" |
---|
42 | are very slippery. What happens when a NTFS or ISO 9660 file system is mounted |
---|
43 | in directory on a machine running a POSIX-like operating system, for example?</p> |
---|
44 | <p><b>Why not supply a 'handle' type, and let the file and directory operations |
---|
45 | traffic in it?</b></p> |
---|
46 | <p>It isn't clear there is any feasible way to meet the "portable script-style |
---|
47 | file system operations" requirement with such a system. File systems exist where operations are usually performed on |
---|
48 | some non-string handle type. The classic Mac OS has been mentioned explicitly as a case where |
---|
49 | trafficking in paths isn't always natural. </p> |
---|
50 | <p>The case for the "handle" (opaque data type to identify a file) |
---|
51 | style may be strongest for directory iterator value type. (See Jesse Jones' Jan 28, |
---|
52 | 2002, Boost postings). However, as class path has evolved, it seems sufficient |
---|
53 | even as the directory iterator value type.</p> |
---|
54 | <p><b>Why aren't directories considered to be files?</b></p> |
---|
55 | <p>Because |
---|
56 | directories cannot portably and usefully be opened as files using the C++ Standard Library stdio or fstream |
---|
57 | I/O facilities. An important additional rationale is that separating the concept |
---|
58 | of directories and files makes exposition and specification clearer. A |
---|
59 | particular problem is the naming and description of function arguments.</p> |
---|
60 | <div align="center"> |
---|
61 | <center> |
---|
62 | <table border="1" cellpadding="5" cellspacing="0"> |
---|
63 | <tr> |
---|
64 | <td colspan="3"> |
---|
65 | <p align="center"><b>Meaningful Names for Arguments</b></td> |
---|
66 | </tr> |
---|
67 | <tr> |
---|
68 | <td><b>Argument Intent</b></td> |
---|
69 | <td><b>Meaningful name if<br> |
---|
70 | directories are files</b></td> |
---|
71 | <td><b>Meaningful name if<br> |
---|
72 | directories aren't files</b></td> |
---|
73 | </tr> |
---|
74 | <tr> |
---|
75 | <td>A path to either a directory or a non-directory</td> |
---|
76 | <td align="center"><i>path</i></td> |
---|
77 | <td align="center"><i>path</i></td> |
---|
78 | </tr> |
---|
79 | <tr> |
---|
80 | <td>A path to a directory, but not to a non-directory</td> |
---|
81 | <td align="center"><i>directory_path</i></td> |
---|
82 | <td align="center"><i>directory_path</i></td> |
---|
83 | </tr> |
---|
84 | <tr> |
---|
85 | <td>A path to a non-directory, but not a directory</td> |
---|
86 | <td align="center"><i>non_directory_path</i></td> |
---|
87 | <td align="center"><i>file_path</i></td> |
---|
88 | </tr> |
---|
89 | </table> |
---|
90 | </center> |
---|
91 | </div> |
---|
92 | <p>The problem is that when directories are considered files, <i> |
---|
93 | non_directory_path</i> as an argument name, and the corresponding "non-directory |
---|
94 | path" in documentation, is ugly and lengthy, and so is shortened to just <i>path</i>, |
---|
95 | causing the code and documentation to be confusing if not downright wrong. The |
---|
96 | names which result from the "directories aren't files" approach are more |
---|
97 | acceptable and less likely to be used incorrectly. </p> |
---|
98 | <p><b>Why are the operations.hpp non-member functions so low-level?</b></p> |
---|
99 | <p>To provide a toolkit from which higher-level functionality can be created.</p> |
---|
100 | <p>An |
---|
101 | extended attempt to add convenience functions on top of, or as a replacement |
---|
102 | for, the low-level functionality failed because there is no widely acceptable |
---|
103 | set of simple semantics for most convenience functions considered. |
---|
104 | Attempts to provide alternate semantics, via either run-time options or |
---|
105 | compile-time polices, became overly complicated in relation to the value |
---|
106 | delivered, or became contentious. OTOH, the specific functionality needed for several trial |
---|
107 | applications was very easy for the user to construct from the lower-level |
---|
108 | toolkit functions. See <a href="design.htm#Abandoned_Designs">Failed Attempts</a>.</p> |
---|
109 | <p><b>Isn't it inconsistent then to provide a few convenience functions?</b></p> |
---|
110 | <p>Yes, but experience with both this library, POSIX, and Windows indicates |
---|
111 | the utility of certain convenience functions, and that it is possible to provide |
---|
112 | simple, yet widely acceptable, semantics for them. For example, remove_all.</p> |
---|
113 | <p><b>Why are library functions so picky about errors?</b></p> |
---|
114 | <p>Safety. The default is to be safe rather than sorry. This is particularly |
---|
115 | important given the reality that on many computer systems files and directories |
---|
116 | are <a href="index.htm#Race-condition">globally shared</a> resources, and thus subject to |
---|
117 | unexpected errors.</p> |
---|
118 | <p><b>Why are errors reported by exception rather than return code or error |
---|
119 | notification variable?</b></p> |
---|
120 | <p>Safety. Return codes or error notification variables are often ignored |
---|
121 | by programmers. Exceptions are much harder to ignore, provided desired |
---|
122 | default behavior (program termination) if not caught, yet allow error recovery |
---|
123 | if desired.</p> |
---|
124 | <p><b>Why are attributes accessed via named functions rather than property maps?</b></p> |
---|
125 | <p>For a few commonly used attributes (existence, directory or file, emptiness), |
---|
126 | simple syntax and guaranteed presence outweigh other considerations. Because |
---|
127 | access to virtually all other attributes is inherently system dependent, |
---|
128 | property maps are viewed as the best hope for access and modification, but it is |
---|
129 | better design to provide such functionality in a separate library. (Historical |
---|
130 | note: even the apparently simple attribute "read-only" turned out to be so |
---|
131 | system depend as to be disqualified as a "guaranteed presence" operation.)</p> |
---|
132 | <p><b>Why isn't there a set_current_directory function?</b></p> |
---|
133 | <p>Global variables are considered harmful [<a href="design.htm#Wulf-Shaw-73">wulf-shaw-73</a>]. |
---|
134 | While we can't prevent people from shooting themselves in the foot, we aren't |
---|
135 | about to hand them a loaded gun pointed right at their big toe.</p> |
---|
136 | <p><b>Why aren't there query functions for compound conditions like existing_directory?</b></p> |
---|
137 | <p>After several attempts, named queries for multi-attribute proved a |
---|
138 | slippery-slope; where do you stop?</p> |
---|
139 | <p><b>Why aren't <a name="wide-character_names">wide-character names</a> supported? Why not std::wstring or even |
---|
140 | a templated type?</b></p> |
---|
141 | <p>Wide-character names would provide an illusion of portability where |
---|
142 | portability does not in fact exist. Behavior would be completely different on |
---|
143 | operating systems (Windows, for example) that support wide-character names, than |
---|
144 | on systems which don't (POSIX). Providing functionality that appears to provide |
---|
145 | portability but in fact delivers only implementation-defined behavior is highly |
---|
146 | undesirable. Programs would not even be portable between library implementations |
---|
147 | on the same operating system, let alone portable to different operating systems.</p> |
---|
148 | <p>The C++ standards committee Library Working Group discussed this in some |
---|
149 | detail both on the committee's library reflector and at the Spring, 2002, <font face="Times New Roman"> meeting</font>, and feels that (1) names based on types other than char are |
---|
150 | extremely non-portable, (2) there are no agreed upon semantics for conversion |
---|
151 | between wide-character and narrow-character names for file systems which do not support |
---|
152 | wide-character name, and |
---|
153 | (3) even the committee members most interested in wide-character names are |
---|
154 | unsure that they are a good idea in the context of a portable library.</p> |
---|
155 | <p>[October, 2002 - PJ Plauger has suggested a locale based conversion |
---|
156 | scheme. Others have indicated support for such an experiment.]</p> |
---|
157 | <p><b>Why are file and directory name portability errors detected automatically |
---|
158 | when these aren't actually errors in some applications?</b></p> |
---|
159 | <p>For many uses, automatic portability error detection based on the generic |
---|
160 | path grammar is a sensible default. For cases where some other error check |
---|
161 | (including no check at all) is preferred for the entire application, |
---|
162 | functionality is provided to change the default. For cases where some other |
---|
163 | error check (including no check at all) is preferred for a particular |
---|
164 | path, the error check can be specified in the path constructor.</p> |
---|
165 | <p>The error checking functions call also be explicitly called. That provides |
---|
166 | yet another way to check for errors.</p> |
---|
167 | <p>The design makes error checking easy and automatic for common cases, yet |
---|
168 | provides explicit control when that is required.</p> |
---|
169 | <p><b>Why isn't more powerful name portability error detection provided, such as |
---|
170 | deferring checking until a path is actually used?</b></p> |
---|
171 | <p>A number (at least six) of designs for name validity error |
---|
172 | detection were evaluated, including at least four complete implementations. |
---|
173 | While the details for rejection differed, all of the more powerful name validity checking |
---|
174 | designs distorted other |
---|
175 | otherwise simple aspects of the library. While name checking can be helpful, it |
---|
176 | isn't important enough to justify added a lot of additional complexity.</p> |
---|
177 | <p><b>Why are paths sometimes manipulated by member functions and sometimes by |
---|
178 | non-member functions?</b></p> |
---|
179 | <p>The design rule is that purely lexical operations are supplied as <i>class |
---|
180 | path</i> member |
---|
181 | functions, while operations performed by the operating system are provided as |
---|
182 | free functions.</p> |
---|
183 | <p><b>Why is path <a href="path.htm#Normalized">normalized form</a> different |
---|
184 | from <a href="path.htm#Canonical">canonical form</a>?</b></p> |
---|
185 | <p>On operating systems such as POSIX which allow symbolic links to directories, |
---|
186 | the normalized form of a path can represent a different location than the |
---|
187 | canonical form. See <a href="design.htm#symbolic-link-use-case">use case</a> |
---|
188 | from Walter Landry.</p> |
---|
189 | <hr> |
---|
190 | <p>Revised |
---|
191 | <!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%d %B, %Y" startspan -->02 August, 2005<!--webbot bot="Timestamp" endspan i-checksum="34600" --></p> |
---|
192 | <p>© Copyright Beman Dawes, 2002</p> |
---|
193 | <p> Use, modification, and distribution are subject to the Boost Software |
---|
194 | License, Version 1.0. (See accompanying file <a href="../../../LICENSE_1_0.txt"> |
---|
195 | LICENSE_1_0.txt</a> or copy at <a href="http://www.boost.org/LICENSE_1_0.txt"> |
---|
196 | www.boost.org/LICENSE_1_0.txt</a>)</p> |
---|