1 | '\" |
---|
2 | '\" Copyright (c) 1993 The Regents of the University of California. |
---|
3 | '\" Copyright (c) 1994-1996 Sun Microsystems, Inc. |
---|
4 | '\" Copyright (c) 2000 Scriptics Corporation. |
---|
5 | '\" |
---|
6 | '\" See the file "license.terms" for information on usage and redistribution |
---|
7 | '\" of this file, and for a DISCLAIMER OF ALL WARRANTIES. |
---|
8 | '\" |
---|
9 | '\" RCS: @(#) $Id: scan.n,v 1.24 2007/12/13 15:22:33 dgp Exp $ |
---|
10 | '\" |
---|
11 | .so man.macros |
---|
12 | .TH scan n 8.4 Tcl "Tcl Built-In Commands" |
---|
13 | .BS |
---|
14 | '\" Note: do not modify the .SH NAME line immediately below! |
---|
15 | .SH NAME |
---|
16 | scan \- Parse string using conversion specifiers in the style of sscanf |
---|
17 | .SH SYNOPSIS |
---|
18 | \fBscan \fIstring format \fR?\fIvarName varName ...\fR? |
---|
19 | .BE |
---|
20 | .SH INTRODUCTION |
---|
21 | .PP |
---|
22 | This command parses substrings from an input string in a fashion similar |
---|
23 | to the ANSI C \fBsscanf\fR procedure and returns a count of the number of |
---|
24 | conversions performed, or -1 if the end of the input string is reached |
---|
25 | before any conversions have been performed. \fIString\fR gives the input |
---|
26 | to be parsed and \fIformat\fR indicates how to parse it, using \fB%\fR |
---|
27 | conversion specifiers as in \fBsscanf\fR. Each \fIvarName\fR gives the |
---|
28 | name of a variable; when a substring is scanned from \fIstring\fR that |
---|
29 | matches a conversion specifier, the substring is assigned to the |
---|
30 | corresponding variable. |
---|
31 | If no \fIvarName\fR variables are specified, then \fBscan\fR works in an |
---|
32 | inline manner, returning the data that would otherwise be stored in the |
---|
33 | variables as a list. In the inline case, an empty string is returned when |
---|
34 | the end of the input string is reached before any conversions have been |
---|
35 | performed. |
---|
36 | .SH "DETAILS ON SCANNING" |
---|
37 | .PP |
---|
38 | \fBScan\fR operates by scanning \fIstring\fR and \fIformat\fR together. |
---|
39 | If the next character in \fIformat\fR is a blank or tab then it |
---|
40 | matches any number of white space characters in \fIstring\fR (including |
---|
41 | zero). |
---|
42 | Otherwise, if it is not a \fB%\fR character then it |
---|
43 | must match the next character of \fIstring\fR. |
---|
44 | When a \fB%\fR is encountered in \fIformat\fR, it indicates |
---|
45 | the start of a conversion specifier. |
---|
46 | A conversion specifier contains up to four fields after the \fB%\fR: |
---|
47 | a XPG3 position specifier (or a \fB*\fR to indicate the converted |
---|
48 | value is to be discarded instead of assigned to any variable); a number |
---|
49 | indicating a maximum substring width; a size modifier; and a |
---|
50 | conversion character. |
---|
51 | All of these fields are optional except for the conversion character. |
---|
52 | The fields that are present must appear in the order given above. |
---|
53 | .PP |
---|
54 | When \fBscan\fR finds a conversion specifier in \fIformat\fR, it |
---|
55 | first skips any white-space characters in \fIstring\fR (unless the |
---|
56 | conversion character is \fB[\fR or \fBc\fR). |
---|
57 | Then it converts the next input characters according to the |
---|
58 | conversion specifier and stores the result in the variable given |
---|
59 | by the next argument to \fBscan\fR. |
---|
60 | .PP |
---|
61 | If the \fB%\fR is followed by a decimal number and a \fB$\fR, as in |
---|
62 | .QW \fB%2$d\fR , |
---|
63 | then the variable to use is not taken from the next |
---|
64 | sequential argument. Instead, it is taken from the argument indicated |
---|
65 | by the number, where 1 corresponds to the first \fIvarName\fR. If |
---|
66 | there are any positional specifiers in \fIformat\fR then all of the |
---|
67 | specifiers must be positional. Every \fIvarName\fR on the argument |
---|
68 | list must correspond to exactly one conversion specifier or an error |
---|
69 | is generated, or in the inline case, any position can be specified |
---|
70 | at most once and the empty positions will be filled in with empty strings. |
---|
71 | .PP |
---|
72 | .VS 8.5 |
---|
73 | The size modifier field is used only when scanning a substring into |
---|
74 | one of Tcl's integer values. The size modifier field dictates the |
---|
75 | integer range acceptable to be stored in a variable, or, for the inline |
---|
76 | case, in a position in the result list. |
---|
77 | The syntactically valid values for the size modifier are \fBh\fR, \fBL\fR, |
---|
78 | \fBl\fR, and \fBll\fR. The \fBh\fR size modifier value is equivalent |
---|
79 | to the absence of a size modifier in the the conversion specifier. |
---|
80 | Either one indicates the integer range to be stored is limited to |
---|
81 | the same range produced by the \fBint()\fR function of the \fBexpr\fR |
---|
82 | command. The \fBL\fR size modifier is equivalent to the \fBl\fR size |
---|
83 | modifier. Either one indicates the integer range to be stored is |
---|
84 | limited to the same range produced by the \fBwide()\fR function of |
---|
85 | the \fBexpr\fR command. The \fBll\fR size modifier indicates that |
---|
86 | the integer range to be stored is unlimited. |
---|
87 | .VE 8.5 |
---|
88 | .PP |
---|
89 | The following conversion characters are supported: |
---|
90 | .TP 10 |
---|
91 | \fBd\fR |
---|
92 | The input substring must be a decimal integer. |
---|
93 | It is read in and the integer value is stored in the variable, |
---|
94 | truncated as required by the size modifier value. |
---|
95 | .TP 10 |
---|
96 | \fBo\fR |
---|
97 | The input substring must be an octal integer. It is read in and the |
---|
98 | integer value is stored in the variable, |
---|
99 | truncated as required by the size modifier value. |
---|
100 | .TP 10 |
---|
101 | \fBx\fR |
---|
102 | The input substring must be a hexadecimal integer. |
---|
103 | It is read in and the integer value is stored in the variable, |
---|
104 | truncated as required by the size modifier value. |
---|
105 | .TP 10 |
---|
106 | \fBu\fR |
---|
107 | The input substring must be a decimal integer. |
---|
108 | The integer value is truncated as required by the size modifier |
---|
109 | value, and the corresponding unsigned value for that truncated |
---|
110 | range is computed and stored in the variable as a decimal string. |
---|
111 | The conversion makes no sense without reference to a truncation range, |
---|
112 | so the size modifier \fBll\fR is not permitted in combination |
---|
113 | with conversion character \fBu\fR. |
---|
114 | .TP 10 |
---|
115 | \fBi\fR |
---|
116 | The input substring must be an integer. The base (i.e. decimal, binary, |
---|
117 | octal, or hexadecimal) is determined in the same fashion as described in |
---|
118 | \fBexpr\fR. The integer value is stored in the variable, |
---|
119 | truncated as required by the size modifier value. |
---|
120 | .TP 10 |
---|
121 | \fBc\fR |
---|
122 | A single character is read in and its Unicode value is stored in |
---|
123 | the variable as an integer value. |
---|
124 | Initial white space is not skipped in this case, so the input |
---|
125 | substring may be a white-space character. |
---|
126 | .TP 10 |
---|
127 | \fBs\fR |
---|
128 | The input substring consists of all the characters up to the next |
---|
129 | white-space character; the characters are copied to the variable. |
---|
130 | .TP 10 |
---|
131 | \fBe\fR or \fBf\fR or \fBg\fR |
---|
132 | The input substring must be a floating-point number consisting |
---|
133 | of an optional sign, a string of decimal digits possibly |
---|
134 | containing a decimal point, and an optional exponent consisting |
---|
135 | of an \fBe\fR or \fBE\fR followed by an optional sign and a string of |
---|
136 | decimal digits. |
---|
137 | It is read in and stored in the variable as a floating-point value. |
---|
138 | .TP 10 |
---|
139 | \fB[\fIchars\fB]\fR |
---|
140 | The input substring consists of one or more characters in \fIchars\fR. |
---|
141 | The matching string is stored in the variable. |
---|
142 | If the first character between the brackets is a \fB]\fR then |
---|
143 | it is treated as part of \fIchars\fR rather than the closing |
---|
144 | bracket for the set. |
---|
145 | If \fIchars\fR |
---|
146 | contains a sequence of the form \fIa\fB\-\fIb\fR then any |
---|
147 | character between \fIa\fR and \fIb\fR (inclusive) will match. |
---|
148 | If the first or last character between the brackets is a \fB\-\fR, then |
---|
149 | it is treated as part of \fIchars\fR rather than indicating a range. |
---|
150 | .TP 10 |
---|
151 | \fB[^\fIchars\fB]\fR |
---|
152 | The input substring consists of one or more characters not in \fIchars\fR. |
---|
153 | The matching string is stored in the variable. |
---|
154 | If the character immediately following the \fB^\fR is a \fB]\fR then it is |
---|
155 | treated as part of the set rather than the closing bracket for |
---|
156 | the set. |
---|
157 | If \fIchars\fR |
---|
158 | contains a sequence of the form \fIa\fB\-\fIb\fR then any |
---|
159 | character between \fIa\fR and \fIb\fR (inclusive) will be excluded |
---|
160 | from the set. |
---|
161 | If the first or last character between the brackets is a \fB\-\fR, then |
---|
162 | it is treated as part of \fIchars\fR rather than indicating a range value. |
---|
163 | .TP 10 |
---|
164 | \fBn\fR |
---|
165 | No input is consumed from the input string. Instead, the total number |
---|
166 | of characters scanned from the input string so far is stored in the variable. |
---|
167 | .LP |
---|
168 | The number of characters read from the input for a conversion is the |
---|
169 | largest number that makes sense for that particular conversion (e.g. |
---|
170 | as many decimal digits as possible for \fB%d\fR, as |
---|
171 | many octal digits as possible for \fB%o\fR, and so on). |
---|
172 | The input substring for a given conversion terminates either when a |
---|
173 | white-space character is encountered or when the maximum substring |
---|
174 | width has been reached, whichever comes first. |
---|
175 | If a \fB*\fR is present in the conversion specifier |
---|
176 | then no variable is assigned and the next scan argument is not consumed. |
---|
177 | .SH "DIFFERENCES FROM ANSI SSCANF" |
---|
178 | .PP |
---|
179 | The behavior of the \fBscan\fR command is the same as the behavior of |
---|
180 | the ANSI C \fBsscanf\fR procedure except for the following differences: |
---|
181 | .IP [1] |
---|
182 | \fB%p\fR conversion specifier is not supported. |
---|
183 | .IP [2] |
---|
184 | For \fB%c\fR conversions a single character value is |
---|
185 | converted to a decimal string, which is then assigned to the |
---|
186 | corresponding \fIvarName\fR; |
---|
187 | no substring width may be specified for this conversion. |
---|
188 | .IP [3] |
---|
189 | The \fBh\fR modifier is always ignored and the \fBl\fR and \fBL\fR |
---|
190 | modifiers are ignored when converting real values (i.e. type |
---|
191 | \fBdouble\fR is used for the internal representation). The \fBll\fR |
---|
192 | modifier has no \fBsscanf\fR counterpart. |
---|
193 | .IP [4] |
---|
194 | If the end of the input string is reached before any conversions have been |
---|
195 | performed and no variables are given, an empty string is returned. |
---|
196 | .SH EXAMPLES |
---|
197 | Convert a UNICODE character to its numeric value: |
---|
198 | .CS |
---|
199 | set char "x" |
---|
200 | set value [\fBscan\fR $char %c] |
---|
201 | .CE |
---|
202 | .PP |
---|
203 | Parse a simple color specification of the form \fI#RRGGBB\fR using |
---|
204 | hexadecimal conversions with substring sizes: |
---|
205 | .CS |
---|
206 | set string "#08D03F" |
---|
207 | \fBscan\fR $string "#%2x%2x%2x" r g b |
---|
208 | .CE |
---|
209 | .PP |
---|
210 | Parse a \fIHH:MM\fR time string, noting that this avoids problems with |
---|
211 | octal numbers by forcing interpretation as decimals (if we did not |
---|
212 | care, we would use the \fB%i\fR conversion instead): |
---|
213 | .CS |
---|
214 | set string "08:08" ;# *Not* octal! |
---|
215 | if {[\fBscan\fR $string "%d:%d" hours minutes] != 2} { |
---|
216 | error "not a valid time string" |
---|
217 | } |
---|
218 | # We have to understand numeric ranges ourselves... |
---|
219 | if {$minutes < 0 || $minutes > 59} { |
---|
220 | error "invalid number of minutes" |
---|
221 | } |
---|
222 | .CE |
---|
223 | .PP |
---|
224 | Break a string up into sequences of non-whitespace characters (note |
---|
225 | the use of the \fB%n\fR conversion so that we get skipping over |
---|
226 | leading whitespace correct): |
---|
227 | .CS |
---|
228 | set string " a string {with braced words} + leading space " |
---|
229 | set words {} |
---|
230 | while {[\fBscan\fR $string %s%n word length] == 2} { |
---|
231 | lappend words $word |
---|
232 | set string [string range $string $length end] |
---|
233 | } |
---|
234 | .CE |
---|
235 | .PP |
---|
236 | Parse a simple coordinate string, checking that it is complete by |
---|
237 | looking for the terminating character explicitly: |
---|
238 | .CS |
---|
239 | set string "(5.2,-4e-2)" |
---|
240 | # Note that the spaces before the literal parts of |
---|
241 | # the scan pattern are significant, and that ")" is |
---|
242 | # the Unicode character \eu0029 |
---|
243 | if { |
---|
244 | [\fBscan\fR $string " (%f ,%f %c" x y last] != 3 |
---|
245 | || $last != 0x0029 |
---|
246 | } then { |
---|
247 | error "invalid coordinate string" |
---|
248 | } |
---|
249 | puts "X=$x, Y=$y" |
---|
250 | .CE |
---|
251 | .PP |
---|
252 | .VS 8.5 |
---|
253 | An interactive session demonstrating the truncation of integer |
---|
254 | values determined by size modifiers: |
---|
255 | .CS |
---|
256 | % set tcl_platform(wordSize) |
---|
257 | 4 |
---|
258 | % scan 20000000000000000000 %d |
---|
259 | 2147483647 |
---|
260 | % scan 20000000000000000000 %ld |
---|
261 | 9223372036854775807 |
---|
262 | % scan 20000000000000000000 %lld |
---|
263 | 20000000000000000000 |
---|
264 | .CE |
---|
265 | .VE 8.5 |
---|
266 | .SH "SEE ALSO" |
---|
267 | format(n), sscanf(3) |
---|
268 | .SH KEYWORDS |
---|
269 | conversion specifier, parse, scan |
---|