1 | .\" |
---|
2 | .\" Copyright (c) 1993 The Regents of the University of California. |
---|
3 | .\" Copyright (c) 1994-1996 Sun Microsystems, Inc. |
---|
4 | .\" |
---|
5 | .\" See the file "license.terms" for information on usage and redistribution |
---|
6 | .\" of this file, and for a DISCLAIMER OF ALL WARRANTIES. |
---|
7 | .\" |
---|
8 | .\" RCS: @(#) $Id: string.n,v 1.43 2007/12/13 15:22:33 dgp Exp $ |
---|
9 | .\" |
---|
10 | .so man.macros |
---|
11 | .TH string n 8.1 Tcl "Tcl Built-In Commands" |
---|
12 | .BS |
---|
13 | .\" Note: do not modify the .SH NAME line immediately below! |
---|
14 | .SH NAME |
---|
15 | string \- Manipulate strings |
---|
16 | .SH SYNOPSIS |
---|
17 | \fBstring \fIoption arg \fR?\fIarg ...?\fR |
---|
18 | .BE |
---|
19 | |
---|
20 | .SH DESCRIPTION |
---|
21 | .PP |
---|
22 | Performs one of several string operations, depending on \fIoption\fR. |
---|
23 | The legal \fIoption\fRs (which may be abbreviated) are: |
---|
24 | .TP |
---|
25 | \fBstring bytelength \fIstring\fR |
---|
26 | Returns a decimal string giving the number of bytes used to represent |
---|
27 | \fIstring\fR in memory. Because UTF\-8 uses one to three bytes to |
---|
28 | represent Unicode characters, the byte length will not be the same as |
---|
29 | the character length in general. The cases where a script cares about |
---|
30 | the byte length are rare. In almost all cases, you should use the |
---|
31 | \fBstring length\fR operation (including determining the length of a |
---|
32 | Tcl ByteArray object). Refer to the \fBTcl_NumUtfChars\fR manual |
---|
33 | entry for more details on the UTF\-8 representation. |
---|
34 | .TP |
---|
35 | \fBstring compare\fR ?\fB\-nocase\fR? ?\fB\-length int\fR? \fIstring1 string2\fR |
---|
36 | Perform a character-by-character comparison of strings \fIstring1\fR |
---|
37 | and \fIstring2\fR. Returns \-1, 0, or 1, depending on whether |
---|
38 | \fIstring1\fR is lexicographically less than, equal to, or greater |
---|
39 | than \fIstring2\fR. If \fB\-length\fR is specified, then only the |
---|
40 | first \fIlength\fR characters are used in the comparison. If |
---|
41 | \fB\-length\fR is negative, it is ignored. If \fB\-nocase\fR is |
---|
42 | specified, then the strings are compared in a case-insensitive manner. |
---|
43 | .TP |
---|
44 | \fBstring equal\fR ?\fB\-nocase\fR? ?\fB\-length int\fR? \fIstring1 string2\fR |
---|
45 | Perform a character-by-character comparison of strings \fIstring1\fR |
---|
46 | and \fIstring2\fR. Returns 1 if \fIstring1\fR and \fIstring2\fR are |
---|
47 | identical, or 0 when not. If \fB\-length\fR is specified, then only |
---|
48 | the first \fIlength\fR characters are used in the comparison. If |
---|
49 | \fB\-length\fR is negative, it is ignored. If \fB\-nocase\fR is |
---|
50 | specified, then the strings are compared in a case-insensitive manner. |
---|
51 | .TP |
---|
52 | \fBstring first \fIneedleString haystackString\fR ?\fIstartIndex\fR? |
---|
53 | Search \fIhaystackString\fR for a sequence of characters that exactly match |
---|
54 | the characters in \fIneedleString\fR. If found, return the index of the |
---|
55 | first character in the first such match within \fIhaystackString\fR. If not |
---|
56 | found, return \-1. If \fIstartIndex\fR is specified (in any of the |
---|
57 | forms accepted by the \fBindex\fR method), then the search is |
---|
58 | constrained to start with the character in \fIhaystackString\fR specified by |
---|
59 | the index. For example, |
---|
60 | .RS |
---|
61 | .CS |
---|
62 | \fBstring first a 0a23456789abcdef 5\fR |
---|
63 | .CE |
---|
64 | will return \fB10\fR, but |
---|
65 | .CS |
---|
66 | \fBstring first a 0123456789abcdef 11\fR |
---|
67 | .CE |
---|
68 | will return \fB\-1\fR. |
---|
69 | .RE |
---|
70 | .TP |
---|
71 | \fBstring index \fIstring charIndex\fR |
---|
72 | Returns the \fIcharIndex\fR'th character of the \fIstring\fR argument. |
---|
73 | A \fIcharIndex\fR of 0 corresponds to the first character of the |
---|
74 | string. \fIcharIndex\fR may be specified as follows: |
---|
75 | .VS 8.5 |
---|
76 | .RS |
---|
77 | .IP \fIinteger\fR 10 |
---|
78 | For any index value that passes \fBstring is integer -strict\fR, |
---|
79 | the char specified at this integral index |
---|
80 | (e.g. \fB2\fR would refer to the |
---|
81 | .QW c |
---|
82 | in |
---|
83 | .QW abcd ). |
---|
84 | .IP \fBend\fR 10 |
---|
85 | The last char of the string |
---|
86 | (e.g. \fBend\fR would refer to the |
---|
87 | .QW d |
---|
88 | in |
---|
89 | .QW abcd ). |
---|
90 | .IP \fBend\fR\-\fIN\fR 10 |
---|
91 | The last char of the string minus the specified integer offset \fIN\fR |
---|
92 | (e.g. \fBend\fR\-1 would refer to the |
---|
93 | .QW c |
---|
94 | in |
---|
95 | .QW abcd ). |
---|
96 | .IP \fBend\fR+\fIN\fR 10 |
---|
97 | The last char of the string plus the specified integer offset \fIN\fR |
---|
98 | (e.g. \fBend\fR+\-1 would refer to the |
---|
99 | .QW c |
---|
100 | in |
---|
101 | .QW abcd ). |
---|
102 | .IP \fIM\fR+\fIN\fR 10 |
---|
103 | The char specified at the integral index that is the sum of |
---|
104 | integer values \fIM\fR and \fIN\fR |
---|
105 | (e.g. \fB1+1\fR would refer to the |
---|
106 | .QW c |
---|
107 | in |
---|
108 | .QW abcd ). |
---|
109 | .IP \fIM\fR\-\fIN\fR 10 |
---|
110 | The char specified at the integral index that is the difference of |
---|
111 | integer values \fIM\fR and \fIN\fR |
---|
112 | (e.g. \fB2\-1\fR would refer to the |
---|
113 | .QW b |
---|
114 | in |
---|
115 | .QW abcd ). |
---|
116 | .PP |
---|
117 | In the specifications above, the integer value \fIM\fR contains no |
---|
118 | trailing whitespace and the integer value \fIN\fR contains no |
---|
119 | leading whitespace. |
---|
120 | .PP |
---|
121 | If \fIcharIndex\fR is less than 0 or greater than or equal to the |
---|
122 | length of the string then this command returns an empty string. |
---|
123 | .RE |
---|
124 | .VE |
---|
125 | .TP |
---|
126 | \fBstring is \fIclass\fR ?\fB\-strict\fR? ?\fB\-failindex \fIvarname\fR? \fIstring\fR |
---|
127 | Returns 1 if \fIstring\fR is a valid member of the specified character |
---|
128 | class, otherwise returns 0. If \fB\-strict\fR is specified, then an |
---|
129 | empty string returns 0, otherwise an empty string will return 1 on |
---|
130 | any class. If \fB\-failindex\fR is specified, then if the function |
---|
131 | returns 0, the index in the string where the class was no longer valid |
---|
132 | will be stored in the variable named \fIvarname\fR. The \fIvarname\fR |
---|
133 | will not be set if the function returns 1. The following character |
---|
134 | classes are recognized (the class name can be abbreviated): |
---|
135 | .RS |
---|
136 | .IP \fBalnum\fR 12 |
---|
137 | Any Unicode alphabet or digit character. |
---|
138 | .IP \fBalpha\fR 12 |
---|
139 | Any Unicode alphabet character. |
---|
140 | .IP \fBascii\fR 12 |
---|
141 | Any character with a value less than \eu0080 (those that are in the |
---|
142 | 7\-bit ascii range). |
---|
143 | .IP \fBboolean\fR 12 |
---|
144 | Any of the forms allowed to \fBTcl_GetBoolean\fR. |
---|
145 | .IP \fBcontrol\fR 12 |
---|
146 | Any Unicode control character. |
---|
147 | .IP \fBdigit\fR 12 |
---|
148 | Any Unicode digit character. Note that this includes characters |
---|
149 | outside of the [0\-9] range. |
---|
150 | .IP \fBdouble\fR 12 |
---|
151 | Any of the valid forms for a double in Tcl, with optional surrounding |
---|
152 | whitespace. In case of under/overflow in the value, 0 is returned and |
---|
153 | the \fIvarname\fR will contain \-1. |
---|
154 | .IP \fBfalse\fR 12 |
---|
155 | Any of the forms allowed to \fBTcl_GetBoolean\fR where the value is |
---|
156 | false. |
---|
157 | .IP \fBgraph\fR 12 |
---|
158 | Any Unicode printing character, except space. |
---|
159 | .IP \fBinteger\fR 12 |
---|
160 | Any of the valid string formats for a 32-bit integer value in Tcl, |
---|
161 | with optional surrounding whitespace. In case of under/overflow in |
---|
162 | the value, 0 is returned and the \fIvarname\fR will contain \-1. |
---|
163 | .IP \fBlist\fR 12 |
---|
164 | Any proper list structure, with optional surrounding whitespace. In |
---|
165 | case of improper list structure, 0 is returned and the \fIvarname\fR |
---|
166 | will contain the index of the |
---|
167 | .QW element |
---|
168 | where the list parsing fails, or \-1 if this cannot be determined. |
---|
169 | .IP \fBlower\fR 12 |
---|
170 | Any Unicode lower case alphabet character. |
---|
171 | .IP \fBprint\fR 12 |
---|
172 | Any Unicode printing character, including space. |
---|
173 | .IP \fBpunct\fR 12 |
---|
174 | Any Unicode punctuation character. |
---|
175 | .IP \fBspace\fR 12 |
---|
176 | Any Unicode space character. |
---|
177 | .IP \fBtrue\fR 12 |
---|
178 | Any of the forms allowed to \fBTcl_GetBoolean\fR where the value is |
---|
179 | true. |
---|
180 | .IP \fBupper\fR 12 |
---|
181 | Any upper case alphabet character in the Unicode character set. |
---|
182 | .VS 8.5 |
---|
183 | .IP \fBwideinteger\fR 12 |
---|
184 | Any of the valid forms for a wide integer in Tcl, with optional |
---|
185 | surrounding whitespace. In case of under/overflow in the value, 0 is |
---|
186 | returned and the \fIvarname\fR will contain \-1. |
---|
187 | .VE 8.5 |
---|
188 | .IP \fBwordchar\fR 12 |
---|
189 | Any Unicode word character. That is any alphanumeric character, and |
---|
190 | any Unicode connector punctuation characters (e.g. underscore). |
---|
191 | .IP \fBxdigit\fR 12 |
---|
192 | Any hexadecimal digit character ([0\-9A\-Fa\-f]). |
---|
193 | .PP |
---|
194 | In the case of \fBboolean\fR, \fBtrue\fR and \fBfalse\fR, if the |
---|
195 | function will return 0, then the \fIvarname\fR will always be set to |
---|
196 | 0, due to the varied nature of a valid boolean value. |
---|
197 | .RE |
---|
198 | .TP |
---|
199 | \fBstring last \fIneedleString haystackString\fR ?\fIlastIndex\fR? |
---|
200 | Search \fIhaystackString\fR for a sequence of characters that exactly match |
---|
201 | the characters in \fIneedleString\fR. If found, return the index of the |
---|
202 | first character in the last such match within \fIhaystackString\fR. If there |
---|
203 | is no match, then return \-1. If \fIlastIndex\fR is specified (in any |
---|
204 | of the forms accepted by the \fBindex\fR method), then only the |
---|
205 | characters in \fIhaystackString\fR at or before the specified \fIlastIndex\fR |
---|
206 | will be considered by the search. For example, |
---|
207 | .RS |
---|
208 | .CS |
---|
209 | \fBstring last a 0a23456789abcdef 15\fR |
---|
210 | .CE |
---|
211 | will return \fB10\fR, but |
---|
212 | .CS |
---|
213 | \fBstring last a 0a23456789abcdef 9\fR |
---|
214 | .CE |
---|
215 | will return \fB1\fR. |
---|
216 | .RE |
---|
217 | .TP |
---|
218 | \fBstring length \fIstring\fR |
---|
219 | Returns a decimal string giving the number of characters in |
---|
220 | \fIstring\fR. Note that this is not necessarily the same as the |
---|
221 | number of bytes used to store the string. If the object is a |
---|
222 | ByteArray object (such as those returned from reading a binary encoded |
---|
223 | channel), then this will return the actual byte length of the object. |
---|
224 | .TP |
---|
225 | \fBstring map\fR ?\fB\-nocase\fR? \fImapping string\fR |
---|
226 | Replaces substrings in \fIstring\fR based on the key-value pairs in |
---|
227 | \fImapping\fR. \fImapping\fR is a list of \fIkey value key value ...\fR |
---|
228 | as in the form returned by \fBarray get\fR. Each instance of a |
---|
229 | key in the string will be replaced with its corresponding value. If |
---|
230 | \fB\-nocase\fR is specified, then matching is done without regard to |
---|
231 | case differences. Both \fIkey\fR and \fIvalue\fR may be multiple |
---|
232 | characters. Replacement is done in an ordered manner, so the key |
---|
233 | appearing first in the list will be checked first, and so on. |
---|
234 | \fIstring\fR is only iterated over once, so earlier key replacements |
---|
235 | will have no affect for later key matches. For example, |
---|
236 | .RS |
---|
237 | .CS |
---|
238 | \fBstring map {abc 1 ab 2 a 3 1 0} 1abcaababcabababc\fR |
---|
239 | .CE |
---|
240 | will return the string \fB01321221\fR. |
---|
241 | .PP |
---|
242 | Note that if an earlier \fIkey\fR is a prefix of a later one, it will |
---|
243 | completely mask the later one. So if the previous example is |
---|
244 | reordered like this, |
---|
245 | .CS |
---|
246 | \fBstring map {1 0 ab 2 a 3 abc 1} 1abcaababcabababc\fR |
---|
247 | .CE |
---|
248 | it will return the string \fB02c322c222c\fR. |
---|
249 | .RE |
---|
250 | .TP |
---|
251 | \fBstring match\fR ?\fB\-nocase\fR? \fIpattern\fR \fIstring\fR |
---|
252 | See if \fIpattern\fR matches \fIstring\fR; return 1 if it does, 0 if |
---|
253 | it does not. If \fB\-nocase\fR is specified, then the pattern attempts |
---|
254 | to match against the string in a case insensitive manner. For the two |
---|
255 | strings to match, their contents must be identical except that the |
---|
256 | following special sequences may appear in \fIpattern\fR: |
---|
257 | .RS |
---|
258 | .IP \fB*\fR 10 |
---|
259 | Matches any sequence of characters in \fIstring\fR, including a null |
---|
260 | string. |
---|
261 | .IP \fB?\fR 10 |
---|
262 | Matches any single character in \fIstring\fR. |
---|
263 | .IP \fB[\fIchars\fB]\fR 10 |
---|
264 | Matches any character in the set given by \fIchars\fR. If a sequence |
---|
265 | of the form \fIx\fB\-\fIy\fR appears in \fIchars\fR, then any |
---|
266 | character between \fIx\fR and \fIy\fR, inclusive, will match. When |
---|
267 | used with \fB\-nocase\fR, the end points of the range are converted to |
---|
268 | lower case first. Whereas {[A\-z]} matches |
---|
269 | .QW _ |
---|
270 | when matching case-sensitively (since |
---|
271 | .QW _ |
---|
272 | falls between the |
---|
273 | .QW Z |
---|
274 | and |
---|
275 | .QW a ), |
---|
276 | with \fB\-nocase\fR this is considered like {[A\-Za\-z]} (and |
---|
277 | probably what was meant in the first place). |
---|
278 | .IP \fB\e\fIx\fR 10 |
---|
279 | Matches the single character \fIx\fR. This provides a way of avoiding |
---|
280 | the special interpretation of the characters \fB*?[]\e\fR in |
---|
281 | \fIpattern\fR. |
---|
282 | .RE |
---|
283 | .TP |
---|
284 | \fBstring range \fIstring first last\fR |
---|
285 | Returns a range of consecutive characters from \fIstring\fR, starting |
---|
286 | with the character whose index is \fIfirst\fR and ending with the |
---|
287 | character whose index is \fIlast\fR. An index of 0 refers to the first |
---|
288 | character of the string. \fIfirst\fR and \fIlast\fR may be specified |
---|
289 | as for the \fBindex\fR method. If \fIfirst\fR is less than zero then |
---|
290 | it is treated as if it were zero, and if \fIlast\fR is greater than or |
---|
291 | equal to the length of the string then it is treated as if it were |
---|
292 | \fBend\fR. If \fIfirst\fR is greater than \fIlast\fR then an empty |
---|
293 | string is returned. |
---|
294 | .TP |
---|
295 | \fBstring repeat \fIstring count\fR |
---|
296 | Returns \fIstring\fR repeated \fIcount\fR number of times. |
---|
297 | .TP |
---|
298 | \fBstring replace \fIstring first last\fR ?\fInewstring\fR? |
---|
299 | Removes a range of consecutive characters from \fIstring\fR, starting |
---|
300 | with the character whose index is \fIfirst\fR and ending with the |
---|
301 | character whose index is \fIlast\fR. An index of 0 refers to the |
---|
302 | first character of the string. \fIFirst\fR and \fIlast\fR may be |
---|
303 | specified as for the \fBindex\fR method. If \fInewstring\fR is |
---|
304 | specified, then it is placed in the removed character range. If |
---|
305 | \fIfirst\fR is less than zero then it is treated as if it were zero, |
---|
306 | and if \fIlast\fR is greater than or equal to the length of the string |
---|
307 | then it is treated as if it were \fBend\fR. If \fIfirst\fR is greater |
---|
308 | than \fIlast\fR or the length of the initial string, or \fIlast\fR is |
---|
309 | less than 0, then the initial string is returned untouched. |
---|
310 | .VS 8.5 |
---|
311 | .TP |
---|
312 | \fBstring reverse \fIstring\fR |
---|
313 | Returns a string that is the same length as \fIstring\fR but with its |
---|
314 | characters in the reverse order. |
---|
315 | .VE 8.5 |
---|
316 | .TP |
---|
317 | \fBstring tolower \fIstring\fR ?\fIfirst\fR? ?\fIlast\fR? |
---|
318 | Returns a value equal to \fIstring\fR except that all upper (or title) |
---|
319 | case letters have been converted to lower case. If \fIfirst\fR is |
---|
320 | specified, it refers to the first char index in the string to start |
---|
321 | modifying. If \fIlast\fR is specified, it refers to the char index in |
---|
322 | the string to stop at (inclusive). \fIfirst\fR and \fIlast\fR may be |
---|
323 | specified as for the \fBindex\fR method. |
---|
324 | .TP |
---|
325 | \fBstring totitle \fIstring\fR ?\fIfirst\fR? ?\fIlast\fR? |
---|
326 | Returns a value equal to \fIstring\fR except that the first character |
---|
327 | in \fIstring\fR is converted to its Unicode title case variant (or |
---|
328 | upper case if there is no title case variant) and the rest of the |
---|
329 | string is converted to lower case. If \fIfirst\fR is specified, it |
---|
330 | refers to the first char index in the string to start modifying. If |
---|
331 | \fIlast\fR is specified, it refers to the char index in the string to |
---|
332 | stop at (inclusive). \fIfirst\fR and \fIlast\fR may be specified as |
---|
333 | for the \fBindex\fR method. |
---|
334 | .TP |
---|
335 | \fBstring toupper \fIstring\fR ?\fIfirst\fR? ?\fIlast\fR? |
---|
336 | Returns a value equal to \fIstring\fR except that all lower (or title) |
---|
337 | case letters have been converted to upper case. If \fIfirst\fR is |
---|
338 | specified, it refers to the first char index in the string to start |
---|
339 | modifying. If \fIlast\fR is specified, it refers to the char index in |
---|
340 | the string to stop at (inclusive). \fIfirst\fR and \fIlast\fR may be |
---|
341 | specified as for the \fBindex\fR method. |
---|
342 | .TP |
---|
343 | \fBstring trim \fIstring\fR ?\fIchars\fR? |
---|
344 | Returns a value equal to \fIstring\fR except that any leading or |
---|
345 | trailing characters present in the string given by \fIchars\fR are removed. If |
---|
346 | \fIchars\fR is not specified then white space is removed (spaces, |
---|
347 | tabs, newlines, and carriage returns). |
---|
348 | .TP |
---|
349 | \fBstring trimleft \fIstring\fR ?\fIchars\fR? |
---|
350 | Returns a value equal to \fIstring\fR except that any leading |
---|
351 | characters present in the string given by \fIchars\fR are removed. If |
---|
352 | \fIchars\fR is not specified then white space is removed (spaces, |
---|
353 | tabs, newlines, and carriage returns). |
---|
354 | .TP |
---|
355 | \fBstring trimright \fIstring\fR ?\fIchars\fR? |
---|
356 | Returns a value equal to \fIstring\fR except that any trailing |
---|
357 | characters present in the string given by \fIchars\fR are removed. If |
---|
358 | \fIchars\fR is not specified then white space is removed (spaces, |
---|
359 | tabs, newlines, and carriage returns). |
---|
360 | .TP |
---|
361 | \fBstring wordend \fIstring charIndex\fR |
---|
362 | Returns the index of the character just after the last one in the word |
---|
363 | containing character \fIcharIndex\fR of \fIstring\fR. \fIcharIndex\fR |
---|
364 | may be specified as for the \fBindex\fR method. A word is |
---|
365 | considered to be any contiguous range of alphanumeric (Unicode letters |
---|
366 | or decimal digits) or underscore (Unicode connector punctuation) |
---|
367 | characters, or any single character other than these. |
---|
368 | .TP |
---|
369 | \fBstring wordstart \fIstring charIndex\fR |
---|
370 | Returns the index of the first character in the word containing |
---|
371 | character \fIcharIndex\fR of \fIstring\fR. \fIcharIndex\fR may be |
---|
372 | specified as for the \fBindex\fR method. A word is considered to be any |
---|
373 | contiguous range of alphanumeric (Unicode letters or decimal digits) |
---|
374 | or underscore (Unicode connector punctuation) characters, or any |
---|
375 | single character other than these. |
---|
376 | .SH EXAMPLE |
---|
377 | Test if the string in the variable \fIstring\fR is a proper non-empty |
---|
378 | prefix of the string \fBfoobar\fR. |
---|
379 | .CS |
---|
380 | set length [\fBstring length\fR $string] |
---|
381 | if {$length == 0} { |
---|
382 | set isPrefix 0 |
---|
383 | } else { |
---|
384 | set isPrefix [\fBstring equal\fR -length $length $string "foobar"] |
---|
385 | } |
---|
386 | .CE |
---|
387 | |
---|
388 | .SH "SEE ALSO" |
---|
389 | expr(n), list(n) |
---|
390 | |
---|
391 | .SH KEYWORDS |
---|
392 | case conversion, compare, index, match, pattern, string, word, equal, |
---|
393 | ctype, character, reverse |
---|
394 | |
---|
395 | .\" Local Variables: |
---|
396 | .\" mode: nroff |
---|
397 | .\" End: |
---|