1 | '\" |
---|
2 | '\" Copyright (c) 1998 Sun Microsystems, Inc. |
---|
3 | '\" |
---|
4 | '\" See the file "license.terms" for information on usage and redistribution |
---|
5 | '\" of this file, and for a DISCLAIMER OF ALL WARRANTIES. |
---|
6 | '\" |
---|
7 | '\" RCS: @(#) $Id: regexp.n,v 1.28 2007/12/13 15:22:33 dgp Exp $ |
---|
8 | '\" |
---|
9 | .so man.macros |
---|
10 | .TH regexp n 8.3 Tcl "Tcl Built-In Commands" |
---|
11 | .BS |
---|
12 | '\" Note: do not modify the .SH NAME line immediately below! |
---|
13 | .SH NAME |
---|
14 | regexp \- Match a regular expression against a string |
---|
15 | |
---|
16 | .SH SYNOPSIS |
---|
17 | \fBregexp \fR?\fIswitches\fR? \fIexp string \fR?\fImatchVar\fR? ?\fIsubMatchVar subMatchVar ...\fR? |
---|
18 | .BE |
---|
19 | |
---|
20 | .SH DESCRIPTION |
---|
21 | .PP |
---|
22 | Determines whether the regular expression \fIexp\fR matches part or |
---|
23 | all of \fIstring\fR and returns 1 if it does, 0 if it does not, unless |
---|
24 | \fB\-inline\fR is specified (see below). |
---|
25 | (Regular expression matching is described in the \fBre_syntax\fR |
---|
26 | reference page.) |
---|
27 | .LP |
---|
28 | If additional arguments are specified after \fIstring\fR then they |
---|
29 | are treated as the names of variables in which to return |
---|
30 | information about which part(s) of \fIstring\fR matched \fIexp\fR. |
---|
31 | \fIMatchVar\fR will be set to the range of \fIstring\fR that |
---|
32 | matched all of \fIexp\fR. The first \fIsubMatchVar\fR will contain |
---|
33 | the characters in \fIstring\fR that matched the leftmost parenthesized |
---|
34 | subexpression within \fIexp\fR, the next \fIsubMatchVar\fR will |
---|
35 | contain the characters that matched the next parenthesized |
---|
36 | subexpression to the right in \fIexp\fR, and so on. |
---|
37 | .PP |
---|
38 | If the initial arguments to \fBregexp\fR start with \fB\-\fR then |
---|
39 | they are treated as switches. The following switches are |
---|
40 | currently supported: |
---|
41 | .TP 15 |
---|
42 | \fB\-about\fR |
---|
43 | Instead of attempting to match the regular expression, returns a list |
---|
44 | containing information about the regular expression. The first |
---|
45 | element of the list is a subexpression count. The second element is a |
---|
46 | list of property names that describe various attributes of the regular |
---|
47 | expression. This switch is primarily intended for debugging purposes. |
---|
48 | .TP 15 |
---|
49 | \fB\-expanded\fR |
---|
50 | Enables use of the expanded regular expression syntax where |
---|
51 | whitespace and comments are ignored. This is the same as specifying |
---|
52 | the \fB(?x)\fR embedded option (see the \fBre_syntax\fR manual page). |
---|
53 | .TP 15 |
---|
54 | \fB\-indices\fR |
---|
55 | Changes what is stored in the \fIsubMatchVar\fRs. |
---|
56 | Instead of storing the matching characters from \fIstring\fR, |
---|
57 | each variable |
---|
58 | will contain a list of two decimal strings giving the indices |
---|
59 | in \fIstring\fR of the first and last characters in the matching |
---|
60 | range of characters. |
---|
61 | .TP 15 |
---|
62 | \fB\-line\fR |
---|
63 | Enables newline-sensitive matching. By default, newline is a |
---|
64 | completely ordinary character with no special meaning. With this |
---|
65 | flag, |
---|
66 | .QW [^ |
---|
67 | bracket expressions and |
---|
68 | .QW . |
---|
69 | never match newline, |
---|
70 | .QW ^ |
---|
71 | matches an empty string after any newline in addition to its normal |
---|
72 | function, and |
---|
73 | .QW $ |
---|
74 | matches an empty string before any newline in |
---|
75 | addition to its normal function. This flag is equivalent to |
---|
76 | specifying both \fB\-linestop\fR and \fB\-lineanchor\fR, or the |
---|
77 | \fB(?n)\fR embedded option (see the \fBre_syntax\fR manual page). |
---|
78 | .TP 15 |
---|
79 | \fB\-linestop\fR |
---|
80 | Changes the behavior of |
---|
81 | .QW [^ |
---|
82 | bracket expressions and |
---|
83 | .QW . |
---|
84 | so that they |
---|
85 | stop at newlines. This is the same as specifying the \fB(?p)\fR |
---|
86 | embedded option (see the \fBre_syntax\fR manual page). |
---|
87 | .TP 15 |
---|
88 | \fB\-lineanchor\fR |
---|
89 | Changes the behavior of |
---|
90 | .QW ^ |
---|
91 | and |
---|
92 | .QW $ |
---|
93 | (the |
---|
94 | .QW anchors ) |
---|
95 | so they match the |
---|
96 | beginning and end of a line respectively. This is the same as |
---|
97 | specifying the \fB(?w)\fR embedded option (see the \fBre_syntax\fR |
---|
98 | manual page). |
---|
99 | .TP 15 |
---|
100 | \fB\-nocase\fR |
---|
101 | Causes upper-case characters in \fIstring\fR to be treated as |
---|
102 | lower case during the matching process. |
---|
103 | .TP 15 |
---|
104 | \fB\-all\fR |
---|
105 | Causes the regular expression to be matched as many times as possible |
---|
106 | in the string, returning the total number of matches found. If this |
---|
107 | is specified with match variables, they will contain information for |
---|
108 | the last match only. |
---|
109 | .TP 15 |
---|
110 | \fB\-inline\fR |
---|
111 | Causes the command to return, as a list, the data that would otherwise |
---|
112 | be placed in match variables. When using \fB\-inline\fR, |
---|
113 | match variables may not be specified. If used with \fB\-all\fR, the |
---|
114 | list will be concatenated at each iteration, such that a flat list is |
---|
115 | always returned. For each match iteration, the command will append the |
---|
116 | overall match data, plus one element for each subexpression in the |
---|
117 | regular expression. Examples are: |
---|
118 | .CS |
---|
119 | \fBregexp\fR -inline -- {\ew(\ew)} " inlined " |
---|
120 | \fI\(-> in n\fR |
---|
121 | \fBregexp\fR -all -inline -- {\ew(\ew)} " inlined " |
---|
122 | \fI\(-> in n li i ne e\fR |
---|
123 | .CE |
---|
124 | .TP 15 |
---|
125 | \fB\-start\fR \fIindex\fR |
---|
126 | Specifies a character index offset into the string to start |
---|
127 | matching the regular expression at. |
---|
128 | .VS 8.5 |
---|
129 | The \fIindex\fR value is interpreted in the same manner |
---|
130 | as the \fIindex\fR argument to \fBstring index\fR. |
---|
131 | .VE 8.5 |
---|
132 | When using this switch, |
---|
133 | .QW ^ |
---|
134 | will not match the beginning of the line, and \eA will still |
---|
135 | match the start of the string at \fIindex\fR. If \fB\-indices\fR |
---|
136 | is specified, the indices will be indexed starting from the |
---|
137 | absolute beginning of the input string. |
---|
138 | \fIindex\fR will be constrained to the bounds of the input string. |
---|
139 | .TP 15 |
---|
140 | \fB\-\|\-\fR |
---|
141 | Marks the end of switches. The argument following this one will |
---|
142 | be treated as \fIexp\fR even if it starts with a \fB\-\fR. |
---|
143 | .PP |
---|
144 | If there are more \fIsubMatchVar\fRs than parenthesized |
---|
145 | subexpressions within \fIexp\fR, or if a particular subexpression |
---|
146 | in \fIexp\fR does not match the string (e.g. because it was in a |
---|
147 | portion of the expression that was not matched), then the corresponding |
---|
148 | \fIsubMatchVar\fR will be set to |
---|
149 | .QW "\fB\-1 \-1\fR" |
---|
150 | if \fB\-indices\fR has been specified or to an empty string otherwise. |
---|
151 | .SH EXAMPLES |
---|
152 | Find the first occurrence of a word starting with \fBfoo\fR in a |
---|
153 | string that is not actually an instance of \fBfoobar\fR, and get the |
---|
154 | letters following it up to the end of the word into a variable: |
---|
155 | .CS |
---|
156 | \fBregexp\fR {\e<foo(?!bar\e>)(\ew*)} $string \-> restOfWord |
---|
157 | .CE |
---|
158 | Note that the whole matched substring has been placed in the variable |
---|
159 | \fB\->\fR which is a name chosen to look nice given that we are not |
---|
160 | actually interested in its contents. |
---|
161 | .PP |
---|
162 | Find the index of the word \fBbadger\fR (in any case) within a string |
---|
163 | and store that in the variable \fBlocation\fR: |
---|
164 | .CS |
---|
165 | \fBregexp\fR \-indices {(?i)\e<badger\e>} $string location |
---|
166 | .CE |
---|
167 | .PP |
---|
168 | Count the number of octal digits in a string: |
---|
169 | .CS |
---|
170 | \fBregexp\fR \-all {[0\-7]} $string |
---|
171 | .CE |
---|
172 | .PP |
---|
173 | List all words (consisting of all sequences of non-whitespace |
---|
174 | characters) in a string: |
---|
175 | .CS |
---|
176 | \fBregexp\fR \-all \-inline {\eS+} $string |
---|
177 | .CE |
---|
178 | |
---|
179 | .SH "SEE ALSO" |
---|
180 | re_syntax(n), regsub(n), |
---|
181 | .VS 8.5 |
---|
182 | string(n) |
---|
183 | .VE |
---|
184 | |
---|
185 | |
---|
186 | .SH KEYWORDS |
---|
187 | match, regular expression, string |
---|