[25] | 1 | '\" |
---|
| 2 | '\" Copyright (c) 1998 Sun Microsystems, Inc. |
---|
| 3 | '\" |
---|
| 4 | '\" See the file "license.terms" for information on usage and redistribution |
---|
| 5 | '\" of this file, and for a DISCLAIMER OF ALL WARRANTIES. |
---|
| 6 | '\" |
---|
| 7 | '\" RCS: @(#) $Id: regexp.n,v 1.28 2007/12/13 15:22:33 dgp Exp $ |
---|
| 8 | '\" |
---|
| 9 | .so man.macros |
---|
| 10 | .TH regexp n 8.3 Tcl "Tcl Built-In Commands" |
---|
| 11 | .BS |
---|
| 12 | '\" Note: do not modify the .SH NAME line immediately below! |
---|
| 13 | .SH NAME |
---|
| 14 | regexp \- Match a regular expression against a string |
---|
| 15 | |
---|
| 16 | .SH SYNOPSIS |
---|
| 17 | \fBregexp \fR?\fIswitches\fR? \fIexp string \fR?\fImatchVar\fR? ?\fIsubMatchVar subMatchVar ...\fR? |
---|
| 18 | .BE |
---|
| 19 | |
---|
| 20 | .SH DESCRIPTION |
---|
| 21 | .PP |
---|
| 22 | Determines whether the regular expression \fIexp\fR matches part or |
---|
| 23 | all of \fIstring\fR and returns 1 if it does, 0 if it does not, unless |
---|
| 24 | \fB\-inline\fR is specified (see below). |
---|
| 25 | (Regular expression matching is described in the \fBre_syntax\fR |
---|
| 26 | reference page.) |
---|
| 27 | .LP |
---|
| 28 | If additional arguments are specified after \fIstring\fR then they |
---|
| 29 | are treated as the names of variables in which to return |
---|
| 30 | information about which part(s) of \fIstring\fR matched \fIexp\fR. |
---|
| 31 | \fIMatchVar\fR will be set to the range of \fIstring\fR that |
---|
| 32 | matched all of \fIexp\fR. The first \fIsubMatchVar\fR will contain |
---|
| 33 | the characters in \fIstring\fR that matched the leftmost parenthesized |
---|
| 34 | subexpression within \fIexp\fR, the next \fIsubMatchVar\fR will |
---|
| 35 | contain the characters that matched the next parenthesized |
---|
| 36 | subexpression to the right in \fIexp\fR, and so on. |
---|
| 37 | .PP |
---|
| 38 | If the initial arguments to \fBregexp\fR start with \fB\-\fR then |
---|
| 39 | they are treated as switches. The following switches are |
---|
| 40 | currently supported: |
---|
| 41 | .TP 15 |
---|
| 42 | \fB\-about\fR |
---|
| 43 | Instead of attempting to match the regular expression, returns a list |
---|
| 44 | containing information about the regular expression. The first |
---|
| 45 | element of the list is a subexpression count. The second element is a |
---|
| 46 | list of property names that describe various attributes of the regular |
---|
| 47 | expression. This switch is primarily intended for debugging purposes. |
---|
| 48 | .TP 15 |
---|
| 49 | \fB\-expanded\fR |
---|
| 50 | Enables use of the expanded regular expression syntax where |
---|
| 51 | whitespace and comments are ignored. This is the same as specifying |
---|
| 52 | the \fB(?x)\fR embedded option (see the \fBre_syntax\fR manual page). |
---|
| 53 | .TP 15 |
---|
| 54 | \fB\-indices\fR |
---|
| 55 | Changes what is stored in the \fIsubMatchVar\fRs. |
---|
| 56 | Instead of storing the matching characters from \fIstring\fR, |
---|
| 57 | each variable |
---|
| 58 | will contain a list of two decimal strings giving the indices |
---|
| 59 | in \fIstring\fR of the first and last characters in the matching |
---|
| 60 | range of characters. |
---|
| 61 | .TP 15 |
---|
| 62 | \fB\-line\fR |
---|
| 63 | Enables newline-sensitive matching. By default, newline is a |
---|
| 64 | completely ordinary character with no special meaning. With this |
---|
| 65 | flag, |
---|
| 66 | .QW [^ |
---|
| 67 | bracket expressions and |
---|
| 68 | .QW . |
---|
| 69 | never match newline, |
---|
| 70 | .QW ^ |
---|
| 71 | matches an empty string after any newline in addition to its normal |
---|
| 72 | function, and |
---|
| 73 | .QW $ |
---|
| 74 | matches an empty string before any newline in |
---|
| 75 | addition to its normal function. This flag is equivalent to |
---|
| 76 | specifying both \fB\-linestop\fR and \fB\-lineanchor\fR, or the |
---|
| 77 | \fB(?n)\fR embedded option (see the \fBre_syntax\fR manual page). |
---|
| 78 | .TP 15 |
---|
| 79 | \fB\-linestop\fR |
---|
| 80 | Changes the behavior of |
---|
| 81 | .QW [^ |
---|
| 82 | bracket expressions and |
---|
| 83 | .QW . |
---|
| 84 | so that they |
---|
| 85 | stop at newlines. This is the same as specifying the \fB(?p)\fR |
---|
| 86 | embedded option (see the \fBre_syntax\fR manual page). |
---|
| 87 | .TP 15 |
---|
| 88 | \fB\-lineanchor\fR |
---|
| 89 | Changes the behavior of |
---|
| 90 | .QW ^ |
---|
| 91 | and |
---|
| 92 | .QW $ |
---|
| 93 | (the |
---|
| 94 | .QW anchors ) |
---|
| 95 | so they match the |
---|
| 96 | beginning and end of a line respectively. This is the same as |
---|
| 97 | specifying the \fB(?w)\fR embedded option (see the \fBre_syntax\fR |
---|
| 98 | manual page). |
---|
| 99 | .TP 15 |
---|
| 100 | \fB\-nocase\fR |
---|
| 101 | Causes upper-case characters in \fIstring\fR to be treated as |
---|
| 102 | lower case during the matching process. |
---|
| 103 | .TP 15 |
---|
| 104 | \fB\-all\fR |
---|
| 105 | Causes the regular expression to be matched as many times as possible |
---|
| 106 | in the string, returning the total number of matches found. If this |
---|
| 107 | is specified with match variables, they will contain information for |
---|
| 108 | the last match only. |
---|
| 109 | .TP 15 |
---|
| 110 | \fB\-inline\fR |
---|
| 111 | Causes the command to return, as a list, the data that would otherwise |
---|
| 112 | be placed in match variables. When using \fB\-inline\fR, |
---|
| 113 | match variables may not be specified. If used with \fB\-all\fR, the |
---|
| 114 | list will be concatenated at each iteration, such that a flat list is |
---|
| 115 | always returned. For each match iteration, the command will append the |
---|
| 116 | overall match data, plus one element for each subexpression in the |
---|
| 117 | regular expression. Examples are: |
---|
| 118 | .CS |
---|
| 119 | \fBregexp\fR -inline -- {\ew(\ew)} " inlined " |
---|
| 120 | \fI\(-> in n\fR |
---|
| 121 | \fBregexp\fR -all -inline -- {\ew(\ew)} " inlined " |
---|
| 122 | \fI\(-> in n li i ne e\fR |
---|
| 123 | .CE |
---|
| 124 | .TP 15 |
---|
| 125 | \fB\-start\fR \fIindex\fR |
---|
| 126 | Specifies a character index offset into the string to start |
---|
| 127 | matching the regular expression at. |
---|
| 128 | .VS 8.5 |
---|
| 129 | The \fIindex\fR value is interpreted in the same manner |
---|
| 130 | as the \fIindex\fR argument to \fBstring index\fR. |
---|
| 131 | .VE 8.5 |
---|
| 132 | When using this switch, |
---|
| 133 | .QW ^ |
---|
| 134 | will not match the beginning of the line, and \eA will still |
---|
| 135 | match the start of the string at \fIindex\fR. If \fB\-indices\fR |
---|
| 136 | is specified, the indices will be indexed starting from the |
---|
| 137 | absolute beginning of the input string. |
---|
| 138 | \fIindex\fR will be constrained to the bounds of the input string. |
---|
| 139 | .TP 15 |
---|
| 140 | \fB\-\|\-\fR |
---|
| 141 | Marks the end of switches. The argument following this one will |
---|
| 142 | be treated as \fIexp\fR even if it starts with a \fB\-\fR. |
---|
| 143 | .PP |
---|
| 144 | If there are more \fIsubMatchVar\fRs than parenthesized |
---|
| 145 | subexpressions within \fIexp\fR, or if a particular subexpression |
---|
| 146 | in \fIexp\fR does not match the string (e.g. because it was in a |
---|
| 147 | portion of the expression that was not matched), then the corresponding |
---|
| 148 | \fIsubMatchVar\fR will be set to |
---|
| 149 | .QW "\fB\-1 \-1\fR" |
---|
| 150 | if \fB\-indices\fR has been specified or to an empty string otherwise. |
---|
| 151 | .SH EXAMPLES |
---|
| 152 | Find the first occurrence of a word starting with \fBfoo\fR in a |
---|
| 153 | string that is not actually an instance of \fBfoobar\fR, and get the |
---|
| 154 | letters following it up to the end of the word into a variable: |
---|
| 155 | .CS |
---|
| 156 | \fBregexp\fR {\e<foo(?!bar\e>)(\ew*)} $string \-> restOfWord |
---|
| 157 | .CE |
---|
| 158 | Note that the whole matched substring has been placed in the variable |
---|
| 159 | \fB\->\fR which is a name chosen to look nice given that we are not |
---|
| 160 | actually interested in its contents. |
---|
| 161 | .PP |
---|
| 162 | Find the index of the word \fBbadger\fR (in any case) within a string |
---|
| 163 | and store that in the variable \fBlocation\fR: |
---|
| 164 | .CS |
---|
| 165 | \fBregexp\fR \-indices {(?i)\e<badger\e>} $string location |
---|
| 166 | .CE |
---|
| 167 | .PP |
---|
| 168 | Count the number of octal digits in a string: |
---|
| 169 | .CS |
---|
| 170 | \fBregexp\fR \-all {[0\-7]} $string |
---|
| 171 | .CE |
---|
| 172 | .PP |
---|
| 173 | List all words (consisting of all sequences of non-whitespace |
---|
| 174 | characters) in a string: |
---|
| 175 | .CS |
---|
| 176 | \fBregexp\fR \-all \-inline {\eS+} $string |
---|
| 177 | .CE |
---|
| 178 | |
---|
| 179 | .SH "SEE ALSO" |
---|
| 180 | re_syntax(n), regsub(n), |
---|
| 181 | .VS 8.5 |
---|
| 182 | string(n) |
---|
| 183 | .VE |
---|
| 184 | |
---|
| 185 | |
---|
| 186 | .SH KEYWORDS |
---|
| 187 | match, regular expression, string |
---|