| 1 | [section Dynamic Regexes] |
|---|
| 2 | |
|---|
| 3 | [h2 Overview] |
|---|
| 4 | |
|---|
| 5 | Static regexes are dandy, but sometimes you need something a bit more ... dynamic. Imagine you are developing |
|---|
| 6 | a text editor with a regex search/replace feature. You need to accept a regular expression from the end user |
|---|
| 7 | as input at run-time. There should be a way to parse a string into a regular expression. That's what xpressive's |
|---|
| 8 | dynamic regexes are for. They are built from the same core components as their static counterparts, but they |
|---|
| 9 | are late-bound so you can specify them at run-time. |
|---|
| 10 | |
|---|
| 11 | [h2 Construction and Assignment] |
|---|
| 12 | |
|---|
| 13 | There are two ways to create a dynamic regex: with the _regex_compile_ |
|---|
| 14 | function or with the _regex_compiler_ class template. Use _regex_compile_ |
|---|
| 15 | if you want the default locale, syntax and semantics. Use _regex_compiler_ if you need to |
|---|
| 16 | specify a different locale, or if you need more control over the regex syntax and semantics than the |
|---|
| 17 | _syntax_option_type_ enumeration gives you. ['(Editor's note: in xpressive v1.0, _regex_compiler_ does not support |
|---|
| 18 | customization of the dynamic regex syntax and semantics. It will in v2.0.)] |
|---|
| 19 | |
|---|
| 20 | Here is an example of using `basic_regex<>::compile()`: |
|---|
| 21 | |
|---|
| 22 | sregex re = sregex::compile( "this|that", regex_constants::icase ); |
|---|
| 23 | |
|---|
| 24 | Here is the same example using _regex_compiler_: |
|---|
| 25 | |
|---|
| 26 | sregex_compiler compiler; |
|---|
| 27 | sregex re = compiler.compile( "this|that", regex_constants::icase ); |
|---|
| 28 | |
|---|
| 29 | _regex_compile_ is implemented in terms of _regex_compiler_. |
|---|
| 30 | |
|---|
| 31 | [h2 Dynamic xpressive Syntax] |
|---|
| 32 | |
|---|
| 33 | Since the dynamic syntax is not constrained by the rules for valid C++ expressions, we are free to use familiar |
|---|
| 34 | syntax for dynamic regexes. For this reason, the syntax used by xpressive for dynamic regexes follows the |
|---|
| 35 | lead set by John Maddock's [@http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1429.htm proposal] |
|---|
| 36 | to add regular expressions to the Standard Library. It is essentially the syntax standardized by |
|---|
| 37 | [@http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf ECMAScript], with minor changes |
|---|
| 38 | in support of internationalization. |
|---|
| 39 | |
|---|
| 40 | Since the syntax is documented exhaustively elsewhere, I will simply refer you to the existing standards, rather |
|---|
| 41 | than duplicate the specification here. |
|---|
| 42 | |
|---|
| 43 | [h2 Customizing Dynamic xpressive Syntax] |
|---|
| 44 | |
|---|
| 45 | xpressive v1.0 has limited support for the customization of dynamic regex syntax. The only customization allowed |
|---|
| 46 | is what can be specified via the _syntax_option_type_ enumeration. |
|---|
| 47 | |
|---|
| 48 | [blurb |
|---|
| 49 | I have planned some future work in this area for v2.0, however. xpressive's design allows for powerful mechanisms |
|---|
| 50 | to customize the dynamic regex syntax. First, since the concept of "regex" is separated from the concept of |
|---|
| 51 | "regex compiler", it will be possible to offer multiple regex compilers, each of which accepts a different syntax. |
|---|
| 52 | Second, since xpressive allows you to build grammars using static regexes, it should be possible to build a |
|---|
| 53 | dynamic regex parser out of static regexes! Then, new dynamic regex grammars can be created by cloning an existing |
|---|
| 54 | regex grammar and modifying or disabling individual grammar rules to suit your needs. |
|---|
| 55 | ] |
|---|
| 56 | |
|---|
| 57 | [h2 Internationalization] |
|---|
| 58 | |
|---|
| 59 | As with static regexes, dynamic regexes support internationalization by allowing you to specify a different |
|---|
| 60 | `std::locale`. To do this, you must use _regex_compiler_. The _regex_compiler_ class has an `imbue()` function. |
|---|
| 61 | After you have imbued a _regex_compiler_ object with a custom `std::locale`, all regex objects compiled by |
|---|
| 62 | that _regex_compiler_ will use that locale. For example: |
|---|
| 63 | |
|---|
| 64 | std::locale my_locale = /* initialize your locale object here */; |
|---|
| 65 | sregex_compiler compiler; |
|---|
| 66 | compiler.imbue( my_locale ); |
|---|
| 67 | sregex re = compiler.compile( "\\w+|\\d+" ); |
|---|
| 68 | |
|---|
| 69 | This regex will use `my_locale` when evaluating the intrinsic character sets `"\\w"` and `"\\d"`. |
|---|
| 70 | |
|---|
| 71 | [endsect] |
|---|