Planet

navi

home

PPS

about

screenshots

download

development

forum

Context Navigation

source: downloads/boost_1_33_1/libs/algorithm/string/doc/usage.xml @ 20

Last change on this file since 20 was 12, checked in by landauf, 18 years ago
added boost
File size: 17.0 KB

Line
1	<?xml version="1.0" encoding="utf-8"?>
2	<!DOCTYPE library PUBLIC "-//Boost//DTD BoostBook XML V1.0//EN"
3	"http://www.boost.org/tools/boostbook/dtd/boostbook.dtd">
4	<section id="string_algo.usage" last-revision="$Date: 2005/12/01 13:42:02 $">
5	<title>Usage</title>
6
7	<using-namespace name="boost"/>
8	<using-namespace name="boost::algorithm"/>
9
10
11	<section>
12	<title>First Example</title>
13
14	<para>
15	Using the algorithms is straightforward. Let us have a look at the first example:
16	</para>
17	<programlisting>
18	#include <boost/algorithm/string.hpp>
19	using namespace std;
20	using namespace boost;
21
22	// ...
23
24	string str1(" hello world! ");
25	to_upper(str1); // str1 == " HELLO WORLD! "
26	trim(str1); // str1 == "HELLO WORLD!"
27
28	string str2=
29	to_lower_copy(
30	ireplace_first_copy(
31	str1,"hello","goodbye")); // str2 == "goodbye world!"
32	</programlisting>
33	<para>
34	This example converts str1 to upper case and trims spaces from the start and the end
35	of the string. str2 is then created as a copy of str1 with "hello" replaced with "goodbye".
36	This example demonstrates several important concepts used in the library:
37	</para>
38	<itemizedlist>
39	<listitem>
40	<para><emphasis role="bold">Container parameters:</emphasis>
41	Unlike in the STL algorithms, parameters are not specified only in the form
42	of iterators. The STL convention allows for great flexibility,
43	but it has several limitations. It is not possible to <emphasis>stack</emphasis> algorithms together,
44	because a container is passed in two parameters. Therefore it is not possible to use
45	a return value from another algorithm. It is considerably easier to write
46	<code>to_lower(str1)</code>, than <code>to_lower(str1.begin(), str1.end())</code>.
47	</para>
48	<para>
49	The magic of <ulink url="../../libs/range/index.html">Boost.Range</ulink>
50	provides a uniform way of handling different string types.
51	If there is a need to pass a pair of iterators,
52	<ulink url="../../libs/range/doc/utility_class.html"><code>boost::iterator_range</code></ulink>
53	can be used to package iterators into a structure with a compatible interface.
54	</para>
55	</listitem>
56	<listitem>
57	<para><emphasis role="bold">Copy vs. Mutable:</emphasis>
58	Many algorithms in the library are performing a transformation of the input.
59	The transformation can be done in-place, mutating the input sequence, or a copy
60	of the transformed input can be created, leaving the input intact. None of
61	these possibilities is superior to the other one and both have different
62	advantages and disadvantages. For this reason, both are provided with the library.
63	</para>
64	</listitem>
65	<listitem>
66	<para><emphasis role="bold">Algorithm stacking:</emphasis>
67	Copy versions return a transformed input as a result, thus allow a simple chaining of
68	transformations within one expression (i.e. one can write <code>trim_copy(to_upper_copy(s))</code>).
69	Mutable versions have <code>void</code> return, to avoid misuse.
70	</para>
71	</listitem>
72	<listitem>
73	<para><emphasis role="bold">Naming:</emphasis>
74	Naming follows the conventions from the Standard C++ Library. If there is a
75	copy and a mutable version of the same algorithm, the mutable version has no suffix
76	and the copy version has the suffix <emphasis>_copy</emphasis>.
77	Some algorithms have the prefix <emphasis>i</emphasis>
78	(e.g. <functionname>ifind_first()</functionname>).
79	This prefix identifies that the algorithm works in a case-insensitive manner.
80	</para>
81	</listitem>
82	</itemizedlist>
83	<para>
84	To use the library, include the <headername>boost/algorithm/string.hpp</headername> header.
85	If the regex related functions are needed, include the
86	<headername>boost/algorithm/string_regex.hpp</headername> header.
87	</para>
88	</section>
89	<section>
90	<title>Case conversion</title>
91
92	<para>
93	STL has a nice way of converting character case. Unfortunately, it works only
94	for a single character and we want to convert a string,
95	</para>
96	<programlisting>
97	string str1("HeLlO WoRld!");
98	to_upper(str1); // str1=="HELLO WORLD!"
99	</programlisting>
100	<para>
101	<functionname>to_upper()</functionname> and <functionname>to_lower()</functionname> convert the case of
102	characters in a string using a specified locale.
103	</para>
104	<para>
105	For more information see the reference for <headername>boost/algorithm/string/case_conv.hpp</headername>.
106	</para>
107	</section>
108	<section>
109	<title>Predicates and Classification</title>
110	<para>
111	A part of the library deals with string related predicates. Consider this example:
112	</para>
113	<programlisting>
114	bool is_executable( string& filename )
115	{
116	return
117	iends_with(filename, ".exe") \|\|
118	iends_with(filename, ".com");
119	}
120
121	// ...
122	string str1("command.com");
123	cout
124	<< str1
125	<< is_executable("command.com")? "is": "is not"
126	<< "an executable"
127	<< endl; // prints "command.com is an executable"
128
129	//..
130	char text1[]="hello world!";
131	cout
132	<< text1
133	<< all( text1, is_lower() )? "is": "is not"
134	<< " written in the lower case"
135	<< endl; // prints "hello world! is written in the lower case"
136	</programlisting>
137	<para>
138	The predicates determine whether if a substring is contained in the input string
139	under various conditions. The conditions are: a string starts with the substring,
140	ends with the substring,
141	simply contains the substring or if both strings are equal. See the reference for
142	<headername>boost/algorithm/string/predicate.hpp</headername> for more details.
143	</para>
144	<para>
145	In addition the algorithm <functionname>all()</functionname> checks
146	all elements of a container to satisfy a condition specified by a predicate.
147	This predicate can be any unary predicate, but the library provides a bunch of
148	useful string-related predicates and combinators ready for use.
149	These are located in the <headername>boost/algorithm/string/classification.hpp</headername> header.
150	Classification predicates can be combined using logical combinators to form
151	a more complex expressions. For example: <code>is_from_range('a','z') \|\| is_digit()</code>
152	</para>
153	</section>
154	<section>
155	<title>Trimming</title>
156
157	<para>
158	When parsing the input from a user, strings usually have unwanted leading or trailing
159	characters. To get rid of them, we need trim functions:
160	</para>
161	<programlisting>
162	string str1=" hello world! ";
163	string str2=trim_left_copy(str1); // str2 == "hello world! "
164	string str3=trim_right_copy(str2); // str3 == " hello world!"
165	trim(str1); // str1 == "hello world!"
166
167	string phone="00423333444";
168	// remove leading 0 from the phone number
169	trim_left_if(phone,is_any_of("0")); // phone == "423333444"
170	</programlisting>
171	<para>
172	It is possible to trim the spaces on the right, on the left or on both sides of a string.
173	And for those cases when there is a need to remove something else than blank space, there
174	are <emphasis>_if</emphasis> variants. Using these, a user can specify a functor which will
175	select the <emphasis>space</emphasis> to be removed. It is possible to use classification
176	predicates like <functionname>is_digit()</functionname> mentioned in the previous paragraph.
177	See the reference for the <headername>boost/algorithm/string/trim.hpp</headername>.
178	</para>
179	</section>
180	<section>
181	<title>Find algorithms</title>
182
183	<para>
184	The library contains a set of find algorithms. Here is an example:
185	</para>
186	<programlisting>
187	char text[]="hello dolly!";
188	iterator_range<char*> result=find_last(text,"ll");
189
190	transform( result.begin(), result.end(), result.begin(), bind2nd(plus<char>(), 1) );
191	// text = "hello dommy!"
192
193	to_upper(result); // text == "hello doMMy!"
194
195	// iterator_range is convertible to bool
196	if(find_first(text, "dolly"))
197	{
198	cout << "Dolly is there" << endl;
199	}
200	</programlisting>
201	<para>
202	We have used <functionname>find_last()</functionname> to search the <code>text</code> for "ll".
203	The result is given in the <ulink url="../../libs/range/doc/utility_class.html"><code>boost::iterator_range</code></ulink>.
204	This range delimits the
205	part of the input which satisfies the find criteria. In our example it is the last occurrence of "ll".
206
207	As we can see, input of the <functionname>find_last()</functionname> algorithm can be also
208	char[] because this type is supported by
209	<ulink url="../../libs/range/index.html">Boost.Range</ulink>.
210
211	The following lines transform the result. Notice that
212	<ulink url="../../libs/range/doc/utility_class.html"><code>boost::iterator_range</code></ulink> has familiar
213	<code>begin()</code> and <code>end()</code> methods, so it can be used like any other STL container.
214	Also it is convertible to bool therefore it is easy to use find algorithms for a simple containment checking.
215	</para>
216	<para>
217	Find algorithms are located in <headername>boost/algorithm/string/find.hpp</headername>.
218	</para>
219	</section>
220	<section>
221	<title>Replace Algorithms</title>
222	<para>
223	Find algorithms can be used for searching for a specific part of string. Replace goes one step
224	further. After a matching part is found, it is substituted with something else. The substitution is computed
225	from the original, using some transformation.
226	</para>
227	<programlisting>
228	string str1="Hello Dolly, Hello World!"
229	replace_first(str1, "Dolly", "Jane"); // str1 == "Hello Jane, Hello World!"
230	replace_last(str1, "Hello", "Goodbye"); // str1 == "Hello Jane, Goodbye World!"
231	erase_all(str1, " "); // str1 == "HelloJane,GoodbyeWorld!"
232	erase_head(str1, 6); // str1 == "Jane,GoodbyeWorld!"
233	</programlisting>
234	<para>
235	For the complete list of replace and erase functions see the
236	<link linkend="string_algo.reference">reference</link>.
237	There is a lot of predefined function for common usage, however, the library allows you to
238	define a custom <code>replace()</code> that suits a specific need. There is a generic <functionname>find_format()</functionname>
239	function which takes two parameters.
240	The first one is a <link linkend="string_algo.finder_concept">Finder</link> object, the second one is
241	a <link linkend="string_algo.formatter_concept">Formatter</link> object.
242	The Finder object is a functor which performs the searching for the replacement part. The Formatter object
243	takes the result of the Finder (usually a reference to the found substring) and creates a
244	substitute for it. Replace algorithm puts these two together and makes the desired substitution.
245	</para>
246	<para>
247	Check <headername>boost/algorithm/string/replace.hpp</headername>, <headername>boost/algorithm/string/erase.hpp</headername> and
248	<headername>boost/algorithm/string/find_format.hpp</headername> for reference.
249	</para>
250	</section>
251	<section>
252	<title>Find Iterator</title>
253
254	<para>
255	An extension to find algorithms it the Find Iterator. Instead of searching for just a one part of a string,
256	the find iterator allows us to iterate over the substrings matching the specified criteria.
257	This facility is using the <link linkend="string_algo.finder_concept">Finder</link> to incrementally
258	search the string.
259	Dereferencing a find iterator yields an <ulink url="../../libs/range/doc/utility_class.html"><code>boost::iterator_range</code></ulink>
260	object, that delimits the current match.
261	</para>
262	<para>
263	There are two iterators provided <classname>find_iterator</classname> and
264	<classname>split_iterator</classname>. The former iterates over substrings that are found using the specified
265	Finder. The latter iterates over the gaps between these substrings.
266	</para>
267	<programlisting>
268	string str1("abc--ABC--aBc");
269	// Find all 'abc' substrings (ignoring the case)
270	// Create a find_iterator
271	typedef find_iterator<string::iterator> string_find_iterator;
272	for(string_find_iterator It=
273	make_find_iterator(str1, first_finder("abc", is_iequal()));
274	It!=string_find_iterator();
275	++It)
276	{
277	cout << copy_range<std::string>(*It) << endl;
278	}
279
280	// Output will be:
281	// abc
282	// ABC
283	// aBC
284
285	typedef split_iterator<string::iterator> string_split_iterator;
286	for(string_find_iterator It=
287	make_split_iterator(str1, first_finder("-*-", is_iequal()));
288	It!=string_find_iterator();
289	++It)
290	{
291	cout << copy_range<std::string>(*It) << endl;
292	}
293
294	// Output will be:
295	// abc
296	// ABC
297	// aBC
298	</programlisting>
299	<para>
300	Note that the find iterators have only one template parameter. It is the base iterator type.
301	The Finder is specified at runtime. This allows us to typedef a find iterator for
302	common string types and reuse it. Additionally make_*_iterator functions help
303	to construct a find iterator for a particular range.
304	</para>
305	<para>
306	See the reference in <headername>boost/algorithm/string/find_iterator.hpp</headername>.
307	</para>
308	</section>
309	<section>
310	<title>Split</title>
311
312	<para>
313	Split algorithms are an extension to the find iterator for one common usage scenario.
314	These algorithms use a find iterator and store all matches into the provided
315	container. This container must be able to hold copies (e.g. <code>std::string</code>) or
316	references (e.g. <code>iterator_range</code>) of the extracted substrings.
317	</para>
318	<para>
319	Two algorithms are provided. <functionname>find_all()</functionname> finds all copies
320	of a string in the input. <functionname>split()</functionname> splits the input into parts.
321	</para>
322
323	<programlisting>
324	string str1("hello abc--ABC--aBc goodbye");
325
326	typedef vector< iterator_range<string::iterator> > find_vector_type;
327
328	find_vector_type FindVec; // #1: Search for separators
329	ifind_all( FindVec, str1, "abc" ); // FindVec == { [abc],[ABC],[aBc] }
330
331	typedef vector< string > split_vector_type;
332
333	split_vector_type SplitVec; // #2: Search for tokens
334	split( SplitVec, str1, is_any_of("-*") ); // SplitVec == { "hello abc","ABC","aBc goodbye" }
335	</programlisting>
336	<para>
337	<code>[hello]</code> designates an <code>iterator_range</code> delimiting this substring.
338	</para>
339	<para>
340	First example show how to construct a container to hold references to all extracted
341	substrings. Algorithm <functionname>ifind_all()</functionname> puts into FindVec references
342	to all substrings that are in case-insensitive manner equal to "abc".
343	</para>
344	<para>
345	Second example uses <functionname>split()</functionname> to split string str1 into parts
346	separated by characters '-' or '*'. These parts are then put into the SplitVec.
347	It is possible to specify if adjacent separators are concatenated or not.
348	</para>
349	<para>
350	More information can be found in the reference: <headername>boost/algorithm/string/split.hpp</headername>.
351	</para>
352	</section>
353	</section>

Note: See TracBrowser for help on using the repository browser.

Download in other formats: