Planet

navi

home

PPS

about

screenshots

download

development

forum

Context Navigation

source: downloads/boost_1_34_1/libs/algorithm/string/doc/usage.xml @ 29

Last change on this file since 29 was 29, checked in by landauf, 17 years ago
updated boost from 1_33_1 to 1_34_1
File size: 17.2 KB

Line
1	<?xml version="1.0" encoding="utf-8"?>
2	<!DOCTYPE library PUBLIC "-//Boost//DTD BoostBook XML V1.0//EN"
3	"http://www.boost.org/tools/boostbook/dtd/boostbook.dtd">
4
5	<!-- Copyright (c) 2002-2006 Pavol Droba.
6	Subject to the Boost Software License, Version 1.0.
7	(See accompanying file LICENSE-1.0 or http://www.boost.org/LICENSE-1.0)
8	-->
9
10	<section id="string_algo.usage" last-revision="$Date: 2007/01/30 07:58:35 $">
11	<title>Usage</title>
12
13	<using-namespace name="boost"/>
14	<using-namespace name="boost::algorithm"/>
15
16
17	<section>
18	<title>First Example</title>
19
20	<para>
21	Using the algorithms is straightforward. Let us have a look at the first example:
22	</para>
23	<programlisting>
24	#include <boost/algorithm/string.hpp>
25	using namespace std;
26	using namespace boost;
27
28	// ...
29
30	string str1(" hello world! ");
31	to_upper(str1); // str1 == " HELLO WORLD! "
32	trim(str1); // str1 == "HELLO WORLD!"
33
34	string str2=
35	to_lower_copy(
36	ireplace_first_copy(
37	str1,"hello","goodbye")); // str2 == "goodbye world!"
38	</programlisting>
39	<para>
40	This example converts str1 to upper case and trims spaces from the start and the end
41	of the string. str2 is then created as a copy of str1 with "hello" replaced with "goodbye".
42	This example demonstrates several important concepts used in the library:
43	</para>
44	<itemizedlist>
45	<listitem>
46	<para><emphasis role="bold">Container parameters:</emphasis>
47	Unlike in the STL algorithms, parameters are not specified only in the form
48	of iterators. The STL convention allows for great flexibility,
49	but it has several limitations. It is not possible to <emphasis>stack</emphasis> algorithms together,
50	because a container is passed in two parameters. Therefore it is not possible to use
51	a return value from another algorithm. It is considerably easier to write
52	<code>to_lower(str1)</code>, than <code>to_lower(str1.begin(), str1.end())</code>.
53	</para>
54	<para>
55	The magic of <ulink url="../../libs/range/index.html">Boost.Range</ulink>
56	provides a uniform way of handling different string types.
57	If there is a need to pass a pair of iterators,
58	<ulink url="../../libs/range/doc/utility_class.html"><code>boost::iterator_range</code></ulink>
59	can be used to package iterators into a structure with a compatible interface.
60	</para>
61	</listitem>
62	<listitem>
63	<para><emphasis role="bold">Copy vs. Mutable:</emphasis>
64	Many algorithms in the library are performing a transformation of the input.
65	The transformation can be done in-place, mutating the input sequence, or a copy
66	of the transformed input can be created, leaving the input intact. None of
67	these possibilities is superior to the other one and both have different
68	advantages and disadvantages. For this reason, both are provided with the library.
69	</para>
70	</listitem>
71	<listitem>
72	<para><emphasis role="bold">Algorithm stacking:</emphasis>
73	Copy versions return a transformed input as a result, thus allow a simple chaining of
74	transformations within one expression (i.e. one can write <code>trim_copy(to_upper_copy(s))</code>).
75	Mutable versions have <code>void</code> return, to avoid misuse.
76	</para>
77	</listitem>
78	<listitem>
79	<para><emphasis role="bold">Naming:</emphasis>
80	Naming follows the conventions from the Standard C++ Library. If there is a
81	copy and a mutable version of the same algorithm, the mutable version has no suffix
82	and the copy version has the suffix <emphasis>_copy</emphasis>.
83	Some algorithms have the prefix <emphasis>i</emphasis>
84	(e.g. <functionname>ifind_first()</functionname>).
85	This prefix identifies that the algorithm works in a case-insensitive manner.
86	</para>
87	</listitem>
88	</itemizedlist>
89	<para>
90	To use the library, include the <headername>boost/algorithm/string.hpp</headername> header.
91	If the regex related functions are needed, include the
92	<headername>boost/algorithm/string_regex.hpp</headername> header.
93	</para>
94	</section>
95	<section>
96	<title>Case conversion</title>
97
98	<para>
99	STL has a nice way of converting character case. Unfortunately, it works only
100	for a single character and we want to convert a string,
101	</para>
102	<programlisting>
103	string str1("HeLlO WoRld!");
104	to_upper(str1); // str1=="HELLO WORLD!"
105	</programlisting>
106	<para>
107	<functionname>to_upper()</functionname> and <functionname>to_lower()</functionname> convert the case of
108	characters in a string using a specified locale.
109	</para>
110	<para>
111	For more information see the reference for <headername>boost/algorithm/string/case_conv.hpp</headername>.
112	</para>
113	</section>
114	<section>
115	<title>Predicates and Classification</title>
116	<para>
117	A part of the library deals with string related predicates. Consider this example:
118	</para>
119	<programlisting>
120	bool is_executable( string& filename )
121	{
122	return
123	iends_with(filename, ".exe") \|\|
124	iends_with(filename, ".com");
125	}
126
127	// ...
128	string str1("command.com");
129	cout
130	<< str1
131	<< is_executable("command.com")? "is": "is not"
132	<< "an executable"
133	<< endl; // prints "command.com is an executable"
134
135	//..
136	char text1[]="hello world!";
137	cout
138	<< text1
139	<< all( text1, is_lower() )? "is": "is not"
140	<< " written in the lower case"
141	<< endl; // prints "hello world! is written in the lower case"
142	</programlisting>
143	<para>
144	The predicates determine whether if a substring is contained in the input string
145	under various conditions. The conditions are: a string starts with the substring,
146	ends with the substring,
147	simply contains the substring or if both strings are equal. See the reference for
148	<headername>boost/algorithm/string/predicate.hpp</headername> for more details.
149	</para>
150	<para>
151	In addition the algorithm <functionname>all()</functionname> checks
152	all elements of a container to satisfy a condition specified by a predicate.
153	This predicate can be any unary predicate, but the library provides a bunch of
154	useful string-related predicates and combinators ready for use.
155	These are located in the <headername>boost/algorithm/string/classification.hpp</headername> header.
156	Classification predicates can be combined using logical combinators to form
157	a more complex expressions. For example: <code>is_from_range('a','z') \|\| is_digit()</code>
158	</para>
159	</section>
160	<section>
161	<title>Trimming</title>
162
163	<para>
164	When parsing the input from a user, strings usually have unwanted leading or trailing
165	characters. To get rid of them, we need trim functions:
166	</para>
167	<programlisting>
168	string str1=" hello world! ";
169	string str2=trim_left_copy(str1); // str2 == "hello world! "
170	string str3=trim_right_copy(str2); // str3 == " hello world!"
171	trim(str1); // str1 == "hello world!"
172
173	string phone="00423333444";
174	// remove leading 0 from the phone number
175	trim_left_if(phone,is_any_of("0")); // phone == "423333444"
176	</programlisting>
177	<para>
178	It is possible to trim the spaces on the right, on the left or on both sides of a string.
179	And for those cases when there is a need to remove something else than blank space, there
180	are <emphasis>_if</emphasis> variants. Using these, a user can specify a functor which will
181	select the <emphasis>space</emphasis> to be removed. It is possible to use classification
182	predicates like <functionname>is_digit()</functionname> mentioned in the previous paragraph.
183	See the reference for the <headername>boost/algorithm/string/trim.hpp</headername>.
184	</para>
185	</section>
186	<section>
187	<title>Find algorithms</title>
188
189	<para>
190	The library contains a set of find algorithms. Here is an example:
191	</para>
192	<programlisting>
193	char text[]="hello dolly!";
194	iterator_range<char*> result=find_last(text,"ll");
195
196	transform( result.begin(), result.end(), result.begin(), bind2nd(plus<char>(), 1) );
197	// text = "hello dommy!"
198
199	to_upper(result); // text == "hello doMMy!"
200
201	// iterator_range is convertible to bool
202	if(find_first(text, "dolly"))
203	{
204	cout << "Dolly is there" << endl;
205	}
206	</programlisting>
207	<para>
208	We have used <functionname>find_last()</functionname> to search the <code>text</code> for "ll".
209	The result is given in the <ulink url="../../libs/range/doc/utility_class.html"><code>boost::iterator_range</code></ulink>.
210	This range delimits the
211	part of the input which satisfies the find criteria. In our example it is the last occurrence of "ll".
212
213	As we can see, input of the <functionname>find_last()</functionname> algorithm can be also
214	char[] because this type is supported by
215	<ulink url="../../libs/range/index.html">Boost.Range</ulink>.
216
217	The following lines transform the result. Notice that
218	<ulink url="../../libs/range/doc/utility_class.html"><code>boost::iterator_range</code></ulink> has familiar
219	<code>begin()</code> and <code>end()</code> methods, so it can be used like any other STL container.
220	Also it is convertible to bool therefore it is easy to use find algorithms for a simple containment checking.
221	</para>
222	<para>
223	Find algorithms are located in <headername>boost/algorithm/string/find.hpp</headername>.
224	</para>
225	</section>
226	<section>
227	<title>Replace Algorithms</title>
228	<para>
229	Find algorithms can be used for searching for a specific part of string. Replace goes one step
230	further. After a matching part is found, it is substituted with something else. The substitution is computed
231	from the original, using some transformation.
232	</para>
233	<programlisting>
234	string str1="Hello Dolly, Hello World!"
235	replace_first(str1, "Dolly", "Jane"); // str1 == "Hello Jane, Hello World!"
236	replace_last(str1, "Hello", "Goodbye"); // str1 == "Hello Jane, Goodbye World!"
237	erase_all(str1, " "); // str1 == "HelloJane,GoodbyeWorld!"
238	erase_head(str1, 6); // str1 == "Jane,GoodbyeWorld!"
239	</programlisting>
240	<para>
241	For the complete list of replace and erase functions see the
242	<link linkend="string_algo.reference">reference</link>.
243	There is a lot of predefined function for common usage, however, the library allows you to
244	define a custom <code>replace()</code> that suits a specific need. There is a generic <functionname>find_format()</functionname>
245	function which takes two parameters.
246	The first one is a <link linkend="string_algo.finder_concept">Finder</link> object, the second one is
247	a <link linkend="string_algo.formatter_concept">Formatter</link> object.
248	The Finder object is a functor which performs the searching for the replacement part. The Formatter object
249	takes the result of the Finder (usually a reference to the found substring) and creates a
250	substitute for it. Replace algorithm puts these two together and makes the desired substitution.
251	</para>
252	<para>
253	Check <headername>boost/algorithm/string/replace.hpp</headername>, <headername>boost/algorithm/string/erase.hpp</headername> and
254	<headername>boost/algorithm/string/find_format.hpp</headername> for reference.
255	</para>
256	</section>
257	<section>
258	<title>Find Iterator</title>
259
260	<para>
261	An extension to find algorithms it the Find Iterator. Instead of searching for just a one part of a string,
262	the find iterator allows us to iterate over the substrings matching the specified criteria.
263	This facility is using the <link linkend="string_algo.finder_concept">Finder</link> to incrementally
264	search the string.
265	Dereferencing a find iterator yields an <ulink url="../../libs/range/doc/utility_class.html"><code>boost::iterator_range</code></ulink>
266	object, that delimits the current match.
267	</para>
268	<para>
269	There are two iterators provided <classname>find_iterator</classname> and
270	<classname>split_iterator</classname>. The former iterates over substrings that are found using the specified
271	Finder. The latter iterates over the gaps between these substrings.
272	</para>
273	<programlisting>
274	string str1("abc--ABC--aBc");
275	// Find all 'abc' substrings (ignoring the case)
276	// Create a find_iterator
277	typedef find_iterator<string::iterator> string_find_iterator;
278	for(string_find_iterator It=
279	make_find_iterator(str1, first_finder("abc", is_iequal()));
280	It!=string_find_iterator();
281	++It)
282	{
283	cout << copy_range<std::string>(*It) << endl;
284	}
285
286	// Output will be:
287	// abc
288	// ABC
289	// aBC
290
291	typedef split_iterator<string::iterator> string_split_iterator;
292	for(string_split_iterator It=
293	make_split_iterator(str1, first_finder("-*-", is_iequal()));
294	It!=string_split_iterator();
295	++It)
296	{
297	cout << copy_range<std::string>(*It) << endl;
298	}
299
300	// Output will be:
301	// abc
302	// ABC
303	// aBC
304	</programlisting>
305	<para>
306	Note that the find iterators have only one template parameter. It is the base iterator type.
307	The Finder is specified at runtime. This allows us to typedef a find iterator for
308	common string types and reuse it. Additionally make_*_iterator functions help
309	to construct a find iterator for a particular range.
310	</para>
311	<para>
312	See the reference in <headername>boost/algorithm/string/find_iterator.hpp</headername>.
313	</para>
314	</section>
315	<section>
316	<title>Split</title>
317
318	<para>
319	Split algorithms are an extension to the find iterator for one common usage scenario.
320	These algorithms use a find iterator and store all matches into the provided
321	container. This container must be able to hold copies (e.g. <code>std::string</code>) or
322	references (e.g. <code>iterator_range</code>) of the extracted substrings.
323	</para>
324	<para>
325	Two algorithms are provided. <functionname>find_all()</functionname> finds all copies
326	of a string in the input. <functionname>split()</functionname> splits the input into parts.
327	</para>
328
329	<programlisting>
330	string str1("hello abc--ABC--aBc goodbye");
331
332	typedef vector< iterator_range<string::iterator> > find_vector_type;
333
334	find_vector_type FindVec; // #1: Search for separators
335	ifind_all( FindVec, str1, "abc" ); // FindVec == { [abc],[ABC],[aBc] }
336
337	typedef vector< string > split_vector_type;
338
339	split_vector_type SplitVec; // #2: Search for tokens
340	split( SplitVec, str1, is_any_of("-*") ); // SplitVec == { "hello abc","ABC","aBc goodbye" }
341	</programlisting>
342	<para>
343	<code>[hello]</code> designates an <code>iterator_range</code> delimiting this substring.
344	</para>
345	<para>
346	First example show how to construct a container to hold references to all extracted
347	substrings. Algorithm <functionname>ifind_all()</functionname> puts into FindVec references
348	to all substrings that are in case-insensitive manner equal to "abc".
349	</para>
350	<para>
351	Second example uses <functionname>split()</functionname> to split string str1 into parts
352	separated by characters '-' or '*'. These parts are then put into the SplitVec.
353	It is possible to specify if adjacent separators are concatenated or not.
354	</para>
355	<para>
356	More information can be found in the reference: <headername>boost/algorithm/string/split.hpp</headername>.
357	</para>
358	</section>
359	</section>

Note: See TracBrowser for help on using the repository browser.

Download in other formats: