Planet

navi

home

PPS

about

screenshots

download

development

forum

Context Navigation

source: orxonox.OLD/orxonox/trunk/src/util/newmat/newmatnl.h @ 4565

Last change on this file since 4565 was 4565, checked in by patrick, 19 years ago
orxonox/trunk: added the newmat library to the project. needs some translation in directory, temp under util/newmat. is needed by the collision detection engine to perform lin alg operations such as eigenvector decomposition. perhaps we will make our own library to do that later.
File size: 10.8 KB

Line
1	//$$ newmatnl.h definition file for non-linear optimisation
2
3	// Copyright (C) 1993,4,5: R B Davies
4
5	#ifndef NEWMATNL_LIB
6	#define NEWMATNL_LIB 0
7
8	#include "newmat.h"
9
10	#ifdef use_namespace
11	namespace NEWMAT {
12	#endif
13
14
15
16	/*
17
18	This is a beginning of a series of classes for non-linear optimisation.
19
20	At present there are two classes. FindMaximum2 is the basic optimisation
21	strategy when one is doing an optimisation where one has first
22	derivatives and estimates of the second derivatives. Class
23	NonLinearLeastSquares is derived from FindMaximum2. This provides the
24	functions that calculate function values and derivatives.
25
26	A third class is now added. This is for doing maximum-likelihood when
27	you have first derviatives and something like the Fisher Information
28	matrix (eg the variance covariance matrix of the first derivatives or
29	minus the second derivatives - this matrix is assumed to be positive
30	definite).
31
32
33
34	class FindMaximum2
35
36	Suppose T is the ColumnVector of parameters, F(T) the function we want
37	to maximise, D(T) the ColumnVector of derivatives of F with respect to
38	T, and S(T) the matrix of second derivatives.
39
40	Then the basic iteration is given a value of T, update it to
41
42	T - S.i() * D
43
44	where .i() denotes inverse.
45
46	If F was quadratic this would give exactly the right answer (except it
47	might get a minimum rather than a maximum). Since F is not usually
48	quadratic, the simple procedure would be to recalculate S and D with the
49	new value of T and keep iterating until the process converges. This is
50	known as the method of conjugate gradients.
51
52	In practice, this method may not converge. FindMaximum2 considers an
53	iteration of the form
54
55	T - x * S.i() * D
56
57	where x is a number. It tries x = 1 and uses the values of F and its
58	slope with respect to x at x = 0 and x = 1 to fit a cubic in x. It then
59	choses x to maximise the resulting function. This gives our new value of
60	T. The program checks that the value of F is getting better and carries
61	out a variety of strategies if it is not.
62
63	The program also has a second strategy. If the successive values of T
64	seem to be lying along a curve - eg we are following along a curved
65	ridge, the program will try to fit this ridge and project along it. This
66	does not work at present and is commented out.
67
68	FindMaximum2 has three virtual functions which need to be over-ridden by
69	a derived class.
70
71	void Value(const ColumnVector& T, bool wg, Real& f, bool& oorg);
72
73	T is the column vector of parameters. The function returns the value of
74	the function to f, but may instead set oorg to true if the parameter
75	values are not valid. If wg is true it may also calculate and store the
76	second derivative information.
77
78	bool NextPoint(ColumnVector& H, Real& d);
79
80	Using the value of T provided in the previous call of Value, find the
81	conjugate gradients adjustment to T, that is - S.i() * D. Also return
82
83	d = D.t() * S.i() * D.
84
85	NextPoint should return true if it considers that the process has
86	converged (d very small) and false otherwise. The previous call of Value
87	will have set wg to true, so that S will be available.
88
89	Real LastDerivative(const ColumnVector& H);
90
91	Return the scalar product of H and the vector of derivatives at the last
92	value of T.
93
94	The function Fit is the function that calls the iteration.
95
96	void Fit(ColumnVector&, int);
97
98	The arguments are the trial parameter values as a ColumnVector and the
99	maximum number of iterations. The program calls a DataException if the
100	initial parameters are not valid and a ConvergenceException if the
101	process fails to converge.
102
103
104	class NonLinearLeastSquares
105
106	This class is derived from FindMaximum2 and carries out a non-linear
107	least squares fit. It uses a QR decomposition to carry out the
108	operations required by FindMaximum2.
109
110	A prototype class R1_Col_I_D is provided. The user needs to derive a
111	class from this which includes functions the predicted value of each
112	observation its derivatives. An object from this class has to be
113	provided to class NonLinearLeastSquares.
114
115	Suppose we observe n normal random variables with the same unknown
116	variance and such the i-th one has expected value given by f(i,P)
117	where P is a column vector of unknown parameters and f is a known
118	function. We wish to estimate P.
119
120	First derive a class from R1_Col_I_D and override Real operator()(int i)
121	to give the value of the function f in terms of i and the ColumnVector
122	para defined in class R1_CoL_I_D. Also override ReturnMatrix
123	Derivatives() to give the derivates of f at para and the value of i
124	used in the preceeding call to operator(). Return the result as a
125	RowVector. Construct an object from this class. Suppose in what follows
126	it is called pred.
127
128	Now constuct a NonLinearLeastSquaresObject accessing pred and optionally
129	an iteration limit and an accuracy critierion.
130
131	NonLinearLeastSquares NLLS(pred, 1000, 0.0001);
132
133	The accuracy critierion should be somewhat less than one and 0.0001 is
134	about the smallest sensible value.
135
136	Define a ColumnVector P containing a guess at the value of the unknown
137	parameter, and a ColumnVector Y containing the unknown data. Call
138
139	NLLS.Fit(Y,P);
140
141	If the process converges, P will contain the estimates of the unknown
142	parameters. If it does not converge an exception will be generated.
143
144	The following member functions can be called after you have done a fit.
145
146	Real ResidualVariance() const;
147
148	The estimate of the variance of the observations.
149
150	void GetResiduals(ColumnVector& Z) const;
151
152	The residuals of the individual observations.
153
154	void GetStandardErrors(ColumnVector&);
155
156	The standard errors of the observations.
157
158	void GetCorrelations(SymmetricMatrix&);
159
160	The correlations of the observations.
161
162	void GetHatDiagonal(DiagonalMatrix&) const;
163
164	Forms a diagonal matrix of values between 0 and 1. If the i-th value is
165	larger than, say 0.2, then the i-th data value could have an undue
166	influence on your estimates.
167
168
169	*/
170
171	class FindMaximum2
172	{
173	virtual void Value(const ColumnVector&, bool, Real&, bool&) = 0;
174	virtual bool NextPoint(ColumnVector&, Real&) = 0;
175	virtual Real LastDerivative(const ColumnVector&) = 0;
176	public:
177	void Fit(ColumnVector&, int);
178	virtual ~FindMaximum2() {} // to keep gnu happy
179	};
180
181	class R1_Col_I_D
182	{
183	// The prototype for a Real function of a ColumnVector and an
184	// integer.
185	// You need to derive your function from this one and put in your
186	// function for operator() and Derivatives() at least.
187	// You may also want to set up a constructor to enter in additional
188	// parameter values (that will not vary during the solve).
189
190	protected:
191	ColumnVector para; // Current x value
192
193	public:
194	virtual bool IsValid() { return true; }
195	// is the current x value OK
196	virtual Real operator()(int i) = 0; // i-th function value at current para
197	virtual void Set(const ColumnVector& X) { para = X; }
198	// set current para
199	bool IsValid(const ColumnVector& X)
200	{ Set(X); return IsValid(); }
201	// set para, check OK
202	Real operator()(int i, const ColumnVector& X)
203	{ Set(X); return operator()(i); }
204	// set para, return value
205	virtual ReturnMatrix Derivatives() = 0;
206	// return derivatives as RowVector
207	virtual ~R1_Col_I_D() {} // to keep gnu happy
208	};
209
210
211	class NonLinearLeastSquares : public FindMaximum2
212	{
213	// these replace the corresponding functions in FindMaximum2
214	void Value(const ColumnVector&, bool, Real&, bool&);
215	bool NextPoint(ColumnVector&, Real&);
216	Real LastDerivative(const ColumnVector&);
217
218	Matrix X; // the things we need to do the
219	ColumnVector Y; // QR triangularisation
220	UpperTriangularMatrix U; // see the write-up in newmata.txt
221	ColumnVector M;
222	Real errorvar, criterion;
223	int n_obs, n_param;
224	const ColumnVector* DataPointer;
225	RowVector Derivs;
226	SymmetricMatrix Covariance;
227	DiagonalMatrix SE;
228	R1_Col_I_D& Pred; // Reference to predictor object
229	int Lim; // maximum number of iterations
230
231	public:
232	NonLinearLeastSquares(R1_Col_I_D& pred, int lim=1000, Real crit=0.0001)
233	: criterion(crit), Pred(pred), Lim(lim) {}
234	void Fit(const ColumnVector&, ColumnVector&);
235	Real ResidualVariance() const { return errorvar; }
236	void GetResiduals(ColumnVector& Z) const { Z = Y; }
237	void GetStandardErrors(ColumnVector&);
238	void GetCorrelations(SymmetricMatrix&);
239	void GetHatDiagonal(DiagonalMatrix&) const;
240
241	private:
242	void MakeCovariance();
243	};
244
245
246	// The next class is the prototype class for calculating the
247	// log-likelihood.
248	// I assume first derivatives are available and something like the
249	// Fisher Information or variance/covariance matrix of the first
250	// derivatives or minus the matrix of second derivatives is
251	// available. This matrix must be positive definite.
252
253	class LL_D_FI
254	{
255	protected:
256	ColumnVector para; // current parameter values
257	bool wg; // true if FI matrix wanted
258
259	public:
260	virtual void Set(const ColumnVector& X) { para = X; }
261	// set parameter values
262	virtual void WG(bool wgx) { wg = wgx; }
263	// set wg
264
265	virtual bool IsValid() { return true; }
266	// return true is para is OK
267	bool IsValid(const ColumnVector& X, bool wgx=true)
268	{ Set(X); WG(wgx); return IsValid(); }
269
270	virtual Real LogLikelihood() = 0; // return the loglikelihhod
271	Real LogLikelihood(const ColumnVector& X, bool wgx=true)
272	{ Set(X); WG(wgx); return LogLikelihood(); }
273
274	virtual ReturnMatrix Derivatives() = 0;
275	// column vector of derivatives
276	virtual ReturnMatrix FI() = 0; // Fisher Information matrix
277	virtual ~LL_D_FI() {} // to keep gnu happy
278	};
279
280	// This is the class for doing the maximum likelihood estimation
281
282	class MLE_D_FI : public FindMaximum2
283	{
284	// these replace the corresponding functions in FindMaximum2
285	void Value(const ColumnVector&, bool, Real&, bool&);
286	bool NextPoint(ColumnVector&, Real&);
287	Real LastDerivative(const ColumnVector&);
288
289	// the things we need for the analysis
290	LL_D_FI& LL; // reference to log-likelihood
291	int Lim; // maximum number of iterations
292	Real Criterion; // convergence criterion
293	ColumnVector Derivs; // for the derivatives
294	LowerTriangularMatrix LT; // Cholesky decomposition of FI
295	SymmetricMatrix Covariance;
296	DiagonalMatrix SE;
297
298	public:
299	MLE_D_FI(LL_D_FI& ll, int lim=1000, Real criterion=0.0001)
300	: LL(ll), Lim(lim), Criterion(criterion) {}
301	void Fit(ColumnVector& Parameters);
302	void GetStandardErrors(ColumnVector&);
303	void GetCorrelations(SymmetricMatrix&);
304
305	private:
306	void MakeCovariance();
307	};
308
309
310	#ifdef use_namespace
311	}
312	#endif
313
314
315
316	#endif
317
318	// body file: newmatnl.cpp
319
320
321
322

Note: See TracBrowser for help on using the repository browser.

Download in other formats: