source: ntrip/trunk/BNC/newmat/newmatnl.h@ 8992

Last change on this file since 8992 was 8901, checked in by stuerze, 5 years ago

upgrade to newmat11 library

File size: 11.2 KB
Line 
1/// \ingroup newmat
2///@{
3
4/// \file newmatnl.h
5/// Header file for non-linear optimisation
6
7// Copyright (C) 1993,4,5: R B Davies
8
9#ifndef NEWMATNL_LIB
10#define NEWMATNL_LIB 0
11
12#include "newmat.h"
13
14#ifdef use_namespace
15namespace NEWMAT {
16#endif
17
18
19
20/*
21
22This is a beginning of a series of classes for non-linear optimisation.
23
24At present there are two classes. FindMaximum2 is the basic optimisation
25strategy when one is doing an optimisation where one has first
26derivatives and estimates of the second derivatives. Class
27NonLinearLeastSquares is derived from FindMaximum2. This provides the
28functions that calculate function values and derivatives.
29
30A third class is now added. This is for doing maximum-likelihood when
31you have first derviatives and something like the Fisher Information
32matrix (eg the variance covariance matrix of the first derivatives or
33minus the second derivatives - this matrix is assumed to be positive
34definite).
35
36
37
38 class FindMaximum2
39
40Suppose T is the ColumnVector of parameters, F(T) the function we want
41to maximise, D(T) the ColumnVector of derivatives of F with respect to
42T, and S(T) the matrix of second derivatives.
43
44Then the basic iteration is given a value of T, update it to
45
46 T - S.i() * D
47
48where .i() denotes inverse.
49
50If F was quadratic this would give exactly the right answer (except it
51might get a minimum rather than a maximum). Since F is not usually
52quadratic, the simple procedure would be to recalculate S and D with the
53new value of T and keep iterating until the process converges. This is
54known as the method of conjugate gradients.
55
56In practice, this method may not converge. FindMaximum2 considers an
57iteration of the form
58
59 T - x * S.i() * D
60
61where x is a number. It tries x = 1 and uses the values of F and its
62slope with respect to x at x = 0 and x = 1 to fit a cubic in x. It then
63choses x to maximise the resulting function. This gives our new value of
64T. The program checks that the value of F is getting better and carries
65out a variety of strategies if it is not.
66
67The program also has a second strategy. If the successive values of T
68seem to be lying along a curve - eg we are following along a curved
69ridge, the program will try to fit this ridge and project along it. This
70does not work at present and is commented out.
71
72FindMaximum2 has three virtual functions which need to be over-ridden by
73a derived class.
74
75 void Value(const ColumnVector& T, bool wg, Real& f, bool& oorg);
76
77T is the column vector of parameters. The function returns the value of
78the function to f, but may instead set oorg to true if the parameter
79values are not valid. If wg is true it may also calculate and store the
80second derivative information.
81
82 bool NextPoint(ColumnVector& H, Real& d);
83
84Using the value of T provided in the previous call of Value, find the
85conjugate gradients adjustment to T, that is - S.i() * D. Also return
86
87 d = D.t() * S.i() * D.
88
89NextPoint should return true if it considers that the process has
90converged (d very small) and false otherwise. The previous call of Value
91will have set wg to true, so that S will be available.
92
93 Real LastDerivative(const ColumnVector& H);
94
95Return the scalar product of H and the vector of derivatives at the last
96value of T.
97
98The function Fit is the function that calls the iteration.
99
100 void Fit(ColumnVector&, int);
101
102The arguments are the trial parameter values as a ColumnVector and the
103maximum number of iterations. The program calls a DataException if the
104initial parameters are not valid and a ConvergenceException if the
105process fails to converge.
106
107
108 class NonLinearLeastSquares
109
110This class is derived from FindMaximum2 and carries out a non-linear
111least squares fit. It uses a QR decomposition to carry out the
112operations required by FindMaximum2.
113
114A prototype class R1_Col_I_D is provided. The user needs to derive a
115class from this which includes functions the predicted value of each
116observation its derivatives. An object from this class has to be
117provided to class NonLinearLeastSquares.
118
119Suppose we observe n normal random variables with the same unknown
120variance and such the i-th one has expected value given by f(i,P)
121where P is a column vector of unknown parameters and f is a known
122function. We wish to estimate P.
123
124First derive a class from R1_Col_I_D and override Real operator()(int i)
125to give the value of the function f in terms of i and the ColumnVector
126para defined in class R1_CoL_I_D. Also override ReturnMatrix
127Derivatives() to give the derivates of f at para and the value of i
128used in the preceeding call to operator(). Return the result as a
129RowVector. Construct an object from this class. Suppose in what follows
130it is called pred.
131
132Now constuct a NonLinearLeastSquaresObject accessing pred and optionally
133an iteration limit and an accuracy critierion.
134
135 NonLinearLeastSquares NLLS(pred, 1000, 0.0001);
136
137The accuracy critierion should be somewhat less than one and 0.0001 is
138about the smallest sensible value.
139
140Define a ColumnVector P containing a guess at the value of the unknown
141parameter, and a ColumnVector Y containing the unknown data. Call
142
143 NLLS.Fit(Y,P);
144
145If the process converges, P will contain the estimates of the unknown
146parameters. If it does not converge an exception will be generated.
147
148The following member functions can be called after you have done a fit.
149
150Real ResidualVariance() const;
151
152The estimate of the variance of the observations.
153
154void GetResiduals(ColumnVector& Z) const;
155
156The residuals of the individual observations.
157
158void GetStandardErrors(ColumnVector&);
159
160The standard errors of the observations.
161
162void GetCorrelations(SymmetricMatrix&);
163
164The correlations of the observations.
165
166void GetHatDiagonal(DiagonalMatrix&) const;
167
168Forms a diagonal matrix of values between 0 and 1. If the i-th value is
169larger than, say 0.2, then the i-th data value could have an undue
170influence on your estimates.
171
172
173*/
174
175class FindMaximum2
176{
177 virtual void Value(const ColumnVector&, bool, Real&, bool&) = 0;
178 virtual bool NextPoint(ColumnVector&, Real&) = 0;
179 virtual Real LastDerivative(const ColumnVector&) = 0;
180public:
181 void Fit(ColumnVector&, int);
182 virtual ~FindMaximum2() {} // to keep gnu happy
183};
184
185class R1_Col_I_D
186{
187 // The prototype for a Real function of a ColumnVector and an
188 // integer.
189 // You need to derive your function from this one and put in your
190 // function for operator() and Derivatives() at least.
191 // You may also want to set up a constructor to enter in additional
192 // parameter values (that will not vary during the solve).
193
194protected:
195 ColumnVector para; // Current x value
196
197public:
198 virtual bool IsValid() { return true; }
199 // is the current x value OK
200 virtual Real operator()(int i) = 0; // i-th function value at current para
201 virtual void Set(const ColumnVector& X) { para = X; }
202 // set current para
203 bool IsValid(const ColumnVector& X)
204 { Set(X); return IsValid(); }
205 // set para, check OK
206 Real operator()(int i, const ColumnVector& X)
207 { Set(X); return operator()(i); }
208 // set para, return value
209 virtual ReturnMatrix Derivatives() = 0;
210 // return derivatives as RowVector
211 virtual ~R1_Col_I_D() {} // to keep gnu happy
212};
213
214
215class NonLinearLeastSquares : public FindMaximum2
216{
217 // these replace the corresponding functions in FindMaximum2
218 void Value(const ColumnVector&, bool, Real&, bool&);
219 bool NextPoint(ColumnVector&, Real&);
220 Real LastDerivative(const ColumnVector&);
221
222 Matrix X; // the things we need to do the
223 ColumnVector Y; // QR triangularisation
224 UpperTriangularMatrix U; // see the write-up in newmata.txt
225 ColumnVector M;
226 Real errorvar, criterion;
227 int n_obs, n_param;
228 const ColumnVector* DataPointer;
229 RowVector Derivs;
230 SymmetricMatrix Covariance;
231 DiagonalMatrix SE;
232 R1_Col_I_D& Pred; // Reference to predictor object
233 int Lim; // maximum number of iterations
234
235public:
236 NonLinearLeastSquares(R1_Col_I_D& pred, int lim=1000, Real crit=0.0001)
237 : criterion(crit), Pred(pred), Lim(lim) {}
238 void Fit(const ColumnVector&, ColumnVector&);
239 Real ResidualVariance() const { return errorvar; }
240 void GetResiduals(ColumnVector& Z) const { Z = Y; }
241 void GetStandardErrors(ColumnVector&);
242 void GetCorrelations(SymmetricMatrix&);
243 void GetHatDiagonal(DiagonalMatrix&) const;
244
245private:
246 void MakeCovariance();
247};
248
249
250// The next class is the prototype class for calculating the
251// log-likelihood.
252// I assume first derivatives are available and something like the
253// Fisher Information or variance/covariance matrix of the first
254// derivatives or minus the matrix of second derivatives is
255// available. This matrix must be positive definite.
256
257class LL_D_FI
258{
259protected:
260 ColumnVector para; // current parameter values
261 bool wg; // true if FI matrix wanted
262
263public:
264 virtual void Set(const ColumnVector& X) { para = X; }
265 // set parameter values
266 virtual void WG(bool wgx) { wg = wgx; }
267 // set wg
268
269 virtual bool IsValid() { return true; }
270 // return true is para is OK
271 bool IsValid(const ColumnVector& X, bool wgx=true)
272 { Set(X); WG(wgx); return IsValid(); }
273
274 virtual Real LogLikelihood() = 0; // return the loglikelihhod
275 Real LogLikelihood(const ColumnVector& X, bool wgx=true)
276 { Set(X); WG(wgx); return LogLikelihood(); }
277
278 virtual ReturnMatrix Derivatives() = 0;
279 // column vector of derivatives
280 virtual ReturnMatrix FI() = 0; // Fisher Information matrix
281 virtual ~LL_D_FI() {} // to keep gnu happy
282};
283
284// This is the class for doing the maximum likelihood estimation
285
286class MLE_D_FI : public FindMaximum2
287{
288 // these replace the corresponding functions in FindMaximum2
289 void Value(const ColumnVector&, bool, Real&, bool&);
290 bool NextPoint(ColumnVector&, Real&);
291 Real LastDerivative(const ColumnVector&);
292
293 // the things we need for the analysis
294 LL_D_FI& LL; // reference to log-likelihood
295 int Lim; // maximum number of iterations
296 Real Criterion; // convergence criterion
297 ColumnVector Derivs; // for the derivatives
298 LowerTriangularMatrix LT; // Cholesky decomposition of FI
299 SymmetricMatrix Covariance;
300 DiagonalMatrix SE;
301
302public:
303 MLE_D_FI(LL_D_FI& ll, int lim=1000, Real criterion=0.0001)
304 : LL(ll), Lim(lim), Criterion(criterion) {}
305 void Fit(ColumnVector& Parameters);
306 void GetStandardErrors(ColumnVector&);
307 void GetCorrelations(SymmetricMatrix&);
308
309private:
310 void MakeCovariance();
311};
312
313
314#ifdef use_namespace
315}
316#endif
317
318
319
320#endif
321
322// body file: newmatnl.cpp
323
324
325
326///@}
327
Note: See TracBrowser for help on using the repository browser.