Context Navigation

← Previous Revision
Latest Revision
Next Revision →
Normal
Revision Log

source: ntrip/trunk/BNC/newmat/newmatnl.h@ 9045

Last change on this file since 9045 was 8901, checked in by stuerze, 5 years ago
upgrade to newmat11 library
File size: 11.2 KB

Rev	Line
[8901]	1	/// \ingroup newmat
	2	///@{
	3
	4	/// \file newmatnl.h
	5	/// Header file for non-linear optimisation
	6
	7	// Copyright (C) 1993,4,5: R B Davies
	8
	9	#ifndef NEWMATNL_LIB
	10	#define NEWMATNL_LIB 0
	11
	12	#include "newmat.h"
	13
	14	#ifdef use_namespace
	15	namespace NEWMAT {
	16	#endif
	17
	18
	19
	20	/*
	21
	22	This is a beginning of a series of classes for non-linear optimisation.
	23
	24	At present there are two classes. FindMaximum2 is the basic optimisation
	25	strategy when one is doing an optimisation where one has first
	26	derivatives and estimates of the second derivatives. Class
	27	NonLinearLeastSquares is derived from FindMaximum2. This provides the
	28	functions that calculate function values and derivatives.
	29
	30	A third class is now added. This is for doing maximum-likelihood when
	31	you have first derviatives and something like the Fisher Information
	32	matrix (eg the variance covariance matrix of the first derivatives or
	33	minus the second derivatives - this matrix is assumed to be positive
	34	definite).
	35
	36
	37
	38	class FindMaximum2
	39
	40	Suppose T is the ColumnVector of parameters, F(T) the function we want
	41	to maximise, D(T) the ColumnVector of derivatives of F with respect to
	42	T, and S(T) the matrix of second derivatives.
	43
	44	Then the basic iteration is given a value of T, update it to
	45
	46	T - S.i() * D
	47
	48	where .i() denotes inverse.
	49
	50	If F was quadratic this would give exactly the right answer (except it
	51	might get a minimum rather than a maximum). Since F is not usually
	52	quadratic, the simple procedure would be to recalculate S and D with the
	53	new value of T and keep iterating until the process converges. This is
	54	known as the method of conjugate gradients.
	55
	56	In practice, this method may not converge. FindMaximum2 considers an
	57	iteration of the form
	58
	59	T - x * S.i() * D
	60
	61	where x is a number. It tries x = 1 and uses the values of F and its
	62	slope with respect to x at x = 0 and x = 1 to fit a cubic in x. It then
	63	choses x to maximise the resulting function. This gives our new value of
	64	T. The program checks that the value of F is getting better and carries
	65	out a variety of strategies if it is not.
	66
	67	The program also has a second strategy. If the successive values of T
	68	seem to be lying along a curve - eg we are following along a curved
	69	ridge, the program will try to fit this ridge and project along it. This
	70	does not work at present and is commented out.
	71
	72	FindMaximum2 has three virtual functions which need to be over-ridden by
	73	a derived class.
	74
	75	void Value(const ColumnVector& T, bool wg, Real& f, bool& oorg);
	76
	77	T is the column vector of parameters. The function returns the value of
	78	the function to f, but may instead set oorg to true if the parameter
	79	values are not valid. If wg is true it may also calculate and store the
	80	second derivative information.
	81
	82	bool NextPoint(ColumnVector& H, Real& d);
	83
	84	Using the value of T provided in the previous call of Value, find the
	85	conjugate gradients adjustment to T, that is - S.i() * D. Also return
	86
	87	d = D.t() * S.i() * D.
	88
	89	NextPoint should return true if it considers that the process has
	90	converged (d very small) and false otherwise. The previous call of Value
	91	will have set wg to true, so that S will be available.
	92
	93	Real LastDerivative(const ColumnVector& H);
	94
	95	Return the scalar product of H and the vector of derivatives at the last
	96	value of T.
	97
	98	The function Fit is the function that calls the iteration.
	99
	100	void Fit(ColumnVector&, int);
	101
	102	The arguments are the trial parameter values as a ColumnVector and the
	103	maximum number of iterations. The program calls a DataException if the
	104	initial parameters are not valid and a ConvergenceException if the
	105	process fails to converge.
	106
	107
	108	class NonLinearLeastSquares
	109
	110	This class is derived from FindMaximum2 and carries out a non-linear
	111	least squares fit. It uses a QR decomposition to carry out the
	112	operations required by FindMaximum2.
	113
	114	A prototype class R1_Col_I_D is provided. The user needs to derive a
	115	class from this which includes functions the predicted value of each
	116	observation its derivatives. An object from this class has to be
	117	provided to class NonLinearLeastSquares.
	118
	119	Suppose we observe n normal random variables with the same unknown
	120	variance and such the i-th one has expected value given by f(i,P)
	121	where P is a column vector of unknown parameters and f is a known
	122	function. We wish to estimate P.
	123
	124	First derive a class from R1_Col_I_D and override Real operator()(int i)
	125	to give the value of the function f in terms of i and the ColumnVector
	126	para defined in class R1_CoL_I_D. Also override ReturnMatrix
	127	Derivatives() to give the derivates of f at para and the value of i
	128	used in the preceeding call to operator(). Return the result as a
	129	RowVector. Construct an object from this class. Suppose in what follows
	130	it is called pred.
	131
	132	Now constuct a NonLinearLeastSquaresObject accessing pred and optionally
	133	an iteration limit and an accuracy critierion.
	134
	135	NonLinearLeastSquares NLLS(pred, 1000, 0.0001);
	136
	137	The accuracy critierion should be somewhat less than one and 0.0001 is
	138	about the smallest sensible value.
	139
	140	Define a ColumnVector P containing a guess at the value of the unknown
	141	parameter, and a ColumnVector Y containing the unknown data. Call
	142
	143	NLLS.Fit(Y,P);
	144
	145	If the process converges, P will contain the estimates of the unknown
	146	parameters. If it does not converge an exception will be generated.
	147
	148	The following member functions can be called after you have done a fit.
	149
	150	Real ResidualVariance() const;
	151
	152	The estimate of the variance of the observations.
	153
	154	void GetResiduals(ColumnVector& Z) const;
	155
	156	The residuals of the individual observations.
	157
	158	void GetStandardErrors(ColumnVector&);
	159
	160	The standard errors of the observations.
	161
	162	void GetCorrelations(SymmetricMatrix&);
	163
	164	The correlations of the observations.
	165
	166	void GetHatDiagonal(DiagonalMatrix&) const;
	167
	168	Forms a diagonal matrix of values between 0 and 1. If the i-th value is
	169	larger than, say 0.2, then the i-th data value could have an undue
	170	influence on your estimates.
	171
	172
	173	*/
	174
	175	class FindMaximum2
	176	{
	177	virtual void Value(const ColumnVector&, bool, Real&, bool&) = 0;
	178	virtual bool NextPoint(ColumnVector&, Real&) = 0;
	179	virtual Real LastDerivative(const ColumnVector&) = 0;
	180	public:
	181	void Fit(ColumnVector&, int);
	182	virtual ~FindMaximum2() {} // to keep gnu happy
	183	};
	184
	185	class R1_Col_I_D
	186	{
	187	// The prototype for a Real function of a ColumnVector and an
	188	// integer.
	189	// You need to derive your function from this one and put in your
	190	// function for operator() and Derivatives() at least.
	191	// You may also want to set up a constructor to enter in additional
	192	// parameter values (that will not vary during the solve).
	193
	194	protected:
	195	ColumnVector para; // Current x value
	196
	197	public:
	198	virtual bool IsValid() { return true; }
	199	// is the current x value OK
	200	virtual Real operator()(int i) = 0; // i-th function value at current para
	201	virtual void Set(const ColumnVector& X) { para = X; }
	202	// set current para
	203	bool IsValid(const ColumnVector& X)
	204	{ Set(X); return IsValid(); }
	205	// set para, check OK
	206	Real operator()(int i, const ColumnVector& X)
	207	{ Set(X); return operator()(i); }
	208	// set para, return value
	209	virtual ReturnMatrix Derivatives() = 0;
	210	// return derivatives as RowVector
	211	virtual ~R1_Col_I_D() {} // to keep gnu happy
	212	};
	213
	214
	215	class NonLinearLeastSquares : public FindMaximum2
	216	{
	217	// these replace the corresponding functions in FindMaximum2
	218	void Value(const ColumnVector&, bool, Real&, bool&);
	219	bool NextPoint(ColumnVector&, Real&);
	220	Real LastDerivative(const ColumnVector&);
	221
	222	Matrix X; // the things we need to do the
	223	ColumnVector Y; // QR triangularisation
	224	UpperTriangularMatrix U; // see the write-up in newmata.txt
	225	ColumnVector M;
	226	Real errorvar, criterion;
	227	int n_obs, n_param;
	228	const ColumnVector* DataPointer;
	229	RowVector Derivs;
	230	SymmetricMatrix Covariance;
	231	DiagonalMatrix SE;
	232	R1_Col_I_D& Pred; // Reference to predictor object
	233	int Lim; // maximum number of iterations
	234
	235	public:
	236	NonLinearLeastSquares(R1_Col_I_D& pred, int lim=1000, Real crit=0.0001)
	237	: criterion(crit), Pred(pred), Lim(lim) {}
	238	void Fit(const ColumnVector&, ColumnVector&);
	239	Real ResidualVariance() const { return errorvar; }
	240	void GetResiduals(ColumnVector& Z) const { Z = Y; }
	241	void GetStandardErrors(ColumnVector&);
	242	void GetCorrelations(SymmetricMatrix&);
	243	void GetHatDiagonal(DiagonalMatrix&) const;
	244
	245	private:
	246	void MakeCovariance();
	247	};
	248
	249
	250	// The next class is the prototype class for calculating the
	251	// log-likelihood.
	252	// I assume first derivatives are available and something like the
	253	// Fisher Information or variance/covariance matrix of the first
	254	// derivatives or minus the matrix of second derivatives is
	255	// available. This matrix must be positive definite.
	256
	257	class LL_D_FI
	258	{
	259	protected:
	260	ColumnVector para; // current parameter values
	261	bool wg; // true if FI matrix wanted
	262
	263	public:
	264	virtual void Set(const ColumnVector& X) { para = X; }
	265	// set parameter values
	266	virtual void WG(bool wgx) { wg = wgx; }
	267	// set wg
	268
	269	virtual bool IsValid() { return true; }
	270	// return true is para is OK
	271	bool IsValid(const ColumnVector& X, bool wgx=true)
	272	{ Set(X); WG(wgx); return IsValid(); }
	273
	274	virtual Real LogLikelihood() = 0; // return the loglikelihhod
	275	Real LogLikelihood(const ColumnVector& X, bool wgx=true)
	276	{ Set(X); WG(wgx); return LogLikelihood(); }
	277
	278	virtual ReturnMatrix Derivatives() = 0;
	279	// column vector of derivatives
	280	virtual ReturnMatrix FI() = 0; // Fisher Information matrix
	281	virtual ~LL_D_FI() {} // to keep gnu happy
	282	};
	283
	284	// This is the class for doing the maximum likelihood estimation
	285
	286	class MLE_D_FI : public FindMaximum2
	287	{
	288	// these replace the corresponding functions in FindMaximum2
	289	void Value(const ColumnVector&, bool, Real&, bool&);
	290	bool NextPoint(ColumnVector&, Real&);
	291	Real LastDerivative(const ColumnVector&);
	292
	293	// the things we need for the analysis
	294	LL_D_FI& LL; // reference to log-likelihood
	295	int Lim; // maximum number of iterations
	296	Real Criterion; // convergence criterion
	297	ColumnVector Derivs; // for the derivatives
	298	LowerTriangularMatrix LT; // Cholesky decomposition of FI
	299	SymmetricMatrix Covariance;
	300	DiagonalMatrix SE;
	301
	302	public:
	303	MLE_D_FI(LL_D_FI& ll, int lim=1000, Real criterion=0.0001)
	304	: LL(ll), Lim(lim), Criterion(criterion) {}
	305	void Fit(ColumnVector& Parameters);
	306	void GetStandardErrors(ColumnVector&);
	307	void GetCorrelations(SymmetricMatrix&);
	308
	309	private:
	310	void MakeCovariance();
	311	};
	312
	313
	314	#ifdef use_namespace
	315	}
	316	#endif
	317
	318
	319
	320	#endif
	321
	322	// body file: newmatnl.cpp
	323
	324
	325
	326	///@}
	327

Note: See TracBrowser for help on using the repository browser.

Download in other formats: