InvGN
Calculate Tikhonov-regularized, Gauss-Newton nonlinear iterated inversion to solve the damped nonlinear least squares problem (Matlab code).
Install / Use
/learn @aganse/InvGNREADME
(here's a Markdown-enabled version of manpage.txt which describes InvGN...)
##INVGN - Gauss-Newton inversion - version 1.0.
by Andrew Ganse, Applied Physics Laboratory, University of Washington, Seattle.
andy@ganse.org
Copyright (c) 2015, University of Washington, via 3-clause BSD license.
See LICENSE.txt for full license statement.
INVGN calculates Tikhonov-regularized, Gauss-Newton nonlinear iterated inversion to solve the following damped nonlinear least squares problem:
minimize ||g(m)-d||^2_2 + lambda^2||Lm||^2_2
For appropriate choices of regularization parameter lambda, this problem is equivalent to:
minimize ||Lm||_2 subject to ||g(m)-d||_2<delta
(where delta is some statistically-determined noise threshold)
and also to:
minimize ||g(m)-d||_2 subject to ||Lm||_2<epsilon
(where epsilon is some model-norm threshold)
The function call is set up to allow use on both nonlinear and linear problems, both regularized (inverse) and non-regularized (parameter estimation) problems, and both frequentist and Bayesian problems. See EXAMPLES below for various calls. The dampled NLS regularization is accomplished with the L-curve method (see e.g. Hansen 1987).
This code is based on material from:
- Professor Ken Creager's ESS-523 inverse theory class, Univ of Washington, 2005
- RC Aster, B Borchers, & CH Thurber, "Parameter Estimation and Inverse Problems," Elsevier Academic Press, 2004.
- A Tarantola, "Inverse Problem Theory and Methods for Model Parameter Estimation", Society for Industrial and Applied Mathematics, 2005.
- PC Hansen, "Rank-Deficient and Discrete Ill-Posed Problems: Numerical Aspects of Linear Inversion," Society for Industrial Mathematics, 1987.
- PE Gill, W Murray, & MH Wright, "Practical Optimization," Academic Press, 1981.
INVGN is compatible with Octave (version >3), but to handle a few minor remaining differences between Matlab and Octave, do note the "usingOctave" flag among the options listed at top of the InvGN.m script. Perhaps in future versions INVGN will allow these options to be set via keyword/value pairs in the input argument list, but not implemented yet.
<br> ###USAGE: [m,normdm,normmisfit,normrough,lambda,dpred,G,C,R,Ddiag,i_conv] = ...
invGN(fwdfunc,derivfunc,epsr,dmeas,covd,lb,ub,m0,L,lambda,maxiters,useMpref,...
verbose,fid,vargin);
<br>
###INPUTS: (all vectors are always column vectors)
fwdfunc = Matlab/Octave function handle to forward problem function which
is of form outvector=myfunction(invector,...). (See example 1
of "help function_handle" in Matlab for how to implement)
derivfunc = Matlab/Octave function handle to derivatives function which
is of form outvector=myfunction(invector,...).
If derivfunc=[] then finitediffs are automatically used instead.
epsr = relative accuracy of forward problem computation (not the
accuracy of the modeling of the physical process, but of the
computation itself, see e.g. Gill,Murray,Wright 1981). Affects
convergence testing and is passed into finite diff function (if
used), in both cases providing stepsize h (h is about sqrt(epsr)
if the model params are all of order unity). Epsr has a minimum
at the tradeoff between degrading Talor series expansion accuracy
if h too large, and increasing roundoff error in the numerator
G(m+h)-G(m) of finite diffs if h too small.
Epsr can be scalar (ie same for all elements of model vector) or
vector of same length as model vector for 1-to-1 correspondence
to different variable "types" (eg layer bulk properties vs layer
thicknesses).
dmeas = measured data vector (includes noise)
covd = if sqr matrix, then this is measurement noise covariance matrix.
if col vector, then this is diag of meas noise covariance matrix.
if neg, then = -covd^(-1/2), allowing e.g. use of zeros for a mask,
so note in that case it's the inverse of the STDEVS (sqrt(cov)).
lb = lower model bounds vector (hard bnds, no tolerance)
ub = upper model bounds vector (hard bnds, no tolerance)
These bounds are only implemented at the iteration steps after
solving for the model perturbation without them; ie functions
such as BVLS or LSQNONNEG are not used. If the bounds are seen
to be passed in a candidate model perturbation, the stepsize of
the perturbation is reduced (but retaining its direction) until
within bounds. As long as we know the final solution must be
within the bounds and not on them, this is merely like changing
the starting point of inversion and is valid (perhaps somtimes
not as efficient as one of those more complicated functions).
But if the solution from this code ends up on a bound, note you
should treat that solution with some caution and not consider it
a true GN solution. In my own inverse problems using this code
the bounds only ever engage at the tails of my Lcurve where I'm
not too worried about being rigorous, so I haven't found it
necessary to complicate things with those other functions.
m0 = initial model vector estimate to start iterating from; if this
is instead a matrix (NxM), then it's a collection of M initial
model vectors of length N each, corresponding to M lambda values
(Typically this is done to continue iterations on previous runs
without starting over.)
Alternate use: if L (next input arg) = I (identity matrix), then
this is also the "preferred model" whose distance from the soln
model is minimized.
L = If this is a matrix, then this is the user-supplied finite
difference operator for Tikhonov regularization (function
finiteDiffOP.m can help construct it) in the frequentist
framework. If in the Bayesian framework and lambda is set to 1,
then L can be supplied as the Cholesky decomposition of the
inverse model prior covariance matrix.
If L is scalar then it is the number of profile segments (for
example if the model vector represents concatenated P & S wave
speed profiles then Lin=2) and 2nd order Tikhonov (smoothness)
frequentist regularization will be used.
lambda = Used for the frequentist inversion framework (if using Bayesian
framework then set lambda=1 and see Bayesian note for L above).
If lambda is a negative scalar, auto calculate Lcurve with num
pts equal to magnitude of the negative scalar. The auto-calc'd
curve starts at lambda_mid = sqrt(max(eig(G'*G))/max(eig(L'L)))
and decreases logarithmically from there to 3 orders of mag
less and more than lambda_mid by default. If lambda >=0, use
its value(s) directly as frequentist Tikhonov tradeoff param(s),
in which case lambda can be scalar or vector.
If lambda is a 3-element vector, lambda(1) is the pos or neg scalar
just described; and lambda(2:3) are how many orders of magnitude
greater (lambda(2)) or less (lambda(3)) than lambda_mid to sweep
in the auto-calc'd curve. (lambda(2:3) defaults are 3,-3)
useMpref = flag for one of three problem frameworks:
useMpref=0 is for a frequentist formulation with higher-order
Tikhonov regularization. Here the regularization doesn't
handle distance from preferred (initial) model m0, but no
restriction on L (unlike useMpref=1).
useMpref=1 is for using 0th-order-Tikh (ridge regr), ie
L==eye(size(L)), in a frequentist formulation in which m0 is
the "preferred model" the inversion minimizes distance to.
useMpref=2 is for using Bayesian formulation, in which case:
m0 is the prior model mean, lambda must =1, L=chol(inv(Cprior))
(user must set that L externally, note ironically L there must
be RIGHT triangular, the "L" notation is from frequentist case),
output C is now the Bayesian posterior covariance matrix (which
note is different from the frequentist C), outputs R and Ddiag
are empty, and normrough won't have literal meaning anymore.
maxiters = Number of GN iterations to perform at each lambda on the Lcurve.
So total number of derivative estimations will be
numlambdasmaxiters.
verbose = 0-3, with 0 = run quietly, and 3 = lots of progress text
Progress text (percent done) in findiffs is useful for slow
forward problems and/or debugging - but note percent done text
is unavailable in jacfwddist and jaccendist.
fid = File handle to which progress text is output in verbose option.
Note fid=1 outputs to stdout, but an output file is useful for
long batch runs when the forward problem is slow.
vargin = the extra arguments (if any) that must be passed to the fwdprob
function are tacked onto the end here. You may have none, in
which case your last arg is t
