optimizer.Rd
This function serves as a wrapper around optimize
, optim
, and ctmm
's partial-Newton optimization routine, with standardized arguments and return values. It finds the optimal parameters that minimize a function, whether it be a cost, loss, risk, or negative log-likelihood function.
optimizer(par,fn,...,method="pNewton",lower=-Inf,upper=Inf,period=FALSE,reset=identity,
control=list())
Initial parameter guess.
Function to be minimized with first argument par
and optional argument zero
(see 'Details' below).
Optional arguments fed to fn
.
Optimization algorithm (see 'Details' below).
Lower bound for parameters.
Upper bound for parameters.
Period of circular parameters if not FALSE
.
Optional function to re-center parameters, if symmetry permits, to prevent numerical underflow.
Argument list for the optimization routine (see 'Details' below).
Only method='pNewton'
will work in both one dimension and multiple dimensions. Any other method
argument will be ignored in one dimension, in favor of optimize
with a backup evaluation of nlm
(under a log-link) for cases where optimize
is known to fail. In multiple dimensions, methods other than pNewton
include those detailed in optim
.
method='pNewton'
is ctmm
's partial-Newton optimizer, which is a quasi-Newton method that is more accurate than BFGS-based methods when the gradient of fn
must be calculated numerically. In short, while BFGS-based methods provide a single rank-1 update to the Hessian matrix per iteration, the partial-Newton algorithm provides length(par)+1
rank-1 updates to the Hessian matrix per iteration, at the same computational cost. Furthermore, length(par)
of those updates have better numerical precision than the BFGS update, meaning that they can be used at smaller step sizes to obtain better numerical precision. The pNewton
optimizer also supports several features not found in other R
optimizers: the zero
argument, the period
argument, and parallelization.
The zero
argument is an optional argument in fn
supported by method='pNewton'
. Briefly, if you rewrite a negative log-likelihood of the form \(fn = \sum_{i=1}^n fn_i\) as \(fn = \sum_{i=1}^n ( fn_i - zero/n ) + zero\), where zero
is the current estimate of the minimum value of fn
, then the sum becomes approximately "zeroed" and so the variance in numerical errors caused by the difference in magnitude between fn
and fn_i
is mitigated. In practice, without the zero
argument, log-likelihood functions grow in magnitude with increasing data and then require increasing numerical precision to resolve the same differences in log-likelihood. But absolute differences in log-likelihoods (on the order of 1) are always important, even though most optimization routines more naturally consider relative differences as being important.
The period
argument informs method='pNewton'
if parameters is circular, such as with angles, and what their periods are.
The control
list can take the folowing arguments, with defaults shown:
precision=1/2
Fraction of machine numerical precision to target in the maximized likelihood value. The optimal par
will have half this precision. On most computers, precision=1
is approximately 16 decimal digits of precision for the objective function and 8 for the optimal par
.
maxit=.Machine$integer.max
Maximum number of iterations allowed for optimization.
parscale=pmin(abs(par),abs(par-lower),abs(upper-par))
The natural scale of the parameters such that variations in par
on the order of parscale
produce variations in fn
on the order of one.
trace=FALSE
Return step-by-step progress on optimization.
cores=1
Perform cores
evaluations of fn
in parallel, if running in UNIX. cores<=0
will use all available cores, save abs(cores)
. This feature is only supported by method='pNewton'
and is only useful if fn
is slow to evaluate, length(par)>1
, and the total number of parallel evaluations required does not trigger fork-bomb detection by the OS.
Returns a list with components par
for the optimal parameters, value
for the minimum value of fn
, and possibly other components depending on the optimization routine employed.
method='pNewton'
is very stringent about achieving its precision
target and assumes that fn
has small enough numerical errors (permitting the use of argument zero
) to achieve that precision
target. If the numerical errors in fn
are too large, then the optimizer can fail to converge. ctmm.fit
standardizes its input data before optimization, and back-transforms afterwards, as one method to minimize numerical errors in fn
.