`optimizer.Rd`

This function serves as a wrapper around `optimize`

, `optim`

, and `ctmm`

's partial-Newton optimization routine, with standardized arguments and return values. It finds the optimal parameters that minimize a function, whether it be a cost, loss, risk, or negative log-likelihood function.

optimizer(par,fn,...,method="pNewton",lower=-Inf,upper=Inf,period=FALSE,reset=identity, control=list())

par | Initial parameter guess. |
---|---|

fn | Function to be minimized with first argument |

... | Optional arguments fed to |

method | Optimization algorithm (see 'Details' below). |

lower | Lower bound for parameters. |

upper | Upper bound for parameters. |

period | Period of circular parameters if not |

reset | Optional function to re-center parameters, if symmetry permits, to prevent numerical underflow. |

control | Argument list for the optimization routine (see 'Details' below). |

Only `method='pNewton'`

will work in both one dimension and multiple dimensions. Any other `method`

argument will be ignored in one dimension, in favor of `optimize`

with a backup evaluation of `nlm`

(under a log-link) for cases where `optimize`

is known to fail. In multiple dimensions, methods other than `pNewton`

include those detailed in `optim`

.

`method='pNewton'`

is `ctmm`

's partial-Newton optimizer, which is a quasi-Newton method that is more accurate than BFGS-based methods when the gradient of `fn`

must be calculated numerically. In short, while BFGS-based methods provide a single rank-1 update to the Hessian matrix per iteration, the partial-Newton algorithm provides `length(par)+1`

rank-1 updates to the Hessian matrix per iteration, at the same computational cost. Furthermore, `length(par)`

of those updates have better numerical precision than the BFGS update, meaning that they can be used at smaller step sizes to obtain better numerical precision. The `pNewton`

optimizer also supports several features not found in other `R`

optimizers: the `zero`

argument, the `period`

argument, and parallelization.

The `zero`

argument is an optional argument in `fn`

supported by `method='pNewton'`

. Briefly, if you rewrite a negative log-likelihood of the form \(fn = \sum_{i=1}^n fn_i\) as \(fn = \sum_{i=1}^n ( fn_i - zero/n ) + zero\), where `zero`

is the current estimate of the minimum value of `fn`

, then the sum becomes approximately "zeroed" and so the variance in numerical errors caused by the difference in magnitude between `fn`

and `fn_i`

is mitigated. In practice, without the `zero`

argument, log-likelihood functions grow in magnitude with increasing data and then require increasing numerical precision to resolve the same differences in log-likelihood. But absolute differences in log-likelihoods (on the order of 1) are always important, even though most optimization routines more naturally consider relative differences as being important.

The `period`

argument informs `method='pNewton'`

if parameters is circular, such as with angles, and what their periods are.

The `control`

list can take the folowing arguments, with defaults shown:

`precision=1/2`

Fraction of machine numerical precision to target in the maximized likelihood value. The optimal

`par`

will have half this precision. On most computers,`precision=1`

is approximately 16 decimal digits of precision for the objective function and 8 for the optimal`par`

.`maxit=.Machine$integer.max`

Maximum number of iterations allowed for optimization.

`parscale=pmin(abs(par),abs(par-lower),abs(upper-par))`

The natural scale of the parameters such that variations in

`par`

on the order of`parscale`

produce variations in`fn`

on the order of one.`trace=FALSE`

Return step-by-step progress on optimization.

`cores=1`

Perform

`cores`

evaluations of`fn`

in parallel, if running in UNIX.`cores<=0`

will use all available cores, save`abs(cores)`

. This feature is only supported by`method='pNewton'`

and is only useful if`fn`

is slow to evaluate,`length(par)>1`

, and the total number of parallel evaluations required does not trigger fork-bomb detection by the OS.

Returns a list with components `par`

for the optimal parameters, `value`

for the minimum value of `fn`

, and possibly other components depending on the optimization routine employed.

C. H. Fleming.

`method='pNewton'`

is very stringent about achieving its `precision`

target and assumes that `fn`

has small enough numerical errors (permitting the use of argument `zero`

) to achieve that `precision`

target. If the numerical errors in `fn`

are too large, then the optimizer can fail to converge. `ctmm.fit`

standardizes its input data before optimization, and back-transforms afterwards, as one method to minimize numerical errors in `fn`

.