Levenberg-Marquardt [

Levenberg44,

Marquardt63] is a gradient descent
method. It is a hybrid between the steepest descent and the Newton methods.

The Newton optimization method searches for the minimum of a nonlinear function by following descent directions
determined from the function's first and second partial derivatives. The steepest descent method searches for a
minimum based only on the first derivatives of the function. While the Newton method converges quadratically towards
the minimum in its vicinity, it may not converge at all if it is far away from it. On the other hand the steepest
descent method only converges linearly but is guaranteed to converge.

Levenberg first suggested an improvement to the Newton method in order to make it more robust, i.e. to overcome the
problem of non-convergence. His suggestion was to add a factor to the diagonal elements of the Hessian matrix of
second derivatives when not close to the minimum (this can be judged by how positive definite the matrix is). The
effect when this factor is large compared to the elements of Hessian is that the method then becomes the steepest
descent method. Later Marquardt suggested that the factor should be multiplicative rather than additive and also
defined a heuristic to make this factor increase or decrease. The method known as Levenberg-Marquardt is thus an
adaptive method that effectively changes between the steepest descent to the Newton method.

The original suggestions of Levenberg and Marquardt were effective to enhance the Gauss-Newton method, a variant of
the Newton method specifically for minimizing least-squares functions. In this case the advantage is also that the
second derivatives do not need to be calculated as they are estimated from the gradient of the residuals.
Subsequently Goldfeld et al. [

Goldfeld66] extended the method to the
case of general non-linear functions.

### Options for Levenberg - Marquardt

**Iteration Limit**
- This parameter is positive integer determining the maximum number of iterations the algorithm shall perform.
The default is '200'.

**Tolerance**
- This parameter is a positive value determining the tolerance with which the solution shall be determined. If
the improvement between two steps is less than the tolerance the algorithm stops. The default is '$10^{-5}$'.