approximate derivatives of a function (Jacobian or Hessian)
J = numderivative(f, x) J = numderivative(f, x, h) J = numderivative(f, x, h, order) J = numderivative(f, x, h, order, H_form) J = numderivative(f, x, h, order, H_form, Q) [J, H] = numderivative(...)
a function or a list, the function to differentiate.
a n-by-1 or 1-by-n vector of doubles, real, the point where to compute the derivatives.
a 1-by-1 or n-by-1 vector of doubles, the step used in the finite difference
approximations.
If h
is not provided, then the default step is computed
depending on x
and the order
.
If h
is a 1-by-1 matrix, it is expanded
to the same size as x
.
a 1-by-1 matrix of doubles, integer, positive, the order of the finite difference
formula (default order=2
).
The available values of order
are 1, 2 or 4.
a string, the form in which the Hessian will be
returned (default H_form="default"
).
The available values are "default", "blockmat" or "hypermat". See the section "The shape of the Hessian" below for details on this option.
a real matrix of doubles, modifies the directions of differentiation (default is Q=eye(n,n)
).
The matrix Q
is expected to be orthogonal. Q provides the possibility to remove
the arbitrariness of using the canonical basis to approximate the derivatives of a function.
a m-by-n matrix of doubles, the approximated Jacobian.
The row J(k, :)
approximates the gradient of fk
,
for k = 1, 2, ..., m
.
a matrix of doubles, the approximated Hessian.
The elements of H
approximate the second-order partial derivatives of f
.
Computes the approximated Jacobian and Hessian matrices of a function with finite differences.
The function f
takes as
input argument x
, a n-by-1
vector,
and returns y
, a m-by-1
vector.
In the following, the k-th component of f
is denoted by fk
and the Hessian matrix of
fk
is denoted by Hk
,
for k = 1, 2, ..., m
.
Any optional argument equal to the empty matrix []
is replaced by its default value.
In general, the order=1
formula provides the
largest error (least accurate), the order = 2
formula provides an average
error and the order=4
formula provides the lowest error (most accurate).
However, there are cases for which this is not true: see the section "Accuracy issues" below
for more details on this topic.
The second derivatives are computed by composition of first order derivatives.
The function f
must have the following header:
y = f (x)
where
a n-by-1 vector of doubles, the current point
a m-by-1 vector of doubles, the function value
It might happen that the function requires additional arguments to be evaluated.
In this case, we can use the following feature.
The argument f
can also be the list (myfun, a1, a2, ...)
.
In this case, myfun
, the first element in the list, must be a function and must
have the header:
y = myfun (x, a1, a2, ...)
a1, a2, ...
are automatically appended as parameters to the call.The H_form
option changes the dimensions of the output argument
H
.
This manages the general case where m
is different from 1
,
that is, when there are several functions to differentiate at the same time.
The possible values of H_form
are:
H is a m-by-(n^2)
matrix; in this form,
the row H(k, :)
contains Hk
:
H(k, :) == Hk(:)'
H is a (mn)-by-n
matrix: the n-by-n
Hessian
matrices of each component of f
are stacked row-by-row, that is:
H == [H1 ; H2 ; ... ; Hm]
H is a n-by-n matrix for m=1
, and a n-by-n-by-m hypermatrix otherwise.
The matrix H(:, :, k)
is the Hessian matrix of the k-th
component of f
, i.e.
H(:, :, k) == Hk
If the problem is correctly scaled, increasing the order of the finite difference formula may reduce
the error of approximation but requires more function evaluations.
The following list presents the number of function evaluations required to
compute the Jacobian depending on the order of the formula and the dimension of x
,
denoted by n
:
order = 1
, the number of function evaluations is n+1
,
order = 2
, the number of function evaluations is 2n
,
order = 4
, the number of function evaluations is 4n
.
Computing the Hessian matrix requires the square of the number of function evaluations, as detailed in the following list.
order = 1
, the number of function evaluations is (n+1)^2
(total is (n+1)^2+n+1
),
order = 2
, the number of function evaluations is 4n^2
(total is 4n^2+2n
),
order = 4
, the number of function evaluations is 16n^2
(total is 16n^2+4n
).
// The function to differentiate function y=f(x) f1 = sin(x(1)*x(2)) + exp(x(2)*x(3)+x(1)) f2 = sum(x.^3) y = [f1; f2] endfunction // The exact gradient function [g1, g2]=exactg(x) g1(1) = cos(x(1)*x(2))*x(2) + exp(x(2)*x(3)+x(1)) g1(2) = cos(x(1)*x(2))*x(1) + exp(x(2)*x(3)+x(1))*x(3) g1(3) = exp(x(2)*x(3) + x(1))*x(2) g2(1) = 3*x(1)^2 g2(2) = 3*x(2)^2 g2(3) = 3*x(3)^2 endfunction // The exact Hessian function [H1, H2]=exactH(x) H1(1, 1) = -sin(x(1)*x(2))*x(2)^2 + exp(x(2)*x(3)+x(1)) H1(1, 2) = cos(x(1)*x(2)) - sin(x(1)*x(2))*x(2)*x(1) + exp(x(2)*x(3)+x(1))*x(3) H1(1, 3) = exp(x(2)*x(3)+x(1))*x(2) H1(2, 1) = H1(1, 2) H1(2, 2) = -sin(x(1)*x(2))*x(1)^2 + exp(x(2)*x(3)+x(1))*x(3)^2 H1(2, 3) = exp(x(2)*x(3)+x(1)) + exp(x(2)*x(3)+x(1))*x(3)*x(2) H1(3, 1) = H1(1, 3) H1(3, 2) = H1(2, 3) H1(3, 3) = exp(x(2)*x(3)+x(1))*x(2)^2 // H2(1, 1) = 6*x(1) H2(1, 2) = 0 H2(1, 3) = 0 H2(2, 1) = H2(1, 2) H2(2, 2) = 6*x(2) H2(2, 3) = 0 H2(3, 1) = H2(1, 3) H2(3, 2) = H2(2, 3) H2(3, 3) = 6*x(3) endfunction // Compute the approximate Jacobian, the Hessian x = [1; 2; 3]; J = numderivative(f, x) [J, H] = numderivative(f, x) // Compare with exact derivatives [g1, g2] = exactg(x); Jexact = [g1'; g2'] [H1, H2] = exactH(x); Hexact = [H1(:)'; H2(:)'] // Configure the step J = numderivative(f, x, 1.e-1) // Configure the order J = numderivative(f, x, [], 4) // Configure Hessian shapes [J, H] = numderivative(f, x, [], [], "blockmat") [J, H] = numderivative(f, x, [], [], "hypermat") // Configure Q n = 3; Q = qr(rand(n, n)) J = numderivative(f, x, [], [], [], Q) // Test order 1, 2 and 4 formulas. // For the order 4 formula, there are some entries in H // which are computed as nonzero. [g1, g2] = exactg(x); [H1, H2] = exactH(x); Jexact = [g1'; g2']; Hexact = [H1(:)'; H2(:)']; for i = [1, 2, 4] [J, H] = numderivative(f, x, [], i); dJ = floor(min(assert_computedigits(J, Jexact))); dH = floor(min(assert_computedigits(H, Hexact))); mprintf("order = %d, dJ = %d, dH = %d \n", i, dJ, dH); end | ![]() | ![]() |
In the following example, we use a function which requires the extra-argument
p
.
Hence, we use the feature that the argument f
can be a list, where the first argument is the function
G
and the remaining elements are automatically passed
to the function.
If H
is given in its default form, then the Taylor series of
f(x)
up to terms of second order is given by:
In the following script, we check this formula with numerical differences.
// The function to differentiate function y=f(x) f1 = sin(x(1)*x(2)) + exp(x(2)*x(3)+x(1)) f2 = sum(x.^3) y = [f1; f2] endfunction x = [1; 2; 3]; dx = 1e-3*[1; 1; -1]; [J, H] = numderivative(f, x); f(x+dx) f(x+dx)-f(x) f(x+dx)-f(x)-J*dx f(x+dx)-f(x)-J*dx-1/2*H*(dx .*. dx) | ![]() | ![]() |
In the following example, we use a function which requires three extra-arguments
A
, p
and w
.
When the step h
is not provided, the numderivative
function tries to compute a step which provides a sufficient accuracy.
This step depends on the degree of the derivative (Jacobian or Hessian),
the order of the formula and the point x
.
More precisely, if x
is nonzero,
then the default step h
is a vector with the
same size as x
, scaled with the absolute value
of x
.
In the following example, we compute the derivative of the square root function. The following script plots a graph which represents the relative error of the numerical derivative depending on the step. We can see that there is an optimal step which minimizes the relative error.
// Its exact derivative function y=mydsqrt(x) y = 0.5 * x^(-0.5) endfunction x = 1.0; n = 1000; logharray = linspace (-16, 0, n); for i = 1:n h = 10^(logharray(i)); expected = mydsqrt(x); computed = numderivative (sqrt, x, h); relerr = abs(computed - expected) / abs(expected); logearray (i) = log10 (relerr); end scf(); plot (logharray, logearray); xtitle("Relative error of numderivative (x = 1.0)", .. "log(h)", "log(RE)"); | ![]() | ![]() |
The strategy in numderivative
provides a sufficient accuracy in many
cases, but can fail to be accurate in some cases.
In fact, the optimal step also depends on the function value f(x)
and is second derivative, both of which are unknown at the time where the
default step is computed.
See "Practical optimization", by Gill, Murray and Wright, Academic Press, 1981,
for more details on this topic.
The relevant sections are "4.6.1. Finite-difference Approximations to First
Derivatives" and "8.6 Computing finite differences".
Version | Description |
5.5.0 | Function numderivative introduced. |