Example: Gradient approximation

We will approximate the derivative of the Rosenbrock function at (1,0,0), with the forward and backward difference methods, and with two different step sizes.

We will also compute an approximation of the central difference, as the average of the forward and backward results.

Success will be determined by whether results between the different methods (forward, backward, central) are consistent (i.e., equal, within some tolerance).

Function inputs and outputs are NumPy arrays of arbitrary positive dimension.

[1]:
import numpy as np
from scipy.optimize import rosen, rosen_der

from fiddy import MethodId, get_derivative
from fiddy.analysis import ApproximateCentral
from fiddy.success import Consistency

# Point at which to compute the derivative
point = np.array([1, 0, 0])
# Step sizes for finite differences
sizes = [1e-10, 1e-5]

derivative = get_derivative(
    function=rosen,
    point=point,
    sizes=sizes,
    method_ids=[MethodId.FORWARD, MethodId.BACKWARD],
    direction_ids=["x", "y", "z"],
    analysis_classes=[ApproximateCentral],
    success_checker=Consistency(rtol=1e-2, atol=1e-15),
)
print("Computed derivative:", derivative.value)
[ 400.00001657 -202.00002612    0.        ]

The full (derivative.df_full) or the concise (derivative.df) dataframe can be used for debugging gradients.

The IDs correspond to the directions in which finite differences were computed. These directions can be any vector in the function’s parameter space. In this case, directions were not specified, so the default directions were used, which is the standard basis.

[2]:
derivative.df
[2]:
direction success value completed computer_results analysis_results
direction
x [1, 0, 0] True 400.000017 True method_id value metad... method_id value met...
y [0, 1, 0] True -202.000026 True method_id value metad... method_id value met...
z [0, 0, 1] True 0.000000 True method_id value metadata 0... method_id value metadata...

The *_results columns can be printed separately to view the specific derivative values that were computed.

These values differ from the values reported in derivative.values. This is because the success_checker (Consistency) provides the derivative values as the average of all consistent derivative values. Consistency is checked on the level of size, so if any of the values for 1e-05 were inconsistent to the rest, they would not contribute to the average reported by the Consistency success checker.

[3]:
derivative.df.loc["x", "computer_results"]
[3]:
method_id value metadata
0 MethodId.FORWARD 400.000033 {'size': 1e-10}
1 MethodId.BACKWARD 400.000033 {'size': 1e-10}
2 MethodId.FORWARD 400.006010 {'size': 1e-05}
3 MethodId.BACKWARD 399.993990 {'size': 1e-05}
[4]:
derivative.df.loc["x", "analysis_results"]
[4]:
method_id value metadata
0 approximate_central 400.000000 {'size': 1e-05}
1 approximate_central 400.000033 {'size': 1e-10}

In this case, the finite difference values are all consistent with each other, and we now compare them with the expected derivative.

[5]:
expected_derivative = rosen_der(point)
print(f"{expected_derivative=}")
[ 400 -202    0]
[6]:
np.isclose(derivative.value, expected_derivative).all()
[6]:
True