Example: Gradient approximation
We will approximate the derivative of the Rosenbrock function at (1,0,0), with the forward and backward difference methods, and with two different step sizes.
We will also compute an approximation of the central difference, as the average of the forward and backward results.
Success will be determined by whether results between the different methods (forward, backward, central) are consistent (i.e., equal, within some tolerance).
Function inputs and outputs are NumPy arrays of arbitrary positive dimension.
[1]:
import numpy as np
from scipy.optimize import rosen, rosen_der
from fiddy import MethodId, get_derivative
from fiddy.analysis import ApproximateCentral
from fiddy.success import Consistency
# Point at which to compute the derivative
point = np.array([1, 0, 0])
# Step sizes for finite differences
sizes = [1e-10, 1e-5]
derivative = get_derivative(
function=rosen,
point=point,
sizes=sizes,
method_ids=[MethodId.FORWARD, MethodId.BACKWARD],
direction_ids=["x", "y", "z"],
analysis_classes=[ApproximateCentral],
success_checker=Consistency(rtol=1e-2, atol=1e-15),
)
print("Computed derivative:", derivative.value)
[ 400.00001657 -202.00002612 0. ]
The full (derivative.df_full) or the concise (derivative.df) dataframe can be used for debugging gradients.
The IDs correspond to the directions in which finite differences were computed. These directions can be any vector in the function’s parameter space. In this case, directions were not specified, so the default directions were used, which is the standard basis.
[2]:
derivative.df
[2]:
| direction | success | value | completed | computer_results | analysis_results | |
|---|---|---|---|---|---|---|
| direction | ||||||
| x | [1, 0, 0] | True | 400.000017 | True | method_id value metad... | method_id value met... |
| y | [0, 1, 0] | True | -202.000026 | True | method_id value metad... | method_id value met... |
| z | [0, 0, 1] | True | 0.000000 | True | method_id value metadata 0... | method_id value metadata... |
The *_results columns can be printed separately to view the specific derivative values that were computed.
These values differ from the values reported in derivative.values. This is because the success_checker (Consistency) provides the derivative values as the average of all consistent derivative values. Consistency is checked on the level of size, so if any of the values for 1e-05 were inconsistent to the rest, they would not contribute to the average reported by the Consistency success checker.
[3]:
derivative.df.loc["x", "computer_results"]
[3]:
| method_id | value | metadata | |
|---|---|---|---|
| 0 | MethodId.FORWARD | 400.000033 | {'size': 1e-10} |
| 1 | MethodId.BACKWARD | 400.000033 | {'size': 1e-10} |
| 2 | MethodId.FORWARD | 400.006010 | {'size': 1e-05} |
| 3 | MethodId.BACKWARD | 399.993990 | {'size': 1e-05} |
[4]:
derivative.df.loc["x", "analysis_results"]
[4]:
| method_id | value | metadata | |
|---|---|---|---|
| 0 | approximate_central | 400.000000 | {'size': 1e-05} |
| 1 | approximate_central | 400.000033 | {'size': 1e-10} |
In this case, the finite difference values are all consistent with each other, and we now compare them with the expected derivative.
[5]:
expected_derivative = rosen_der(point)
print(f"{expected_derivative=}")
[ 400 -202 0]
[6]:
np.isclose(derivative.value, expected_derivative).all()
[6]:
True