Authors:

The importance of differentiation as a mathematical tool is obvious. One of the first things we learn in elementary school is how to manually differentiate expressions using a few elementary formulas. Unfortunately the use of derivatives in scientific computing has been quite limited due to the misunderstanding that derivatives are hard to obtain. Many people still think that the only alternative to the symbolic way of obtaining derivatives is to use divided differences in which the difficulties in finding an expression for the derivatives are avoided. But by using divided differences, truncation errors are introduced and this usually has a negative effect on further computations - in fact it can lead to very inaccurate results.

The use of a symbolic differentiation package such as Maple or Mathematica can solve the problem of obtaining expressions for the derivatives. This method obviously avoids truncation errors but usually these packages have problems in handling large expressions and the time/space usage for computing derivatives can be enormous. In worst case it can even cause a program to crash. Furthermore, common sub expressions are usually not identified in the expressions and this leads to unnecessary computations during the evaluation of the derivatives.

Automatic differentiation is an alternative to the above methods. Here derivatives are computed by using the very well known chain rule for composite functions, in a clever way. In automatic differentiation the evaluation of a function and its derivatives are calculated simultaneously, using the same code and common temporary values. If the code for the evaluation is optimised, then the computation of the derivatives will automatically be optimised. The resulting differentiation is free from truncation errors, and if we calculate the derivatives using interval analysis we will obtain enclosures of the true derivatives. Automatic differentiation is easy to implement in languages with operator overloading such as C++, Ada and PASCAL-XSC.

Examples of applications:

Forward automatic differentiation on a function f : Rn-> Rn, evaluated in interval arithmetic, using the BIAS/PROFIL package, to obtain function values and derivatives. Used with the interval Newton method to obtain guaranteed enclosures of all solutions to the non-linear equation f(x)=0.

Forward-Backward automatic differentiation on a function f : Rn->R, evaluated in interval arithmetic, using the BIAS/PROFIL package and the backward method to obtain first order derivatives (the gradient) and the forward method to differentiate the first order derivatives to obtain the second order derivatives (the Hessian). Used with the interval Krawczyk method to perform global optimisation obtaining a guaranteed enclosure of the global minimum.

Numerical integration of a function f : RxRn->R, using a three-point-two-derivative formula. The Backward method has been used to differentiate the Numerical integration program obtaining the n partial derivatives of the integral with respect to the n parameters in the function.

Taylor expansion of the solution to an ordinary differential equation, used to solve initial value problems. The Forward method has been used to differentiate the initial value problem solver to obtain the solution of the variational problem.

Taylor expansion of the solution to an ordinary differential equation using interval arithmetic and using the Forward method to obtain derivatives of the Taylor coefficients with respect to the point of expansion, which are the values of the Taylor coefficients for the solution of the variational problem. Used in a method that solves initial value problems with guaranteed enclosures.

FADBAD++ works by overloading all elementary arithmetic operations to include calculation of derivatives. To enable these overloaded operations you have to change the arithmetic type (normally double) to the appropriate AD-type. The AD-types are defined by the three templates F<> for forward automatic differentiation, B<> for backward automatic differentiation and T<> for Taylor expansion.
Forward automatic differentiation
Lets see what that looks like when using forward automatic differentiation. Assume that we have a function, func, that we want to differentiate.
`double func(const double& x, const double& y){	double z=sqrt(x);	return y*z+sin(z);}`
Do a search and replace on "double" with "F<double>". Our code now looks like:
`F<double> func(const F<double>& x, const F<double>& y){	F<double> z=sqrt(x);	return y*z+sin(z);}`
We have to include the file "fadiff.h" to define the F<> template.

Our function is now prepared for computing derivatives. Before we call the function we have to specify the variables we want to differentiate with respect to.  After the call we obtain the function value and the derivatives. This can be done with the following code:

`	F<double> x,y,f;     // Declare variables x,y,f	x=1;                 // Initialize variable x	x.diff(0,2);         // Differentiate with respect to x (index 0 of 2)	y=2;                 // Initialize variable y	y.diff(1,2);         // Differentiate with respect to y (index 1 of 2)	f=func(x,y);         // Evaluate function and derivatives	double fval=f.x();   // Value of function	double dfdx=f.d(0);  // Value of df/dx (index 0 of 2)	double dfdy=f.d(1);  // Value of df/dy (index 1 of 2)	cout << "f(x,y)=" << fval << endl;	cout << "df/dx(x,y)=" << dfdx << endl;	cout << "df/dy(x,y)=" << dfdy << endl;`
Notice that the method diff(i,n) is called on the independent variables before the function call. Where n is the total number of independent variables that we want to differentiate with respect to and i denotes the index of the variable i=0...n-1. After the function call the actual function value is obtained from the dependent variable(s) with the x() method and the derivatives by the d(i) method, where i=0...n-1 is the index of the independent variable.
Backward automatic differentiation
Now, lets wee what this looks like when using backward automatic differentiation. Given the function func evaluated in doubles. We do a search and replace on "double" with "B<double>". Now the code looks like:
`B<double> func(const B<double>& x, const B<double>& y){	B<double> z=sqrt(x);	return y*z+sin(z);}`
The file "badiff.h" have to be included for defining the B<> template.

The code is now prepared for computing derivatives. Actually the function diff will now "record" a directed acyclic graph (DAG) representing the function, while computing the function value. After the function call we specify what variable we want to differentiate. When this has been done on all dependent variables we can obtain the derivatives. This is done with the following code:
`	B<double> x,y,f;    // Declare variables x,y,f	x=1;                // Initialize variable x	y=2;                // Initialize variable y	f=func(x,y);        // Evaluate function and record DAG	f.diff(0,1);        // Differentiate f (index 0 of 1)	double fval=f.x();  // Value of function	double dfdx=x.d(0); // Value of df/dx (index 0 of 1)	double dfdy=y.d(0); // Value of df/dy (index 0 of 1)	cout << "f(x,y)=" << fval << endl;	cout << "df/dx(x,y)=" << dfdx << endl;	cout << "df/dy(x,y)=" << dfdy << endl;`
Notice that this time the method diff(i,m) is called on the dependent variable(s) after the function-call. Where m is the total number of dependent variables we want to differentiate and i denotes the index of the variable i=0..m-1. The function value is obtained by the x() method on the dependent variables, while the derivaties are obtained by calling the d(i) on the independent variables, where i=0...m-1 is the index of the dependent variable.

The DAG is deallocated automatically when the derivatives are being computed backwards through the DAG.

It is very important that diff(i,m) is called on all dependent variables, i.e. variables that are dependent on the variables we want to differentiate with respect to. If this is not done then the derivatives are not propagated correctly into the independent variables, causing the derivatives to be wrong when obtaining them by calling d(i). A way to make a variable independent is to assign a constant to it. E.g. f=0; in the above example. Another way to avoid temporary dependent variables is to let them go out of scope by using {...} or by defining a functions that encapsulates the evaluation and only returns the dependent variables of interest.
Simple Taylor expansion
Lets asume we want to Taylor expand the function func with respect to the variable x. We now do a search and replace on "double" with "T<double>" and get:
`T<double> func(const T<double>& x, const T<double>& y){	T<double> z=sqrt(x);	return y*z+sin(z);}`
Where the template T<> is defined in the file "tadiff.h". The function will now "record" a directed acyclic graph (DAG) while computing the function value (which is the 0'th order Taylor-coefficient). This DAG can then be used to find the Taylor coefficients. This is done in the following code:
`	T<double> x,y,f;     // Declare variables x,y,f	x=1;                 // Initialize variable x	y=2;                 // Initialize variable y	x[1]=1;              // Taylor-expand wrt. x (dx/dx=1)	f=func(x,y);         // Evaluate function and record DAG	double fval=f[0];    // Value of function	f.eval(10);          // Taylor-expand f to degree 10	// f[0]...f[10] now contains the Taylor-coefficients.	cout << "f(x,y)=" << fval << endl;	for(int i=0;i<=10;i++)	{		double c=f[i];// The i'th taylor coefficient		cout << "(1/k!)*(d^"<<i<<"f/dx^" << i << ")=" << c << endl;	}`
We manually initialise the 1'st order Taylor coefficient of x to the value 1 to denote the independent variable in which we want to expand the function. We can now expand the function by calling the eval(k) method on the dependent variable(s), where k is the order in which we want to expand the function.

The DAG is not de-allocated automatically (like in the backward method) when the Taylor coefficients have been computed by the eval(k) method. The DAG will instead hold all Taylor coefficients of degree 0...k of all intermediate variables for calculating the function. If we call eval(l), where l>k, the Taylor coefficients that have already been calculated will be reused and only the coefficients (k+1)..l will be calculated. This is useful when Taylor expanding the solution to an ordinary differential equation, which we will see later. The coefficients in the DAG can be deleted by calling the method reset() on the dependent variable(s). This is done in the following code:
`	f.reset();           // Reset the values in the DAG	x[0]=3;              // New value for x	y[0]=4;              // New value for y	y[1]=1;              // Taylor-expand wrt. y (dy/dy=1)	f.eval(10);          // Taylor-expand f to degree 10	// f[0]...f[10] now contains the Taylor-coefficients.	for(int i=0;i<=10;i++)	{		double c=f[i];// The i'th taylor coefficient		cout << "(1/k!)*(d^"<<i<<"f/dy^" << i << ")=" << c << endl;	}`
After the coefficients in the DAG has been deleted we can re-initialise the values of x and y (0'th order coefficients) and now specify that we want to expand with respect to the variable y.

Finally the DAG will be de-allocated when the independent variables is de-allocated.
Taylor expanding the solution of an ODE
The feature that the coefficients in the DAG is not de-allocated can be used for Taylor expanding the solution of an ODE: x'=f(x). Assume that we want to expand the solution of x'=cos(x). We can encapsulate the DAG of the function in an object by defining a class:
`class TODE{public:	T<double> x;            // Independent variables	T<double> xp;           // Dependent variables	TODE(){xp=cos(x);}      // record DAG at construction};`
Objects of this class will have a DAG representing the ODE. The Taylor coefficients can then be computed one order at a time with the following code:
`	TODE ode;                // Construct ODE:	ode.x[0]=1;              // Set point of expansion:	for(int i=0;i<10;i++)	{		ode.xp.eval(i); // Evaluate i'th Taylor coefficient		ode.x[i+1]=ode.xp[i]/double(i+1);// Use dx/dt=ode(x).	}	// ode.x[0]...ode.x[10] now contains the Taylor-coefficients	// of the solution of the ODE.	// Print out the Taylor coefficients for the solution	// of the ODE:	for(int i=0;i<=10;i++)	{		cout << "x[" << i << "]=" << ode.x[i] << endl;	}`
We see that the ODE links the i'th coefficient of the dependent variable to the (i+1)'th coefficient of the independent variable. This dependency is specified explicitly in the loop.
Combinations of automatic differentiation
One of the very unique things of FADBAD++ is the ability to compute high order derivatives in a very flexible way by combining the methods of automatic differentiation. These combinations are produced by applying the templates on themselves. For example the type B< F< double > > can be used in optimisation for computing first order derivatives by using the backward method and second order derivatives by using a backward-forward method. The combination T< F<double> > can be used to Taylor expand the solution of an ODE, while computing derivatives of the coefficients with respect to the point of expansion. The derivatives are then by definition the Taylor coefficients to the solution of the variational equation. All these possibilities are described in more detail in the documents which can be downloaded from this site and in the examples that comes with the source code of FADBAD++.
Avoiding search/replace by using functors
One way to avoid searching and replacing of types when applying automatic differentiation is to prepare the code for automatic differentiation. This preparation can be done by using functors:
`class Func{	public:	template <class T>	T operator()(const T& x, const T& y)	{		T z=sqrt(x);		return y*z+sin(z);	}};`
Objects of class Func can now be used to evaluate the function with different arithmetic types, which are determined at compile-time. We can even create forward- and backward differentiating functors that takes a functor-class as template arguments and differentiates it:
`template <class C>class FDiff{	public:	template <class T>	T operator()(		T& o_dfdx, T& o_dfdy,		const T& i_x, const T& i_y)	{		F<T> x(i_x),y(i_y); // Initialize arguments		x.diff(0,2);        // Differentiate wrt. x		y.diff(1,2);        // Differentiate wrt. y		C func;             // Instantiate functor		F<T> f(func(x,y));  // Evaluate function and record DAG		o_dfdx=f.d(0);      // Value of df/dx		o_dfdy=f.d(1);      // Value of df/dy		return f.x();       // Return function value	}};template <class C>class BDiff{	public:	template <class T>	T operator()(		T& o_dfdx, T& o_dfdy,		const T& i_x, const T& i_y)	{		B<T> x(i_x),y(i_y); // Initialize arguments		C func;             // Instantiate functor		B<T> f(func(x,y));  // Evaluate function and record DAG		f.diff(0,1);        // Differentiate		o_dfdx=x.d(0);      // Value of df/dx		o_dfdy=y.d(0);      // Value of df/dy		return f.x();       // Return function value	}};`
The differentiating functors are used in the following code:
`	double x,y,f,dfdx,dfdy;    // Declare variables	x=1;                       // Initialize variable x	y=2;                       // Initialize variable y	FDiff<Func> FFunc;         // Functor for function and derivatives	f=FFunc(dfdx,dfdy,x,y);    // Evaluate function and derivatives	cout << "f(x,y)=" << f << endl;	cout << "df/dx(x,y)=" << dfdx << endl;	cout << "df/dy(x,y)=" << dfdy << endl;	BDiff<Func> BFunc;         // Functor for function and derivatives	f=BFunc(dfdx,dfdy,x,y);    // Evaluate function and derivatives	cout << "f(x,y)=" << f << endl;	cout << "df/dx(x,y)=" << dfdx << endl;	cout << "df/dy(x,y)=" << dfdy << endl;`
It is possible to make completely generic differentiating functors that differentiates any function f:Rn->Rm, by using STL vectors as input and output variables. See the test-code in FADBAD++.
It is possible to change the underlying arithmetic type for all function and derivative calculations - also called the BaseType. Normally the underlying arithmetic type is doubles, but automatic differentiation is also very useful in the field of interval arithmetic. E.g. to obtain narrow enclosures by using the mean value form or in global optimization.
The templates that are defined in FADBAD++ requre that some standard operations are defined for the underlying type: comparison operators, simple operations "+", "-" "*", "/" and simple functions "sin", "cos", "exp", etc.

`#define pow Power#define sqr Sqr#define exp Exp#define log Log#define sqrt Sqrt#define sin Sin#define cos Cos#define tan Tan#define asin ArcSin#define acos ArcCos#define atan ArcTan#define BaseType INTERVAL#include "tadiff.h"#include "fadiff.h"#undef BaseType#undef pow#undef exp#undef log#undef sqrt#undef sin#undef cos#undef tan#undef asin#undef acos#undef atan`

Notice that the packages FADBAD-TADIFF are old packages, while FADBAD++ is the new package. The old packages require you to run a configure script in a unix shell. FADBAD++ can be used right away - without any configuring. FADBAD++ has been tested with GCC 3.2 on Linux, Visual C++ 6.0 and .NET on Windows.
• Get the sourcecode for FADBAD (as zip) and the technical report (as pdf), "FADBAD, a Flexible C++ Package for Automatic Differentiation, using the forward and backward methods".
• Get the sourcecode for TADIFF (as zip) and the technical report (as pdf), "TADIFF, a Flexible C++ Package for Automatic Differentiation, using Taylor series expansion".

• Ole Stauning. "Flexible Automatic Differentiation using Templates and Operator Overloading in C++". Talk presented at the Automatic Differentiation Workshop at Shrivenham Campus, Cranfield University, June 6, 2003. Download PDF.
• N. S. Nedialkov and K. R. Jackson. "The Design and Implementation of an Object-Oriented Validated ODE Solver".
• N. S. Nedialkov. VNODE, A Validated Solver for Initial Value Problems for Ordinary Differential Equations.
• Ole Stauning, "ADIODES: Automatic Differentiation Interval Ordinary Differential Equation Solver". The C++ program package for enclosing solutions of ordinary differential equations which is descibed in the above Ph.D.-thesis. Download sourcecode (as zip).
• Ole Stauning, "ADIODES, a Self-Validating ODE Solver". Slides for a talk presented at the SIAM Annual Meeting, University of Toronto, July 15, 1998. Download PS, PDF.