FADBAD++

sample

Flexible Automatic differentiation using templates and operator overloading in ANSI C++

Content:

Authors:

FADBAD++ were developed by Ole Stauning and Claus Bendtsen. Any comments, questions, flames etc. should be sent to the authors <fadbad@uning.dk>. We are also very interested in hearing if you have used FADBAD++ in your own application(s).

Copyright:

FADBAD++ is copyright © 1996-2003 Claus Bendtsen and Ole Stauning. You are free to use FADBAD++ for non-commercial purposes. You should contact the authors if you wish to use FADBAD++ in a commercial product. See also the copyright message which is distributed with the source code.

General introduction to automatic differentiation and FADBAD++:

The importance of differentiation as a mathematical tool is obvious. One of the first things we learn in elementary school is how to manually differentiate expressions using a few elementary formulas. Unfortunately the use of derivatives in scientific computing has been quite limited due to the misunderstanding that derivatives are hard to obtain. Many people still think that the only alternative to the symbolic way of obtaining derivatives is to use divided differences in which the difficulties in finding an expression for the derivatives are avoided. But by using divided differences, truncation errors are introduced and this usually has a negative effect on further computations - in fact it can lead to very inaccurate results.

The use of a symbolic differentiation package such as Maple or Mathematica can solve the problem of obtaining expressions for the derivatives. This method obviously avoids truncation errors but usually these packages have problems in handling large expressions and the time/space usage for computing derivatives can be enormous. In worst case it can even cause a program to crash. Furthermore, common sub expressions are usually not identified in the expressions and this leads to unnecessary computations during the evaluation of the derivatives.

Automatic differentiation is an alternative to the above methods. Here derivatives are computed by using the very well known chain rule for composite functions, in a clever way. In automatic differentiation the evaluation of a function and its derivatives are calculated simultaneously, using the same code and common temporary values. If the code for the evaluation is optimised, then the computation of the derivatives will automatically be optimised. The resulting differentiation is free from truncation errors, and if we calculate the derivatives using interval analysis we will obtain enclosures of the true derivatives. Automatic differentiation is easy to implement in languages with operator overloading such as C++, Ada and PASCAL-XSC.

FADBAD++ contains templates for performing automatic differentiation on functions implemented in C++ code. If the source code of a program, which is an implementation of a differentiable function, is available, then FADBAD++ can be applied to obtain the derivatives of this function. To apply automatic differentiation on a program, the arithmetic type used in the program (normally double) is changed to an overloading type ( normally F<double>, B<double> or T<double> ). Since the overloading type behaves just like the usual arithmetic type and the functionality therefore is kept, the program, which performs the function evaluation, should not be changed in any other way. When calling the program, the user should specify what output variables, returned from the program, to differentiate, and what input variables to differentiate with respect to. Since a program using automatic differentiation, itself can be differentiated using FADBAD++; it is possible to obtain higher order derivatives by applying multiple layers of automatic differentiation. Any type of combinations of the automatic differentiation methods that are implemented in FADBAD++ is possible. This way derivatives can be obtained in several ways, making it possible to optimise with respect to the time and space used in the process.

Examples of applications:

Forward automatic differentiation on a function f : Rn-> Rn, evaluated in interval arithmetic, using the BIAS/PROFIL package, to obtain function values and derivatives. Used with the interval Newton method to obtain guaranteed enclosures of all solutions to the non-linear equation f(x)=0.

Forward-Backward automatic differentiation on a function f : Rn->R, evaluated in interval arithmetic, using the BIAS/PROFIL package and the backward method to obtain first order derivatives (the gradient) and the forward method to differentiate the first order derivatives to obtain the second order derivatives (the Hessian). Used with the interval Krawczyk method to perform global optimisation obtaining a guaranteed enclosure of the global minimum.

Numerical integration of a function f : RxRn->R, using a three-point-two-derivative formula. The Backward method has been used to differentiate the Numerical integration program obtaining the n partial derivatives of the integral with respect to the n parameters in the function.

Taylor expansion of the solution to an ordinary differential equation, used to solve initial value problems. The Forward method has been used to differentiate the initial value problem solver to obtain the solution of the variational problem.

Taylor expansion of the solution to an ordinary differential equation using interval arithmetic and using the Forward method to obtain derivatives of the Taylor coefficients with respect to the point of expansion, which are the values of the Taylor coefficients for the solution of the variational problem. Used in a method that solves initial value problems with guaranteed enclosures.

Crash course in using FADBAD++:

FADBAD++ works by overloading all elementary arithmetic operations to include calculation of derivatives. To enable these overloaded operations you have to change the arithmetic type (normally double) to the appropriate AD-type. The AD-types are defined by the three templates F<> for forward automatic differentiation, B<> for backward automatic differentiation and T<> for Taylor expansion.
Forward automatic differentiation
Lets see what that looks like when using forward automatic differentiation. Assume that we have a function, func, that we want to differentiate.
double func(const double& x, const double& y)
{
double z=sqrt(x);
return y*z+sin(z);
}
Do a search and replace on "double" with "F<double>". Our code now looks like:
F<double> func(const F<double>& x, const F<double>& y)
{
F<
double> z=sqrt(x);
return y*z+sin(z);
}
We have to include the file "fadiff.h" to define the F<> template.

Our function is now prepared for computing derivatives. Before we call the function we have to specify the variables we want to differentiate with respect to.  After the call we obtain the function value and the derivatives. This can be done with the following code:

	F<double> x,y,f;     // Declare variables x,y,f
x=1; // Initialize variable x
x.diff(0,2); // Differentiate with respect to x (index 0 of 2)
y=2; // Initialize variable y
y.diff(1,2); // Differentiate with respect to y (index 1 of 2)
f=func(x,y); // Evaluate function and derivatives
double fval=f.x(); // Value of function
double dfdx=f.d(0); // Value of df/dx (index 0 of 2)
double dfdy=f.d(1); // Value of df/dy (index 1 of 2)

cout << "f(x,y)=" << fval << endl;
cout <<
"df/dx(x,y)=" << dfdx << endl;
cout <<
"df/dy(x,y)=" << dfdy << endl;
Notice that the method diff(i,n) is called on the independent variables before the function call. Where n is the total number of independent variables that we want to differentiate with respect to and i denotes the index of the variable i=0...n-1. After the function call the actual function value is obtained from the dependent variable(s) with the x() method and the derivatives by the d(i) method, where i=0...n-1 is the index of the independent variable.
Backward automatic differentiation
Now, lets wee what this looks like when using backward automatic differentiation. Given the function func evaluated in doubles. We do a search and replace on "double" with "B<double>". Now the code looks like:
B<double> func(const B<double>& x, const B<double>& y)
{
B<
double> z=sqrt(x);
return y*z+sin(z);
}
The file "badiff.h" have to be included for defining the B<> template.

The code is now prepared for computing derivatives. Actually the function diff will now "record" a directed acyclic graph (DAG) representing the function, while computing the function value. After the function call we specify what variable we want to differentiate. When this has been done on all dependent variables we can obtain the derivatives. This is done with the following code:
	B<double> x,y,f;    // Declare variables x,y,f
x=1; // Initialize variable x
y=2; // Initialize variable y
f=func(x,y); // Evaluate function and record DAG
f.diff(0,1); // Differentiate f (index 0 of 1)
double fval=f.x(); // Value of function
double dfdx=x.d(0); // Value of df/dx (index 0 of 1)
double dfdy=y.d(0); // Value of df/dy (index 0 of 1)

cout << "f(x,y)=" << fval << endl;
cout <<
"df/dx(x,y)=" << dfdx << endl;
cout <<
"df/dy(x,y)=" << dfdy << endl;
Notice that this time the method diff(i,m) is called on the dependent variable(s) after the function-call. Where m is the total number of dependent variables we want to differentiate and i denotes the index of the variable i=0..m-1. The function value is obtained by the x() method on the dependent variables, while the derivaties are obtained by calling the d(i) on the independent variables, where i=0...m-1 is the index of the dependent variable.

The DAG is deallocated automatically when the derivatives are being computed backwards through the DAG.

It is very important that diff(i,m) is called on all dependent variables, i.e. variables that are dependent on the variables we want to differentiate with respect to. If this is not done then the derivatives are not propagated correctly into the independent variables, causing the derivatives to be wrong when obtaining them by calling d(i). A way to make a variable independent is to assign a constant to it. E.g. f=0; in the above example. Another way to avoid temporary dependent variables is to let them go out of scope by using {...} or by defining a functions that encapsulates the evaluation and only returns the dependent variables of interest.
Simple Taylor expansion
Lets asume we want to Taylor expand the function func with respect to the variable x. We now do a search and replace on "double" with "T<double>" and get:
T<double> func(const T<double>& x, const T<double>& y)
{
T<
double> z=sqrt(x);
return y*z+sin(z);
}
Where the template T<> is defined in the file "tadiff.h". The function will now "record" a directed acyclic graph (DAG) while computing the function value (which is the 0'th order Taylor-coefficient). This DAG can then be used to find the Taylor coefficients. This is done in the following code:
	T<double> x,y,f;     // Declare variables x,y,f
x=1; // Initialize variable x
y=2; // Initialize variable y
x[1]=1; // Taylor-expand wrt. x (dx/dx=1)
f=func(x,y); // Evaluate function and record DAG
double fval=f[0]; // Value of function
f.eval(10); // Taylor-expand f to degree 10
// f[0]...f[10] now contains the Taylor-coefficients.
cout << "f(x,y)=" << fval << endl;
for(
int i=0;i<=10;i++)
{
double c=f[i];// The i'th taylor coefficient
cout << "(1/k!)*(d^"<<i<<"f/dx^" << i << ")=" << c << endl;
}
We manually initialise the 1'st order Taylor coefficient of x to the value 1 to denote the independent variable in which we want to expand the function. We can now expand the function by calling the eval(k) method on the dependent variable(s), where k is the order in which we want to expand the function.

The DAG is not de-allocated automatically (like in the backward method) when the Taylor coefficients have been computed by the eval(k) method. The DAG will instead hold all Taylor coefficients of degree 0...k of all intermediate variables for calculating the function. If we call eval(l), where l>k, the Taylor coefficients that have already been calculated will be reused and only the coefficients (k+1)..l will be calculated. This is useful when Taylor expanding the solution to an ordinary differential equation, which we will see later. The coefficients in the DAG can be deleted by calling the method reset() on the dependent variable(s). This is done in the following code:
	f.reset();           // Reset the values in the DAG
x[0]=3; // New value for x
y[0]=4; // New value for y
y[1]=1; // Taylor-expand wrt. y (dy/dy=1)
f.eval(10); // Taylor-expand f to degree 10
// f[0]...f[10] now contains the Taylor-coefficients.
for(int i=0;i<=10;i++)
{
double c=f[i];// The i'th taylor coefficient
cout << "(1/k!)*(d^"<<i<<"f/dy^" << i << ")=" << c << endl;
}
After the coefficients in the DAG has been deleted we can re-initialise the values of x and y (0'th order coefficients) and now specify that we want to expand with respect to the variable y.

Finally the DAG will be de-allocated when the independent variables is de-allocated.
Taylor expanding the solution of an ODE
The feature that the coefficients in the DAG is not de-allocated can be used for Taylor expanding the solution of an ODE: x'=f(x). Assume that we want to expand the solution of x'=cos(x). We can encapsulate the DAG of the function in an object by defining a class:
class TODE
{
public:
T<
double> x; // Independent variables
T<double> xp; // Dependent variables
TODE(){xp=cos(x);} // record DAG at construction
};
Objects of this class will have a DAG representing the ODE. The Taylor coefficients can then be computed one order at a time with the following code:
	TODE ode;                // Construct ODE:
ode.x[0]=1; // Set point of expansion:
for(int i=0;i<10;i++)
{
ode.xp.eval(i);
// Evaluate i'th Taylor coefficient
ode.x[i+1]=ode.xp[i]/double(i+1);// Use dx/dt=ode(x).
}
// ode.x[0]...ode.x[10] now contains the Taylor-coefficients
// of the solution of the ODE.

// Print out the Taylor coefficients for the solution
// of the ODE:
for(int i=0;i<=10;i++)
{
cout <<
"x[" << i << "]=" << ode.x[i] << endl;
}
We see that the ODE links the i'th coefficient of the dependent variable to the (i+1)'th coefficient of the independent variable. This dependency is specified explicitly in the loop.
Combinations of automatic differentiation
One of the very unique things of FADBAD++ is the ability to compute high order derivatives in a very flexible way by combining the methods of automatic differentiation. These combinations are produced by applying the templates on themselves. For example the type B< F< double > > can be used in optimisation for computing first order derivatives by using the backward method and second order derivatives by using a backward-forward method. The combination T< F<double> > can be used to Taylor expand the solution of an ODE, while computing derivatives of the coefficients with respect to the point of expansion. The derivatives are then by definition the Taylor coefficients to the solution of the variational equation. All these possibilities are described in more detail in the documents which can be downloaded from this site and in the examples that comes with the source code of FADBAD++.
Avoiding search/replace by using functors
One way to avoid searching and replacing of types when applying automatic differentiation is to prepare the code for automatic differentiation. This preparation can be done by using functors:
class Func
{
public:
template <class T>
T operator()(
const T& x, const T& y)
{
T z=sqrt(x);
return y*z+sin(z);
}
};
Objects of class Func can now be used to evaluate the function with different arithmetic types, which are determined at compile-time. We can even create forward- and backward differentiating functors that takes a functor-class as template arguments and differentiates it:
template <class C>
class FDiff
{
public:
template <class T>
T operator()(
T& o_dfdx, T& o_dfdy,
const T& i_x, const T& i_y)
{
F<T> x(i_x),y(i_y);
// Initialize arguments
x.diff(0,2); // Differentiate wrt. x
y.diff(1,2); // Differentiate wrt. y
C func; // Instantiate functor
F<T> f(func(x,y)); // Evaluate function and record DAG
o_dfdx=f.d(0); // Value of df/dx
o_dfdy=f.d(1); // Value of df/dy
return f.x(); // Return function value
}
};

template <class C>
class BDiff
{
public:
template <class T>
T operator()(
T& o_dfdx, T& o_dfdy,
const T& i_x, const T& i_y)
{
B<T> x(i_x),y(i_y);
// Initialize arguments
C func; // Instantiate functor
B<T> f(func(x,y)); // Evaluate function and record DAG
f.diff(0,1); // Differentiate
o_dfdx=x.d(0); // Value of df/dx
o_dfdy=y.d(0); // Value of df/dy
return f.x(); // Return function value
}
};
The differentiating functors are used in the following code:
	double x,y,f,dfdx,dfdy;    // Declare variables
x=1; // Initialize variable x
y=2; // Initialize variable y
FDiff<Func> FFunc; // Functor for function and derivatives
f=FFunc(dfdx,dfdy,x,y); // Evaluate function and derivatives

cout << "f(x,y)=" << f << endl;
cout <<
"df/dx(x,y)=" << dfdx << endl;
cout <<
"df/dy(x,y)=" << dfdy << endl;

BDiff<Func> BFunc;
// Functor for function and derivatives
f=BFunc(dfdx,dfdy,x,y); // Evaluate function and derivatives

cout << "f(x,y)=" << f << endl;
cout <<
"df/dx(x,y)=" << dfdx << endl;
cout <<
"df/dy(x,y)=" << dfdy << endl;
It is possible to make completely generic differentiating functors that differentiates any function f:Rn->Rm, by using STL vectors as input and output variables. See the test-code in FADBAD++.
Using FADBAD++ with other types than double
It is possible to change the underlying arithmetic type for all function and derivative calculations - also called the BaseType. Normally the underlying arithmetic type is doubles, but automatic differentiation is also very useful in the field of interval arithmetic. E.g. to obtain narrow enclosures by using the mean value form or in global optimization.
The templates that are defined in FADBAD++ requre that some standard operations are defined for the underlying type: comparison operators, simple operations "+", "-" "*", "/" and simple functions "sin", "cos", "exp", etc.

One of the main ideas of FADBAD++ is the ability of changing the underlying arithmetic type from double to e.g. intervals. This is done by defining BaseType to the name of the underlying type, before including the FADBAD++ headerfiles. If the functions for the BaseType is named in a non-standard way, e.g. "cos" are named "Cos", then these names has to be redefined, before including the FADBAD++ headerfiles. For example to use FADBAD++ with the interval type that are provided by Profil/BIAS we have to include the FADBAD++ headerfiles with the following code:
#define pow Power
#define sqr Sqr
#define exp Exp
#define log Log
#define sqrt Sqrt
#define sin Sin
#define cos Cos
#define tan Tan
#define asin ArcSin
#define acos ArcCos
#define atan ArcTan

#define BaseType INTERVAL

#include "tadiff.h"
#include "fadiff.h"

#undef BaseType
#undef pow
#undef exp
#undef log
#undef sqrt
#undef sin
#undef cos
#undef tan
#undef asin
#undef acos
#undef atan
It is possible to configure FADBAD++ even more. See the headerfile "fadbad.h" for details.

Download source code and documentation for FADBAD++ ( and FADBAD-TADIFF ):

Notice that the packages FADBAD-TADIFF are old packages, while FADBAD++ is the new package. The old packages require you to run a configure script in a unix shell. FADBAD++ can be used right away - without any configuring. FADBAD++ has been tested with GCC 3.2 on Linux, Visual C++ 6.0 and .NET on Windows.

Work related to FADBAD++ (FADBAD-TADIFF):

Other Automatic differentiation Sites:



You were visitor number since '97.  This page was last modified by Ole Stauning, Feb. 3, 2005. The counter is sponsored by www.digits.com.