Least Squares Data Fitting

with Applications

Per Christian Hansen, Victor Pereyra, and Godela Scherer

Johns Hopkins University Press, 2012

This site contains material related to the above book.

Corrections to first printing

Here we provide a short list of corrections to the book (updated Dec. 7, 2012).

Page x (Preface), line 10: "P.-. Wedin" should be "P.-Å. Wedin".
Page 6, Eq. (1.2.6): the x below "min" should be boldface, twice.
Page 27, Eq. (2.1.1): the x below "min" should be boldface.
Page 40, Theorem 26: the x and y below "min" should be boldface, and b in the theorem's first displayed equation should be boldface.
Page 44, line 7: the x below "min" should be boldface.
Page 88, line 10: the x below "min" should be boldface.
Page 101, first displayed eq.: the y below "min" should be bolface.
Page 109, line 17: the x in Φ(x) should be boldface.
Page 115, displayed eq. on the middle of the page: the x below "min" should be boldface, and so should the one in x^(k).
Page 166, Eq. (9.2.2): the subscript to "min" should be Δ x_k (not Δ x_k₊₁).
Page 170, the displayed equation above (9.3.1): same as above. Also, the last term should be || Δ x_k ||₂² (not || x_k ||₂²).
Page 172, line 4 (the displayed equation for ρ_k): change the sign of the denominator.
Page 172, second bullet in the parameter updating algorith: the last equation should read λ_k+1 = 2 λ_k.
Page 180, Eq. (9.5.5): the sign should be + instead of − in the Newton update.
Page 188, Eq. (9.7.2): change x = x_i^* to x_i = x_i^*.
Page 213, line -4: change 11.2.1 to 11.2.2.

Overheads for PCH's lectures

These overheads are used in DTU's course 02610 Optimization and Data Fitting.

Linear data fitting, covering Sections 1.1, 1.2, 1.4, 2.1, and 2.2.
Nonlinar data fitting, covering Sections 8.1, 8.2, 9.1, 9.2, and 9.3.

Foreword

The following foreword (which we were not able to include in the book) was written by Professor Stepen J. Wright, University of Wisconsin. We are grateful to Prof. Wright for the kind words.

Scientific computing is founded in models that capture the properties of systems under investigation, be they engineering systems; systems in natural sciences; financial, economic, and social systems; or conceptual systems such as those that arise in machine learning or speech processing. For models to be useful, they must be calibrated against "real-world" systems and informed by data. The recent explosion in availability of data opens up unprecendented opportunities to increase the fidelity, resolution, and power of models - but only if we have access to algorithms for incorporating this data into models, effectively and efficiently.

For this reason, least squares - the first and best-known technique for fitting models to data - remains central to scientific computing. This problem class remains a fascinating topic of study from a variety of perspectives. Least-squares formulations can be derived from statistical principles, as maximum-likelihood estimates of models in which the model-data discrepancies are assumed to arise from Gaussian white noise. In scientific computing, they provide the vital link between model and data, the final ingredient in a model that brings the other elements together. In their linear variants, least-squares problems were a foundational problem in numerical linear algebra, as this field grew rapidly in the 1960s and 1970s. From the perspective of optimization, nonlinear least-squares has appealing structure that can be exploited with great effectiveness in algorithm design.

Least squares is foundational in another respect: It can be extended in a variety of ways: to alternative loss functions that are more robust to outliers in the observations, to one-sided "hinge loss" functions, to regularized models that impose structure on the model parameters in addition to fitting the data, and to "total least squares" models in which errors appear in the model coefficients as well as the observations.

This book surveys least-squares problems from all these perspectives. It is both a comprehensive introduction to the subject and a valuable resource to those already well versed in the area. It covers statistical motivations along with thorough treatments of direct and iterative methods for linear least squares and optimization methods for nonlinear least squares. The later chapters contain compelling case studies of both linear and nonlinear models, with discussions of model validation as well as model construction and interpretation. It conveys both the rich history of the subject and its ongoing importance, and reflects the many contributions that the authors have made to all aspects of the subject.