**Multivariate
Statistics and Future Industrial Image Analysis**

**Allan
Aasbjerg Nielsen**

**Technical**** ****University**** of ****Denmark**

Informatics
and Mathematical Modelling

http://www.imm.dtu.dk/~aa

DTU
Vision Day 2002,

There is a tendency in digital image data
acquired in industrial (medical and other) applications towards

- better spatial resolution, i.e., more pixels per unit area,
- better radiometric resolution, i.e., more bits per pixel, and
- better spectral resolution, i.e., more spectral bands or colours per pixel - also beyond visible part of the electromagnetic (EM) spectrum.

The latter improved
spectral resolution where we record not just greyscale or RGB data but an actual
(maybe coarse) EM spectrum for each pixel leads to an increased need for
digital techniques based on multivariate statistics and spectral analysis.

The situation with improved spectral
resolution is characterised by more than one variable (also known as a feature)
for each observation also known as case, sample or experimental unit. This variable or feature is often the
reflectance at a given wavelength in the EM spectrum or a quantity derived
thereof. The data can be organised in a
so-called data matrix with *n* rows (one for each of *n* observations) and *p* columns (one for each of *p* variables). This data matrix can be viewed or analysed
row- or column-wise depending on whether one is primarily interested in relations
(e.g. similarity or proximity) between observations or variables.

The main division between statistical
analysis methods is related to whether we *a
priori* assume some structure in the data (e.g. we may know beforehand that the
observations naturally divide into groups) or whether we assume no prior
knowledge and discover structure from data matrix alone. Techniques associated with the latter
situation are often referred to as data mining techniques and they have
received a lot of attention recently.

The main objective of this talk is

to give an
impression of the potential of multivariate statistical techniques to industrial
image analysis by sketching methods that have proved useful in other areas of
application,

to illustrate
the usefulness of these methods by examples from those application areas, and

to give an example
of a rough-and-ready multivariate analysis of multispectral
image data from an industrial application.