Multivariate Statistics and Future Industrial Image Analysis

Multivariate Statistics and Future Industrial Image Analysis

Allan Aasbjerg Nielsen

Technical University of Denmark

Informatics and Mathematical Modelling

http://www.imm.dtu.dk/~aa

DTU Vision Day 2002, 3 June 2002

There is a tendency in digital image data acquired in industrial (medical and other) applications towards

better spatial resolution, i.e., more pixels per unit area,
better radiometric resolution, i.e., more bits per pixel, and
better spectral resolution, i.e., more spectral bands or colours per pixel - also beyond visible part of the electromagnetic (EM) spectrum.

The latter improved spectral resolution where we record not just greyscale or RGB data but an actual (maybe coarse) EM spectrum for each pixel leads to an increased need for digital techniques based on multivariate statistics and spectral analysis.

The situation with improved spectral resolution is characterised by more than one variable (also known as a feature) for each observation also known as case, sample or experimental unit. This variable or feature is often the reflectance at a given wavelength in the EM spectrum or a quantity derived thereof. The data can be organised in a so-called data matrix with n rows (one for each of n observations) and p columns (one for each of p variables). This data matrix can be viewed or analysed row- or column-wise depending on whether one is primarily interested in relations (e.g. similarity or proximity) between observations or variables.

The main division between statistical analysis methods is related to whether we a priori assume some structure in the data (e.g. we may know beforehand that the observations naturally divide into groups) or whether we assume no prior knowledge and discover structure from data matrix alone. Techniques associated with the latter situation are often referred to as “data mining” techniques and they have received a lot of attention recently.

The main objective of this talk is

• to give an impression of the potential of multivariate statistical techniques to industrial image analysis by sketching methods that have proved useful in other areas of application,

• to illustrate the usefulness of these methods by examples from those application areas, and

• to give an example of a rough-and-ready multivariate analysis of multispectral image data from an industrial application.