Multivariate Statistics and Future Industrial Image Analysis
Allan Aasbjerg Nielsen
Informatics and Mathematical Modelling
Vision Day 2002,
There is a tendency in digital image data acquired in industrial (medical and other) applications towards
The latter improved spectral resolution where we record not just greyscale or RGB data but an actual (maybe coarse) EM spectrum for each pixel leads to an increased need for digital techniques based on multivariate statistics and spectral analysis.
The situation with improved spectral resolution is characterised by more than one variable (also known as a feature) for each observation also known as case, sample or experimental unit. This variable or feature is often the reflectance at a given wavelength in the EM spectrum or a quantity derived thereof. The data can be organised in a so-called data matrix with n rows (one for each of n observations) and p columns (one for each of p variables). This data matrix can be viewed or analysed row- or column-wise depending on whether one is primarily interested in relations (e.g. similarity or proximity) between observations or variables.
The main division between statistical analysis methods is related to whether we a priori assume some structure in the data (e.g. we may know beforehand that the observations naturally divide into groups) or whether we assume no prior knowledge and discover structure from data matrix alone. Techniques associated with the latter situation are often referred to as data mining techniques and they have received a lot of attention recently.
The main objective of this talk is
to give an impression of the potential of multivariate statistical techniques to industrial image analysis by sketching methods that have proved useful in other areas of application,
to illustrate the usefulness of these methods by examples from those application areas, and
to give an example of a rough-and-ready multivariate analysis of multispectral image data from an industrial application.