Multivariate Statistics With a View to Future Medical Image Analysis

Allan Aasbjerg Nielsen

Knut Conradsen

Department of Mathematical Modelling

Technical University of Denmark

http://www.imm.dtu.dk/~aa

CAFIA, KAS Herlev, 29-30 May 2000

The motivation behind this presentation is a tendency observed in digital image data acquired in not just medical applications towards 1) better spatial resolution (more pixels per unit area); 2) better radiometric resolution (more bits per pixel); and 3) better spectral resolution (more spectral bands or colours) so that we obtain not just a grayscale or an RGB image but rather we get what begins to look like an actual electromagnetic spectrum for each pixel in the image. This spectrum may extend beyond the visible part of the spectrum into the UV and IR regions. This latter tendency leads to an increased need for digital techniques based on multivariate statistics and spectral analysis.

Multivariate data are characterised by the fact that we have more than one variable (here typically physiological response) for each observation (also known as a case, a sample or an experimental unit, here typically individuals). The recorded data can be organised in a data matrix with *n* rows (one for each observation) and *p* columns (one for each variable). This *n* by *p* data matrix can be viewed and analysed row- or column-wise depending on whether one is interested in relations (e.g. similarities) between observations or between variables.

The main division between multivariate statistical methods is 1) methods that assume a known structure in the data (e.g. cases may divide into groups) and 2) methods that rely on discovering structure from the data matrix alone (such techniques have lately become known under the joint phrase "data mining"). At the heart of multivariate statistics lie linear transformations of data such as the celebrated principal components analysis. Several examples of this and other types of transformations will be given. Also factor analysis, different types of so-called canonical analyses, unsupervised and supervised classification schemes will be illustrated. In several cases these established multivariate methods will be extended for use with spatial data.

The main objective of this presentation is to give an impression of the potential of multivariate statistical techniques to image analysis by sketching methods that have proven useful in other areas of application such as geological mapping, change detection in remote sensing, and industrial inspection. The methods will be illustrated by examples from these application areas and also an example of a multivariate analysis of a fundus image will be given.