# DATA PROCESSING AND ANALYSIS

Following faunal species identifcation, statistical analysis is carried out using the PRIMER® (Plymouth Routines in Multivariate Ecological Research)

PRIMER® consists of a wide range of univariate, graphical and multivariate routines for analysing the species/samples abundance (or biomass) matrices that result from the biological sampling.

Univariate statistics in the form of diversity indices are calculated, i.e. Margalef’s species richness index, Pielou’s Evenness index and Shannon-Wiener diversity index. Species richness is a measure of the total number of species present for a given number of individuals. Evenness is a measure of how evenly the individuals are distributed among different species. The diversity index incorporates both of these parameters. Richness ranges from 0 (low richness) to 12 (high richness), evenness ranges from 0 (low evenness) to 1 (high evenness), diversity ranges from 0 (low diversity) to 5 (high diversity).

Following univariate analysis, the faunal data is transformed and used to prepare a similarity matrix. Transformations are carried out in order to weight the contributions of common and rare species. The similarity matrix is then used in classification analysis. This aim of this analysis is to find “natural groupings’ of samples, i.e. samples within a group that are more similar to each other, than they are similar to samples in different groups. The PRIMER ® programme CLUSTER carries out this analysis by successively fusing the samples into groups and the groups into larger clusters, beginning with the highest mutual similarities then gradually reducing the similarity level at which groups are formed. The result is represented graphically in a dendrogram, the x-axis representing the full set of samples and the y-axis representing similarity levels at which two samples/groups are said to have fused.

The similarity matrix is also subjected to a non-metric multi-dimensional scaling (MDS) algorithm, using the PRIMER ® programme MDS. This programme produces an ordination, which is a map of the samples in two- or three-dimensions, whereby the placement of samples reflects the similarity of their biological communities, rather than their simple geographical location.

The species, which are responsible for the grouping of samples in cluster and ordination analyses, are identified using the PRIMER® programme SIMPER. This programme determines the percentage contribution of each species to the dissimilarity/similarity within and between each sample group. Only two groups of samples are compared at a time and the influential species are identified for each specific comparison.

The physical and chemical data is used to compile data matrices used for principal component analysis (PCA). Essentially, the contribution that each parameter makes to the variance within and between samples is calculated, with the overall aim of identifying the parameter(s), which causes the variance in the original set. Following transformation of the data and analysis using the Primer ® programme PCA, a two dimensional PCA plot is produced. Essentially, the PCA plot defines the positions of the replicates/stations in relation to each axes, which represents the full set of variables. Each replicate/station acquires a place on this graph and the location depends on a number of variables significant to that station and which sets it apart from all the rest. The significant variables, which increase to the right and to the left of axis 1 (PC1), are identified in the legend. Similarly, the significant variables, which increase to the top and to the bottom of axis 2 (PC2), are also identified in the legend.