• No results found

Microbial diversity

5. Introduction

5.3. Microbiota

5.3.3. Microbial diversity

The microbiome diversity is a measure of the number of different taxa and the abundance of them (73). A distinction is made between diversity and b-diversity; a-diversity is a measure of a-diversity within a sample or environment, also called biodiversity. Whereas b-diversity is a measure of diversity between different samples or different environments (73, 74).

Several different a-diversity indices or measures exist, however a consensus regarding which method or measure that should be used in various settings has not been reached (74). Some standard a-diversity measures are presented below.

• Richness is a quantitative measure of the number of different taxa or organisms within a particular sample or environment, for example, the number of species detected within a biopsy sample is called species richness (73). The richness is simply a count and does not take into account the abundances of the different taxa (Figure 7).

• Evenness describes the abundances of taxa in a sample and gives information regarding the distribution of taxa; are the taxa equally distributed or are some

taxa dominating? Low evenness describes an environment where some taxa are dominating, illustrated in the lower left box in Figure 7.

• Chao1 index is a non-parametric method which estimates species richness but also intends to correct for underestimation of species richness due to loss of species during sampling or sequencing (75). Chao1 index uses the number of species with one or two counts to correct the observed number of species in order to estimate a more realistic number of species within a sample (74).

• Shannon diversity is a complex index which takes both the species richness and relative abundance of each species into account, the index is calculated on a logarithmic scale and can therefore not be directly interpreted by its number (74).

Figure 7. Illustrating the richness and evenness as measures of a-diversity. Increasing richness from left to right (upper three boxes). Increasing evenness (lower three boxes) from left to right.

Each symbol illustrates one taxon, similar symbol equals similar taxon.

• Simpson index also takes the species richness and relative abundance of each species into account, but whereas Shannon index emphasises species richness, the Simpson index emphasises species evenness (76).

• Phylogenetic diversity (PD) is a measure that reflects the molecular or evolutionary diversity of taxa within a sample; it estimates diversity by summing the branch lengths along the tree of life covered by one sample (77). PD provides information about the relatedness of the species or taxa within a sample based on evolutionary similarity, as opposed to the other a-diversity measures which only give information regarding the count and distribution of taxa within a sample. Higher PD numbers reflect a more diverse sample covering a larger part of the tree of life (77).

b-diversity measures the diversity between different samples or environments and gives an estimate of how different two communities are (74). A high b-diversity indicates that the two samples or environments have a low number of shared few taxa or species, whereas a low b-diversity indicates that the samples are similar and share most of their taxa (74). The b-diversity is often graphically visualised in Principal Coordinates Analysis (PCoA) plots or Non-metric multidimensional scaling (NMDS) plots. PCoA is based on eigenvalue equations to calculate distance matrix between variables or observations and visualise the distances in a low-dimensional Euclidian space (78). As opposed to Principal Component Analysis (PCA), PCoA can use different measures of association to calculate distance matrix, while PCA is based on covariance/correlation coefficient and requires a linear relationship between the observations/variables (79). The difference between PCoA and NMDS is the distance matrix calculation, PCoA is based on eigenvalue, while in NMDS uses order or rank between observations (78).

The most common b-diversity measures of association are unweighted UniFrac, weighted UniFrac, Jaccard index and Bray-Curtis dissimilarity.

• UniFrac or unique fraction metric measures the phylogenetic distance between taxa on the phylogenetic tree by measuring the percentage of branch lengths of the tree that is unique to one sample or environment (80). If two samples or environments have no unique branches, they are considered phylogenetic similar, contrary if two samples share no

branches and each sample only contains unique branches they are considered phylogenetic maximum different (80).

• Weighted UniFrac emphasises the abundance of taxa in the calculation so that the most abundant taxa are considered more important (81).

• Unweighted UniFrac only accounts for the presence or absence of different taxa and does not use abundance information in the calculation.

Therefore abundant and rare taxa are similarly emphasised. Unweighted UniFrac is therefore efficient in terms of accounting for changes in abundance of rare taxa (81).

• Bray-Curtis dissimilarity is a metric which quantifies the compositional dissimilarity between two samples based on the taxa counts in each sample. Bray-Curtis is considered an abundance-based b-diversity index (82). The Bray-Curtis dissimilarity is calculated by the formula:

!" = 2C!"

(S!+ S")

where j and i are the two samples, Cij is the sum of the minimum value of each species found in both samples, Si and Sj are the total number of taxa present in sample i and j respectively (Figure 8) (83). The Bray-Curtis dissimilarity is bound to be between 0 and 1, where 0 implies the two samples have the same composition (share all taxa), and 1 implies that the two samples do not share any taxa.

Figure 8. Illustrating calculation of Bray Curtis measure of association. The circles (i and j) represent two different samples (Si and Sj). Each of the symbols within each circle represent one species, similar symbols equal similar species. Sample i consist of 17 species in total and three different species. Sample j consist of 16 species and two different species.Cij is the sum of the minimum value of each species found in both samples, Si and Sj are the total number of taxa present in sample i and j respectively.

• Jaccard index is a so-called presence-absence index which focuses on more on rare species in comparison to abundance-based indices such as Bray-Curtis (82). Jaccard index is calculated by the formula:

*+ = [1 − / / + 0 + 1]

where a is the number of shared species between samples, b is the number of species occurring exclusively in sample i, c is the number of species occurring exclusively in sample j (84).