1
Genetic structure of diploid (2n = 12, 14) Scurvygrasses (Cochlearia) with emphasis on Icelandic populations
Luka Natassja Olsen
MSc Thesis
Centre for Ecology and Evolutionary Synthesis, Department of Biosciences UNIVERSITY OF OSLO
September 2015
2
© Luka Natassja Olsen 2015
Genetic structure of diploid (2n = 12, 14) Scurvygrasses (Cochlearia) with emphasis on Icelandic populations
Luka N. Olsen
http://www.duo.uio.no/
Cover art: Sondre Strøm Linde
Print: University Print Central, University of Oslo
3
Spoonwort doth warm, and also doth dry, In the Scurvy 'tis a great Remedy, It sends out all corrupt humors by sweat With this your mouth gargel often, and wet.
This plant which deserves so much of your praise The Apothecaries use six several wayes, It's Spirit, Syrup, Water procures health, So doth its Salt conserve, and the Herb itself.
'Six several ways' (of using the treasured Spoonwort (Cochlearia) as a remedy of scurvy), as cited by Lorenz (1953)
4
5
Forord
En stor takk til min hovedveileder Anne, for møter med kaffe, te og klemmer. Takk for laboratoriehjelp og utregninger, takk for utallige rettinger og tilbakemeldinger på utkast, og for svar på en myriade av spørsmål. Takk til medveileder Inger, for all nomenklaturisk og taksonomisk eksperthjelp i Cochlearia-feltet. I tillegg, takk for at du ble med Anne og meg ut i felt på Island som vår private sjåfør – til tross for at det skulle vise seg å innebære risikable turer over hraun for å krafse Cochlearia ut av klippesprekker! Takk til medveileder Charlotte, for at du hjalp til med gamle Islandske floraer på Tøyen. Takk til Marie, for at du viste vei for meg innenfor RAD-seq, STACKS og alt! Jeg er glad vi rakk å bli kjent før du ble ferdig med din oppgave. Takk til resten av Annes gruppe også, for koselige tirsdagsmøter.
Takk til Sigríður og Ingrid Johansen, for at dere oversatte gamle Islandske floraer for meg.
Takk til Paul Grini, for at du tok deg tid til å diskutere epigenetikk. Thank you, Robin, for all your thoughts, and for pushing me into trying bioinformatic gymnastics. Thank you, Annie, for feedback on my writing. Many thanks to Terezie Mandáková and Martin Lysák, for letting me stay in Brno. And to Terezie especially, for all help with the chromosome counts, both at the Mazaryk University in Czeck Republik, and here at UiO.
En spesielt stor takk til Stine, Tonje og Mathilde, for at jeg fikk være en brøkdel av den fantastiske firkløveren vår. Det er usikkert om jeg ville likt meg så godt gjennom studiene om det ikke hadde vært for dere. Nå som den dårlige studentsamvittigheten omsider fordufter håper jeg at vi kan finne på noe utenfor lesesalen igjen. Takk også, til alle de fine samboerne mine, Åsa, Sverre, Veronika og Linnea, for at det ofte har vært klemmer å få og middag i kjøleskapet når jeg kommer sent hjem. Takk til familien min, spesielt Ronja, for alle oppmuntrende ord, og for at du tvinger meg til å ta kaffepauser selv når jeg tror jeg ikke trenger det. Sist, men ikke minst, takk til Sondre, for at du laget forside for meg, og for at du alltid synes jeg er den flinkeste i verden – uansett hva det er jeg foretar meg.
6
7
Index
Abstract ... 9
1 Introduction ... 11
1.1 Taxonomical treatment of Cochlearia in Iceland... 13
1.2 Research aims and questions ... 15
2 Materials and methods ... 16
2.1 Plant material ... 16
2.2 Chromosome counting ... 17
2.3 Morphometry ... 19
2.4 DNA extraction ... 21
2.5 RAD-sequencing ... 21
Processing the raw RAD-seq reads ... 23
Population structure analysis ... 25
Tree and network analyses ... 25
PCA analyses ... 26
Maps ... 26
3 Results ... 27
3.1 Chromosome counting ... 27
3.2 Morphometry ... 28
3.3 RAD-sequencing ... 32
Population structure analysis ... 32
Tree and network analysis ... 33
Bayesian phylogeny ... 35
PCA analysis ... 36
4 Discussion ... 39
4.1 Do Icelandic plants with different chromosome number or ecology constitute different genetic clusters? ... 39
4.2 How is the evolutionary relationship between the Icelandic plants and other diploid Cochlearia species? ... 43
4.3 How can the results from this study be guiding for taxonomical decisions in Flora Nordica? ... 44
References ... 47
Appendix ... 52
8
9
Abstract
Section Cochlearia (Brassicaceae) includes highly polymorphic species complexes with regard to ploidal level, ecological adaptation and distribution. Low levels of chloroplast DNA divergence suggest that taxa most likely have diversified relatively recently, and that speciation is still ongoing. This has led to conflicting taxonomic treatments. The European Cochlearia displays a range of ploidal levels, from diploid to decaploid. Diploid species with chromosome number 2n = 12 dominate in southwestern Europe, whereas the arctic Cochlearia is diploid with 2n = 14. In Iceland, diploid plants of both basic numbers (2n =12, 14) are found. Whereas the 2n = 12 plants are found only in beach cliffs along the Icelandic coast, the 2n = 14 plants are found in two different habitats: In snowbeds on inland mountains, and along the western coast of Iceland. There is still no agreement as to which taxa the Icelandic plants belong. It has been suggested that the 2n = 14 plants belong either to the arctic diploid C. groenlandica (2n = 14) or constitute a subspecies of the tetraploid C.
officinalis (2n = 24). The 2n = 12 plants have been related either to C. groenlandica or to the southwestern European diploid C. pyrenaica (2n = 12). In this study, single nucleotide polymorphisms (SNPs) derived from RAD-sequencing were applied to study whether the Icelandic Cochlearia plants constitute genetic clusters in accordance with chromosome number or ecology. Additionally, to investigate their evolutionary relation to other Cochlearia species, Icelandic plants were compared to recognized diploid species in Svalbard (C.
groenlandica) and southwestern Europe (C. pyrenaica and C. aestuaria). Analyses of SNP data showed that Icelandic plants cluster according to ecology, and not according to chromosome number. Furthermore, the genetic variation among the Icelandic populations display a geographic pattern, where plants sampled in closely located sites are more similar irrespective of chromosome number. Icelandic plants do not cluster with southwestern European plants, but alpine (2n = 14) plants on Iceland consistently group with C.
groenlandica in Svalbard. Based on the results from this study, it is suggested to refer alpine Icelandic plants to C. groenlandica. The Icelandic coastal plants show no clear genetic or morphological separation between plants with different chromosome number (2n = 12, 14), and they should therefore be referred to the same taxon. However, because the relation to C.
officinalis (2n = 24) was not addressed in this study, it is not decided which taxon they should be referred to.
10
11
1 Introduction
[Cochlearia] has always proved to me to be one of the most intractable boreal genera […] Habit, pods and leaves afford the characters hitherto made use of; and all are equally fallacious, as far as affording permanent distinctions (Hooker 1861, p. 317).
The genus Cochlearia L. is part of the mustard family (Brassicaceae) and is widely distributed in Europe and the circumpolar region. The genus comprises two sections: Cochlearia and Glaucocochlearia O.E. Schulz (Koch et al. 1996). The focus of this study will be on a selection of diploid taxa within the section Cochlearia.
Section Cochlearia is known as a notoriously difficult group when it comes to taxonomic delimitations (Saunte 1955, Gill 1965, Hultén 1971, Gill 1971a, 1971b, 1973, 1978, Nordal 1988, Nordal and Laane 1990, Nordal and Stabbetorp 1990, Koch et al. 1996, 1999). The opening quote of Hooker, on his observation of morphological traits more than 150 years ago, is still applicable today. The section as a whole appears to be quite young, possibly with a post or late glacial origin (Koch et al. 1996, Koch 2012). There is obviously ongoing speciation within this group, and considerable chromosome evolution has taken place without corresponding morphological differentiation (Koch et al. 1996). Two basic diploid series exist, based on haploid numbers x = 6 and x = 7. The x = 6 series is suggested to be the most basal, and it is found from south temperate to north boreal Europe, while the x = 7 series is widespread in arctic regions (Gill 1971a, 1973, Nordal and Laane 1990, Koch et al. 1996).
The geographically disjunct distribution of the 2n = 12 plants: in Iceland, Scotland and southwestern Europe, seen today, are possibly relicts of a formerly wider distribution (Koch et al. 1996). Koch et al. (1996) suggested that during the Pleistocene, Cochlearia plants similar to the extant 2n = 12 species (C. pyrenaica DC. or C. aestuaria (J.Lloyd) Heywood) survived in refugia south of the glaciers in southwestern Europe. As the glaciers retreated, the diploids spread northwards from the refugia, underwent chromosomal changes and diversified.
Cytological studies have shown that the circumpolar 2n = 14 (x = 7) most likely derived from 2n = 12 (x = 6) by primary tetrasomy (doubling of one chromosome set; Gill 1971a, 1973, Nordal and Laane 1990). Tetrasomy is suggested to create potential for greater variation (Nordal and Laane 1990), and it is speculated that the x = 7 series evolved in western Europe,
12 possibly on Iceland, and subsequently colonized the circumpolar and arctic region (Svalbard, Greenland, North America and Siberia; Nordal and Laane 1990, Koch et al. 1996).
Large cytological variation is present in the section Cochlearia, ranging from diploids to decaploids (Saunte 1955, Gill 1965, 1973, Nordal et al. 1986, Nordal and Laane 1990).
Crossing experiments across the chromosome levels have resulted in fully or partially fertile hybrids (Crane and Gairdner 1923, Gill 1971a, 1973, 1975, Fearn 1977, Koch et al. 1996, Pegtel 1999). Taxa recognition based on morphology without knowing the ploidal level or geographic origin can be very difficult, if not impossible, as individuals of the same taxon may present large variation in morphology dependent on their habitat. Particularly the level of soil nitrogen, light conditions and stress factors (e.g. salinity) are likely to influence the phenotype, such as size, erect versus prostrate habit, leaf shape and development of stem leaves (Elkington 1984, Nordal et al. 1986, Nordal and Stabbetorp 1990, Pegtel 1999).
Accordingly, environmental influences must be accounted for when morphological traits are given taxonomic weight. On the other hand, separate evolutionary entities that are morphologically similar might be mistakenly included into the same taxon if morphological divergence has not happened at the same speed as genetic divergence and reproductive isolation (Shaw 2000, Chan et al. 2002, Whittall et al. 2004, Duminil and Di Michele 2009).
Determination of species’ boundaries of recently diverged lineages is a general problem in organism groups because populations may not have been isolated long enough to accumulate significant differences in the traits considered (Shaffer and Thomson 2007, Escobar García et al. 2009, Lega et al. 2012, Slovák et al. 2012, Dick et al. 2014). Different species concepts often focus on particular properties of the diverging populations, e.g. diagnosable morphological traits, internal reproductive isolation, restricted gene flow or monophyly (cf.
Medrano et al. 2014). However, the evolutionary changes that lead to such properties through speciation do not necessarily occur at the same time or in a regular order (De Queiroz 2005, 2007). Consequently, delimitation of taxa may vary among authors depending on which traits they focus on, and which species concept and methods they use. The treatment of diverging lineages might therefore often result in incompatible taxonomical treatments (De Queiroz 2005, 2007). In an attempt to unify the different more or less incompatible species concepts, De Queiroz (2005) recognized that all concepts are based on a general understanding of species as a separately evolving metapopulation lineage, and considered this to be the only necessary property. As an example, if two metapopulations (each consisting of several subpopulations) diverge with regard to one or several properties, this is according to De
13 Queiroz sufficient to postulate that they are different species. Consequently, properties such as reproductive isolation and ecological differentiation are not needed to delimit species, but can be used as evidence for species boundaries and to determine the degree of separation. In addition, these properties can be used as descriptions of the separation (e.g. reproductively isolated species or ecologically differentiated species). As mentioned, Cochlearia is of a young age and consists of species characterized by high phenotypic plasticity, introgression and ongoing differentiation. The degree of separation between units, however, is still unknown.
1.1 Taxonomical treatment of Cochlearia in Iceland
For the genus Cochlearia, Iceland is of particular interest since it is the only area where plants with different diploid chromosome counts co-occur (2n = 12, 14) (Gill 1971a, Nordal and Laane 1990, Koch et al. 1996, Koch et al. 1998). The plants are distributed along the entire coast of Iceland, and have in addition a scattered distribution on inland mountains.
Chromosome numbers have, so far, only been obtained on a limited number of populations (Appendix, Table A1).Two ecologically differentiated Cochlearia forms exist in Iceland. A dwarfed “alpine” form with a chromosome number of 2n = 14 is found in late snow beds on inland mountains, whereas a larger “coastal” form is found in beach cliffs along the coast.
Both diploid chromosome numbers are, so far, reported for the coastal plants, but 2n = 12 plants are reported only from a restricted part of the southern coast of Iceland (except that Gill referred to an unpublished observation of 2n = 12 plants in northwestern Iceland; Gill 1971a).
On the other hand, 2n = 14 plants are reported from the west-coast of Iceland (Saunte 1955, Gill 1971a, Löve 1975, Nordal and Laane 1990).
Icelandic populations of Cochlearia have been treated very differently both with regards to taxonomy and nomenclature through the years, varying greatly depending on the author (Appendix, Table A2). In the second edition of Flóra Íslands (Stefánsson 1924, first edition from 1901, not seen), all Icelandic plants were referred to one species, C. officinalis L., and were subdivided into three varieties, var. groenlandica (L.) Gelert., var. oblongifolia (DC.) Gelert., and var. arctica (Schlecht.) Gelert. The mountain form was referred to var.
groenlandica, whereas the coastal forms were referred to var. oblongifolia and var. arctica based on differentiation in leaf and fruit forms (Stefánsson 1924). This delimitation was upheld in the third edition (Stefánsson 1948). In Íslenzkar Jurtir, Löve (1945) referred all
14 Cochlearia populations to one species, C. officinalis without subspecific taxa. In Flora Europaea (Tutin et al. 1964), two species were referred to Iceland: C. fenestrata R.Br. and C.
groenlandica L., with C. fenestrata probably representing the coastal plants, and C.
groenlandica the alpine form. The paragraph on these taxa concludes: “There has been much confusion between the two taxa, consequently their distribution is uncertain” (Tutin et al.
1964). Pobedimova (1969) revised the circumpolar plants in the genus. She reported C.
groenlandica from the northeastern Iceland and C. islandica Pobed. all along the coast, except the southern part of Iceland. Additionally, Pobedimova (1970) reported C. anglica L. in the northwest, close to Snæfellsnes. In Icelandic Excursion Flora, (Löve 1970) referred the Icelandic plants to C. groenlandica ssp. islandica (Pobed.) Á. Löve. Later, however, he referred the Icelandic taxa to two genera, Cochlearia based on x = 6, and Cochleariopsis based on x = 7 (Löve 1983). He reported two taxa from Iceland: Cochlearia pyrenaica (2n = 12), and Cochleariopsis groenlandica (L.) Löve & Löve ssp. islandica (Pobed.) Löve & Löve (2n = 14). This has not been accepted in any later treatments (Elven 2011), as the morphological differences are ignorable and the chromosome difference probably is a result of tetrasomy (Gill 1971a, Nordal and Laane 1990). In Flowering plants and ferns of Iceland, Kristinsson (1986) referred all the Icelandic plants to C. officinalis. He also mentioned the unclear origin of the mountain plants, and suggested that they might belong to a species distinct from the coastal plants (Kristinsson 1986). The treatment of the Icelandic plants as C.
officinalis is upheld in the third edition (Kristinsson 2010). Nordal and Laane (1990) chose not to group the Icelandic plants with C. officinalis, based on experiments that showed significant differences in flower and seed size as well as differences in mode of reproduction (Icelandic plants were supposed to be self-compatible, whereas C. officinalis proved to be obligate outcrossers). Instead, Nordal and Laane (1990) grouped the Icelandic plants together and referred them to C. groenlandica based on morphology, chromosome number and reproductive biology. They further suggested that the ecologically and morphologically differentiated alpine and coastal plants on Iceland might deserve subspecific recognition.
Few molecular studies have been conducted on the Icelandic Cochlearia plants, and it is so far unclear whether they belong to the same genetic cluster, or whether they represent two or more separate genetic clusters corresponding to chromosome number and/or habitat. In a study on chloroplast divergence in section Cochlearia, Koch et al. (1996) included European diploid (2n = 12) C. pyrenaica and C. aestuaria, distributed in southwestern Europe as well as Icelandic plants from a 2n = 12 coastal population and four 2n = 14 populations (sample
15 location is known for only one population, which is a coastal population), all referred to by the authors as C. groenlandica. All the 2n = 12 diploid plants turned out to have a distinct chloroplast (cpDNA) type (B) and grouped together irrespective of geographic origin. The four Icelandic 2n = 14 populations had another cpDNA type (F), and grouped with C.
officinalis individuals from northern Scandinavia, which also had this cpDNA type (Koch et al. 1996). Isoenzyme studies grouped the Icelandic 2n = 12 plants with C. pyrenaica, while Icelandic 2n = 14 plants were grouped with C. aestuaria (Koch et al. 1998).
1.2 Research aims and questions
Given the partly contradictory information on cytological, ecological, molecular and morphological data observed in Icelandic Cochlearia, the aim of this thesis is to further investigate evolutionary relationships and taxonomic status of these plants. Restriction site Associated DNA Sequencing (RAD-seq) will be used to detect possible genetic structure among Icelandic populations. RAD-seq is a method that makes it possible to create a reduced representation of the genome by sequencing fragments of nucleotides next to restriction enzyme cutting sites and searching these for molecular markers such as single-nucleotide polymorphisms (SNPs). This can be done for multiple individuals and populations simultaneously (Baird et al. 2008). RAD-seq can be performed de novo, without the need of a reference genome, which makes it suitable for studies on non-model species. Chromosome count information will be obtained from several populations, and results from a small pilot study on morphological leaf traits will be compared to the molecular and cytological results.
Specifically, the following research questions will be addressed:
Do Icelandic plants with different chromosome number (2n = 12/14) or ecology (coastal/alpine) constitute different genetic clusters?
How is the evolutionary relationship between the Icelandic plants and other diploid Cochlearia species, specifically: C. groenlandica (Svalbard), C. aestuaria (Spain), and C. pyrenaica (Spain, France)?
How can the results from this study be guiding for taxonomical decisions in Flora Nordica?
16
2 Materials and methods
2.1 Plant material
Plant material was mainly sampled in Iceland during one week in August 2014. Leaf samples were collected from ten individuals per population, twelve populations in total (abbreviation of locality names are used throughout the thesis according to Fig. 1; Appendix Table A3). Ten of the sampled populations were located close to the sea and considered as representatives of the ‘coastal’ ecotype, while two populations were sampled on inland mountains at 920-1030 m.a.s.l. representing the ‘alpine’ ecotype. Sampling localities were chosen based on previous publication of chromosome numbers (Saunte 1955, Gill 1971a, Löve 1975, Nordal and Laane 1990, Koch et al. 1998, Appendix Table A1), or from locality records from a database at the Icelandic Institute of Natural History (IINH).
In the field, healthy green leaves were harvested and dried instantly on silica gel for subsequent DNA extraction. Seeds were collected when present and later germinated in a phytotron at the University of Oslo, Department of Biosciences. From most populations (except for populations STK, HVA, DJU and HAF) living specimens were additionally sampled and cultivated in the phytotron. Herbarium vouchers were made from field collected specimens for all populations and will be deposited at the Natural History Museum, University of Oslo (Appendix, Table A3). For cytological studies, flower buds were harvested in the field or in the phytotron.
Additional material sampled by collaborators was obtained from three Icelandic populations (either as seeds, silica dried material or both, Fig. 1), and from Svalbard, France and Spain (Appendix Tables A3, A4).
Seeds were subjected to stratification before sowing. Five layers of paper tissue were placed in a petri dish and covered by filter paper. Using a syringe, 5 ml water was added before 14 healthy seeds were evenly distributed in the petri dish. The seeds were covered by filter paper and sprinkled with 2 ml water. Petri dishes were sealed with Parafilm before being placed in a cold room at 4 °C for approximately three weeks. Seeds were sown in standard soil (S-jord) in the phytotron with conditions of 18 h day at18 °C and 6 h night at 10 °C. As most Cochlearia species are biennial or perennial (Gill 1971a) and do not flower until the second (biennial) and subsequent seasons (perennial), plants were allowed to develop leaf rosette for about 2.5
17 months before they were moved to simulated winter conditions: 10 h day and a temperature of less than 9 °C for 1.5 month and then moved back to summer conditions to induce flowering.
As not all plants flowered after the first induced winter, they were exposed to winter conditions for further 1.5 months. During the second induced summer, conditions were changed to speed up the flowering process: 18 h day at 20 °C and 6 h night at 18 °C.
Figure 1: Map of Iceland with indication of locations from where 15 populations of Cochlearia were sampled;
abbreviated names in brackets are used throughout the study. Inserted map in top left corner shows the distribution of Cochlearia in Iceland (map from Kristinsson 2010). For further locality information, see Appendix Table A3.
2.2 Chromosome counting
From plants in the field or the phytotron, we collected whole inflorescences containing buds which were early in their development (the biggest bud in the inflorescence was yellow).
Chromosome counts were obtained from altogether 12 Icelandic populations. From most populations only one individual per population was counted, except for population BAE from
18 which four individuals were counted. Population HJO was not included as no flowers were available in the field, planted seeds did not sprout, and the live plant in the phytotron did not flower. Seeds of populations HFN and LAT were not available. Inflorescences were placed in freshly prepared Carnoy’s fixative I (3 parts ethanol, 1 part glacial acetic acid), which was changed once before the inflorescences were transferred into tubes with 70 % ethanol and kept in the freezer for long-time storage. This material was later used for chromosome counts done either together with collaborators at Masaryk University in Brno, Czech Republic, or at the University of Oslo. Chromosome slide preparation was performed following parts of a protocol originally developed for chromosome painting (Lysak and Mandáková 2013).
Counted individuals did not necessarily correspond to those included in the RAD-seq library.
Floral material was first rinsed with distilled water for 10 min before suitable parts of the inflorescence were selected and washed twice in 1 x citrate buffer (Appendix Table A5) for 5 min while shaking on an orbital shaker. The 1 x citrate buffer was removed and the tissue submerged in ~1 ml pectolytic enzyme mixture (Appendix Table A6 a, b) at 37 °C for 3 h.
The enzyme mixture was replaced by 1 x citrate buffer and the digested material kept on ice until use. A single flower bud was placed on a FisherBrand Superfrost plus microscope slide (Fischer Scientific, Pittsburg, USA) using a Pasteur pipette together with 20 µl 60% acetic acid. The bud was disintegrated to break the cells using a dissection needle, and placed on a heating block (50 °C) for 2 min. During this step the cell suspension was spread by circular stirring of the 60% acetic acid droplet using a needle that was held horizontally (without touching the slide). The chromosomes were fixed in 100 µl Carnoy’s fixative I (pipetted as four drops around the suspension drop and lastly one in the middle). The fluid was discarded by tilting the slide and quickly dried for two seconds using a hair dryer. The slide was quality checked in a light microscope with phase contrast before applying 20 µl DAPI (4',6- diamidino-2-phenylindole; Appendix Table A7), and covered with a coverslip. For each individual, multiple microscope slides were prepared and examined. In Masaryk University, Brno, the search for clear nucleus spreads was done in an Olympus BX-61 epifluorescence microscope and CoolCube CCD Camera (Metasystem) and pictures were processed in Adobe Photoshop CS2 (Adobe Systems). Chromosome counts of three of the populations (STR, HNF and SUR) were determined in Oslo using both a light microscope with phase contrast and a Zeiss Axioplan Imaging2 epifluorescence microscope system equipped with Nomarski optics, epifluorescence attachment and the software Zeiss AxioVision 4.8. After chromosome counting, the slides were stored in a dust free box at 4 °C.
19
2.3 Morphometry
After several months (depending on whether plants were grown from seeds or collected as live plants) of exposure to the same conditions in the phytotron, five leaves from four individuals per population (10 populations in total) were sampled. For population DYR, ING, BAE, OLF, material included in the morphometric analyses were collected both from plants sampled in the field (and subsequently cultivated in the phytotron) and from plants grown from seeds in the phytoptron. For population STK, HVA, DJU and HAF, only material sampled from plants germinated from seeds were included, as live plants were not sampled in the field. For the alpine populations EIR and GIL, only material from live plants sampled in the field were included, as seeds were not available or did not germinate. Only three individuals from the EIR population were available for morphometric analyses. The two populations HJO and STR were excluded from the morphometric analyses as phytotron material was available from only one plant. Population SUR was excluded since all individuals presumably suffered from a virus causing sickly plants with crippled leaves.
Populations HFN and LAT were only available as silica-dried field material and were neither included in the morphometric analyses.
Because only few individuals were flowering in several of the populations, flower traits were not used in morphometric analyses. Instead, leaf traits previously recognized by Nordal and Laane (1990) as informative, were used: Maximum leaf length (L), maximum leaf width (W) and leaf base angle (Fig. 2). Maximum leaf length was measured as the vertical line drawn from the leaf tip (apex) to the attachment of the leaf stem. Apex was considered the part of the lamina farthest removed from the point of attachment of the leaf to the stem. Maximum leaf width was the distance between the leftmost and rightmost points in the horizontal line measured 90° on the vertical leaf length line. The measurements were used to calculate leaf surface area (L x W) and leaf ratio (W/L). Leaf base angle was measured as the angle degrees between the lines drawn from the point of attachment along the lower margins of the leaf.
Leaf surface area gives information about leaf size, while leaf ratio and leaf base angle give information about leaf shape. The average of an individual’s five leaves was calculated.
Differences in morphological traits were investigated using the following statistical analyses:
Levenes (1960), Shapiro and Wilk (1965), Kruskal Wallis (1952) and the Mann-Whitney U test (1947), as well as box plots. These tests were generated in EXCEL using the REAL STATISTICS RESOURCE PACK software (Release 4.3, copyright (2013-2015) Charles Zaiont www.real-statistics.com).
20
Figure 2: Illustration roughly depicting how measurements were made in leaf morphometric analyses of Cochlearia plants from Iceland. Examples of two leaves with very different size and shape are shown.
Tests for critical requirements of homoscedasticity (Levenes test) and normality (Shapiro and Wilk) for parametric analysis were performed. Because of the rejection of these requirements, subsequent analyses were done using non-parametric tests: Kruskal-Wallis and Mann- Whitney U tests for two independent samples. The Kruskal-Wallis test was used to test for significant differences in the population samples in respect to leaf surface area, ratio and base angle. The Mann-Whitney U test was used to test for significant values of the mentioned traits between two groups (according to habitat, chromosome number or chromosome number within or between habitat). To test which of the populations that were significantly different, Dunn’s test for multiple comparisons (Dunn 1964) was performed using the DUNN.TEST 1.2.4 (Dinno 2015) package in R. To visualize potential trends or patterns when combining the effect of the the three morphological variables (leaf surface area, leaf ratio and leaf base angle), a Principle Component Analysis (PCA) biplot was created in R using the VEGAN 2.3-0
(Oksanen et al. 2013) package. PCA identifies the ordination axes that correspond to the greatest variability in the data set by reducing multidimensional data into lower dimensions while still retaining most of the information (Sparks et al. 1999). To include the different variables into the same analysis, normalization of the data was done by calculating the mean and standard deviation of each variable (leaf surface area, ratio and angle) separately. Each observation (Xi) was converted into a corresponding Z score, preserving the shape of the original data: Mean (µ) was set to 0 and standard deviation to 1: Zi=(Xi-µ)/s.
21
2.4 DNA extraction
DNA was extracted from approximately 30 mg dried leaf samples using the E.Z.N.A. SP plant DNA kit (Omega bio-tek, Norcross, USA) and the protocol issued by the manufacturers with minor modifications. The dried samples were crushed by adding two 3 mm tungsten carbide beads (Qiagen, Venlo, Netherlands) to a 2 ml Eppendorf tube containing the leaf sample, disrupting them for 2 min at 20 Hz in a TissueLyser II, Retsch MMo1 (Retsch, Castleford, UK). The remaining protocol was followed without any modifications until the elution steps at the end, where most samples were eluted in 100 µl elution buffer run once through the binding column or in two steps with 50 µl elution buffer at a time. Individuals with considerably less starting material than 30 mg were first eluted in 50 µl elution buffer, and then once again using the first eluate to increase the final concentration of the extracted DNA. DNA LoBind tubes (Eppendorf, Hamburg, Germany) were used for prolonged storage.
Quantification and quality check of the extracted DNA was performed using both NanoDrop ND-1000 V3.10 Spectrophometer (Thermo scientific, USA) and Qubit dsDNA BR assay kit (Life Technologies, Carlsbad, California, USA) with a Qubit fluorometer (Life Technologies).
2.5 RAD-sequencing
RAD-seq uses a restriction enzyme to cut DNA from each individual, producing sticky-ended fragments. The sticky-end fragments are ligated to adapters that contain a matching sticky-end and a barcode. Barcodes are used in subsequent analyses to recognize individuals (Davey and Blaxter 2010, Davey et al. 2011).
A single-digest RAD-seq library was prepared using single digest, double barcoding and size selection with magnetic beads according to a protocol adapted from Baird et al. (2008) and modified by Ovidiu Paun and Clemens Pachschwöll (University of Vienna). Further modification was done based on a protocol developed by Robin Cristofari (University of Oslo): The sub-libraries were kept separate throughout the procedure (i.e. pooling of sub- libraries was not done until after qPCR). The library comprised of 82 samples (79 individuals and three replicate samples) individuals separated into seven sub-libraries: six sub-libraries each with 12 individuals and one sub library with ten individuals. Each individual was marked by a unique double barcode combination (Appendix, Table A8), where seven P2 adapters indicated sub-libraries and 12 P1 adapters indicated individuals within sub-libraries. The barcodes had at least three nucleotide differences between each other. Isolated genomic DNA
22 was cleaned with NucleoSpin gDNA Clean-up (Macherey-Nagel, Düren, Germany).
Quantification and quality check of the cleaned DNA was done using both a NanoDrop ND- 1000 V3.10 Spectrophometer and a Qubit Fluorometer. Based on quantification values each sample was diluted to ensure that the same amount of DNA (250 ng) was included from each individual (i.e. normalization): x µl DNA and 44-x µl Milli-Q water. Genomic DNA was digested with restriction enzyme PstI-HF (CTGCA/G) (NEB, New England Biolabs, UK) at 37 °C for about 2 h. Since PstI-HF cannot be heat activated, the samples were cleaned with SPRI (i.e. solid phase reversible immobilization; SPRI, Beckman Coulter, Indiana, USA) with no selection (1.8x) to remove the enzyme after restriction digestion. The samples were quantified using Qubit and normalized to a volume of 30 µl. Ligation of P1 adapters was done by adding 1.25 µl 100 nM P1 adapter, 1 µl 100 mM rATP (Promega, Fitchburg, USA), 1 µl NEB Buffer 2, 3.25 µl Milli-Q water, 3 µl 10x SmartCutBuffer and 0.5 µl 200 000 U T4 ligase (NEB) to each sample before incubating at 16 °C over night in a PCR machine without heated lid. The samples were heat treated for 10 minutes at 65 °C to inactivate the enzyme, pooled into seven sub-libraries and randomly sonicated (stochastic shearing) using nine cycles (2 °C, 30 sec on and 30 sec off) on a Bioruptor Plus (Diagenode, Denville, US) to obtain an optimal size of 300-600 bp. The shared samples were cleaned using a MinElute Reaction Cleanup Kit (Qiagen) eluting in 15 µl elution buffer, and subsequently SPRI size selection was performed on both the left (0.7x) and right (0.55x) side. To polish the ends of the fragments, the Quick Blunting Kit (NEB) was used; 2.5 µl Buffer, 2.5 µl 100 mM dNTP and 1.0 µl enzyme was added to 19 µl DNA per sub-library and incubated for 30 min at room temperature. Next, the samples were cleaned with MinElute Reaction Cleanup Kit, before dATP/adenine (Fermentas) overhangs were attached to the 3’ end of the fragments by adding 2 µl (15U) Klenow Exo (NEB), 1 µl 100 mM dATP and 2 µl NEB Buffer to 15 µl DNA per sub-library before incubation at 37 °C for 30 min. Once again the samples were cleaned using MinElute Reaction Cleanup Kit. P2 adapters were ligated to the DNA fragments by adding 5 µl 2 µM P2 Adapter, 1 µl 199 mM rATP, 3 µl NEB Buffer 2, 0.5 µl T4 ligase to 20 µl DNA solution with subsequent incubation at room temperature for 30 min. A new purification was done using MinElute Reaction Cleanup Kit (Qiagen) as well as left side size selection (0.65x) with SPRI. The sub-libraries were amplified using PCR. PCR amplification was done as seven reactions each containing 12.5 μl Phusion Polymerase remix (NEB), 1 μl Solexa primer (10 µM), 7.5 μl water and 4 μl sub-library template (DNA), with the following cycling conditions; 30 sec at 98 °C, followed by 18 cycles [10 sec 98 °C, 30 sec 65 °C, 30 sec 72 °C], 5 min at 72 °C, and incubation at 4 °C. The resulting products were run on a 2% TBE gel for
23 30 min to verify successful amplification. The sub-libraries were cleaned with MinElute Reaction Cleanup Kit (Qiagen) and left side SPRI size selection (0.65x). The sub-libraries were quantified using Qubit, and run on an Agilent 2100 Bioanalyzer (Agilent technologies, Santa Clara, USA) with a High sensitivity DNA Assay Kit (Agilent) to verify that the overall size range and quantity of the sub-libraries were optimal. The concentration of each sub- library was measured with qPCR assay (KAPA Library Quantification Kits cat no KK4824, KAPA biosystems, Massachusetts, USA), using a qPCR cycler (Lightcycler 96, Roche, Basel, Switzerland) ensuring that equal amounts of each sub-library were included in the final RAD- seq library before it was sequenced using paired-end sequencing (125 bp) in one Illumina HiSeq2500 lane at the Norwegian Sequencing Centre (NCS).
Processing the raw RAD-seq reads
A total number of 348,745,226 paired-end Illumina sequence reads (120 bp in length) were returned from the sequencing lab and processed with STACKS version 1.29, using high- throughput computation resources at the University of Oslo. STACKS is a pipeline program used to build loci and identify SNPs (Catchen et al. 2011, Catchen et al. 2013). In STACKS,
the program PROCESS_RADTAGS.PL was used to sort individuals according to barcodes (demultiplexing; Appendix Table A8) and to filter reads to improve quality (clean/remove low quality reads), resulting in 196,579,422 retained forwards reads. In addition to the individuals included in the current RAD-seq library, ten diploid individuals (C. aestuaria and C. pyrenaica from population AST, PYR1, PYR2, PYR3, PYR4, see Appendix Table A4) were included from a RAD-seq library produced in a previous study (Brandrud 2014). Both forward and remainder forward reads were merged and used in the analysis, trimmed to 100 bp to match the additional sequences. Next DENOVO.MAP.PL was used to execute three
STACKS component programs: unique stacks (USTACKS), catalog stacks (CSTACKS) and search stacks (SSTACKS). The USTACKS program aligns sequence reads into matching stacks and uses these to build loci and call SNPs using maximum likelihood. CSTACKS builds a catalog by creating sets of consensus loci and merges alleles accordingly. SSTACKS allows us to search among stacks created by USTACKS, matching each sample against the catalog. Severel runs with different settings were performed to see which combination maximized the number of reliable loci with the chosen filters. Resulting files were loaded into a MySQL database with
LOAD RADTAGS.PL and compared: Only allowing loci with one to nine SNPs, and only counting SNPs that appeared in at least 80% of the individuals. Settings used in the end were
24 m (minimum number of identical, raw reads required to create a stack) = 2, M (number of mismatches allowed between loci when processing a single individual) = 2, n (number of mismatches allowed between loci when building the catalog) = 1.
A high number of unique loci in the processed RAD-seq library raised suspicion about contamination. The genome of any Cochlearia species has not yet been sequenced, but as the genus is a relative of the model plant species Arabidopsis thaliana, the sequence reads were aligned to the A. thaliana reference genome using the National Centre for Biotechnology Information (NCBI). When aligning reads to the A. thaliana reference genome, alignment rates were somewhat low: < 30 % of the tested reads aligned more than once to the reference genome; approximately 10 % aligned once and the rest did not align at all. Blasting of the data revealed that around 77 % of the reads were indeed from bacteria and fungi; some of them known endophytes of Arabidopsis. The probability of a read stemming from bacteria or fungi decreased steeply with the increasing number of scored reads; almost all reads scored in at least 10 samples had Brassicaceae like BLAST-hits. To remove bacterial reads from the final files used in the data analysis, strict filter settings in the STACKS program POPULATIONS
were implemented: Retaining only loci that were present in minimum 0.75-0.8 % of the individuals in a population. Furthermore, only loci that were present in 0.75-0.85 % of the populations were kept. Additionally, filtered reads were blasted to check for remaining bacterial DNA, and no contamination was observed. Individuals were linked to their respective populations using POPULATIONS, and the following format output files were chosen: STRUCTURE, VCF and PHYLIP. For more information about the settings used to produce the different output files, see Appendix Table A9.
Because of the focus of this study (diploid Cochlearia), and potential complications arising when mixing numbers in the analyses, six tetraploid individuals (originally included in the RAD-seq library) were excluded from further analyses. Additionally, ten diploid individuals from a previously produced RAD-seq library (Brandrud 2014) were included, resulting in a total number of 83 individuals in addition to the three replicates. Replicates were excluded in the final results.
Initial data analyses, showed that two samples (in population HFN and LAT) were most probably switched in the lab during DNA extraction, as the sample tubes had identical numbers. In the analyses, one individual in the HFN population always grouped with the LAT
25 population and vice versa. Based on these results a decision was made to switch the samples back to their suspected true populations.
Population structure analysis
Population structure was investigated with the program STRUCTURE (Pritchard et al. 2000).
STRUCTURE is based on the Hardy-Weinberg (HW) assumption, and uses Bayesian clustering to find the optimal number of groups (= K) that the dataset can be divided into, and then assigns individuals to these groups. For the analysis, the STRUCTURE output file was used (containing 1500 SNPs for 83 individuals). Since we are dealing with closely related species and populations, the admixture model was used, assuming that individuals may originate from more than one group. Correlated frequencies were chosen, allowing allele frequencies in the different populations to be quite similar (Falush et al. 2007). The dataset was run for each K from K = 1 to K = 10 using 1 000 000 iterations and burn‐in of 100 000, through the Lifeportal at the University of Oslo. Results were summarized in STRUCTURE HARVESTER
(Earl and von Holdt 2012) and CLUMPAK (Kopelman et al. 2015). After the evaluation of the likelihood graphs and graphs of DeltaK produced by the Evanno method (Evanno et al. 2005), the data was visualized in DISTRUCT (Rosenberg 2004).
Tree and network analyses
SPLITSTREE v4.13.1(Huson and Bryant 2006) can be used to infer various phylogenetic trees and networks based on for instance SNPs, sequences or a distance matrix. A neighbor-joining network based on the PHYLIP file (containing 1500 SNPs) for 83 individuals was made using uncorrected_P as SNP data distance. The network was performed with each end node representing an individual.
A Bayesian tree was created using a PHYLIP file, (containing 2482 SNPs for 83 individuals represented as 24 populations), with each end node representing a population. This PHYLIP
file was converted to a NEXUS file using ALIVIEW v1.17.1 (Larsson 2014). Selecting the best- fit model of nucleotide substitution was done by the use of JMODELTEST 2.1.7 (Darriba et al.
2012), where number of substitution schemes was set to 3, rate variation to G (gamma) without variable sites and calculation for base tree for likelihood set to Fixed Bionj-JC.
Otherwise default settings were used. Based on the result of this test, the F81+G substitution model was used when running a Bayesian inference phylogenetic analysis using the MCMC
26 method in MRBAYES v3.2.5 (Ronquist and Huelsenbeck 2003). MRBAYES was programmed to perform two parallel runs using four heated chains. Each run had number of generations set to 1.000.000, sampling every 100th generation and giving diagnostics every 1000th generation.
To test if the Markov Chain converged, the standard deviation of split frequencies was monitored to make sure it fell below 0.01 when comparing the two independent runs. The p output files were opened in TRACER (Baldwin et al. 1995) to check the run performed in
MRBAYES
:
confirming the converged chains and deciding the burnin. Next, the t output file was opened in TREEANNOTATOR v1.8.2 (Rambaut and Drummond 2011, available at:http://beast.bio.ed.ac.uk/treeannotator). TREEANNOTATOR was used to create a majority rule consensus tree using a burnin percentage of 10 Posterior probability was set to 0.9, and PP- values below were not shown. The resulting tree was visualized in FIGTREE v1.4.2 (Rambaut 2014, available at: http://tree.bio.ed.ac.uk/software/figtree/) and edited in ADOBE ILLUSTRATOR CS4.
PAUP* v4.0b10 (Swofford 2001) was used to perform a phylogenetic analysis, producing a most parsimonious tree based on the same NEXUS file as used in MRBAYES. The following settings were used: Heuristic search, 1000 replications saving 20 trees per replicate, tree bisection reconnection (TBR) as branch swapping and 1000 replications for the bootstrap analysis. The majority rule bootstrap consensus tree was visualized in FIGTREE, and bootstrap values from this tree were added to the Bayesian tree using ILLUSTRATOR CS4.
PCA analyses
PCA analyses were performed based on RAD-seq data using the VCF output file (containing 3246 SNPs for 83 individuals). PCA was performed in R using the ADEGENET package version 1.4.2 (Jombart 2008, Jombart and Ahmed 2011) with default parameters, without removing any outliers, and missing data was set to max 25 %. The data was plotted using the
GGPLOT2 package (Wickham 2009).
Maps
Maps were made based on GPS coordinates sampled in the field (Appendix Table A3) using QGis-OSGeo4W-2.4.0-1 (QGIS Developmental Team, 2009. QGIS Geographic Information System. Open Source Geospatial Foundation. http://qgis.osgeo.org), and modified in ADOBE ILLUSTRATOR CS4.
27
3 Results
3.1 Chromosome counting
Chromosome counts were done on inflorescences sampled from 12 Icelandic populations.
Four populations (EIR, GIL, HAF, STK) had a chromosome number of 2n = 14 and eight populations (DJU, STR, DYR, HVA, ING, BAE, OLF and SUR) had a chromosome number of 2n = 12. Photos of nuclei were taken for nine of the populations (Fig.3). Both alpine populations, EIR and GIL, had 14 chromosomes. Among the coastal populations, two (STK and HAF) had 14 chromosomes, and the remaining eight had 12 chromosomes.
Figure 3: Map of Iceland with 15 sampled Cochlearia populations indicated (for further locality information, see Appendix Table A3). Inserted map in top left corner shows the distribution of Cochlearia in Iceland (map from Kristinsson 2010). Different symbols (squares or triangles) indicate chromosome number of the 12 populations for which chromosome numbers were counted. For nine of these populations pictures displaying nuclei with visible chromosomes are shown. Populations with unknown chromosome number are indicated by a circle. Populations are further colored according to geographical areas, which are referred to throughout the results and discussion.
28
3.2 Morphometry
There were significant differences in leaf traits among the 10 measured populations (Kruskal Wallis test, leaf surface area, p = 0.0031; leaf ratio: p = 0.0049; leaf base angle: p = 0.0003, Fig. 4 a, b and c). Dunn’s post hoc test (multiple comparisons) indicated that there was a significant difference in leaf surface area when comparing the alpine population EIR and the two coastal populations DYR (p = 0.03) and ING (p = 0.05). There was also a significant difference in leaf surface area when comparing the other alpine population GIL with DYR (p
= 0.007) and ING (p = 0.01). There was further a significant difference in leaf ratio when comparing EIR with DYR (p = 0.02) and OLF (p = 0.05) and when comparing GIL with DYR (p = 0.01) and OLF (p = 0.03). Lastly, there was a significant difference in leaf base angle when comparing EIR with DYR (p = 0.01), OLF (p = 0.03) and ING (p = 0.02) and when comparing GIL with DYR (p = 0.004), OLF (p = 0.01) and ING (p = 0.04). Thus, the leaves of alpine populations were overall smaller in leaf surface area, ratio and angle than coastal populations, and significantly so in the cases mentioned above. For selected examples of leaf morphology, see Appendix Fig. A1. Mann Whitney test showed a significant difference between the alpine and coastal populations in leaf surface area (p = 0.0001, Fig. 5 a), leaf ratio (p = 0.0001, Fig. 5 e) and leaf base angle (p<0.0001, Fig. 5 i). Furthermore, there was a significant difference between the coastal and alpine populations when including only populations with chromosome number 2n = 14: Leaf surface area (p = 0.01, Fig. 5 b), leaf ratio (p = 0.001, Fig. 5 f) and leaf angle (p = 0.001, Fig. 5 j). Mann-Whitney test showed a significant difference between the 2n = 14 and 2n = 12 plants in leaf surface area (p = 0.0002, Fig. 5 c), leaf ratio (p = 0.0002, Fig. 5 g) and leaf angle (p = 0.003, Fig. 5 k). However, there was no significant difference between coastal populations with 2n = 14 and 2n = 12 when comparing leaf surface area (Fig. 5 d) or leaf base angle (Fig. 5 l), but there was a significant difference in leaf ratio (p = 0.03; Fig. 5 h).
29
Figure 4: Boxplots illustrating differences in leaf morphological traits among 10 Icelandic Cochlearia populations (names abbreviated according to Appendix Table A3): a) Leaf surface area, b) Leaf ratio (width/length) and c) Leaf base angle. The average of 5 leaves from an individual was calculated. Populations with 2n = 14 are colored grey and populations with 2n = 12 are white. N = 4 (except EIR with N = 3). For further explanation of the boxplot, see Appendix Fig. A2.
0 2 4 6 8 10 12
EIR GIL STK HAF DYR ING HVA DJU BAE OLF
Cm2
Population
Leaf surface area
0%
50%
100%
150%
200%
250%
EIR GIL STK HAF DYR ING HVA DJU BAE OLF Population
Leaf ratio (W/L)
0 50 100 150 200 250 300
EIR GIL STK HAF DYR ING HVA DJU BAE OLF Population
Leaf base angle a)
b)
c)
30 e)
f)
g)
h)
i)
j)
k)
l)
Figure 5: Boxplots illustrating differences in leaf traits between 10 Icelandic Cochlearia populations (names abbreviated according to Appendix Table A3) grouped according to habitat (coastal in blue and alpine in beige) or chromosome number (2n = 12 in white and 2n = 14 in grey) or a combination of habitat and chromosome number: a-d) Leaf surface area, e-h) Leaf ratio (width/length) and i-l) Leaf base angle. Asterisk indicates significant differences between the two groups. For explanation of the boxplot, see Appendix Fig. A2.
0 2 4 6 8 10 12
Coastal Alpine
Cm2
Leaf surface area
0 2 4 6 8
Coastal 2n=14
Alpine 2n=14
Cm2
Leaf surface area
0 2 4 6 8 10 12
2n=12 2n=14
Cm2
Leaf surface area
0 2 4 6 8 10 12
Coastal 2n=12
Coastal 2n=14
Cm2
Leaf surface area
0%
50%
100%
150%
200%
250%
Coastal Alpine Leaf ratio (W/L)
0%
50%
100%
150%
Coastal 2n=14
Alpine 2n=14 Leaf ratio (W/L)
0%
50%
100%
150%
200%
250%
2n=12 2n=14 Leaf ratio (W/L)
0%
50%
100%
150%
200%
250%
Coastal 2n=12
Coastal 2n=14 Leaf ratio (W/L)
0 50 100 150 200 250 300
Coastal Alpine Leaf base angle
0 50 100 150 200 250
Coastal 2n=14
Alpine 2n=14 Leaf base angle
0 50 100 150 200 250 300
2n=12 2n=14 Leaf base angle
0 50 100 150 200 250 300
Coastal 2n=12
Coastal 2n=14 Leaf base angle
*
*
*
*
*
*
*
*
*
* a)
b)
c)
d)
31 The PCA analysis (Fig. 6) showed a separation of populations according to habitat along the first principal axis (PC1; explaining 75.4 % of the variation) and the second principal axis (PC2; explaining 15.4% of the total variation). As indicated in the PCA plot by the direction of arrows from origo, above mean and increased value of leaf surface area, leaf ratio and leaf base angle were correlated with PC1, with points close to the extremity of the arrow indicating higher values. The alpine populations (EIR and GIL) constituted a dense group at the left side of the plot, away from the direction of the arrows, indicating low values of leaf surface area, leaf ratio and leaf base angle. The coastal populations were more scattered, showing overall higher variation along both PC1 and PC2. Some individuals from populations OLF, ING and DYR had highest values along PC1, indicating high values of the three morphological traits.
Figure 6: Biplot resulting from the PCA analysis of leaf morphological traits (leaf surface area, leaf ratio and leaf base angle) from 10 Icelandic Cochlearia populations (names abbreviated according to Appendix Table A3).
Each point represents the mean score from five of leaves from one individual plant. The direction of the arrows from origo indicate above mean and increased values for the morphological traits. Points close to the extremity of the arrow indicate higher values. Populations with 2n = 14 are illustrated as squares, and populations with 2n
= 12 as triangles. N = 4 (except for EIR, N = 3).
32
3.3 RAD-sequencing
Sequencing of the paired-end RAD-seq library from 82 Cochlearia individuals/replicates yielded a total number of 348,745,226 sequence reads. After barcode sorting and filtering by the use of PROCESS_RADTAGS in STACKS, 196,579,422 forwards reads were retained.
Population structure analysis
When using STRUCTURE HARVESTER to select number of genetic clusters from the
STRUCTURE analysis, K = 3 had the highest and K = 5 the second highest delta K value (Appendix Fig. A3). K = 5 had, however, a higher value than K = 3 in the likelihood of K graph (Appendix, Fig. A4). These results were checked and confirmed in CLUMPAK (not shown). Both K = 3 and K = 5 were visualized using DISTRUCT.
When three genetic clusters were selected (K = 3), the Icelandic populations were assigned to two of these clusters, the orange and the blue (Fig. 7). Coastal populations from southern Iceland (DYR, HJO, ING and SUR) assigned to the orange cluster together with two of the most southerly located populations along the eastern coast (DJU and HVA, Fig. 1). The alpine 2n = 14 populations (EIR and GIL) assigned to the blue cluster together with populations from Svalbard (HOP, TJU, LOM and FLA), except for one individual from GIL that was admixed between the blue and the orange cluster. The remaining populations (all coastal), showed varying degree of admixture between these two clusters. The coastal 2n = 14 populations (STK and HAF) showed a higher percentage assignment to the blue cluster than they did to the 2n = 12 populations (OLF, BAE, STR), and two populations of unknown chromosome number (LAT and HFN), and vice versa. Southwestern European populations were assigned to a cluster of their own (yellow). Two of the C. pyrenaica populations (PYR2 and PYR4; consisting of one individual each) showed, however, some minor admixture with the orange and blue clusters.
When five genetic clusters were selected (K = 5), the Icelandic populations showed further geographical clustering (Fig. 8). The large non-admixed orange cluster from the previous plot (K = 3) were here split into one cluster constituting the southern populations (blue) and one cluster constituting the eastern populations (pink). The two northeastern populations (HFN and STR) were now admixed between three clusters, assigning mainly to the eastern (pink) cluster, but additionally to the alpine/Svalbard (brown) cluster and to the southern (blue)
33 cluster. The coastal populations from western and northern Iceland were assigned mainly to the green cluster, which contained populations with different chromosome numbers (STK and HAF with 2n = 14; OLF and BAE with 2n = 12, and LAT with unknown chromosome number). The three latter populations displayed admixture with the southern (blue) cluster.
The admixed individual in the alpine GIL population shared also genes with the southern populations. The southwestern European populations still constituted a cluster of their own (purple).
Tree and network analysis
In the neighbor-joining network performed by SPLITSTREE, all individuals (except the outlier individual in the alpine GIL population) were grouped according to populations (Fig. 9), displaying the same geographical structure as was found in the Structure and PCA analyses.
Figure 7: STRUCTURE analysis of 83 Cochlearia individuals, based on 1500 SNPs obtained from RAD-seq data. Each vertical bar represents an individual. Populations are separated by vertical black lines and names abbreviated according to Appendix Table A3 and A4. The number of colors depict the number of genetic clusters (K = 3).
Figure 8: STRUCTURE analysis of 83 Cochlearia individuals, based on 1500 SNPs obtained from RAD-seq data. Each vertical bar represents an individual. Populations are separated by vertical black lines, and names abbreviated according to Appendix Table A3 and A4. The number of colors represents the number of genetic clusters (K = 5). Populations are ordered according to geographical area, as depicted in the white boxes below.
34 There was a clear split separating southwestern European populations from the Svalbard and Icelandic populations. The Svalbard and Icelandic alpine populations were also separated from the Icelandic coastal populations by a clear split. A minor diagonal split further divided the eastern Icelandic populations from the western, southern and northern populations.
Figure 9: Neighbor-joining network performed in SPLITSTREE for 83 Cochlearia individuals based on 1500 SNPs obtained from RAD-seq data and using the uncorrected_P distance measure. Different symbols (squares or triangles) indicate chromosome number. Individuals with unknown chromosome number are indicated by a circle. Populations are further colored according to geographical areas, and names abbreviated according to Appendix Table A3 and A4.
35 Bayesian phylogeny
The topology of the Bayesian majority-rule consensus tree was consistent with the single most-parsimonious tree (length = 1423) that was found (Fig. 10, the most-parsimonious tree not shown). When rooted with the southwestern European population AST, the Icelandic alpine and Svalbard populations constituted a well-supported monophyletic group where the alpine populations (brown) were sister group to the Svalbard populations (orange), which again constituted a well-supported sister group to the coastal Icelandic populations. Within the Icelandic coastal populations, the eastern populations (pink) were sister group to a clade consisting of southern (light blue), northern (grey) and western (green) populations. Bootstrap values supporting the S-N-W-clade as monophyletic were, however, low (61), and further relationships within this clade were generally with low support.
Figure 10: Bayesian majority-rule consensus tree for 24 Cochlearia populations (names abbreviated according to Appendix Table A3 and A4) based on 2482 SNPs obtained from RAD-seq data. Posterior probability values are shown in black above branches. Bootstrap values from the parsimony bootstrap analysis are shown in grey below branches. Color coding of populations is as follows: Spanish = dark blue, French = purple, Svalbard = orange; Icelandic regions: alpine = brown, eastern = pink, southern = light blue, western = green, northern = grey.