Assessing the probability of detection of horizontal gene transfer events in bacterial populations
Jeffrey P. Townsend1, Thomas Bøhn2,3and Kaare Magne Nielsen2,3*
1Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, USA
2GenØk-Centre for Biosafety, The Science Park, Tromsø, Norway
3Department of Pharmacy, Faculty of Health Sciences, University of Tromsø, Tromsø, Norway
Edited by:
Rustam I. Aminov, University of Aberdeen, UK
Reviewed by:
Morten Otto Alexander Sommer, Technical University of Denmark, Denmark
Lilia Macovei, The Forsyth Institute, USA
Douda Bensasson, University of Manchester, UK
*Correspondence:
Kaare Magne Nielsen, Department of Pharmacy, Research Group in Microbiology, Molecular and Pharmaco-Epidemiology, Breivika, 9037 Tromso, Tromso, Norway e-mail: [email protected]
Experimental approaches to identify horizontal gene transfer (HGT) events of non-mobile DNA in bacteria have typically relied on detection of the initial transformants or their imme- diate offspring. However, rare HGT events occurring in large and structured populations are unlikely to be detected in a short time frame. Population genetic modeling of the growth dynamics of bacterial genotypes is therefore necessary to account for natural selection and genetic drift during the time lag and to predict realistic time frames for detection with a given sampling design. Here we draw on statistical approaches to population genetic theory to construct a cohesive probabilistic framework for investigation of HGT of exoge- nous DNA into bacteria. In particular, the stochastic timing of rare HGT events is accounted for. Integrating over all possible event timings, we provide an equation for the probability of detection, given that HGT actually occurred. Furthermore, we identify the key variables determining the probability of detecting HGT events in four different case scenarios that are representative of bacterial populations in various environments. Our theoretical analy- sis provides insight into the temporal aspects of dissemination of genetic material, such as antibiotic resistance genes or transgenes present in genetically modified organisms. Due to the long time scales involved and the exponential growth of bacteria with differing fit- ness, quantitative analyses incorporating bacterial generation time, and levels of selection, such as the one presented here, will be a necessary component of any future experimental design and analysis of HGT as it occurs in natural settings.
Keywords: lateral or horizontal gene transfer, DNA uptake, modeling, monitoring, sampling, antibiotic resistance, GMO, biosafety
INTRODUCTION
Bacteria in natural populations are known to import and inte- grate exogenous genetic material of diverse, often unidentified, origins (Eisen, 2000;Ochman et al., 2000;Lawrence, 2002;Naka- mura et al., 2004;Didelot and Maiden, 2010). Bacterial genomes can be exposed not only to the multitude of sources of exogenous DNA present in their natural environments (Levy-Booth et al., 2007;Nielsen et al., 2007;Pontiroli et al., 2007;Pietramellara et al., 2009;Rizzi et al., 2012), but also to introduced sources of novel DNA such as the fraction of recombinant DNA present in geneti- cally modified organisms (GMOs). Such exposure can potentially lead to horizontal gene transfer (HGT) events of GMO recombi- nant DNA, dependent on the multitude of parameters that govern HGT processes in various environments (Bertolla and Simonet, 1999;Bensasson et al., 2004). However, for long-term persistence of infrequently acquired genetic material in new bacterial hosts, a conferred selective advantage is considered necessary (Feil and Spratt, 2001;Berg and Kurland, 2002;Johnsen et al., 2009;Kuo and Ochman, 2010). Experimental investigations have shown that most HGT events that integrate into the bacterial chromosome are deleterious (Elena et al., 1998;Remold and Lenski, 2004). Thus, in terms of the persistence of its signature and its effects on fit- ness, HGT processes resemble routine mutational processes that
take place at similarly low frequencies in bacteria and that are eventually lost from the population (Kimura and Ohta, 1969;Jor- gensen and Kurland, 1987;Lawrence et al., 2001;Mira et al., 2001;
Johnsen et al., 2011). However, rare HGT events and mutations can be positively selected under particular conditions and are the sources of bacterial adaptation and evolution (Imhof and Schlöt- terer, 2001;Townsend et al., 2003;Orr, 2005;Barret et al., 2006).
HGT is particularly well known for playing a central role in the evolution of resistance to antibacterial agents (Bergstrom et al., 2000;Heinemann and Traavik, 2004;Aminov and Mackie, 2007;
Aminov, 2010, 2011).
The detection of HGT events in a given bacterial genome can be performed retrospectively through bioinformatics-based compar- ative analyses (Ochman et al., 2000;Spratt et al., 2001;Nakamura et al., 2004;Didelot and Maiden, 2010). Alternatively, events may be detected via focused experimental efforts on defined bacte- rial populations under controlled conditions in the laboratory or monitoring efforts on subsamples taken from bacterial pop- ulations present in various environments, e.g., from soil, water, wounds, or gastrointestinal tracts (GITs;Nielsen and Townsend, 2004;Thomas and Nielsen, 2005;Pontiroli et al., 2009;Aminov, 2011). The latter approach can enable the identification of HGT events as they occur in the context of complex interactions of
diverse bacterial communities. Its main limitation is sensitivity due to restricted sampling capacity of large bacterial populations, other methodological limitations, and cost of analysis. Repre- sentative analysis of HGT events in bacterial communities also depends on knowledge of the structure and population dynam- ics of the population and the sequence of the DNA transferred.
Detection strategies frequently rely on hidden or implicit assump- tions regarding the distribution and proportion of the individual cells in the sampled larger bacterial population that would carry the transferred DNA sequences (Keese, 2008;Heinemann et al., 2011).
Large-scale cultivation of genetically modified plants (GM- plants) result in multitudinous opportunities for bacterial expo- sure to recombinant DNA and therefore opportunities for unin- tended horizontal dissemination of transgenes (EFSA, 2004, 2009;
Nielsen et al., 2005;Levy-Booth et al., 2007;Wögerbauer, 2007;
Pietramellara et al., 2009; Brigulla and Wackernagel, 2010). In laboratory-settings, experimental studies of single bacterial species have demonstrated that bacteria can take up DNA fragments from plants and integrate them into bacterial genomes under highly optimized conditions (e.g.,Gebhard and Smalla, 1998;De Vries et al., 2001;Kay et al., 2002;Ceccherini et al., 2003). In contrast, in natural settings, sampling-based studies of agricultural soils, run-off water, and GIT contents have found spread of transgenes from GM-plants, but negative or inconclusive evidence for HGT (Gebhard and Smalla, 1999;Netherwood et al., 2004;Mohr and Tebbe, 2007;Demanèche et al., 2008;Douville et al., 2009).
Most research on HGT from GM-plants to bacteria has been performed via an assay after a limited time period following transgene exposure, perhaps in part because only limited explicit considerations of the population dynamics of HGT events have been presented to guide sampling design and data analysis (Heine- mann and Traavik, 2004;Nielsen and Townsend, 2004; Nielsen et al., 2005). Given the low mechanistic probability of occur- rence, horizontally transferred non-mobile DNA will initially be present at an exceedingly low frequency in the overall population.
It may therefore take months, years, or even longer for the few initial transformants to divide and numerically out-compete non- transformed cells of the population to reach frequencies that can be efficiently detected by sampling efforts. The generation time of bacterial populations is therefore of high importance for detection efforts. Cell division time varies with species and environments and can be as short as<1 h in nutrient rich environments such as the GIT and up to several weeks in nutrient limited environments such as soil.
The time lag between initial occurrence and potential detection will be present even though the relevant HGT events lead to posi- tive selection of transformants (Nielsen and Townsend, 2001, 2004;
Heinemann and Traavik, 2004;Pettersen et al., 2005). Quantify- ing this time lag and determining the relationship between HGT frequencies and probability of detection requires mathematical models with dependency on several key parameters: HGT fre- quencies, changes in relative fitness of the transformants, bacterial population sizes, and generation times in nature. A few studies have accordingly begun to characterize the effects of natural selec- tion and the probability of fixation of HGT events in bacterial populations (Nielsen and Townsend, 2001, 2004;Pettersen et al., 2005).
Here we integrate previous theory into a cohesive probabilistic framework that addresses current methodological shortcomings in the detection of HGT events and guides experimental design of future sampling of bacterial populations. Our analysis yields a sim- ple formulation for the probability of detection given that a HGT actually occurred, and facilitates computation of the statistical power of an experimental sampling design.
We apply the model to four different scenarios that are relevant for experimental monitoring of complex bacterial communities, accounting for both the adaptive dynamics of natural selection and the unknown timing of HGT events. In scenarios 1 and 2, the effects of variable DNA exposure are considered (i.e., exposed sub-population versus the total population of bacteria). Sampling occurs at the end of the DNA exposure period. In scenarios 3 and 4, the sampling is delayed until sometime after the DNA exposure of the bacterial recipients has ended. The total population size (N) and the strength of selection (m) varies in the scenarios (between N=106–1012, and m=10−10–1). The m parameter represents the relative cost or advantage conferred by the HGT event to the transformant bacterium compared to untransformed members of the same population. In nature,mvalues would range from the reciprocal of the population size (weak positive selection) to near infinity (strong positive selection). The latter would for instance be caused by antibiotic treatment leading to death of all suscep- tible non-transformed cells. However, for most traits much lower values of m are expected. Them value of a given trait is not a constant and will depend on the environmental conditions. For instance, an antibiotic resistance trait can be highly advantageous in the presence of antibiotics (high positivem) but confer a fitness cost in the absence of antibiotics (negativemvalue; c.f.Johnsen et al., 2011).
MODELING
Immediately following an HGT event into a large bacterial popu- lation, the lineage of bacterial cells carrying the novel transferred gene is highly vulnerable to extinction due to natural stochastic- ity in cell survival over the first generations (Fisher, 1922, 1930;
Haldane, 1927;Johnson and Gerrish, 2002;Pettersen et al., 2005).
Subsequently, after the transformant population has established at higher numbers, it can be assumed to follow a fairly deterministic path, given continued directional selection. For a selected variant in transit to fixation with Malthusian relative fitnessmper gen- eration overtggenerations, the current frequencypof a mutant starting at frequencyp0can be modeled deterministically as
p
1−p = p0
1−p0emtg (1)
(Hartl and Clark, 1997; Nielsen and Townsend, 2004). With a single HGT event, the frequency of the transformant in a haploid population becomes 1/N where N is the overall num- ber of bacterial cells in the population of interest. Thus, for the frequency of a transformant (p) subsequent to a HGT event,
p
1−p = 1
N−1emtg (2)
(Nielsen and Townsend, 2004). Solved forp, Eq. 2 yields p= emtg
N −1+emtg. (3)
Because HGT events are relatively rare and presumably inde- pendent, we assume that the time delay until a HGT occurs is exponentially distributed, parameterized by a rate that incorpo- rates the number of bacteria exposed,x, the rate of HGT per exposed bacterium,r, and the time,tx, during which exposure may occur (SeeNielsen and Townsend, 2001for a detailed description of these factors). Accordingly, the time to the next HGT,Tx,would be distributed as
fTx(tx)=rxe−rxtx. (4)
The probability of fixation of a new variant gene in a haploid population has been characterized as
1−e−2m
1−e−2Nm, (5)
(Kimura, 1957, 1962;Moran, 1961;Gillespie, 1974; seePatwa and Wahl, 2008, for a review of alternate cases).
Following the exponentially distributed occurrence rate (Eq.
4), filtered by the fixation process (Eq. 5), the timing until the occurrence,T, of the first HGT that is to be eventually fixed in the population, would be distributed as
fT(t)=
1−e−2m 1−e−2Nm
rxe−
1−e−2m 1−e−2Nm
rxt. (6)
Given that an HGT occurs that is on its way to fixation, what is the probability that such a transfer will be detected? This proba- bility depends in part on the sample size of the monitoring effort, n. Here,nis treated as the number of bacteria in the environment sampled in a perfect assay for possession of the HGT event. If the frequency of the primary transformant and its offspring in the population at a given time isp, then the probability of detection is
1−(1−p)n. (7)
Because the frequency p of a HGT event/transformant that is under strong positive selection deterministically increases with time until it is fixed in the population (Figure 1), the probability of detection depends on the amount of timetsince the first HGT event occurred, which depends on the time of first exposure to DNA of concern,T. The later the samples are taken, the greater the probability that a selected HGT event on its way to fixation will be detected.
The probability of detection for a HGT event on its way to fixation with selection coefficientmat timetgafter the original transfer event is derived by substituting Eq. 3 for pinto Eq. 7 (c.f.Nielsen and Townsend, 2004). Assuming the value of Eq. 3 is very small (i.e., population size is large and selection coefficient is
sufficiently small), a useful approximation for the probability of detection of a HGT event on its way to fixation is
1−
1− emtg N−1+emtg
n
≈1−
1−n emtg N−1+emtg
= nemtg
N−1+emtg. (8) However, for practical implementation, the probability term from Eq. 3 may not be known to be small. Furthermore, the unknown timing of the successful HGT is a key factor in the prob- ability of detection. Therefore it would be best to integrate over all possible timings in order to calculate a representative probability of detection of HGT events. Noting that in this casetg=ts−t, this integration, from Eqs 4, 5, and 8, is
tx
0
1−e−2m 1−e−2Nm
rxe−
1−e−2m 1−e−2Nm
rxt
×
1−
1− em(ts−t) N −1+em(ts−t)
n
dt, (9)
or, moving factors that do not depend upon timet out of the integral,
1−e−2m 1−e−2Nm
rx
tx
0
1−
1− em(ts−t) N −1+em(ts−t)
n
×e−
1−e−2m 1−e−2Nm
rxtdt. (10)
Equation 10 yields a prediction of the probability of occurrence and detection of a HGT event, and may be parameterized across a range of rates of HGT.
For experimental design purposes (or for prediction for pol- icy purposes), it may be important to calculate not just the full probability of detection, but also the restricted, higher probability of detection given that a successful HGT has occurred. This cal- culation can be achieved by dividing the result of Eq. 10 by the probability of any successful HGT event over the timetx, 1−e−2m
1−e−2Nm
rx
tx
0
e−
1−e−2m 1−e−2Nm
rxtdt≈1−e−(1−e−2m)rxtx . (11) The approximation is valid providedNis large compared tom.
Setting this approximation aside for generality, the larger proba- bility of detection given that a successful HGT has occurred is then
tx 0
1−
1−N−e1m(ts−t)+em(ts−t)n e−
1−e−2m 1−e−2Nm
rxtdt
tx 0 e−
1−e−2m 1−e−2Nm
rxtdt
. (12)
MATERIALS AND METHODS
We applied this model to estimate the probability of successfully finding HGT events (e.g., antibiotic resistance genes or transgenes) in bacterial populations under different scenarios representative of various environmental conditions. The total population sizes
2 4
6 8
10 2
4 6
8 10
0 0.25
0.5 0.75 1
2 4
6 8
10
2 4
6 8
10 2
4 6
8 10
0 0.25
0.5 0.75 1
2 4
6 8
10
2 4
6 8
10 2
4 6
8 10
0 0.25
0.5 0.75 1
2 4
6 8
10
10^-10 10^-10 10^-10
10^-3 10^-3
10^-3
10^-14 10^-14 10^-14
10^-5 10^-5 10^-5
2 4
6 8
10 2
4 6
8 10
0 0.25 0.5 0.75 1
2 4
6 8
10 2
4 6
8 10
2 4
6 8
10
0 0.25
0.5 0.75 1
2 4
6 8
10
2 4
6 8
10 2
4 6
8 10
0 0.25
0.5 0.75 1
2 4
6 8
10
2 4
6 8
10 2
4 6
8 10
0 0.25
0.5 0.75 1
2 4
6 8
10 2
4 6
8 10
2 4
6 8
10
0 0.25
0.5 0.75 1
2 4
6 8
10
10^-14 10^-5 10^-5
10^-14 10^-10
10^-3
10^-10
10^-3
10^-10
10^-3 10^-10
10^-3
10^-10
10^-3 10^-14
10^-5
10^-14 10^-5
10^-14 10^-5
Selection (m)
P (detection)
2 4
6 8
10 2
4 6
8 10
0 0.25 0.5 0.75 1
2 4
6 8
10 10^-5
10^-14 10^-10
10^-3
tx=ts=20 tx=ts=10^3 tx=ts=10^5
x=10^6
x=10^8
x=10^10
FIGURE 1 | Scenario 1: HGT in large populations, no sampling delay; weak positive selection.Probabilities of detection of transformants in a large bacterial populationN=1012with HGT rates ranging fromr=10−14to 10−5, and weak positive selection of transformants ranging fromm=10−10to 10−3. The proportion of DNA
exposed bacteria is low, medium and high (x=106, 108, and 1010, respectively, out of the 1012total bacterial population, from top to bottom), and the time period of DNA exposure is the same as the time to sampling:tx=ts=20, 103, and 105, from left to right. Sample sizen=10,000 bacteria.
ranged from 106(small) to 1012 (large). We assume only a frac- tion of the bacterial population that was sampled was exposed to novel genetic material (0.0001–1% in large populations, and 0.1–
10% in small populations); resulting in HGT rates (and hence, transformant rates) ranging from 10−14 to 10−5. Moreover, as explained above, we considered only transformants that have rel- ative fitness gains, as expressed by a positive selection coefficient, including weak positive selection (m=10−10–10−3) or strong positive selection (m=10−3–1). Our analysis excludes secondary transmissions, a process that may need explicit consideration in cases of plasmid transfer (Landis et al., 2000).
Four different environmental scenarios were examined that broadly represent the population dynamics of HGT events in bac- terial populations. The scenarios encompassed: (i) large and small bacterial populations, (ii) strong and weak selection of the HGT events (transformants), and (iii) immediate or delayed sampling, i.e., if the sampling of the larger bacterial population was per- formed at the end of the DNA exposure, or delayed in time until long after the DNA exposure had ended. Within these scenarios, we varied the HGT rate, the selection coefficient, the ratio of exposed to total population of bacteria, the time period of exposure, and the time until the bacteria was sampled in the field (Table 1).
Since approx. 10,000 bacteria represents the upper limit of the number of individual isolates that can be practically assayed in a research laboratory (Nielsen and Townsend, 2004), we assume this sample size (n=10,000) for all our scenario calculations, even though effective sample sizes in actual studies to date have been
smaller. In scenarios where several samples were taken from a field at several time points, sample size will be proportionally reduced at each sampling point. This ensures comparability among experimental designs.
In scenarios 1 and 2, the focus is on the effects of variable DNA exposure level (exposed sub-population versus the total popula- tion of bacteria) and on the strength of selection. We keep the time span of exposure equal to the time span before sampling, i.e., sampling occurs at the end of the DNA exposure (i.e.,tx=ts).
In scenarios 3 and 4, the sampling is delayed for considerable amounts of time after the DNA exposure of the bacterial recipients has ended (i.e.,tx<ts).
All calculations were performed and graphics were drawn in the Mathematica 4.1 software (Wolfram Research, IL, USA). The Mathematica notebook containing these calculations is available in the Appendix.
RESULTS
SCENARIO 1. DETECTION OF BACTERIAL TRANSFORMANTS IN LARGE POPULATIONS
Scenario 1 represents a large bacterial population (e.g., abundant members of the soil bacterial community or the GIT of an ani- mal population). Of the total population, only a sub-fraction of 0.0001, 0.01, and 1% is actually exposed to DNA (e.g., due to lim- ited release/exposure of DNA from the defined source and or DNA degradation in soil or the GIT;Nielsen et al., 2007;Nordgård et al., 2007;Rizzi et al., 2012). Those bacteria exposed can acquire DNA
Table 1 | Parameters and their ranges used in this study.
Parameter Symbol Range Comments/reference
Total population size N 106–1012 Overall size of population that is susceptible to HGT in the exposed environment; note this bacterial population may therefore not be limited to a particular species.
Number of exposed bacteria, in large (and small) populations
x 106–1010(10–105) The number of the overall susceptible population that will be exposed to the donor DNA source;
A smaller fraction of those exposed is transformed. See HGT rate below
Selection coefficient m 10−10–1 Relative measure of Malthusian fitness in populations with overlapping generations
Time to sampling ts 20–105 The time (in bacterial generations) since the beginning of DNA exposure to the time of sampling Time of exposure tx 20–105 Time of exposure to DNA source (in bacterial generations)
HGT rate r 10−14–10−5 Frequency of gene transfer into the bacterial population
Sample size n 10,000 Nielsen and Townsend (2004)
2 4
6 8
10 2
4 6
8 10
0 0.25
0.5 0.75 1
2 4
6 8
10
2 4
6 8
10 2
4 6
8 10
0 0.25
0.5 0.75 1
2 4
6 8
10 2
4 6
8 10
2 0
0.25 0.5 0.75
1
2 4
6 8
10
2 4
6 8
10 2
4 6
8 10
0 0.25
0.5 0.75
1
2 4
6 8
10
2 4
10 2
4 6
8 10
0 0.25
0.5 0.75
1
2 4
10
2 4
6 8
10 2
4 6
8 10
0 0.25
0.5 0.75
1
2 4
6 8
10
2 4
6 8
10 2
4 6
8 10
0 0.25
0.5 0.75 1
2 4
6 8
10
2 4
6 8
10 2
4 6
8 10
0 0.25
0.5 0.75 1
2 4
6 8
10 2
4 6
8 10
2 4
6 8
10
0 0.25
0.5 0.75 1
2 4
6 8
10
4 6
8 8 10
4 6
8 10000
2
2 4
6 8
10
4 6
8 10
4 6
8 10
2 4
6 8
10
4 6
8 10
2 4
6 8
10
2 2
4 4
6 6
8 8
1 1 1 100000000000000000000000000
0 0 2 2
4 4
6 6
8 8
1 10
2 1 100
2 2
4 4
6 6
8 8
1 100
2 0 0 2
2 4 4
6 6
8 8
1 100
2 4
0 0 2
2 4 4
6 6
8 8
1 100
2 4
0 0 0
0 0 0 0 0 0 0 0 0 0 2 2
4 4
1 1000000000
2 0 0 0
2 2
4 4
6 6
8 8
1 100
2 2 2 2 2 2 2 2 2 2 2
2 2
4 4
6 6
8 8
1 100
2 4
0 0
0 0 0 2 2
4 4
6 6
8
8 2
1 100
Selection (m)
P (detection)
10^-14 10^-5
1 10^-3
10^-14 1 10^-3
10^-14 10^-5
1 10^-3
10^-14 10^-5
1 10^-3
10^-14 10^-5
1 10^-3
10^-14 10^-5
1 10^-3
10^-14 10^-5
1 10^-3
10^-14 10^-5
1 10^-3
10^-14 10^-5
1 10^-3
10^-5
tx=ts=20 tx=ts=10^2 tx=ts=10^3
x=10^6
x=10^8
x=10^10
FIGURE 2 | Scenario 1: HGT in large populations, no sampling delay; strong positive selection.Probability of detection of transformants in a large bacterial population,N=1012, with a HGT rate ranging fromr=10−14to 10−5, and strong positive selection on transformants ranging fromm=10−3to 1. The number of bacteria
exposed to DNA is low, medium, and high (x=106, 108, and 1010, respectively, out of the 1012total bacterial population, from top to bottom), and the time period of DNA exposure is the same as time to sampling:tx=ts=20, 102, and 103, from left to right. Sample size n=10,000 bacteria.
at rates ranging from extremely low (below what is usually exper- imentally measurable in the laboratory) to very high. Detection is likely only when sampling was performed after a long period of exposure (Figures 1and2). The strength of directional selection
is of considerable importance. However the determining factor is time of sampling after onset of exposure. To achieve a 90% or greater probability of detection, given that transfer has occurred, requires a selection coefficient greater thanm=10−4(Figure 1).
This observation suggests a time interval from the onset of a DNA exposure, until detection is possible, of 11 years in the GIT to up to 3,000 years in soil and (given 105 bacterial generations, with generation times of 1 h to 2 weeks, respectively).Figures 1and2 illustrate that an increase in the proportion of exposed bacteria is of little importance when compared to prolonging the time period of DNA exposure and sampling. Given enough time, even weakly but positively selected HGT events (e.g., antibiotic resistance gene or transgene) resulting from DNA exposure to only a small frac- tion of the total population and with a low HGT rate, is likely to establish in the bacterial population.
SCENARIO 2. DETECTION OF BACTERIAL TRANSFORMANTS IN SMALL POPULATIONS
Scenario 2 considers a small bacterial population. Such a scenario can be representative of fluctuating colonization or infection pat- terns such as microcolonies on plant, skin, or soil surfaces (e.g., Kinkel et al., 1995;Morris et al., 1997;Monier and Lindow, 2004), or alternatively situations where only a subset of the species/strains in the overall DNA exposed microbial community are capable of acquiring DNA. In this scenario, a sub-fraction of 0.001, 0.1, and 10% of the overall capable population is exposed to DNA and can acquire DNA at rates ranging from extremely low to high frequencies (r=10−14–10−5).
When selection is weak, the HGT events and, hence, resulting transformants, are likely to be detected only after relatively long exposure times, i.e., more than 1,000 generations, and only when the fraction of exposed bacteria is about 0.1% or higher (Figure 3).
In situations where positive selection is stronger, the HGT event is detectable in the short-term, i.e., after 20 generations, given that the fraction of bacteria that are exposed is high (Figure 4, bottom panel). A combination of an intermediate fraction of bac- teria exposed (0.1%) and an intermediate time for DNA exposure (1,000 generations) gives a relatively high probability of detecting HGT events (Figure 4, middle panel). Even in cases where the fraction of bacteria exposed to the DNA is very low, long-term exposure (tx>105generations) and strong positive selection will lead to establishment of transformants, i.e., at detectable levels (Figure 4, upper panel). Our longest generation time examined, 105, represents a continual exposure period of 10 years (with a bac- terial division time of 1 h or less) to more than 3,000 years (with a bacterial division time of 11 days or more).
THE EFFECT OF DELAYED SAMPLING AFTER SHORT-TERM TRANSIENT EXPOSURE TO DNA
In the following scenarios (three and four, weak and strong selec- tion, respectively), the bacteria are exposed to DNA for only a short period of time (20–100 generations, e.g., representing a time period of a less than a day to a few years, depending on bacter- ial growth rate). In all cases, the bacterial population is sampled after exposure. That is, a time delay before sampling is intro- duced after the end of the exposure period that provides additional time for directional selection (of a range of intensities) to act on the transformed cells. These scenarios illustrate situations where the sampled microbial community (e.g., agricultural soil, GIT) is only temporarily exposed to the DNA source in question (e.g.,
soil or GIT bacteria by seasonal crop cultivation or consump- tion patterns). We examined large populations (N=1012) only, and applied different parameter values for the weak and strong selection scenarios. For the weak positive selection scenario, the exposure time was 100 generations, the exposed population was of sizen=108and the time lag before sampling ranged from 104 to 105 generations). For the strong selection scenario, the DNA exposure time was extremely short (20 bacterial generations), the exposed population was 106, and the time lag before sampling ranged from none (i.e., sampling at the end of exposure), 30 bacterial generations, or 80 bacterial generations later.
SCENARIO 3. DELAYED SAMPLING – LARGE POPULATIONS AND WEAK POSITIVE SELECTION
In situations where potential transformants experience only weak positive selection (10−10–10−3), with 0.01% of the bacterial pop- ulation exposed to DNA, and an exposure time of 100 generations, no HGT events could be detected either at the end of the exposure or after a delayed sampling (104 generations after DNA expo- sure; Figure 5, left). However, a further 5- to 10-fold increase in the time delay before sampling (to 5×104 and 105 genera- tions) yielded increasing probabilities of detecting the HGT events (Figure 5, middle and right). Thus, theoretically, in environmen- tal situations where the bacterial generation time is very short (e.g., in a mammalian gut system), HGT events arising from lim- ited, transient DNA exposure can be detected, providing they are positively selected and have had the necessary time to increase in relative numbers within the overall population. However, even the most rapidly dividing bacterial populations would need more than 10 years to comprise 5×104 and 105 generations (the 10- year figure would assume a bacterial division time of 30 min).
Supposing this scenario represented an environment with inter- mittent antibiotic treatments, the effect of the length of the time period before transformants become detectable would be sen- sitive to any inconstancy of selection. The time period before the transformant population either increases in proportion to detectability or is lost from the population would therefore be different, and typically longer, in a situation with more variable selection dynamics.
SCENARIO 4. DELAYED SAMPLING – LARGE POPULATIONS AND STRONG SELECTION
Under strong positive selection, even HGT events occurring as a consequence of exposure of a very low overall proportion of the population (here 0.0001%) over a short period of time (t=20 generations) can be detected (Figure 6). The time of sampling nevertheless remains a significant factor in the probability of detec- tion. Sampling at the end of the exposure (T=20) yields a low probability of detection (Figure 6, left). In contrast, introducing a time lag before sampling, here 30 and 80 generations after the end of exposure (T=50 and 100, respectively), results in a sharp increase in the likelihood of detection (Figure 6, middle and right).
The starkness of this result may at first seem surprising; however positive selection that increases the frequency of the transformed bacteria is the main characteristic that makes detection possi- ble; other factors in this scenario are of negligible importance.
Strong directional selection makes such HGT events less affected
2 4
8 10
10
0 0.25
0.5 0.75 1
2 4
8 10 2
4 8
10 2 0
0.25 0.5 0.75 1
2 4
8 10
2 4
6 8
10 2
4 6
8 10
0 0.25
0.5 0.75 1
2 4
6 8
10
2 4
6 8
10 2 0
0.25 0.5 0.75 1
2 4
6 8
10 2
4 6
8 10
2 4
6 8
10
0 0.25
0.5 0.75 1
2 4
6 8
10
2 4
6 8
10 2
4 6
8 10
0 0.25
0.5 5 1
2 4
6 8
10
2 4
2 4
6 8
10
0 0.25
0.5 0.75 1
2 4
2 4
6 8
10 2
4 6
8 10
0 0.25
0.5 0.75 1
2 4
6 8
10 2
4
2 4
6 8
10
0 0.25 0.5 0.75 1
2 4
4 6 6 6 8
1 10000000000000 10
2
2 4 4 6
8 8 8 8 100000000
4 4 6 6 6 6 8 8 8 8 10
6 6 6 6 6 8 8 8 8 8 1 1 1 1 10
4 4 4 4 4 6 6 6 6 8 8 8 8 8 1 10
4 6
8 10000000
2 2
4 4
2
2 2
4 4
2 2 2 2 2 2 4
0 0 0 0 0 0 2 2
4 4
6 6
8 8
1 100
2 2 2 2 4 4 2
2 4 4
6 6
8 8
1 100 2 2 0 0 2
2 4 4
6 6
8 8
1 1 1 100
2 0 0 2
2 2 2 2
4 4
6 6
8 8
1 100 2 2 2 2 2
2 4 4
8 8
1 100000
2
2 2
4 4
8 8
1 100
2 2
4 4
6 6
8 8
1 100
2 2 2 0 0
Selection (m)
P (detection)
10^-14 10^-5
10^-3 10^-10
10^-14 10^-3 10^-10
10^-14 10^-5
10^-3 10^-10
10^-14 10^-5
10^-3 10^-10
10^-14 10^-5
10^-3 10^-10
10^-14 10^-5
10^-3 10^-10
10^-14 10^-5
10^-3 10^-10
10^-14 10^-5
10^-3 10^-10
10^-14 10^-5
10^-3 10^-10
tx=ts=20 tx=ts=10^3 tx=ts=10^5
x=10
x=10^3
x=10^5
FIGURE 3 | Scenario 2: HGT in small populations, no sampling delay;
weak positive selection.Probability of detection of transformants in a small bacterial population,N=106, with HGT rates ranging fromr=10−14to 10−5, and weak positive selection on transformants ranging fromm=10−10to 10−3.
The number of bacteria exposed to the DNA is low, medium, and high (x=10, 103, and 105, respectively, out of the 106total bacterial population, from top to bottom), and the time period of exposure is the same as the time to sampling;
tx=ts=20, 103, and 105, from left to right. Sample sizen=10,000 bacteria.
2 4
6 8
10 4
6 8
10
0 0.25
0.5 0.75 1
2 4
6 8
10 10
10
0 0.25 0.5 0.75 1
10
2 4
6 8
10 2
4 6
8 10
0 0.25
0.5 0.75 1
2 4
6 8
10
2 4
6 8
10 2
4 6
8 10
0 0.25
0.5 0.75 1
2 4
6 8
10
2 4
6 8
10 2
4 6
8 10
0 0.25
0.5 0.75 1
2 4
6 8
10
2 4
6 8
10 2
4 6
8 10
0 0.25 0.5 0.75 1
2 4
6 8
10
2 4
6 8
10 2
4 6
8 10
0 0.25
0.5 0.75 1
2 4
6 8
10
2 4
6 8
10 2
4 6
8 10
0 0.25
0.5 0.75 1
2 4
6 8
10 2
4 6
8 10
2 4
6 8
10
0 0.25 0.5 0.75 1
2 4
6 8
10
4 6
8 8 1000000000000 10
4 6
8 8 8 8 8 1 10
2 2 2 4
6 8
100000
2 4
6 8
10
2 4 4 6 6 6 8 8 8 8 1 1000000
2 2 4
6 8
10
4 6
8 10
4 6
8 10
2 2
4 4
6 6
8 8
1 1 2 1 100
2 2
4 4
6 6
8 8
1 100 2 2 0 0
2 2
4 4
6 6
8 8
1 100000000 2 2 2 2 2 2 2 2 2
2 2 2 2
4 4
6 6
8 8
1 10 2 1 10000 2
2 4 4
6 6
8 8
1 100 2 0 0 0 2
2 4 4
6 6
8 8
1 100000000000000
2 2 2 2 2
2 4 4
6 6
8 8
1
100 1100
2 2
4 4
6 6
8 8
1 100 2 2 0 0
Selection (m)
P (detection)
10^-14 10^-5
1 10^-3
10^-14 1 10^-3
10^-14 10^-5
1 10^-3
10^-14 10^-5
1 10^-3
10^-14 10^-5
1 10^-3
10^-14 10^-5
1 10^-3
10^-14 10^-5
1 10^-3
10^-14 10^-5
1 10^-3
10^-14 10^-5
1 10^-3
tx=ts=20 tx=ts=10^3 tx=ts=10^5
x=10
x=10^3
x=10^5
FIGURE 4 | Scenario 2: HGT in small populations, no sampling delay;
strong positive selection.Probability of detection of transformants in a small bacterial population,N=106, with HGT rates ranging fromr=10−14to 10−5, and strong positive selection on transformants ranging fromm=10−3to 1.
The number of bacteria exposed to DNA is low, medium, and high (x=10, 103, and 105, respectively, out of the 106total bacterial population, from top to bottom), and the time period of exposure is the same as the time to sampling:
tx=ts=20, 103, and 105, from left to right. Sample sizen=10,000 bacteria.