• No results found

1 Introduction

1.1 Background

1.1.8 Linkage

4 4 4 4

1 0.5 0.5 0.25 0.5 0.5 0.5

       . Similar reasoning applies for the rest of the coefficients, further details may be found in the given reference [51].

It is now fairly easy to see the distinction between inbreeding and the subpopulation correction (coancestry). Whereas the former influence the IBD patterns as illustrated above, the latter would affect the gi in (3) by adjusting the allele frequencies. Similar to the wording in the introduction, inbreeding as discussed above deals with models within pedigrees while coancestry as discussed in Section 1.1.6, require models for population effects.

1.1.8 Linkage

Genetic linkage is the phenomenon occurring within a pedigree when alleles at different loci are inherited dependently, i.e. there is a dependent inheritance pattern. The cause of this occurrence is generally attributed to the physical proximity of loci on the same chromosome. In fact, this is a truth with some modification as linkage may actually be quite different for two loci separated by say 1000 bases on one chromosome and two loci separated by the same distance on some other

chromosome, i.e. it is dependent on other things than physical distance alone. One measure of the genetic distance is centiMorgan (cM), where 1 cM is very roughly equal to 1 million bases (Mb). Even more commonly, we denote linkage in terms of recombination fraction (crossover rate), r, where this fraction is the probability that two loci will crossover in any given meiosis (actually the probability of

17 any odd number of crossovers). The relation between cM and recombination fraction can be

obtained from a mapping function. For instance, Haldane’s mapping function specifies

2 /100

1 2 e d

r

relating recombination fraction, r, to the genetic distance d, measured in cM. The formula relies on the assumption that the pattern of recombination along a chromosome follows a Poisson process.

The assumption is reasonable in calculation though interference, i.e. the occurrence of previous crossovers affecting the probability of a subsequent crossover, is not accounted for.

To obtain a measure of the linkage between two markers, we may typically analyze larger extended pedigrees where haplotypes and their inheritance as units can be traced throughout the tree. For statistical considerations, linkage only affects transitions probabilities within a pedigree, and we generally require at least two meioses to observe an effect.5 As a consequence, random match probabilities will never be affected by linkage, unless the alternative hypothesis is for instance “My brother did it” [52]. In medical genetics, linkage is commonly used as a first step to screen for potential genes. It is a natural approach as linkage extends quite far, in theory all along the chromosome, while other means may subsequently be used to get a more exact position.

Although described for relationship estimation, see e.g. Thompson [53], the forensic genetics field has been more hesitant to using linked markers. This could be due to the fact that no user-friendly implementations have existed. In addition, linked markers introduce more parameters and require complex models. In general, they may provide crucial information in some relationship cases [54-56].

Gill et al demonstrated that linkage should be considered whenever two or more meioses separate two typed individuals in a pedigree [57]. Furthermore, Kling et al provides simulations illustrating the effect on some common relationship scenarios [58]. One scenario, which is frequently illustrated, is the example involving the relationship hypotheses

HUNC: Two individuals are related as uncle/nephew HHS: The two individuals are related as half siblings

Consider two individuals P1 and P2 with genotypes 17,19 and 19,21 respectively, at a genetic marker and 14,15 and 15,17 respectively at a second marker. Using two unlinked autosomal markers we may use equation (2) and obtain LR=1 as both relationship hypotheses have the same IBD probabilities,

5 It should be noted that this is a very crude rule

18 i.e. k0, k1 and k2 are equal for both relationships. On the contrary, considering the same two markers to be linked we get the formula

   

   

   

   

2 2

0,1 0,2 1,2

2 2

1,1 0,2 1,2

3 2 2 3

0,1 0,2 1,2

2 3 3 2

1,1 0,2 1,2

0.5 (1 ) 2(1 )

0.5 2(1 ) (1 )

( | )

( | ) 0.5 (1 ) (1 )3 3(1 )

0.5 3(1 ) (1 ) 3 (1 )

HS UNC

g r r g r r g

g r r g r r g

P Data H

P Data H g r r r g r r r g

g r r r g r r r g

     

 

     

 

        

       

 

Where gi,j=Pj(Data|IBD=i) are functions of allele frequencies given that i alleles are IBD for locus j. The terms including r may look complicated but is understood from the fact that for half siblings we have two meioses while for uncle-nephew we have three meioses. The first term is explained by the probability that zero alleles is IBD at the second marker given zero alleles is IBD at the first marker, which can be the consequence of two recombinations or none, r2+(1-r)2. Further, evaluating the gi,j

we see that the LR will be a function of r, p(19) and p(15), i.e. the shared allele at each locus. Figure 8 illustrates the LR as a function of r for some fixed values on p(19) and p(15). It is obvious that the recombination rate has an impact on the results, as different number of meiosis differs between the two hypotheses, although given the current data fairly small.

Figure 8. The LR in a case where the disputed relationships are half siblings and uncle-nephew. The recombination rate (r) is on the X-axis.

19 Using linked markers has, as previously indicated, generally been considered an obstacle in forensic genetics, while it can actually be turned into great advantage. As noted by Thompson [53],

dependency tends to reduce the individual information contribution from each marker, but given that the alternative is to exclude linked markers from the calculations, including them is always the better option, assuming you have a model for the dependency. Their use will most probably play an even more important part in the future with the arrival of next generation sequencing technologies, inevitably leading to a greater number of markers and as a consequence dependency.