Likelihood Ratios - University of Vermontbiology/Classes/296D/21_Likelihood.pdf · Likelihood...
Transcript of Likelihood Ratios - University of Vermontbiology/Classes/296D/21_Likelihood.pdf · Likelihood...
Likelihood Ratios
Chapter 21
A MatchThree possible outcomes of comparing two
DNA profiles:1. Inconclusive - unknown
– There is not enough data to determine2. Exclusion – non-match
– The profiles are too different to possibly be the same individual
3. Inclusion – if the DNA profiles match– Probability of seeing this match at random is
calculated
Court Case
• Match is observed between the evidence and the suspect’s reference sample:– ‘Q’ matches ‘K’
• Defense:– Samples match completely by chance– Q could be from an another individual
• Prosecution:– Q and K match because they both come from
the suspect
Random Match
Three ways to report the probability of a random match
– Possibility that an unrelated individual could have the exact same DNA profile
1. Genotype frequency estimate• How often will this genotype be seen?
2. Likelihood ratios3. Source attribution
1. Frequency Estimates
• Look up allele frequency estimates for each locus– In appropriate population database
• Multiply genotype frequencies across all loci
• Using the product rule– Assuming that all loci are unlinked
• Will calculate an estimate of the frequency of the specific DNA profile seen
Genotype Frequencies
If the loci is homozygous:• Look up the allele frequency of that allele
(p)• p2 = genotype frequencyIf the loci is heterozygous:• Look up the allele frequencies for both
alleles (p and q)• 2pq = genotype frequency
Hardy-Weinberg CalculationWhy these calculations?• Received one p allele from mother• One p allele from father• (p)(p) = p2 = genotype frequency• Either received one allele (p) from mother
and other allele (q) from father• Or reverse could be true q from mother, p
from father• (p)(q) + (q)(p) = 2pq = genotype frequency
Population databases
• The allele frequencies are taken from the population database that matches to the suspect
• Table 21.1 shows the difference:– Caucasian vs. Hispanic database
• Caucasian – chance of seeing these three genotypes = 1 in 17,000
• Hispanic – chance of seeing = 1 in 31,000
More Loci – More rare• As more and more loci are added the DNA
profile becomes extremely rare• More loci that Q and K match for the less
likely it becomes that an unrelated individual could have contributed this DNA to the crime scene
• Therefore the case comes to two options:– Either the suspect contributed the evidence– Or a set of extremely unlikely coincidences
occurred
Fallacies
• A fallacy is a misconception resulting from incorrect reasoning
• Prosecutor’s Fallacy:– Overstates the DNA evidence– “There is only a 1 in 17,000 chance that the
defendant is not guilty”• Defense Attorney’s Fallacy:
– The idea that anyone else with the same DNA profile had an equal chance of committing this crime
Fallacies• A fallacy is a misconception resulting from
incorrect reasoning• The only way to interpret DNA evidence is
as follows:– “The probability of seeing this DNA profile
within the same population is 1 in 17,000”• The DNA evidence says nothing about:
– Access to the crime scene, alibi, motive, etc– Not everyone in the world has these same set
of conditions
Product Rule• As long as the loci are not genetically
linked to each other– This is tested and proven before the markers
are used• Then all the genotype frequencies can be
multiplied together to determine the DNA profile frequency
• The product rule is what gives the power to the DNA profile:– Because the chances of seeing all these
genotypes together is so rare
Product Rule
• Table 21.2 shows the DNA profile frequency using the most common genotypes in three different population databases
• 1.20 x 10-15
• 6.04 x 10-17
• 5.57 x 10-17
• All three are more rare than 1 in 6 Billion
Which Population Database?• Use the population database that matches
to the suspect’s background• Or if you have a witness or video of the
suspect• If not then you use range of all databases• Box 21.1• Range of profile frequencies across all
population databases– Still get less than 1 in 6 Billion
Adjusting for Substructure• Have to adjust for the possibility of
inbreeding and having related individuals• Discussed before how you incorporate
theta into empirically derived measure of genotype frequency
• Correction factor is used:– Ө = 0.01 for general population of US– Ө = 0.03 for isolated populations
• Table 21.4– Apply to genotype frequencies
Related Suspect?
• What if the suspect might be related to the true perpetrator?
• Have to calculate everything differently• Because if they are related then their
genotypes are not independent• Relatives are expected to have more
genotypes in common than random individuals
Related Suspect
• Table 21.7 shows example calculations• Genotype frequencies depend on amount
of relation between suspect and true perpetrator
• Defense has to have reason to show that related individual might have committed crime
• Just know that the more related – less information can be obtained from DNA
General Match
Five possible relationships between suspect and true perpetrator:
1. Suspect committed crime2. Suspect’s sibling did it3. Suspect’s relative did it4. Someone within suspect’s racial group5. Someone outside suspect’s racial group
General Match
• Use most conservative estimates
Relationship Match probabilitySibling 1 in 10,000Parent/child 1 in 1 millionHalf-siblings/uncle 1 in 10 millionFirst cousin 1 in 100 millionUnrelated 1 in 1 billion
• Favors the defendant
2. Likelihood Ratio
Compare the probabilities of two mutually exclusive hypotheses:
• Prosecution:– This DNA matches because it is the suspect’s
DNA• Defense:
– This DNA just happens to match by coincident• Set up two probabilities as a ratio• Likelihood ratio = Hp/Hd
Likelihood Ratio
• LR = Hp/Hd• Since prosecution’s hypothesis is that the
suspect committed the crime:– Hp = 1 (100% probability)
• Hypothesis of the defense is probability that someone else could have same DNA:– Hd = Genotype frequency in population
• LR – Inverse relationship to the genotype frequency
Likelihood Ratios
• LR greater than 1 = evidence supports prosecution
• LR less than 1 = evidence supports defense
• Larger the LR gets the more evidence for the prosecution
• More rare the DNA profile becomes –larger the ratio will become:– Genotype more markers
Likelihood RatiosLikelihood Ratio Evidence Provides
1 to 10 Limited support1 to 100 Moderate support1 to 1000 Strong support1 to >1000 Very Strong support
• 13 loci DNA profile is around 8 x 10-14
• LR = 1 to 8 x 1014
• Extremely strong support for prosecution
3. Source Attribution
• Since DNA profiles provide evidence that can exceed the world’s population
• FBI calculates the “Source attribution” for a specific DNA profile
• This is the Confidence Level of attributing the source of the DNA to the suspect
• Rarer the DNA profile – more confidence one has in knowing the suspect is the source of that DNA
3. Source Attribution
• Page 513 shows the math involved• Don’t worry about specifics• A random match probability of 3.35 x 10-11
will confer 99% confidence that the suspect is the source of the DNA
• Can calculate this for the population of interest:– 300 Million for entire USA– 6 Billion for entire world population
3. Source Attribution
• Appropriate source attribution statement will be as follows:– “In the absence of identical twins or close
relatives it can be concluded with scientific certainty that the DNA profile (Q) and (K) came from the same individual”
• You can also include what the scientific certainty is:– 99% confidence– 95%, 90% etc.
3. Source Attribution
• If a close relative may have had access to the crime scene
• Then the law states that there is probable cause for obtaining a reference sample from the suspected relative
• A DNA profile is produced and compared to the crime scene sample and suspect’s sample– Evidence is dealt with accordingly
Lineage Markers
• Y and mtDNA is inherited directly • Therefore cannot use product rule• Use counting method instead:
• p = X(# this exact profile)/N (# total profiles) • Depends on the population the suspect
belongs to• Also on the population database used• Larger the size of the database – more
accurate likelihood will be
Truth about the populationDecision based on sample examined
Correct decision
Type II error
Type I error Correct decision
H0 True H1 True
Accept H0
Reject H0(Accept H1)
Correct decision
Wrongfully acquitted
Wrongfully accused
Correct decision
(B) Example
Not Guilty
Guilty
Courtroom VerdictNot Guilty Guilty
Defendant
(A) Hypothesis Testing Decisions
Figure 19.2, J.M. Butler (2005) Forensic DNA Typing, 2nd Edition © 2005 Elsevier Academic Press
Any Questions?
Read Chapter 22