Forensic DNA Statistics: Still Controversial In Some...

12
A lthough forensic DNA testing is well established, experts sometimes disagree about the interpreta- tion and statistical characterization of test results. This article will describe the key controversies and will explain what lawyers need to know to recognize and deal with controversial types of DNA evidence. When labs try to “type” samples that contain too little DNA, or DNA that is too degraded, the results of the DNA test can be unreliable. The test may fail to detect certain genetic characteristics (called alleles) of people who contributed DNA to the sample — a phe- nomenon called allelic drop out; the tests may falsely detect characteristics that did not come from contribu- tors — a phenomenon called allelic drop in; and the test results may be distorted in other ways that compli- cate interpretation. Labs try to make allowances for these distortions when they deal with limited and degraded samples. Because two samples from the same person may (under these conditions) produce slightly different DNA pro- files, labs must use lenient standards for declaring that a suspect “matches” or is “included as a possible con- tributor” to an evidentiary sample. They do not require that the suspect’s profile correspond exactly to the evi- dentiary profile — some differences are allowed in order to account for drop out, drop in, and other types of distortion. But this leniency in matching increases the probability that an innocent suspect, who was not a contributor, will be incriminated — and therefore requires an adjustment to the statistics that describe the value of the DNA match. The controversy we will examine concerns how big this adjustment should be, and how labs should make it. Before delving into that debate, however, we will provide some background information on DNA evidence. I. Admissibility of DNA Statistics DNA evidence is typically accompanied by impres- sive statistics that purport to show the meaning (and power) of a DNA match. The most common statistic is the random match probability (RMP), which is an esti- mate of the probability that a randomly selected person from some reference population (e.g., Caucasians, African-Americans, Hispanics) would “match” or be “included as a potential contributor” to an evidentiary sample. These statistics can be extremely impressive — random match probabilities of one in billions, trillions, or even quadrillions and quintillions are typical, although more modest estimates of one in thousands, hundreds, or even less, are sometimes offered in cases where limited DNA profile information was detected in the evidentiary sample. In cases involving mixed DNA samples (samples with more than one contributor) labs sometimes compute another statistic called a combined probability of inclusion or exclusion (CPI or CPE), which reflects the relative probability of obtaining the observed results if the defendant was (and was not) a BY WILLIAM C. THOMPSON, LAURENCE D. MUELLER, AND DAN E. KRANE 12 Forensic DNA Statistics: Still Controversial In Some Cases WWW.NACDL.ORG THE CHAMPION

Transcript of Forensic DNA Statistics: Still Controversial In Some...

Page 1: Forensic DNA Statistics: Still Controversial In Some Casesdarwin.bio.uci.edu/~mueller/pdf/forensic_dna_dec_p12-23.pdf · Although forensic DNA testing is well established, experts

Although forensic DNA testing is well established,experts sometimes disagree about the interpreta-tion and statistical characterization of test results.

This article will describe the key controversies and willexplain what lawyers need to know to recognize and dealwith controversial types of DNA evidence.

When labs try to “type” samples that contain toolittle DNA, or DNA that is too degraded, the results ofthe DNA test can be unreliable. The test may fail todetect certain genetic characteristics (called alleles) ofpeople who contributed DNA to the sample — a phe-nomenon called allelic drop out; the tests may falselydetect characteristics that did not come from contribu-tors — a phenomenon called allelic drop in; and thetest results may be distorted in other ways that compli-cate interpretation.

Labs try to make allowances for these distortionswhen they deal with limited and degraded samples.Because two samples from the same person may (underthese conditions) produce slightly different DNA pro-files, labs must use lenient standards for declaring thata suspect “matches” or is “included as a possible con-tributor” to an evidentiary sample. They do not requirethat the suspect’s profile correspond exactly to the evi-

dentiary profile — some differences are allowed inorder to account for drop out, drop in, and other typesof distortion. But this leniency in matching increasesthe probability that an innocent suspect, who was not acontributor, will be incriminated — and thereforerequires an adjustment to the statistics that describe thevalue of the DNA match. The controversy we willexamine concerns how big this adjustment should be,and how labs should make it. Before delving into thatdebate, however, we will provide some backgroundinformation on DNA evidence.

I. Admissibility of DNA Statistics DNA evidence is typically accompanied by impres-

sive statistics that purport to show the meaning (andpower) of a DNA match. The most common statistic isthe random match probability (RMP), which is an esti-mate of the probability that a randomly selected personfrom some reference population (e.g., Caucasians,African-Americans, Hispanics) would “match” or be“included as a potential contributor” to an evidentiarysample. These statistics can be extremely impressive —random match probabilities of one in billions, trillions,or even quadrillions and quintillions are typical,although more modest estimates of one in thousands,hundreds, or even less, are sometimes offered in caseswhere limited DNA profile information was detected inthe evidentiary sample. In cases involving mixed DNAsamples (samples with more than one contributor) labssometimes compute another statistic called a combinedprobability of inclusion or exclusion (CPI or CPE),which reflects the relative probability of obtaining theobserved results if the defendant was (and was not) a

BY WILLIAM C. THOMPSON, LAURENCE D. MUELLER, AND DAN E. KRANE

12

Forensic DNA Statistics:Still Controversial In Some Cases

WWW. N A C D L . O R G T H E C H A M P I O N

Page 2: Forensic DNA Statistics: Still Controversial In Some Casesdarwin.bio.uci.edu/~mueller/pdf/forensic_dna_dec_p12-23.pdf · Although forensic DNA testing is well established, experts

contributor.The scientific community has long

recognized the need for statistics in con-nection with DNA evidence. In 1992, theNational Research Council, in its firstreport on DNA evidence, declared:

To say that two patterns match,without providing any scientif-ically valid estimate (or at least,an upper bound) of the fre-quency with which suchmatches might occur bychance, is meaningless. …DNA “inclusions” cannot beinterpreted without knowledgeof how often a match might beexpected to occur in the gener-al population.1

The Scientific Working Group onDNA Analysis Methods (SWGDAM), agroup of forensic scientists chosen bythe FBI to propose guidelines for DNAtesting, has also declared that statisticsare an essential part of forensic DNAanalysis. Its most recent guidelines(approved in January 2010) include thefollowing: “The laboratory must per-form statistical analysis in support ofany inclusion that is determined to berelevant in the context of the case, irre-spective of the number of alleles detect-ed and the quantitative value of the sta-tistical analysis.” (InterpretationGuideline 4.1).2

Courts in most jurisdictions will notadmit DNA evidence unless it is accompa-nied by statistics to give meaning to thefinding of a match. As one court explained,“[w]ithout the probability assessment, the

jury does not know what to make of thefact that the patterns match: the jury doesnot know whether the patterns are as com-mon as pictures with two eyes, or asunique as the Mona Lisa.”3 In many juris-dictions the law also requires, as a condi-tion of admissibility, that the statistics pre-sented in connection with the DNA evi-dence be accurate and reliable. In otherwords, the validity of the statistics is anissue going to the admissibility of DNAevidence, not just its weight.4

Since the mid-1990s, proponents ofDNA evidence have generally had littledifficulty establishing the admissibilityof DNA statistics. As we will show, how-ever, most laboratories are using statisti-cal methods that are valid only in caseswhere the test results are reliable andaccurate — which means cases where itcan safely be assumed that all alleles(genetic characteristics) of the individu-als who contributed DNA to the eviden-tiary samples have been detected. Agrowing number of laboratories areexpanding their use of DNA testing —using it to examine limited and degrad-ed samples where it is not possible toobtain complete results. In those cases,the standard statistical methods, whichhave long been accepted for the typicalcase, can be highly misleading.

II. Computing DNAStatistics in Cases WithAmple Quantities of DNA and Clear Results

Forensic DNA testing in the UnitedStates is generally done using commercial

test kits that examine at least 12 locations(loci) on the human genome where thereare STRs (short tandem repeats). At eachSTR a person will have two “alleles”(genetic markers), one inherited fromeach parent. If two DNA samples are fromthe same person, they will have the samealleles at each locus examined; if two sam-ples are from different people, they willalmost always have different alleles atsome of the loci. The goal of forensic DNAtesting is to detect the alleles present at thetested loci in evidentiary samples so thatthey can be compared with the allelesdetected at the same loci in reference sam-ples from possible contributors.5

DNA tests produce computer-gener-ated charts (called electropherograms) inwhich “peaks” represent the alleles. Figure1 shows results for five different samples— the top chart shows the alleles found inblood from a crime scene; the lower fourcharts show the alleles found in referencesamples from four suspects. The chartsshow results for just three of the loci nor-mally examined by the test — these threeloci are called D3S1358, vWA, and FGA.At each locus the computer assigns anumber to each peak (based on its posi-tion which in turn is determined by itssize) to indicate which of the possiblealleles it represents.

A quick look at the charts will showthat only Suspect 3 has the same alleles asthe blood from the crime scene — hence,based on these results, Suspect 3 would be“included” as a possible source of theblood, while the other three suspectswould be excluded. To know how muchweight to assign to such a “match,” labstypically compute the random match

WWW. N A C D L . O R G D E C E M B E R 2 0 1 2

FORENSIC DNA STATISTICS: S

TILL CONTROVERSIAL IN

SOME CASES

13

Figure 1: Electropherograms showing DNA profiles of five samples at three loci

Page 3: Forensic DNA Statistics: Still Controversial In Some Casesdarwin.bio.uci.edu/~mueller/pdf/forensic_dna_dec_p12-23.pdf · Although forensic DNA testing is well established, experts

probability — that is, the probability thata random person would “match” the alle-les found in the blood sample. Table 1shows how this is done. A lab analyst con-sults a database to determine the frequen-cy of each of the matching alleles in a ref-erence population that reflects a pool ofalternative suspects. (Table 1 shows thefrequency of each allele among U.S.Caucasians). These frequencies are thenmultiplied together in a particular way. Ifthere are two alleles at a locus, the fre-quency for the pair (called a genotype)can be determined with the simple for-mula: 2 x p x q, where p and q are the fre-quencies of the individual alleles. Forlocus D3S1358, for example, the allele fre-quencies are 0.103 and 0.262 (whichmeans that 10.3 percent of the allelesobserved for this locus are “14” and 26.2percent are “15”), so the frequency of thepair of alleles (genotype) is 0.103 x 0.262 x2 = 0.054, which means that among U.S.Caucasians approximately 1 person in18.5 would have this genotype. If there isonly one allele at a locus (which wouldoccur if the person inherited the sameallele from both parents), then the fre-quency for such a genotype is simply p2,where p is the frequency of that allele. Thefrequencies of the genotypes at each locusare then all multiplied together to producea frequency estimate for the overall pro-file.6 The overall frequency (among U.S.Caucasians) of the three-locus DNA pro-file seen in the blood from the crime scene(and also in Suspect 3) is approximately0.000212, which means that approximate-ly one person in 4,592 would be expectedto have this particular three-locus profile.

The profile frequency is often calledthe random match probability (RMP)because it delivers the answer to a very spe-cific question: the probability that a ran-domly chosen, unrelated person from thereference population would happen tohave the same DNA profile found in anevidence sample. The DNA test places amatching suspect, like Suspect 3, in a cate-

gory of individuals who might have beenthe donor of this sample; the profile fre-quency (RMP) tells us something aboutthe size of this group (in our example, itincludes 1 in 4,592 Caucasians, which is alot of people given that more than 100 mil-lion Caucasians live in the United States).The profile frequency does not tell us theprobability that Suspect 3 was (or was not)the source of the blood at the crime scenebecause the DNA evidence cannot tell uswhether Suspect 3 was more or less likelyto be the donor than any of the other peo-ple who would also match. But it does tellus how broad a net was cast by the test —it was broad enough to incriminate aboutone person in 4,592, just by chance. Hence,in a case like the one illustrated here, it is auseful statistic for characterizing the valueof a DNA match and is widely accepted asvalid and appropriate by scientific experts.Unfortunately, not all cases are as clear andeasy to interpret as this one — for morecomplex cases, the answer to the questionasked by the random match probability(RMP) can be highly misleading.

III. Computing Statistics inCases With Incomplete Or Unreliable DataIn 2001, Bruce Budowle, then a sen-

ior scientist at the FBI Crime Laboratory,and some of his colleagues, issued a warn-ing to the DNA testing community aboutthe way they were beginning to use (andabuse) the new STR technology:

Because of the successesencountered with STR typing, itwas inevitable that some indi-viduals would endeavor to typesamples containing very minuteamounts of DNA. … When fewcopies of DNA template arepresent, stochastic amplificationmay occur, resulting in either asubstantial imbalance of twoalleles at a given heterozygous

locus or allelic drop out.7

Budowle’s concern, in short, was thatlabs were beginning to use STR technolo-gy in ways that were unreliable — exam-ining samples so limited or so degradedthat the tests could not be counted on toproduce a complete and accurate profile.

It is now well accepted in the scientif-ic community that STR tests becomeunreliable when used to type samples con-taining too little DNA. Such limited sam-ples are sometimes called low-copy num-ber (LCN) samples because they contain alow number of copies of the targetedDNA. When such samples are examinedthe STR test may pick up more copies ofone allele than another at a particularlocus simply as a result of sampling error.This phenomenon, known as a stochasticeffect, can distort the resulting DNA pro-files in various ways. For example, somealleles simply may not be sampled (allelicdrop out); it can also cause two allelesfrom the same person to have widely dif-ferent peak-heights, which falsely suggeststhat they came from two different people.Another problem with LCN samples is thedetection of spurious alleles due to a phe-nomenon called “allelic drop in” as well asincreased incidence of a complicatingartifact known as “stutter.” As Budowle etal. explained in 2001, “[m]ixture analysesand confirmation of a mixture are notreliable with LCN typing, because …imbalance of heterozygote alleles,increased production of stutter products,and allele drop in can occur.”

Although these underlying problemsare widely recognized, some forensic lab-oratories are pushing the limits of STRtesting and ignoring Budowle’s warning.Increasingly STR testing is used on agedand extremely limited samples, such as“touch DNA” samples, that are likely toyield quantities of DNA in the rangewhere stochastic effects occur. Analyststry to take these effects into account bybeing more lenient about their standards

WWW. N A C D L . O R G T H E C H A M P I O N14

FORENSIC DNA STATISTICS: STILL CONTROVERSIAL IN SOME CASES

Locus

D3S1358 vWA FGAOverall Profile Frequency(Random Match Probability)

Alleles 14 15 17 18 23 24

Allele frequencies .103 .262 .281 .200 .134 .136

Genotype frequencies 0.054 or 1 in 18.5

0.112 or 1 in 8.9

0.036 or 1 in 27.8

0.0002177 or 1 in 4,592

Table 1: Allele, genotype and profile frequencies (among U.S. Caucasians) for the blood from the crime scene

Page 4: Forensic DNA Statistics: Still Controversial In Some Casesdarwin.bio.uci.edu/~mueller/pdf/forensic_dna_dec_p12-23.pdf · Although forensic DNA testing is well established, experts

for declaring a match. They do notrequire that a suspect’s DNA profile cor-respond exactly to an evidentiary samplebecause stochastic effects, or other phe-nomena associated with low copy num-ber testing, might cause discrepancies.This leniency in matching is necessary toavoid false exclusions, but it makes it dif-ficult if not impossible to estimate the sta-tistical meaning of a “match.”

Consider, for example, the DNA pro-files that are compared in Figure 2. Thechart on top shows the profile of an evi-dentiary sample from a sexual assault casein which very little DNA was recovered.The chart on the bottom shows the pro-file of the defendant. If this were a con-ventional case, the defendant would beexcluded as a possible contributorbecause his profile obviously differs. Oneof his alleles at locus D13S317 (allele“14”) was not found in the evidentiarysample. But he was not excluded — infact, he is now in prison on the strength ofthis evidence. The lab attributed the dis-crepancy to “allelic drop out” and con-cluded that he was a possible contributornotwithstanding the discrepancy.8

This lenient approach to declaringan “inclusion” or “match” is problematicin several ways. First, it depends in parton circular reasoning. There is no way toverify that a discrepancy like the one illus-trated in Figure 2 was, in fact, caused by“allelic drop out.” An alternative possibil-ity is that the true contributor of the sam-ple is a homozygote (a person who inher-

ited allele 8 from both of his parents andhas no other allele to contribute). Theconclusion that allelic drop out occurredhere is inferred, in part, from the very factthat it purports to explain — the discrep-ancy between the evidentiary sample andthe defendant’s profile. At best this type ofanalysis creates uncertainty about thevalue of DNA evidence; at worst it under-mines that value entirely.

The lenient approach to interpreta-tion also creates a second problem — itundermines the objectivity of DNA evi-dence. The results of a comparison likethat shown in Figure 2 depend on an ana-lyst’s subjective judgment about whetherdrop out did or did not occur, which cre-ates room for expert disagreement. Arecent study by Itiel Dror and GregHampikian illustrates the degree of sub-jectivity (and disagreement) that is possi-ble in DNA interpretation.9 They asked 17qualified DNA analysts from accreditedlaboratories to evaluate independentlythe DNA evidence that had been used toprove that a Georgia man participated ina gang rape. The analysts were given theDNA profile of the Georgia man, andwere given the DNA test results obtainedfrom a sample collected from the rapevictim, but were not told anything aboutthe underlying facts of the case (otherthan scientific details needed to interpretthe test results). The analysts were askedto judge, based on the scientific resultsalone, whether the Georgia man shouldbe included or excluded as a possible con-

tributor to the mixed DNA sample fromthe victim. Twelve of the analysts said theGeorgia man should be excluded, fourjudged the evidence to be inconclusive,and only one agreed with the interpreta-tion that had caused the Georgia man tobe convicted and sent to prison — i.e.,that he was included as a possible con-tributor to the DNA mixture. The authorsfound it “interesting that even using the‘gold standard’ DNA, different examinersreach conflicting conclusions based onidentical evidentiary data.”10

Dror and Hampikian also highlight-ed a third problem with the lenientapproach to interpretation required whentyping low copy number samples — itcreates a potential for bias in the interpre-tation of DNA tests. Noting that the ana-lyst who testified in the Georgia case hadbeen exposed to investigative facts sug-gesting that the Georgia man was guilty,Dror and Hampikian suggested that this“domain irrelevant information mayhave biased” the analyst’s conclusions.11

Other commentators have also docu-mented the potential for bias in the inter-pretation of DNA test results.12

We will focus here, however, on afourth problem with the lenient approachto DNA interpretation illustrated inFigure 2 — it makes it difficult, if notimpossible, to assess the statistical mean-ing of the resulting DNA “match.” Thestandard approach of computing the fre-quency (in various reference populations)of people who have the same profile as the

WWW. N A C D L . O R G D E C E M B E R 2 0 1 2

FORENSIC DNA STATISTICS: S

TILL CONTROVERSIAL IN

SOME CASES

15

Figure 2: DNA profile of an evidentiary sample in a sexual assault case and DNA profile of a suspect(The height of each peak in relative fluorescent units — RFU — is shown in the boxes below the labels for the peaks.)

Evidentiary SpecimenD5S818 D13S317 D7S820

Defendant’s Reference Profile

Page 5: Forensic DNA Statistics: Still Controversial In Some Casesdarwin.bio.uci.edu/~mueller/pdf/forensic_dna_dec_p12-23.pdf · Although forensic DNA testing is well established, experts

evidentiary sample will not work here.The frequency of the evidentiary profile isirrelevant when analysts are willing to“include” as possible contributors people(like the defendant) who do not have thatprofile. When analysts widen the net toinclude people whose profiles do notmatch the evidentiary profile perfectly, thefrequency of the evidentiary profile nolonger reflects and will necessarily under-state the likelihood that an innocent per-son will be included by chance.

A few labs have tried (incorrectly) todeal with this problem by computing thefrequency, in a reference population, ofpeople who have the alleles that thedefendant shares with the evidentiarysample. In other words, they try to com-pute the frequency of people who would“match” the evidentiary sample the waythat the defendant matches. For the caseillustrated in Figure 2, they would com-pute the frequency of people who havethe defendant’s genotype at locus D5S818and locus D7S820, and who have the “8”allele (in combination with any otherallele) at locus D13S317. But the net castby this test has the potential to “include” afar broader group of people than thosedefined by this method. Suppose, for

example, that the defendant had geno-type 13,13 rather than 8,13 at locusD7S820. Would he have been excluded?We strongly doubt it. We think that inthat case the analyst might simply haveconcluded that the “8” allele observed atthat locus was caused by “allelic drop in”or by DNA of a second contributor.Analysts are often reluctant to “exclude” asuspect unless there is no plausible way toaccount for discrepancies between hisprofile and the evidentiary sample.13

When the analyst can invoke allelic dropout, drop in, secondary contributors, andvarious other explanatory mechanisms toaccount for discrepancies, however, thenumber of profiles that might plausiblybe consistent with the evidentiary samplecan expand rapidly. This means the netcast by the test is far broader than the cat-egory of individuals who match the waythe defendant matches.

Another approach labs have taken toaddress this problem is to base statisticalcomputations only on loci where all of thesuspect’s alleles were detected in the evi-dence, ignoring (for statistical purposes)any locus where the suspect does notmatch. But two prominent experts offorensic statistics have recently published

an article calling this approach unsupport-able.14 Using statistical modeling, theyshowed that “this approach may produceapparently strong evidence against a sur-prisingly large fraction of noncontribu-tors.” 15 It allows labs to ignore potentiallyexculpatory results (the loci where the sus-pect does not perfectly match) whilecounting against him those where he doesmatch, and this cherry-picking of datagreatly expands the net cast by the test inways that are not adequately reflected ininclusion statistics. Such an approachanswers a substantially different questionthan the one addressed by the randommatch probability. It essentially asks:“What is the chance that a randomly cho-sen individual from a given populationwould match the defendant if we focusonly on evidence consistent with a matchand ignore evidence to the contrary?”When analysts present such statistics to thejury, however, they typically (and mislead-ingly) describe them as “random matchprobabilities.”

The fundamental problem facingthose who try to design statistical proce-dures for such cases is that no one knowshow broad the net cast by the test reallyis. Estimating the percentage of the pop-

WWW. N A C D L . O R G T H E C H A M P I O N16

FORENSIC DNA STATISTICS: STILL CONTROVERSIAL IN SOME CASES

Thank You to our Cocktail Reception Sponsor at the White Collar Seminar in New York

The reception was held in the Fordham Law Center Atrium.

Page 6: Forensic DNA Statistics: Still Controversial In Some Casesdarwin.bio.uci.edu/~mueller/pdf/forensic_dna_dec_p12-23.pdf · Although forensic DNA testing is well established, experts

ulation who would be “included” as apossible contributor is like estimating thelength of a rubber band. Just as a rubberband may be longer or shorter, depend-ing on how far one is willing to stretch it,the size of the “included” population maybe larger or smaller, depending on howleniently or strictly the analyst defines thecriteria for an “inclusion.” Because it isunclear just how far the laboratory mightstretch to “include” a suspect, the truesize of the “included” population cannotbe determined.

In some cases, forensic laboratoriestry to deal with this problem by using sta-tistics known as likelihood ratios ratherthan frequencies and RMPs. To computea likelihood ratio one estimates howprobable the observed results (in the evi-dentiary sample) would be if the defen-dant was (and was not) a contributor.Because likelihood ratios focus on theprobability of obtaining the exact resultsobserved (under different hypothesesabout how they arose), they avoid the dif-ficulty of estimating the size of the“included” group. But they run smackinto a related problem. To compute anaccurate likelihood ratio, in a case likethat shown in Figure 2, one must knowthe probability that allelic drop out (anddrop in) occurred. Without knowledge ofthe drop out probability, one cannotknow the probability of obtaining theobserved results if the defendant was acontributor, which means one cannotcompute a likelihood ratio (at least notaccurately). It is well understood that asthe quantity of DNA in a sample decreas-es, the probability of drop out and dropin increases. But, as the quantity of DNAdecreases, the ability to reliably estimateits quantity also decreases. So estimates ofthe probability of drop out or drop in areoften little more than guesses, which wefind unacceptable given that the results ofthe DNA test — whether it is reported asa powerful incrimination or a definitiveexclusion — may depend on what guessthe expert happens to make.

In 2006, a DNA Commission of theInternational Society of ForensicGenetics proposed a statistical methodthat takes drop out probabilities intoaccount when computing a likelihoodratio. Unfortunately, this approach is dif-ficult to apply even in simple cases due touncertainty about the drop out probabil-ity.16 Matters quickly become even morecomplicated when formulae incorporat-ing drop out probabilities are applied tomixtures. The Commission acknowl-edged that “[e]xpansion of these conceptsto mixtures is complex and that is whythey are not generally used.” But that, of

course, leaves forensic DNA testing in theunsatisfactory position of having no gen-erally accepted method for computingstatistics in such cases.

A more defensible way to deal withproblems arising from stochastic effects isfor the lab to ignore for statistical purpos-es any locus where it is suspected that sto-chastic effects (leading to drop out) mayhave occurred, whether or not the suspect“matches” at that locus. One clue towhether the sample in question is subjectto stochastic effects is the height of the“peaks” seen in the electropherogram.Peak heights are measured in “relative flu-orescent units” (RFU) and their heightcan be determined by reference to the ver-tical indices on an electropherogram. As ageneral policy, some labs ignore for statis-tical purposes any results from a locuswhere there are peaks below a “stochasticthreshold,” which is often set somewherebetween 50 and 150 RFU. That meansthat (for statistical purposes) they relyonly on loci where they are willing toassume that all of the contributors’ alleleshave been detected.

“Stochastic thresholds” are not a per-fect solution to the problem posed byunreliable DNA data because it meansthat the lab may ignore potentially excul-patory data such as the mismatch seen atlocus D13S317 in Figure 2. And expertsdisagree about what the “stochasticthreshold” should be. But this approach isless likely than other approaches to pro-duce results that are unfairly biasedagainst a suspect. Unfortunately, the con-servatism of this approach is the very rea-son that many laboratories reject it; thisapproach reduces the impressiveness ofthe statistics that can be presented in con-nection with a DNA match.

When there is no valid way to com-pute a statistic, or when the only validmethod produces statistics that are mod-est and unimpressive, labs sometimeselect to present DNA evidence withoutstatistics. The jury is told that the defen-dant’s profile “matches” the evidentiaryprofile, or that the defendant “cannot beexcluded,” but is given no statistics thatwould allow them to assess the strength ofthis evidence. This approach violates theSWGDAM guidelines as well as the legalrequirement in many jurisdictions thatDNA evidence is inadmissible withoutstatistics. Defense lawyers sometimesraise no objection, however, thinking thatwithout statistics the DNA evidence willdo less harm, and fearing that an objec-tion may prompt the government to gen-erate a statistic that will prove moreincriminating. The danger of thisapproach is that even in the absence of

FORENSIC DNA STATISTICS: S

TILL CONTROVERSIAL IN

SOME CASES

17

GlossaryAllele (peak): One of two or morealternative forms of a gene, a peakappears on an electropherogramfor each allele that is detected.

Bayes’ Rule: A mathematical equa-tion that describes how subjectiveestimates of probability should berevised in light of new evidence.

Degradation: The chemical orphysical breaking down of DNA.

Domain irrelevant information:Information that should have nobearing on an expert’s interpreta-tion of scientific data; generallyfrom outside of an expert’s area ofexpertise.

Drop in: Detection of an allele thatis not from a contributor to an evi-dence sample, usually due to lowlevels of contamination.

Drop out: Failure to detect an allelethat is actually present in a sample,usually due to small amounts ofstarting material.

Electropherogram: The output ofa genetic analyzer, typically dis-played as a graph where individualpeaks correspond to the presenceof alleles detected in a tested sam-ple.

Likelihood ratio: A statistic reflect-ing the relative probability of a par-ticular finding under alternativetheories about its origin.

Locus (pl. loci): The physical loca-tion of a gene on a chromosome.

Low-copy number (LCN)/low-template (LT) DNA: DNA testresults at or below the stochasticthreshold.

Monte Carlo-Markov Chain(MCMC) modeling: A computer-intensive statistical method thatproposes millions of possible sce-narios that might have producedthe observed results, computes theprobability of the observed resultsunder each scenario, and uses theresulting distributions (and Bayes’Rule) to determine which scenariosbest explain the observed results.

Random match probability(RMP): The probability that a ran-domly chosen unrelated individualwould have a DNA profile that can-not be distinguished from thatobserved in an evidence sample.

WWW. N A C D L . O R G D E C E M B E R 2 0 1 2

Page 7: Forensic DNA Statistics: Still Controversial In Some Casesdarwin.bio.uci.edu/~mueller/pdf/forensic_dna_dec_p12-23.pdf · Although forensic DNA testing is well established, experts

statistics the jury will assume that theDNA evidence is highly probative, whichcould easily cause the jury to over-valuethe problematic kind of DNA evidencewe are discussing here.

IV. Computers to theRescue? The TrueAllele® Casework SystemA Pittsburgh company called

Cybergenetics has been marketing anautomated system for interpreting DNAevidence that relies on high-poweredcomputers, and a form of Bayesian analy-sis called Monte Carlo-Markov Chain(MCMC) modeling, to draw conclusionsabout the profiles of possible contribu-tors to evidentiary DNA samples.Promoters of this system claim that it isan objective and scientifically validmethod for assessing the statistical valueof DNA evidence. They claim it can beused in all types of cases, including prob-lematic cases in which sample limitationsrender the test results less than perfectlyreliable. Better yet, the system often pro-duces statistics that are even moreimpressively incriminating than the sta-tistics produced by conventional meth-ods. This sales pitch appears to be work-ing. The company has sold its system(which consists of software and associat-ed computer hardware) to several foren-sic laboratories for use in routine case-work. It has also helped forensic laborato-

ries come up with statistical estimates in anumber of specific cases. In a fewinstances it has helped defense lawyers byreanalyzing evidence in order to seewhether TrueAllele® agreed with theinterpretation offered by a human foren-sic analyst. Because evidence generated bythis system is increasingly appearing incourtrooms, it is important that lawyersunderstand its strengths and limitations.

The system relies on a form of statis-tical modeling (called MCMC) that hasbeen widely used in the field of statisticsto model complex situations. Althoughthe application of this technique to foren-sic DNA testing is novel, the underlyingapproach has been used successfully else-where. The computer is programmed tomake certain assumptions about howforensic DNA tests work, as well as howthey fail to work. The assumptions allowthe computer to predict how an electro-pherogram will look when the sampletested has DNA from an individual (orindividuals) with specific profiles, howthe pattern of peaks should change withvariations in the quantity of DNA fromeach contributor, with variations in thedegree to which the DNA is degraded, etc.The assumptions cover such issues aswhen allelic drop out and drop in wouldbe expected to occur, and how probablethese phenomena are under various con-ditions. Based on the assumptions thatare programmed into the system, it canpredict the “output” of a forensic DNAtest for any given set of “inputs.” In otherwords, it can predict the probability that aforensic DNA test will produce electro-pherograms showing a particular patternof peaks, given that the sample testedcontained DNA of an individual (or indi-viduals) who have specific DNA profiles,and given various other assumptionsabout the quantity and quality of thesamples tested. Whether the predictions itmakes are accurate is a matter we willconsider in a moment—but there is nodoubt that the computer can make suchpredictions in a consistent manner.

In order to analyze an evidentiarysample, such as the one shown in Figure 2,the computer is programmed to proposemillions of possible scenarios that mighthave produced the observed results andthen to compute the probability of theobserved results under each scenario. Thescenarios include all possible combina-tions of explanatory variables (i.e., geno-types of contributors, mixture proportion,degradation, stutter, etc.). Each scenario is,effectively, a hypothesis about a set of fac-tors that might explain the results; it isevaluated relative to other hypotheses interms of how well it fits the observed data.

Most hypotheses are effectively ruled outbecause they cannot explain the observedresults. As more and more hypotheses aretested, however, the system identifies a lim-ited number of hypotheses that might pos-sibly explain the data. If all of thosehypotheses require that the contributorhave the same genotype, then the systemassigns a probability of one (certainty) tothat genotype. To give a simple example, ifsuch a system assumed a single contribu-tor to the evidentiary sample shown inFigure 2, and tried out various hypothesesabout how that single contributor couldhave produced the observed results atlocus D5S818, it would conclude that thecontributor must have genotype 8, 12 —any other hypothesis would not fit thedata. Hence, it would assign that genotypea probability of one.

For a locus like D13S317, the analysiswould be more complicated. The hypoth-esis that there was a single contributorwith genotype 8, 8 would fit the data, butto the extent allelic drop out is possible,the data could also be explained by a sin-gle contributor with genotype 8, x where“x” is any other allele at this locus. In thatcase, the system would compute the prob-ability of each of the possible genotypes byusing a formula known as Bayes’ Rule tocombine data on the frequency of thegenotype in the population with the sys-tem’s estimates of the probability ofobtaining the observed results if the con-tributor had that genotype. For example,the system would compute the probabilitythat the contributor had genotype 8, 14(and therefore matched the defendant) byconsidering both the underlying frequen-cy of this genotype in the population andthe probability of obtaining the observedresults (an 8 allele with a peak height of80) if the single contributor had genotype8, 14. To do this, the computer wouldobviously need to estimate the probabilitythat drop out (and/or drop in) occurredin this case. It would make this estimatebased on pre-programmed assumptionsabout such matters as the relationshipbetween peak height and drop out (anddrop in: how sure can we be that the con-tributor at the D5S818 locus was 8, 12 andnot really 8, 8 or 12, 12?). As suggestedabove, the accuracy of these assumptionswill be a key issue when evaluatingTrueAllele® and similar systems.

Greater complications arise if thesystem is instructed to assume therecould have been more than one contrib-utor. If there are two contributors, thenmany additional hypotheses becomeplausible candidates for explaining thedata. The results at locus D5S818, forexample, could be explained if one con-

WWW. N A C D L . O R G T H E C H A M P I O N18

FORENSIC DNA STATISTICS: STILL CONTROVERSIAL IN SOME CASES

Scientific Working Group onDNA Analysis Methods (SWG-DAM): A group of forensic scien-tists from Canada and the U.S.crime laboratories appointed bythe director of the FBI to provideguidance on crime laboratorypolicies and practices.

Stochastic effects: Random fluc-tuations in testing results that canadversely influence DNA profileinterpretation (e.g., exaggeratedpeak height imbalance, exagger-ated stutter, allelic drop-out, andallelic drop-in).

STR (short tandem repeat) test-ing: A locus where alleles differ inthe number of times that a stringof four nucleotides are tandemlyrepeated.

Stutter: A spurious peak that istypically one repeat unit less (ormore) in size than a true allele.Stutter arises during DNA amplifi-cation because of strand slippage.

Page 8: Forensic DNA Statistics: Still Controversial In Some Casesdarwin.bio.uci.edu/~mueller/pdf/forensic_dna_dec_p12-23.pdf · Although forensic DNA testing is well established, experts

tributor has genotype 8, 8 and the othercontributor has genotype 12, 12; butmight also be explained by two contribu-tors who both have genotype 8, 12. Orperhaps one contributor has genotype 8,12 and the other has genotype 8, x, wherex dropped out, and so on. To computethe probability of the genotypes, thecomputer would need to consider everypossible combination of explanatoryvariables and compute the probability ofthe observed results under each possiblescenario for every possible combinationof all of the other explanatory variables.This requires the computer to makeenormous numbers of calculations,which is why the method requires high-powered computers and analysis of a caseoften requires many hours (or even days)of computer time.

Once the system has computed theprobabilities of the various possible geno-types at each locus for the evidentiarysample, those results are compared withthe suspect’s genotypes. If any of the sus-pect’s genotypes has been assigned aprobability of zero, then the suspect isexcluded. If all of the suspect’s genotypeshave a probability greater than zero, thenthe suspect is included as a possible con-tributor. The statistic used to describe the

value of this “inclusion” is a likelihoodratio that indicates how much moreprobable the observed results would be(based on TrueAllele®’s calculations) ifthe contributor was the suspect than ifthe contributor was a random person.17

Is TrueAllele® a Valid Method?As with any type of computer mod-

eling, the accuracy of TrueAllele®depends, in part, on the accuracy of theunderlying assumptions. The system isdesigned to take into account the possi-bility of phenomena like allelic drop out,drop in, and “stutter,” but does it reallyknow the probability that those phenom-ena occurred in a particular case? Oneexpert who commented on automatedsystems of this type noted that a keyproblem is that knowledge of such mat-ters is limited.18 So a central issue, whenconsidering the admissibility of thismethod in court, will be the scientificfoundation for the assumptions on whichthe model relies. Before accepting evi-dence produced by such a system, courtsshould demand to see a careful programof validation that demonstrates the sys-tem can accurately classify mixtures ofknown samples under conditions compa-rable to those that arise in actual forensic

cases. The fact that an automated systemcan produce answers to the questions oneputs to it is no assurance that the answersare correct. While automated systemsappear promising, their ability to handle“hard cases” like those discussed in thisarticle remains to be fully evaluated.

Those promoting TrueAllele® havepublished a number of articles thatexplain the theoretical background of themethod and describe the validation per-formed to date.19 The validation researchfocuses on two aspects of the system’sperformance: “efficacy” and reproducibil-ity. “Efficacy” is measured by comparingthe likelihood ratios computed byTrueAllele® to likelihood ratios derivedfrom conventional methods like the CPIand CPE. These studies show thatTrueAllele® generally produces muchlarger likelihood ratios than conventionalmethods. Its “efficacy advantage” stemsfrom the fact that it considers more infor-mation when making calculations thanthe conventional methods do. The con-ventional methods generally consideronly whether an allele is present or absentin a sample; TrueAllele® also considersthe height of the underlying peak and thepresence or absence of technical artifactsthat often accompany actual alleles.

WWW. N A C D L . O R G D E C E M B E R 2 0 1 2

FORENSIC DNA STATISTICS: S

TILL CONTROVERSIAL IN

SOME CASES

19

THE LAW OFFICES OF

Chesnoff and Schonfeld, PC

Nevada Attorneys for Criminal Justice

Arrascada & Aramini, Ltd

Thank You to our Cocktail ReceptionSponsors at the Drug Seminar in Las Vegas

Page 9: Forensic DNA Statistics: Still Controversial In Some Casesdarwin.bio.uci.edu/~mueller/pdf/forensic_dna_dec_p12-23.pdf · Although forensic DNA testing is well established, experts

Greater efficacy is not, however, the samething as greater accuracy. A gasolinegauge that tells you there are 100 gallonsin your tank would, by this definition,have more “efficacy” than a gauge thattells you there are only 10 gallons. Beforedeciding which gauge to rely on, however,you would want to know which one pro-duced an accurate reading of the amountof gas actually in the tank — in otherwords, you would need to know whichgauge is valid and accurate. Althoughstudies have shown that TrueAllele® pro-duces more impressive numbers thanother methods, these studies do notaddress the question of whether thosenumbers are valid and accurate.

A second line of validation researchfocuses on whether the results ofTrueAllele® are reproducible. In otherwords, does the system produce the samelikelihood ratio each time it is run on aparticular sample? The answer is “no.”Because there are random elements in theway the system does its modeling, such asthe random choice of which hypothesesto consider, in which order, no two com-puter runs will be exactly the same.Proponents argue that the variation in theanswers produced by the system is smallenough not to matter although, as dis-cussed below, that assertion has beenchallenged in recent litigation.

One limitation of all the publishedTrueAllele® validation studies is that thenumber of samples tested was relativelysmall. Some studies included forensicsamples from laboratory casework, whichhave the advantage of being realistic, butare somewhat problematic for validationbecause the exact number of contributorsand their profiles cannot be known withcertainty. We would like to see additionalstudies that create and test samples fromknown contributors where the sampleshave sufficiently low levels of DNA thatallelic drop out is likely to occur.Although the existing validation demon-strates that TrueAllele® can, in somecases, make inferences about the geno-types of contributors that human analystswould have difficulty making, we thinkthe existing validation is insufficient toprove that TrueAllele® can consistentlymake correct genotype inferences in chal-lenging, problematic cases such as mix-ture cases with unequal contributionsfrom the contributors, limited quantitiesof DNA, degradation due to environmen-tal insult, etc. And the published valida-tion studies have not yet answered themore difficult question of whether thelikelihood ratio estimates produced bythe system are appropriate. Consequently,grounds may well exist to challenge the

admissibility of results from TrueAllele®under either the Frye or Daubert stan-dards, although (to our knowledge) nosuch challenges have yet been mounted.

Issues have also arisen concerning theway the TrueAllele® system has been usedin particular cases and the way in which itsresults have been reported. For example, ina case in Belfast, Northern Ireland,20 testi-mony revealed that the company ran thesoftware four separate times on a particularelectropherogram and produced four dif-ferent likelihood ratios for incriminatingthe defendant: 389 million, 1.9 billion, 6.03billion, and 17.8 billion.21 These varyingresults illustrate the issue of reproducibilitydiscussed above — they show that there isan element of uncertainty (a margin oferror) in the likelihood ratios generated bythe system. How this uncertainty is han-dled, when reporting the results to the jury,is an issue on which experts may well differ.Cybergenetics reported that the likelihoodratio for the electropherogram was six bil-lion, a number that the company’s presi-dent defended on grounds that it was thecenter of the range of values produced inthe four runs. A defense expert (LaurenceMueller, co-author of this article) contend-ed that the company should have takenaccount of the margin of error by comput-ing a 95 percent confidence interval basedon the four estimates and reporting itslower bound — which would have yieldedan estimate of 214 million (which is lowerthan the reported value by a factor of 28).While all the numbers for this particularevidence sample were highly incriminat-ing, a difference of this size might well beconsequential in a cold-hit case in whichthere is little other evidence to incriminatethe defendant.

More important, perhaps, wasanother issue that arose in the same case.A critical evidentiary sample in the case, aswab from a cell phone, was amplifiedthree times in an effort to produce com-plete results. Each time the result wasimperfect — each electropherogramshowed a slightly different set of alleles,suggesting that drop out and perhapsdrop in had occurred. When TrueAllele®was run on the results of all three ampli-fications (treating them as samples of thesame underlying set of alleles),TrueAllele® produced a likelihood ratiofor incriminating the defendant of24,000. Cybergenetics elected not toreport this likelihood ratio; it insteadreported the likelihood ratio of six bil-lion, which was based on one of the threeamplifications. When questioned aboutthis decision, the company president stat-ed that he thought that the particularamplification he had chosen to rely upon

when reporting statistics was the mostinformative of the three, and that relyingon the three amplifications together wasless informative than relying on the onehe deemed the best. This decision wascriticized by defense experts, who sug-gested that the company had cherry-picked its data, deciding that one of theamplifications was more “informative”because it was more incriminating for thedefendant. This dispute suggests that eventhe use of a purportedly objective com-puter-based method for interpretationand statistical computation will not endquestions about the potential for analyticbias in DNA testing. It is unclear, howev-er, whether this criticism had any impacton the judge who was trying the case. Heconvicted defendant Shiver but acquittedDuffy after a bench trial.22

To reduce the potential for analyticalbias, a defense lawyer in one case askedCybergenetics to analyze the evidentiarysamples and derive the DNA profile (orprofiles) of possible contributors withoutknowing the profiles of the defendant orany other suspects. To its credit, the com-pany agreed to perform the testing in this“blind” manner. The company was toldthe profiles of the defendant and otherpossible contributors (so that the companycould draw conclusions about the statisti-cal value of “inclusions”) only after it hadprovided an initial report regarding thegenotypes of possible contributors. Thedispute in the Belfast case about whetherthe company had cherry-picked its datamight well have been eliminated had thecompany chosen which evidentiary sam-ples were most “informative” without hav-ing access to information about the pro-files of any suspects. In our opinion,lawyers who choose to have samples ana-lyzed by an automated system (or, indeed,by any independent form of review) wouldbe well advised to ask that it be done in amanner that keeps the analyst “blind” tothe profiles of suspects until after the evi-dentiary samples are fully interpreted.

V. Identifying Problematic CasesOne of the goals in writing this arti-

cle was to alert lawyers to the specialproblems that arise in cases in whichsample limitations affect the reliability ofDNA profiles. In such cases there is a spe-cial need for expert assistance to helplawyers evaluate the statistical signifi-cance of the evidence, and grounds maywell exist for challenging the admissibilityof some findings. But how can lawyers tellwhether the DNA evidence in a case theyare handling falls into the problematic

WWW. N A C D L . O R G T H E C H A M P I O N20

FORENSIC DNA STATISTICS: STILL CONTROVERSIAL IN SOME CASES

Page 10: Forensic DNA Statistics: Still Controversial In Some Casesdarwin.bio.uci.edu/~mueller/pdf/forensic_dna_dec_p12-23.pdf · Although forensic DNA testing is well established, experts

Presented in Partnership by The National Association of Criminal Defense Lawyers

and the California Attorneys for Criminal Justice

Fax to 202-872-8690

MAKING SENSE OF SCIENCE VI:

FORENSIC SCIENCE & THE LAW

April 5-6, 2013 Ø The Cosmopolitan Hotel Ø Las Vegas, NV

NACDL & CACJ's 6th Annual Forensic Science Seminar will be a

two-day event in the City of Lights — Las Vegas! In the modern

world, you need to know and understand the forensic sciences

in order to effectively represent your clients. Attend this one-of-

a-kind CLE seminar and leave with a better understanding of

forensic evidence and technology to use in the arsenal of tools

to win your next case. If it involves forensic evidence or

technology in a criminal case, it will be covered at this one-of-a-

kind seminar with an unprecedented faculty.

Seminar Registration Fees NACDL Members: Regular, Life and Sustaining $389CACJ Members $389All NEVADA Lawyers (membership not required) $229All Public Defenders $285Non-Members of NACDL or CACJ $5591-Day Seminar $205Groups of (4) or more lawyers $249(Must register at the same time with one payment; no changes, replacements, or cancellations allowed)

Discounted NACDL Membership & SeminarRegular Membership & Seminar (First Time Members Only) $535PD Membership & Seminar (First Time Members Only) $385

Seminar Total

Materials FeesSeminar Attendee Audio CD $150Non-Attendee Audio CD $200Non-Attendee Written Materials & Audio CD $250DVD Video (Pre-Sale) $400

Materials Total

Subtotal

*State fees must be paid in advance to receive CLE credit. If left blank, you will not receive credit.

Grand Total

Select your seminar materials preference:Pre-Seminar Download (FREE) Hard copy (add $30)Emailed to you the week before the seminar.

u Designer Drugs: The Trend

u DNA: Be More Skeptical

u Determining Cause of Death

u Audio/Video Enhancement in

the George Zimmerman Case

u Ballistics and Toolmarks 101

u Postmortem Interval:

Calculating Time of Death

u Arson Investigation &

The Souliotes Case

u The Science Behind Substance

Abuse and its Impact on the Brain

u Memory and Eyewitness

Identification

u An Update on the NAS Report

u DNA Statistics

u The Science Behind False

Confessions

u Shaken Baby Syndrome

Registration InformationCircle your membership category fee and note the Grand Total due.Confirmation letters will be sent to all registrants in early March. Registrantsmust stop by the on-site registration desk before attending any events to receivetheir badge and materials. Cancellations must be received in writing by 5:30 pm ET on April 1, 2013 to receive a refund, less a $75processing fee. A $15 processing fee will be applied for returned checks. Toregister online, visit www.nacdl.org/meetings; or fax this form withcredit card information to 202-872-8690; or mail with full payment to: 2013Forensic Seminar, 1660 L St., NW, 12th Floor, Washington, DC20036. Questions? Contact Viviana Sejas at 202-872-8600 x632.

Name ________________________________________________________________________

Badge name (if different) __________________________________________________________

CLE State(s) ________________________ Bar # (s) ____________________________________

Address _______________________________________________________________________

City ________________________________ State __________________ Zip ______________

Office phone ____________________________ Home phone ____________________________

E-mail _________________________________________ Fax ____________________________

Guest(s)/children _________________________________________________________________

m Check enclosed (payable to NACDL)

m AMEX m VISA m MASTERCARD m DISCOVER

Card Number ____________________________________________________________________

Name on Card ____________________________________________ Exp. Date ______________

Billing Address __________________________________________________________________

City ________________________________ State __________________ Zip ______________

Authorized Signature________________________________________________________________

Fax to: (202) 872-8690www.nacdl.org/cle

Topics will include:

GA CLE fee add $60.00* NC CLE fee add $40.00* PA CLE fee add $21.00* IL, NE, UT CLE fee add $15.00* TX CLE fee add $10.00*

Page 11: Forensic DNA Statistics: Still Controversial In Some Casesdarwin.bio.uci.edu/~mueller/pdf/forensic_dna_dec_p12-23.pdf · Although forensic DNA testing is well established, experts

category being discussed here? There areseveral important clues:

v LCN cases. If the laboratory identifiesthe case as one involving “low copy num-ber” (LCN) testing (sometimes also called“low template,” “trace,” or “touch” DNAtesting), then the lawyer can be certain thatit raises the statistical issues discussed inthis article. It is important to understand,however, that not all cases raising theseissues are identified as LCN cases by foren-sic laboratories. Laboratories sometimesapply the label LCN only when theyemploy special testing proceduresdesigned to increase the sensitivity of DNAtesting (in order to find results with sam-ples that are otherwise too limited for STRtesting). But laboratories frequentlyencounter problems like allelic drop outeven when they do not employ these pro-cedures, so one should not assume that theproblems discussed here are limited toLCN cases.

v The samples tested contain small quan-tities of DNA. Forensic STR tests aredesigned to work best when they testsamples that yield 0.5 to 1 ng(nanogram) of DNA for analysis. Whenpreparing samples for STR testing, labstypically perform an operation knownas quantitation in order to estimate theamount of DNA available. They then tryto concentrate or dilute the samples tothe proper level for STR testing.Sometimes, however, the estimatedquantity of DNA in the samples is toolow to achieve the ideal concentration.In such cases labs often go ahead withthe test anyway and hope for the best. Itis well known, however, that stochasticeffects become more likely as the con-centration of DNA decreases. If it isbelow 250 pg (picograms), and certainlyif it is below 100 pg, which is one-tenthof a nanogram, then most experts agreethat stochastic effects are almost certainto be in play. The moral of this story forlawyers is that it is important to knowthe lab’s estimates of the quantities ofDNA in the samples tested, which can befound in the laboratory case file orbench notes. But these estimates are notalways accurate (especially as they getinto the 100 pg range and below), so it isimportant to attend to the other clueslisted here as well.

v Defendant is identified as a possible“minor contributor” to a mixed sample. Inmixed samples there sometimes is enoughDNA to produce a reliable profile for themajor contributor, but not enough to pro-duce reliable profiles for minor contribu-

tors. Suppose, for example, that the sam-ple containing 1 ng is a 10:1 mixture ofDNA from a major and minor contribu-tor. The quantity of DNA from the minorcontributor will be less than 100 pg, whichmeans that it will almost certainly beplagued by stochastic effects.

v One or more alleles of alleged contribu-tors were not detected in the evidentiarysample. If the laboratory must invoke thetheory of allelic drop out (and/or dropin) in order to avoid excluding putativecontributors, then the lawyer can be cer-tain that the case falls in the problematiccategory discussed here.

v Low peak heights. As noted above, theheight of the peaks in electropherogramsusually corresponds to the quantity ofDNA present from a particular contribu-tor. As the peaks associated with that con-tributor get shorter, the chance of sto-chastic effects increases. They are mostcommon when peak heights are below150 RFU.

Lawyers who see any of these clues ina case they are handling should take specialcare in evaluating the DNA evidence, andparticularly the associated statistics. Expertassistance may well be required to make anadequate evaluation, and considerationshould be given to challenging the admissi-bility of problematic testimony as well asattacking it at trial. Although DNA evi-dence has long been regarded as the goldstandard of forensic science, lawyers shouldnot assume that the tests were performedor interpreted properly in every case.

VI. Admissibility ChallengesAnd Regulatory Reform

In 2009 the National ResearchCouncil issued a scathing report aboutthe state of forensic science in the UnitedStates. It found that entire disciplinesrest on deficient scientific foundations,that procedures routinely used for inter-pretation are lacking in rigor, that ana-lysts take inadequate measures to avoiderror and bias, and that they testify withunwarranted certainty. The report notedthat the legal system has been “utterlyineffective” in dealing with these prob-lems. It recommended the creation of anew federal regulatory agency — theNational Institute of Forensic Science(NIFS) — to establish best practice stan-dards and oversee the field.

The problems with forensic DNAstatistics discussed in this article providean excellent example of why NIFS, orsome similar body, is needed. The legal

system has not been effective, thus far, indealing with these problems. DNA ana-lysts are increasingly presenting prob-lematic DNA evidence of the kind dis-cussed here and are characterizing it in avariety of different ways, which are oftenclearly wrong and misleading. The exist-ing regulatory structure, which consistsof accrediting agencies like ASCLD-LABand advisory bodies like SWGDAM, hasnot addressed these problems in an ade-quate manner. In an ideal system, a bodylike NIFS would step in and provideauthoritative guidance about how (andwhen) analysts should present evidencein problematic cases, and what statisticsshould be used to characterize such evi-dence. But efforts to create a nationalregulatory body like NIFS havefoundered in Congress, due, in large part,to opposition from enforcement groups,particularly prosecutors. It seems thatthey prefer the status quo in which labscan present problematic forensic evi-dence in ways that exaggerate its value toa regulatory regime with the power toaddress such problems.

In the absence of effective self-regu-lation by forensic scientists, the onlycheck on misleading forensic science tes-timony is provided by criminal defenselawyers. Their task is extremely challeng-ing. Judges and jurors perceive DNA evi-dence to be virtually infallible — thegold standard of forensic science. At thispoint in history it will not be easy to per-suade judges even to hear challenges tothe admissibility of DNA evidence, letalone exclude it. But legal grounds forsuccessful challenge surely exist in somecases, and a great deal is riding on theability of defense lawyers to bring suchchallenges successfully where they arewarranted. Successful challenges notonly protect the rights of accused indi-viduals, they may well persuade forensicscientists to be more cautious abouttheir interpretations in other cases; theymight even persuade policy-makers torethink the need for effective forensicscience regulation.23 It is vitally impor-tant that defense lawyers advocate effec-tively on these issues.

Notes1. NATIONAL RESEARCH COUNCIL, DNA

TECHNOLOGY IN FORENSIC SCIENCE 74-75 (1992).2. http://www.fbi.gov/about-us/lab/

codis/swgdam-interpretation-guidelines.3. United States v. Yee, 134 F.R.D. 161,

181 (N.D. Ohio 1991), aff’d. sub nom. UnitedStates v. Bonds, 12 F.3d 540 (1993).

4. See People v. Venegas, 18 Cal.4th 47(1998) (FBI DNA test inadmissible due tofailure of lab to follow correct statistical

WWW. N A C D L . O R G T H E C H A M P I O N22

FORENSIC DNA STATISTICS: STILL CONTROVERSIAL IN SOME CASES

Page 12: Forensic DNA Statistics: Still Controversial In Some Casesdarwin.bio.uci.edu/~mueller/pdf/forensic_dna_dec_p12-23.pdf · Although forensic DNA testing is well established, experts

procedures when computing match proba-bilities); W. Thompson, Evaluating theAdmissibility of New Genetic IdentificationTests: Lessons From the ‘DNA War,’ 84 J. CRIM.L. & CRIMINOLOGY 22, 84-86 (1993).

5. See generally J. BUTLER, FORENSIC DNATYPING: BIOLOGY, TECHNOLOGY AND GENETICS OF

STR MARKERS (2005); W.C. Thompson, S. Ford,T. Doom, M. Raymer & D. Krane, EvaluatingForensic DNA Evidence: Essential Elements ofa Competent Defense Review: Part 1, 27(3)THE CHAMPION 16-25 (2003); W.C. Thompson,S. Ford, T. Doom, M. Raymer & D. Krane,Evaluating Forensic DNA Evidence: EssentialElements of a Competent Defense Review:Part 2, 27(4) THE CHAMPION 24-28 (2003).

6. In some instances the labs also intro-duce a correction factor called “theta” intothe computation to account for subgroupvariation in population frequencies, but thiscorrection has a relative minor effect on theresulting numbers. For the sake of simplici-ty, we will ignore it here.

7. B. Budowle et al., Low Copy Number— Consideration and Caution, Proc. 12thInternational Symposium on HumanIdentification (2001) (http://www.promega.com/~/media/files/resources/confer-ence%20proceedings/ishi%2012/oral%20presentations/budowle.pdf?la=en ).

8. For purposes of illustration, we pres-ent here the results for only three of thenine loci examined by the laboratory in thiscase. There were discrepancies betweenthe defendant’s profile and the evidentiaryprofile at some of the other loci as well, butall of the discrepancies were attributed tothe unreliability of the STR test for typing asample with the limited quantity of DNAfound in the evidentiary sample. The shortpeaks that appear in the defendant’s refer-ence sample — for example, peaks “7” and“11” at locus D5S818 — are examples of thephenomenon known as “stutter.”

9. Itiel E. Dror & Greg Hampikian,Subjectivity and Bias in Forensic DNA MixtureInterpretation, 51 SCIENCE AND JUSTICE 204-208(2011).

10. Id. 11. Id.12. W.C. Thompson, Painting the Target

Around the Matching Profile: The TexasSharpshooter Fallacy in Forensic DNAInterpretation, 8 LAW, PROBABILITY AND RISK257-276 (2009); available at http://lpr.oxfordjournals.org/cgi/reprint/mgp013?ijkey=ehrDeU74Gi6ERv1&keytype=ref.

13. Id. 14. J.M. Curran & J. Buckleton, Inclusion

Probabilities and Dropout, 55 J. FORENSIC SCI.1171-73 (2010).

15. Id. at 1171.16. P. Gill, C.H. Brenner, J.S. Buckleton, et

al., DNA Commission of the International Soci-ety of Forensic Genetics: Recommendations

for the Interpretation of Mixtures, 160 FOREN-SIC SCI. INT’L 90-101 (2006). The InternationalCommission noted that many laboratorieshave carried out experiments that provideinformation relevant to estimating drop outprobabilities. In general, the probability ofdrop out of one allele in a genotype increas-es as the peak height of the other allele de-creases — in other words, allelic drop out ismore common when peak heights are lowdue to limited quantities of DNA. Empiricalstudies may allow determination of athreshold peak height above which dropout is unlikely to occur but estimation ofdrop out probabilities when the observedpeaks are below this threshold inevitably re-quires a considerable element of guess-work, which means the ultimate statisticswill always depend, in part, on the subjec-tive judgment of experts on questionsabout which experts are likely to differ.

17. To compute the likelihood ratio,the system simply divides its overall esti-mate of the probability the contributorhad the defendant’s profile by the proba-bility a random person would have thisprofile. The likelihood ratio thus corre-sponds to the degree to which the DNAanalysis increases certainty that the sus-pect was a contributor.

18. J.M. Curran, A MCMC Method forResolving Two Person Mixtures, 48 SCIENCE &JUSTICE 168-177 (2008).

19. See M.W. Perlin & B. Szabady, LinearMixture Analysis: A Mathematical Approach toResolving Mixed DNA Samples, 46 J. FORENSICSCI. 1372-1378 (2001); M.W. Perlin & A.Sinelnikov, An Information Gap in DNAEvidence Interpretation, 4 PLOS ONE e8327(2009); M.W. Perlin, J.B. Kadane & R.W. Cotton,Match Likelihood Ratio for UncertainGenotypes, 8(3) LAW, PROBABILITY AND RISK 289-302 (2009); M.W. Perlin, Simple Reporting ofComplex DNA Evidence: AutomatedComputer Interpretation, Promega’sFourteenth International Symposium onHuman Identification (2003); M.W. Perlin, M.M.Legler, C.E. Spencer, J.L. Smith, W.P. Allan, J.L.Belrose & B.W. Duceman, Validating TrueAllele®DNA Mixture Interpretation, 56 J. FORENSIC SCI.1430-1447 (2011).

20. Regina v. Duffy and Shivers, 2012 NICC(available at http://www.courtsni.gov.uk/en-GB/Judicial%20Decisions/PublishedByYear/Documents/2012/[2012]%20NICC%201/j_j_HAR8394Final2.htm).

21. Laurence Mueller and Dan Krane,who are co-authors of this article, served asexperts for the defendant.

22. Regina v. Duffy and Shivers, supra. 23. See W.C. Thompson, The National

Research Council’s Plan to StrengthenForensic Science: Does the Path Forward RunThrough the Courts? 50 JURIMETRICS JOURNAL35-51 (2009) (arguing that policy-makers

are unlikely to take seriously the need forforensic science reform unless, or until,courts begin excluding problematic foren-sic science evidence).n

WWW. N A C D L . O R G D E C E M B E R 2 0 1 2

FORENSIC DNA STATISTICS: S

TILL CONTROVERSIAL IN

SOME CASES

23

About the AuthorsWilliam C. Thompson is a Professor in

the Department ofCriminology, Law &Society and at theSchool of Law, Uni-versity of California,Irvine. He is a formerco-chair of NACDL’sForensic EvidenceCommittee.

Professor William C. ThompsonDepartment of Criminology, Law & SocietySchool of Social EcologyUniversity of California, IrvineIrvine, CA 92697949-824-6156

[email protected]

Laurence D. Mueller is a Professor andChair of the De-partment of Ecolo-gy and Evolution-ary Biology at theUniversity of Cali-fornia, Irvine.

Professor Laurence D. MuellerDepartment of Ecology & Evolutionary BiologyUniversity of California, IrvineIrvine, CA 92697949-824-4744

[email protected]

Dan E. Krane is a Professor in the Depart-ment of Biology atWright State Uni-versity in Dayton,Ohio.

Professor Dan E. KraneWright State University3640 Colonel Glenn Hwy.Dayton, OH 45435937-775-3333

[email protected]

E-MAIL

E-MAIL

E-MAIL