Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level...
Transcript of Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level...
![Page 1: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/1.jpg)
1
Statistical methods forinterpreting microarray data
Terry SpeedDepartment of Statistics, UC Berkeley
Walter & Eliza Hall Institute of Medical Research
Workshop on Molecular and Statistical Genomic EpidemiologyParis, May 9-11, 2005
![Page 2: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/2.jpg)
2
My plan today: two illustrations(of Statistical methods for…)
Low level analysis: calling genotypes from AffymetrixSNP chip data.
Similar projects are underway for analysing chip datafor DNA copy number determination, DNAresequencing, whole genome tiling arrays for globalexpression and ChIP-chip studies, and whole genomeexon arrays for exon and gene-level expression. AlsoQA/QC.
Higher level analysis: one experiment to identifygenes involved in the host response to Leishmaniamajor, not atypical of the special experiments we do.
![Page 3: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/3.jpg)
3
No time to mention today
Middle level analysis, many examples, e.g.the summarization and ranking of genesusing microarray time course data.
![Page 4: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/4.jpg)
4
The Affymetrix SNP Chip
1.28cm > 100,000 features / array
1.28cm
88µµmm
8µm
> 1million of identical 25bp probes / feature
* **
**
![Page 5: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/5.jpg)
5
TAGCCATCGGTANGTACTCAATGAT
Genomic DNA
ATCGGTAGCCATTCATGAGTTACTAPerfect Match probe for Allele A
ATCGGTAGCCATCCATGAGTTACTAPerfect Match probe for Allele B
A SNP
GTAGCCATCGGTA GTACTCAATGAT
Affymetrix SNP chip terminology
![Page 6: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/6.jpg)
6
Affymetrix SNP probe tiling strategy, 1SNP Tiling Strategy
TAGCCATCGGTA N
SNP 0 Position
A / G
GTA C TCAATGATCAGCT
ATCGGTAGCCAT T
ATCGGTAGCCAT CATCGGTAGCCAT A
ATCGGTAGCCAT ACAT G AGTTACTACAT G AGTTACTA
CAT G AGTTACTACAT G AGTTACTA
PM AlleleMM Allele
PM AlleleMM Allele
AA
BB
Central probe quartet
![Page 7: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/7.jpg)
7
Affymetrix SNP probe tiling strategy, 2
TAGCCATCGGTA N
SNP
+4 PositionA / G
GTA C TCAATGATCAGCT
GTAGCCAT T
GTAGCCAT CGTAGCCAT C
GTAGCCAT TCAT G AGTTACTAGTCGCAT C AGTTACTAGTCG
CAT G AGTTACTAGTCGCAT C AGTTACTAGTCG
PMMM
PMMM
AA
B B
+4 Allele+4 Allele
+4 Allele+4 Allele
+4 offset probe quartet
![Page 8: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/8.jpg)
8
Affymetrix SNP probe tiling strategy, 3
MMBMMBMMBMMBMMBMMBMMB
PMBPMBPMBPMBPMBPMBPMB
MMAMMAMMAMMAMMAMMAMMA
PMAPMAPMAPMAPMAPMAPMA
7654321
Repeated on the opposite strand: 56 probes in all.More recently, 40: just 4 offset quartets instead of 6.
Central quartetOffset quartets Offset quartets
![Page 9: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/9.jpg)
9
Affymetrix SNP identificationFake (idealized) image for 3 samples on one SNP
AA AB BB
The current vendor-supplied genotype-calling algorithm DM seeks the best fitting pattern of the above kind, including nocall (NC). It is a mix of normal likelihood-based model selectionand a Wilcoxon test. There is no training, and it is single chip.
![Page 10: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/10.jpg)
10
DM (no NCs) vs HapMap
1,452327,4151,13225BB
1,745544355,168457AB
1,42091,249339,502AA
NCBBABAAHapMapDM
11,446 SNPs, 90 samples99.67% concordance (both called) 3,416 discordant calls
![Page 11: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/11.jpg)
11
Why attempt an improvement over DM?
• Perhaps the error rate is too high?
• There is reason to believe it can be improved bya) using the training/test set paradigm;b) carrying out multi-chip analyses, which identifyand exploit probe behaviour; andc) exploiting the massive parallelism across SNPs.
• The 100K SNPs were selected from a much largerscreening set using DM. For the 500K and >1M SNPchips, a higher yield is desirable, and perhaps abetter genotype-calling algorithm could achieve this.
![Page 12: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/12.jpg)
12
Robust Linear Model with theMahalanobis distance classifier
• RLMM pronounced pronounced ““REALMREALM””• Based on an RMA-like model
– Uses PM only– Linear additive multi-chip model on log scale– A- and B-probe and chip effects– Robustly estimated parameters
• Classification using Mahalanobis’ distance
![Page 13: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/13.jpg)
13
RLMM: single SNP, multi-chip model
For SNP n we fit the following models for the A and B-probes toquantile-normalized PM values yi,A,j
(n) and yi,B,j(n) .
log2(yi,A,j(n)) = θA,i
(n) + βA,j(n) + εij ,
log2(yi,B,j(n)) = θB,i
(n) + βB,j(n) + εij ,
where θA,i(n) and θB,i
(n) are the A- and B-effects for sample i,and βA,j and βB,j are the relative probe affinities, subject to ∑βA,j
(n) = ∑βB,j
(n) =0.As errors are likely to be contaminated due to outlier probes,
we use a robust linear model to estimate the parameters.
![Page 14: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/14.jpg)
14
RLMM: outline of the algorithm1. Quantile normalize PM intensities across chips.2. For SNP n, obtain estimates of (θ(n)
A,i ,θ(n)B,i ) for each
sample i in the training set using the previous model.3. Estimate the mean vectors (µ(n)
AA, µ(n)AB , µ(n)
BB)and covariance matrices (Σ(n)
AA , Σ(n)AB , Σ(n)
BB) ofthe 2-dimensional vectors (θ(n)
A,i ,θ(n)B,i) using samples
from the AA, AB and AB groups in the training set.4. Obtain estimates (θ(n)
A,i ,θ(n)B,i ) for each sample i in the
test set.5. Classify each sample in the test set to the genotype
group closest to it in Mahalanobis’ distance.
![Page 15: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/15.jpg)
15
Mahalanobis’ distanceIntroduced by P.C. Mahalanobis in1936.A Euclidean-type metric which takesinto account the variances andcovariances, here Σg , between thecomponents θA and θ B of θ = (θA ,θB) :
D2g(θ) = (θ – µg)’Σ-1
g(θ – µ g)where D2
g(θ) is the generalized squareddistance of the θ vector from the mean µgof genotype group g = AA, AB or BB.We choose the g with smallest D2
g(θ).Note: we are not using ^’s to designateestimates, trusting to context.
![Page 16: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/16.jpg)
16
From raw intensities to θ values:AA
SNP 5 data from 13 AA samples (horizontally)
PMA+PMA-PMB+PMB-
PMA+PMA-PMB+PMB-
Relatively low (high)intensity probeRelatively
dim chip
![Page 17: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/17.jpg)
17
From raw intensities to θ values: AB
SNP5 data from 39 AB samples
PMA+PMA-PMB+PMB-
PMA+PMA-PMB+PMB-
![Page 18: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/18.jpg)
18
From raw intensities to θ values: BB
SNP 5 data from 75 BB samples
PMA+PMA-PMB+PMB-
PMA+PMA-PMB+PMB-
![Page 19: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/19.jpg)
19
SNP 5: θ- and residual plots
BB
AA
AB
Every sample has its (θA ,θB) pair: plot them!Do likewise for the residuals in the fitted model.
Residuals areuseful for QC;here skewed b/c of + strand failure.
Similar plots are used byAB, Chemicon and Illumina.
New sample points are assigned to the “closest” genotype
![Page 20: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/20.jpg)
20
SNP 200655: θ- and residual plots
A more satisfactory SNP’s plots.
![Page 21: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/21.jpg)
21
A-1706313 (DM NoCalls=10%) A-1659973 (Nocalls=23%)
A-1726964 (Nocalls=19%) A-1657538 (DM NoCalls = 6%)
Here are fourSNPs with someharder calls: thegenotype groupsare closer togetherand internallymore straggly.
The DM defaultmakes NCs onthese.
![Page 22: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/22.jpg)
22
Empirical Bayes Multi-SNP model Averaging of genotype centerscenters (µ(n)
AA, µ(n)AB , µ(n)
BB)and covariance matrices (Σ(n)
AA, Σ(n)AB , Σ(n)
BB) acrossSNPs n leads to
• empirically estimated conjugate Gaussian prior,• giving prior estimates of genotype means and
covariance matrices for all SNPs,• which when combined with the data for a particular
SNP, gives• better estimates of genotype group means and
covariance matrices, and hence better genotypicassignments for that SNP.
Main benefit: better genotype prediction when there arefew or no training samples with a given genotype.
![Page 23: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/23.jpg)
23
RLMM (no NCs) vs HapMap
1,478327,77249832BB
1,699184356,575196AB
1,44012476339,756AA
NCBBABAAHapMapRLMM
11,446 SNPs, 90 samples, LOOCV99.86% concordance (both called)1,398 discordant calls
![Page 24: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/24.jpg)
24
Availability
A version of RLMM will go into the opensource R-based Bioconductor package
before the end of this summer.
![Page 25: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/25.jpg)
25
Leishmaniasis
![Page 26: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/26.jpg)
26BALB/c C57BL/6
![Page 27: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/27.jpg)
27
L. major response loci in mice
• lmr1 Chromosome 17– MHC region– BALB/c susceptible
• lmr2 Chromosome 9– BALB/c susceptible
• lmr3 X Chromosome– C57BL/6 susceptible in the presence of BALB/c
homozygosity at lmr1
![Page 28: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/28.jpg)
28
lmr1, lmr2, and lmr3 affectthe course of disease
0
1
2
3
4
5
2 3 4 5 6 7 8 9 10 11 12
B/c.lmr3BALB/cB/c.lmr1B/c.lmr2
Aver
age
lesio
n sc
ore
Week post infection
*
* p < 0.05
![Page 29: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/29.jpg)
29
C.lmr1/2• BALB/c background• lmr1 and lmr2 from C57BL/6• Predict: More resistant than BALB/c
B6.lmr1/2• C57BL/6 background• lmr1 and lmr2 from BALB/c• Predict: more susceptible than C57BL/6
Compound congenics
![Page 30: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/30.jpg)
30
0
1
2
3
4
5
2 3 4 5 6 7 8 9 10 11 12
B/c.lmr3BALB/cB/c.lmr1B/c.lmr2B/c.lmr1/2B6.lmr1/2B6.lmr1B6.lmr2C57BL/6
Course of infection in strainscongenic for lmr loci
weeks post infection
aver
age
lesio
n sc
ore
![Page 31: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/31.jpg)
31
Summary of challenge infections
• All three loci confirmed to play a role inresponse to L. major infection
• Having all three resistance alleles(C.lmr1/2/3) or all three susceptibility alleles(B6.lmr1/2/3) does NOT recapitulate theparental phenotype in every mouse
• There are possibly other genes involved
![Page 32: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/32.jpg)
32
Infected macrophages
C57BL/6 B6.lmr1/2
![Page 33: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/33.jpg)
33
Design of microarray experiment
C57BL/6uninfected
B6.lmr1/2uninfected
C57BL/6infected
B6.lmr1/2infected
BALB/cuninfected
B/c.lmr1/2infected
BALB/cinfected
B/c.lmr1/2uninfected
Boxes indicate bone marrow derived macrophage samples arrayed on Affymetrix chips; red arrows indicate comparisons of interest.
![Page 34: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/34.jpg)
34
Uninfected B6.lmr1/2vs uninfected C57BL/6
• 83 genes t* > 5– Antigen presentation– Receptors– Cell surface– Chemokines– Inflammatory response– Cytoskeleton
Extracellular matrix9 genes in C57BL/6– Cell cycle– Mitochondrial– Signal transduction– Transcription factors
*Analysis carried out with RMA and limma, t here denotingmoderated Student t-statistic; qq-plots also used.
![Page 35: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/35.jpg)
35
Genes differently differentially expressed*
• Over 20 genes common to both arms of the experiment• Some immunological genes and others• Genes involved in tissue remodelling, wound repair and
extracellular matrix deposition– Metalloproteinases– Cytokines involved in extracellular matrix deposition– Collagens
Hypothesis: wound repair is important
*Again analysis done in limma, this time a 2×2 factorial analysis.
![Page 36: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/36.jpg)
36
Is a lesion a wound which fails to heal?
![Page 37: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/37.jpg)
Rate of wound healing
0
0.5
1
1.5
2
2.5
0 3 4 5 6 7 8 9 10 11Days
Lesi
on
Siz
e (
mm
)
BALBcC.lmr1/2C57BL6BL6.lmr1/2
![Page 38: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/38.jpg)
38
Collagen bundles in congenics
C57BL/6 B6.Clmr1/2 C.B6lmr1/2 BALB/c
Uninf.punch biopsies
L.majorinfected
![Page 39: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/39.jpg)
39
Conclusions
• lmr1, lmr2 and lmr3 affect progression of disease• Expression of Th1/Th2 cytokines is not mediated by lmr1,
lmr2, or lmr3 loci at any time during infection (not shown)• Early difference in cytokine response not seen (not shown)• Microarray analysis of macrophages has identified genes
involved in wound healing as being important.
• Wound healing experiments show that collagen depositionis indeed different between congenics and parentals.
![Page 40: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/40.jpg)
40
Acknowledgements
Nusrat Rabbee, UCB
Simon Cawley, Affymetrix
Simon FooteEmanuela HandmanColleen ElsoLynden RobertsAnuratha SakthiandeswarenJoan CurtisDenise BullenBeena KumarLynn BuckinghamFleur RoddaClaire, Kerry and Melissa (Kew)Tracey Baldwin
Funding: HHMI, NIH, NHMRC, Gene CRC, NSF
Gordon SmythRussell ThompsonKen Simpson
All WEHI
![Page 41: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/41.jpg)
41
![Page 42: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/42.jpg)
42
DM vs HapMap (no NCs)
1,452327,4151,13225BB
1,745544355,168457AB
1,42091,249339,502AA
NCBBABAAHapMapDM
11,446 SNPs, 90 samples99.67% concordance (valid calls) 3,416 discordant calls
![Page 43: Statistical methods for interpreting microarray data · (of Statistical methods for…) Low level analysis: calling genotypes from Affymetrix SNP chip data. Similar projects are underway](https://reader033.fdocuments.net/reader033/viewer/2022052008/601ced897149e25f954f57b9/html5/thumbnails/43.jpg)
43
Comparison TableComparison TableRLMM RLMM vs vs DMDM
(n=11,446 SNPs)(n=11,446 SNPs)99.7% concordance99.7% concordance
Total discordant calls: 2866Total discordant calls: 2866
32916483228BB
592356899445AB
24945341211AA
BBABAADMRLMM