Genome-wide association and HLA region fine-mapping ... · Genome-wide association and HLA region...
Transcript of Genome-wide association and HLA region fine-mapping ... · Genome-wide association and HLA region...
1
Genome-wideassociationandHLAregionfine-mappingstudiesidentify
susceptibilitylociformultiplecommoninfections
ChaoTian,1*BethannSHromatka,1AmyKKiefer,1NicholasEriksson,1JoyceYTung,1DavidA.
Hinds,1**123andMe,MountainView,California,UnitedStatesofAmerica.
*CorrespondingauthorsEmailaddress:*CT:[email protected]**DAH:[email protected]
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
2
ABSTRACT
Weperformed23genome-wideassociationstudiesforcommoninfections,including
chickenpox,shingles,coldsores,mononucleosis,mumps,hepatitisB,plantarwarts,
positivetuberculosistestresults,strepthroat,scarletfever,pneumonia,bacterial
meningitis,yeastinfections,urinarytractinfections,tonsillectomy,childhoodear
infections,myringotomy,measles,hepatitisA,rheumaticfever,commoncolds,rubellaand
chronicsinusinfection,inmorethan200,000individualsofEuropeanancestry.Forthe
firsttime,genome-widesignificantassociations(P<5x10-8)wereidentifiedformany
commoninfections.Theassociationsweremappedtogeneswithkeyrolesinacquiredand
innateimmunity(HLA,IFNA21,FUT2,ST3GAL4,ABO,IFNL4,LCE3E,DSG1,LTBR,MTMR3,
TNFRSF13B,TNFSF13B,NFKB1,CD40)andinregulationofembryonicdevelopmental
process(TBX1,FGF,FOXA1andFOXN1).Severalmissensemutationswerealsoidentified
(inLCE5A,DSG1,FUT2,TBX1,CDHR3,PLG,TNFRSF13B,FOXA1,SH2B3,ST5andFOXN1).
MissensemutationsinFUT2andTBX1wereimplicatedinmultipleinfections.Weapplied
fine-mappinganalysistodissectassociationsinthehumanleukocyteantigenregion,which
suggestedimportantrolesofspecificaminoacidpolymorphismsintheantigen-binding
clefts.Ourfindingsprovideanimportantsteptowarddissectingthehostgenetic
architectureofresponsetocommoninfections.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
3
Introduction
Infectiousdiseases,thesecondleadingcauseofdeathworldwide,representpersistent
challengestohumanhealthduetoincreasingresistancetoestablishedtreatments,lackof
life-savingvaccinesandmedicationsindevelopingcountries,andincreasingdistribution1.
Studieshavelinkedsusceptibilitytoinfectiousagentstocancers,autoimmunediseases,
anddrughypersensitivity.Humanpapillomavirusesareassociatedwithmultiplecancers2;
rubellaandmumpsinfectionhavebeenlinkedtodevelopmentoftype1diabetes(T1D)in
children3;andriskofdevelopingmultiplesclerosis(MS)wasgreateramongpatientswith
herpeszoster(shingles)thaninmatchedcontrols4.Reactivationofchronicpersistent
humanherpesviruseshasbeenlinkedtodrug-inducedhypersensitivity5.Thus,infectious
diseaseshaveaprofoundimpactonourhealth,bothdirectlyaswellasthrough
connectionswithotherdiseases.Nevertheless,geneticstudiesofcommoninfectious
diseaseslagsomewhatbehindthoseofothermajorcomplexdiseases.Fewgenome-wide
associationstudies(GWAS)havebeenundertakenforcommoninfectiousdiseasesoflower
mortalityorforinfectiousdiseasesforwhichvaccinesareavailable6.
Inthisstudy,weconducted23imputedGWASswithmorethan200,000European
researchparticipantswhoweregenotypedonthe23andMeplatformandwereaskedto
reportontheirhistoryofinfections(Table1andSupplementarynote).Classicalhuman
leukocyteantigen(HLA)locihavebeenprototypicalcandidategeneticsusceptibilityto
multipleinfectiousdiseases7.WefurtherimputedandtestedtheHLAclassicalallelesand
aminoacidpolymorphismstodissectindependentHLAsignalsformultiplecommon
infectionsthathavesignificantHLAassociationsinourGWASs.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
4
Results
Intotal,59genome-widesignificant(GWS)regions(P<5x10-8)werediscovered(Table
2,seealsoSupplementaryFigure1andSupplementaryFigure2forGWASManhattanplots
andregionalplots).TheHLAregionissignificantlyassociatedwith13oftheinfectionsthat
westudiedandwedissectedindependentsignalsineachassociation(Figure1andTable3,
seealsoSupplementaryFigure3forHLAregionalplots).Inthelatersections,wedescribe
theresultforeachphenotypeindetail.
Geneticcorrelations
ThepairwiseLDscorecorrelationonGWASstatistics(Figure2,SupplementaryTable2)
showedsignificantpositivegeneticcorrelationsbetweenstrepthroatandtonsillectomy(rg
=0.85,se=0.09,P=1.89x10-21),betweenchildhoodearinfectionandmyringotomy(rg=
0.88,se=0.14,P=1.27x10-9),andbetweenchickenpoxandshingles(rg=0.56,se=0.16,P
=4x10-4),whicharepairsofrelatedphenotypes.Ahighlysignificantgeneticcorrelation
wasfoundbetweentonsillectomyandchildhoodearinfections(rg=0.42,se=0.05,P=8.19
x10-15).Tonsillitisandearinfectionsaresometimescomorbidsymptoms.Aprevioustwin
studysuggestedasubstantialoverlapingeneticfactorsinfluencingvariationsinliabilityto
earinfectionandtonsillitis8.InourGWASs,weidentifiedthemissensemutation(N397H)
inTBX1asasignificantsusceptibilitylocusforboth.Significantpositivecorrelationswere
alsoobservedforpneumoniawithchronicsinusinfections(rg=0.72,se=0.19,P=1.0x
10-4),childhoodearinfections(rg=0.39,se=0.07,P=1.47x10-8),coldslastyear(rg=
0.57,se=0.12,P=2.66x10-6)andstrepthroats(rg=0.69,se=0.13,P=1.0x10-7).These
areallupperrespiratoryinfections.Thesamevirusesthatcausecoldsandsorethroat(if
theseinfectthethroat,sinusesandupperrespiratorytract)canalsocausepneumonia(if
thesereachthelungs).Wealsosawahighgeneticcorrelationbetweenurinarytract
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
5
infection(UTI)andyeastinfections(rg=0.68,se=0.08,P=4.30x10-16).Therewasa
phenotypiccorrelation(Pearsonr=0.41,P<2.2x10-16)amongwomenwhoreportedboth
UTIandyeastinfectionsinourcohort.Thesetwoinfectionscanhavesimilarsymptoms
andbothcausediscomfortinvaginalarea,butdifferentinfectiousagentscausethem.The
geneticcorrelationfromLDscoreregressionremovesthecorrelationinGWASsummary
statisticsduetothecorrelationinphenotypesandthussuggestsasharedhostgenetic
susceptibilitytoUTIandyeastinfectionsinfemales.
Chickenpox(Varicella-zostervirus,Herpesvirusesfamily)
Chickenpox,characterizedbyred,itchybumpsontheskin,isahighlycontagiousdisease
causedbyprimaryinfectionwithvaricella-zostervirus(VZV).Aftertheinitialchickenpox
infection,thevirusremainsdormantinthenervoussystem,thoughinapproximately20%
ofpeopleitreactivatesandmanifestsasshingles(seebelow).Weidentifiedindependent
associationsinHLA-AandHLA–BintheclassIregion.OnconditionalanalysiswithinHLA-
A,theaminoacidpolymorphismHLA-A-Gly107(P=3.77x10-10)accountedfortheHLA
alleleassociation(HLA-A*02:01,P=1.08x10-9,conditionalp-valueis0.90)inthisinterval.
AlthoughHLA-A-Gly107isnotlocatedinthepeptide-bindingcleft,itisinhighLDwith
HLA-A-Asp74,HLA-A-Gly62andHLA-A-V95(r2>0.95),whichareallinthepeptide-binding
cleft.InHLA-B,onlyrs9266089(P=1.00x10-10)metthethresholdforGWS.
Shingles(Varicella-zostervirus,Herpesvirusesfamily)
Shingles,alsoknownasherpeszoster,ischaracterizedbyapainfulband-shapedrash9
andiscausedbyreactivationoftheVZV(seeabove).Weidentifiedmultipleindependent
HLAsignalsforshingles.WithinHLA-A,theaminoacidpolymorphismHLA-A-Arg97(P=
2.75x10-22),locatedinthepeptide-bindinggroove,accountedformostoftheSNPeffect
(rs2523815,P=1.83x10-22,conditionalp-valueis0.04)andtheHLAalleleeffect(HLA-
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
6
A*02:01,P=8.91x10-20,conditionalp-valueis0.06).HLA-A*02:01isimplicatedinboth
ourchickenpoxandshinglesGWASs.Theaminoacidassociationsaredifferent,butHLA-A-
Arg97isinhighLDwithHLA-A-Gly107(r2=0.73).WithinHLA-B,rs2523591(P=1.74x
10-27)hadastrongerassociationthanHLAclassicalvariants.Therewasasignificant
residualeffectforrs2523591afterconditioningonHLAallele(conditionalp-valueis1.71x
10-9)oraminoacid(conditionalp-valueis3.13x10-5)associationsinHLA-Bregion.The
SNPrs2523591isinLDwithrs9266089(r2=0.24,D’=0.98),whichisaGWSsignalin
chickenpoxGWAS,suggestingpotentialsharedgeneticfactorsforchickenpoxandshingles.
AfteradjustingforthesignalsinHLA-AandHLA-Bregions,weobservedindependent
secondarysignalsintheclassIIIregion(rs41316748,P=1.44x10-12)andintheclassII
regionindexbyDRB1-PheSerHis13(P=5.9x10-10),whichisinthepeptide-bindingcleftof
HLA-DRB1.
WeidentifiedavariantupstreamofIFNA21(rs7047299,P=1.67x10-8)asaGWS
associationwithshingles.Noneofthevariantswithin500kbandinmoderateLD(r2>0.6)
withtheindexSNPrs7047299werecoding,norweretheyreportedasexpression
quantitativetraitloci(eQTL).IFNA21,amemberofthealphainterferongeneclusterat
band9p21,encodedtypeIinterferonandismainlyinvolvedininnateimmuneresponse
againstviralinfection.Ithavebeenshowntobeinvolvedinthepathogenesisofrubella10,
andmayalsoinfluencesusceptibilitytoasthmaandatopy11.
Coldsores(type1herpessimplexvirus,Herpesvirusesfamily)
OverhalfoftheUSpopulationsuffersfromcoldsores(herpessimplexlabialis),which
aremostcommonlycausedbytheherpessimplexvirustypeI(HSV-1)12.Oncesomeone
hasbeeninfected,thevirususuallycannotbeeliminatedandliesdormantinnervecells
whereitmayreactivateyearslater.AlthoughmostpeopleareinfectedwithHSVatsome
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
7
pointintheirlives,itisnotclearwhyonlysomepeoplesufferfromcoldsoreoutbreaksand
reactivation13.WeidentifiedtwoindependentGWSSNPassociationsintheHLAclassI
region.Oneis2kbupstreamofPOU5F1indexedbyrs885950(P=7.47x10-13)andthe
otherisupstreamofHCP5indexedbyrs4360170(P=3.41x10-9).WealsofoundaGWS
associationwithHLA-B-ThrGly45inthepeptide-bindingcleftoftheHLA-Bprotein(P=
4.91x10-12).UponconditioningonHLA-B-ThrGly45,thetwoSNPshadsignificantresidual
effects(conditionalp-valuesare6.28x10-9and1.32x10-5).However,theeffectofHLA-B-
ThrGly45waslargelyremoved(conditionalp-valueis0.001)afterconditioningonthetwo
SNPs.
Mononucleosis(Epstein-Barrvirus,Herpesvirusfamily)
Over90%oftheworld’sadultpopulationischronicallyinfectedwithEpstein-Barrvirus
(EBV).Primaryinfectionatchildhoodisusuallyasymptomaticorwithonlymildsymptoms.
Primaryinfectionlaterinlifeisoftenaccompaniedbyinfectiousmononucleosis(IM)14,
commonlyreferredtoas“mono”andcharacterizedbyfever,tonsillitis,andfatigue.A
longitudinalstudyofIMinacohortof2,823,583Danishchildrenfoundevidenceoffamilial
aggregationofIMthatwarrantsGWASonIMdiseaseetiology14.WeidentifiedtheHLA
classIregionupstreamofHCP5asaGWSassociation(indexedbyrs2596465,P=2.48x10-
9).NoGWSHLAalleleoraminoacidvariantwasidentified.
Mumps(mumpsvirus)
Mumpsisanillnesscausedbythemumpsvirus.Thereisnodiseasemodifying
treatmentformumpsinfection.Itisararediseaseindevelopedcountriessincethe
introductionofroutinevaccination;however,outbreakscanstilloccurandmumpsremains
asignificanthealththreatindevelopingcountries15.WeidentifiedfourindependentGWS
associationswithmumps.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
8
VariationinHLA-A,intheclassIregion,wassignificantlyassociatedwithmumps.The
aminoacidpolymorphismHLA-A-Gln43(P=2.51x10-17)accountedformostoftheSNP
association(rs114193679,P=2.23x10-17,conditionalp-valueis0.19)andtheHLAallele
association(HLA-A*02:05,P=1.73x10-17,conditionalp-valueis0.26)intheHLA-Aregion.
WeidentifiedFUT2(indexedbyrs516316,P=9.63x10-72)andST3GAL4(indexedby
rs3862630,P=1.21x10-8)ashighlysignificantassociationswithmumps.Bothbelongto
theglycosphingolipid(GSL)biosynthesispathway,whichwasthemostsignificant
MAGENTA-analyzedpathway(Table4:P=1x10-4andFDR=0.003)identifiedusing
mumpsGWASdata.ThispathwayalsoincludestheABOgene(indexedbyrs8176643,P=
5.81x10-5).Glycosphingolipids(GSLs),foundinthecellmembranesofbacteriaalltheway
tohumans,playimportantbiologicalrolesinmembranestructure,host-pathogen
interactions,cell-cellrecognition,andmodulationofmembraneproteinfunction.Genetic
variationinthehumanFUT2genedetermineswhetherABObloodgroupantigensare
secretedintobodyfluids16.Detailedfunctionalannotationsofthegenesidentifiedinour
mumpsGWASareprovidedinSupplementarytable3.Thehigh-riskallelers516316(C)is
incompleteLDwithrs601338(A),anonsensevariantinFUT2thatencodesthe“non-
secretor”(se)allele,whichhasbeenreportedtoprovideresistancetoNorovirus17and
susceptibilitytoCrohn’sdisease18,19andT1D20.Thesurfaceglycoproteinofmumpsvirus,
hemagglutinin-neuraminidase,attachestosialicacidreceptorsandpromotesfusionand
viralentryintohostcells21.ABOantigenscanmodulatetheinteractionbetweenpathogens
andcellsurfacesialicacidreceptors22;thispresentsaplausiblemechanismbywhich
secretorstatuscouldmodifysusceptibilitytomumpsinfections.ST3GAL4encodesa
memberoftheglycosyltransferase29family,agroupofenzymesinvolvedinprotein
glycosylation.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
9
Wealsoidentifiedasignificantassociationbetweenmumpsandavarianton
chromosome14(rs11160318,P=4.56x10-12)thatislocatedintheintergenicregion
upstreamofBDKRB2.TheroleofBDKRB2inmumpssusceptibilityisnotclear,butthe
BDKRB2producthashighaffinityforintactkinins,whichmediateawidespectrumof
biologicaleffectsincludinginflammation,pain,vasodilation,andsmoothmuscle
contractionandrelaxation23.
HepatitisB(hepatitisBvirus)
ChronichepatitisBvirus(HBV)infectionisachallengingglobalhealthproblem.Several
GWASshavebeencarriedouttoidentifygeneticlociinvolvedinHBVsusceptibilityand
immuneresponsetohepatitisBvaccine,andtheHLA-DRB1-HLA-DQB1regionhasbeen
consistentlyshowntobeGWS24–26.InourGWASofHBVinfectionweidentifiedaGWS
associationwithrs9268652(P=3.14x10-9,inHLA-DRA),consistentwiththeassociation
oftheHLAclassIIregionwithHBVsusceptibility.Finemappinganalysisfailedtoidentify
GWSassociationswithHLAclassicalvariants,butDQB1*06:02wasmoderatelyassociated
withhepatitisB(P=5.47x10-6).HLA-DQB1*06:02isprotectiveagainstHBVinourdata,
whichisconsistentwithpreviousstudiesthattheDRB1*15:01-DQB1*06:02haplotypeis
associatedwithprotectionfromtype1autoimmunehepatitisanddevelopmentof
hepatocellularcarcinoma27.AlthoughchronicinfectionwiththeHBVhasbeenlinkedtothe
developmentofhepatocellularcarcinomaformorethan30years28,themechanismby
whichHBVinfectionleadstohepatocellularcarcinomaisunclear.
HepatitisA(hepatitisAvirus)
InfectionwithhepatitisAvirus(HAV)canbeasymptomaticorresultinanacuteillness
characterizedbyflu-likesymptomsandjaundice.Seriouscomplicationsarerareandthe
hepatitisAvaccineiseffectiveforprevention29.Nolargecohortstudyhasbeendonefor
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
10
HAVinfection.OurGWASshowedasuggestiveassociationwithIFNL4,indexedby
rs66531907(P=5.7x10-8,OR=1.23,1kbupstreamofIFNL4).IFNL4islocatedupstream
ofIFNL3(alsoknownasIL28B).Thehigh-riskallelers66531907(C)inourhepatitisA
GWASisinalmostcompleteLDwithrs8099917(T)(r2=0.992),whichhasbeenassociated
withprogressiontochronichepatitisCinfection(HCV)andpoorresponsetoHCVtherapy
inmultiplestudies30.
Plantarwarts(humanpapillomavirus)
Thehumanpapillomavirus(HPV)causesplantarwarts,whichoccuronthesolesofthe
feet.SomestrainsofHPVarealsoassociatedwithcertainformsofcancer2.Weidentified
twoGWSassociationswithplantarwart.OurfirstassociationwaswiththeHLA-DRB1in
classIIregion.TheaminoacidpolymorphismDRB1_PheSerTyr13(P=1.84x10-39),inthe
peptidebindingcleftsofHLA-DRB1,washighlysignificant.Conditioningonit,theeffectsof
HLA-DRB1*15:01(P=4.48e-14,conditionalp-valueis0.36),HLA-DRB1*04:01(P=4.88x
10-18,conditionalp-valueis0.001),andrs9272050(P=9.66x10-31,conditionalp-valueis
0.06)werelargelyremoved,butasmallresidualeffectremainedforrs28752534(P=1.50
x10-30,conditionalp-valueis2.15x10-8).
WealsoidentifiedaGWSassociationwithrs6692209(P=5.25x10-9)nearLCE3Einthe
EpidermalDifferentiationComplex(EDC)on1q21.TheEDCcomprisesaremarkable
densityofgenefamiliesthatdeterminedifferentiationofthehumanepidermis31.Thehigh-
riskallelers6692209(T)isinhighLD(r2=0.86)withrs2105117(A)(P=2.98x10-8inour
GWAS),amissensevariantinLCE5A.Variantsinlatecornifiedenvelope(LCE)geneshave
beenimplicatedinGWASsofatopicdermatitis(rs3126085-A,P=6x10-12,r2=0.047with
rs6692209)32andpsoriasis(rs4085613-A,P=7x10-30,r2=0.002withrs6692209)33,
althoughtheindexSNPsidentifiedinthoseGWASswereinlowLDwithouridentified
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
11
region.Theskinistheprimaryinterfacewiththeexternalenvironmentandprovidesa
criticalfirstbarrieroftheinnateimmunedefensetoinfections34.Whilestudieshavelinked
theepidermaldifferentiationgenestovariousskindiseases,theroleoftheLCEproteinsin
epidermalbiologyhasnotbeenextensivelystudied.OnestudysuggestedthatLCEproteins
ofgroups1,2,5and6areinvolvedinnormalskinbarrierfunction,whileLCE3genes
encodeproteinsthatareinvolvedinbarrierrepairafterinjuryorinflammation35.
PositiveTBtest(Mycobacteriumtuberculosis)
Tuberculosisisacommonand,inmanycases,lethalinfectiousdiseasecausedbyvarious
strainsofmycobacteria,usuallyMycobacteriumtuberculosis.Achestx-rayandasampleof
sputumareneededforadiagnosisofactiveTBdisease36.ThephenotypeusedinourGWAS
wasdefinedbyaskingcustomerswhethertheyhadhadapositiveresponse(redbumps)to
atuberculosis(TB)skintest.ApositiveTBtestindicatesthatapersonhasatsometime
beenexposedtoandinfectedwithTBbacteria.AnegativeTBtestindicatesthatlatentTB
infectionorTBdiseaseisunlikely.InourGWASofpositiveTBtests,thecaseshaveeither
activeTBdisease,latenttuberculosisinfections,orwerevaccinatedagainstTB,while
controlsmayormaynothavebeenexposedtoTBbacteriaandhaveneverbeeninfected.
WeidentifiedmultipleindependentHLAassociations.InthemostsignificantHLA-DRA-
DQA1region,conditioningonHLA-DRB1-Leu67(P=1.14x10-29)andHLA-DQA1-His129
(P=1.56x10-13)removedtheeffectofDQA1*01:02(P=7.04x10-22,conditionalp-valueis
0.54),DQA1*03:01(P=3.99x10-10,conditionalp-valueis0.36),andrs3135359(P=8.82x
10-21,conditionalp-valueis0.06);however,theeffectofrs2894257(P=8.16x10-36,
conditionalp-valueis1.2x10-7)remainedsignificant.Uponconditioningonthesignalsin
theHLA-DRA-DQA1region,secondarysignalsweredetectedintheclassI(HLA-B*08:01,
P=8.43x10-14)andclassIIIregions(rs148844907inC6orf47,P=2.21x10-12).
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
12
Strepthroat(groupAstreptococcusbacteria)
Strepthroat(alsocalledstreptococcaltonsillitisorstreptococcalpharyngitis)is
characterizedbyfever,headache,severesorethroat,swollentonsilsandlymphnodes,and
iscausedbygroupAstreptococcus(GAS)bacteria37.Itisroutinelytreatedwithantibiotics.
Thesymptomsofuntreatedstrepthroattypicallyresolvewithinafewdays,butcan
sometimescausecomplicationssuchasrheumaticfeverandkidneyinflammation.We
foundaHLA-BclassIregion(indexSNPrs1055821,P=7.69x10-11)asaGWSassociation
withstrepthroat.HLA-B*57:01wasalmostGWS(P=5.06x10-8).HLA-DRB1_CysTyr30in
classIIregionalsoshowedasignificanteffect(P=2.14x10-9).Conditioningonrs1055821,
theeffectofHLA-B*57:01wasremoved(conditionalp-valueis0.84)andtheeffectofHLA-
DRB1_CysTyr30wassubstantiallydecreased(conditionalp-valueis0.005).
WealsofoundanGWSassociationwithasmallinsertion,rs35395352,in1p36.23(P=
3.90x10-8,effect=0.03).Noneofthevariantswithin500kbandinmoderateLD(r2>0.6)
withthisSNPwerecoding.However,twovariantshavebeenreportedascis-eQTLs:
rs7548511(P=7.84x10-8,effect=0.03,riskalleleA,r2=0.96withrs35395352)isan
eQTLforENO1(eQTLp-valueis8x10-5)inlymphoblastoid38andrs11121129(P=9.62x
10-8,effect=0.03,riskalleleG,r2=0.96withrs35395352)isaneQTLforTNFRSF9(eQTL
p-valueis4.4x10-4)inlymphoblastoid39.TheproductofENO1,Alphaenolase,hasbeen
identifiedasanautoantigeninHashimoto’sencephalopathy40andaputativeautoantigenin
severeasthma41andBehcet’sdisease42.TheTNFRSF9-encodedproteinisamemberofthe
TNF-receptorsuperfamily,whichcontributestotheclonalexpansion,survival,and
developmentofTcells.Thesame1p36.23regionhasbeenimplicatedinGWASsof
psoriasis(rs11121129,riskalleleA,P=1.7x10-8,r2=0.96withrs35395352)43.
Interestingly,thestrepthroatriskallelers11121129-Aisprotectiveagainstpsoriasis.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
13
Studiesfoundthatexacerbationofchronicpsoriasiscanbeassociatedwithstreptococcal
throatinfections,andtheT-cellsgeneratedbypalatinetonsilscanrecognizeskinkeratin
determinantsinpatients’blood44.
Scarletfever(groupAstreptococcusbacteria)
Scarletfever(alsoknownasscarlatina)iscausedbygroupAstreptococcus(GAS)andis
characterizedbyabrightredrashthatcoversmostofthebody,sorethroat,andfever.It
wasamajorcauseofdeathbeforethediscoveryofeffectiveantibiotics.WefoundaGWS
associationbetweenscarletfeverandtheHLAclassIIregion.Theaminoacid
polymorphismHLA-DQB1-Gly45(P=6.35x10-14),locatedinthepeptide-bindingcleftof
HLA-DQB1,accountedfortheSNPassociation(rs36205178,P=9.49x10-14,conditionalp-
valuesis0.83)andtheHLAalleleassociation(HLA-DQB1*03:01,P=2.98x10-14,
conditionalp-valuesis0.14)inthisregion.Althoughscarletfeverdevelopsinsomepeople
withstrepthroat,wedidnotobserveanycrossdiseaseeffectbetween‘scarletfever’and
‘strepthroat’inouranalysis.
Pneumonia(multipleorigins)
Pneumoniaischaracterizedbyinflammationofthealveoliinthelungsandcanbe
causedbyavarietyoforganisms,includingbacteria,viruses,andfungi.Themostcommon
causeofbacterialpneumoniainadults–isolatedinnearly50%ofcommunity-acquired
cases--isStreptococcuspneumonia45.WefoundaGWASassociationbetweenpneumonia
andaHLAclassIregionindexedbyrs3131623(P=1.99x10-15).Conditioningon
rs3131623,theresidualeffectsofHLA-B-SerTrpAsn97(P=2.37x10-12,conditionalp-value
is9.33x10-4)andHLA-DRB1-LS11(P=4.10x10-11,conditionalp-valueis3.89x10-6),both
inthepeptide-bindingcleft,remainedsignificant.Therewasalsoaresidualeffectleftfor
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
14
rs3131623(conditionalp-valueis6.94x10-5)afterconditioningonthetwoaminoacid
associations.
Bacterialmeningitis(Streptococcuspneumoniae,GroupBStreptococcusbacteria
andmanyotherbacteria)
Bacterialmeningitisisasevereinfectioncausingsubstantialneurologicalmorbidityand
mortalityworldwide.Streptococcuspneumoniaeistheleadingcauseofbacterial
meningitisandisassociatedwitha30%mortalityrate46.WeidentifiedaGWSassociation
withCA10(carbonicanhydraseX)indexedbyrs1392935(P=1.19x10-8,intronicofCA10).
Noneofthevariantswithin500kbandinmoderateLD(r2>0.6)withrs1392935were
coding,norweretheyreportedasaneQTL.However,rs1392935fallswithinanenhancer
thatisdefinedinthefetalbrain.TheproteinencodedbyCA10isanacatalyticmemberof
thealpha-carbonicanhydrasesubgroupanditisthoughttoplayaroleinthecentral
nervoussystem,especiallyinbraindevelopment47.
Yeastinfection(Candidaspp.)
Ayeastinfection,alsocalledcandidiasis,isusuallyalocalizedfungalinfectionoftheskin
ormucosalmembranes,typicallyaffectingtheoralcavity,esophagus,gastrointestinaltract,
urinarybladder,toenails,orgenitalia48.Mostyeastinfectionsarecausedbyspeciesof
Candida,oftenCandidaalbicans.WefoundGWSassociationsbetweenyeastinfectionsand
variantsinDSG1(rs200520431,P=1.87x10-16).TheindexSNPrs200520431isintronic
andinhighLD(r2>0.8)withmultiplemissensemutations(rs8091003,rs8091117,
rs16961689,rs61730306,rs34302455)inDSG1.TheDSG1geneproductisacalcium-
bindingtransmembraneglycoproteincomponentofdesmosomesinvertebrateepithelial
cells.Itconnectsthecellsurfacetothekeratincytoskeletonandplaysakeyrolein
maintainingepidermalintegrityandbarrierfunction49.Thisglycoproteinhasbeen
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
15
identifiedastheautoantigenoftheautoimmuneskinblisteringdiseasepemphigus
foliaceus50andhomozygousmutationsinDSG1havebeenshowedtoresultinsevere
dermatitis,multipleallergies,andmetabolicwastingsyndrome49.Avariantdownstreamof
PRKCH(indexedbyrs2251260,P=3.46x10-10)wasalsosignificantlyassociatedwith
yeastinfections.Noneofthevariantswithin500kbandinmoderateLD(r2>0.6)withthe
indexSNPrs2251260werecoding,norweretheyreportedasaneQTL.However,
rs2251260fallswithinstrongenhancersdefinedinhepatocellularcarcinoma.PRKCHisa
memberofafamilyofserine-andthreonine-specificproteinkinaseandispredominantly
expressedinepithelialtissues.ThePRKCHproteinkinasecanregulatekeratinocyte
differentiation51.Wealsofoundasignificantassociationwithavariantinthe14q32.2gene
desert(rs7161578-T,p=4.04x10-8).TheindexSNPisinhighLDwithmanyenhancer
sequencesdefinedinepidermalkeratinocytes.Avariantinthesameregion(rs7152623-A,
p=3x10-15,r2=(+)0.98withrs7161578-T)wasimplicatedinaorticstiffness52.
UrinaryTractInfection(Escherichiacoli,Staphylococcussaprophyticus,rarely
viralorfungal)
About80%-85%ofurinarytractinfections(UTIs)arecausedbyEscherichiacoli(E.
coli),and5-10%arecausedbyStaphylococcussaprophyticus53.Veryrarely,virusesor
fungimaycauseUTIs54.UTIsoccurmorecommonlyinwomenthanmen55andinour
dataset,significantlymorefemalethanmalecustomersreportedhavingtwoormoreUTIs.
WefoundanassociationbetweenUTIsandvariantsinPSCA(indexedbyrs2976388,P=
3.27x10-10).TheindexSNPrs2976388fallswithinstrongenhancersdefinedinepidermal
keratinocytes.ItisalsoinhighLD(r2=0.95)withrs2294008(P=1.01x10-9),whichisin
the5’-UTRpromotersequenceofPSCAandcancausechangesintranscriptionalrepressor
CTCFmotif.ThePSCAgenevariantsconferredriskofUTIonlyinfemalewhenwefurther
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
16
testedtheeffectinfemaleandmalecohortseparately(P<1e-11infemale,andP>0.1in
male).PriorGWASshavefoundassociationsbetweenthePSCAgeneandduodenalulcer
(rs2294008-C,p=2e-33)andbladdercancer(rs2294008-T,p=4x10-11)56,57.PSCA
encodesaglycosylphosphatidylinositol-anchoredcellmembraneglycoproteinofunknown
function.Thisglycoproteinwasinitiallyidentifiedasaprostate-specificcell-surfacemarker
58andisoverexpressedinalargeproportionofprostatecancers,butalsodetectedin
bladderandpancreaticcancers57.
WealsoidentifiedavariantintronicofFRMD5(rs146906133,P=2.02x10-8)thatisin
LDwithrs138763871(r2=0.8,p=1.75x10-6inourGWAS),amissensevariantinSTRC
(R1521W).TheroleofFRMD5orSTRCinUTIsisunclear.
Tonsillectomy(multipleorigins)
Tonsilsareconsideredthefirstlineofdefenseagainstrespiratoryinfections.Previous
twinstudieshavesuggestedasubstantialgeneticpredispositionforrecurrenttonsillitis59.
Tonsillectomyisasurgerytoremovethetonsilstypicallyduetorepeatedoccurrenceof
tonsillitisortonsilhypertrophy.OurtonsillectomyGWASrevealed35independent
chromosomeregionsreachingGWS.WeobservedstrongHLAassociationsatHLA-Binthe
classIregion.TheaminoacidpolymorphismHLA-B-Val97(P=1.44x10-21),inthepeptide-
bindingcleftoftheHLA-Bprotein,accountedformostoftheSNPeffect(rs41543314,P=
5.36x10-21,conditionalpvalueis0.06)andHLAalleleeffect(HLA-B*57:01,P=6.29x10-
22,conditionalpvalueis0.05)inthisinterval.Wealsonotedadditionalindependentsignals
intheclassIIandclassIIIregions.WithinHLA-DRB1intheclassIIregion,theeffectof
rs140177540(P=5.02x10-15,conditionalp-valueis0.001)waslargelyremovedafter
conditioningonDRB1_LeuValGly11(P=1.02x10-12)inthepeptide-bindingcleftofthe
HLA-DRB1protein.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
17
Wedetected34additionalGWSassociations(Table2andSupplementaryTable4),
includingsignalsfromgenesencodingtumornecrosisfactors/receptorssuperfamily
members(LTBR,TNFRSF13B,TNFSF13B,CD40,TNFAIP3),myotubularin(MTMR3),
chemokines(CXCL13),mitogen-activatedproteinkinases/regulators(IGFBP3,SPRED2,
DUSP10,ST5,MAPK3),SRChomologydomainbindingproteins(SBK1,ST5,SH2B3),
disintegrins(ADAM23,ADAMTS10),varioustranscriptionfactors(NFKB1,HOXA2,FOXA1,
IKZF1,MDFIC,TFEC,RERE,TBX1,FOXN1),andsignalingmolecules(WNT2,GNA12,ITSN1).
Manyofthesegenesmaptopathwaysinvolvedinimmuneandinflammatoryprocesses,
whileothersareimportantregulatorsofembryonicdevelopment(e.g.mostofthe
transcriptionfactors),andafewarepotentiallyinvolvedinplateletproductionor
hemostasis(GNA12,SH2B3,RAP1GAP2,ADAM23,ADAMTS10).Whenwetestedthe
significantlociforenrichmentincanonicalpathways,thetopover-representedpathway
was‘IntestinalimmunenetworkforIgAproduction’(Table5,P=2x10-6andFDR=3.5e-3).
ThegenesthatcontributedsignificantlyinthispathwayincludeLTBR(rs10849448,P=
2.35x10-35),TNFRSF13B(rs34557412,missensevariant,P=2.66x10-21),TNFSF13B
(rs200748895,P=1.20x10-17)andCD40(rs6032664,P=7.31x10-12).IgAisanantibody
thatplaysacriticalroleinmucosalimmunity.Tonsilsbelongtonasopharyngeal-associated
lymphoidtissuesandthegenerationofBcellsisconsideredoneofthemajortonsillar
functions;secretorydimericIgAproducedbytheBcellsiscapableofpreventing
absorptionandpenetrationofbacteriaand/orvirusesintotheupperrespiratorytract
mucosa60.TonsillectomyhasbeenshowntosignificantlydecreaselevelsofserumIgAand
salivarysecretoryIgAlevels61.Otheridentifiedsignificantsignalingpathwaysinclude
‘TACIandBCMAstimulationofBcellimmunesystem’(P=9x10-4andFDR=0.078),
‘ceramidesignalingpathway’(P=3x10-4andFDR=0.08)and‘Glucocorticoidreceptor
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
18
regulatorynetwork’(P=5x10-4andFDR=0.099).ThecontributinggenesareTNFSF13B
(rs200748895,P=1.2x10-17),TNFRSF13B(rs34557412,P=2.66x10-21),NFKB1
(rs23052,P=4.54x10-14)andMAPK3(rs12931792,P=4.06x10-8).
Althoughtonsillectomyhasbeenperformedforover100years,possibleimmunological
effectsofthisprocedureremaincontroversial.Manyreportshavedemonstratedthatin
certainpatientstonsillectomyisaneffectivetherapyforpsoriasis44andrheumatoid
arthritis(RA)62;therationaleforthiseffectisunknown.The‘Intestinalimmunenetwork
forIgAproduction’pathwaywasalsoidentifiedasthemostsignificantpathwayinarecent
GWASofIgAN63.Inthatstudy,theyalsolinkedtheintestinalmucosalinflammatory
disordersandinflammatoryboweldisease(IBD)withriskofIgAN.Ourtonsillectomy
GWASshowedsignificantoverlapoftheidentifiedlociwiththoseautoimmuneand
inflammatorydisorders64.WefoundfiverisklocithatarealsoassociatedwithRA(HLA,
TNFIP3,CD40,SPRED2andSH2B3);fourrisklocithatarealsoassociatedwithIBD(HLA,
MTMR3,TNFAIP3andCD40)andulcerativecolitis(HLA,GNA12,MAPK3andNFKB1);three
risklocithatarealsoassociatedwithpsoriasis(HLA,SLC12ABand1p36.23);andtworisk
loci(HLAandMTMR3)thataresharedwithIgAN.Weobservedbothconcordantand
opposingeffectscomparedtotheseimmune-mediateddiseases.Ourresultsmayhelpto
elucidatetheconnectionbetweentonsillectomyandthesediseasesandmayprovide
insightintoclinicalmarkersthatcouldbeusedasindicatorsoftonsillectomyasatherapy
forthesediseases.
Childhoodearinfection(multipleorigins)
Althoughearinfectionscanhaveabacterialorigin,theyareoftensecondarytoanupper
respiratoryinfectionsuchasacoldorsorethroat,whichcanbecausedbyvirusesor
bacteria65.Childrenaremorelikelythanadultstohaveearinfectionsbecausetheir
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
19
Eustachiantubesarelesseffectiveatdrainingfluidandtheirimmunesystemsarenotfully
developed.InourGWAS,weidentified14regionsthatweresignificantlyassociatedwith
childhoodearinfection.SignalsintheHLAregionmappedtoHLA-DRB1inclassIIregion.
OurstrongestassociationwithinHLA-DRB1waswithHLA-DRB1-Gln96inthepeptide-
bindinggroove.AfterconditioningonHLA-DRB1-Gln96(P=2.11x10-12),theeffectsof
rs4329147(P=9.55x10-12,conditionalpvalueis0.27)andDQB1*06:02(P=5.79x10-10,
conditionalpvalueis0.85)wereeliminated.
Wefound13additionalGWSsignals(Table2andSupplementaryTable5).Although
MAGENTApathwayanalysisdidnotidentifysignificantpathways(Table6),variantsin
FUT2andABO,whichwereinvolvedintheglycosphingolipidbiosynthesispathwayand
wereimplicatedinourmumpsGWAS,werealsosignificantlyassociatedwithchildhoodear
infection.Theriskallelers681343(C)(P=3.51x10-30)isasynonymousmutationinFUT2
andisinalmostcompleteLDwithrs601338(G)(r2=0.9993),the‘secretor’(se)alleleis
associatedwithhigherriskofchildhoodearinfectioninourdata.Thisisconsistentwitha
previousreportinwhichsecretorswereover-representedamongpatientswithupper
respiratoryinfections66.Oursecondmostsignificantassociationwaswithavariantin
TBX1(rs1978060,P=1.17x10-19).Thelow-riskallelers1978060(A)isinLD(r2=0.45)
withrs72646967(C),amissense(N397H)mutationinTBX1thatwasalsoimplicatedin
ourtonsillectomyGWAS.T-boxgenesencodetranscriptionfactorsthatplayimportant
rolesintissueandorganformationduringembryonicdevelopment.Inmice,TBX1haplo-
insufficiencyintheDiGeorgesyndrome(22q11.2deletionsyndrome)regionhasbeen
showedtodisruptthedevelopmentofthepharyngealarcharteries67andthemiddleand
outerear68,69.Wealsoidentifiedothergenesinvolvedindevelopmentalprocesses,
includingMKX(rs2808290,P=5.09x10-16),FGF3(rs72931768,P=2.63x10-9)andAUTS2
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
20
(rs35213789,P=3.75x10-9).TheMKXproductisanIRXfamily-relatedhomeobox
transcriptionfactorandhasbeenshowntoplayacriticalroleintendondifferentiationby
regulatingtypeIcollagenproductionintendoncells70.FGF3isamemberofthefibroblast
growthfactorfamily,whichisinvolvedinavarietyofbiologicalprocessesincluding
embryonicdevelopment,cellgrowth,morphogenesis,tissuerepairandinvasion.Studyof
similargenesinmouseandchickensuggestedtheroleininnerearformation71..AUTS2has
beenimplicatedinneurodevelopmentandisacandidategeneforautismspectrum
disordersanddevelopmentaldelay72,73.Finally,weidentifiedmissensemutationsinCDHR3
(rs114947103,P=5.40x10-9)andPLG(rs73015965,P=3.78x10-8).TheindexSNP
rs114947103isinalmostcompleteLDwithrs6967330(r2=0.97),amissensevariantin
CDHR3,thatwasrecentlyidentifiedinaGWASasasusceptibilitylocusforasthma74.The
biologicalroleofCDHR3isnotclear,butitishighlyexpressedinairwayepitheliumand
belongstothecadherinfamilythatisinvolvedincelladhesionandepithelialpolarity74.
MutationsinthePLGgenecouldcausecongenitalplasminogendeficiency,whichresultsin
inflamedgrowthsonthemucousmembranes.StudiesinmicehaveshowedthatPLGplays
anessentialroleinprotectingagainstthespontaneousdevelopmentofchronicotitismedia
(middleearinfection)andhavealsosuggestedthepossibilityofusingPLGforclinical
therapyofcertaintypesofotitismedia75.
Myringotomy(multiplecauses)
Myringotomyisasurgicalprocedureinwhichatinyincisioniscreatedintheeardrumto
allowthereleaseofmiddle-earfluid.Beforetheinventionofantibiotics,myringotomy
withouttubeplacementwasacommontreatmentforsevereorfrequentacuteotitis
media76.ThesamevariantinTBX1thatwasassociatedwithtonsillectomyandchildhood
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
21
earinfection(seeearlier)wasalsofoundtobesignificantlyassociatedwithmyringotomy
(rs1978060,P=3.34x10-10).
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
22
Discussion
Weidentifiedthegreatestnumberofindependentassociationsfortonsillectomy(n=35)
andchildhoodearinfection(n=14),tworelativelynonspecificphenotypes.Formanyofthe
infections,weonlyidentifiedHLAassociations.Thesefindingsmaybeduetoourrelatively
largersamplesizesforthetwotraitsandmayalsobeduetothefactthattonsillectomyand
childhoodearinfectionsareheterogeneousphenotypes,influencedbyavarietyof
pathogensandalsobyanatomicalabnormalities.Peopleundergotonsillectomyfor
differentreasons,includingrecurrenttonsillitis,obstructivesleepapnea,andnasalairway
obstruction77.Thecausesofchildhoodearinfectionsarealsodiverse,involvingdifferent
typesofbacteriaandviruses,andeardevelopmentaldefects78.VariantsintheTBX1gene,
whichisessentialforinnereardevelopment68,69,wereassociatedwithbothtonsillectomy
andinnerearinfections.Manygenesthatareinvolvedinembryoticdevelopmentwerealso
identifiedinourtonsillectomyandchildhoodearinfectionGWASs.Thevulnerabilityto
tonsillectomyandchildhoodearinfectionsmayalsobelargelyinherited.Therewere
earlierstudiescalculatedthatheredityaccountforover70%ofsusceptibilitytorecurrent
earinfectionsinchildren79,80.Previoustwinstudieshadsuggestedasubstantialgenetic
predispositionforrecurrenttonsillitis59andsleep-disorderedbreathing81.Accordingto
theLDscoreregressiononourGWASdata,tonsillectomyandchildhoodearinfectionshave
relativelyhigherheritabilityontheobservedscalethanmostoftheotherinfectionswe
studies.Wedidnotcalculatetheliabilityscaleheritabilityduetothelackofpopulation
prevalenceinformation.
IntheHLAregion,wefoundthatviraldiseases—e.g.,chickenpox,shingles,coldsores,
mononucleosis(allcausedbyherpesvirus)aswellasmumps(causedbymumpsvirus)—
weremainlyassociatedwithvariationinclassImolecules(Figure1).Thebacterial
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
23
diseases—specificallyhavingapositiveTBtest(causedbyMycobacteriumtuberculosis),
scarletfever(causedbyGAS),andchildhoodearinfection(mostlycausedbybacteria)—
weremainlyassociatedwithvariationinclassIImolecules.Tonsillectomyandpneumonia,
causedbyeitherbacteriaorviruses,wereassociatedwithbothclassIandclassII
molecules.Theseobservationsareconsistentwithpreviousknowledgeaboutantigen
presentation;virusesmostlyreplicatewithinnucleatedcellsinthecytosolandproduce
endogenousantigensthatarepresentedbyclassIMHCmolecules,whilebacteriagrow
extracellularlyandaretakenupbyendosomalcompartmentswheretheyareprocessedfor
presentationbyclassIIMHCmolecules82.Thesetwointracellularpathwaysofprotein
processingmaynotbecompletelyseparate.ActivationofCD8+,HLAclassI-restrictedT-
cellsbyexogenousantigenshasbeenreported83andHLAclassII-restrictedCTLsthat
recognizeendogenouslysynthesizedantigen,suchasHBVenvelopeantigens,havealso
beendescribed84,85.
Somediseases,however,donotfollowthecommonprinciplesofantigenpresentation.
Plantarwarts(causedbyHPV)aredifferentfromothervirus-causingdiseases.HPV
infectionsareexclusivelyintraepithelialandthereisnoviraemiaorcytolysisinthe
infectioncycle86.TheprimaryresponsetoHPVantigensismorelikelytobeinitiatedbythe
antigen-presentingcells(APCs)ofsquamousepithelia,theLangerhanscells(LCs).LCs
captureantigensbymacropinocytosisandreceptor-mediatedendocytosis,andthen
initiateclassIIprocessingofexogenousantigens87,whichexplainswhyourplantarwarts
GWASmainlyidentifiedassociationswithMHCclassIImolecules.Inothercases,our
resultssuggestevenmorecomplexinteractions.Strepthroat(causedbyGASbacteria)has
bothMHCclassIandclassIIassociations.Someofourreportedstrepthroatcasesmay
representsecondarybacterialinfectionsinresponsetoaviralupperrespiratoryinfection,
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
24
meaningthatthediscoveredMHCclassIassociationsareactuallywiththevirusinfection
thatinitiatedtheprocess88.Sincemostsorethroatsarecausedbyviruses89,another
possibilityisthatsomeofthereportedstrepthroatcasesactuallyhadamisdiagnosedor
misreportedviralinfection.AfinalpossibilityisthatHLAclassImoleculescanbind
bacterialpeptidesderivedfromexogenousproteinsthatareinternalizedbyendocytosisor
phagocytosis,aprocesscalledcross-presentation85,90.
WefailedtodetectanyGWSassociationforcertaininfectiousdiseases,includingrubella,
measles,chronicsinusinfection,thecommoncold,orrheumaticfever.ItisknownthatHLA
affectsresponsestomeaslesandrubellavaccinations91,92andmaythusalsobeassociated
withsusceptibility.Onelimitationofourstudythatmightaffectfindingswithrespectto
measlesandrubellaisourlackofdataonvaccinationstatus.Sincethereisnowacommon
vaccinationformeaslesandrubella,misclassifyingvaccinatedpeopleascontrolswould
theoreticallyreduceourpowertodetectassociations.Thislimitationmightalsoaffectthe
powerofourshinglesGWAS.ThecontrolsusedfortheshinglesGWASwerenotfilteredfor
apositivehistoryofchickenpox,thoughhavingchickenpoxisaprerequisiteforshingles.
Anotherreasonmightbetheheterogeneityofthephenotype.Coldsarecausedbyseveral
differentviruses(mostlyrhinovirusesandcoronaviruses),andlargersamplesizesmaybe
requiredtoidentifyallelesonlyassociatedwithspecificpathogens.
Althoughadditionalstudieswillberequiredtovalidateourassociations,ourfindings
areanimportantsteptowardsdissectingthehostgeneticcontributiontovariationin
responsetoinfections.Researchinsightsintoinfectiousdiseaseswillhelptoderivenew
diagnosticapproachesandperhapsnewtherapiesandpreventions.Thisstudymayalso
helpustounderstandsomeoftheautoimmunedisordersthatareassociatedwithvarious
infectiontriggers.Manyoftheidentifiedassociationswerealsofoundinautoimmune
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
25
diseases.Onepostulatedconnectionbetweeninfectiousdiseasesandimmunological
disordersisthattheimmunecellsthatareactivatedinresponsetoapathogenepitopeare
alsocross-reactivetoselfandleadtodirectdamageandfurtheractivationofotherarmsof
theimmunesystem93.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
26
Methods
Subjects
WeconductedGWASsof23infectiousdiseases.Allparticipantsweredrawnfromthe
customerbaseof23andMe,Inc.,apersonalgeneticscompany.Allindividualsincludedin
theanalysesprovidedinformedconsentandansweredsurveysonlineaccordingtoour
humansubjectsprotocol,whichwasreviewedandapprovedbyEthical&Independent
ReviewServices,aprivateinstitutionalreviewboard(http://www.eandireview.com).
AllparticipantswereofprimarilyEuropeanancestry(>97%Europeanancestry)as
determinedthroughananalysisoflocalancestry94.Amaximumsetofunrelatedindividuals
waschosenusingasegmentalidentity-by-descent(IBD)estimationalgorithm95.
Individualsweredefinedasrelatediftheysharedmorethan700cMofIBD,including
regionswherethetwoindividualsshareeitheroneorbothgenomicsegmentsIBD.This
levelofrelatedness(involving~20%ofthehumangenome)correspondsapproximatelyto
theminimalexpectedsharingbetweenfirstcousinsinanoutbredpopulation.
Table1showsourdiscoverysamplesizes,drawnfrommorethan650,000genotyped
customerswhoreportedviaweb-basedquestionnaireswhethertheyhadbeendiagnosed
withoneormoreinfectiousdiseases.(Formoredetailonsurveyquestionsusedtocollect
thephenotypedata,seetheSupplementaryNotes).
GenotypingandSNPimputation
DNAextractionandgenotypingwereperformedonsalivasamplesbyNationalGenetics
Institute(NGI),aCLIA-certifiedlaboratorythatisasubsidiaryofLaboratoryCorporationof
America.Samplesweregenotypedononeoffourgenotypingplatforms.TheV1andV2
platformswerevariantsoftheIlluminaHumanHap550+BeadChipandcontainedatotalof
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
27
about560,000SNPs,includingabout25,000customSNPsselectedby23andMe.TheV3
platformwasbasedontheIlluminaOmniExpress+BeadChipandcontainedatotalofabout
950,000SNPsandcustomcontenttoimprovetheoverlapwithourV2array.TheV4
platformincurrentuseisafullycustomarrayofabout950,000SNPsandincludesalower
redundancysubsetofV2andV3SNPswithadditionalcoverageoflower-frequencycoding
variation.Samplesthatfailedtoreach98.5%callratewerere-analyzed.Individualswhose
analysesfailedrepeatedlywerere-contactedby23andMecustomerservicetoprovide
additionalsamples,asisdoneforall23andMecustomers.
ParticipantgenotypedatawereimputedusingtheMarch2012“v3”releaseofthe1000
GenomesProjectreferencehaplotypes96.Wephasedandimputeddataforeachgenotyping
platformseparately.First,weusedBEAGLE97(version3.3.1)tophasebatchesof8,000to
9,000individualsacrosschromosomalsegmentsofnomorethan10,000genotypedSNPs,
withoverlapsof200SNPs.WeexcludedSNPswithminorallelefrequency(MAF)<0.001,
Hardy-WeinbergequilibriumP<1×10−20,callrate<95%,orlargeallelefrequency
discrepanciescomparedtotheEuropean1000GenomesProjectreferencedata.Frequency
discrepancieswereidentifiedbycomputinga2x2tableofallelecountsforEuropean1000
Genomessamplesand2000randomlysampled23andMecustomerswithEuropean
ancestry,andidentifyingSNPswithachisquaredP<10−15.Weimputedeachphased
segmentagainstall-ethnicity1000Genomeshaplotypes(excludingmonomorphicand
singletonsites)usingMinimac298withfiveroundsand200statesforparameter
estimation.
Forthenon-pseudoautosomalregionoftheXchromosome,malesandfemaleswere
phasedtogetherinsegments,treatingthemalesasalreadyphased;thepseudoautosomal
regionswerephasedseparately.Wethenimputedmalesandfemalestogetherusing
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
28
Minimacaswiththeautosomes,treatingmalesashomozygouspseudo-diploidsforthe
non-pseudoautosomalregion.
GWASanalysis
Forcasecontrolcomparisons,wetestedforassociationusinglogisticregression,
assumingadditivealleliceffects.Forquantitativetraits,associationtestswereperformed
usinglinearregression.Fortestsusingimputeddata,weusetheimputeddosagesrather
thanbest-guessgenotypes.Weincludedcovariatesforage,gender,andthetopfive
principalcomponentstoaccountforresidualpopulationstructure.Theassociationtestp-
valuewascomputedusingalikelihoodratiotest,whichinourexperienceisbetterbehaved
thanaWaldtestontheregressioncoefficient.ResultsfortheXchromosomewere
computedsimilarly,withmencodedasiftheywerehomozygousdiploidfortheobserved
allele.
GeneticcorrelationsWecomputedLDscoresaspreviouslydescribed99usingthesamplesinthe1000
GenomesProjectreferencepanel(phase3verion5a)withaMAFcutoffof5%.We
calculatedthepairwisegeneticcorrelation(rg)ontheGWASsummarystatistics(including
MHCregion)usingLDscoreregression99.Theclustermapofgeneticcorrelation,baseon
thelowerboundofthe95%confidentintervalofcalculatedrg,wasplottedusing’Nearest
PointAlgorithm’implementedinpython“Searborn”package
(https://stanford.edu/~mwaskom/software/seaborn/index.html).Wefoundno
significantnegativecorrelationsinourdata,thuswesetallnegativevaluestozero.
SNPFunctionAnnotation
ToexplorewhetheranyofthesignificantSNPsidentifiedmightlinktofunctional
mutationsorhavepotentialregulatoryfunctions,weusedtheonlinetoolsHaploReg
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
29
(http://www.broadinstitute.org/mammals/haploreg/haploreg.php)toconfirmthe
locationofeachSNPinrelationtoannotatedprotein-codinggenesand/ornon-coding
regulatoryelements.Wequeriedonlythevariantsthatwerewithin500kbofandin
moderateLD(r2>0.6)withtheindexSNP.
ImputationofHLAclassicalallelesandrespectiveaminoacidvariations
HLAimputationwasperformedwithHIBAG100,anattributebaggingbasedstatistical
methodthatcomesasafreelyavailableRpackageandincludesapre-fitclassifier.This
classifierwastrainedfromalargedatabaseofindividualswithknownHLAallelesandSNP
variationwithintheHLAregion.Over98%ofthetaggingSNPsusedinHIBAGwere
genotypedandpassedqualitycontrolon23andMe’splatform.Weimputedallelicdosage
forHLA-A,B,C,DPB1,DQA1,DQB1,andDRB1lociatfour-digitresolution.Weusedthe
defaultsettingsofHIBAGandtheruntimefor100,000sampleswasabout10hoursonour
cluster.
UsinganapproachsuggestedbyP.deBakker101,wedownloadedthefilesthatmapHLA
allelestoaminoacidsequencesfromhttps://www.broadinstitute.org/mpg/snp2hla/and
mappedourimputedHLAallelesatfour-digitresolutiontothecorrespondingaminoacid
sequences;inthiswaywetranslatedtheimputedHLAallelicdosagesdirectlytoaminoacid
dosages.Weencodedallaminoacidvariantsinthe23andMeEuropeansamplesasbiallelic
markers,whichfacilitateddownstreamanalysis.Forexample,position45ofHLA-Bprotein
hadfivedifferentalleles(E:Glu,M:Met,T:Thr,K:Lys,G:Gly),wefirstencodedtheposition
usingfivebinarymarkers,eachcorrespondingtothepresenceorabsenceofeachallele
(e.g.HLA-B-Gly45tagsGlyatposition45ofHLA-Bprotein).Forpositionshavingthreeor
morealleles,wealsocreatedmarkersthattagmultiplealleles,eachcorrespondingtothe
presenceorabsenceofthemultiplealleles(e.g.HLA-B-ThrGly45tagsThrandGlyat
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
30
position45ofHLA-Bprotein).Thus,wecreatedbinaryindicatorsforallpossible
combinationsofaminoacidvariants.Weusethisnamingconventionforaminoacid
polymorphismsthroughoutthispaper.
Weimputed292classicalHLAallelesatfour-digitresolutionand2395bi-allelicamino
acidpolymorphisms.SimilartotheSNPimputation,wemeasuredimputationqualityusing
r2,whichistheratiooftheempiricallyobservedvarianceofthealleledosagetothe
expectedvarianceassumingHardy-Weinbergequilibrium.Theimputationquality(r2)of
thetopassociatedHLAallelesandaminoacidsareinSupplementaryTable6.
HLAregionfinemapping
TotestassociationsbetweenimputedHLAallele/aminoaciddosagesandphenotypes,
weperformedlogisticorlinearregressionusingthesamesetofcovariatesusedintheSNP-
basedGWASforthatphenotype.Weappliedaforwardstepwisestrategy,withineachtype
ofvariant,toestablishstatisticallyindependentsignalsintheHLAregion.Withineach
varianttype(e.g.SNP,HLAallele,andHLAaminoacid),wefirstidentifiedthemoststrongly
associatedsignals(lowestp-value)foreachdiseaseandperformedforwarditerative
conditionalregressiontoidentifyotherindependentsignalsiftheconditionalp-valuewas
<5x10-8.Allanalysescontrolledforsexandfiveprincipalcomponentsofgeneticancestry.
Thep-valueswerecalculatedusingalikelihoodratiotest.Theiterativeconditional
regressiondissectedHLAsignalintoindependentHLAassociations.Withineachidentified
HLAlocus(HLA-A,B,C,DPB1,DQA1,DQB1,andDRB1),Wefurthercarriedoutreciprocal
analyses,whicharetheconditionalanalysesacrossvariantstypes,toseeiftheassociation
canbeattributedtotheaminoacidpolymorphismwithineachHLAlocus.
Pathwayanalysis
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
31
Inordertobetterunderstandhowmultiplegenesinthesamepathwaymaycontribute
tocertaininfections,weperformedpathwayanalysisusingMeta-AnalysisGene-set
EnrichmentofvariaNTAssociations(MAGENTA)102,whichtestsforenrichmentofgenetic
associationsinpredefinedbiologicalprocessesorsetsoffunctionallyrelatedgenes,using
GWASresultsasinput.Weusedgenesetsof1320canonicalpathwaysfromtheMolecular
SignaturesDatabase(MsigDB)compiledbydomainexperts103anddefaultsettingsof
MAGENTAtool.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
32
Acknowledgements:
Wethankthecustomersof23andMeforparticipatinginthisresearchandthe
employeesof23andMefortheircontributionstothiswork.
Conflictofinterest:
Allauthorslistedwere/areemployedby23andMe,andownstockorstockoptionsin
23andMe.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
33
REFERENCE:
1. Becker,K.,Hu,Y.&Biller-Andorno,N.Infectiousdiseases-aglobalchallenge.Int.J.
Med.Microbiol.IJMM296,179–185(2006).
2. Mighty,K.K.&Laimins,L.A.Theroleofhumanpapillomavirusesinoncogenesis.
RecentResultsCancerRes.FortschritteKrebsforsch.Prog.DansRech.SurCancer193,
135–148(2014).
3. Filippi,C.M.&vonHerrath,M.G.Viraltriggerfortype1diabetes:prosandcons.
Diabetes57,2863–2871(2008).
4. Kang,J.-H.,Sheu,J.-J.,Kao,S.&Lin,H.-C.Increasedriskofmultiplesclerosisfollowing
herpeszoster:anationwide,population-basedstudy.J.Infect.Dis.204,188–192
(2011).
5. Pavlos,R.,Mallal,S.,Ostrov,D.,Pompeu,Y.&Phillips,E.Fever,rash,andsystemic
symptoms:understandingtheroleofvirusandHLAinseverecutaneousdrugallergy.
J.AllergyClin.Immunol.Pract.2,21–33(2014).
6. Blackwell,J.M.,Jamieson,S.E.&Burgner,D.HLAandinfectiousdiseases.Clin.
Microbiol.Rev.22,370–385,TableofContents(2009).
7. Cooke,G.S.&Hill,A.V.S.Geneticsofsusceptibitlitytohumaninfectiousdisease.Nat.
Rev.Genet.2,967–977(2001).
8. Kvestad,E.etal.Recurrentotitismediaandtonsillitis:commondisease
predisposition.Int.J.Pediatr.Otorhinolaryngol.70,1561–1568(2006).
9. Weinberg,J.M.Herpeszoster:epidemiology,naturalhistory,andcommon
complications.J.Am.Acad.Dermatol.57,S130–135(2007).
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
34
10. Mo,X.etal.Microarrayanalysesofdifferentiallyexpressedhumangenesand
biologicalprocessesinECV304cellsinfectedwithrubellavirus.J.Med.Virol.79,
1783–1791(2007).
11. Chan,A.etal.VariationinthetypeIinterferongeneclusteron9p21influences
susceptibilitytoasthmaandatopy.GenesImmun.7,169–178(2006).
12. Xu,F.etal.Trendsinherpessimplexvirustype1andtype2seroprevalenceinthe
UnitedStates.JAMAJ.Am.Med.Assoc.296,964–973(2006).
13. Kriesel,J.D.etal.C21orf91genotypescorrelatewithherpessimplexlabialis(cold
sore)frequency:descriptionofacoldsoresusceptibilitygene.J.Infect.Dis.204,1654–
1662(2011).
14. Rostgaard,K.,Wohlfahrt,J.&Hjalgrim,H.Ageneticbasisforinfectiousmononucleosis:
evidencefromafamilystudyofhospitalizedcasesinDenmark.Clin.Infect.Dis.Off.
Publ.Infect.Dis.Soc.Am.58,1684–1689(2014).
15. Harrison’sprinciplesofinternalmedicine/editors,AnthonyS.Fauci...[etal.].(McGraw-
HillMedical,2008).
16. Kelly,R.J.,Rouquier,S.,Giorgi,D.,Lennon,G.G.&Lowe,J.B.Sequenceandexpression
ofacandidateforthehumanSecretorbloodgroupalpha(1,2)fucosyltransferasegene
(FUT2).Homozygosityforanenzyme-inactivatingnonsensemutationcommonly
correlateswiththenon-secretorphenotype.J.Biol.Chem.270,4640–4649(1995).
17. Larsson,M.M.etal.Antibodyprevalenceandtitertonorovirus(genogroupII)
correlatewithsecretor(FUT2)butnotwithABOphenotypeorLewis(FUT3)
genotype.J.Infect.Dis.194,1422–1427(2006).
18. McGovern,D.P.B.etal.Fucosyltransferase2(FUT2)non-secretorstatusisassociated
withCrohn’sdisease.Hum.Mol.Genet.19,3468–3476(2010).
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
35
19. Franke,A.etal.Genome-widemeta-analysisincreasesto71thenumberofconfirmed
Crohn’sdiseasesusceptibilityloci.Nat.Genet.42,1118–1125(2010).
20. Smyth,D.J.etal.FUT2nonsecretorstatuslinkstype1diabetessusceptibilityand
resistancetoinfection.Diabetes60,3081–3084(2011).
21. Zaitsev,V.etal.SecondsialicacidbindingsiteinNewcastlediseasevirus
hemagglutinin-neuraminidase:implicationsforfusion.J.Virol.78,3733–3741(2004).
22. Cohen,M.,Hurtado-Ziola,N.&Varki,A.ABObloodgroupglycansmodulatesialicacid
recognitiononerythrocytes.Blood114,3668–3676(2009).
23. Ma,J.X.etal.Structureandchromosomallocalizationofthegene(BDKRB2)encoding
humanbradykininB2receptor.Genomics23,362–369(1994).
24. Pan,L.etal.Agenome-wideassociationstudyidentifiespolymorphismsintheHLA-
DRregionassociatedwithnon-responsetohepatitisBvaccinationinChineseHan
populations.Hum.Mol.Genet.23,2210–2219(2014).
25. Hu,Z.etal.NewlociassociatedwithchronichepatitisBvirusinfectioninHanChinese.
Nat.Genet.45,1499–1503(2013).
26. Kamatani,Y.etal.Agenome-wideassociationstudyidentifiesvariantsintheHLA-DP
locusassociatedwithchronichepatitisBinAsians.Nat.Genet.41,591–595(2009).
27. Umemura,T.etal.HumanleukocyteantigenclassIIhaplotypesaffectclinical
characteristicsandprogressionoftype1autoimmunehepatitisinJapan.PloSOne9,
e100565(2014).
28. Sherlock,S.,Fox,R.A.,Niazi,S.P.&Scheuer,P.J.Chronicliverdiseaseandprimary
liver-cellcancerwithhepatitis-associated(Australia)antigeninserum.Lancet1,
1243–1247(1970).
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
36
29. Vogt,T.M.,Wise,M.E.,Bell,B.P.&Finelli,L.DeclininghepatitisAmortalityinthe
UnitedStatesduringtheeraofhepatitisAvaccination.J.Infect.Dis.197,1282–1288
(2008).
30. Rauch,A.etal.GeneticvariationinIL28BisassociatedwithchronichepatitisCand
treatmentfailure:agenome-wideassociationstudy.Gastroenterology138,1338–
1345,1345.e1–7(2010).
31. Mischke,D.,Korge,B.P.,Marenholz,I.,Volz,A.&Ziegler,A.Genesencodingstructural
proteinsofepidermalcornificationandS100calcium-bindingproteinsformagene
complex(‘epidermaldifferentiationcomplex’)onhumanchromosome1q21.J.Invest.
Dermatol.106,989–992(1996).
32. Sun,L.-D.etal.Genome-wideassociationstudyidentifiestwonewsusceptibilityloci
foratopicdermatitisintheChineseHanpopulation.Nat.Genet.43,690–694(2011).
33. Zhang,X.-J.etal.Psoriasisgenome-wideassociationstudyidentifiessusceptibility
variantswithinLCEgeneclusterat1q21.Nat.Genet.41,205–210(2009).
34. Sugita,K.etal.Innateimmunitymediatedbyepidermalkeratinocytespromotes
acquiredimmunityinvolvingLangerhanscellsandTcellsintheskin.Clin.Exp.
Immunol.147,176–183(2007).
35. Bergboer,J.G.M.etal.Psoriasisriskgenesofthelatecornifiedenvelope-3groupare
distinctlyexpressedcomparedwithgenesofotherLCEgroups.Am.J.Pathol.178,
1470–1477(2011).
36. LearntheSignsandSymptomsofTBDisease.
37. Tart,A.H.,Walker,M.J.&Musser,J.M.NewunderstandingofthegroupA
Streptococcuspathogenesiscycle.TrendsMicrobiol.15,318–325(2007).
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
37
38. Montgomery,S.B.etal.Transcriptomegeneticsusingsecondgenerationsequencing
inaCaucasianpopulation.Nature464,773–777(2010).
39. Dimas,A.S.etal.Commonregulatoryvariationimpactsgeneexpressioninacelltype-
dependentmanner.Science325,1246–1250(2009).
40. Yoneda,M.etal.Highprevalenceofserumautoantibodiesagainsttheaminoterminal
ofalpha-enolaseinHashimoto’sencephalopathy.J.Neuroimmunol.185,195–200
(2007).
41. Nahm,D.-H.etal.Identificationofalpha-enolaseasanautoantigenassociatedwith
severeasthma.J.AllergyClin.Immunol.118,376–381(2006).
42. Lee,K.H.etal.Humanalpha-enolasefromendothelialcellsasatargetantigenofanti-
endothelialcellantibodyinBehçet’sdisease.ArthritisRheum.48,2025–2035(2003).
43. Tsoi,L.C.etal.Identificationof15newpsoriasissusceptibilitylocihighlightstherole
ofinnateimmunity.Nat.Genet.44,1341–1348(2012).
44. Thorleifsdottir,R.H.etal.Improvementofpsoriasisaftertonsillectomyisassociated
withadecreaseinthefrequencyofcirculatingTcellsthatrecognizestreptococcal
determinantsandhomologousskindeterminants.J.Immunol.Baltim.Md1950188,
5160–5165(2012).
45. PneumococcalDisease.
46. Castelblanco,R.L.,Lee,M.&Hasbun,R.Epidemiologyofbacterialmeningitisinthe
USAfrom1997to2010:apopulation-basedobservationalstudy.LancetInfect.Dis.14,
813–819(2014).
47. Taniuchi,K.etal.Developmentalexpressionofcarbonicanhydrase-relatedproteins
VIII,X,andXIinthehumanbrain.Neuroscience112,93–99(2002).
48. Medicalmicrobiology.(UniversityofTexasMedicalBranchatGalveston,1996).
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
38
49. Samuelov,L.etal.Desmoglein1deficiencyresultsinseveredermatitis,multiple
allergiesandmetabolicwasting.Nat.Genet.45,1244–1248(2013).
50. Amagai,M.&Stanley,J.R.Desmogleinasatargetinskindiseaseandbeyond.J.Invest.
Dermatol.132,776–784(2012).
51. Kashiwagi,M.,Ohba,M.,Chida,K.&Kuroki,T.ProteinkinaseCeta(PKCeta):its
involvementinkeratinocytedifferentiation.J.Biochem.(Tokyo)132,853–857(2002).
52. Mitchell,G.F.etal.Commongeneticvariationinthe3’-BCL11Bgenedesertis
associatedwithcarotid-femoralpulsewavevelocityandexcesscardiovasculardisease
risk:theAortaGenConsortium.Circ.Cardiovasc.Genet.5,81–90(2012).
53. Nicolle,L.E.Uncomplicatedurinarytractinfectioninadultsincludinguncomplicated
pyelonephritis.Urol.Clin.NorthAm.35,1–12,v(2008).
54. Amdekar,S.,Singh,V.&Singh,D.D.Probiotictherapy:immunomodulatingapproach
towardurinarytractinfection.Curr.Microbiol.63,484–490(2011).
55. Valiquette,L.Urinarytractinfectionsinwomen.Can.J.Urol.8Suppl1,6–12(2001).
56. Tanikawa,C.etal.Agenome-wideassociationstudyidentifiestwosusceptibilityloci
forduodenalulcerintheJapanesepopulation.Nat.Genet.44,430–434,S1–2(2012).
57. Wu,X.etal.GeneticvariationintheprostatestemcellantigengenePSCAconfers
susceptibilitytourinarybladdercancer.Nat.Genet.41,991–995(2009).
58. Reiter,R.E.etal.Prostatestemcellantigen:acellsurfacemarkeroverexpressedin
prostatecancer.Proc.Natl.Acad.Sci.U.S.A.95,1735–1740(1998).
59. Kvestad,E.etal.Heritabilityofrecurrenttonsillitis.Arch.Otolaryngol.HeadNeckSurg.
131,383–387(2005).
60. Bernstein,J.M.Mucosalimmunologyoftheupperrespiratorytract.Respir.Int.Rev.
Thorac.Dis.59Suppl3,3–13(1992).
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
39
61. D’Amelio,R.,Palmisano,L.,LeMoli,S.,Seminara,R.&Aiuti,F.SerumandsalivaryIgA
levelsinnormalsubjects:comparisonbetweentonsillectomizedandnon-
tonsillectomizedsubjects.Int.Arch.AllergyAppl.Immunol.68,256–259(1982).
62. Kawano,M.etal.Simultaneous,clonallyidenticalTcellexpansionintonsiland
synoviuminapatientwithrheumatoidarthritisandchronictonsillitis.Arthritis
Rheum.48,2483–2488(2003).
63. Gharavi,A.G.etal.Genome-wideassociationstudyidentifiessusceptibilitylociforIgA
nephropathy.Nat.Genet.43,321–327(2011).
64. Welter,D.etal.TheNHGRIGWASCatalog,acuratedresourceofSNP-trait
associations.NucleicAcidsRes.42,D1001–1006(2014).
65. Heikkinen,T.&Chonmaitree,T.Importanceofrespiratoryvirusesinacuteotitis
media.Clin.Microbiol.Rev.16,230–241(2003).
66. Raza,M.W.etal.Associationbetweensecretorstatusandrespiratoryviralillness.BMJ
303,815–818(1991).
67. Lindsay,E.A.etal.Tbx1haploinsufficienyintheDiGeorgesyndromeregioncauses
aorticarchdefectsinmice.Nature410,97–101(2001).
68. Raft,S.,Nowotschin,S.,Liao,J.&Morrow,B.E.Suppressionofneuralfateandcontrol
ofinnerearmorphogenesisbyTbx1.Dev.Camb.Engl.131,1801–1812(2004).
69. Vitelli,F.etal.TBX1isrequiredforinnerearmorphogenesis.Hum.Mol.Genet.12,
2041–2048(2003).
70. Ito,Y.etal.TheMohawkhomeoboxgeneisacriticalregulatoroftendon
differentiation.Proc.Natl.Acad.Sci.U.S.A.107,10538–10542(2010).
71. Sensi,A.etal.LAMMsyndromewithmiddleeardysplasiaassociatedwithcompound
heterozygosityforFGF3mutations.Am.J.Med.Genet.A.155A,1096–1101(2011).
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
40
72. Oksenberg,N.&Ahituv,N.TheroleofAUTS2inneurodevelopmentandhuman
evolution.TrendsGenet.TIG29,600–608(2013).
73. Jolley,A.etal.Denovointragenicdeletionoftheautismsusceptibilitycandidate2
(AUTS2)geneinapatientwithdevelopmentaldelay:acasereportandliterature
review.Am.J.Med.Genet.A.161A,1508–1512(2013).
74. Bønnelykke,K.etal.Agenome-wideassociationstudyidentifiesCDHR3asa
susceptibilitylocusforearlychildhoodasthmawithsevereexacerbations.Nat.Genet.
46,51–55(2014).
75. Eriksson,P.-O.,Li,J.,Ny,T.&Hellström,S.Spontaneousdevelopmentofotitismediain
plasminogen-deficientmice.Int.J.Med.Microbiol.IJMM296,501–509(2006).
76. Smith,N.&Greinwald,J.Totubeornottotube:indicationsformyringotomywithtube
placement.Curr.Opin.Otolaryngol.HeadNeckSurg.19,363–366(2011).
77. Baugh,R.F.etal.Clinicalpracticeguideline:tonsillectomyinchildren.Otolaryngol.--
HeadNeckSurg.Off.J.Am.Acad.Otolaryngol.-HeadNeckSurg.144,S1–30(2011).
78. EarInfectionsinChildren.
79. Casselbrant,M.L.etal.Theheritabilityofotitismedia:atwinandtripletstudy.JAMA
282,2125–2130(1999).
80. Kvaerner,K.J.,Tambs,K.,Harris,J.R.&Magnus,P.Distributionandheritabilityof
recurrentearinfections.Ann.Otol.Rhinol.Laryngol.106,624–632(1997).
81. Carmelli,D.,Colrain,I.M.,Swan,G.E.&Bliwise,D.L.Geneticandenvironmental
influencesinsleep-disorderedbreathinginoldermaletwins.Sleep27,917–922
(2004).
82. Blum,J.S.,Wearsch,P.A.&Cresswell,P.Pathwaysofantigenprocessing.Annu.Rev.
Immunol.31,443–473(2013).
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
41
83. Rock,K.L.,Gamble,S.&Rothstein,L.PresentationofexogenousantigenwithclassI
majorhistocompatibilitycomplexmolecules.Science249,918–921(1990).
84. Penna,A.etal.HepatitisBvirus(HBV)-specificcytotoxicT-cell(CTL)responsein
humans:characterizationofHLAclassII-restrictedCTLsthatrecognizeendogenously
synthesizedHBVenvelopeantigens.J.Virol.66,1193–1198(1992).
85. Nuchtern,J.G.,Biddison,W.E.&Klausner,R.D.ClassIIMHCmoleculescanusethe
endogenouspathwayofantigenpresentation.Nature343,74–76(1990).
86. Stanley,M.HPV-immuneresponsetoinfectionandvaccination.Infect.Agent.Cancer
5,19(2010).
87. Stanley,M.Immuneresponsestohumanpapillomavirus.Vaccine24Suppl1,S16–22
(2006).
88. Metzger,D.W.&Sun,K.Immunedysfunctionandbacterialcoinfectionsfollowing
influenza.J.Immunol.Baltim.Md1950191,2047–2052(2013).
89. SoreThroat.
90. Rock,K.L.&Shen,L.Cross-presentation:underlyingmechanismsandroleinimmune
surveillance.Immunol.Rev.207,166–183(2005).
91. Haralambieva,I.H.etal.Genome-widecharacterizationoftranscriptionalpatternsin
highandlowantibodyresponderstorubellavaccination.PloSOne8,e62149(2013).
92. Ovsyannikova,I.G.,Pankratz,V.S.,Vierkant,R.A.,Jacobson,R.M.&Poland,G.A.
ConsistencyofHLAassociationsbetweentwoindependentmeaslesvaccinecohorts:a
replicationstudy.Vaccine30,2146–2152(2012).
93. Ercolini,A.M.&Miller,S.D.Theroleofinfectionsinautoimmunedisease.Clin.Exp.
Immunol.155,1–15(2009).
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
42
94. Durand,E.Y.,Do,C.B.,Mountain,J.L.&Macpherson,J.M.AncestryComposition:A
Novel,EfficientPipelineforAncestryDeconvolution.(2014).
95. Henn,B.M.etal.Crypticdistantrelativesarecommoninbothisolatedand
cosmopolitangeneticsamples.PloSOne7,e34267(2012).
96. 1000GenomesProjectConsortiumetal.Amapofhumangenomevariationfrom
population-scalesequencing.Nature467,1061–1073(2010).
97. Browning,S.R.&Browning,B.L.Rapidandaccuratehaplotypephasingandmissing-
datainferenceforwhole-genomeassociationstudiesbyuseoflocalizedhaplotype
clustering.Am.J.Hum.Genet.81,1084–1097(2007).
98. Fuchsberger,C.,Abecasis,G.R.&Hinds,D.A.minimac2:fastergenotypeimputation.
Bioinforma.Oxf.Engl.(2014).doi:10.1093/bioinformatics/btu704
99. Bulik-Sullivan,B.K.etal.LDScoreregressiondistinguishesconfoundingfrom
polygenicityingenome-wideassociationstudies.Nat.Genet.47,291–295(2015).
100.Zheng,X.etal.HIBAG--HLAgenotypeimputationwithattributebagging.
PharmacogenomicsJ.14,192–200(2014).
101.Jia,X.etal.Imputingaminoacidpolymorphismsinhumanleukocyteantigens.PloS
One8,e64683(2013).
102.Segrè,A.V.etal.Commoninheritedvariationinmitochondrialgenesisnotenriched
forassociationswithtype2diabetesorrelatedglycemictraits.PLoSGenet.6,(2010).
103.Subramanian,A.etal.Genesetenrichmentanalysis:aknowledge-basedapproachfor
interpretinggenome-wideexpressionprofiles.Proc.Natl.Acad.Sci.U.S.A.102,
15545–15550(2005).
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
43
FigureTitlesandLegends
Figure1.SummaryofIndependentHLASignals
Thestrongestassociatedsignals(red)andputativeindependentsecondarysignals
(blue)areshownforeachdiseasealongwiththeirlocationwithintheregion.Independent
secondarysignalsweredefinedasthosewithresidualconditionalassociationusingthe
samesignificantthresholdastheprimaryassociationsignalmentionedinthemaintext.
Figure2.Heatmapshowingthegeneticcorrelations
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
44
Eachsquare[i,j]showsthelowerboundofthe95%confidenceintervalof
estimatedgeneticcorrelation(rg)betweentraitiandtraitj,whereiindexesrowandj
indexescolumns.Darkercolorsrepresentlargergeneticcorrelations.Allnegativevaluesto
zero.Phenotypeswereclusteredby’NearestPointAlgorithm’implementedinpython
“Seaborn”package.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
45
TablesTable1D
iscoverycohortcharacteristics
PhenotypeswithsignificantG
WASfindings
Chickenpox
ShinglesColdsores
Mononucleosis
Mum
psHepatitisB
Cases
107,769
16,711
25,108
17,457
31,227
1,425
Control
15,982
118,152
63,332
68,446
54,153
218,180
Plantarw
artsPositiveTB
Strepthroat aScarletfever
Pneumonia
Bacterialmeningitis
Cases
24,994
4,426
[qt]5
2,487
6,812
40,600
842
Control
37,451
84,290
22,017
113,837
90,039
82,778
Yeastinfections a
UTIfrequency
aTonsillectom
yChildhoodearinfection
Myringotom
y
Cases
[qt]5
2,218
[qt]3
5,000
60,098
46,936
4,138
Control
10,235
33,478
113,323
74,874
85,089
Phenotypesw
ithnosignificantGWASfindings
Measles
HepatitisA
Rheumaticfever
#ofColdslastyear aRubella
ChronicsinusinfectionCases
38,219
2,442
1,115
[qt]4
3,826
12,000
5,291
Control
47,279
217,137
88,076
15,720
71,597
79,622
a.Strepthroat,yeastin
fections,urinarytractin
fection(UTI)frequencyandnumberofcoldslastyearwerequantita
tivetraits([qt]).
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
46
Table2:Genome-w
ideSignificantAssociationsforEachDisease
PhenotypeCytoband
GenecontextVariants
PositionAlleles
aFreq
OR/effect b
(95%CI)
P-valueChickenpox
6p21.33
[HLA]
rs9266089
31320509
G/A
0.85
1.12
(1.1-1.14)
1.00E-10
Shingles6p21.33
[HLA]
rs2523591
31326960
G/A
0.58
1.14
(1.13-1.16)
1.74E-27
9p21.3
[IFNA21]
rs7047299
21167465
A/G
0.56
1.07
(1.06-1.09)
1.67E-08
Coldsores6p21.33
[HLA]
rs885950
31140152
C/A
0.42
1.08
(1.07-1.09)
7.47E-13
Mononucleosis
6p21.33
[HLA]
rs2596465
31412948
T/C
0.47
1.08
(1.06-1.09)
2.48E-09
Mum
ps
19q13.33
[FUT2]
rs516316
49206145
C/G
0.48
1.25
(1.24-1.27)
9.63E-72
6p22.1
[HLA]
rs114193679
29925103
G/C
0.99
1.72
(1.62-1.84)
2.23E-17
14q32.2
C14orf132--[]--B
DKRB2
rs11160318
96659919
G/A
0.68
1.1
(1.08-1.11)
4.56E-12
11q24.2
[ST3GAL4]
rs3862630
126243649
T/C
0.12
1.13
(1.1-1.15)
1.21E-08
HepatitisA
19q13.2
[IFNL4]
rs66531907
39740675
A/C
0.21
1.23
(1.14-1.33)
5.70E-08
HepatitisB
6p21.32
[HLA]
rs9268652
32409056
G/A
0.74
1.32
(1.25-1.38)
3.14E-09
Plantarwarts
6p21.32
[HLA]
rs9272050
32599071
G/A
0.38
1.19
(1.17-1.21)
9.66E-31
1q21.3
CRCT1--[]--L
CE3E
rs6692209
152506444
T/C
0.62
1.08
(1.06-1.09)
5.25E-09
PositiveTBTest
6p21.32
[HLA]
rs2894257
32433276
C/G
0.53
1.36
(1.33-1.39)
8.16E-36
Strepthroat6p21.33
[HLA]
rs1055821
31321919
T/G
0.06
0.08
(0.06-0.09)
7.69E-11
1p36.23
ERRFI1---[]---S
LC45A1
rs35395352
8266829
D/I
0.72
0.03
(0.03-0.04)
3.90E-08
Scarletfever6p21.32
[HLA]
rs36205178
32627938
C/T
0.81
1.24
(1.21-1.28)
9.49E-14
Pneumonia
6p21.33
[HLA]
rs3131623
31418810
T/A
0.85
1.1
(1.09-1.12)
1.99E-15
Bacterialmeningitis
17q21.33
[CA10]
rs1392935
49939129
G/A
0.91
1.78
(1.6-1.99)
1.19E-08
Yeastinfections
18q12.1
[DSG1]
rs200520431
28918197
D/I
0.07
0.11
(0.09-0.12)
1.87E-16
14q23.1
PRKCH
rs2251260
62023767
T/C
0.17
0.05
(0.04-0.06)
3.46E-10
14q32.2
[]---C14orf177
rs7161578
98586162
T/C
0.39
0.04
(0.03-0.04)
4.04E-08
UTIfrequency
8q24.3
JRK-[]-P
SCA
rs2976388
143760256
G/A
0.56
0.04
(0.04-0.05)
3.27E-10
15q15.3
[FRMD5]
rs146906133
44262951
T/C
1.00
0.38
(0.32-0.45)
2.02E-08
Tonsillectomy
12p13.31
[LTBR]
rs10849448
6493351
A/G
0.25
1.13
(1.12-1.14)
2.35E-35
22q12.2
[MTMR3]
rs201112509
30324654
I/D
0.69
1.12
(1.11-1.13)
3.39E-34
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
47
17p11.2
[TNFRSF13B]
rs34557412
16852187
G/A
0.01
1.59
(1.52-1.67)
2.66E-21
6p21.33
[HLA]
rs41543314
31322690
G/A
0.03
1.23
(1.2-1.26)
5.36E-21
13q33.3
[TNFSF13B]
rs200748895
108960380
D/I
0.03
1.26
(1.23-1.29)
1.20E-17
7p12.3
IGFBP3---[]
rs80077929
46094089
T/C
0.11
1.11
(1.1-1.13)
1.74E-15
4q24
[NFKB1]
rs230523
103462038
T/C
0.67
1.07
(1.06-1.08)
4.54E-14
7p15.2
HOXA1-[]-H
OXA2
rs6668
27138183
T/C
0.37
1.06
(1.06-1.07)
1.71E-13
14q21.1
MIPOL1-[]--F
OXA1
rs148131694
38025098
T/C
0.44
1.07
(1.06-1.08)
7.86E-13
4q24
[TET2]
rs1391439
106151642
G/A
0.40
1.06
(1.05-1.07)
2.76E-12
2p14
[SPRED2]
rs201473667
65562862
D/I
0.47
1.06
(1.05-1.07)
5.39E-12
20q13.12
NCOA5--[]-C
D40
rs6032664
44739419
A/T
0.26
1.07
(1.06-1.08)
7.31E-12
7p12.2
C7orf72---[]--IK
ZF1
rs876037
50308692
T/A
0.68
1.06
(1.05-1.07)
1.12E-11
13q21.33
[KLHL1]
rs9542155
70579527
T/C
0.32
1.06
(1.05-1.07)
1.92E-11
7p22.2
[GNA12]
rs2644312
2841164
G/A
0.71
1.06
(1.05-1.07)
2.73E-11
3q21.2
[SLC12A8]
rs1980080
124911841
C/T
0.33
1.06
(1.05-1.07)
3.16E-11
16p11.2
[SBK1]
rs141876325
28330968
I/D
0.28
1.06
(1.05-1.07)
4.72E-11
1q41
DUSP10---[]---H
HIPL2
rs12126292
222165909
G/T
0.12
1.09
(1.07-1.1)
7.21E-11
4p15.2
LGI2--[]-S
EPSECS
rs10939037
25114601
A/G
0.56
1.05
(1.05-1.06)
1.42E-10
12q24.12
[SH2B3]
rs3184504
111884608
T/C
0.50
1.05
(1.04-1.06)
3.07E-10
4q21.1
[CXCL13]
rs7685785
78440425
C/T
0.89
1.09
(1.07-1.1)
4.52E-10
7q31.2
MDFIC---[]---T
FEC
rs2023703
114944919
C/A
0.33
1.05
(1.05-1.06)
7.79E-10
19p13.2
ADAMTS10---[]--A
CTL9
rs2918308
8789912
A/C
0.84
1.07
(1.06-1.08)
1.10E-09
17p13.3
[RAP1GAP2]
rs67968065
2729247
D/I
0.52
1.05
(1.04-1.06)
2.08E-09
21q22.11
[ITSN1]
rs200746495
35052995
D/I
0.87
1.07
(1.06-1.09)
4.45E-09
2q33.3
[ADAM23]
rs1448903
207308961
G/A
0.09
1.08
(1.07-1.1)
5.13E-09
1p36.23
RERE--[]--E
NO1
rs12068123
8894346
G/A
0.45
1.05
(1.04-1.06)
5.60E-09
11p15.4
[ST5]
rs11042055
8756856
A/G
0.43
1.05
(1.04-1.06)
6.96E-09
20q13.33
[SAMD10]
rs41278232
62606617
G/A
0.89
1.08
(1.06-1.09)
7.54E-09
9q34.2
ABO-[]--S
URF6
rs635634
136155000
T/C
0.20
1.06
(1.05-1.07)
8.47E-09
17q11.2
[FOXN1]
rs62066768
26858395
A/G
0.08
1.09
(1.07-1.11)
1.09E-08
6q23.3
OLIG3---[]---T
NFAIP3
rs11757201
138003822
C/G
0.20
1.06
(1.05-1.07)
2.97E-08
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
48
16p11.2
MAPK3--[]--C
ORO1A
rs12931792
30156398
G/A
0.54
1.06
(1.05-1.07)
4.06E-08
7q31.2
[WNT2]
rs200608253
116917244
I/D
0.52
1.05
(1.04-1.05)
4.07E-08
22q11.21
[TBX1]
rs41298830
19752943
A/C
0.77
1.06
(1.05-1.07)
4.95E-08
ChildhoodEarInfections
19q13.33
[FUT2]
rs681343
49206462
C/T
0.52
1.11
(1.1-1.12)
3.51E-30
22q11.21
[TBX1]
rs1978060
19749525
G/A
0.60
1.09
(1.08-1.1)
1.17E-19
10p12.1
RAB18--[]--M
KX
rs2808290
27900882
C/T
0.51
1.07
(1.07-1.08)
5.09E-16
15q26.2
SPATA8---[]---L
INC00923
rs7174062
97452829
G/A
0.73
1.08
(1.07-1.09)
3.49E-14
6p21.32
[HLA]
rs4329147
32566577
T/C
0.83
1.11
(1.09-1.13)
9.55E-12
9q34.2
[ABO]
rs8176643
136149095
D/I
0.36
1.06
(1.05-1.07)
3.67E-11
2p16.1
[EFEMP1]
rs1802575
56093204
G/C
0.87
1.09
(1.07-1.1)
1.47E-10
2p24.1
NT5C1B-RDH14---[]---
OSR1
rs5829676
19274784
D/I
0.40
1.06
(1.05-1.07)
1.78E-10
11q13.3
FGF3---[]--A
NO1
rs72931768
69829717
G/C
0.88
1.09
(1.07-1.1)
2.63E-09
7q11.22
[AUTS2]
rs35213789
69268012
C/T
0.74
1.06
(1.05-1.07)
3.75E-09
7q22.3
[CDHR3]
rs114947103
105658927
C/T
0.18
1.07
(1.06-1.08)
5.40E-09
8q22.2
NIPAL2--[]--K
CNS2
rs13281988
99372329
C/G
0.31
1.06
(1.05-1.07)
9.84E-09
3p21.31
[BSN]
rs67035515
49656530
I/D
0.18
1.07
(1.05-1.08)
1.56E-08
6q26
[PLG]
rs73015965
161127501
G/A
0.01
1.43
(1.34-1.53)
3.78E-08
Myringotom
y22q11.21
[TBX1]
rs1978060
19749525
G/A
0.60
1.17
(1.14-1.2)
3.34E-10
a. TheallelesforSNParerepresentedashigh-riskallele/low-riskallele;
b. Strepthroat,p
lantarwartsandUTIfrequencyarequantita
tivetraits,forwhichwereportedtheeffectsizeinsteadofth
eoddsratio.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
49
Table3:HLAFine-m
appingResultsforEachDisease
PhenotypeGene
CytobandVariants
bPosition
Allelesc
FreqOR/
Effect d(95%
CI)Con.
P-valueP-value
Chickenpox[HLA-A]
6p22.1
A_Gly107
29911093
GW
0.71
1.09
(1.07-1.1)
3.77E-10
3.77E-10
6p22.1
A*02:01
29911953
NA
0.27
0.92
(0.91-0.93)
1.08E-09
1.08E-09
[HLA-B]
6p21.33
rs9266089
31320509
G/A
0.85
1.12
(1.1-1.14)
1.00E-10
1.00E-10
Shingles
[HLA-A]
6p22.1
rs2523815
29840243
A/G
0.64
1.12
(1.11-1.14)
6.17E-20
1.83E-22
6p22.1
A_Arg97
29911063
IRM
0.36
0.88
(0.87-0.9)
2.75E-22
2.75E-22
6p22.1
A*02:01
29911953
NA
0.27
0.88
(0.87-0.89)
1.72E-12
8.91E-20
[HLA-B]
6p21.33
B*44:02
31323318
NA
0.09
0.81
(0.79-0.82)
3.54E-23
3.54E-23
6p21.33
B*57:01
31323318
NA
0.04
1.22
(1.18-1.25)
1.03E-08
1.81E-10
6p21.33
B_Trp147
31324051
WL
0.95
0.92
(0.89-0.94)
6.67E-10
1.24E-03
6p21.33
B_GluMet45
31324602
EMTKG
0.52
1.11
(1.1-1.13)
9.51E-16
1.65E-18
6p21.33
rs2523591
31326960
G/A
0.58
1.14
(1.13-1.16)
1.74E-27
1.74E-27
[TNXB]
6p21.33
rs41316748
32019512
C/T
0.04
1.19
(1.15-1.23)
1.02E-08
1.44E-12
6p21.32
DRB1*11:01
32552079
NA
0.06
0.79
(0.77-0.82)
9.84E-10
2.96E-11
6p21.32
DRB1_PheSerHis13
32552131
FSHYGR
0.66
0.92
(0.91-0.94)
1.22E-09
5.90E-10
Coldsores
POU5F1-[]--
HCG27
6p21.33
rs885950
31140152
C/A
0.42
1.08
(1.07-1.09)
7.47E-13
7.47E-13
[HLA-B]
6p21.33
B_ThrGly45
31324602
EMTKG
0.23
0.91
(0.9-0.92)
4.91E-12
4.91E-12
MICA--[]--
MICB
6p21.33
rs4360170
31430359
A/G
0.91
1.13
(1.11-1.16)
3.38E-09
3.41E-12
Mononucleosis
MICA--[]--
MICB
6p21.33
rs2596465
31412948
T/C
0.47
1.08
(1.06-1.09)
2.48E-09
2.48E-09
Mum
ps[HLA-A]
6p22.1
A_Gln43
29910660
QR
0.99
1.71
(1.6-1.82)
2.51E-17
2.51E-17
6p22.1
A*02:05
29911953
NA
0.01
0.56
(0.52-0.6)
1.73E-17
1.73E-17
6p22.1
rs114193679
29925103
G/C
0.99
1.72
(1.62-1.84)
2.23E-17
2.23E-17
HepatitisB
[HLA-DRA]
6p21.32
rs9268652
32409056
G/A
0.74
1.32
(1.25-1.38)
3.14E-09
3.14E-09
Plantarwarts
[HLA-DRB1]
6p21.32
DRB1*04:01
32552079
NA
0.08
1.21
(1.19-1.24)
4.88E-18
4.88E-18
6p21.32
DRB1*15:01
32552079
NA
0.13
1.14
(1.12-1.15)
3.50E-19
4.48E-14
6p21.32
DRB1_PheSerTyr13
32552131
FSHYGR
0.63
0.85
(0.84-0.86)
1.84E-39
1.84E-39
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
50
6p21.32
rs28752534
32585068
T/C
0.64
1.11
(1.1-1.13)
9.28E-09
1.50E-30
6p21.32
rs9272050
32599071
G/A
0.38
1.19
(1.17-1.21)
9.65E-31
9.66E-31
PositiveTBTest
[HLA-B]
6p21.33
B*08:01
31323318
NA
0.11
1.29
(1.24-1.33)
1.73E-11
8.43E-14
[C6orf47]
6p21.33
rs148844907
31628397
A/T
0.01
1.92
(1.74-2.12)
4.95E-10
2.21E-12
[HLA-DRA-
DQA1]
6p21.32
rs3135359
32390578
C/T
0.71
1.21
(1.18-1.24)
2.63E-13
8.82E-21
6p21.32
rs2894257
32433276
C/G
0.53
1.36
(1.33-1.39)
8.16E-36
8.16E-36
6p21.32
DRB1_Leu67
32551969
LIF
0.41
1.3
(1.27-1.33)
1.14E-29
1.14E-29
6p21.32
DQA1*01:02
32608660
NA
0.20
0.75
(0.73-0.78)
7.04E-22
7.04E-22
6p21.32
DQA1*03:01
32608660
NA
0.10
1.27
(1.22-1.32)
1.59E-09
3.99E-10
6p21.32
DQA1_His129
32609872
QHx
0.66
1.19
(1.16-1.22)
2.40E-13
1.56E-13
Strepthroat[HLA-B]
6p21.33
rs1055821
31321919
T/G
0.06
0.08
(0.06-0.09)
7.69E-11
7.69E-11
[HLA-DRB1]
6p21.32
DRB1_CysTyr30
32552080
CYLGRH
0.83
-0.04
(-0.05--
0.04)
2.14E-09
2.14E-09
Scarletfever[HLA-DQB1]
6p21.32
rs36205178
32627938
C/T
0.81
1.24
(1.21-1.28)
9.49E-14
9.49E-14
6p21.32
DQB1*03:01
32630853
NA
0.20
0.83
(0.81-0.86)
2.98E-14
2.98E-14
6p21.32
DQB1_Gly45
32632724
GE
0.80
1.19
(1.16-1.22)
6.35E-14
6.35E-14
Pneumonia
[HLA-B]
6p21.33
B*08:01
31323318
NA
0.11
0.91
(0.9-0.92)
2.87E-11
2.87E-11
6p21.33
B_SerTrpAsn97
31324201
STWRNV
0.34
0.94
(0.93-0.95)
2.37E-12
2.37E-12
MICA--[]--
MICB
6p21.33
rs3131623
31418810
T/A
0.85
1.1
(1.09-1.12)
1.99E-15
1.99E-15
[HLA-DRB1]
6p21.32
DRB1_LeuSer11
32552137
LSVGDP
0.52
0.94
(0.94-0.95)
1.42E-09
4.10E-11
Tonsillectomy
[HLA-B]
6p21.33
rs41543314
31322690
G/A
0.03
1.23
(1.2-1.26)
5.36E-21
5.36E-21
6p21.33
B*57:01
31323318
NA
0.04
1.22
(1.19-1.24)
6.29E-22
6.29E-22
6p21.33
B_Val97
31324201
STWRNV
0.04
1.21
(1.19-1.24)
1.44E-21
1.44E-21
[LST1,LST1]
6p21.33
rs1052248
31556581
A/T
0.28
1.05
(1.04-1.06)
3.69E-08
3.15E-17
[HLA-DRB1]
6p21.32
DRB1_LeuValGly11
32552137
LSVGDP
0.42
1.06
(1.05-1.07)
1.67E-09
1.02E-12
6p21.32
rs140177540
32569453
D/I
0.47
1.06
(1.05-1.07)
2.92E-12
5.02E-15
GRM4--[]-
HMGA1
6p21.31
rs201220830
34195176
D/I
0.05
1.11
(1.09-1.14)
1.32E-08
3.65E-08
ChildhoodEarInfections
[HLA-DRB1]
6p21.32
DRB1_Gln96
32549612
EHYxQ
0.17
0.92
(0.91-0.93)
2.11E-12
2.11E-12
6p21.32
rs4329147
32566577
T/C
0.83
1.11
(1.09-1.13)
9.55E-12
9.55E-12
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
51
6p21.32
DQB1*06:02
32630853
NA
0.13
0.92
(0.91-0.94)
5.79E-10
5.79E-10
a. Weidentify
independentassociationsforeachcategoryofvariants:SNP,HLA(classicalallele)andAA(HLAaminoacid),separatelyusing
iterativeconditio
nalregression.Foraminoacidpolymorphism,th
elabelspecifie
stheaminoacidandthepositio
n.Forexample,‘A-Val95’
meansaminoacidValatpositio
n95ofth
eHLA-Aprotein.W
eusethisnamingconventionthroughoutth
epaper,whichiscomposedfrom
thenameofth
eHLAprotein(A),followedbythethree-lettercodefortheaminoacid(Val)anditslocationintheprotein(95).Theamino
acidpolymorphismishighlightedas‘red’ifth
esignalin
thatgeneregioncanbeattributetoitb
yreciprocalanalysis.
b. TheallelesforSNParerepresentedasriskallele/non-riskallele;th
ealleleforHLAclassicalalleleisrecordedas‘NA’,w
hichmeansit
eitherhasordoesnothasthetestedallele;th
eallelesforHLAaminoacidpolymorphismsareallth
epossibleaminoacids(one-letter
code)atth
atpositio
n.
c. Strepthroat,p
lantarwartsandUTIfrequencyarequantita
tivetraits,forwhichwereportedtheeffectsizeinsteadofth
eoddsratio.
d. The‘Cond.P-value’columncontainsconditio
nalp-valuesthatareafterite
rativeconditio
nalregressionwithineachcategoryofvariants
(SNP,im
putedHLAallelesandim
putedHLAaminoacidpolymorphisms).The‘P-value’columncontainstheunconditio
nalassociation
testp-values.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
52
Table4:MEGAN
TAPathwayAnalysisonM
umpsGW
AS
Pathway
Pathway.
PvaluePathw
ay.FD
RGene
Gene.Pvalue
SNP
SNP.Pvalue
Glycosphingolipidbiosynthesis
1.00E-04
0.0029
FUT1
0.00E+00
rs584768
2.38E-62
FUT2
0.00E+00
rs516316
9.63E-72
ST3GAL4
1.12E-08
rs3862630
1.21E-08
B3GNT1
4.20E-02
rs7947391
2.42E-04
GCNT2
4.34E-02
rs147839826
7.16E-05
B3GNT2
4.46E-02
rs116508040
2.60E-04
ABO
4.97E-02
9:136149709:AC_A
5.81E-05
Themostsignific
antpathway(genesets)withagenesetenrichmentFDR<0.01arepresented.Thegenesthatcontaingenome-wide
signific
antSNPsintheGWASanalysisarehighlightedred.
Table5:MEGAN
TAPathwayAnalysisonTonsillectom
yGWAS
Pathway
Pathway.
PvaluePathw
ay.FD
RGene
Gene.Pvalue
SNP
SNP.Pvalue
Intestinalimmune
networkforIgA
production2.00E-06
3.50E-03
LTBR
0.00E+00
rs10849448
2.35E-35
TNFSF13B
0.00E+00
rs200748895
1.20E-17
TNFRSF13B
0.00E+00
rs34557412
2.66E-21
CD40
3.19E-10
rs6032664
7.31E-12
ICOS
1.40E-03
rs231779
1.52E-06
CXCR4
4.75E-03
rs2680880
6.91E-06
CCL27
7.23E-03
rs12350154
6.71E-06
TNFRSF17
8.69E-03
rs456178
1.13E-05
IL6
9.25E-03
rs57561814
7.45E-06
IL15RA
1.63E-02
rs11256464
6.99E-06
TACIandBCMA
stimulationofB
9.00E-04
7.80E-03
TNFSF13B
0.00E+00
rs200748895
1.20E-17
TNFRSF13B
0.00E+00
rs34557412
2.66E-21
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
53
cellimmune
responses.NFKB1
9.08E-14
rs230523
4.54E-14
TNFRSF17
8.69E-03
rs456178
1.13E-05
TRAF2
1.21E-02
rs17250694
1.56E-05
Ceramide
signalingpathway
3.00E-04
7.95E-03
TNFRSF1A
0.00E+00
rs10849448
2.35E-35
NFKB1
9.08E-14
rs230523
4.54E-14
MAPK3
3.78E-06
rs12931792
4.06E-08
MAP2K2
4.75E-04
rs350906
5.34E-07
MYC
1.13E-03
rs13279820
1.44E-06
TRAF2
1.21E-02
rs17250694
1.56E-05
MADD
3.01E-02
rs2856661
3.76E-05
CTSD
3.13E-02
rs7929314
3.17E-05
CASP8
3.15E-02
rs190209406
3.72E-05
Glucocorticoidreceptorregulatorynetw
ork
5.00E-04
9.93E-03
NFKB1
9.08E-14
rs230523
4.54E-14
MAPK3
3.78E-06
rs12931792
4.06E-08
NCOA2
9.50E-04
rs1481046
5.65E-07
SELE
1.66E-03
rs6691453
5.38E-07
BGLAP
1.94E-03
rs2075165
2.52E-06
IL6
9.25E-03
rs57561814
7.45E-06
MAPK11
1.20E-02
rs4076566
1.23E-05
SMARCA4
1.41E-02
rs9797720
1.58E-05
SPI1
2.48E-02
rs11604680
3.38E-05
FKBP5
3.06E-02
rs74853132
1.68E-05
ICAM1
3.12E-02
rs74956615
4.07E-05
PCK2
3.34E-02
14:24604399:AAC_A
5.33E-05
Themostsignific
antpathway(genesets)withagenesetenrichmentFDR<0.01arepresented.Thegenesthatcontaingenome-wide
signific
antSNPsintheGWASanalysisarehighlightedred.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint
54
Table6:MEGAN
TAPathwayAnalysisonChildhoodEarInfectionsGW
AS
Pathway
Pathway.
PvaluePathw
ay.FD
RGene
Gene.Pvalue
SNP
SNP.Pvalue
Catalysisofthetransferof
organicanionsfrom
onesideofamem
branetotheother
9.00E-04
0.0062
SLCO1A2
1.44E-03
rs10841784
2.76E-06
SLCO1B1
3.45E-03
rs2417971
5.67E-06
ABCC3
4.74E-03
rs186049592
1.82E-05
SLC22A6
3.43E-02
rs67754977
1.20E-04
SLC22A8
3.80E-02
rs67754977
1.20E-04
Themostsignific
antpathway(genesets)withagenesetenrichmentFDR<0.01arepresented.Thegenesthatcontaingenome-wide
signific
antSNPsintheGWASanalysisarehighlightedred.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted September 9, 2016. . https://doi.org/10.1101/073056doi: bioRxiv preprint