ALFRED ALFRED: the ALlele FREquency Database Kenneth K. Kidd and the ALFRED Team Department of...

Post on 21-Dec-2015

230 views 0 download

Tags:

Transcript of ALFRED ALFRED: the ALlele FREquency Database Kenneth K. Kidd and the ALFRED Team Department of...

ALFRED: the ALALlele FREFREquency DDatabase

ALFRED: the ALALlele FREFREquency DDatabase

Kenneth K. Kidd and the ALFRED Team

Department of Genetics and Center for Medical InformaticsYale University School of Medicine

Supported by the U.S. National Science Foundation

IntroductionIntroductionIntroductionIntroduction

What is in ALFRED?What is in ALFRED?

How to access ALFREDHow to access ALFRED

What is ALFRED?What is ALFRED?

Why is it necessary?Why is it necessary?

What we are doing now?What we are doing now?

What is ALFRED?What is ALFRED?What is ALFRED?What is ALFRED?

ALFRED, ALFRED, the ALlele FREquency the ALlele FREquency DatabaseDatabase, is designed to integrate into , is designed to integrate into a single source information on the a single source information on the frequencies of human DNA sequence frequencies of human DNA sequence variants.variants.

ALFRED Home PageALFRED Home Pagehttp://alfred.med.yale.edu/alfred/ALFRED Home PageALFRED Home Pagehttp://alfred.med.yale.edu/alfred/

ALFRED is designed to allow ALFRED is designed to allow reference of frequencies to:reference of frequencies to:ALFRED is designed to allow ALFRED is designed to allow reference of frequencies to:reference of frequencies to:

A specific typing protocol for a specific polymorphism at a specific A specific typing protocol for a specific polymorphism at a specific locus.locus.

A specific sampling of an ethnic group.A specific sampling of an ethnic group.

Cross reference to the literature for other publications of frequencies based Cross reference to the literature for other publications of frequencies based on the same sample or ethnic group.on the same sample or ethnic group.

Why is it necessary?Why is it necessary?Why is it necessary?Why is it necessary?

ALFRED is designed to serve as a ALFRED is designed to serve as a central repository of frequencies for central repository of frequencies for variation in the human genomevariation in the human genome––curated curated and cross referenced to molecular and and cross referenced to molecular and ethnographic databasesethnographic databases––by assembling by assembling in one place data that are dispersed in one place data that are dispersed very widely in the scientific literature.very widely in the scientific literature.

Why is it necessary?Why is it necessary?Why is it necessary?Why is it necessary?

ALFRED is web-based, publicly ALFRED is web-based, publicly available, with easy to download data available, with easy to download data thus serving as a resource for many thus serving as a resource for many types of research projects.types of research projects.

With its graphic displays of data, ALFRED With its graphic displays of data, ALFRED can also serve as an educational can also serve as an educational resource for physical anthropology and resource for physical anthropology and human population genetics.human population genetics.

What are we doing now?What are we doing now? Data content - Quality controlData content - Quality control

What are we doing now?What are we doing now? Data content - Quality controlData content - Quality control

Criteria - a minimal typed sample size of Criteria - a minimal typed sample size of 20 individuals, minimization of missing 20 individuals, minimization of missing data, time-stamped frequency data (i.e., data, time-stamped frequency data (i.e., different versions of frequency data are different versions of frequency data are available).available).

What are we doing now?What are we doing now? Data integration and accumulationData integration and accumulation

What are we doing now?What are we doing now? Data integration and accumulationData integration and accumulation

ALFRED curators are currently ALFRED curators are currently uploading allele frequency data from uploading allele frequency data from published literature throughout the published literature throughout the physical anthropology and population physical anthropology and population genetics peer reviewed journals.genetics peer reviewed journals.

What are we doing now?What are we doing now? Data managementData management

What are we doing now?What are we doing now? Data managementData management

ALFRED programmers are currently ALFRED programmers are currently working on the migration of ALFRED working on the migration of ALFRED from Access to Oracle in order to from Access to Oracle in order to handle the rapidly growing database.handle the rapidly growing database.

What is in ALFRED?What is in ALFRED?What is in ALFRED?What is in ALFRED?

ALFRED stores allele frequencies and ALFRED stores allele frequencies and information on a wide range of loci, information on a wide range of loci, polymorphic sites, populations, and polymorphic sites, populations, and samples.samples.

Table Summary NumbersTable Summary NumbersAs of April 9, 2002As of April 9, 2002Table Summary NumbersTable Summary NumbersAs of April 9, 2002As of April 9, 2002

LociLociExample: Chromosome 22Example: Chromosome 22LociLociExample: Chromosome 22Example: Chromosome 22

LocusLocusExample: Catechol-O-Methyl TransferaseExample: Catechol-O-Methyl TransferaseLocusLocusExample: Catechol-O-Methyl TransferaseExample: Catechol-O-Methyl Transferase

LocusLocusExample: Catechol-O-Methyl TransferaseExample: Catechol-O-Methyl TransferaseLocusLocusExample: Catechol-O-Methyl TransferaseExample: Catechol-O-Methyl Transferase

Definition of the PolymorphismDefinition of the PolymorphismDefinition of the PolymorphismDefinition of the Polymorphism

A clear protocolPCR primers and product sizes for In/Dels and

STRPs

PCR primers and fragment sizes after enzyme digestion for RSPs

Unambiguous definition of varying nucleotides based on flanking sequence.

PolymorphismsPolymorphismsExample: COMT, 3-site haplotypeExample: COMT, 3-site haplotypePolymorphismsPolymorphismsExample: COMT, 3-site haplotypeExample: COMT, 3-site haplotype

Allele FrequenciesAllele FrequenciesExample: COMT, 3-site haplotypeExample: COMT, 3-site haplotypeAllele FrequenciesAllele FrequenciesExample: COMT, 3-site haplotypeExample: COMT, 3-site haplotype

Allele FrequenciesAllele FrequenciesExample: COMT, 3-site haplotypeExample: COMT, 3-site haplotypeAllele FrequenciesAllele FrequenciesExample: COMT, 3-site haplotypeExample: COMT, 3-site haplotype

PopulationsPopulationsExample: North AmericaExample: North AmericaPopulationsPopulationsExample: North AmericaExample: North America

PopulationsPopulationsExample: Maya, YucatanExample: Maya, YucatanPopulationsPopulationsExample: Maya, YucatanExample: Maya, Yucatan

SamplesSamplesExample: Maya, YucatanExample: Maya, YucatanSamplesSamplesExample: Maya, YucatanExample: Maya, Yucatan

Frequency data retrievalFrequency data retrievalSearchSearch Frequency data retrievalFrequency data retrievalSearchSearch

Frequency Variation for Frequency Variation for Four SNPS in 33 PopulationsFour SNPS in 33 PopulationsFrequency Variation for Frequency Variation for Four SNPS in 33 PopulationsFour SNPS in 33 Populations

00.10.20.30.40.50.60.70.80.9

1

DRD2 TaqI"D" RET TaqI APOB XbaI RBP3 MspI

AfricaAfrica Europe/Middle EastEurope/Middle East East AsiaEast AsiaNorthNorth

AmericaAmericaSouthSouth

AmericaAmericaP.P. S.S.

ALFRED System ImplementationALFRED System ImplementationALFRED System ImplementationALFRED System Implementation

Microsoft Access (migration to Oracle)Microsoft Access (migration to Oracle)

Microsoft NT Server with Internet Microsoft NT Server with Internet

Information Server (IIS)Information Server (IIS)

Scripts Written in Server-side ASP (VB Script)Scripts Written in Server-side ASP (VB Script)

Microsoft Access (migration to Oracle)Microsoft Access (migration to Oracle)

Microsoft NT Server with Internet Microsoft NT Server with Internet

Information Server (IIS)Information Server (IIS)

Scripts Written in Server-side ASP (VB Script)Scripts Written in Server-side ASP (VB Script)

ALFRED System OverviewALFRED System OverviewALFRED System OverviewALFRED System Overview

PhenoDB

ALFREDWeb Server

(ASP)ODBC

ClientBrowser

ExternalData

Resources

Collaborators HAPLOProgram

Others (e.g.literature)

Input Data Sources

NT Server

Kidd Lab Data

ALFREDALFREDTheThe ALALlelelele FREFREquency quency DDatabaseatabase

from Kidd Labfrom Kidd Lab

ALFREDALFREDTheThe ALALlelelele FREFREquency quency DDatabaseatabase

from Kidd Labfrom Kidd Lab

http://alfred.med.yale.edu/alfred/index.asphttp://alfred.med.yale.edu/alfred/index.asp

Suggestions and comments are welcome.Suggestions and comments are welcome.

http://alfred.med.yale.edu/alfred/index.asphttp://alfred.med.yale.edu/alfred/index.asp

Suggestions and comments are welcome.Suggestions and comments are welcome.

The ALFRED TeamThe ALFRED TeamThe ALFRED TeamThe ALFRED TeamSenior FacultySenior FacultyKenneth K. Kidd, Ph.D., Professor of Genetics and Psychiatry (ALFRED P.I.)Perry Miller, M.D., Ph.D., Director of Center for Medical Informatics

CuratorsCuratorsChen-Chen Yeh, M.S., Research AssociateRebekah Heinzen, B.A., Research Assistant

ProgrammersProgrammersMichael V. Osier, Ph.D. Candidate, Graduate StudentHaseena Rajeevan, Ph. D., Systems ProgrammerNicholas P. Tosches, M.D., Associate Research ScientistLyudmila Druskin, M.D., Postdoctoral Fellow and Associate

ConsultantsConsultantsAndrew J. Pakstis, Ph. D., Research Scientist

Judith R. Kidd, Ph. D., Research Scientist

Kei-Hoi Cheung, Ph. D., Assistant Professor