基因功能註解工具 :DAVID. Database for Annotation, Visualization and Integrated Discovery...
-
Upload
grant-armstrong -
Category
Documents
-
view
261 -
download
0
Transcript of 基因功能註解工具 :DAVID. Database for Annotation, Visualization and Integrated Discovery...
基因功能註解工具 :DAVID
Database for Annotation, Visualization and Integrated Discovery (DAVID )
Functional Annotation Tool Gene Ontology Protein interaction Protein domain Pathway Disease
Gene ID Conversion Gene Functional Classification
上傳基因列表到網站
DAVID 操作流程
Gene Name Batch Viewer Gene Functional Classification
Functional Annotation Tool
選定類別以進行分析
取得結果
上傳基因列表AFFYMETRIX_3PRIME_IVT_IDAFFYMETRIX_EXON_GENE_IDAFFYMETRIX_SNP_IDAGILENT_CHIP_IDAGILENT_IDAGILENT_OLIGO_IDENSEMBL_GENE_IDENSEMBL_TRANSCRIPT_IDENTREZ_GENE_IDFLYBASE_GENE_IDFLYBASE_TRANSCRIPT_IDGENBANK_ACCESSIONGENOMIC_GI_ACCESSIONGENPEPT_ACCESSIONILLUMINA_IDIPI_IDMGI_IDOFFICIAL_GENE_SYMBOLPFAM_IDPIR_IDPROTEIN_GI_ACCESSIONREFSEQ_GENOMICREFSEQ_MRNAREFSEQ_PROTEINREFSEQ_RNARGD_IDSGD_IDTAIR_IDUCSC_GENE_IDUNIGENEUNIPROT_ACCESSIONUNIPROT_IDUNIREF100_IDWORMBASE_GENE_IDWORMPEP_IDZFIN_IDNot Sure
1. 確定物種
2. 選定後使用
3.
Functional Annotation Tool
DAVID Gene ID: It is an internal ID generated on "DAVID Gene Concept" in DAVID system. One DAVID gene ID represents one unique gene cluster belonging to one single gene entry.
Input Gene list : 817Map to David Database : 754David IDs : 734
1.
Genes from your list involved in this annotation categories
2.
4. Single chart report only for this annotation categories.
3. 99 / 734
Functional Annotation ChartChart Report is an annotation-term-focused view which lists annotation terms and their associated genes under study. To avoid over counting duplicated genes, the Fisher Exact statistics is calculated based on corresponding DAVID gene IDs by which all redundancies in original IDs are removed. All result of Chart Report has to pass the thresholds (by default, Max.Prob.<=0.1 and Min.Count>=2) in Chart Option section to ensure only statistically significant ones displayed.
Functional Annotation Chart
a modified Fisher Exact P-Value (EASE Score)
每頁可顯示多少結果
List Total(LT) - number of genes in the gene list mapping to the category of which the term is a memberPopulation Hits(PH) - number of genes in the background gene list mapping to a specific termPopulation Total(PT) - number of genes in the background gene list mapping to the category
RT (Related Term)Related Term Search can identify other similar terms
RT (Related Term)
Any given gene is associating with a set of annotation terms. If genes share similar set of those terms, they are most likely involved in similar biological mechanisms. The algorithm adopts kappa statistics to quantitatively measure the degree of the agreement how genes share the similar annotation terms. Kappa result ranges from 0 to 1. The higher the value of Kappa, the stronger the agreement. Any a biological process/term coming from all functional categories listed in DAVID.
COG_ONTOLOGY refers to an ontology from NCBI's COG databaseThe database of Clusters of Orthologous Groups of proteins (COGs): a tool for genome-scale analysis of protein functions and evolution
SP_PIR_KEYWORDS are keywords defined by the SwissProt/Uniprot and PIR (Protein Information Resource)
UP_SEQ_FEATURE refers to the annotation category, Uniprot Sequence Feature, found at the Uniprot site, within their report.
Annotation Category - Functional Categories
Annotation Category – Protein domain & Protein Interaction
Protein structure
GOTerms are categorized into 3 groups:BP - Biological ProcessMF - Molecular FunctionCC - Cellular Component
GOTERM_BP_1 -> GO term under Biological Process (BP) in the Level 1.
GOTERM_BP_ALL -> GO term under Biological Process (BP) in the ALL possible Levels.
GOTERM_BP_FAT - Basically this test exams the significance of enriched annotation
(GO FAT) filters out very broad GO terms based on a measured specificity of each term (not level-specificity)
Annotation Category - Gene Ontology
Annotation Category-Pathways
Biocarta
KEGG
挑選 11 項 Categories
總共 11 項 Categories
Combined View Annotation
Functional Annotation ClusterFunctional Annotation ClusteringDue to the redundant nature of annotations, Functional Annotation Chart presents similar/relevant annotations repeatedly. It dilutes the focus of the biology in the report. To reduce the redundancy, the newly developed Functional Annotation Clustering report groups/displays similar annotations together which makes the biology clearer and more focused to be read vs. traditional chart report. • The Functional Annotation Clustering integrates the same techniques of Kappa statistics to measure the degree of the common
genes between two annotations, and fuzzy heuristic clustering to classify the groups of similar annotations according kappa values.
All gene involved in this annotation cluster
Ease score (modified fisher exact test)
Heat map
調整 Kappa statistics 的參數
調整 fuzzy heuristic clustering 的參數
P_value
Enrichment Score = [ -log(P_value 1) + -log(P_value 2) + -log(P_value N) ] / n
Initial Group Members (any value >=2; default = 4): the minimum gene number in a seeding group, which affects the minimum size of each functional group in the final. In general, the lower value attempts to include more genes in functional groups, particularly generates a lot small size groups.
Final Group Members (any value >=2; default = 4): the minimum gene number in one final group after “cleanup” procedure. In general, the lower value attempts to include more genes in functional groups, particularly generates a lot small size groups. It co-functions with previous parameters to control the minimum size of functional groups. In the final cluster, the number of terms that a cluster must have to be presented in the output.
Multi-linkage Threshold (any value between 0% to 100%; default = 50%): It controls how seeding groups merge each other, i.e. two groups sharing the same gene members over the percentage will become one group. The higher percentage, in general, gives sharper separation i.e. it generates more final functional groups with more tightly associated genes in each group. In addition, changing the parameter does not contribute extra genes into unclustered group.
• If you run both functions with defualt setting, they will not be totally overlapped. In general, clustering result may contain more result than chart. In clustering, some 'non-significant' terms could be included due to the link of their 'significant' neigthbors (co-members in on cluster).
• If you want to completely cross link the two reports, you should run chart report by setting p-value cutoff to "1" (ground level). Thus, you will have all possible terms with significant or insignificant p-values.
Chart vs Cluster
上傳基因列表到網站
Gene Name Batch Viewer Gene Functional Classification
Functional Annotation Tool
選定類別以進行分析
取得結果
Another Tools in DAVID
Gene Name Batch Viewer
Gene Functional Classification Tool
Term report
Gene Functional Classification Tool - Create sublist
Gene ID Conversion Tool
Thank you for your attention