RBP1 Splicing Regulation in Drosophila Melanogaster

18
RBP1 Splicing RBP1 Splicing Regulation Regulation in in Drosophila Drosophila Melanogaster Melanogaster 03-711 - Fall 2005 03-711 - Fall 2005 Jacob Joseph, Ahmet Jacob Joseph, Ahmet Bakan, Bakan, Amina Abdulla Amina Abdulla This presentation available at http://www.jjoseph.org/biology/

description

RBP1 Splicing Regulation in Drosophila Melanogaster. 03-711 - Fall 2005 Jacob Joseph, Ahmet Bakan, Amina Abdulla. This presentation available at http://www.jjoseph.org/biology/. Alternative Splicing in Dros. RBP1 Regulation. Involved in dsx splicing and Rbp1 auto-regulation - PowerPoint PPT Presentation

Transcript of RBP1 Splicing Regulation in Drosophila Melanogaster

Page 1: RBP1 Splicing Regulation in  Drosophila Melanogaster

RBP1 Splicing RBP1 Splicing RegulationRegulation

in in Drosophila Drosophila

MelanogasterMelanogaster03-711 - Fall 200503-711 - Fall 2005

Jacob Joseph, Ahmet Bakan, Jacob Joseph, Ahmet Bakan,

Amina AbdullaAmina Abdulla

This presentation available at http://www.jjoseph.org/biology/

Page 2: RBP1 Splicing Regulation in  Drosophila Melanogaster

Alternative Splicing in Alternative Splicing in Dros.Dros.

Page 3: RBP1 Splicing Regulation in  Drosophila Melanogaster

RBP1 RegulationRBP1 Regulation

Involved in Involved in dsxdsx splicing and splicing and Rbp1Rbp1 auto- auto-regulationregulation

Suspected in many other related Suspected in many other related pathwayspathways

Page 4: RBP1 Splicing Regulation in  Drosophila Melanogaster

Genome DataGenome Data

Sequence of all introns of known Sequence of all introns of known splice variantssplice variants

Two annotated genomes availableTwo annotated genomes available D. MelanogasterD. Melanogaster D. PseudoobscuraD. Pseudoobscura

As the gene names for D. Mel. and As the gene names for D. Mel. and D. Pseu. differ, a list of gene D. Pseu. differ, a list of gene orthologs was also obtainedorthologs was also obtained

Page 5: RBP1 Splicing Regulation in  Drosophila Melanogaster

Computational ApproachComputational Approach Create profile HMM for each motif (B-B, Create profile HMM for each motif (B-B,

B-A) B-A) Select the end of every intron (~50 bases)Select the end of every intron (~50 bases) Perform an HMM search for each intron Perform an HMM search for each intron

segment, in both D. Mel. and D. Pseu.segment, in both D. Mel. and D. Pseu. Keep matches found in both speciesKeep matches found in both species Keep matches at the end of introns (~15 Keep matches at the end of introns (~15

bases)bases) Return alignment of both speciesReturn alignment of both species Examine biological similarity of matchesExamine biological similarity of matches

Page 6: RBP1 Splicing Regulation in  Drosophila Melanogaster

Data SummaryData Summary

Page 7: RBP1 Splicing Regulation in  Drosophila Melanogaster

Hidden Markov Profile Hidden Markov Profile (HMM) and HMMer(HMM) and HMMer

We needed an HMM profiler and search We needed an HMM profiler and search program.program.

Revised version of what Krogh/Haussler Revised version of what Krogh/Haussler model called Plan 7model called Plan 7 Not only global alignmentNot only global alignment

Page 8: RBP1 Splicing Regulation in  Drosophila Melanogaster

HMMer AdvantagesHMMer Advantages Possible AlignmentsPossible Alignments

Classic global alignmentClassic global alignment Classic local alignmentClassic local alignment Global Profile, Local Sequence alignmentGlobal Profile, Local Sequence alignment Fully local “multihit” alignment. Ex:Fully local “multihit” alignment. Ex:

ScoringScoring Raw alignment scoreRaw alignment score E-value, showing the significance of the E-value, showing the significance of the

alignmentalignment

Page 9: RBP1 Splicing Regulation in  Drosophila Melanogaster

HMMerHMMer

Create HMM for multiple alignment of Create HMM for multiple alignment of each B-B and B-A motifeach B-B and B-A motif

Genome is scanned for high scoring Genome is scanned for high scoring matchesmatches

Only hits within a distance of 15 base Only hits within a distance of 15 base pairs of the 3’ splice site are consideredpairs of the 3’ splice site are considered

Page 10: RBP1 Splicing Regulation in  Drosophila Melanogaster

Results: B-A MotifResults: B-A MotifCG30271-RC-in_5 (27 - 39), GA15740-in_5 (27 - 39) score: -6ctgttgaatcacttggaaagcaatcaGTCGACAATTGTTtacttttacag| |||||||||| |||||||||||||||||||||||||||||||||||cctttgaatcactcggaaagcaatcaGTCGACAATTGTTtacttttacag

CG30020-RA-in_3 (25 - 37), GA15581-in_9 (24 - 36) score: -8ccgtcccagtgacttacaatacgaTTCTACTATTTTTtgtacgcttacag | | | | | ||||| |||| | | taaggctcttcatactttatcaaATCTACAATTTCTcaatgtaattgcag

Klp3A-RA-in_3 (31 - 43), GA21186-in_3 (26 - 38) score: -9ttgaagttcgaaaactcctgaaactaattgTTCCACAATTTTTttttatt | || || || ||| || ||||| | | tgttcaattcttaaataaaaccaatTTCGACTCTTTTTctcttctttcag

na-RB-in_0 (33 - 45), GA13546-in_2 (25 - 37) score: -9tctggtgcactgagagaaatgccatctacttcATCGATACTCTTTtgcag | | || | | || || | tgtaaacactcgttgcaaacacaaATTTACAATCAATttccatgttttat

CG30428-RA-in_2 (33 - 45), GA15840-in_1 (25 - 37) score: -9ggtaaggaagcgtaaaaataaattctttttttATCACCAATATTTttcag | || || ||||| |||| ||||| aaaatatcaagccgaaacaaatttATGTACAATTTTTtttttatggaaag

CG2199-RB-in_0 (36 - 48), GA15296-in_0 (33 - 45) score: -10ttgctactgccattataggtagtttaaaaactgttTTCTACACTCTTTct | | | | | || ||||| | | aacaaaaacaaaaatatggccctctgataattGGGGACACTTTATttcag

Page 11: RBP1 Splicing Regulation in  Drosophila Melanogaster

Results: B-B MotifResults: B-B Motifps-RD-in_4 (31 - 42), GA20847-in_4 (31 - 42) score: -11catttaatatcttgaaaatatttaacataaATCTGATGCAAAtattccag | || | || ||||||||||||||||||||||||||||||||attactattcttaaaatatatttaacataaATCTGATGCAAAtattccag

fru-RE-in_6 (26 - 37), GA12896-in_5 (24 - 35) score: -13cccacccccacagtgatgacgcctaATATGAACCAAGcaaatgtttgcag | | | | | | ||| | || | | | | tgctaaataaaccaaattccaaaCTCTGATCAAAAaataccgataaaaag

Ptp52F-RA-in_0 (38 - 49), GA14851-in_14 (34 - 45) score: -13tactctttgaaaaataagcatatggatgtcactgataATATGATATTAAt | | | | || | ||| || || tctaaatcgtattcaaatcgaattgaaacataaATCGAATCCAAAaacag

CG9455-RA-in_0 (32 - 43), GA21800-in_0 (27 - 38) score: -13aatagtggctttgttttaataacaatgtaatATCTGATATTTAttctcag | | | | | ||||| | | | cagagcgtgccccgtctgatgatccgAACTGATCTGATgtttttcggtag

CG8709-RA-in_2 (34 - 45), GA21271-in_9 (34 - 45) score: -13acaaatcttaggaaataccaaagttgttctacgATCTTATCTATGgagtc | | | | | | || || | |||||| gccccatcagtgtcagtggcagctgaccccaccATTTGATCTATTtgcag

CG7966-RA-in_0 (37 - 48), GA20727-in_4 (26 - 37) score: -13tatatgtacacattgtactgcaaacacatgccctgaATCTTTGATAAAga | | ||| | | |||||| | |||| gtgttgaatgaaagaatacacttgaATCGGTTCTAAAttgcatcgcacag

Page 12: RBP1 Splicing Regulation in  Drosophila Melanogaster

Biomolecular Activity: B-Biomolecular Activity: B-AA

Page 13: RBP1 Splicing Regulation in  Drosophila Melanogaster

Biomolecular Activity: B-Biomolecular Activity: B-BB

Page 14: RBP1 Splicing Regulation in  Drosophila Melanogaster

Biomolecular activity Biomolecular activity analysisanalysis

frufru gene, regulated by the gene, regulated by the tratra and and tra2tra2 genes is expressed at the same genes is expressed at the same time as dsx gene helps validate our time as dsx gene helps validate our results.results.

Expected presence of Expected presence of sxlsxl and and tra tra genes.genes.

Functional Similarity:Functional Similarity: B-A motif: B-A motif: SNF4Agamma, rdgc, qtc.SNF4Agamma, rdgc, qtc. B-B motif: B-B motif: ps, ptp, CG9455ps, ptp, CG9455..

Page 15: RBP1 Splicing Regulation in  Drosophila Melanogaster

Difficulties & Future Difficulties & Future DirectionsDirections

Support Vector Machines were Support Vector Machines were appliedapplied

Lack of significant training data.Lack of significant training data. Lack of direct experimental data for Lack of direct experimental data for

cross-validation.cross-validation. Since the current D. Pse. genome has Since the current D. Pse. genome has

far fewer intron sequences, reliance far fewer intron sequences, reliance upon orthologs introduces many false upon orthologs introduces many false negatives.negatives.

Page 16: RBP1 Splicing Regulation in  Drosophila Melanogaster

Alternate Approach:Alternate Approach:Support Vector Machines Support Vector Machines

(SVM)(SVM) Used for data classificationUsed for data classification Creates hyperplanes that Creates hyperplanes that

separate data into two classes separate data into two classes with maximum-marginwith maximum-margin

Appropriate for Appropriate for multidimensional multidimensional classification problemsclassification problems

ExamplesExamples Article classificationArticle classification Protein classificationProtein classification

Critical pointsCritical points Feature selectionFeature selection TrainingTraining

Page 17: RBP1 Splicing Regulation in  Drosophila Melanogaster

HMM and SVMHMM and SVM HMMer is used to generate featuresHMMer is used to generate features All genome searched for A and B All genome searched for A and B

consensus sequencesconsensus sequences Search results for each intron combined Search results for each intron combined

to create featuresto create features FeaturesFeatures

Scores of two motifs in the upstream (2)Scores of two motifs in the upstream (2) Distance of the motifs to the splice site (1)Distance of the motifs to the splice site (1) Length of consensus sequence overlap (1)Length of consensus sequence overlap (1) Length of motif (1)Length of motif (1) Does consensus sequence B precedes A (1)Does consensus sequence B precedes A (1)

Number of features = 6Number of features = 6

Page 18: RBP1 Splicing Regulation in  Drosophila Melanogaster

SummarySummary

Profile HMM used for modelingProfile HMM used for modeling Comparative analysis with the D.Pseu Comparative analysis with the D.Pseu

genomegenome High scoring alignments for both High scoring alignments for both

motifs further analyzed for motifs further analyzed for biomolecular activitybiomolecular activity

The existence of the The existence of the fru fru and other and other close matches help to validate our close matches help to validate our resultsresults