Many to 1 Gene Associations

50
1 Many to 1 Gene Associations The following slides show a few examples of gene predictions by one annotation group that overlap one or more genes from another group. Some of the examples that follow also illustrate issues related to - differences in annotation type (e.g., pseudogene versus gene),and -in confusing nomenclature (e.g., different genes assigned the same official gene name).

description

Many to 1 Gene Associations. The following slides show a few examples of gene predictions by one annotation group that overlap one or more genes from another group. Some of the examples that follow also illustrate issues related to - PowerPoint PPT Presentation

Transcript of Many to 1 Gene Associations

Page 1: Many to 1 Gene Associations

1

Many to 1 Gene Associations

The following slides show a few examples of gene predictions by one annotation group that overlap one or more genes from another group.

Some of the examples that follow also illustrate issues related to - differences in annotation type (e.g., pseudogene versus

gene),and -in confusing nomenclature (e.g., different genes assigned the same official gene name).

Page 2: Many to 1 Gene Associations

2

2:110788585..110968584One gene or two?

Orientation issue for OTT15152?

Page 3: Many to 1 Gene Associations

3

8:4238129..4254528 One gene or two?

Page 4: Many to 1 Gene Associations

4

3:105659594..105759593One gene or two?

Page 5: Many to 1 Gene Associations

5

11:69491920..69516919 One gene or two or three?

Page 6: Many to 1 Gene Associations

6

5:106920574..107155573 One gene or two or three?

Page 7: Many to 1 Gene Associations

7

6:145313224..145563223One gene or two?

The VEGA gene model seems to unite two separate gene models in NCBI

Page 8: Many to 1 Gene Associations

8

One gene or two?9:15109186..15189185

Page 9: Many to 1 Gene Associations

9

7:127057560..127247315 One gene or two?

Page 10: Many to 1 Gene Associations

10

7:52670474..52680473 One gene or two?

Page 11: Many to 1 Gene Associations

11

4:146600055..146731054 One gene or two?

Page 12: Many to 1 Gene Associations

12

2:37243166..37343165n:mENSMUSG00000050714 and ENSMUSG00000066798 overlap OTTMUSG00000012648 and OTTMUSG00000012652

Page 13: Many to 1 Gene Associations

13

2:155895575..155939706n:mENSMUSG00000074643 overlaps ENSMUSG00000038171 OTTMUSG00000016087, OTTMUSG00000016088, and OTTMUSG00000019746

Page 14: Many to 1 Gene Associations

14

6:113326848..113366847 n:mOTTMUSG00000017554 overlaps OTTMUSG00000016376 and EG68089 and EG101100.

Page 15: Many to 1 Gene Associations

15

8:47538900..47638899 Are EG667337 and EG14081 different genes?

Page 16: Many to 1 Gene Associations

16

7:126992968..127042967 Are EG233805 and EG1000043396 different genes?

Page 17: Many to 1 Gene Associations

17

6:122655579..122665578 Are EG71950 and EG100038891 different genes?

Page 18: Many to 1 Gene Associations

18

16:84828048..84836547 Are EG11957 and EG100039950 different genes?

Page 19: Many to 1 Gene Associations

19

7:87385985..87410984 Are EG61000042379 and EG269954 different genes?

Page 20: Many to 1 Gene Associations

20

9:43622106..43900000Are EG61000042548 and EG21838 different genes?

Page 21: Many to 1 Gene Associations

21

13:22073239..22080498 Are OTT00466 and OTT13227 different genes?

Page 22: Many to 1 Gene Associations

22

4:122937497..122988738 Are OTT08975 and OTT08978 different genes?

Page 23: Many to 1 Gene Associations

23

3:94933437..94938148 Are OTT22306 and OTT19657 different genes?

Page 24: Many to 1 Gene Associations

24

3:107728458..107736457 Are OTT25890 and OTT07101 the same gene?

Page 25: Many to 1 Gene Associations

25

1:172123537..172148536 Are OTT21542 and OTT21543 different genes?

Page 26: Many to 1 Gene Associations

26

1:173164903..173177002Are OTT21571 and OTT21573 different genes?

Page 27: Many to 1 Gene Associations

27

2:90744183..90753182 Are OTT14319 and OTT14315 different genes?

Page 28: Many to 1 Gene Associations

28

4:42236997..42261996 Are ENS78738 and ENS78736 different genes?

Are the genes predicted new members of the chemokine (C-C motif) ligand family?

In Ensembl multiple gene predictions are assigned to the same gene symbol/MGI id.

Page 29: Many to 1 Gene Associations

29

15:79611961..79691960 One gene or two or three?

Are Nptxr and Cbx6 Overlapping?

Page 30: Many to 1 Gene Associations

30

2:120535197..120698446 One gene or two?

Are Cdan1 and Ttbk2 Overlapping?

Page 31: Many to 1 Gene Associations

31

X:9598695..9848694Srpx and Rpgr Overlapping?

One gene or two?

Page 32: Many to 1 Gene Associations

32

2:181092767..181132366 Zgpat and Lime1 Overlapping?

One gene or two?

Page 33: Many to 1 Gene Associations

33

5:31435474..31485473One gene or two?

Mpv17 and Gtf3c2 overlapping?

Page 34: Many to 1 Gene Associations

34

16:96582252..96792251 One gene or two?

Are Pcp4 and Igsf5 two different genes?

Next slide

Page 35: Many to 1 Gene Associations

35

In Ensembl currently it looks as though Pcp4 and Igsf5 are considered synonyms for the same gene?

Page 36: Many to 1 Gene Associations

36

6:87895874..87954921 One gene or two?

NCBI gene is a pseudogene, Ensembl gene is a protein coding gene.

Pseudogene

Protein coding gene

Page 37: Many to 1 Gene Associations

37

13:75781991..75782990

Pseudogene

Protein coding gene

Protein coding gene

Page 38: Many to 1 Gene Associations

38

14:3046445..3080444

Pseudogene

Protein coding gene

Page 39: Many to 1 Gene Associations

39

Retrotransposed vs pseudogene6:128882645..128993644

Pseudogene

Pseudogene

Retrotransposed

Page 40: Many to 1 Gene Associations

40

Gene Family Challenges

killer cell lectin-like receptor (Klra) family

UDP glucuronosyltransferase 1 family

Gene families present many challenges to determining equivalency among gene predictions and for nomenclature. Examples from two gene families are shown in the following slides….

cysteine-rich perinuclear theca

C-type lectin domain family 2

Page 41: Many to 1 Gene Associations

41

6:129837719..130337718 killer cell lectin-like receptor (Klra) family

Next slide

Page 42: Many to 1 Gene Associations

42

6:130198815..130298814 killer cell lectin-like receptor (Klra) family

Gene identity crisis!

Pseudogene

Protein coding gene

Protein coding gene

Next slide

Page 43: Many to 1 Gene Associations

43

6:130275414..130375413 1. Overlapping NCBI annotation2. Overlapping features of different types

1.

2.

Pseudogene

Protein coding gene

Page 44: Many to 1 Gene Associations

44

1:89943192..90125441

Next slide

UDP glucuronosyltransferase 1 family

Ensembl maintains a single gene id for all of the members of the family.

Page 45: Many to 1 Gene Associations

45

9:24428665..24431164cysteine-rich perinuclear theca

Gene identity crisis!

Page 46: Many to 1 Gene Associations

46

C-type lectin domain family 2 6:128882645..128993644

Ensembl and VEGA predict only a single gene with multiple transcripts rather than two genes Clec2g and Clec2f.

Page 47: Many to 1 Gene Associations

47

Unique to MGI

MGI does not have a high-throughput computational genome annotation pipeline. However, we integrated the results of high throughput cDNA sequencing projects into the database prior to the availability of the mouse genome. Many of these genes have remained unique to MGI.

The following slides illustrate several cases where MGI has a gene that has not been predicted by one of the three major annotation groups.

Many (most) of these MGI-unique genes are from the RIKEN cDNA sequencing initiative. Many of them likely represent non-protein coding genes.

Page 48: Many to 1 Gene Associations

48

11:79796866..79857365

Page 49: Many to 1 Gene Associations

49

9:106742778..106752777 Unique to MGI

Page 50: Many to 1 Gene Associations

50

11:69491920..69516919