GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars...

21
GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars Arvestad, KTH Jens Lagergren, KTH Bengt Sennblad
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    218
  • download

    2

Transcript of GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars...

Page 1: GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars Arvestad, KTH Jens Lagergren, KTH Bengt Sennblad.

GSRThe Gene Sequence evolution model

with iid Rate variation over tree

Örjan Åkerborg, KTH

Lars Arvestad, KTH

Jens Lagergren, KTH

Bengt Sennblad

Page 2: GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars Arvestad, KTH Jens Lagergren, KTH Bengt Sennblad.

What?

• Gene evolution through duplication and loss

• Sequence evolution

Page 3: GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars Arvestad, KTH Jens Lagergren, KTH Bengt Sennblad.

Why?

• Base reconciliation analysis directly on data– Avoid information loss– Addresses uncertainty better

• Gene tree reconstruction should mirror generation

Page 4: GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars Arvestad, KTH Jens Lagergren, KTH Bengt Sennblad.

When?

• Arvestad et al. 2003, – MrBayes + GEM– Flawed model

Page 5: GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars Arvestad, KTH Jens Lagergren, KTH Bengt Sennblad.

QuickTime och enTIFF (okomprimerat)-dekomprimerare

krävs för att kunna se bilden.

QuickTime och enTIFF (okomprimerat)-dekomprimerare

krävs för att kunna se bilden.

60s ribosomal data

QuickTime och enTIFF (okomprimerat)-dekomprimerare

krävs för att kunna se bilden.

Page 6: GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars Arvestad, KTH Jens Lagergren, KTH Bengt Sennblad.

• Arvestad et al. 2003, – MrBayes + GEM– Flawed model

• Arvestad et al. 2004 – Intergrated GEM + Substitution model– Mathematically correct model– Molecular clock– Sampling algorithm - slow

When?

Page 7: GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars Arvestad, KTH Jens Lagergren, KTH Bengt Sennblad.

MHC revisited

QuickTime och enTIFF (okomprimerat)-dekomprimerare

krävs för att kunna se bilden.

QuickTime och enTIFF (okomprimerat)-dekomprimerare

krävs för att kunna se bilden.

Page 8: GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars Arvestad, KTH Jens Lagergren, KTH Bengt Sennblad.

• Åkerborg et al. Submitted (GSR)– Integrated GEM + SRT model

When?

• Arvestad et al. 2003, – MrBayes + GEM– Flawed model

• Arvestad et al. 2004– Intergrated GEM + Substitution model– Mathematically correct model– Molecular clock– Sampling algorithm - slow

Page 9: GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars Arvestad, KTH Jens Lagergren, KTH Bengt Sennblad.

How?

• GSR– GEM

• Reconciled trees – duplication and loss

– SRT• Relaxed clock model (iid)• Substitution model

– Fast algorithm• Discretized time space

Page 10: GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars Arvestad, KTH Jens Lagergren, KTH Bengt Sennblad.

QuickTime och enTIFF (okomprimerat)-dekomprimerare

krävs för att kunna se bilden.

Trees, T

Sequence data, F

Pr[D,T]

Self-consistency

Page 11: GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars Arvestad, KTH Jens Lagergren, KTH Bengt Sennblad.

Self-consistency – the X%-test

Page 12: GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars Arvestad, KTH Jens Lagergren, KTH Bengt Sennblad.

Application to Yeast data

• Compare to prev. Results– YGOB– Orthogrups (SYNERGY)

• Both synteny-based

Page 13: GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars Arvestad, KTH Jens Lagergren, KTH Bengt Sennblad.

Synteny -- gene order

QuickTime och enTIFF (okomprimerat)-dekomprimerare

krävs för att kunna se bilden.

Page 14: GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars Arvestad, KTH Jens Lagergren, KTH Bengt Sennblad.

Application to Yeast data

• Compare to prev. Results– YGOB– Orthogrups (SYNERGY)

• Both synteny-based

– Whole genome duplication• Challenge!

• Genome-wide analysis!– 4809 gene families

(orthogroups)

Page 15: GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars Arvestad, KTH Jens Lagergren, KTH Bengt Sennblad.

Comparison YGOB results

QuickTime och enTIFF (okomprimerat)-dekomprimerare

krävs för att kunna se bilden.

Page 16: GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars Arvestad, KTH Jens Lagergren, KTH Bengt Sennblad.

Molecular clock?

Page 17: GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars Arvestad, KTH Jens Lagergren, KTH Bengt Sennblad.

Comparison SYNERGY results

QuickTime och enTIFF (okomprimerat)-dekomprimerare

krävs för att kunna se bilden.

Page 18: GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars Arvestad, KTH Jens Lagergren, KTH Bengt Sennblad.

Sequence vs. synteny data

• No sequence diff– 36% of data sets are >85% similar

• Strong divergence– 25% of data set are <40% similar– Long-branch attraction

• Conflicting sequence-synteny signal

Page 19: GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars Arvestad, KTH Jens Lagergren, KTH Bengt Sennblad.

Sequence vs. synteny data

QuickTime och enTIFF (okomprimerat)-dekomprimerare

krävs för att kunna se bilden.

QuickTime och enTIFF (okomprimerat)-dekomprimerare

krävs för att kunna se bilden.

Page 20: GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars Arvestad, KTH Jens Lagergren, KTH Bengt Sennblad.

Orthogroup 3176

42.4 %

Single best SYNERGY tree

21.6 %31.6 %

Page 21: GSR The Gene Sequence evolution model with iid Rate variation over tree Örjan Åkerborg, KTH Lars Arvestad, KTH Jens Lagergren, KTH Bengt Sennblad.

Summary

• primeGSR– Integrated model

• Reconciliation – gene duplication loss• Relaxed clock• Sequence evolution

– Efficient algorithms– Improved gene tree reconstruction

• Future prospects– Divergence time estimates (MAP)– Species tree reconstruction– Include synteny