Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

131
Podiplomski študij Biomedicine Genetika: Genetski koncepti (2009-010) Uravnavanje izražanja genov, transkriptomika in drugi pristopi po-genomskega obdobja Prof. dr. Damjana Rozman [email protected] http://cfgbc.mf.uni-lj.si/ 1. Osnove uravnavanje izražanja genov. 2. Študije genoma in transkriptoma z DNA mikromrežami (čipi). 3. Študije genoma in transkriptoma z novo generacijo sekveniranja. 4. Pomen informatike pri študijah “omov”. 5. Primeri uporabe “omov”: kemogenomika, toksikogenomika, farmakogenomika.

Transcript of Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Page 1: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Podiplomski študij Biomedicine

Genetika: Genetski koncepti (2009-010)Uravnavanje izražanja genov, transkriptomika in

drugi pristopi po-genomskega obdobja

Prof. dr. Damjana Rozman [email protected]

http://cfgbc.mf.uni-lj.si/

1. Osnove uravnavanje izražanja genov.

2. Študije genoma in transkriptoma z DNA mikromrežami (čipi).

3. Študije genoma in transkriptoma z novo generacijo sekveniranja.

4. Pomen informatike pri študijah “omov”.

5. Primeri uporabe “omov”: kemogenomika, toksikogenomika, farmakogenomika.

Page 2: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

2008

Page 3: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Različne ravni uravnavanja

Page 4: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Prenos signala 30 nm vlakno

10 nm vlakno

Premik nukleosomov stkivno-specifičnimi

faktorji

razdružitev nukleosomov

Prostorskorazporejanje

DNA

Preiniciacijskikompleks

polII

Molekularni dogodki pri izražanju genov

Aktivatorji prepisovanja

Page 5: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Lanctôt et al. Nature Reviews Genetics 8, 104–115 (February 2007) | doi:10.1038/nrg2041

Chromatin is organized in distinct CTs. Nuclear topography remains a subject of debate, expecially with regard to the extent of chromatin loops expanding into the IC and intermingling between neighbouring CTs and chromosomal subdomains.

The chromosome territory–interchromatin compartment (CT–IC) model

Page 6: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Lanctôt et al. Nature Reviews Genetics 8, 104–115 (February 2007) | doi:10.1038/nrg2041

Chromatin) might become repositioned over time;

Subchromosomal 1 Mb domains (corresponding to DNA replication units that are labelled during early S phase32) are depicted as filled circles.

The lighter-green circles indicate gene-poor chromatin;

Dark-coloured circles indicate gene-dense chromatin.

Areas that contain transcriptionally active genesare indicated by dark backgrounddark background shading.

Chromatin mobility is constrained in several different ways.

The chromatin mobility is constrained by the tendency of transcriptionally active and transcriptionally inactive chromatin regions to cluster in three-dimensional space.

Three hypothetical chromosome territories

Page 7: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Lanctôt et al. Nature Reviews Genetics 8, 104–115 (February 2007) | doi:10.1038/nrg2041

a) Movement of chromatin from one compartment to another leads to changes in expression of the corresponding genomic regions.

Activation is triggered by repositioning of the gene loci to an activating compartment, away from silencing compartments.

b) Compartments are transient self-organizing entities.

Gene activation leads to dissolution of the silencing compartments, changes in gene positioning and de novoassembly of an activating compartment.

Chromatin mobility allows dynamic interactions between genomic loci and between loci and other nuclear structures.

Page 8: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Uravnavanje izražanja genov

•RNA polimeraze, osnovni aparat in cikel transkripcije

•Mehanizmi prepisovanja in aktivacije

•Kontrola prepisovanja preko spremembe strukture kromatina

•Epigenetska kontrola

• Kombinirani dogodki pri prenosu signala

•Metode za študij interakcij DNA-proteini

Page 9: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Transkripcijski aparat

•Osnovni aparat – RNA polimeraza

•Regulatorji (aktivatorji, represorji; trans-delujoči faktorji)

Cis-elementi

•Cis-regulatorni moduli (CRM); vzpodbujevalci (enhacers), utiševalci (silencers)

http://the_brain.bwh.harvard.edu/Lever/Lever.png

Page 10: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

RP Zinzen et al. Nature 462, 65-70 (2009) doi:10.1038/nature08531

Generating a high-resolution atlas of mesodermal CRMs.

Page 11: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Bakterije– 1 RNA polimeraza

Evkarionti– RNA pol I (45s pre-rRNA )

-RNA pol II (mRNA kodirani geni, miRNA)

-RNA pol III (tRNA, 5S rRNA)

Page 12: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

“Tovarna” RNA polimeraze II

A.J. Courey, Mechanisms in transcriptional regulation, 2008)

Page 13: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Komunikacija med CRM in promotorjem

A.J. Courey, Mechanisms in transcriptional regulation, 2008)

Page 14: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

B. Lewin, GenesVII, 2000

Tipični pol II gen ima promotor, ki zavzema področje cca 200 bp navzgor od mesta pričetka prepisovanja. Lahko vsebuje tudi pospeševalec(enhancer), ki je oddaljen poljubno (???)

Page 15: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Struktura gena razreda pol II

A.J. Courey, Mechanisms in transcriptional regulation, 2008)

Page 16: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Core promoter-selective RNA polymerase II transcription

•Gross P, Oelgeschlager T, Transcription Laboratory, Marie Curie Research Institute, The Chart, Oxted, Surrey RH8 OTL, UK.

The core promoter:-minimal DNA region that is sufficient to direct low levels of activator-independent (basal) transcription by RNAP II in vitro; -typically extends approx. 40 bp up- and downstream of the start site of transcription;-can contain several distinct core promoter sequence elements;-contains TATA box or INR (initiator) element.

Computational analysis of metazoan genomes suggests that the prevalence of the TATA box has been overestimated in the past and that the majority of human genes are TATA-less.

While TATA-mediated transcription initiation has been studied in great detail and is very well understood, very little is known about the factors and mechanisms involved in the function of the INR and other core promoter elements.

Biochem Soc Symp. 2006;(73):225-36.

Page 17: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

3, 821-831 (November 2003)

TATA-less promoter – example – CYP19 (aromatase)

Page 18: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

The human CYP19 (P450arom) gene is located in the chromosome 15q21.2 region and is comprised of a 30 kb coding region and a 93 kb regulatory region.

The unusually large regulatory region contains 10 tissue-specific promoters that are alternatively used in various cell types. Each promoter is regulated by a distinct set of regulatory sequences in DNA and transcription factors that bind to these specific sequences.

The most recently characterized promoter Is TATA-less and accounts for the transcription of 29-54% of P450arom mRNAs in breast cancer tissues and contains endothelial-type cis-acting elements that interact with endothelial-type transcription factors, e.g. GATA-2.

This promoter may upregulate aromatase expression in vascular endothelial cells. The in vivo cellular distribution and physiologic roles of this promoter in healthy tissues is not known.

TATA-less promoter - example

J Steroid Biochem Mol Biol. 2003 Sep;86(3-5):219-24

Page 19: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

TATA-less promoters

Molina and Grotewold BMC Genomics 2005 6:25

Page 20: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Transkripcijski cikel

A.J. Courey, Mechanisms in transcriptional regulation, 2008)

Page 21: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Transkripcijski mehurček

A.J. Courey, Mechanisms in transcriptional regulation, 2008)

Page 22: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

RNA polimeraza se lahko premika v obe smeri

A.J. Courey, Mechanisms in transcriptional regulation, 2008)

Page 23: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Cramer et al., Science 288, 640-645 (2000)

Kodirajoča veriga

Matrična veriga

Vezavno mesto za DNA-RNA hibrid

Page 24: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Bushnell et al., Science 303, 983-988 (2004)

Model of the pol II transcription initiation complex with the addition of electron microscope structures of TFIIE, TFIIF, and TFIIH.

Page 25: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

•TFIID: velik multiproteinski kompleks, ki prične sestavljanje transkripcijskega aparata. Vsebuje TBP in TAFe

•TFIIB: interagira z TBP in z DNA (analog bakterijskih σ faktorjev)

•TFIIF: udeležen pri iniciaciji kot pri podaljševanju verige.

•TFIIE: veže TFIIH

•TFIIH ima kinazno aktivnost, je helikaza, sodeluje tudi pri popravljanja DNA. Predstavlja povezavo med prepisovanjem in popravljanjem DNA in uravnavanjem celičnega cikla.

•TFIIA - po nekaterih izsledkih ne spada med splošne transkripcijske faktorje temvečmed specializirane TFIID koaktivatorje. V pogojih in vitro veže TBP-DNA kompleks in povzroči širjenje DNA-protein kontaktov, obstajajo pa tudi dokazi, da ima pomen le pri kontaktiranju z gensko-specifičnimi transkripcijskimi faktorji.

Splošni transkripcijski faktorji RNA polimeraze II

Page 26: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Kromatin in prepisovanje

http://www.wadsworth.org/images/bio_icons/tn_transchromatin.jpg

Page 27: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

GenesVII

N-terminalni repki histonov so gibljivi in obrnjeni navzven. Modifikacija histonskih repkov vodi do strukturnih sprememb, potrebnih pri pričetku prepisovanja

Page 28: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...
Page 29: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

www.med.ufl.edu/biochem/ keithr/fig1pt2.html

Page 30: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Lanctôt et al. Nature Reviews Genetics 8, 104–115 (February 2007) | doi:10.1038/nrg2041

Chromatin) might become repositioned over time;

Subchromosomal 1 Mb domains (corresponding to DNA replication units that are labelled during early S phase32) are depicted as filled circles.

The lighter-green circles indicate gene-poor chromatin;

Dark-coloured circles indicate gene-dense chromatin.

Areas that contain transcriptionally active genesare indicated by dark backgrounddark background shading.

Chromatin mobility is constrained in several different ways.

The chromatin mobility is constrained by the tendency of transcriptionally active and transcriptionally inactive chromatin regions to cluster in three-dimensional space.

Three hypothetical chromosome territories

Page 31: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Cook P.R., Science 284, 1790-1795 (1999)

Nascentni transkripti označeni z Br-UTP, Br-RNA imunobarvana s fluorescin izotiocianatom (zelena), nukleinske kislineoznačene s TO-TO-3 (rdeča)

Millerjeva razporeditev –rRNA transkripcijska enota (pol I) s ~125 prepisi

Pol II transkripcijskaenota z enim prepisom

Nukleolarnatovarna Nukleoplazemska

tovarna

Transkripcijske (nukleolarne)Tovarne v celicah HeLa

Page 32: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Cook P.R., Science 284, 1790-1795 (1999)

Transkripcijski cikel

RNA II polimeraza ima večjo potezno moč kot kinezin in miozin.Energijo hidrolize ATP pretvori v mehansko in kinetično energijo

Page 33: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

RNA polimeraze, osnovni aparat in cikel transkripcije– povzetek

•Cikel transkripcije vsebuje začetek (iniciacijo), podaljševanje (elongacijo) in zaključek (terminacijo)

•Transkripcijo katalizirajo evolucijsko ohranjene RNA polimeraze. Evkariontiimajo RNA poI I, II in III. Potrebni pa so še drugi proteini (RNA polimeaznikompleks)

•Jedro RNA polimeraze vsebuje par “klešč” (clamp), ki tvorijo režo aktivnegacentra encima za sintezo RNA. V stopnji podaljševanja DNA doseže aktivnomesto skozi kanal med kleščama, se denaturira (mehurček), rastoča RNA veriga pa tvori heterodupleks z matrično verigo DNA. NTP-ji dosežejoaktivno mesto skozi sekundarni kanal.

•Če polimeraza naleti na oviro, se lahko premakne nazaj (back-tracking).

•TBP je univerzalni transkripcijski faktot, ki prepozna TATA, potreben pa je tudi za prepisovanje s promotorjev brez TATA.

Page 34: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

•Cis-regulatorni modul (CRM) je področje DNA, kamor se vežejoštevilni transkripcijski faktorji in uravnavajo izražanje bližnjih genov . Posamezni gen ima pri živalih lahko do 20 CRM.

•CRM se običajno nahajajo v 5’ neprevedenih regijah gena ali v intronih.

•Prepisovanje genov se dogaja v nukleolarnih tovarnah, ki predstavljajotranskripcijsko aktivna področja jedra.

•Acetilacija (deacetilacija) uravnava kondenzacijo kromatina. Zaprepisovanje primeren je acetiliran kromatin.

CRM in kromatin - povzetek

Page 35: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

The suffix “-ome-” originated as a back-formation from “genome”, a word formed in analogy with “chromosome”.

Because “genome” refers to the complete genetic makeup of an organism, some people have made the inference that there exists some root, *“-ome-”, of Greek origin referring to wholeness or to completion.

Because of the success of large-scale quantitative biology projects such as genome sequencing, the suffix "-ome-" has migrated to a host of other contexts. Bioinformaticians and molecular biologists figured amongst the first scientists to start to apply the "-ome" suffix widely.

http://en.wikipedia.org/

2. Študije transkriptoma z DNA mikromrežami (čipi),

povezave z drugimi “omi”

Page 36: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

employees.csbsju.edu/.../ olbindtransciption.html

Page 37: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

The transcriptome, the mRNA complement of an entire organism, tissue type, or cell; with its associated field transcriptomics

The metabolome, the totality of metabolites in an organism; with its associated field metabolomics

The metallome, the totality of metal and metalloid species; with its associated field metallomics

The lipidome, the totality of lipids; with its associated field Lipidomics

The glycome, the totality of glycans, carbohydrate structures of an organism, a cell or tissue type. Glycomics: The associated field of study.

The interactome, the totality of the molecular interactions in an organism a once proposed field of interactomics has generally become known as systems biology

The spliceome (see spliceosome), the totality of the alternative splicing protein isoforms; with its associated field spliceomics.

The ORFeome refers to the totality of DNA sequences that begin with the initiation codon ATG, end with a nonsense codon, and contain no stop codon. Such sequences may therefore encode part or all of a protein.

The Phenome - the organism itself. The Phenome is to the genome what the phenotype is to the genotype. Also, the complete list of phenotypic mutants available for a species.

The Exposome - the collection of an individual's environmental exposures.

The “ome” worldhttp://en.wikipedia.org/

Page 38: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...
Page 39: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Genomics

Genomics is the study of an organism's entire genome.

The field includes intensive efforts to determine the entire DNA sequence of organisms and fine-scale genetic mapping efforts.

http://www.wiley.com/legacy/college/boyer/0470003790/cutting_edge/shotgun_seq/computer.gif

http://en.wikipedia.org/

Page 40: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Študije transkriptoma z DNA čipi

• Analiza genoma (genotpizacija enojnih nukleotidnih polimorfizmov SNP, analiza variance števila kopij CNV oz. CGH, resekvenciranje).

• Analiza transkriptoma (ekspresijsko profiliranje: 3' ekspresijski, eksonski ali genski čipi, čipi za sledenje izražanja miRNA).

• Študije uravnavanja izražanja genov (kromatinska imunoprecipitacija na čipu ChIP-on-Chip, mapiranje prepisov).

Page 41: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

A DNA microarray (also commonly known as gene chip, DNA chip, or biochip) is a collection of microscopic DNA spots attached to a solid surface, such as glass, plastic or silicon chip forming an array for the purpose of monitoring of expression levels for thousands of genes simultaneously.

The affixed DNA segments are known as probes (although some sources will use different nomenclature), thousands of which can be used in a single DNA microarray.

Microarray technology evolved from Southern blotting, where fragmented DNA is attached to a substrate and then probed with a known gene or fragment.

Measuring gene expression using microarrays is relevant to many areas of biology and medicine, such as studying treatments, disease, and developmental stages. For example, microarrays can be used to identify disease genes by comparing gene expression in disease and normal cells.

A DNA microarray http://en.wikipedia.org/

Page 42: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

High-density tiled microarrays for transcript mapping.

Genome-wide, exon-level analysis on a single array —a survey of alternative splicing and gene expression.

Robust, simple representation focusing on the 3' ends.

Different parts of the gene as probes on the array

http://www.affymetrix.com/products/arrays/index.affx

Page 43: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Parts of a gene represented in different types of microarrays

Page 44: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

http://www.ieee.org/portal/cms_docs_sscs/sscs/07Fall/HamFig1s.jpg

Hybridization on the microarray

Page 45: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...
Page 46: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Analisys of genome - Genotyping microarrays

DNA microarrays can be used to read the sequence of a genome in particular positions.

SNP microarrays are a particular type of DNA microarrays that are used to identify genetic variation in individuals and across populations.

Short oligonucleotide arrays can be used to identify the single nucleotide polymorphisms (SNPs) that are thought to be responsible for genetic variation and the source of susceptibility to genetically caused diseases.

Amplifications and deletions can also be detected using comparative genomic hybridization in conjuction with microarrays.

Resequencing arrays have also been developed to sequence portions of the genome in individuals. These arrays may be used to evaluate germlinemutations in individuals, or somatic mutations in cancer.Genome tiling arrays include overlapping oligonucleotides designed to blanket an entire genomic region of interest. Many companies have successfully designed tiling arrays that cover whole human chromosomes.

Page 47: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

The complete human genome sequence announced in June 2000 is a "representative" genome sequence based on the DNA of just a few individuals.

Over the longer term, scientists will study DNA from many different people to identify where and what variations between individual genomes exist. Sequencing a genome is such a Herculean task that capturing its person-to-person variability on the first pass would be next to impossible.

www.genomenewsnetwork.org/.../ Chp4_1.shtml

If every human genome is different, what does it mean to sequence "the" human genome?

Since every person's genome is unique, no one person is any more or less "representative" than any other and it hardly matters whose genome is sequenced first.

The vast majority of the genome's sequence is the same from one person to the next, with the same genes in the same places.

Page 48: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

DNA polymorphisms

• Single Base Pair Differences,

– Single Nucleotide Polymorphisms (SNPs),

• Microsatellites (short sequence repeats),

• Minisatellites (long sequence repeats),

• Deletions,

• Duplications.

Page 49: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Affymetrix uses a unique combination of photolithography and combinatorialchemistry to manufacture GeneChip® Arrays.

High density Afymetrix microarrays

Page 50: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

http://www.mun.ca/biology/scarr/DNA_VDA_chip_schematic.jpg

SNP chipsTo determine which alleles are present, genomic DNA from an individual is isolated, fragmented, tagged with a fluorescent dye, and applied to the chip.

The genomic DNA fragments anneal only to these oligos to which they are perfectly complementary.

A computer reads the position of the two fluorescent tags and identifies the individual as a C / T heterozygote.

The single spots in the other three columns indicate that the individual is homozygous at the three corresponding SNP positions.

Page 51: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

>1,500 SNPs arrayed

Genotyping

Page 52: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Array comparative genomic hybridization (CGH) is a tool for the assessment of DNA copy number variation (CNV) within any given DNA sample.

The technique is derived from the concept of conventional CGH where patient and reference DNA (the latter derived from a normal male/female) is differentially labelled and co-hybridized to a normal metaphase spread.

Comparative genomic hybridization

Page 53: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Some diseases are associated with sections of chromosomes that are erroneously replicated or deleted.

The technique of array-based genomic hybridisation (array-comparative genomic hybridization CGH or copy number variation CNV) allows high-throughput, rapid identification and mapping of these genomic DNA copy number changes, with higher resolution than is possible using traditional, non-array methods.

http://www.nature.com/labinvest/journal/v85/n9/images/3700312f2.jpg

Page 54: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

http://www.genomictree.com/images/bioimg09.jpg

Expression profiling and CGH

Page 55: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Array-based Comparative genome hybridization (CGH)

http://www.sanger.ac.uk/HGP/Cytogenetics/gfx/cgh_300.jpg

In array-based CGH, large insert clones like BACs , containing human DNA, replace the metaphase spread as target.

The clones are spotted in three- four- or fivefold on a microscope slide. Depending on the contents ofthe array, the collection of clones that is printed on the slide, specific chromosome parts or even the entire genome can be analyzed in one experiment.

Page 56: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Resequencing Arrays

Tiling arrays - tools for resequencing, based on the concept of interrogating the genome in a systematic, unbiased fashion.

Before tiling arrays came along, other methods were used to predict genes that were unknown, including expressed-sequence tag (EST) sequencing, comparison with known protein-coding sequences, and ab initio prediction.

Tiling arrays are designed to address this problem by taking an empirical measurement across the entire genomic DNA. This allows a researcher to probe for to probe for transcribed sequences without a predetermined concept of thier location

Using the tiling chips it is possible to perform the most comprehensive analysis of the whole genome combined with transcript mapping. The probes are designed (tiled) at an average resolution of 35-bp throughout the whole genome, thus enabling a more accurate view of the transcription and discovery of novel RNA transcripts.

Page 57: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...
Page 58: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Analiza transkriptoma

Page 59: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Principle of expression profilingčitalec

označevanje

Obratno prepisovanje

Emisija

Prekrivanje

Računalniška analiza

Laser 1Test Referenca

Laser 2

ko-hibridizacija

Ekspresijsko profiliranje z DNA mikromrežami – dvojno označevanje

Geni zastopani scDNA ali z oligonukleotidi

Robot-nanašalec(spotter)

mikromreže s cDNA in dolgimi oligonukleotidi s 3’-konca

Page 60: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

GeneChip Exon Arrays provide the most comprehensive coverage of the genome and enable two complementary levels of analysis: gene expression and alternative splicing.

Multiple probes per exon enable "exon-level" analysis and allow you to distinguish between different isoforms of a gene. T

his exon-level analysis on a whole-genome scale opens the door to detecting specific alternative splicing events that may play a central role in disease mechanism and etiology.

You can also perform "gene-level" analysis, in which multiple probes on different exons are summarized into a single expression value of all transcripts from the same gene, enabling similar analysis workflows to 3'-gene expression.

Exon arrays

Page 61: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Complex organisation and structure of the ghrelin antisense strand gene GHRLOS, a candidate non-coding RNA geneSeim I et al., BMC Molecular Biology 2008, 9:95

Exon arrays – example I

Page 62: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Human Gene Arrays (April 2007), interrogate approximately 29,000 well annotated genes by using an average of 26 probes per gene.

Probes are designed to cover the entire transcript length, unlike the 3’ focused arrays which contain probes targeting the transcript’s 3’ end.

Probes on the array are perfect match (PM) only; mismatch probes found on the 3’focused arrays have been replaced on the Gene Arrays with a set of approximately 20,000 generic background probes.

Gene arrays

Page 63: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

miRNA Profiling

MicroRNAs (miRNAs) are 21-23 nucleotide long non-coding RNA molecules that have been shown to regulate the stability or translational efficiency of target messenger RNAs.

They have been shown to regulate diverse processes such as Fragile X , cancer proliferation and fat metabolism.

At present over 530 different miRNA sequences have been validated in the human (Sanger Institute mirBase database) although recent research suggests that the number of unique miRNAs could exceed 800.

It is also thought that miRNA genes regulate protein production for 10% or more of all human genes making them an increasingly important area of research

Page 64: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

miRNA isolation

The starting material for Agilent miRNA Arrays is total RNA (not small RNA) by isolated with i.e. TRIZOL still containing small RNA molecules.

As such, they should not have been cleaned up using columns as this will invariably remove the small RNAs which the arrays are trying to measure.

Just 100ng of total RNA is required for labelling. Once isolated, samples can be quantified using the Nanodrop Spectrophotometer.

Because miRNAs are so short, they are less prone to degradation then mRNAs.

Screen capture of Agilent 2100 Bioanalyzer electropherograms of a TRIZOL Reagent isolated total RNA. Note the pronounced peak of RNA at <200nt, typically not seen in column cleaned-up samples. Source: Microarray Centre

Page 65: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Interactome by Chromatine immunoprecipitation on a chip (CHIP-Chip)

Page 66: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

The "ChIP to Chip" Assay

Page 67: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Detection of Nkx2-5 binding to the Actc1 promoter and intron 1 (A) and the Nppa promoter (B) using the Integrated Genome Browser

Chip-chip – example I

www.nimr.mrc.ac.uk/devbiol/mohun/cardiacdiff/

Page 68: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...
Page 69: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

DNA čipi so zbirka mikroskopskih “DNA točk”, cDNA ali oligonukleotidov, pritrjenihna trdo podlago.

Uporabljajo se za:

• Analizo genomske DNA (genotipizacija, SNP analiza, CHG, sekvenciranje).

• Analizo izražanja genov (ekspresijsko profiliranje) , kjer probe lahkopredstavljajo 3'–konce genov, eksone ali različne dele genov. Posebni čipi pa so za sledenje izražanja miRNA.

• Študije uravnavanja izražanja genov (določanje vezavnih mesttranskripcijskih faktorjev (Chip-chip), metilacija kromatina), kjer probe predstavljajo 5’-neprevedene dele in nerepetitivne dele genov.

Povzetek

Page 70: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Nova generacija sekvenciranja

www.gatc.co.uk/cgi-bin/wPrintpreview.cgi?sour...

Page 71: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Thu Nov 5, 2009 7:24pm EST

Want to know your entire DNA sequence? A California company has done it for as little as $1,700.

Privately held Complete Genomics says it can do a better quality, usable genome map for about $4,400 -- compared with the $100 million the Human Genome Project spent to complete the first sequencing of the human genome in 2000.

"Whole-genome sequencing costs have dropped from the more than $100 million cost of the first human genomes to the point where individual labs have generated genome sequences in a matter of months for material costs of as low as $48,000," the company's Radoje Drmanac and colleagues reported in the journal Science.

"This high-quality, cost-effective approach to genome sequencing will allow researchers to study complete genomes from hundreds of patients with a disease to advance the understanding of the genetic causes of that disease, with an end to preventing and treating common human ailments," said Cliff Reid, chief executive officer of Complete Genomics.

Two of the people whose DNA was mapped had taken part in an international sequencing project called the International HapMap project -- a man of European descent and a Yoruban female. The third came from a white man taking part in the Personal Genome Project, an online registry in which people are asked to donate both their DNA and a little money.

Genome sequencing is still early stage science. While researchers can get the code, figuring out what it means is a different matter.Genomics pioneer Craig Venter had his own genome sequenced -- at a cost of "several million" dollars -- and found the analysis could only show he was likely to have blue eyes, for instance. Venter does have blue eyes.

Last month Pauline Ng of the J. Craig Venter Institute in San Diego and Sarah Murray of Scripps Translational Science Institute in La Jolla, California, tested kits provided by California-based firms Navigenics Inc, a private company, and 23andMe, backed by Google Inc. They found they varied in predicting disease risk.

Complete Genomics and The Institute for Systems Biology said earlier this week they plan to sequence the genomes of 100 people to try and find insights into Huntington's disease. Scientists also use a technique called genome-wide association to try to find genes that no one suspected were involved in various diseases.(Reporting by Maggie Fox; Editing by Eric Wals)

Page 72: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Nova (naslednja) generacija sekvenciranja

- Sekvenciranje človeškega genoma je trajalo več let, z uporabo cca 20 kb BAC klonov, ki so vsebovali cca 100 kb dolge tarčne fragmente, in 8-kratnega pokrivanja vsakega dela tarče. Analiza s kapilarno elektroforezo.

-Nadaljnji razvoj sekvenciranja je temeljil na sočasnem sekvenciranju celotnega genoma (WHS, angl. whole genome sequencing), ki je bil vstavljen v vektorje. Metoda je hitrejša, pušča pa velike prazninev zelo polimorfnih ali repetitivnih genomih. Analiza je potekala s kapilarno elektroforezo.

-Naslednja generacija sekvenciranja (2004) – visokozmogljivostno paralelno čitanje odsekov DNA naravni celega genoma preko PCR pomnoževanja enoverižnih fragmentov genomske knjižnice.

Page 73: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Next-generation of DNA and RNA sequencing methods

- The bead-amplification sequencing (Roche/454FLX)

-Sequencing by synthesis (Illumina/Solexa Genome analyzer)

-Sequencing by ligation (Applied Biosystems SOLID System)-Helicos Helioscope (2008)

-Pacific Biosciences SMRT (2010)

Common features:-A compex interplay of enzymology, chemistry, software, hardware, optics engineering…)

-A streamline of sample preparation prior to sequencing (time saving)

-Preparation of fragment libraries of the DNA of interest by annealing for platform-specific linkers and amplification

-Amplification of single stranded fragment library and performing sequencing on amplified fragments

-Single molecule sequencing just arrived or is under development

Page 74: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Comparison between capillary sequencers and novel generation sequencers-A huge difference in the throughput of a single run: 96 capillaries of about 750 bp compared to several thousand (Roche) to tens of millions (Illumina, ABi) shorter reads.

-Time runs are longer that by next generation sequencers (8h – 10days)

- Longer run times are required to image the massive parallel sequencing reactions

-Due to the streamline preparation and long reading times a single human operator can operate several machines of next generation at the full capacity.

ABI Solexa

Page 75: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Classical Sanger dideoxy sequencing method

campus.queens.edu/faculty/jannr/molecular/

Dideoxynucleotide sequencing represents only one method of sequencing DNA. It is commonly called Sanger sequencing since Sanger devised the method. This technique utilizes 2',3'-dideoxynucleotide triphospates (ddNTPs), molecules that differ from deoxynucleotides by the having a hydrogen atom attached to the 3' carbon rather than an OH group. These molecules terminate DNA chain elongation because they cannot form a phosphodiester bond with the next deoxynucleotide.

Page 76: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

General concepts for clonal-array generation and sequencing

a | Bead-chips. Genomic DNA is fragmented and adaptors are ligated to create an insert library that is flanked by two universal priming sites. Because of the random fragmentation, the complexity of this signature sequence library is equivalent to the genome. This library is cloned on beads using emulsion PCR technology. A water-in-oil emulsion is created from a PCR mix that contains a limiting dilution of DNA and beads. The emulsion creates micro-compartments with, on average, a single bead and single DNA template each. After PCR, beads with clones are affinity selected and assembled onto a planar substrate. A subsequent cycle-sequencing reaction is used to read out the sequence on the clones.

b | Sequencing by synthesis (SBS). A common anchor primer is annealed to a constant sequence (universal priming site) that is contained within the library clones that are located on the polony (clonalbead) array (the orientation of the immobilized target might vary depending on the platform that is used). The sequence is read out by polymerase extension in a base-by-base fashion using either reversible terminators or sequential nucleotide addition (pyrosequencing). After incorporation of a single base or base type, the incorporated base is identified by fluorescence (laser) or chemiluminescence (no laser required).

c | Sequencing by ligation. The polony array set-up is similar to SBS in which a common primer is annealed to an arrayed polony library and used to read out the sequence through a stepwise ligation of random oligomers. The labelled oligomers are designed to have random bases inserted at every site except the query site. The query site has one of four base substitutions, each matched to a particular fluorescent label on the oligonucleotide. After read-out of each ligation event, the primer and the ligatedoligomer are stripped, a new primer reannealed and the process repeated with an oligomer that contains a query base at a different position.

Fan et al. Nature Reviews Genetics 7, 632–644 (August 2006) | doi:10.1038/nrg1901

Page 77: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Fan et al. Nature Reviews Genetics 7, 632–644 (August 2006) | doi:10.1038/nrg1901

General concepts for clonal-array generation and sequencing

Bead chips

Sequencing bysynthesis

Sequencing byLigation

Applied biosystems

Rochepyrosequencing

Solexa / illumina

Page 78: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Roche/454 FLX Pyrosequencer

Library fragments are mixed with agarosebeads with oligos complementary to adapter sequences on the library.

Each bead is associated with a single fragment.

Each fragment-bead complex is isolated into individual oil:water micelles with PCR mixture.

Thermal cycling of this emulsion PCR of the micelles produces amplified unique sequences on the bead surface.

“En mass” sequencing of PCR products on picotiter plates (PTP) with single beads in each picowell.

Enzyme/substrate containing beads for the pyrosequencing reaction are added to wells that act as floww cells for addition of individual pure nucleotide solutions. The CCD camera records the light emitted at each bead.

Page 79: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Principles of Pyrosequencing

A sequencing primer is hybridized to a single-stranded PCR amplicon that serves as a template.

Mixtures incubated with the enzymes, DNA polymerase, ATP sulfurylase, luciferase, and apyrase as well as the substrates, adenosine 5' phosphosulfate (APS), and luciferin.

ATP sulfurylase converts PPi to ATP in the presence of adenosine 5' phosphosulfate (APS).

ATP drives the luciferase-mediated conversion of luciferin to oxyluciferin that generates visible light in amounts that are proportional to the amount of ATP.

Apyrase, a nucleotide-degrading enzyme, continuously degrades unincorporated nucleotides and ATP. When degradation is complete, another nucleotide is added

pyrogram

Page 80: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Watson and Cricks pyrosequencing readout

2008

Page 81: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...
Page 82: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Timeline of the pyrosequencing development

October 2005 Release of the Genome Sequencer 20, the first next-generation sequencing system on the market

October 2005 Collaboration agreement signed with Roche Diagnostics .

January 2007 Release of the Genome Sequencer FLX System

March 2007 Roche Diagnostics completes integration with 454 Life Sciences

May 2007 Complete sequence of Jim Watson published in Nature. First genome to be sequenced for less than $1 million.

November 2007 Announcement of the 100th peer-reviewed publication enabled by 454 Sequencing

June 2008 454 Joins the 1000 Genome Project, an international effort to build the most detailed map to date of human genetic variation as a tool for medical research

September 2008 Announcement of the 250th peer-reviewed publication enabled by 454 Sequencing

October 2008 Release of Genome Sequencer FLX Titanium Series reagents, featuring 1 million reads at 400 base pairs in length

Page 83: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

•Is based on arrays of randomly assembled glass (silica) beads;The beads have oligonucleotides covalently attached to the surface;Each bead has about one million oligos on its surface;All oligos on each bead have the same sequence.

•Attached DNA fragments are extended and bridge amplified to create an ultra-high density sequencing flow cell with 80-100 million clusters, each containing ~1,000 copies of the same template. These templates are sequenced using a robust four-color DNA sequencing-by-synthesis technology that employs reversible terminators with removable fluorescent dyes. This novel approach ensures high accuracy and true base-by-base sequencing, eliminating sequence-context specific errors and enabling sequencing through homopolymers and repetitive sequences.

The beads are randomly assembled on the arrays, and the location of a particular probe is initially unknown;a process called decoding is used to find the location of each bead;

sequencing technology

Page 84: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Illumina sequencing by synthesis

Page 85: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Illumina sequence - decoding

Page 86: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...
Page 87: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Fan et al. Nature Reviews Genetics 7, 632–644 (August 2006) | doi:10.1038/nrg1901

Approaches to create highly parallel genotyping assays by novel generation sequencing

Page 88: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Fan et al. Nature Reviews Genetics 7, 632–644 (August 2006) | doi:10.1038/nrg1901

Page 89: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Nova generacija sekvenciranja - povzetek

• Nova generacija visokozmogljivostnega sekvenciranje omogoča razpoznavanje zaporedijDNA na ravni celega genoma, z resolucijo posameznega baznega para.

• Iz vsakega vzorca se pripravi z adaptorji ligirana knjižnica, ki vsebuje vse v vzorcuprisotne fragmente DNA ali RNA (cDNA).

• Vse platforme bazirajo na ligaciji adaptorjev in pomnoževanju, imajo pa različne pristopesekvenciranja:

-Pirosekvenciranje (Roche-Nimblegen)

-Sekvenciranje s sintezo (Illumina-Solexa)

-Sekvenciranje z ligacijo (ABI)

•Razvijajo se tudi metode, ki pred sekvenciranjem ne potrebujejo pomnoževanja.

• Aplikacije so enake kot pri klasičnih mikromrežah (ekspresijsko profiliranje oz. SAGE, genotipizacija SNP in CNV, kroamtinska imunoprecipitacija, metilacija kromatina, itd.).

•Prednost pred klasičnimi mikromrežami je v preprosti pripravi vzorca in zmožnostiprocesiranja velikega števila vzorcev v kratkem času.

•Procesiranje velikega števila vzorcev na eni ali več aparaturah lahko upravlja en človek.

Page 90: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Bioinformatika v raziskavah “omov”

http://bioinformatics.ubc.ca/about/what_is_bioinformatics/images/computer.gif

www.gwumc.edu

Page 91: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Technical Issues Involved in Obtaining Reliable Data from Microarray Experiments – Standardization and beyond

Page 92: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

BioinformaticsWikipedia

Making sense of the huge amounts of DNA data produced by gene sequencing projects.

Bioinformatics and computational biology involve the use of techniques from applied mathematics, informatics, statistics, and computer science to solve biologicalproblems.

Research in computational biology often overlaps with systems biology.

Major research efforts in the field include sequence alignment, gene finding, genome assembly, protein structure alignment, protein structure prediction, prediction of gene expression and protein-protein interactions, and the modeling.

The terms bioinformatics and computational biology are often used interchangeably, although the former typically focuses on algorithm development and specific computational methods, while the latter focuses more on hypothesis testing and discovery in the biological domain.

Page 93: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Microarrays and bioinformatics

StandardizationThe lack of standardization in arrays presents an interoperability problem in bioinformatics, which hinders the exchange of array data.

Various projects are attempting to facilitate the exchange and analysis of data produced with non-proprietary chips.

The "Minimum Information About a Microarray Experiment" (MIAME) XML based standard for describing a microarray experiment is being adopted by many journals as a requirement for the submission of papers incorporating microarray results.

http://www.mged.org/Workgroups/MIAME/miame.html

Page 94: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...
Page 95: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Statistical analysis

The analysis of DNA microarrays poses a large number of statistical problems, including the normalisation of the data.

From a hypothesis-testing perspective, the large number of genes present on a single array means that the experimenter must take into account a multiple testing problem: even if each gene is extremely unlikely to randomly yield a result of interest, the combination of all the genes is likely to show at least one or a few occurrences of this result which are false positives.

Page 96: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

irfgc.irri.org/cropbioportal/index.php?option...

Steps in microarray experiments

Page 97: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Gene ontology is a controlled vocabulary used to describe the biology of a gene product in any organism. There are 3 independent sets of vocabularies, or ontologies, that describe the:-molecular function of a gene product,- the biological process in which the gene product participates, -and the cellular component where the gene product can be found.

Data mining includes gene ontology

Page 98: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Experimental design is crucial for a succesfull “omic” experiment

A proper experimental design is crucial for obtaining useful conclusions from a project. The choice of design ideally includes an assessment of the biological variation, the technical variation, the cost and duration of the experiment, and the availability of biological material. The experimental design can also depend on the methods that will be used to analyse the data afterwards. In certain cases, the parameters needed to find the optimal design must be obtained by a pilot experiment.

A related problem is the comparison of different competing experimental methods or devices. Here, a proper test design is crucial as well to be able to make a firm conclusion in favor of one or the other method.

lunabiosciences.com/experimentaldesign.htmlDere E. et al., BMC Genomics 2006, 7:80doi

Page 99: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

T. Rezen et al., BMC Genomics 2008, 9:76

Changes in cholesterol homeostasis and drug metabolism caused by different factors in mouse liver

Experimenal design – an example

Page 100: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Primary data analysis – experimental example

Images of the Sterolgene v0 microarrays were analyzed by Array-Pro Analyzer 4.5 (Media Cybernetics, Bethesda, MD, USA).

The median feature and local background intensities were extracted together with the estimates of their standard deviation. Only features with foreground to background ratio higher than 1.5 and coefficient of variation (CV, ratio between standard deviation of the background and the median feature intensity) lower than 0.5 in both channels were used for further analysis.Log2 ratios were normalized using LOWESS fit to spike in control RNAs according to their average intensity. Two types of spike in controls were used: custom-made (Firefly luciferase) and commercial ArrayControl Spikes (Ambion, Austin, TX, USA). In phenobarbital and cholesterol-feeding experiments data were additionally standardized (median-centered and scaled by median absolute deviation) in order to reduce inter-array variability. All data analysis were done in Orange software [37]

Images of Agilent microarrays were analyzed using Array-Pro Analyzer 4.5 (Media Cybernetics, Bethesda, MD, USA). Features, with CV>0.39, were filtered out and data were normalized using LOWESS fit to all genes according to their average intensity. Filtration and normalization was done in BASE softwar].

Affymetrix data were normalized by the Robust Multichip Average (RMA) algorithm. After transformation to non-logarithmic data, the expression estimates were scaled to the average expression levels in the control group analyzed using GeneSpringsoftware (Silicon Genetics, Redwood, USA). Classification of the differentially expressed genes was done in Orange using single-factor ANOVA or two tailed Student’s t-test. For Sterolgene microarrays probability of type I error αS=0.05 or αS=0.1 was used. For Agilent microarrays complementary probability of type I error was calculated according to Bonferroni correction for multiple testing (αA=αS*nS/nA), but final comparisons were made using more relaxed criteria αA=0.001 and αA=0.01. For Affymetrix microarrays complementary probability αAf=0.00043 was calculated, but also a more relaxed criterion αAf=0.001 was used.

Additional data analyses were done only using common genes between platforms, which were matched using unigene, refseqor gene symbol. On this gene list another classification of differentially expressed genes was done in Orange using ANOVA for Affymetrix and Agilent platforms. A probability for type I error was selected as in a complementary Sterolgene experiment (α=0.1 in Affymetrix analyses and α=0.05 in Agilent analyses). Pearson’s product moment correlation coefficient and a scatterplotbetween log2 ratios from common genes were calculated in SPSS 14.0. All data have been submitted to GEO (Gene expression omnibus) under accession codes: GSE6271 (Affymetrix data), GSE6317 (Agilent data), GSE6447 (Sterolgene phenobarbital and high cholesterol diet data), and GSE6423 (Sterolgene fasting and inflammation data).

T. Rezen et al., BMC Genomics 2008, 9:76

Page 101: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Pomen bioinformatike v raziskavh “omov” - povzetek

- Matematično-informatični pristopi v bioloških poskusih pogenomske dobe (bioinformatika) omogočajo razbiranje biološkega smisla v enormni količini podatkov.

-Bioinformatika (računska biologija, matematična biologija) za reševanje bioloških problemovuporablja metode uporabne matematike. Informatike, statistike in računalniških znanosti.

-Načrtovanje bioloških poskusov zahteva sodelovanje eksperimentatorjev in informatikov žeod vsega začetka.

-Zasnova poskusa zahteva definicijo biolškega vprašanja, izbiro platforme za analizo, določitev števila bioloških in tehničnih replik in načrt serije poskusov (hibridizacij ali sekvenciranj).

-Po tehnični izvedbi poskusa sledita statistična in informatična obdelava ter rudarjenjepodatkov.

- Statistično-informatična obdelava obsega ekstrakcijo intenzitet signala, normalizacijopodatkov, ter različne statistične teste, da pridobimo listo diferencialno izraženih genov ali zaporedij DNA.

- Sekundarna informatična analiza in rudarjenje podatkov obsegata gručanje in razpoznavanje vzorcev, študije seznama genov z genskimi ontologijami, iskanje regulatornihvzorcev, kot tudi načrtovanje validacijskih eksperimentov.

-Matematično-informatične metode za povezovanje podatkov različnih “omov” (npr. transkriptom, proteom, metabolom) so še v razvoju.

Page 102: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Kemogenomika, toksikogenomika in farmakogenomika

– primeri uporabe “omskih” tehnologij v praksi

Page 103: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

What’s the difference?

• Chemogenomics – screening a compound library to detect drug candidates

• Toxicogenomics – investigates the effect of a chemical substance (drug) on the genome

• Pharmacogenomics – investigates inherited basis for difference in drugresponse among individuals

Page 104: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...
Page 105: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

A blockbuster drug is a drug generating more than $1 billion of revenue for its owner each year. The search for blockbusters has been the foundation of the R&D strategy adopted by big pharmaceutical companies, but this looks set to change. New advances in genomics, and the promise of personalized medicine, are likely to fragment the pharmaceutical market.

Page 106: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Chemogenomics: structuring the drug discovery process to gene familiesC.J. Harris and A. P. Stevens,Drug Discov Today., 11, 880-888, 2006.

In the post-genomic era, if all proteins in a gene family can putatively be identified, how can drug discovery effectively tackle so many novel targets that might lack structural and small-molecule inhibitory data?

In response, chemogenomics, a new approach that guides drug discovery based on gene families, has been developed.

By integrating all information available within a protein family (sequence, SAR data, protein structure), chemogenomics can efficiently enable cross-SAR exploitation, directed compound selection and early identification of optimum selectivity panel members.

Page 107: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Aims of chemogenomics - the “druggable” genome

Chemogenomics aims to identify systematically all ligands and modulators for all the gene products expressed and allows the accelerated exploration of their biological function.

The subject brings together diverse disciplines including chemistry, genetics, chemo- and bioinformatics, structural biology, and biological screening in phenotypic and target-based assays.

http://www.nature.com/nbt/journal/v21/n1/images/nbt0103-31-F1.gif

Page 108: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Expert Review of ProteomicsJune 2007, Vol. 4, No. 3, Pages 411-419

Chemogenomics approaches to novel target discoveryL Alex Gaither

Chemogenomics involves the combination of a compound’s effect on biological targets together with modern genomics technologies. The merger of these two methodologies is creating a new way to screen for compound–target interactions, as well as map chemical and biological space in a parallel fashion. The challenge associated with mining complex databases has initiated the development of many novel in silico tools to profile and analyze data in a systematic way. The ability to analyze the combinatorial effects of chemical libraries on biological systems will aid the discovery of new therapeutic entities.

Chemogenomics provides a tool for the rapid validation of novel targeted therapeutics, where a specific molecular target is modulated by a small molecule. Along with targeted therapies comes the ability to discovery pathway nodes where a single molecular target might be an essential component of more than one disease.

Several disease areas will benefit directly from the chemogenomics approach, the most advanced being cancer. A genetic loss-of-function screen can be modulated in the presence of a compound to search for genes or pathways involved in the compound’s activity. Several recent papers highlight how chemogenomics is changing with RNA interference-based screening and shaping the discovery of new targeted therapies. Together, chemical and RNA interference-based screens open the door for a new way to discovery disease-associated genes and novel targeted therapies.

Page 109: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Steps of chemogenomics

Rognan D. Chemogenomic approaches to rational drug design.Br J Pharmacol. 2007, 152:38-52. Epub 2007 May 29. Review.

1. Formation/existance of a compound library

2. Chosen representative biological system

3. A reliable readout (i.e. gene expression, protein expression, HTP binding or functional assays, etc.)

Page 110: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Analysing chemogenomic data is a never ending learning process aimed at completing a 2D-matrix, where columns represent targets/genes and rows chemical compounds. Reported values are binding constants or functional effects.

Assumptions of chemogenomic-based approach:

1. compounds sharing some chemical similarity should also share targets,

2. targets sharing similar ligands should share similar patterns (binding sites).

Defining the ligand space by descriptors (molecular weight, atom and bound counts, topological fingerprints, shape, field, spectra, etc.)Defining the target space by descriptors (databases for sequence, patternssecondary structure fold, atomic coordinates, binding site, etc.)

Page 111: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Ligand space: molecular descriptors

Rognan D. J Pharmacol. 2007, 152:38-52. Epub 2007 May 29. Review.

Page 112: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Representation of ligand and target space correlations - example

Page 113: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Correlation matrices - networks

Page 114: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

In silico target fishing approaches

Machine learning algorithm using Bayesian statistics to predict target profiles from connectivity fingerprints of compounds from the biologically annotated database.

Page 115: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

The key to improving drug discovery is to manage the whole value chain of research.

Information needs to flow from fundamental analysis of genomes, proteins andchemistry through to managing clinical trials and drug submissions.

Integrating so many fields of life sciences presents a massive task of integrating knowledge and information.

Needed are bridges which help researchers in one area to communicate with others.

Page 116: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

http://www.nature.com/embor/journal/v5/n9/images/7400236-f2.jpg

Page 117: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

The TIP Database - an example

The Target Informatics Platform's content includes more than 80,000 high resolution protein structures with reliably annotated small molecule binding sites, including >44,000 comparative models of human proteins, covering every major drug target family including proteases, kinases, phosphatases, phosphodiesterases, nuclear receptors, and GPCRs. In addition to TIP's high quality structural annotation, it is the only database of its kind that stores all similarity relationships between every sequence, structure, and binding site, making it an incredibly powerful system for structural and comparative proteomics.

Page 118: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Challenges in Chemogenomics

• “Conceptual” challenges– Multidisciplinary project teams– Data from chemistry, biology, genomics, …

• Data challenges– Disparate data sources, formats and identifiers– Multiple data types – gene expression, chemical structures, clinical

chemistry, histopathology, receptor binding assays, …– Incomplete and missing data

• Analysis challenges– Should the data be normalized?– Does it make sense to cluster the data?– How to relate chemical structures and gene expression data?

Identify correlations between chemical properties and biological function

Brian Prather, Ph.D., Application Specialist, Spotfire, Third Virtual Conference on Genomics and BioinformaticsBrian Prather, Ph.D., Application Specialist, Spotfire, Third Virtual Conference on Genomics and Bioinformatics

Page 119: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

“Conceptual” challenges

Chemistry Genomics

Shared Analysis

Affymetrix

Biology

ToxicologyCAS

ISIS

ABase CodeLink

Multidisciplinary Drug Compound Candidate Project Team

Information Technology

Page 120: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Toxicogenomicshttp://en.wikipedia.org/

is a form of analysis by which the activity of a particular toxin or chemical substance on living tissue can be identified based upon a profiling of its known effects on genetic material.

Toxicogenomics may also be of use as a preventative measure to predict adverse "side", i.e. toxic, effects, of pharmaceutical drugs on susceptible individuals. This involves using genomic techniques such as gene expression level profiling and single-nucleotide polymorphism analysis of the genetic variation of individuals.

Studies of those types are then correlated to adverse toxicological effects in clinical trials so that suitable diagnostic markers (measurable signs) for these adverse effects can be developed.

There are many well-publicized cases in which popular drugs were pulled from the market because of toxic effects experienced by a small percentage of patients, with a cost of many billions of dollars to the companies responsible.

Page 121: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Some drugs recently withdrawn form the market

Rofecoxib developed by Merck & Co. to treat osteoarthritis, acute pain conditions, and dysmenorrhoea. Rofecoxib was approved as safe and effective by the FDA in 1999 and was marketed under the brand name Vioxx, Ceoxx and Ceeoxx. Worldwide, over 80 million people were prescribed rofecoxib at some time. In 2004, Merck voluntarily withdrew rofecoxib from the market because of concerns about increased risk of heart attack and stroke associated with long-term, high-dosage use. Rofecoxib was one of the most widely used drugs ever to be withdrawn from the market.

Fen-phen was an anti-obesity medication which consisted of two drugs fenfluramine and phentermine. After reports of valvular heart disease and pulmonary hypertension, primarily in women who had been undergoing treatment with Fen-phen, the FDA requested its withdrawal from the market in September 1997. As of 2004 Fen-phen is no longer widely available. In More than 50,000 product liabilitylawsuits had been filed by alleged Fen-phen victims. Estimates of total liability run as high as $14 billion.

Cerivastatin (Baycol®, Lipobay®) is a synthetic member of the class of statins, used to lower cholesterol and prevent cardiovascular disease. It was withdrawn from the market in 2001 because of the high rate of serious side-effects.Cerivastatin was marketed by the Bayer A.G. During post-marketing surveillance, 52 deaths were reported in patients using cerivastatin, mainly from rhabdomyolysis and its resultant renal failure. Risks were higher in patients using fibrates and in patients using the high (0.8 mg/day) dose of cerivastatin. Another 385 nonfatal cases of rhabdomyolysis were reported. This put the risk of this (rare) complication at 5-10 times that of the other statins.

Page 122: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

http://fig.cox.miami.edu/~cmallery/150/gene/toxicogenomics.gif

Toxicogenomic approaches

Page 123: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

http://www.kitox.re.kr/eng/images/study/img014.gif

Page 124: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Pharmacogenomicshttp://en.wikipedia.org/

is the branch of pharmacology which deals with the influence of genetic variation on drug response in patients by correlating gene expression or single-nucleotide polymorphisms with a drug's efficacy or toxicity.

Pharmacogenomics aims to develop rational means to optimise drug therapy, with respect to the patients' genotype, to ensure maximum efficacy with minimal adverse effects.

Such approaches promise the advent of "personalized medicine", in which drugs and drug combinations are optimised for each individual's unique genetic makeup.

Pharmacogenomics is the whole genome application of pharmacogenetics, which examines the single gene interactions with drugs.

Page 125: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Adverse drug reactions

An adverse drug reaction (abbreviated ADR) is an expression that describes the unwanted, negative consequences associated with the use of given medications. An ADR is a particular type of adverse effect. The meaning of this expression differs from the meaning of "side effect", as this last expression might also imply that the effects can be beneficial. The study of ADRs is the concern of the field known as pharmacovigilance.

In 1994, an estimated 2,216,000 hospital patients in the US, suffered a serious ADR, leading to approximately 106,000 patient deaths, making ADR fatalities the fourth to sixth leading cause of death in the US in 1994.

The current regime of “one dose fits all” is not ideal for patients and is not cost-effective for the health service.

Page 126: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Why is every human genome different?

Where are genome variations found?

What kinds of genome variations are there?

Page 127: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Polymorphisms

• Single Base Pair Differences,

– Single Nucleotide Polymorphisms (SNPs),

• Microsatellites (short sequence repeats),

• Minisatellites (long sequence repeats),

• Deletions,

• Duplications.

Page 128: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Pharmacogenomics principle: To identify patients at risk for toxicity or reduced response to therapy for optimal medication and/or dose selection

Pharmacogenomic principle

Page 129: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Pharmacogenomic approaches

Page 130: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

http://www.mun.ca/biology/scarr/DNA_VDA_chip_schematic.jpg

SNP chipsTo determine which alleles are present, genomic DNA from an individual is isolated, fragmented, tagged with a fluorescent dye, and applied to the chip.

The genomic DNA fragments anneal only to those oligos to which they are perfectly complementary.

A computer reads the position of the two fluorescent tags and identifies the individual as a C / T heterozygote.

The single spots in the other three columns indicate that the individual is homozygous at the three corresponding SNP positions.

Page 131: Uravnavanje izražanja genov, transkriptomika in drugi pristopi po ...

Kemogenomika, toksikogenomika, farmakogenomika - povzetek

Kemogenomika– išče kandidatna zdravila z rešetanjem knjižnic kemičnihspojin

Toksikogenomika– preiskuje potencialni toksični vpliv kemičnih spojin(potencialnih zdravil) na genom

Farmakogenomika – preiskuje vpliv genetskega ozadja na različen odzivposameznika na zdravila.

Določitev korelacije med kemičnimi lastnostmi spojine in biološkofunkcijo predtavlja še vedno velik izziv. Zahteva multidisciplinarneraziskovalne skupine, ki lahko pridobivajo, analizirajo in povežejopodatke s področja kemije, biologije in “omik”.