Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms...

21
By Dr. Dr. Azman Firdaus Shafii Azman Firdaus Shafii Founder Chairman & CEO Founder Chairman & CEO dafs dafs @ @ aldrix aldrix .net .net Bioinformatics Symposium 2005 Bioinformatics Symposium 2005 IT Comes Alive IT Comes Alive Theaterette, HELP University College Tuesday, 26 th July 2005 OPEN SOURCE SYSTEMS SDN BHD OPEN SOURCE SYSTEMS SDN BHD Innovation through ICT and Life Sciences Innovation through ICT and Life Sciences www. www. aldrix aldrix .net .net

Transcript of Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms...

Page 1: Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms III-1 Develop policy options for the use of genomics in medical and non-medical settings.

By

Dr.Dr. Azman Firdaus ShafiiAzman Firdaus ShafiiFounder Chairman & CEOFounder Chairman & CEO

dafsdafs@@aldrixaldrix.net.net

Bioinformatics Symposium 2005Bioinformatics Symposium 2005

IT Comes AliveIT Comes AliveTheaterette, HELP University College

Tuesday, 26th July 2005

OPEN SOURCE SYSTEMS SDN BHDOPEN SOURCE SYSTEMS SDN BHDInnovation through ICT and Life SciencesInnovation through ICT and Life Sciences

www.www.aldrixaldrix.net .net

Page 2: Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms III-1 Develop policy options for the use of genomics in medical and non-medical settings.

Biotechnology is 10,000 years old nowBiotechnology is 10,000 years old now

Leavening bread and beer fermentation (4,000

BC, Egypt, Mesopotamia)

Production of cheese and wine (4,000 – 2,000 BC, Sumeria, Egypt, China)

First antibiotic (mouldy soybean curds to treat boils) (500BC, China)

and so on…..

Open Source Systems Sdn. Bhd. Page 2

Page 3: Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms III-1 Develop policy options for the use of genomics in medical and non-medical settings.

The Central Dogma in Biology is stillThe Central Dogma in Biology is still

DNADNA RNARNA PROTEINPROTEIN

Reverse Transcription

Transcription Translation

FoldingPost-Translational Modifications?

STRUCTURESTRUCTURE

FUNCTIONFUNCTION

Hey Folks, looks like

DNADNA = cellular “lordslords”

ProteinsProteins = cellular “workerworker--slavesslaves” !!

Page 3Open Source Systems Sdn. Bhd.

Page 4: Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms III-1 Develop policy options for the use of genomics in medical and non-medical settings.

Pharmaceutical Industry is 160 years old nowPharmaceutical Industry is 160 years old now

� USD550B market now, USD700B in 2008, 40% still in USA.

� Thousands of firms (i.e. fragmented) including generic drug makers, CRO’s, wholesalers, retailers.

� On top, sits the “Big Pharmas”, the Big 12, HQ’s in USA/EU with half of total retail sales, Pfizer the leader at 10% market share

� Big Pharmas very profitable (25% profit margins) but now under cost and new products pressures. Can’t rely solely on “blockbuster drugs” pipeline anymore.

� Drug discovery and development to market cost (including discounted financial opportunity costs)? Average USD800 Million.

� USA Healthcare Spending = USD1.8 T, ca. 15% of GDP

Page 4Open Source Systems Sdn. Bhd.

Page 5: Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms III-1 Develop policy options for the use of genomics in medical and non-medical settings.

The Price of Fighting CancerThe Price of Fighting Cancer

The estimated costs of some of a new wave of cancer drugs (fastest growing sector), which aim at the disease without the side effects of traditional chemotherapy.

CANCER CANCER

DRUGDRUG MANUFACTURERMANUFACTURERAPPROVED APPROVED

FOR USEFOR USETYPE OF TYPE OF

CANCER CANCER TREATEDTREATED

EST. ANNUAL EST. ANNUAL

COST PER COST PER PATIENTPATIENT

ErbituxImClone/Bristol-Myers 2004 Colorectal $ 111,000

Avastin Genentech 2004 Colorectal 54,000

Herceptin Genentech 1998 Breast 38,000

Tarceva Genentech 2004 Lung 35,000

Note: Shares of Genentech has quadrupled in the last 2 years!

Drug price = ƒ {R&D costs, manufacturing, value to patients, etc.}

Source: Sanford C. Bernstein & Co. (NYT July 12, 2005)

Page 5Open Source Systems Sdn. Bhd.

Page 6: Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms III-1 Develop policy options for the use of genomics in medical and non-medical settings.

Technology

ResearchArea

Bioinformatics

Positional Cloning

Parallel Sequencing

2-D Gel,Mass Spec.

Cellular AssaysModel Organism,Gene Knock-outs

SmallMolecules

Animal Models

FunctionalGenomics

Proteomics

CellAssays

DrugTargets

GenesDrugTests

GenomicsCombinatorial Combinatorial

ChemistryChemistryScreening

DrugLeads

Pharmacogenomics

HumanTrials

Genotyping,Phenotyping,SNPs Markers

Drug

High-Throughput and Ultra-High Throughput Screening

StructuralDrug Design

Molecular Informatics

Differential Display, Expression Patterns, Reporter Gene Technologies

Chip Technologies, DNA Chips, Protein Chips, Microarrays

DevelopmentStage

ApprovalDrug Discovery Preclinical

I II III IV

Clinical

Bioinformatics in Drug Discovery/Development ProcessBioinformatics in Drug Discovery/Development Process

Source: IBM Life Science, 2002

Page 6Open Source Systems Sdn. Bhd.

Page 7: Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms III-1 Develop policy options for the use of genomics in medical and non-medical settings.

Pharmaceutical R&D = USD55BUSD55B in 2005

Traditionally, 10 10 –– 1212 years to develop 11 drug

Drugs to the public : MaximiseMaximise efficacyefficacy

MinimiseMinimise toxicitytoxicity

10,00010,000Molecules Screened

250250Selected for Pre-clinical Testing

10Ready for Clinical Testing

1FDA/Regulator

ApprovalFDA/Regulator FDA/Regulator

ApprovalApproval

Page 7Open Source Systems Sdn. Bhd.

Page 8: Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms III-1 Develop policy options for the use of genomics in medical and non-medical settings.

TrendsTrends

� Fierce competition from generics (I,C)

� Reduce D3 pipeline time frame to 7 years

� Collaborate/acquire smaller biotech companies (1/3 of new molecules originate here now, 1/3 more from university labs)

� Maximise use of bioinformatics (and other “matics”) including computational biology across the 60 departments involved

� R&D outsourcing (some parts, 70:30 say) to CRO’s in lower cost BRIC(S) countries

� Early days in pharmacogenomics or “personalised medicine”

Page 8Open Source Systems Sdn. Bhd.

Page 9: Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms III-1 Develop policy options for the use of genomics in medical and non-medical settings.

Modern Biotechnology Industry is 30 years oldModern Biotechnology Industry is 30 years old(with the founding of Genentech, USA, on April 7, 1976)

Years to IPO

3 – 5 years

Years to First Product

6 - 12 years

Years to Profit

9 - 15 years

Page 9

Number of biotech companies? (Ernst & Young, 2005)

USA 1430Canada 470Germany 360United Kingdom 320Australia 220China 150India 100Malaysia 20-50?

Worldwide: 4,416, 641 PLC’s, 78% USA.

Today, it is a USD55B industry by revenues

Market capitalisation (USD330B, 2004 in USA alone) but be careful about hypes and bubbles.

Use your common sense, do not be gullible to Wall Street analysts. Understand what are pendulum swings, market structural shifts, fundamentally disruptive

Open Source Systems Sdn. Bhd.

Page 10: Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms III-1 Develop policy options for the use of genomics in medical and non-medical settings.

ConclusionConclusion::

Biotech is more risky than ICT, has a long gestation period. That’s why in many countries where there

is no venture capital industry, the government must

take the initial lead.

Malaysia’s National Biotech Policy was unveiled on April 28, 2005. What is the order of priority?

Page 10

Green Biotech? Agri / Food

Red Biotech? Healthcare / Pharma

White Biotech? Industrial (including biofuels, vitamins, environmental remediation, etc.)

Will impact sources and sinks of fund flows, deal flows.

Open Source Systems Sdn. Bhd.

Page 11: Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms III-1 Develop policy options for the use of genomics in medical and non-medical settings.

� Creation of DATABASES (eg. GenBank, EMBL, DDBJ for genomes ~ updated daily)

allowing storage and management of large biological data sets (eg. Sanger

Centre’s 70 ABI sequencing units generates 60M bases raw data daily to become 600M annotated

finished sequences per year). Biologists want manually curated, biological validated annotation. (SWISS-PROT has 90 annotators, mostly females!).

� Development of ALGORITHMS and STATISTICS to determine relationships among members of large biological data sets

� Use of the TOOLS for the ANALYSIS and INTERPRETATION of various types of biological data, including DNA, RNA and protein sequences, protein structures, gene expression profiles and biochemical pathways.

ESSENTIALLY, BIOINFORMATICS HAS 3 COMPONENTSESSENTIALLY, BIOINFORMATICS HAS 3 COMPONENTS

Page 11

EMBOSSEMBOSS

BLASTBLAST

STADENSTADEN

CLUSTALWCLUSTALW

Open Source Systems Sdn. Bhd.

Page 12: Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms III-1 Develop policy options for the use of genomics in medical and non-medical settings.

Finding the Right DataFinding the Right Data

Name Address Description

Ensembl www.ensembl.org The Human GenomeGenBank/DDBJ/EMBL www.ncbi.nlm.nih.gov Nucleotide sequencePubMed www.ncbi.nlm.nih.gov Literature referencesNR www.ncbi.nlm.nih.gov Protein sequencesSWISS-PROT www.expasy.ch Annotated protein sequencesInterProScan www.ebi.ac.uk Protein domainsOMIM www.ncbi.nlm.nih.gov Genetic diseasesEnzymes www.chem.qmul.ac.uk EnzymesPDB www.rscb.gov/pdb Protein structuresKEGG www.genome.ad.jp Metabolic pathways

Source : Bioinformatics For Dummies(2003)

Analyzing your DNA/RNA SequenceAnalyzing your DNA/RNA Sequence

Name Address Description

Webcutter www.firstmarket.com/cutter Restriction mapPCR biotools.umassmed.edu/bioapps PCR primers design

Assembly bio.ifom-firc.it/ASSEMBLY/ Simple DNA assembling

assemble.html for small sequencesGenomeScan genes.mit.edu/genomescan/ Gene discovery

blastn, tblastn, blastx www.ncbi.nlm.nih.gov Database searchThe Genome Browser genome.cse.ucsc.edu Browse the ultimate data!

Mfold www.bioinfo.rpi.edu RNA structure prediction

Analyzing Your Protein SequenceAnalyzing Your Protein Sequence

Name Address Description

BLAST www.ncbi.nlm.nih.gov Database homology search

SRS srs.ebi.ac.uk Database search

Entrez www.ncbi.nlm.nih.gov Database searchInterProScan www.ebi.ac.uk Find protein domains

ExPASy www.expasy.ch Analyze a proteinDotlet www.ch.embnet.org Make a dot plot

ClustalW www.ebi.ac.uk Multiple sequence alignmentT-Coffee igs-server.cnrs-mrs.fr/Tcoffee Evaluate multiple alignment

Jalview www.es.embnet.org Multiple alignment editor

PSIPRED bioinf.cs.ucl.ac.uk/psipred Secondary structure predictionCn3D www.ncbi.nlm.nih.gov/Structure Display and spin 3-D structures

Phylip bioweb.pasteur.fr/into-uk.html Tree reconstruction

Open Source Systems Sdn. Bhd. Page 12

Page 13: Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms III-1 Develop policy options for the use of genomics in medical and non-medical settings.

DATA

� Stored fact� Inactive (they exist)� Technology-based� Gathered from various sources

INFORMATION

� Presented fact� Active (enabler)� Business-based� Transformed from data

Role of Bioinformatics(This role already quite developed in Engineering and Finance/Banking)

Where component parts, or building blocks, are already known and their respective functions well understood)

The Typical Bioinformatics Equation: The Typical Bioinformatics Equation:

Know Sequence, WhatKnow Sequence, What’’s the Consequences the Consequence??

Adapted from Dr Hwa A Lim’s “Genetically Yours”

World Scientific Publisher, 2002.

Page 13

Data Information Knowledge

Analyses

ModellingExperiments

in-vivo, in-vitro, population, etc.

Open Source Systems Sdn. Bhd.

LIMS

Page 14: Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms III-1 Develop policy options for the use of genomics in medical and non-medical settings.

Laptop

Global WAN+

Great Global GridGrid Computing

Framework

PCMultimedia H/P

PDA

Visualization & Hi-performance Workstation

Storage Management

Database Management

Parallel Linux Clusters*Server A

Presentation Layer

Computational & Data Management Computing Layer

Information Management Layer

DataBiological FinancialChemical AdminProteomic DiagnosticGenetic TreatmentClinical Pathways

Adapted from EMC

THE THE ““NEW BIOLOGYNEW BIOLOGY”” INFORMATIONINFORMATION--DELIVERY MODELDELIVERY MODEL

Note: Use of scripting languages (Perl, Python, R, JAVA, etc.), XML, Linux and other Open Source Bioinformatics Software is widespread

Page 14

SMP* clusters/dedicated irons (eg. IBM BlueGene/L)

65,000 cpu’s, 132TF on LINPACK

Open Source Systems Sdn. Bhd.* An OSS White Paper on High Performance Computing is downloadable from http://www/aldrix.net

Page 15: Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms III-1 Develop policy options for the use of genomics in medical and non-medical settings.

Grand Challenge I : Genomics to BiologyGrand Challenge I : Genomics to BiologyElucidating the structure and function to genomes

I-1 Comprehensively identify the structural and functional components encoded in the human genome.

I-2 Elucidate the organisation of genetic networks and protein pathways and establish how they contribute to cellular and oganismal phenotypes.

I-3 Develop a detailed understanding of the heritable variation in the human genome.

I-4 Understand evolutionary variation across species and the mechanisms underlying it.

I-5 Develop policy options that facilitate the widespread use of genome information in both research and clinical setting

NHGRI, 2003

Page 15Open Source Systems Sdn. Bhd.

Page 16: Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms III-1 Develop policy options for the use of genomics in medical and non-medical settings.

Bioinformatics in Drug Discovery / Development ProcessBioinformatics in Drug Discovery / Development Process(Example(Example))

DNA sequences

& maps

Gene Expression

Analysis

Protein structure

Prediction / Analysis

Protein structure

Prediction / Analysis

Disease

model

Disease

selection

Empirical

Medicine

LTS

HTSLead Identification

& optimization

Rational

drug design

Pre-clinicaltrials

Clinical

Trials

P-I / II / III / IV

Genomics

databaseTarget gene

Protein

data

Cell biology

database

In vivo

Diseasebiology

Preclinical andexperimental Data

Clinical Data

Computational physical chemistry

Medicinal chemistry

Molecular diversity

chemistry structure

Target receptor

Clinical Trials

Clinical Biology

Animal genetic diseasesPhysiology database

Medical research database

Computational Chemistry

Protein structure prediction

Protein-protein interaction

Protein-molecule interaction

Genomics / Proteomics

Structural Genomics Functional Genomics

Proteomics

Source : IBM Life Science, 2002

DNA informationNucleotide sequence

of genes

Page 16Open Source Systems Sdn. Bhd.

Page 17: Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms III-1 Develop policy options for the use of genomics in medical and non-medical settings.

Grand Challenge II : Genomics to HealthGrand Challenge II : Genomics to HealthTranslating genome-based knowledge into health benefits

II-1 Develop robust strategies for identifying the generic contributions to disease and drug response.

II-2 Develop strategies to identify gene variants that contribute to good health and resistance to disease.

II-3 Develop genome-based approaches to prediction of disease susceptibility and drug response, early detection of illness, and molecular taxonomy of disease states,

II-4 Use new understanding of genes and pathways to develop powerful new therapeutic approaches to disease.

II-5 Investigate how genetic risk information is conveyed in clinical settings, how that information influences health strategies and behaviours, and how these affect health outcomes and costs.

II-6 Develop genome-based tools that improve the health of all.

NHGRI, 2003

Page 17Open Source Systems Sdn. Bhd.

Page 18: Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms III-1 Develop policy options for the use of genomics in medical and non-medical settings.

Grand Challenge III : Genomics to SocietyGrand Challenge III : Genomics to SocietyPromoting the use of genomics to maximise benefits and minimise harms

III-1 Develop policy options for the use of genomics in medical and non-medical settings.

III-2 Understand the relationship between genomics, race, and ethnicity and the consequences of uncovering these relationships.

III-3 Understand the consequences of uncovering the genomic contributions to human traits and behaviours.

III-4 Assess how to define the ethnical boundaries for uses of genomics. NHGRI, 2003

Page 18Open Source Systems Sdn. Bhd.

Page 19: Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms III-1 Develop policy options for the use of genomics in medical and non-medical settings.

Biodiversity: Hype or Real?Biodiversity: Hype or Real?

Astra Zeneca (7th ranked Big Pharma @ USD22B revenues, USD70B market capitalisation) thinks it’s for real. Has invested A$100M looking for drug compounds among Australia’s flora and fauna in partnership with locals (e.g. Griffith University, Queensland). Evaluated 41,000 samples of plants and micro-organisms.

Malaysia? Early stage only. Our RM2B natural products/ herbal*industries will probably focus on nutraceuticals and cosmeceuticals first. We will probably enter the global pharmaceuticals sector via bio-generics, with or without international partners.

Page 19

* Consider Pegaga (Cantella asiatica) @ Pasar malam 1 kg = RM4.00

Dried, extracts 1 kg = RM100.00

Upon standardisation 1 kg = RM380.00

but upon standardised to pharmaceutical/cosmetics industry, 1 kg = RM3,800.00!

(Source: Rajen, [email protected])

Open Source Systems Sdn. Bhd.

Page 20: Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms III-1 Develop policy options for the use of genomics in medical and non-medical settings.

� Earth - universities, RI’s

� Fire - entrepreneurship

� Water - finance esp. venture capital

� Wood - networking (government, industry, academia)

�� GoldGold - stock market (value realisation, IPs are protected)

The Last Page?The Last Page?

Page 20

Biotech/bioinformatics are K-intensive. Basic sciences must be solid. But

IMAGINATIONIMAGINATION will take you to breakthroughs. Need specialised skills to innovate and create value. Worthwhile doing your post-graduate degrees (MSc, PhD, post-doctoral, overseas stints, etc.). Computer Science people must cultivate a love/interest in molecular stuff (& vice-versa). OSS and others are recruiting!

Worldwide, nations are building clusters (Michael Porter’s diamond).

For (bio) clusters to be impactful (takes at least 10 years), need all 5 synergistic elements to be in place.

Open Source Systems Sdn. Bhd.

Career development? Look for scenarios that are likely to have an impact on our lives.History is littered with corpses that only had a single view of the future.

Page 21: Bioinformatics Symposium 2005Promoting the use of genomics to maximise benefits and minimise harms III-1 Develop policy options for the use of genomics in medical and non-medical settings.

Malaysia – just starting, have reasonable agro-industrial base

�� HR capital constraints (USA produces 10,000 PhDs in Science HR capital constraints (USA produces 10,000 PhDs in Science

and Engineering per year!)and Engineering per year!)

�� Bioinformatics skills take time to hone. Best to interact closeBioinformatics skills take time to hone. Best to interact closely ly

with wetwith wet--lab people or bench scientists. Grads must be flexible lab people or bench scientists. Grads must be flexible

and willing to adapt/evolve. Is it our brain thatand willing to adapt/evolve. Is it our brain that’’s lacking s lacking

bandwidth?bandwidth?

�� Regional competitors abound, whatRegional competitors abound, what’’s our competitive s our competitive

advantage?advantage?

�� But national commitment is there (will it be But national commitment is there (will it be impactfulimpactful as our as our

plantations and ICT industries?)plantations and ICT industries?)

�� Have right people at right places, with right tools and adequateHave right people at right places, with right tools and adequate

resources, and plenty of imagination.resources, and plenty of imagination.

Definitely the Last PageDefinitely the Last Page

Page 21

Remember: whatever we do, it’s all relative to the competition!

Open Source Systems Sdn. Bhd.

Thank You