Evolutionary arguments in medical genomics
description
Transcript of Evolutionary arguments in medical genomics
International Life Sciences Workshop “Decision-Making in Biomedical Science – Meet Experts”
September 12 – 16 | 2014 Potsdam | Germany
Evolutionary arguments in medical genomics
Nikita N. Khromov-Borisov
Pavlov First Saint Petersburg State Medical University Saint Petersburg, Russia
[email protected] +7 952-204-89-49; +7 921-449-29-05
http://independent.academia.edu/NikitaKhromovBorisov https://www.researchgate.net/profile/Nikita_Khromov-Borisov?ev=hdr_xprf
1
Slides are freely available to all
Nikita N. Khromov-Borisov Department of Physics, Mathematics and Informatics
Pavlov First Saint Petersburg State Medical University
+7-952-204-89-49; +7-921-449-29-05 http://independent.academia.edu/NikitaKhromovBorisov
2
Key words:
• genetics of predisposition,
• genetic polymorphism,
• genetic association,
• evolutionary medical genomics,
• neutral evolution,
• genetic load,
• opposite pleiotropy,
• homeostasis,
• reproducibility,
• predictive values,
• Bayesian graphs,
3
Albert Einstein
• ‘‘We can’t solve problems by using
• the same kind of thinking we used when we created them’’
• Cited by: Heng H.H.Q. The genome-centric concept: resynthesis of evolutionary theory. BioEssays, 2009; 31: 512–525.
4
Methodology of restrictions and limitations
• Most fundamental scientific principles are in fact exclusions (“taboos”) and the progress of science is associated with the recognition of the importance of some principal restrictions and/or limitations.
• It is impossible to create perpetuum mobile.
• It is impossible to move with the superluminal speed.
• It is impossible to heat the hot body by the cold one.
• Two identical fermions (e.g. two electrons) cannot occupy the same quantum state simultaneously (Pauli exclusion principle).
• Replication, transcription and/or translation of proteins is impossible (Central dogma of molecular biology). Etc.
5
Science is not omnipotent
• Not always and not all of the results of basic research lead immediately to practical application.
• Some of them only indicate the insurmountable uncertainty and fundamental limitations of our abilities.
• In particular, we are talking about trying to diagnose a predisposition to multifactorial diseases and syndromes, as well as susceptibility to certain activities (e.g., to sport achievements or to specific profession), with the help of genetic testing.
6
Post genomic era
• “We bought at the price of a dollar for
• letter huge book without a table of contents.”
• Eric Lander
• After sequencing the human genome we found ourselves in a position of the player of Russian TV capital-show “Field of Dreams”, who has guessed all the letters, but was unable to read the word.
7
• In contrast to rare Mendelian diseases, extensive family-based linkage analysis in the 1990s was largely unsuccessful in uncovering the basis of common diseases that afflict most of the population.
• These diseases are polygenic, and there were no systematic methods for identifying underlying genes.
• As of 2000, only about a dozen genetic variants (outside the HLA locus) had been reproducibly associated with common disorders.
• A decade later, more than 1,100 loci affecting more than 165 diseases and traits have been associated with common traits and diseases, nearly all since 2007.
8
Genetics of predispositions
• Popular genetic association studies (GAS), that is studies of genetic susceptibility to disease or ability to any particular type of activity (e.g., to high achievements in sport) is appropriate to call the genetics of predispositions.
9
Genetics of predispositions
• Perhaps the most fundamental result of the genetics of predispositions consists in contradictory results and their contradictory interpretations.
• The main reason for the contradictory results and contradictory interpretations is their poor reproducibility and extremely low predictive ability.
• Therefore we should not trust the assertions about the alleged practical (clinical) value of such countless studies.
• They are unfounded, if not confirmed repeatedly in independent studies.
• But even if they are reproduced, their practical (clinical) usefulness should be demonstrated.
10
Sources of uncertainty of genetic chiromancy and
predictions
11
Sample sizes in physics, chemistry, biology and medicine
• Physicists and chemists works with the samples of different substances which contain 6∙1023 of particles (atoms or molecules) in 1 mole of the pure substance.
• Even 1 nanomole of given substance contains about 1014 such particles.
• These particles may be regarded as rather identical.
• However, we need not to forget that even on the atomic level there are several isotopes of a given chemical element.
• And some of them are radioactive.
• In medicine researchers are limited with the size of the world population which is less then 1010 (7.26∙109) .
• And human populations are very heterogeneous.
12
Principal contradiction
• Almost all people are dissimilar, even monozygotic (“identical”) twins (CNV, immunoglobulins, fingerprints ).
• Surely this fact is one of the main sources of the low reproducibility and predictive ability of the results in biomedicine.
• Thus, the genetic uniqueness of each person comes into contradiction with the statistical methodology, which requires to analyze large amounts (hundreds and thousands) of identical persons to achieve the certain conclusions.
13
• Thanks to the genome sequence projects, the number of genes in the human has been calculated to be a modest 31,897 (http://eugenes.org/), a relatively small number when one considers that yeast has 7,547, thale cress (Arabidopsis thaliana) contains 29,388 ORFs (open reading frames), and a measly little worm (Caenorhabditis elegans) has 23,399.
• Indeed, Mus musculus (the mouse) has 6,000 genes more than humans, and one may wonder if we really are that complex after all! 14
Variety of determinants of complex phenotype exemplified by the heart hypertrophy
Marian A.J. Molecular genetic studies of complex phenotypes. Translational Research, 2012; 159: 64–79
15
A great variety of genetic polymorphism
• SNP — single nucleotide polymorphism,
• miRNA — RNA interference and micro-RNA,
• piRNA — piwi-interacting RNA,
• lncRNA — long non-coding RNA,
• tmRNA — transfer-messenger RNA,
• eccDNA — extrachromosomal circular DNA,
• microDNA — short eccDNA,
• CNV — copy number variations.
• It seems that the association of the last 7 newly discovered elements with diseases is stronger than for SNP.
• They differ even in monozygotic twins and can vary between individual cells of the same tissue (e.g., in the neurons of the brain).
• Let’s don’t forget also about the individual variation in the immunoglobulin genes.
16
The number of allele (DNA-sequence variation) combinations is rather incalculable
• The number of DNA sequence variations (alleles) in human genome is astronomical.
• According to the NBCI dbSNP build 141 on May 21, 2014 there are 43,737,321 SNPs (single nucleotide polymorphisms).
• The number of their combinations can not be counted; obviously, it is much more than the number of people on Earth - 7.26 billion and perhaps even more than the number of atoms in the Universe - 1067.
• Therefore, it is principally impossible to prove that given unique genotype predisposes to given disease or to certain propensity.
• For this we need to have a large sample of subjects with given genotype, but it is unique.
• For instance, in forensic genetics using 15 loci with about 10 different number of STRs in each is sufficient to identify any unrelated person.
17
Situation when genetic uniqueness is practically useful: Genetic passport
18
Theodosius Dobzhansky, 1973
• Nothing in biology makes sense
except in the light of evolution.
• The American Biology Teacher, 1973; 35: 125-129.
19
Pierre Teilhard de Chardin
• “Evolution is a light which illuminates all
facts, a trajectory which all lines of thought
must follow - this is what evolution is”.
• Pierre Teilhard de Chardin - one of the greatest thinkers of our time.
• Teilhard was a creationist, but such who understood that Creation is realized in this world through evolution.
20
Peter Brian Medawar
• “For a biologist, the alternative
to thinking in evolutionary terms
is not to think at all”
• Medawar P., Medawar J.S, The Life Science: Current Ideas in Biology, London: Wildwood House, 1977
21
• Swynghedauw B. • Nothing in medicine makes sense except in the light of evolution: A Review. • P. Pontarotti (ed.), Evolutionary Biology from Concept to
Application, Springer-Verlag Berlin Heidelberg, 2008; pp. 197-207.
• Varki A. • Nothing in medicine makes sense, except in the light of evolution.
J. Mol. Med., 2012; 90:481–494
• “Understanding human evolution, where we came from, is very important to understanding who we are and where we’re going.”
22
• Kalinowski S.T., Leonard M.J., Andrews T.M.
• Nothing in evolution makes sense except in the light of DNA
• CBE—Life Sciences Education, 2010; 9: 87–97,
• Natural selection is an inherently difficult process for students to grasp.
• “It is almost if the human brain were specifically designed to misunderstand Darwinism.” Dawkins (1986)
23
Weiss K.M., Buchanan A.V., Lambert B.W. The Red Queen and Her King: Cooperation at all Levels of Life.
Yearbook of Physical Anthropology, 2011; 54: 3–18.
• BEYOND EVOLUTIONARY THEORY
• An excessive focus on evolution as the only thing that ‘‘makes sense in biology’’ to quote Dobzhansky’s famous assertion, draws attention away from things that characterize much more of life, much more of the time.
• What goes on in the lives of cells, and the organisms they comprise, can not only help us understand what happens on the evolutionary time scale, but also relates to a number of other questions that evolutionary theory does not address.
• Dobzhansky’s assertion was good for resisting creationism in schools, but as biology it is manifestly.
24
One of controversial evolutionary argument was concerned to AB0 blood group system
25
AB0 and diseases
• The only associations between AB0 blood groups and malignant neoplasms, thrombosis, peptic ulcers, bleeding, bacterial and viral infections are still regarded as statistically “proven“.
• Alas, these associations have no clinical (practical) importance due to low values of odds ratio (OR) which do not exceed the value of OR = 1.5.
26
Associations between AB0 blood groups and diseases, which are considered to be statistically “proven”
Medical condition A > 0 0 > A B/AB > A/0 OR
Malignancy X 1.2 – 1.3
Thrombosis X
Peptic ulcers X 1.2 – 1.4
Bleeding X 1.5
E. coli / Salmonella X
27
Edgren G, Hjalgrim H., Rostgaard K., Norda R, Wikman A, Melbye M., Nyré O. Risk of gastric cancer and peptic ulcers in relation to AB0 blood type: a cohort study
Am. J. Epidemiol., 2010. – Vol. 72. – P. 1280–1285
Blood group
Donors Cases with the gastric cancer
N f with 99% CI Cause N f with 99% CI
A 478633 0.4380.4400.441 Deficit 331 0.410.470.52
AB 57904 0.05260.05320.0539 Excess 45 0.0410.0670.102
B 122819 0.1120.1130.114 Deficit 66 0.070.100.14
0 428978 0.3930.3940.396 Excess 246 0.310.370.43
HWE, Pval 9∙10-83 0.12
Homoge-neity
Pval 0.034
BF01 68.5
28
The authors argue that this large Danish-Swedish cohort study confirms the “association” between blood group A and gastric cancer. Actually in control group the deviations from the HWE is observed due to deficiency of allele A. Moreover, the difference between groups is statistically minor and clinically negligible: OR = 1.15.
Rubanovich A.V., Khromov-Borisov N.N. Theoretical Analysis of the Predictability Indices of the Binary Genetic Tests. Russian Journal of
Genetics: Applied Research, 2014, Vol. 4, No. 2, pp. 146–158.
• It is customary to interpret OR values ≤ 1.5 as virtually worthless, from 1.5 to 3.5 - very low, from 3.5 to 9.0 - low, from 9.0 to 32 – moderate, from 32 to 100 – high and >100 - very high.
• Our theoretical study shows that when OR < 2.2, marker has notoriously low predictive performance in all respects and at all frequencies of occurrence of the disease and the marker.
• The marker can be a good classifier, if OR > 5.4, provided that its population frequency is sufficiently high (pM > 0.3).
• In practice, this means that to these inequalities must satisfy the lower bounds of the confidence interval for the estimated value of OR.
• Earlier, similar values of critical levels for observed effects in the genetics of predispositions were offered for the relative risk (RR < 2 and RR > 5, respectively).
29
Zhang B., Beeghly-Fadiel A., Long J., Zheng W., Genetic Variants Associated with Breast Cancer Risk: Comprehensive Field Synopsis, Meta-Analysis, and
Epidemiologic Evidence. Lancet Oncol., 2011; 12(5): 477–488
• More than 1,000 candidate-gene in breast cancer association studies have been published in the last two decades, which have evaluated more than 7,000 genetic variants.
• While some of these variants may represent true associations with breast cancer risk, many more are false-positive associations which fail to replicate among additional study populations.
• 51 variants in 40 genes showed statistically significant associations with breast cancer risk.
• Cumulative epidemiologic evidence for an association with breast cancer risk was graded as strong for 10 variants in six genes (ATM, CASP8, CHEK2, CTLA4, NBN, and TP53),
• moderate for four variants in four genes (ATM, CYP19A1, TERT, and XRCC3), and
• weak for 37 additional variants. • Additionally, in meta-analyses that included a minimum of 10,000 cases and
10,000 controls, convincing evidence of no association with breast cancer risk was identified for 45 variants in 37 genes.
30
High- and moderate-penetrance breast cancer susceptibility genes
Gene Variants Relative Risk,
RR
Population Frequency
(%)
BRCA1 Multiple mutations >10 0.1
BRCA2 Multiple mutations >10 0.1
TP53 Multiple mutations >10 <0.1
PTEN Multiple mutations >10 <0.1
ATM Truncating and missense mutations 2–4 <0.5
CHEK2 1100delC 2–5 0.7
BRIP1 Truncating mutations 2–3 0.1
PALB2 Truncating mutations 2–5 <0.1
31
Sources of uncertainty of genetic chiromancy and
predictions
32
P = G + E + (G x E) + (Gj x Gk) + …
• We should remember the fundamental statement:
• Phenotype (P) is the product of the interaction between genotype (G) and environment (E).
• Such interaction can be primitive (linear, additive) or sophisticated (nonlinear, multiplicative, compensatory, neutralizing, opposite, etc.).
• Some of them can be cryptic, which are not exhibited under normal conditions and so it is hard to reveal them.
33
Environment – internal and external: «Exposome»
Wild C.P. The exposome: from concept to utility. International Journal of Epidemiology, 2012; 41: 24–32
34
Padmanabhan S., Newton-Cheh C., Dominiczak A.F. Genetic basis of blood pressure and hypertension. Trends in Genetics, 2012; 28(8): 1–12
35
(a) Hypertension is caused by one mutation and occurs in discrete subpopulations. (b) There is no clear distinction between hypertension and normotension. Hypertension is extreme variant of the continuum and has a polygenic nature.
Heritability and the environment
Components of genetic and environmental variability are clearly distinct
Continuity as an interpenetration of genetic and environmental variability
36
There are no common disorders — just the extremes of quantitative traits
• We predict that research on polygenic liabilities
• will eventually lead to a focus on quantitative
• dimensions rather than qualitative disorders.
• The extremes of the distribution are important medically and socially, but we see no scientific advantage in reifying diagnostic constructs that have evolved historically on the basis of symptoms rather than aetiology.
• A more provocative way to restate our argument is that from the perspective of polygenic liability, there are no common disorders — just the extremes of quantitative traits.
• Plomin R., Haworth C.M.A., Davis O.S.P. Common disorders are quantitative traits. Nature Rev. Genet., 2009. – Vol. 10. – P. 872-878.
37
Alexey Matveyevich Olovnikov Алексей Матвеевич Оловников
• Aging as an universal chronic “disease of quantitative traits”: cell aging and RNA-dependent ion-modulated gene expression genes.
• Биомедицинский журнал Medline.ru
• Том 4, СТ. 28 (стр. 31)
• Февраль, 2003 г.
38
Aging is an universal genetic “disease of quantitative traits”
• During the aging of humans and animals no expression of principally new macromolecules is observed.
• All that occurs during aging, is not a qualitative but quantitative change of various traits, whose number is enormous.
• If aging is really a disease of quantitative
• traits, it is appropriate to target the
• biogerontological research to those key
• molecular mechanisms that underlie the
• regulation of quantitative traits in eukaryotes.
39
• The larger the number of factors, both genetic and environmental, influencing given trait (disease or propensity), the greater the unpredictability of the manifestation of this trait.
• The same disease can be determined by different versions of different genes.
• The same gene may be involved in the development of various diseases and syndromes.
• Some versions of given gene (DNA sequence variants, alleles) may predispose to one disease, and other its versions - to another disease.
40
One disease can be influenced by many genes
G-1 G-2 G-3 . . . G-k
Disease
One gene can affect many diseases
Gene
D-1 D-2 D-3 . . . D-k
41
• Each gene affects many traits, and each trait is determined by many genes.
• Sources of uncertainty and unpredictability:
• Reduced penetrance
• Variable expressivity,
• Pleiotropy
• Terms penetrance and expressivity were introduced by Oskar Vogt and Elena (Helena) and Nikolai Timofeev-Resovsky in 1926.
• Vogt O. Psychiatrisch wichtige Tatsachen der zoologisch-botanischen Systematik. Zeitschrift für die gesamte Neurologie und Psychiatrie, 1926; 101:805-32
• Timofeeff-Ressovsky H.A., Timofeeff-Ressovsky N.W. Über das phänotypische Manifestieren des Genotyps. II. Über idio-somatische Variationsgruppen bei Drosophila funebris. Wilhelm Roux‘ Archiv fur Entwicklungsmechanik der Organismen, 1926; 108: 148-70
42
Variable expressivity - Syndactyly
Complete Partial
Polydactily – the character with incomplete penetrance as well as variable expressivity
Reduced penetrance • Penetrance refers to the proportion of people with a particular genetic
change (such as a mutation in a specific gene) who exhibit signs and symptoms of a genetic disorder.
• If some people with the mutation do not develop features of the disorder, the condition is said to have reduced (or incomplete) penetrance.
• Reduced penetrance often occurs with familial cancer syndromes.
• For example, many people with a mutation in the BRCA1 or BRCA2 genes will develop cancer during their lifetime, but some people will not.
• Doctors cannot predict which people with these mutations will develop cancer or when the tumors will develop.
• This phenomenon can make it challenging for genetics professionals to interpret a person’s family medical history and predict the risk of passing a genetic condition to future generations.
45
Variable expressivity
• Variable expressivity refers to the range of signs and symptoms that can occur in different people with the same genetic condition.
• For example, the features of Marfan syndrome vary widely— some people have only mild symptoms (such as being tall and thin with long, slender fingers), while others also experience life-threatening complications involving the heart and blood vessels.
• Although the features are highly variable, most people with this disorder have a mutation in the same gene (FBN1 – fibrillin-1).
• If a genetic condition has highly variable signs and symptoms, it may be challenging to diagnose.
46
Marfan syndrome
Penetrance and expressivity
48
Pleiothropy
• Pleiotropy occurs when one gene influences multiple, seemingly unrelated phenotypic traits.
• Pleiotropic gene action can limit the rate of multivariate evolution when natural selection, sexual selection or artificial selection on one trait favours one specific version of the gene (allele), while selection on other traits favors a different allele.
• The underlying mechanism of pleiotropy in most cases is the effect of a gene on metabolic pathways that contribute to different phenotypes.
49
Pleiothropy
• One of the most widely cited examples of pleiotropy in humans is phenylketonuria (PKU).
• Phenylketonuria (PKU) is an autosomal recessive metabolic genetic disorder characterized by mutations in the gene for the hepatic enzyme phenylalanine hydroxylase (PAH).
• A defect in the single gene (PAH) that codes for this enzyme therefore results in the multiple phenotypes associated with PKU, including mental retardation, tumors, eczema, mousy odor and pigment defects that make affected individuals lighter skinned.
50
Antagonistic Pleiotropy
• For example, in humans, the p53 gene directs damaged cells to stop reproducing, thereby resulting in cell death.
• This gene helps avert cancer by preventing cells with DNA damage from dividing, but it can also suppresses the division of stem cells, which allow the body to renew and replace deteriorating tissues during aging.
• This situation is therefore an example of antagonistic pleiotropy, in which the expression of a single gene causes competing effects, some of which are beneficial and some of which are detrimental to the fitness of an organism.
51
Pleiotropy and homeostasis
• Pleiotropy is one of the main mechanisms for maintaining homeostasis especially when it is mutually opposite (“compromise” or “compensatory”) and / or antagonistic.
• Homeostasis is a central principle of living systems; it is the relatively stable state of equilibrium, or the tendency toward such a state, between different but interdependent elements and subsystems of an organism.
52
Example: APOE
• The case of APOE provides a familiar example of a common variant with well-established cross-phenotype effects.
• The APO*ε4 allele is a known risk factor for both atherosclerotic heart disease and Alzheimer’s disease but has also been shown to exert a protective effect on risk of age-related macular degeneration.
53
Example: The multiplicity of physiological functions of ACE
• Angiotensin-converting enzyme (ACE), is not only the blood pressure monitors, but it also participates in the fertilization process, the formation of immune cells, the development of atherosclerosis.
• Its high expression in macrophages, immune cells, prevents the formation of malignant tumors.
• Therefore, the use of ACE inhibitors can provoke cancer and Alzheimer's disease.
54
Nawaz S.K., Hasnain S. Pleiotropic effects of ACE polymorphism. Biochemia Medica, 2009; 19(1): 36–49.
Association present Association absent Controversial
Diabetic nephropathy Type 2 diabetes Hypertension
Atherosclerosis Diabetic retinopathy Coronary heart disease and stroke
Alzheimer disease Allele D is “preventing”
Gastric cancer Colorectal cancer
Parkinson’s disease Systemic lupus erythematosus Longevity
Breast cancer
Oral cancer
Treatment of osteoporosis
Diseases allegedly “associated” with the indel dimorphism in the ACE gene
55
The polymorphism is due to insertion (I allele) or deletion (D allele) of a 287 bp fragment in intron 6 of the ACE gene in chromosome 17. Brown highlighted are associations that in Russia continue to be considered certainly established.
Ubiquitous VDR and ESR - receptors of “vitamin” D and estrogen
• VDR activity extends far beyond the metabolism of calcium and parathyroid hormone (PTH).
• It participates in the transcription of 900 genes, some of which are key to health, such as MTSS1 (metastasis suppressor), as well as key components of innate immunity (cathelicidin antimicrobial peptide, beta-defensins, TLR2 - toll-like receptor, etc. ).
• VDR role in innate immunity is unique to humans.
• No other animal model (e.g. mouse) did not develop such an evolutionary function for this receptor.
• Estrogen receptor ESR directly or indirectly is responsible for the expression of 6,000 genes, i.e. 19% of the entire genome.
56
ESR
57
Sivakumaran S., Agakov F., Theodoratou, E., Prendergast J.G., Lina Zgaga L., Manolio T., Rudan I., McKeigue P., Wilson J.F., Campbe H. Abundant Pleiotropy in Human
Complex Diseases and Traits. Am. J. Hum. Genet., 2911; 89: 607–618
Demonstration of «abundant» pleiotropic action of genes associated with Crohn’s disease
58
Conclusion
• Click the filly on the nose - it will wag its tail
• Koz’ma Petrovich Prutkov
59
Microbiome and metagenome
• It is estimated that the human intestinal microflora are composed of between 1013 and 1014 microorganisms (comparable with the number of our own cells), comprising >1000 bacterial species).
• The metagenome of this so-called microbiome has at least 100 times as many genes as our own genome.
• The microbiome can be thought of as an additional organ, which is estimated to weight 1 kg in an adult human and is mutalistic to humans (and the commensal microflora that inhabit the host).
60
Influence of the micribiote on the physiology of the human dody
61
Metabolome • As the consequence, the microbiome provides the human
host with additional metabolic functions, described as the “metabolome”.
• It:
• (i) provides a barrier for colonization of pathogens;
• (ii) exerts fermentation of non-digestible fibers, salvage of energy and synthesis of vitamin K; and
• (iii) stimulates the development of the immune system.
• It was stated earlier that, from human genome sequence analyses, there are predicted to be 2645 metabolites in the human metabolic network.
• This number will inevitably need revising keeping in mind prokaryotic-derived metabolites in humans.
62
• In summary, there are 1014 total gastrointestinal tract (GI) bacteria….
• If we assume that one mutation in every 108 bacterial divisions is a viable mutation, 1014 total bacteria in the GI tract theoretically will produce 106 newly mutated viable bacteria at every division cycle.
• It is estimated that the bacteria in the GI tract divide every 20 minutes.
• This generation of large numbers of newly mutated bacteria at every division cycle allows the indigenous GI microflora to adapt rapidly to GI environmental changes.
63
We may be born 100% human but will die 90% bacterial—a truly complex organism!
• The microbiome (the intestinal microflora as well as those bacteria found on the skin, in airways, and in the urogenital tract) may play an important role in maintaining human health.
• This interaction of the microbiome with humans suggests that the human be considered as a superorganism, where we are in fact a human-microbe hybrid.
• The womb is sterile, and so babies are born with gnotobiotic gastrointestinal tracts.
• Its bacteria are mainly maternally acquired, and this process is largely achieved in the first year of life.
64
Unpredictability of genetic predispositions
• Knowing the genotype we cannot unambiguously predict the phenotype, and vice versa:
• Knowing the phenotype it is not possible to predict unequivocally the genotype.
• The validity of the uncertainty principle in genetics is extremely clear:
• even knowing the genome sequence of the person, we will never be able to predict many of its features, e.g., to predict the circumference of his waist.
65
Janssens A.C.J.W., van Duijn C.M. An epidemiological perspective on the future of direct-to-consumer personal genome testing.
Investigative Genetics, 2010; 1:10
66
Amizing phenomena of genetic predisposition testing
• Fuzzy phenotypes.
• Winner's curse.
• Discordant conclusions.
• Genotyping errors.
• Mania of secrecy.
• Multi- and oppositely directed pleiotropy.
• Inadequate statistical analysis.
• Publication bias.
67
Fuzzy (poorly distinguishable, not alternative) phenotypes – the shadow of Lamarck
• Low efficiency of results in the genetics of predispositions already lies in the uncertainty of the determination (diagnosis) of the studied trait.
• For example, how to distinguish an “athlete” from a “smug”?
• As a control group of non-athletes the persons leading a sedentary lifestyle are often sampled.
• But if you think about it, this is pure Lamarckism with its asserting the influence of "exercise" and “non-exercise“ of an organ on its evolutionary destiny.
68
Winner's curse
• Too often, initially promising discoveries that typically cause great excitement, are not reproduced in subsequent studies.
• This phenomenon is called the “winner's curse”.
69
Discordant conclusions
70
Ng P.C., Murray S.S., Levy S., Venter J.C. An agenda for personalized medicine. Nature, 2009; 461: 724-726. About one half of conclusions presented by two companies (23andMe and
Navigenics) contradicts one another.
Imai K., Kricka L.J., Fortina P. Concordance Study of 3 Direct-to-Consumer Genetic-Testing Services. Clin. Chem., 2011; 57(3): 518–521
Relative disease risk assigned by 3 DTC services for a series of diseases evaluated by all 3 services. Values in parentheses indicate the number of SNPs analyzed .
71
Kalf R.R.J., Mihaescu R., Kundu S., de Knijff P., Green R.C., Janssens A.C.J.W. Variations in predicted risks in personal genome testing for common complex diseases. Genetics in Medicine, 2014; 16(1): 85-91
• Predicting risks vary significantly between companies due to differences in the sets of SNPs used and the average values of the population risk, as well as the differenses in the formulas used to calculate the risks.
72
Comparison of risks for three multifactorial diseases predicted by 23andMe, deCODEme and Navigenics
73
Comparison of risks for three multifactorial diseases predicted by 23andMe, deCODEme and Navigenics
74
Adams S.D., Evans J.P., Aylsworth A.S. Direct-to-Consumer Genomic Testing Offers Little Clinical Utility but Appears to Cause Minimal
Harm. N. C. Med. J.m, 2013; 74(6): 494-499
75
76
77
CURRENT ONCOLOGY, 2009; 16(1): 56-58
78
Conspiracy mania
79
Pharmacogenomics
• Lenzini P., Wadelius M., Kimmel S., Anderson J.L., Jorgensen A.L., Pirmohamed M., Caldwell M.D., Limdi N., Burmester J.K., Dowd M.B., Angchaisuksiri P., Bass A.R., Chen J., Eriksson N., Rane A., Lindh J.D., Carlquist J.F., Horne B.D., Grice G., Milligan P.E., Eby C., Shin J., Kim H., Kurnik D., Stein C.M., McMillin G., Pendleton R.C., Berg R.L., Deloukas P., Gage B.F.
• Integration of genetic, clinical, and INR data to refine warfarin dosing.
• Clin. Pharmacol. Ther., 2010; 87(5): 572-578.
80
Calculation of individual warfarin dose
• Pharmacogenetic dose (mg/week) =
• EXP [3,10894 − 0,00767 age − 0,51611 ln(INR) − 0,23032 VKORC1-1639 G>A − 0,14745 CYP2C9*2 − 0,3077 CYP2C9*3 + 0,24597 BSA + 0,26729 Target INR − 0,09644 African origin − 0,2059 stroke − 0,11216 diabetes − 0,1035 amiodarone use − 0,19275 fluvastatin use + 0,0169 dose−2 + 0,02018 dose−3 + 0,01065 dose−4].
• Clinical dose (mg/week) =
• EXP [2,81602 − 0,76679 ln(INR) − 0,0059 age + 0,27815 target INR − 0,16759 diabetes + 0,17675 BSA − 0,22844 stroke − 0,25487 fluvastatin use + 0,07123 African origin − 0,11137 amiodarone use + 0,03471 dose−2 + 0,03047 dose−3 + 0,01929 dose−4].
81
82
Pharmacogenetically predicted warfarin dose
Clinically predicted warfarin dose
FDA warns and accuses: genetic testing is not scientifically justified
• Food and Drug Administration USA (FDA) has sent a WARNING LETTER to 17 companies (23andMe, Navigenics, deCODEme, EasyDNA and others) who are engaged in genetic testing, an order to cease their activities because of the lack of scientific evidence and the inability to accurately predict the risk of diseases.
• http://www.fda.gov/MedicalDevices/ProductsandMedicalProcedures/InVitroDiagnostics/default.htm
23.04.2014 83
Honesty: article retraction due to genotyping errors
• Sebastiani P., Solovieff P.N., Puca A., Hartley S.W., Melista E., Andersen S., Dworkis D.A., Wilk J.B., Myers R.H., Steinberg M.H., Montano M., Baldwin C.T., Perls T.T. Retraction. Science, 2011; 333: 404
• After online publication of our Report “Genetic Signatures of Exceptional Longevity in Humans“ in Science Express, July 1, 2010, we discovered that technical errors and an inadequate quality control protocol had introduced errors in our results.
• We are voluntarily retracting the original manuscript and are pursuing alternative publication of the corrected results.
• We will be happy to discuss our amended findings as soon as they are published.
84
Growth of the number of publications on genotyping error
85
Fang F.C., Steen R.G., Casadevall A. Misconduct accounts for the majority of retracted scientific publications. PNAS, 2012;
109(42):17028–17033.
86
• Too many sloppy mistakes are creeping into scientific papers.
• Lab heads must look more rigorously at the data — and at themselves.
87
Yong E. In the wake of high-profile controversies, psychologists are facing up to problems with replication. Nature, 2012; 485: 298-300
• Publication bias.
• A literature analysis across disciplines reveals a tendency to publish only ‘positive’ studies – those that support the tested hypothesis.
• Psychiatry and psychology are the worst offenders.
88
Two fundamental questions • From evolutionary point of view, genetics of predispositions
has to answer two basic questions:
• 1. Is the natural genetic polymorphism identified with modern genomics proved to be the result of neutral evolution or whether it is an aggravated genetic (mutation) load determining the susceptibility to common diseases, which inexplicably has not been culled by the natural selection well-timed?
• 2. Are joint effects of different predisposing alleles synergistic or at least additive when combined in a single genotype, or they are mutually neutralized?
89
• Evolutionary and population arguments help us to understand that the «genetics of predispositions» studies natural balanced genetic polymorphism, i.e. not the newly formed alterations of genes (mutations), but alleles passed natural selection and fixed in human populations; not anomalies, not pathological or pathogenic variants of the genome are investigated, but infinite number of its natural, «normal» variants.
• Thus the answer on the first of two questions is:
• Evolutionary medical genomics testifies that the vast majority of polymorphic variants of genes (alleles) that are observed in the genomes of modern human populations are selectively neutral.
90
• Indeed, it appears that the coding, i.e. functionally important regions in the human genome, show a much lower degree of variation than non-coding, i.e. whose function is unknown.
• The absolute number of synonymous variants outnumbers nonsynonymous (missense) variants, despite the fact that the number of positions at which non-synonymous variants can occur, is 3 times higher than the position with the possibility of synonymous mutations.
• Proportion of synonymous variants is 4 times greater than non-synonymous (80% and 20% respectively).
91
• In general, the neutralist evolutionary views lead to the
conclusion that historical adaptive evolutionary events are not the source of diseases.
• On the contrary, evolution is a source of stability and the reason why human beings so successfully exist in widely varying conditions.
• Neutrality and balance explain the fact that the predisposing genotypes are found both in persons with the disease (patients) and in persons without the disease (“healthy”), and the only difference is observed in their frequencies in groups of subjects with given disease and without it.
• That is, certainly the presence of predisposing alleles in the genotype of given person is not indicative of the inevitable presence of given disease or other propensity in the preset or its occurrence in the future.
92
• The second principal question:
• Are effects of different predisposing alleles synergistic or at least additive when being combined in a single genotype, or they are mutually neutralized?
• The answer is:
• In many cases, the effects of various predisposing alleles are mutually neutralized through the mechanisms of opposite (antagonistic) pleiotropy and homeostasis.
93
Frequency distributions in two sample of persons with extreme values of blood pressure for 35 alleles predisposing to
hypertension
94
The distributions are almost completely overlapping.
Paynter N.P., Chasman D.I., Pare G., Buring J.E., Cook N.R., ScD, Miletich J.P., Ridker P.M. Association between a literature-based genetic risk score and
cardiovascular events in 19,313 women. JAMA, 2010; 303(7): 631–637.
GRS - genetic risk score – do not improve risk prediction for cardiovascular diseases.
95
Thromboembolism, 11 markers (Kapustin S.I., 2007)
96
0 2 4 6 8 10 12Число предрасполагающих аллелей
0
25
50
75
100
125Ч
ис
лен
но
сть
593 patients with venous thrombosis, Control group of 225 persons
Pval = 0.046 (Mann-Whitney test); Pval = 0.52 (χ2); BF01 = 105
Bayes’ theorem in action
97
DgPDPDgPDP
DgPDPgDPPPV
||
||
11
111
DgPDgPDPDgPDgPDP
DgPDgPDPggDPPPV
||||
||,|
2121
21212,1
To calculate predictive probabilities like PPV we have to know pretest (prior) probability of the presence of the disease P(D+) which is called prevalence, Prev and “counter-prevalence” coPrev = P(D-) = 1 – P(D+)
Artificial, but rather typical example
• Let the prevalence of the disease with the elements of hereditary predisposition (D) in a population is
• P(D+) = 1%.
• Assume that the proportion of carriers of the genotype (allele or haplotype) g1, predisposing to this disease equals
• P(g1|D+) = 20%,
• and it is two times higher than for individuals without this disease:
• P(g1|D-) = 10%.
• This corresponds to the values of the risk ratio RR = 2 and the odds ratio OR = 4,4.
• According to Bayes formula the probability of this disease in patients with the genotype g1 at approximately 2-fold higher than its prevalence:
• PPV1 = P(D+|g1) = 1,98% ≈ 2%.
• Obviously, on the basis of such a small value it is hard to get convincing predictions about the presence or development of this disease in a particular individual.
98
Predictive values of several predisposing genetic markers
Number of predisposing loci in a genotype
PPV Proportion of carriers of the
predisposing genotype in population
1 0.020 0.1
2 0.039 0.01
3 0.075 0.001
4 0.14 0.0001
5 0.24 10-5
6 0.39 10-6
7 0.71 10-7
8 0.84 10-8
9 0.91 10-9
10 0.95 10-10
99
Odds ratios (OR) for the 2 type diabetes corresponding to the number of the predisposing allele in the genotype
100
Persons at high genetic risk are very rare
• Really, the larger the number of predisposing alleles in the genotype, the higher the risk of disease.
• Theoretically it is possible to identify people with a very high risk of the disease, but in practice they will be extremely rare.
• And it is difficult if impossible to prove that given rare combination of predisposing alleles is responsible for the presence of given disease in given person.
• In vast majority of persons the risk of disease only slightly higher than the average risk of disease in the population.
101
Hirschhorn J.N. Genomewide Association Studies — Illuminating Biologic Pathways N. Engl. J. Med., 2009; 360(17): 1699-1701.
• Many newly identified loci do not implicate genes with known functions.
• It is hardly surprising that we do not yet understand the biologic import of every recently associated locus: the associations sometimes do not point unambiguously to a particular gene, and even genes that are clearly implicated are often unannotated with respect to function.
• With regard to prediction, the common variants described by genomewide association studies almost universally have modest predictive power, and for most diseases and traits, these variants in combination explain only a small fraction of heritability.
• The success of genomewide association studies is not tied to prediction.
102
• The number of markers associated with sport performance is estimated as about 200.
• Let us imagine that we will be able to gather in one genome almost all known alleles predisposing to a particular sport pursuits.
• Obviously, due to the non-additivity of interactions of intergenic and environmental factors the athletic performance of a person with such genotype will not be proportional to the number of predisposing alleles.
• It seems to be unlikely that combining 200 predisposing alleles in one genome will result in a 200-fold increase in sports performance of such persons.
• And due to opposite directed pleiotropic action of these alleles will our Superman appear to become a “superidiot”?
103
Yannis Pitsiladis University of Glasgow.
• “Currently, the predictive ability of sports genetics is zero.
• There is no direct evidence for the existence of genetic indicators of the success of athletes.
• The effectiveness of an athlete depends primarily on the socio-economic, cultural and environmental factors.
• So stopwatch predicts much better athletic performance runner than the whole genetics.”
• http://news.menshealth.com/why-kenyans-keep-winning-marathons/2011/06/03/
104
The predictive ability of clinico-genetic certification
105
The main measures of the quality of diagnostic test with the binary outcomes
• Even if the genetic association is statistically highly significant, it certainly raises the question of how useful is this information for risk assessment and prediction of disease?
• For these purposes it is necessary to measure well-known indicators of the diagnostic test quality, such as the sensitivity (Se), specificity (Sp), positive prediction value (PPV), negative prediction value (NPV), positive likelihood ratio (LR[+]) and negative likelihood ratio (LR[+]).
106
The main indices of the detection capability of the index diagnostic test
Index test
Gold standard
Result:
Disease is present, D+ Disease is absent, D-
Result:
Positive,
T+
Sensitivity – probability of the positive in a person
with the disease Se = P(T+|D+)
Counter-specificity – probability of the positive in
a person without the disease
coSp = P(T+|D-) = 1 – Sp
Negative
T-
Counter-sensitivity – probability of the negative
in a person with the disease coSe = (T-|D+) = 1 – Se
Specificity – probability of the negative in a person
without the disease Sp = P(T-|D-)
107
The main indices of the prediction сapability of the index diagnostic test
Index test
Gold standard
Result:
Disease is present, D+ Disease is absent, D-
Result:
Positive,
T+
Positive predictive value – probability of presence the
disease in a person with positive
PPV = P(D+|T+)
Positive counter-predictive value – probability of
absence of the disease in a person with positive
coPPV = P(D-|T+) = 1 – PPV
Negative
T-
Negative counter-predictive value – probability of
presence of the disease in a person with negative
coNPV = P(D+|T-) = 1 – NPV
Negative predictive value – probability of the absence of the disease in a person with
negative NPV = P(D-|T-)
108
N.B. – nota bene
•P(D+|T+) ≠ P(T+|D+)
109
Probabilistic indices of detection and prediction capabilities of the diagnostic test
Se = P(T+|D+)
Sensitivity
coSp = 1 – Sp = P(T+|D-)
Counter-specificity P(T|D)
coSe = 1 – Se = P(T-|D+)
Counter-sensitivity
Sp = P(T-|D-)
Specificity
P(T|D) ≠ P(D|T)
PPV = P(D+|T+)
Positive Predictivity
coPPV = 1 – PPV = P(D-|T+)
Positive Counter-Predictivity P(D|T)
coPPV = 1 – NPV = P(D+|T-)
Negative Counter-Predictivity
NPV = P(D-|T-)
Negative Predictivity
110
Lotufo P.A., Chae C.U., Ajani U.A., Hennekens C.H., Manson J.A.E. Male pattern baldness and coronary heart disease: The Physician's Health Study.
Arch. Intern. Medicine, 2000; 160: 165-171.
CAD Total Predictiveness
Alopecia Yes, D+ No, D-
Yes, M+ 127
0,0700,0940,123
1224 1351 PPV = P(D+|M+)
= 0,070,100,12
No, M-
548
0,0580,0670,077
7611 8159 NPV =
P(D-|M-)
= 0,920,930,94
Total 675 8835 9510 PrevD = P(D+)
= 0,060,070,08
Detectability
Se = P(M+|D+)
= 0,140,190,24
Se = P(M-|D-)
= 0,850,860,87
PrevM = P(M+) =
0,130,140,15
LR[+] =
P(M+|D+)/P(M+|D-)
= 1,021,361,77
LR[-] =
P(M-|D-)/P(M-|D+)
= 1,001,061,14
AUC = (Se + Sp)/2 = 0,500,530,56
Homogeneity Pval = 0,00058 ≈ 6∙10-6
BF10 = 18,9
Association OR = 1,021,452,01 111
Better to see once
• Results of statistical quality control of diagnostic tests is useful to visualize as predictive graphs.
• Examples are shown in figures below.
• The spreadsheet created by R.G. Newcombe PPVNPV.xls may be used
• http://medicine.cf.ac.uk/media/filer_public/2012/11/01/PPVNPV.xls
112 23.04.2014
Lotufo P.A., Chae C.U., Ajani U.A., Hennekens C.H., Manson J.A.E., Male pattern baldness and coronary heart disease: The Physician's Health Study, Arch. Intern. Med. 2000; 160(2): 165-71.
Simon A., Worthen D. M., Mitas J. A.1979. An evaluation of iridology // JAMA, V. 242, N 1, P. 1385-1389.
Alopecia and CAD Iridology and renal failure
113
127/1224/548/7611 LR[+] = 1.01.41.8; LR[-] = 1.01.061.1
29/59/19/36 LR[+] = 0.71.01.4; LR[-] = 0.71.01.5
23.04.2014
• Since the time of Hippocrates (c. 460 – c. 370 BC) it is known that eunuchs do not go bald when they become eunuchs before the age of 25.
• It’s unlikely that any doctor on the basis of these data will recommend to young men to have children up to 25 years and then became eunuch not to go bald and thus to reduce the risk of developing coronary heart disease by 2%.
• Nevertheless, it is very similar to the recommendations of medical geneticists, most of which are too often based on clinically insignificant values of the indices of predictive abilities of genetic testing.
• Rarely odds ratios in these studies exceed the value • OR > 2.
114
Druzhevskaya A.M, Ahmetov I.I., Astratenkova I.V., Rogozkin V.A. 2008. Association of the ACTN3 R577X polymorphism with power athlete status in Russians. Eur. J. Appl. Physiol., 2008; 103: 631–634.
Кундас Л.А., Жур К.В., Бышнев Н.И. и др. Анализ молекулярно-генетических маркеров, ответственных за устойчивость к физическим нагрузкам, у представителей академической гребли. Молекулярная и
прикладная генетика: сб. науч. тр. Институт генетики и цитологии НАН Беларуси; (гл. ред. А.В. Кильчевский). 2013. - Минск: ГНУ «Институт генетики и цитологии НАН Беларуси», Т. 14. – C. 101-105.
Gene ACTN3 and elite athletes Gene PPARG and elite rowers
115
455/1027/31/170 LR[+] = 1.01.11.1; LR[-] = 1.42.23.7
3/3/21/147 LR[+] = 0.95.840; LR[-] = 1.11.21.6
23.04.2014
Mayeux R., Saunders A.M., Shea S., et al. 1998. Utility of the apolipoprotein E genotype in the diagnosis of Alzheimer’s disease. N. Engl. J. Med., 1998; 338: 506-511.
Mäki M., Mustalahti K., Kokkonen J., et al. 2003. Prevalence of celiac disease among children in Finland. N. Engl. J. Med., 2003; 348: 2517-2524.
Gene APOE and Alzheimer's disease HLA haplotypes and celiac disease
116
1142/133/622/285
LR[+] = 1.72.02.5; LR[-] = 1.71.92.2
54/1357/2/2214 LR[+] = 2.72.52.7; LR[-] = 4.112103
23.04.2014
Banks E., Reeves G., Beral V. et. al. Influence of personal characteristics of individual women on sensitivity and specificity of mammography in the Million Women Study: cohort study. BMJ, 2004; 329(7464): 477-479.
Kevin P. Delaney K.P., Branson B.M., Apurva Uniyal A. et al. Performance of an oral fluid rapid HIV-1/2 test: experience from four CDC studies. AIDS, 2006; 20: 1655–1660.
Mammography and breast cancer Rapid test for HIV OralQuick®
117
629/3885/97/117744 LR[+] = 262729; LR[-] = 5.77.29.3
327/12/1/12010 LR[+] = 4959192141; LR[-] = 451653171
23.04.2014
Temptations which should be disposed of
• (1) Catastrophism (or “trillerism") – hypnotizing ourselves and others that our genome is a dump of hazardous alleles.
• (2) Genetitsism - genetic determinism - the blind, fanatical belief in the omnipotence of genes like the statement: "Genetics - the basis of medicine”.
• (3) Eugenics - an underlying desire to improve human nature and to select for a breed of "good" or "right" people, "elite", such as e.g., athletes.
• (4) The commercialization of basic science, which, God forbid, may fall to criminalization.
• Fundamental science loses its chastity and becomes mercenary (corrupt).
• It is pushing on this slippery slope (“on the street") by the science administrators who require science to be self-supporting.
118
Summary
• Poor reproducibility and low predictive values of the results in the genetics of predispositions (genetic association studies) become a systemic problem.
• Results of the statistical quality control of genetic tests in the study should be supported with the post-test (posterior) predictive probabilities (PPV and NPV) and likelihood ratios (LR[+] and LR[-]).
• Predictive values of the vast majority of genetic markers differs little from the population prevalence of the disease.
• This means that such tests despite high statistical significance of their results are not able to provide clinically important association between the disease and biomarker.
• As a result, in most cases, recommendations of medical geneticists are based on clinically negligible (though statistically significant) recognizablity and predictability of genetic markers.
119
Some practical conclusions
• Genetics is the science of heredity and heredity is the fundamental property organisms to pass their traits and peculiarities to the offspring.
• Therefore, results of the genetic association studies should be confirmed by the studies of genetic predispositions at the level of least two generations of relatives, i.e. it is necessarily to analyze families, pedigrees and twins.
• Before engaging in genomics, the registration of genealogies should be initially introduced into clinical practice.
• It is cheaper and more efficient.
• There exist simple and effective statistical test TDT – Transfer Disequilibrium Test.
120
Remember family history is still important even with molecular characterization
121
Genetics of common complex diseases
• Despite the unequivocally strong statistical associations between genetic variants and complex diseases, their low sensitivity and specificity afford limited clinical value for disease predisposition testing.
• Among myriad technological advances and gene discoveries, a simple family history continues to be advocated as a tool for identification of common disease risk.
• For very common conditions with high heritability, such as cardiovascular disease, family history is a much stronger predictor of disease than any single or combination of genetic/genomic markers.
• One model suggests that neither family history nor genetic testing should be used as a standalone but that the real power for disease prediction, risk assessment, and differential diagnosis comes from their combined use.
122
Statistical audit and free access to the data
• Greatly needed is statistical expertise of papers submitted for publication in biomedical journals.
• Several editorial boards of scientific journals have invited experts on statistics.
• It is necessary to impute the responsibility of reviewers to check the correctness of the calculations.
• For this we need to open the initial (raw) data, as is done, for example, in the journals Science, International Forensic Sciences: Genetics.
• When the results were published in the press, then the original data are no longer the intellectual property of the authors and should be accessible to specialists.
123
Watson’s bell tolls for Oncogenomocs http://www.utsandiego.com/news/2013/mar/21/nobel-watson-DNA-irish
• "You could sequence 150,000 people
with cancer and its not going to
cure anyone.
• It might give you a few leads,
but it's not, to me, the solution.
• The solution is good chemistry.
• And that's what's lacking.
• We have a world of cancer biology trained to think genes.
• They don't think chemistry at all."
124
• Evolutionary medical genomics, whether we realize it or not, is the foundation of genetics of predispositions whose main goal should not be personalized prediction of disease risk, but to develop strategies for its treatment and prevention on the basis of the knowledge of its genetic, evolutionary history and molecular mechanisms.
125
My Teacher: Mikhail Efimovich Lobashev (11.11.1907 – 04.01.1971)
• In science you can do anything,
• just don’t forget about the consequences and the responsibility.
126
Thank you for your attention! Slides are available for anybody
Nikita N. Khromov-Borisov Department of Physics, Mathematics and Informatics
First Pavlov State Medical University of St Petersburg
8-952-204-89-49 (Теле2); 8-921-449-29-05 (Мегафон)
http://independent.academia.edu/NikitaKhromovBorisov
127
Reference
• Rubanovich A.V., Khromov-Borisov N.N.
• Theoretical Analysis of the Predictability Indices of the Binary Genetic Tests.
• Russian Journal of Genetics: Applied Research,
• 2014; 4(2): 146–158.
128