An Overview of Bioinformatics-Final3
Transcript of An Overview of Bioinformatics-Final3
-
8/3/2019 An Overview of Bioinformatics-Final3
1/15
UNIT 8. AN OVERVIEW OF BIOINFORMATICS
I. TEXT1. Introduction
Biology is in the middle of a major paradigm shift driven by computingtechnology. Although it is already an informational science in many respects, the fieldhas been rapidly becoming much more computational and analytical. Rapid progress ingenetics and biochemistry research combined with the tools provided by modernbiotechnology has generated massive volumes of genetic and protein sequence data.
Bioinformatics has been defined as a means for analysing, comparing,graphically displaying, modeling, storing, systemising, searching, and ultimatelydistributing biological information, which includes sequences, structures, function, andphylogeny. Thus bioinformatics may be defined as a discipline that generates
computational tools, databases, and methods to support genomic and postgenomicresearch. It comprises the study of DNA structure and function, gene and proteinexpression, protein production, structure and function, genetic regulatory systems, andclinical applications. Bioinformatics needs the expertise from Computer Science,Mathematics, Statistics, Medicine, and Biology.
2. Knowledge Base in Biology
In the last 10 years or so, numerous innovations have seen light and theconsequence is the development of a new biological research paradigm, one that isinformation-heavy and computer-driven. As the genetic information is being made as
computerized databases and their sizes are steadily growing, molecular biologists needeffective and efficient computational tools to store and retrieve the cognate informationsuch as bibliographic or biological information from the databases, to analyze thesequence patterns they contain and to extract the biological knowledge the sequenceshave. On the other hand, there is a strong need for mathematical methods andcomputational techniques for challenging computational tasks such as predicting thethree-dimensional structure of the molecules the sequences represent, and to constructevolutionary trees from the sequence data. These tools will also be used to learn basicfacts about biology such which sequences of DNA are used to code proteins , which othercombinations of DNA are not used for protein synthesis, for greater understanding ofgenes and how they influence diseases.
Biology employs a digital language for represening its information using the four basic alphabets (A, C, G, T). All the chromosomes in an organism' cell have beenrepresented and being identified using these alphabets. The demanding challenge here isto determine how this digital language of the chromosomes is being converted into thethree-dimensional and sometimes four-dimensional languages of living and breathingorganisms.
1
-
8/3/2019 An Overview of Bioinformatics-Final3
2/15
3. Information Technology in Biology
As it was found that performing all these above-mentioned tasks manually is
nearly impossible due to the massive volumes of biological data and the preciseness ofworks, it became mandatory to use computers for these purposes. Thus this subject ofbioinformatics deals with designing and deploying efficient software tools foraccomplishing the above quoted tasksin a fast and precise manner. So, bridging the gapbetween the real world of biology and precise logical nature of computers requires aninterdisciplinaryperspective.
4. Software and Hardware Advancements in Biology
The tools of computer science, statistics, and mathematics are very critical forstudying biology as an informational science subject.
Some of the recent advances happened include improved DNA sequencingmethods, new approaches to identify protein structure, and revolutionary methods tomonitor the expression of many genes in parallel. The design of techniques able to dealwith different sources of incomplete and noisy data has become another crucial goal forthe bioinformatics community. In addition, there is the need to implement computationalsolutions based on theoretical frameworks to allow scientists to perform complexinferences about the phenomena under study.
Genomics in the recent past has triggered the development ofhigh-throughputinstrumentation for DNA sequencing, DNA arrays, genotyping, proteomics, etc. These
instruments have catalyzed a new type of science for biology termed discovery science.
5. Human Genome Project - An Introduction
The Human Genome Project has encouraged a series of paradigm changes to theview that biology is an informational science. The draft of the human genome has givenus a genetics parts list of what is necessary for building a human: approximately 35,000genes, their regulatory regions, a lexicon of motifs that are the building blockcomponents of proteins and genes, and access to the human variability that make us eachdifferent from one user.
6. Genomes - Discovering Methodology and Study
Discovery science defines all of the elements in a biological system. For example,sequence of the genome, identification and quantitation of all of the mRNAs or proteinsin a particular cell type - respectively, genome, transcriptome, and the proteome.Discovery science creates databases of information, in contrast to the more classicalhypothesis-driven science that formulates hypotheses and attempts to test them. Thehigh-throughput tools both provide the means for discovery science and can assay how
2
-
8/3/2019 An Overview of Bioinformatics-Final3
3/15
global information sets, for example, transcriptomes or proteomes change as systems areperturbed.
The genomes of the model organisms yeast, worm, fly etc., have demonstratedthe fundamental conservation among all living organisms of the basic informational
pathways. Hence systems can be perturbed in model organisms to gain insight into theirfunctioning, and these data will provide fundamental insights into human biology. Fromthe genome, the information pathways and networks can be extracted to beginunderstanding their logic of life. Further more, different genomes can be compared toidentify similarities and differences in the strategies for the logic of life and these providefundamental insights into development, physiology and evolution. The first eukaryoticgenome that has been fully sequenced and annotated is Saccharomyces cerevisiae. Thishighly helps to develop biological and computational tools for genomic and postgenomicresearch.
In the era of automated DNA sequencing and revolutionary advances in DNA
sequence analysis, the attention of many researchers is now shifting away from the studyof single genes or small gene clusters to whole genome analyses. Knowing the completesequence of a genome is only the first step in understanding how the myriad ofinformation contained within the genes is transcribed and ultimately translated intofunctional proteins. In the post genomic era, functional genomic and proteomic studieshelps to obtain an image of the dynamic cell.
7. System Biology
Biology is a highly informational science. There are mainly two types of biologicalinformation.
The information of genes or proteins, which are the molecular machines of life The information of the regularity networks that coordinate and specify the
expression patterns of the genes and proteins.
All biological information is hierarchical. Initially DNA will change over to mRNA,which in turn goes to protein. Proteins enacts protein interactions, which creates someinformational pathways. These pathways form informational networks, which in turnbecome cells. Now cells forms networks of cells. Finally an individual is a collection ofcells. A host of individuals forms population and a variety of populations becomesecologies. This evolution brings a primary challenge for researchers and scientists to
create tools and mechanisms to capture and integrate these different levels of biologicalinformation and integrate it towards gaining insight of their curious functionings.
All of these paradigm shift lead to the view that the major challenges for biology andmedicine in this new century will be the study of complex systems and the approachnecessary for studying these biological complexities. Here comes a viable approach.
3
-
8/3/2019 An Overview of Bioinformatics-Final3
4/15
i. Identify all elements, such as sequence of genomes in the system with currentlyavailable discovery tools
ii. Use current knowledge of the sytem to formulate a model predicting its behavioriii. Perturb the system in a model organism using biological, genetic or
environmental perturbations, capture information at all relevant levels, such as
DNA, mRNA, protein, protein interactions, etc. and integrate the collectedinformationiv. Compare theoretical predictions and experimental data, carry out additional
perturbations to bring theory and experiment into closer apposition, integrate newdata into model,
v. Iterate steps iii) and iv) till the mathematical model can predict the structure of thesystem and its systems oremergent properties given particular perturbations.
8. System Biology - Challenges Ahead
The Integration of technology, biology, and computation.
The integration of the various levels of biological information and the modeling . The proper annotation of biological information and its its storage and integration
in databases. The inclusion of other molecules, large and small, in the systems approach. The integration imperatives of systems biology presents many challenges to
industry and academia.
9. Conclusion
With the confluence of biology and computer science, the computer applicationsof molecular biology are drawing a greater attention among the life science
researchers and scientists these days. As it becomes imperative for biologists to seekthe help of information technology professionals to accomplish the ever growingcomputational requirements of a host of exciting and needy biological problems, thesynergy between modern biology and computer science is to blossom in the days tocome. Thus the research scope for all the mathematical techniques and algorithmscoupled with software programming languages, software development anddeployment tools are to get a real boost. In addition, information technologies such asdatabases, middleware, graphical user interface (GUI) design, distributed objectcomputing, storage area networks (SAN), data compression, network andcommunication and remote management are all set to play a very critical role intaking forward the goals for which the bioinformatics field came into existence.
10. Biological Database Links
NCBI HomeEstablished in 1988 as a national resource for molecular biology information,NCBI creates public databases, conducts research in computational biology,develops software tools for analyzing genome data, and disseminates
4
http://www.ncbi.nlm.nih.gov/http://www.ncbi.nlm.nih.gov/http://www.ncbi.nlm.nih.gov/ -
8/3/2019 An Overview of Bioinformatics-Final3
5/15
biomedical information - all for the better understanding of molecular processesaffecting human health and disease.
Entrez Search and Retrieval System
Entrez Programming Utilitiesare tools that provide access to Entrez data outsideof the regularweb query interface and may be helpful for retrieving searchresults for future use in another environment.
KEGG: Kyoto Encyclopedia of Genes and GenomesA grand challenge in the post-genomic era is a complete computer representationof the cell and the organism, which will enable computational prediction ofhigher-level complexity of cellular processes and organism behaviors fromgenomic information. Towards this end we have been developing abioinformatics resource named KEGG, Kyoto Encyclopedia of Genes andGenomes, as part of the research projects in the Kanehisa Laboratory of Kyoto
University Bioinformatics Center.
TIGR Gene IndicesThe TIGR Gene Index Project is supported in part by funding from the USDepartment of Energy, Grant #DE-FG02-99ER62852, and the US NationalScience Foundation, Grant #DBI-9983070. Additional funds are provided by theUS National Science Foundation through grants #DBI-9813392 and #DBI-9975866.
Gramene: A Comparative Mapping Resource for GrainsGramene is a curated, open-source, Web-accessible data resource for
comparative genome analysis in the grasses. Our goal is to facilitate the study ofcross-species homology relationships using information derived from publicprojects involved in genomic and EST sequencing, protein structure and functionanalysis, genetic and physical mapping, interpretation of biochemical pathways,gene and QTL localization and descriptions of phenotypic characters andmutations.
MaizeDBThe goals of this project are to provide a central repository for public maizeinformation and present it in a way that creates intuitive biological connectionsfor the researcher with minimal effort as well as provide a series of
computational tools that directly address the questions of the biologist in aneasy-to-use form.
Barley GenomicsAREAS Of RESEARCH: Barley Genome Mapping , Map-Based Cloning,Molecular Breeding, Mutant Isolation & Characterization, Functional Genomics,BAC Address Calculator, Developmental Mutants.
5
http://www.ncbi.nlm.nih.gov/Entrez/http://www.roseindia.net/bioinformatics/biologicaldatabases.shtmlhttp://www.genome.ad.jp/kegg/http://www.tigr.org/tdb/tgi.shtmlhttp://www.gramene.org/http://www.maizegdb.org/http://barleygenomics.wsu.edu/http://www.ncbi.nlm.nih.gov/Entrez/http://www.roseindia.net/bioinformatics/biologicaldatabases.shtmlhttp://www.genome.ad.jp/kegg/http://www.tigr.org/tdb/tgi.shtmlhttp://www.gramene.org/http://www.maizegdb.org/http://barleygenomics.wsu.edu/ -
8/3/2019 An Overview of Bioinformatics-Final3
6/15
EMBL European Bioinformatics Institute
The European Bioinformatics Institute (EBI) is a non-profit academicorganisation that forms part of the European Molecular Biology Laboratory(EMBL). The EBI is a centre for research and services in bioinformatics. The
Institute manages databases of biological data including nucleic acid, proteinsequences and macromolecular structures.
A Catalog of Genes for Plant Glycerol Lipid BiosynthesisThe current version of this catalog contains more than 2600 sequence files, manyof them with annotation and results of our analysis. This version is updated as ofAug. 1999 and includes essentially all publicly available genomic, cDNA, ESTand GSS sequences for 62 plant polypeptides involved in lipid metabolism inhigher plant species. An important feature of the catalog are the multiplealignments of amino acid sequences deduced from genomic and EST sequences.This version of the dataset accounts for approximately 70% of the Arabidopsis
genome.
Grain Genes: A Small Grains and Sugarcane DatabaseGBrowse, developed by the GMOD group, is a Genome Browser that provides awealth of genome annotation for maps in the GrainGenes collection. Users caneasily manipulate the view of the chromosome and type of data displayed.
PathDB PathwaysPathDB is a beta level research tool for scientists interested in analyzing theirexperimental or computational data in the context of biological pathways andnetworks.
Enzymes and Metabolic Pathways Database
Enzymes and Metabolic Pathways database, EMP, is a unique and mostcomprehensive electronic source of biochemical data. It covers all aspects ofenzymology and metabolism and represents the whole factual content of originaljournal publications.
Boehringer Mannheim Biochemical PathwaysRoche Applied Science: LightCycler, MagNA Pure LC, Lumi-Imager, PCR
ExPASy Molecular Biology ServerThe ExPASy (Expert Protein Analysis System) proteomics server of the SwissInstitute of Bioinformatics (SIB) is dedicated to the analysis of protein sequencesand structures as well as 2-D PAGE.
Nucleic Acids Research:2000 Biological Database IssueNucleic Acids Research (NAR) publishes the results of leading edge research intophysical, chemical, biochemical and biological aspects of nucleic acids and
6
http://www2.ebi.ac.uk/http://www.canr.msu.edu/lgchttp://wheat.pw.usda.gov/http://www.gmod.org/http://www.ncgr.org/pathdb/index.htmlhttp://www.roseindia.net/bioinformatics/biologicaldatabases.shtmlhttp://www.empproject.com/abouthttp://biochem.boehringer-mannheim.com/prodinfo_fst.htm?/techserv/metmap.htmhttp://www.expasy.ch/http://nar.oupjournals.org/cgi/content/full/28/1/1/DC1http://www2.ebi.ac.uk/http://www.canr.msu.edu/lgchttp://wheat.pw.usda.gov/http://www.gmod.org/http://www.ncgr.org/pathdb/index.htmlhttp://www.roseindia.net/bioinformatics/biologicaldatabases.shtmlhttp://www.empproject.com/abouthttp://biochem.boehringer-mannheim.com/prodinfo_fst.htm?/techserv/metmap.htmhttp://www.expasy.ch/http://nar.oupjournals.org/cgi/content/full/28/1/1/DC1 -
8/3/2019 An Overview of Bioinformatics-Final3
7/15
proteins involved in nucleic acid metabolism and/or interactions. It enables therapid publication of papers under the following categories: chemistry,computational biology, genomics, molecular biology, RNA and structuralbiology. A Survey and Summary section provides a format for brief reviews. Thefirst issue of each year is devoted to biological databases, and an issue in July is
devoted to papers describing web-based software resources of value to thebiological community.
Yeast Protein Database HOME PAGESix database volumes of biological information about proteins comprise Incyte'sProteome BioKnowledge Library. Each volume focuses on a different organismimportant in pharmaceutical research.
Saccharomyces Genome DatabaseSGDTM is a scientific database of the molecular biology and genetics of the yeastSaccharomyces cerevisiae, which is commonly known as baker's or budding
yeast.
The Breast Cancer Gene DatabaseA database of genes involved in breast cancer. It is similar to the Tumor GeneDatabase (below) but limited in scope to those genes involved in human breastcancer and thus will be able to go into greater depth. The criteria for a gene to beincluded in this database are that it has been shown to be involved in humanbreast cancer (rather than an animal model) and that there is some evidence thatit plays a functional role in the induction or progression of breast cancer.
The Mammary Transgene Interactive DatabaseThis is an interactive database of literature on research designed to targettransgene proteins to the mammary gland. Current emphasis is on biotechnologyapplications. Addition of tumor model and developmental model literature isplanned.
The Small RNA databaseSmall RNAs are broadly defined as the RNAs not directly involved in proteinsynthesis. These are grouped under three categories: l) Capped small RNAs; 2)Noncapped small RNAs; and 3) Viral small RNAs. Sequences and references areincluded, and you can do wais searching with a keyword.
The Tumor Gene DatabaseA database of genes associated with tumorigenesis and cellular transformation. Thisdatabase includes oncogenes, proto-oncogenes, tumor supressor genes/anti-oncogenes,regulators and substrates of the above, regions believed to contain such genes such astumor-associated chromosomal break points and viral integration sites, and other genesand chromosomal regions that seems relevant.
7
http://www.proteome.com/YPDhome.htmlhttp://genome-www.stanford.edu/Saccharomyces/http://condor.bcm.tmc.edu/ermb/bcgd/bcgd.htmlhttp://mbcr.bcm.tmc.edu/ermb/mtdb/mtdb.htmlhttp://mbcr.bcm.tmc.edu/smallRNA/smallrna.htmlhttp://condor.bcm.tmc.edu/ermb/tgdb/tgdb.htmlhttp://www.proteome.com/YPDhome.htmlhttp://genome-www.stanford.edu/Saccharomyces/http://condor.bcm.tmc.edu/ermb/bcgd/bcgd.htmlhttp://mbcr.bcm.tmc.edu/ermb/mtdb/mtdb.htmlhttp://mbcr.bcm.tmc.edu/smallRNA/smallrna.htmlhttp://condor.bcm.tmc.edu/ermb/tgdb/tgdb.html -
8/3/2019 An Overview of Bioinformatics-Final3
8/15
II. Vocabulary
Accomplish ['kmpli] v. Hon thnh, lm xongAnalyse ['nlaiz] v.
= analyzePhn tch
Analytical [,n'litikl] adj.= Analytic
(C tnh/ thuc) phn tch
Annotate ['nouteit] v. Ch gii, ch thch
Application [,pli'kein] n. ng dng, trnh ng dng
Assay ['sei] n. v. Th nghim, th nghim, phn tch
Bibliographic [,bibli'grfik] adj. Thuc th mc, chng mc
Challenge ['tlind] n. Thch thc, th thch, nhim v kh khn
Clinical ['klinikl] adj. (Thuc hay c lin quan n) khm hay iutr bnh, lm sng
Cognate ['kgneit] adj. Cng ngun gc, h hng gn, cng bn cht
Compare [km'pe] v. So snh
Computational [kmpju:'teinl] adj.
C s dng my tnh, thuc v khoa hc mytnh
Co-ordinate [kou':dineit] v. Phi hp, sp xpCrucial ['kru:l] adj. Ch yu, ct yu, quyt nhDatabase ['deitbeiz] n C s d liuDeal (with) v. C nhim v, bao gm vicDemonstrate ['demnstreit] v. Chng minh, gii thchDeploy [di'pli] v. Trin khai, dn trnDesigning [di'zaini] n. S thit k, vDisseminate [di'semineit] v. Ph bin, gieo rcDigital ['diditl] adj. Thuc con s, k thut sDiscipline ['disiplin] n. Ngnh kin thc, mn hc, quy tcDisplay [dis'plei] v., n. Hin th, ph by ra, s hin thDistribute [dis'tribju:t] v. Phn phi, phn b, phn loiEmergent [i'm:dnt] Ni bt, r ntIdentification [ai,dentifi'kein] n. S xc nhn, xc nhImperatives [im'pertiv] Adj. Cp thit, cp bchImplement ['implimnt] v. Thc thi, thc hin y Inference ['infrns] n. S suy ra, kt lunInnovation [,inou'vein] n. S i mi, s cch tnInsight ['insait] n. S thu hiu, hiu bit su scIntegrate ['intigreit] v. Kt hp, ho hp, hp nhtHypothesis [hai'pisis] n. Gi thuyt, l thuytExpertise [,eksp'ti:z] n. S thnh tho, s tinh thngExtract ['ekstrkt - iks'trkt] v. Trch, trch xut, chit ra
8
-
8/3/2019 An Overview of Bioinformatics-Final3
9/15
Framework ['freimw:k] n. Khun kh; c cu t chc, ct liMandatory ['mndtri] adj. C tnh bt buc
Manually ['mnjulli] adv. (Lm g ) bng tay, th cng
Middleware ['midl'wer] n. Middleware is a computer software thatconnects software components or applications. Phn kt ni trung gian.
Model ['mdl] v., n. M hnh ho, m hnh, kiu muMonitor ['mnit] v. Gim st, ch huyNetwork['netw:k] Mng li, h thngParadigm ['prdaim] n. H c s l thuyt (ca mt mn khoa hc),
nn tngPerturb [p't:b] v. Lm xo trn, lm ri tung lnPhylogeny [fai'ldni] n.=phylogenesis [,fail'dineisis]
S pht sinh loi, chng loi pht sinh
Predict [pri'dikt] v. Tin on, d boQuantitation ['kwntitn] Xc nh s lng, nh lng
Regulatory [regju:'leitri] adj.
= regulator[regju:'leit](c tnh) iu khin, iu ho
Retrieve [ri'tri:v] v. Ly li, khi phc li, gi ra (thng tin clu tr)
Search [s:t] v., n. Tm kim, s tm kimSpecify ['spesifai] v. nh r, ch r
Statistics [st'tistiks] n. Thng k hcStoring [st:] n.
= store, repository
S d tr, ct tr, kho
Systemize ['sistmaiz] v.= systematize ['sistmtaiz]
H thng ho, sp xp theo h thng
Task[t:sk] n. Nhim v, ngha vTheoretical [,i'retikl] adj. Thuc/c tnh l thuytThroughput ['ru:put] n.
= Output or productionSn lng, nng sut
Variability [,veri'bilti] n. Tnh bin thin, tnh hay thay i
9
-
8/3/2019 An Overview of Bioinformatics-Final3
10/15
III. READING COMPREHENSION QUESTIONS
1. What is bioinformatics?2. What can molecular biologists do with effective and efficient computational
tools nowadays?
3. Why is it mandatory to use computers in modern lifescience studies?4. How is biological information classified?5. Which biological database link do you like most? Why?
IV. GRAMMAR: SENTENCE COMBINING SKILLS
The Need to Combine Sentences
Sentences have to be combined to avoid the monotony that would surely result ifall sentences were brief and of equal length. Part of the writer's task is to employwhatever music is available to him or her in language, and part of language's music lies
within the rhythms of varied sentence length and structure. Even poets who write withinthe formal limits and sameness of an iambic pentameter beat will sometimes strike achord against that beat and vary the structure of their clauses and sentence length, thuskeeping the text alive and the reader awake. This section will explore some of thetechniques we ordinary writers use to combine sentences.
Compounding Sentences
A compound sentence consists of two or more independent clauses. Thatmeans that there are at least two units of thought within the sentence, either one of whichcan stand by itself as its own sentence. The clauses of a compound sentence are either
separated by a semicolon (relatively rare) or connected by a coordinating conjunction(which is, more often than not, preceded by a comma). And the two most commoncoordinating conjunctions are andand but. (The others are or, for, yet, andso.) This is thesimplest technique we have for combining ideas:
Meriwether Lewis is justly famous for his expedition into the territory of
the Louisiana Purchase and beyond, but few people know of his
contributions to natural science.
Lewis had been well trained by scientists in Philadelphia prior to his
expedition, and he was a curious man by nature.
Notice that the and does little more than link one idea to another; the but also
links, but it does more work in terms of establishing an interesting relationship between
ideas. The andis part of the immediate language arsenal of children and of dreams: one
thing simply comes after another and the logical relationship between the ideas is not
always evident or important. The word but (and the other coordinators) is at a slightly
higher level of argument.
10
-
8/3/2019 An Overview of Bioinformatics-Final3
11/15
(Please review the rules of comma usage when you combine two independent clauseswith a coordinating conjunction.)
Compounding Sentence Elements
Within a sentence, ideas can be connected by compounding various sentenceelements: subjects, verbs, objects or whole predicates, modifiers, etc. Notice that whentwo such elements of a sentence are compounded with a coordinating conjunction (asopposed to the two independent clauses of a compound sentence), the conjunction isusually adequate and no comma is required.
Subjects: When two or more subjects are doing parallel things, they can often be
combined as a compounded subject.
Working together, President Jefferson and Meriwether Lewis
convinced Congress to raise money for the expedition.
Objects: When the subject(s) is/are acting upon two or more things in parallel, the
objects can be combined.
President Jefferson believed that the headwaters of the Missouri reached
all the way to the Canadian border.
He also believed that meant he could claim all that land for the United
States.
President Jefferson believed that the headwaters of the Missouri might
reach all the way to the Canadian border and that he could claim all that
land for the United States.
Notice that the objects must be parallel in construction: Jefferson believed that
this was true and that was true. If the objects are not parallel (Jefferson was convinced of
two things: that the Missouri reached all the way to the Canadian border and wanted to
begin the expedition during his term in office.) the sentence can go awry. (Please review
the principles of parallelism.)
Verbs and verbals: When the subject(s) is/are doing two things at once, ideas can
sometimes be combined by compounding verbs and verb forms.
He studied the biological and natural sciences. He learned how to categorize and draw animals accurately.
He studied the biological and natural sciences and learned how to
categorize and draw animals accurately.
Notice that there is no comma preceding the "and learned" connecting the
compounded elements above.
11
-
8/3/2019 An Overview of Bioinformatics-Final3
12/15
In Philadelphia, Lewis learned to chart the movement of the stars.
He also learned to analyze their movements with mathematical
precision.
In Philadelphia, Lewis learned to chart and analyze the movement of
the stars with mathematical precision.
OR In Philadelphia, Lewis learned to chart the stars and analyzetheir movements with mathematical precision.
(Notice in this second version that we don't have to repeat the "to" of the infinitive to
maintain parallel form.)
Modifiers: Whenever it is appropriate, modifiers such as prepositional phrases can be
compounded.
Lewis and Clark recruited some of their adventurers from river-town
bars.
They also used recruits from various military outposts.
Lewis and Clark recruited their adventurers from river-town bars and
various military outposts.
Notice that we do not need to repeat the preposition from to make the ideas
successfully parallel in form.
Subordinating One Clause to Another
The act ofcoordinatingclauses simply links ideas; subordinating one clause to
another establishes a more complex relationship between ideas, showing that one ideadepends on another in some way: a chronological development, a cause-and-effectrelationship, a conditional relationship, etc.
William Clark was not officially granted the rank of captain prior to the
expedition's departure.
Captain Lewis more or less ignored this technicality and treated Clark as
his equal in authority and rank.
Although William Clark was not officially granted the rank of captain
prior to the expedition's departure, Captain Lewis more or less ignored this
technicality and treated Clark as his equal in authority and rank.
The explorers approached the headwaters of the Missouri.
They discovered, to their horror, that the Rocky Mountain range stood
between them and their goal, a passage to the Pacific.
As the explorers approached the headwaters of the Missouri, they
discovered, to their horror, that the Rocky Mountain range stood between
them and their goal, a passage to the Pacific.
12
-
8/3/2019 An Overview of Bioinformatics-Final3
13/15
When we use subordination of clauses to combine ideas, the rules of punctuationare very important. It might be a good idea to review the definition of clauses at this pointand the uses of the comma in setting off introductory and parenthetical elements.
Using Appositives to Connect Ideas
The appositive is probably the most efficient technique we have for combiningideas. An appositive or appositive phrase is a renaming, a re-identification, of somethingearlier in the text. You can think of an appositive as a modifying clause from which theclausal machinery (usually a relative pronoun and a linking verb) has been removed. Anappositive is often, but not always, a parenthetical element which requires a pair ofcommas to set it off from the rest of the sentence.
Sacagawea, who was one of the Indian wives of Charbonneau, who
was a French fur-trader, accompanied the expedition as a translator.
A pregnant, fifteen-year-old Indian woman, Sacagawea, one of the
wives of the French fur-trader Charbonneau, accompanied the expeditionas a translator.
Notice that in the second sentence, above, Sacagawea's name is a parenthetical
element (structurally, the sentence adequately identifies her as "a pregnant, fifteen-year-
old Indian woman"), and thus her name is set off by commas; Charbonneau's name,
however, is essential to the meaning of the sentence (otherwise, which fur-trader are we
talking about?) and is not set off by a pair of commas.
Using Participial Phrases to Connect Ideas
A writer can integrate the idea of one sentence into a larger structure by turningthat idea into a modifying phrase.
Captain Lewis allowed his men to make important decisions in a
democratic manner.
This democratic attitude fostered a spirit of togetherness and
commitment on the part of Lewis's fellow explorers.
Allowing his men to make important decisions in a democraticmanner, Lewis fostered a spirit of togetherness and commitment among
his fellow explorers.
In the sentence above, the participial phrase modifies the subject of the sentence,Lewis. Phrases like this are usually set off from the rest of the sentence with a comma.
The expeditionary force was completely out of touch with their families
for over two years.
They put their faith entirely in Lewis and Clark's leadership.
They never once rebelled against their authority.
13
-
8/3/2019 An Overview of Bioinformatics-Final3
14/15
Completely out of touch with their families for over two years,
the men of the expedition put their faith in Lewis and Clark's leadership
and never once rebelled against their authority.
Using Absolute Phrases to Connect Ideas
Perhaps the most elegant and most misunderstood method of combiningideas is the absolute phrase. This phrase, which is often found at the beginning ofsentence, is made up of a noun (the phrase's "subject") followed, more often than not, bya participle. Other modifiers might also be part of the phrase. There is no true verb in anabsolute phrase, however, and it is always treated as a parenthetical element, anintroductory modifier, which is set off by a comma.
The absolute phrase might be confused with a participial phrase, and thedifference between them is structurally slight but significant. The participial phrase doesnot contain the subject-participle relationship of the absolute phrase; it modifies the
subject of the the independent clause that follows. The absolute phrase, on the other hand,is said to modify the entire clause that follows. In the first combined sentence below, forinstance, the absolute phrase modifies the subject Lewis, but it also modifies the verb,telling us "under what conditions" or "in what way" or "how" he disappointedthe world.The absolute phrase thus modifies the entire subsequent clause and should not beconfused with a dangling participle, which must modify the subject which immediatelyfollows.
Lewis's fame and fortune was virtually guaranteed by his exploits.
Lewis disappointed the entire world by inexplicably failing to publish
his journals.
His fame and fortune virtually guaranteed by his exploits, Lewisdisappointed the entire world by inexplicably failing to publish his journals.
Lewis's long journey was finally completed.
His men in the Corps of Discovery were dispersed.
Lewis died a few years later on his way back to Washington, D.C.,
completely alone.
His long journey completed and his men in the Corps ofDiscovery dispersed, Lewis died a few years later on his way back to
Washington, D.C., completely alone.
14
-
8/3/2019 An Overview of Bioinformatics-Final3
15/15
V. PRACTICE
Combining sentences into one:
1. Over the past few decades, major advances in the field of molecular biology
have led to an explosive growth in the biological information.
- The advances in genomic technologies have also contributed to the explosivegrowth in biological information.
- The explosive growth in the biological information have been generated by
scientific community.
2. A biological database is a large body of persiatent data.
- A biological database is a organized body of persistent data.- A biological database is usually associated with computerized software.- The software is designed to update and query components of the data stored
within the system.
- The software is also designed to retrieve components of the data storedwithin the system.
3. A simple data might be a single file
- The file contains many records of information.- Each of the records includes the same set of information.
4. Molecular modeling may not be as accurate at determining a proteinsstructure as experimental methods.
- Molecular modeling is still extremely helpful in proposing and testingvarious biological hypotheses.
15