An Overview of Bioinformatics-Final3

download An Overview of Bioinformatics-Final3

of 15

Transcript of An Overview of Bioinformatics-Final3

  • 8/3/2019 An Overview of Bioinformatics-Final3

    1/15

    UNIT 8. AN OVERVIEW OF BIOINFORMATICS

    I. TEXT1. Introduction

    Biology is in the middle of a major paradigm shift driven by computingtechnology. Although it is already an informational science in many respects, the fieldhas been rapidly becoming much more computational and analytical. Rapid progress ingenetics and biochemistry research combined with the tools provided by modernbiotechnology has generated massive volumes of genetic and protein sequence data.

    Bioinformatics has been defined as a means for analysing, comparing,graphically displaying, modeling, storing, systemising, searching, and ultimatelydistributing biological information, which includes sequences, structures, function, andphylogeny. Thus bioinformatics may be defined as a discipline that generates

    computational tools, databases, and methods to support genomic and postgenomicresearch. It comprises the study of DNA structure and function, gene and proteinexpression, protein production, structure and function, genetic regulatory systems, andclinical applications. Bioinformatics needs the expertise from Computer Science,Mathematics, Statistics, Medicine, and Biology.

    2. Knowledge Base in Biology

    In the last 10 years or so, numerous innovations have seen light and theconsequence is the development of a new biological research paradigm, one that isinformation-heavy and computer-driven. As the genetic information is being made as

    computerized databases and their sizes are steadily growing, molecular biologists needeffective and efficient computational tools to store and retrieve the cognate informationsuch as bibliographic or biological information from the databases, to analyze thesequence patterns they contain and to extract the biological knowledge the sequenceshave. On the other hand, there is a strong need for mathematical methods andcomputational techniques for challenging computational tasks such as predicting thethree-dimensional structure of the molecules the sequences represent, and to constructevolutionary trees from the sequence data. These tools will also be used to learn basicfacts about biology such which sequences of DNA are used to code proteins , which othercombinations of DNA are not used for protein synthesis, for greater understanding ofgenes and how they influence diseases.

    Biology employs a digital language for represening its information using the four basic alphabets (A, C, G, T). All the chromosomes in an organism' cell have beenrepresented and being identified using these alphabets. The demanding challenge here isto determine how this digital language of the chromosomes is being converted into thethree-dimensional and sometimes four-dimensional languages of living and breathingorganisms.

    1

  • 8/3/2019 An Overview of Bioinformatics-Final3

    2/15

    3. Information Technology in Biology

    As it was found that performing all these above-mentioned tasks manually is

    nearly impossible due to the massive volumes of biological data and the preciseness ofworks, it became mandatory to use computers for these purposes. Thus this subject ofbioinformatics deals with designing and deploying efficient software tools foraccomplishing the above quoted tasksin a fast and precise manner. So, bridging the gapbetween the real world of biology and precise logical nature of computers requires aninterdisciplinaryperspective.

    4. Software and Hardware Advancements in Biology

    The tools of computer science, statistics, and mathematics are very critical forstudying biology as an informational science subject.

    Some of the recent advances happened include improved DNA sequencingmethods, new approaches to identify protein structure, and revolutionary methods tomonitor the expression of many genes in parallel. The design of techniques able to dealwith different sources of incomplete and noisy data has become another crucial goal forthe bioinformatics community. In addition, there is the need to implement computationalsolutions based on theoretical frameworks to allow scientists to perform complexinferences about the phenomena under study.

    Genomics in the recent past has triggered the development ofhigh-throughputinstrumentation for DNA sequencing, DNA arrays, genotyping, proteomics, etc. These

    instruments have catalyzed a new type of science for biology termed discovery science.

    5. Human Genome Project - An Introduction

    The Human Genome Project has encouraged a series of paradigm changes to theview that biology is an informational science. The draft of the human genome has givenus a genetics parts list of what is necessary for building a human: approximately 35,000genes, their regulatory regions, a lexicon of motifs that are the building blockcomponents of proteins and genes, and access to the human variability that make us eachdifferent from one user.

    6. Genomes - Discovering Methodology and Study

    Discovery science defines all of the elements in a biological system. For example,sequence of the genome, identification and quantitation of all of the mRNAs or proteinsin a particular cell type - respectively, genome, transcriptome, and the proteome.Discovery science creates databases of information, in contrast to the more classicalhypothesis-driven science that formulates hypotheses and attempts to test them. Thehigh-throughput tools both provide the means for discovery science and can assay how

    2

  • 8/3/2019 An Overview of Bioinformatics-Final3

    3/15

    global information sets, for example, transcriptomes or proteomes change as systems areperturbed.

    The genomes of the model organisms yeast, worm, fly etc., have demonstratedthe fundamental conservation among all living organisms of the basic informational

    pathways. Hence systems can be perturbed in model organisms to gain insight into theirfunctioning, and these data will provide fundamental insights into human biology. Fromthe genome, the information pathways and networks can be extracted to beginunderstanding their logic of life. Further more, different genomes can be compared toidentify similarities and differences in the strategies for the logic of life and these providefundamental insights into development, physiology and evolution. The first eukaryoticgenome that has been fully sequenced and annotated is Saccharomyces cerevisiae. Thishighly helps to develop biological and computational tools for genomic and postgenomicresearch.

    In the era of automated DNA sequencing and revolutionary advances in DNA

    sequence analysis, the attention of many researchers is now shifting away from the studyof single genes or small gene clusters to whole genome analyses. Knowing the completesequence of a genome is only the first step in understanding how the myriad ofinformation contained within the genes is transcribed and ultimately translated intofunctional proteins. In the post genomic era, functional genomic and proteomic studieshelps to obtain an image of the dynamic cell.

    7. System Biology

    Biology is a highly informational science. There are mainly two types of biologicalinformation.

    The information of genes or proteins, which are the molecular machines of life The information of the regularity networks that coordinate and specify the

    expression patterns of the genes and proteins.

    All biological information is hierarchical. Initially DNA will change over to mRNA,which in turn goes to protein. Proteins enacts protein interactions, which creates someinformational pathways. These pathways form informational networks, which in turnbecome cells. Now cells forms networks of cells. Finally an individual is a collection ofcells. A host of individuals forms population and a variety of populations becomesecologies. This evolution brings a primary challenge for researchers and scientists to

    create tools and mechanisms to capture and integrate these different levels of biologicalinformation and integrate it towards gaining insight of their curious functionings.

    All of these paradigm shift lead to the view that the major challenges for biology andmedicine in this new century will be the study of complex systems and the approachnecessary for studying these biological complexities. Here comes a viable approach.

    3

  • 8/3/2019 An Overview of Bioinformatics-Final3

    4/15

    i. Identify all elements, such as sequence of genomes in the system with currentlyavailable discovery tools

    ii. Use current knowledge of the sytem to formulate a model predicting its behavioriii. Perturb the system in a model organism using biological, genetic or

    environmental perturbations, capture information at all relevant levels, such as

    DNA, mRNA, protein, protein interactions, etc. and integrate the collectedinformationiv. Compare theoretical predictions and experimental data, carry out additional

    perturbations to bring theory and experiment into closer apposition, integrate newdata into model,

    v. Iterate steps iii) and iv) till the mathematical model can predict the structure of thesystem and its systems oremergent properties given particular perturbations.

    8. System Biology - Challenges Ahead

    The Integration of technology, biology, and computation.

    The integration of the various levels of biological information and the modeling . The proper annotation of biological information and its its storage and integration

    in databases. The inclusion of other molecules, large and small, in the systems approach. The integration imperatives of systems biology presents many challenges to

    industry and academia.

    9. Conclusion

    With the confluence of biology and computer science, the computer applicationsof molecular biology are drawing a greater attention among the life science

    researchers and scientists these days. As it becomes imperative for biologists to seekthe help of information technology professionals to accomplish the ever growingcomputational requirements of a host of exciting and needy biological problems, thesynergy between modern biology and computer science is to blossom in the days tocome. Thus the research scope for all the mathematical techniques and algorithmscoupled with software programming languages, software development anddeployment tools are to get a real boost. In addition, information technologies such asdatabases, middleware, graphical user interface (GUI) design, distributed objectcomputing, storage area networks (SAN), data compression, network andcommunication and remote management are all set to play a very critical role intaking forward the goals for which the bioinformatics field came into existence.

    10. Biological Database Links

    NCBI HomeEstablished in 1988 as a national resource for molecular biology information,NCBI creates public databases, conducts research in computational biology,develops software tools for analyzing genome data, and disseminates

    4

    http://www.ncbi.nlm.nih.gov/http://www.ncbi.nlm.nih.gov/http://www.ncbi.nlm.nih.gov/
  • 8/3/2019 An Overview of Bioinformatics-Final3

    5/15

    biomedical information - all for the better understanding of molecular processesaffecting human health and disease.

    Entrez Search and Retrieval System

    Entrez Programming Utilitiesare tools that provide access to Entrez data outsideof the regularweb query interface and may be helpful for retrieving searchresults for future use in another environment.

    KEGG: Kyoto Encyclopedia of Genes and GenomesA grand challenge in the post-genomic era is a complete computer representationof the cell and the organism, which will enable computational prediction ofhigher-level complexity of cellular processes and organism behaviors fromgenomic information. Towards this end we have been developing abioinformatics resource named KEGG, Kyoto Encyclopedia of Genes andGenomes, as part of the research projects in the Kanehisa Laboratory of Kyoto

    University Bioinformatics Center.

    TIGR Gene IndicesThe TIGR Gene Index Project is supported in part by funding from the USDepartment of Energy, Grant #DE-FG02-99ER62852, and the US NationalScience Foundation, Grant #DBI-9983070. Additional funds are provided by theUS National Science Foundation through grants #DBI-9813392 and #DBI-9975866.

    Gramene: A Comparative Mapping Resource for GrainsGramene is a curated, open-source, Web-accessible data resource for

    comparative genome analysis in the grasses. Our goal is to facilitate the study ofcross-species homology relationships using information derived from publicprojects involved in genomic and EST sequencing, protein structure and functionanalysis, genetic and physical mapping, interpretation of biochemical pathways,gene and QTL localization and descriptions of phenotypic characters andmutations.

    MaizeDBThe goals of this project are to provide a central repository for public maizeinformation and present it in a way that creates intuitive biological connectionsfor the researcher with minimal effort as well as provide a series of

    computational tools that directly address the questions of the biologist in aneasy-to-use form.

    Barley GenomicsAREAS Of RESEARCH: Barley Genome Mapping , Map-Based Cloning,Molecular Breeding, Mutant Isolation & Characterization, Functional Genomics,BAC Address Calculator, Developmental Mutants.

    5

    http://www.ncbi.nlm.nih.gov/Entrez/http://www.roseindia.net/bioinformatics/biologicaldatabases.shtmlhttp://www.genome.ad.jp/kegg/http://www.tigr.org/tdb/tgi.shtmlhttp://www.gramene.org/http://www.maizegdb.org/http://barleygenomics.wsu.edu/http://www.ncbi.nlm.nih.gov/Entrez/http://www.roseindia.net/bioinformatics/biologicaldatabases.shtmlhttp://www.genome.ad.jp/kegg/http://www.tigr.org/tdb/tgi.shtmlhttp://www.gramene.org/http://www.maizegdb.org/http://barleygenomics.wsu.edu/
  • 8/3/2019 An Overview of Bioinformatics-Final3

    6/15

    EMBL European Bioinformatics Institute

    The European Bioinformatics Institute (EBI) is a non-profit academicorganisation that forms part of the European Molecular Biology Laboratory(EMBL). The EBI is a centre for research and services in bioinformatics. The

    Institute manages databases of biological data including nucleic acid, proteinsequences and macromolecular structures.

    A Catalog of Genes for Plant Glycerol Lipid BiosynthesisThe current version of this catalog contains more than 2600 sequence files, manyof them with annotation and results of our analysis. This version is updated as ofAug. 1999 and includes essentially all publicly available genomic, cDNA, ESTand GSS sequences for 62 plant polypeptides involved in lipid metabolism inhigher plant species. An important feature of the catalog are the multiplealignments of amino acid sequences deduced from genomic and EST sequences.This version of the dataset accounts for approximately 70% of the Arabidopsis

    genome.

    Grain Genes: A Small Grains and Sugarcane DatabaseGBrowse, developed by the GMOD group, is a Genome Browser that provides awealth of genome annotation for maps in the GrainGenes collection. Users caneasily manipulate the view of the chromosome and type of data displayed.

    PathDB PathwaysPathDB is a beta level research tool for scientists interested in analyzing theirexperimental or computational data in the context of biological pathways andnetworks.

    Enzymes and Metabolic Pathways Database

    Enzymes and Metabolic Pathways database, EMP, is a unique and mostcomprehensive electronic source of biochemical data. It covers all aspects ofenzymology and metabolism and represents the whole factual content of originaljournal publications.

    Boehringer Mannheim Biochemical PathwaysRoche Applied Science: LightCycler, MagNA Pure LC, Lumi-Imager, PCR

    ExPASy Molecular Biology ServerThe ExPASy (Expert Protein Analysis System) proteomics server of the SwissInstitute of Bioinformatics (SIB) is dedicated to the analysis of protein sequencesand structures as well as 2-D PAGE.

    Nucleic Acids Research:2000 Biological Database IssueNucleic Acids Research (NAR) publishes the results of leading edge research intophysical, chemical, biochemical and biological aspects of nucleic acids and

    6

    http://www2.ebi.ac.uk/http://www.canr.msu.edu/lgchttp://wheat.pw.usda.gov/http://www.gmod.org/http://www.ncgr.org/pathdb/index.htmlhttp://www.roseindia.net/bioinformatics/biologicaldatabases.shtmlhttp://www.empproject.com/abouthttp://biochem.boehringer-mannheim.com/prodinfo_fst.htm?/techserv/metmap.htmhttp://www.expasy.ch/http://nar.oupjournals.org/cgi/content/full/28/1/1/DC1http://www2.ebi.ac.uk/http://www.canr.msu.edu/lgchttp://wheat.pw.usda.gov/http://www.gmod.org/http://www.ncgr.org/pathdb/index.htmlhttp://www.roseindia.net/bioinformatics/biologicaldatabases.shtmlhttp://www.empproject.com/abouthttp://biochem.boehringer-mannheim.com/prodinfo_fst.htm?/techserv/metmap.htmhttp://www.expasy.ch/http://nar.oupjournals.org/cgi/content/full/28/1/1/DC1
  • 8/3/2019 An Overview of Bioinformatics-Final3

    7/15

    proteins involved in nucleic acid metabolism and/or interactions. It enables therapid publication of papers under the following categories: chemistry,computational biology, genomics, molecular biology, RNA and structuralbiology. A Survey and Summary section provides a format for brief reviews. Thefirst issue of each year is devoted to biological databases, and an issue in July is

    devoted to papers describing web-based software resources of value to thebiological community.

    Yeast Protein Database HOME PAGESix database volumes of biological information about proteins comprise Incyte'sProteome BioKnowledge Library. Each volume focuses on a different organismimportant in pharmaceutical research.

    Saccharomyces Genome DatabaseSGDTM is a scientific database of the molecular biology and genetics of the yeastSaccharomyces cerevisiae, which is commonly known as baker's or budding

    yeast.

    The Breast Cancer Gene DatabaseA database of genes involved in breast cancer. It is similar to the Tumor GeneDatabase (below) but limited in scope to those genes involved in human breastcancer and thus will be able to go into greater depth. The criteria for a gene to beincluded in this database are that it has been shown to be involved in humanbreast cancer (rather than an animal model) and that there is some evidence thatit plays a functional role in the induction or progression of breast cancer.

    The Mammary Transgene Interactive DatabaseThis is an interactive database of literature on research designed to targettransgene proteins to the mammary gland. Current emphasis is on biotechnologyapplications. Addition of tumor model and developmental model literature isplanned.

    The Small RNA databaseSmall RNAs are broadly defined as the RNAs not directly involved in proteinsynthesis. These are grouped under three categories: l) Capped small RNAs; 2)Noncapped small RNAs; and 3) Viral small RNAs. Sequences and references areincluded, and you can do wais searching with a keyword.

    The Tumor Gene DatabaseA database of genes associated with tumorigenesis and cellular transformation. Thisdatabase includes oncogenes, proto-oncogenes, tumor supressor genes/anti-oncogenes,regulators and substrates of the above, regions believed to contain such genes such astumor-associated chromosomal break points and viral integration sites, and other genesand chromosomal regions that seems relevant.

    7

    http://www.proteome.com/YPDhome.htmlhttp://genome-www.stanford.edu/Saccharomyces/http://condor.bcm.tmc.edu/ermb/bcgd/bcgd.htmlhttp://mbcr.bcm.tmc.edu/ermb/mtdb/mtdb.htmlhttp://mbcr.bcm.tmc.edu/smallRNA/smallrna.htmlhttp://condor.bcm.tmc.edu/ermb/tgdb/tgdb.htmlhttp://www.proteome.com/YPDhome.htmlhttp://genome-www.stanford.edu/Saccharomyces/http://condor.bcm.tmc.edu/ermb/bcgd/bcgd.htmlhttp://mbcr.bcm.tmc.edu/ermb/mtdb/mtdb.htmlhttp://mbcr.bcm.tmc.edu/smallRNA/smallrna.htmlhttp://condor.bcm.tmc.edu/ermb/tgdb/tgdb.html
  • 8/3/2019 An Overview of Bioinformatics-Final3

    8/15

    II. Vocabulary

    Accomplish ['kmpli] v. Hon thnh, lm xongAnalyse ['nlaiz] v.

    = analyzePhn tch

    Analytical [,n'litikl] adj.= Analytic

    (C tnh/ thuc) phn tch

    Annotate ['nouteit] v. Ch gii, ch thch

    Application [,pli'kein] n. ng dng, trnh ng dng

    Assay ['sei] n. v. Th nghim, th nghim, phn tch

    Bibliographic [,bibli'grfik] adj. Thuc th mc, chng mc

    Challenge ['tlind] n. Thch thc, th thch, nhim v kh khn

    Clinical ['klinikl] adj. (Thuc hay c lin quan n) khm hay iutr bnh, lm sng

    Cognate ['kgneit] adj. Cng ngun gc, h hng gn, cng bn cht

    Compare [km'pe] v. So snh

    Computational [kmpju:'teinl] adj.

    C s dng my tnh, thuc v khoa hc mytnh

    Co-ordinate [kou':dineit] v. Phi hp, sp xpCrucial ['kru:l] adj. Ch yu, ct yu, quyt nhDatabase ['deitbeiz] n C s d liuDeal (with) v. C nhim v, bao gm vicDemonstrate ['demnstreit] v. Chng minh, gii thchDeploy [di'pli] v. Trin khai, dn trnDesigning [di'zaini] n. S thit k, vDisseminate [di'semineit] v. Ph bin, gieo rcDigital ['diditl] adj. Thuc con s, k thut sDiscipline ['disiplin] n. Ngnh kin thc, mn hc, quy tcDisplay [dis'plei] v., n. Hin th, ph by ra, s hin thDistribute [dis'tribju:t] v. Phn phi, phn b, phn loiEmergent [i'm:dnt] Ni bt, r ntIdentification [ai,dentifi'kein] n. S xc nhn, xc nhImperatives [im'pertiv] Adj. Cp thit, cp bchImplement ['implimnt] v. Thc thi, thc hin y Inference ['infrns] n. S suy ra, kt lunInnovation [,inou'vein] n. S i mi, s cch tnInsight ['insait] n. S thu hiu, hiu bit su scIntegrate ['intigreit] v. Kt hp, ho hp, hp nhtHypothesis [hai'pisis] n. Gi thuyt, l thuytExpertise [,eksp'ti:z] n. S thnh tho, s tinh thngExtract ['ekstrkt - iks'trkt] v. Trch, trch xut, chit ra

    8

  • 8/3/2019 An Overview of Bioinformatics-Final3

    9/15

    Framework ['freimw:k] n. Khun kh; c cu t chc, ct liMandatory ['mndtri] adj. C tnh bt buc

    Manually ['mnjulli] adv. (Lm g ) bng tay, th cng

    Middleware ['midl'wer] n. Middleware is a computer software thatconnects software components or applications. Phn kt ni trung gian.

    Model ['mdl] v., n. M hnh ho, m hnh, kiu muMonitor ['mnit] v. Gim st, ch huyNetwork['netw:k] Mng li, h thngParadigm ['prdaim] n. H c s l thuyt (ca mt mn khoa hc),

    nn tngPerturb [p't:b] v. Lm xo trn, lm ri tung lnPhylogeny [fai'ldni] n.=phylogenesis [,fail'dineisis]

    S pht sinh loi, chng loi pht sinh

    Predict [pri'dikt] v. Tin on, d boQuantitation ['kwntitn] Xc nh s lng, nh lng

    Regulatory [regju:'leitri] adj.

    = regulator[regju:'leit](c tnh) iu khin, iu ho

    Retrieve [ri'tri:v] v. Ly li, khi phc li, gi ra (thng tin clu tr)

    Search [s:t] v., n. Tm kim, s tm kimSpecify ['spesifai] v. nh r, ch r

    Statistics [st'tistiks] n. Thng k hcStoring [st:] n.

    = store, repository

    S d tr, ct tr, kho

    Systemize ['sistmaiz] v.= systematize ['sistmtaiz]

    H thng ho, sp xp theo h thng

    Task[t:sk] n. Nhim v, ngha vTheoretical [,i'retikl] adj. Thuc/c tnh l thuytThroughput ['ru:put] n.

    = Output or productionSn lng, nng sut

    Variability [,veri'bilti] n. Tnh bin thin, tnh hay thay i

    9

  • 8/3/2019 An Overview of Bioinformatics-Final3

    10/15

    III. READING COMPREHENSION QUESTIONS

    1. What is bioinformatics?2. What can molecular biologists do with effective and efficient computational

    tools nowadays?

    3. Why is it mandatory to use computers in modern lifescience studies?4. How is biological information classified?5. Which biological database link do you like most? Why?

    IV. GRAMMAR: SENTENCE COMBINING SKILLS

    The Need to Combine Sentences

    Sentences have to be combined to avoid the monotony that would surely result ifall sentences were brief and of equal length. Part of the writer's task is to employwhatever music is available to him or her in language, and part of language's music lies

    within the rhythms of varied sentence length and structure. Even poets who write withinthe formal limits and sameness of an iambic pentameter beat will sometimes strike achord against that beat and vary the structure of their clauses and sentence length, thuskeeping the text alive and the reader awake. This section will explore some of thetechniques we ordinary writers use to combine sentences.

    Compounding Sentences

    A compound sentence consists of two or more independent clauses. Thatmeans that there are at least two units of thought within the sentence, either one of whichcan stand by itself as its own sentence. The clauses of a compound sentence are either

    separated by a semicolon (relatively rare) or connected by a coordinating conjunction(which is, more often than not, preceded by a comma). And the two most commoncoordinating conjunctions are andand but. (The others are or, for, yet, andso.) This is thesimplest technique we have for combining ideas:

    Meriwether Lewis is justly famous for his expedition into the territory of

    the Louisiana Purchase and beyond, but few people know of his

    contributions to natural science.

    Lewis had been well trained by scientists in Philadelphia prior to his

    expedition, and he was a curious man by nature.

    Notice that the and does little more than link one idea to another; the but also

    links, but it does more work in terms of establishing an interesting relationship between

    ideas. The andis part of the immediate language arsenal of children and of dreams: one

    thing simply comes after another and the logical relationship between the ideas is not

    always evident or important. The word but (and the other coordinators) is at a slightly

    higher level of argument.

    10

  • 8/3/2019 An Overview of Bioinformatics-Final3

    11/15

    (Please review the rules of comma usage when you combine two independent clauseswith a coordinating conjunction.)

    Compounding Sentence Elements

    Within a sentence, ideas can be connected by compounding various sentenceelements: subjects, verbs, objects or whole predicates, modifiers, etc. Notice that whentwo such elements of a sentence are compounded with a coordinating conjunction (asopposed to the two independent clauses of a compound sentence), the conjunction isusually adequate and no comma is required.

    Subjects: When two or more subjects are doing parallel things, they can often be

    combined as a compounded subject.

    Working together, President Jefferson and Meriwether Lewis

    convinced Congress to raise money for the expedition.

    Objects: When the subject(s) is/are acting upon two or more things in parallel, the

    objects can be combined.

    President Jefferson believed that the headwaters of the Missouri reached

    all the way to the Canadian border.

    He also believed that meant he could claim all that land for the United

    States.

    President Jefferson believed that the headwaters of the Missouri might

    reach all the way to the Canadian border and that he could claim all that

    land for the United States.

    Notice that the objects must be parallel in construction: Jefferson believed that

    this was true and that was true. If the objects are not parallel (Jefferson was convinced of

    two things: that the Missouri reached all the way to the Canadian border and wanted to

    begin the expedition during his term in office.) the sentence can go awry. (Please review

    the principles of parallelism.)

    Verbs and verbals: When the subject(s) is/are doing two things at once, ideas can

    sometimes be combined by compounding verbs and verb forms.

    He studied the biological and natural sciences. He learned how to categorize and draw animals accurately.

    He studied the biological and natural sciences and learned how to

    categorize and draw animals accurately.

    Notice that there is no comma preceding the "and learned" connecting the

    compounded elements above.

    11

  • 8/3/2019 An Overview of Bioinformatics-Final3

    12/15

    In Philadelphia, Lewis learned to chart the movement of the stars.

    He also learned to analyze their movements with mathematical

    precision.

    In Philadelphia, Lewis learned to chart and analyze the movement of

    the stars with mathematical precision.

    OR In Philadelphia, Lewis learned to chart the stars and analyzetheir movements with mathematical precision.

    (Notice in this second version that we don't have to repeat the "to" of the infinitive to

    maintain parallel form.)

    Modifiers: Whenever it is appropriate, modifiers such as prepositional phrases can be

    compounded.

    Lewis and Clark recruited some of their adventurers from river-town

    bars.

    They also used recruits from various military outposts.

    Lewis and Clark recruited their adventurers from river-town bars and

    various military outposts.

    Notice that we do not need to repeat the preposition from to make the ideas

    successfully parallel in form.

    Subordinating One Clause to Another

    The act ofcoordinatingclauses simply links ideas; subordinating one clause to

    another establishes a more complex relationship between ideas, showing that one ideadepends on another in some way: a chronological development, a cause-and-effectrelationship, a conditional relationship, etc.

    William Clark was not officially granted the rank of captain prior to the

    expedition's departure.

    Captain Lewis more or less ignored this technicality and treated Clark as

    his equal in authority and rank.

    Although William Clark was not officially granted the rank of captain

    prior to the expedition's departure, Captain Lewis more or less ignored this

    technicality and treated Clark as his equal in authority and rank.

    The explorers approached the headwaters of the Missouri.

    They discovered, to their horror, that the Rocky Mountain range stood

    between them and their goal, a passage to the Pacific.

    As the explorers approached the headwaters of the Missouri, they

    discovered, to their horror, that the Rocky Mountain range stood between

    them and their goal, a passage to the Pacific.

    12

  • 8/3/2019 An Overview of Bioinformatics-Final3

    13/15

    When we use subordination of clauses to combine ideas, the rules of punctuationare very important. It might be a good idea to review the definition of clauses at this pointand the uses of the comma in setting off introductory and parenthetical elements.

    Using Appositives to Connect Ideas

    The appositive is probably the most efficient technique we have for combiningideas. An appositive or appositive phrase is a renaming, a re-identification, of somethingearlier in the text. You can think of an appositive as a modifying clause from which theclausal machinery (usually a relative pronoun and a linking verb) has been removed. Anappositive is often, but not always, a parenthetical element which requires a pair ofcommas to set it off from the rest of the sentence.

    Sacagawea, who was one of the Indian wives of Charbonneau, who

    was a French fur-trader, accompanied the expedition as a translator.

    A pregnant, fifteen-year-old Indian woman, Sacagawea, one of the

    wives of the French fur-trader Charbonneau, accompanied the expeditionas a translator.

    Notice that in the second sentence, above, Sacagawea's name is a parenthetical

    element (structurally, the sentence adequately identifies her as "a pregnant, fifteen-year-

    old Indian woman"), and thus her name is set off by commas; Charbonneau's name,

    however, is essential to the meaning of the sentence (otherwise, which fur-trader are we

    talking about?) and is not set off by a pair of commas.

    Using Participial Phrases to Connect Ideas

    A writer can integrate the idea of one sentence into a larger structure by turningthat idea into a modifying phrase.

    Captain Lewis allowed his men to make important decisions in a

    democratic manner.

    This democratic attitude fostered a spirit of togetherness and

    commitment on the part of Lewis's fellow explorers.

    Allowing his men to make important decisions in a democraticmanner, Lewis fostered a spirit of togetherness and commitment among

    his fellow explorers.

    In the sentence above, the participial phrase modifies the subject of the sentence,Lewis. Phrases like this are usually set off from the rest of the sentence with a comma.

    The expeditionary force was completely out of touch with their families

    for over two years.

    They put their faith entirely in Lewis and Clark's leadership.

    They never once rebelled against their authority.

    13

  • 8/3/2019 An Overview of Bioinformatics-Final3

    14/15

    Completely out of touch with their families for over two years,

    the men of the expedition put their faith in Lewis and Clark's leadership

    and never once rebelled against their authority.

    Using Absolute Phrases to Connect Ideas

    Perhaps the most elegant and most misunderstood method of combiningideas is the absolute phrase. This phrase, which is often found at the beginning ofsentence, is made up of a noun (the phrase's "subject") followed, more often than not, bya participle. Other modifiers might also be part of the phrase. There is no true verb in anabsolute phrase, however, and it is always treated as a parenthetical element, anintroductory modifier, which is set off by a comma.

    The absolute phrase might be confused with a participial phrase, and thedifference between them is structurally slight but significant. The participial phrase doesnot contain the subject-participle relationship of the absolute phrase; it modifies the

    subject of the the independent clause that follows. The absolute phrase, on the other hand,is said to modify the entire clause that follows. In the first combined sentence below, forinstance, the absolute phrase modifies the subject Lewis, but it also modifies the verb,telling us "under what conditions" or "in what way" or "how" he disappointedthe world.The absolute phrase thus modifies the entire subsequent clause and should not beconfused with a dangling participle, which must modify the subject which immediatelyfollows.

    Lewis's fame and fortune was virtually guaranteed by his exploits.

    Lewis disappointed the entire world by inexplicably failing to publish

    his journals.

    His fame and fortune virtually guaranteed by his exploits, Lewisdisappointed the entire world by inexplicably failing to publish his journals.

    Lewis's long journey was finally completed.

    His men in the Corps of Discovery were dispersed.

    Lewis died a few years later on his way back to Washington, D.C.,

    completely alone.

    His long journey completed and his men in the Corps ofDiscovery dispersed, Lewis died a few years later on his way back to

    Washington, D.C., completely alone.

    14

  • 8/3/2019 An Overview of Bioinformatics-Final3

    15/15

    V. PRACTICE

    Combining sentences into one:

    1. Over the past few decades, major advances in the field of molecular biology

    have led to an explosive growth in the biological information.

    - The advances in genomic technologies have also contributed to the explosivegrowth in biological information.

    - The explosive growth in the biological information have been generated by

    scientific community.

    2. A biological database is a large body of persiatent data.

    - A biological database is a organized body of persistent data.- A biological database is usually associated with computerized software.- The software is designed to update and query components of the data stored

    within the system.

    - The software is also designed to retrieve components of the data storedwithin the system.

    3. A simple data might be a single file

    - The file contains many records of information.- Each of the records includes the same set of information.

    4. Molecular modeling may not be as accurate at determining a proteinsstructure as experimental methods.

    - Molecular modeling is still extremely helpful in proposing and testingvarious biological hypotheses.

    15