Expa Bioinformatics 2004 Bell Bioinformatics Bti228

download Expa Bioinformatics 2004 Bell Bioinformatics Bti228

of 2

Transcript of Expa Bioinformatics 2004 Bell Bioinformatics Bti228

  • 7/30/2019 Expa Bioinformatics 2004 Bell Bioinformatics Bti228

    1/2

    BIOINFORMATICS

    expa: a program for calculating extreme

    pathways in biochemical reaction networks

    Steven L. Bell, Bernhard . Palsson

    Department of Bioengineering, University of California, San Diego, La Jolla, CA92093, USA

    ABSTRACT

    Summary: The set of extreme pathways, a generating set

    for all possible steady state flux maps in a biochemical reac-

    tion network, can be computed from the stoichiometric matrix,

    an incidence-like matrix reflecting the network topology. Here,

    we describe the implementation of a well-known algorithm to

    compute these pathways and give a summary of the features

    of the available software.

    Availability: The C-code, along with a Windows executa-

    ble and sample network reaction files, are available at

    http://systemsbiology.ucsd.edu

    Contact: [email protected]

    1 INTRODUCTION

    The abundance of genomic data available today allows for

    construction of genome-scale metabolic networks for many

    organisms. The topology of the type of networks consideredhere is determined by an m n stoichiometric matrix, S,whose rows and columns represent the systems metabolites

    and reactions, respectively. The dynamics of the system is

    given by x(t) = Sv, where x is the m-dimensional vector ofmetabolite concentrations, denotes time-derivative, andv is a vector of fluxes which we assume is independent of

    concentrations and time.

    Under the assumption that the system is in steady-state,

    we have that Sv = 0, and to obtain biologically feasiblesolutions to this equation, we also impose the condition that

    v 0 (reversible reactions are split into two irreversible

    reactions one for each direction). The set of solutions satis-fying these constraints is a so-called convex cone which can

    be generated by a finite (and unique up to a multiple) number

    of vectors, i.e., each biologically feasible flux vector (when

    the system is in steady state) can be expressed as a non-

    negative linear combination of these extreme pathways (5).

    The extreme pathways are the edges of the convex cone, or

    more precisely, they are conically independent, i.e., no such

    vector can be expressed as a non-negative linear combination

    of any other vectors in the cone. The properties and uses of

    extreme pathways have been recently reviewed (2).

    to whom correspondence should be addressed

    2 DESCRIPTION OF THE ALGORITHM

    Given a metabolic network, where the metabolites are repre-

    sented by the nodes and the edges represent the associated

    reactions, we compute the extreme pathways using an algo-rithm presented in (5) (see also (6)). The algorithm uses

    matrix operations similar to those used in the well-known

    Gaussian elimination algorithm.

    For the sake of brevity, we here give a simplified version of

    the extreme pathway algorithm. The algorithm may be des-

    cribed by a sequence of tableaux T0, T1, . . . , T m, where theinitial tableau is given by T0 = [I S ], and the final tableauTm = [P 0 ]. Here, I is the n n identity matrix, primedenotes transpose, P is a matrix with n columns whose rowsare the extreme pathways, and, 0 is a matrix with m columnsand all entries equal to zero (determining the number of rows

    in the final tableau is an open problem). Converting the right

    hand matrix S to the zero matrix is done column by columnusing elementary row operations, each tableaux correspon-

    ding to a column, as follows. For each 1 i m, thetableau Ti is obtained from Ti1 by first choosing a pivo-ting column of the right hand matrix (originating from S)

    to zero out, column j, say. Suppose there are pos positive,neg negative, and z zero elements in column j. First, the zrows ofTi1 containing a zero in column j are copied to Ti.Then each of the pos rows is combined (using an elemen-

    tary row operation) with each of the neg rows to produce a

    zero in column j of Ti. More precisely, ifTi1s,j > 0 and

    Ti1t,j < 0 for some s and t, then |Ti1t,j |T

    i1s + |T

    i1s,j |T

    i1t

    is the new row to be added to Ti. (Here, Ti1s denotes the sth

    row, Ti1s,j is the (s, j)-element in the tableau Ti1, and |x| is

    the absolute value ofx.) Finally, only rows which are coni-cally independent are retained in Ti: for any two rows x, y, ifA(x) A(y), then row x is deleted from the tableau, whereA(x) = {i : xi = 0}, the indices of the zero components ofx. Hence, the number of rows in Ti is at most z+ neg pos.

    3 IMPLEMENTATION

    The tableaux are implemented as matrices (two-dimensional

    arrays) using pointers to pointers as described in Appendix B

    Bioinfor matics Oxford University Press 2004; all rights reserved.

    Bioinformatics Advance Access published December 21, 2004

    byguestonApril19,2013

    http://bioinformatics.oxfordjou

    rnals.org/

    Do

    wnloadedfrom

    http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/
  • 7/30/2019 Expa Bioinformatics 2004 Bell Bioinformatics Bti228

    2/2

    Bell and Palsson

    of (4). This means that the rows of the matrices are not neces-

    sarily stored in contiguous locations in memory (since rows

    are added and deleted each iteration, it would be inefficient

    to store the matrices in continuous chunks of memory). For

    each iteration i = 1, 2, . . . ,m, two matrices are used, onecontaining the tableau from the previous iteration, Ti1, andthe other for constructing the next tableau, Ti. These matri-ces are in sparse form, i.e., only the non-zero elements are

    stored in memory. The column indices corresponding to the

    non-zero elements in each row of the matrices are accoun-

    ted for by bit map representations of the matrices (similiar to

    the ones used in (3)). These bitmaps are also implemented

    as matrices of the type described above, but here each row

    consists ofw words, where w = (m + n)/sizeof(word),i.e., each row is represented by a pointer which points to a

    location in memory of size w words.

    For the following, assume (for simplicity) that the expres-sion inside the ceiling operator of the definition ofw is aninteger. The conical independence check is done with AND

    logical bit operations using bit representations of the rows

    (ifx and y are bit rows and ( x[i]) & y[i] is nonzero, forsome 0 i w 1 then A(x) A(y), where deno-tes bit negation). Each of the candidate pos neg rows (if and are rows of opposite signs, then a new row, , isconstructed by doing = | , where | denotes bitwiseOR and the operation is performed word by word) is checked

    against the existing rows of the next bit matrix, and conditio-

    nal on the outcome of the test, its bit representation is added

    to the bit array and a corresponding sparse row is added to

    the next sparse matrix (at the start of the iteration the nextmatrix consist of the z zero rows determined by the current

    column). The pivoting column chosen in each iteration is one

    whose quantity pos neg is a minimum (of the columns notyet processed). This choice minimizes the amount of work

    done when constructing the next tableau and may be thought

    of as a local (or greedy) optimization strategy (an intere-

    sting open problem is how to choose pivoting columns based

    on some global optimization criterion see the poster from

    RECOMB04 on our web page for some numerical compari-

    sons of different schemes for picking columns). The conical

    independence check described above is by far the most com-

    putationally intensive part of the algorithm. To descrease thenumber of checks necessary, it may be possible to partition

    the rows into equivalence classes based on the zero index sets

    A() so that only a subset of the neg pos rows need bechecked for a particular candidate row (3). Such a partition

    is, in general, network dependendent, and the number of clas-

    ses must be large enough to outweigh the additional expense

    of implementing the partitioning scheme. We have, as of yet,

    not found such a scheme.

    There have been other implementations of the extreme

    pathway algorithm described above (see for example (1) and

    (3), but to our knowledge none where the source-code has

    been freely available.

    4 FEATURES

    The open-source software comes with a command line inter-

    face, where the user is provided with input options and helpmenu by typing expa with no arguments or expa -h,

    respectively. To calculate the extreme pathways, the user has

    the option of specifying the network topology in the form

    of the stoichiometric matrix or the corresponding metabolic

    reactions.

    The matrix file is an ascii file where each row of the matrix

    constitutes a line (ended by a new line character) and each

    matrix entry is separated by white space. In addition, the user

    must supply the dimensions of the matrix and the number and

    types of the so-called exchange fluxes (see (5)). Being able to

    use the stoichimetric matrix as input is useful if preprocessing

    of the matrix is desired (for example, permuting or removingcolumns).

    The reaction file is also an ascii file and contains all the

    reactions in the metabolic network, one on each line of the

    file. The exact form of the entries is described in the README

    file on our website, and several sample files are provided

    there as well.

    The extreme pathways are output to a file named

    Paths.txt in matrix form, where each row is a pathway.

    ACKNOWLEDGEMENT

    Financial support for this work was provided by a grant from

    the National Institute of Health (GM68837).

    REFERENCES

    [1]Klamt, S., Stelling, J., Ginkel, M., Dieter, G. (2003) Flux-

    Analyzer: exploring structure, pathways, and flux distributions

    in metabolic networks on interactive flux maps. Bioinformatics

    19(2): 261269.

    [2]Papin, J. A., Price, N. D., Wiback, S. J., Fell, D. A., and Palsson,

    B. . (2003) Metabolic pathways in the post-genome era. Trends

    in Biochemical Sciences 28:250258.

    [3]Samatova, F. N., Geist, A., Ostrouchov, G. and Melechko, A.

    (2002) Parallel out-of-core algorithm for genome-scale enumera-

    tion of metabolic systemic pathways. In: Proceedings of the First

    IEEE Workshop on High Performance Computational Biolology(HICOMB2002), Ft. Lauderdale, FL.

    [4]Press, W. H., Teukolsky, S. A., Vetterling, W. T., and Flannery, B.

    P. (1992)Numerical Recipes in C, The Art of Scientific Computing,

    Second Edition, Cambridge University Press.

    [5]Schilling, C. H., Letscher, D. and Palsson, B. . (2000) Theory

    for the systemic definition of metabolic pathways and their use in

    interpreting metabolic function from a pathway-oriented perspec-

    tive. Journal of Theoretical Biology 203:249283.

    [6]Schuster, R. and Schuster, S. (1993) Refined algorithm and com-

    puter program for calculating all non-negative fluxes admissible

    in steady states of biochemical reaction systems with or without

    some flux rates fixed. Computational and Applied Bioscience

    9:7985.

    2

    byguestonApril19,2013

    http://bioinformatics.oxfordjou

    rnals.org/

    Do

    wnloadedfrom

    http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/http://bioinformatics.oxfordjournals.org/