Recurrence quantification analysis reveals interaction partners in paramyxoviridae envelope...

6
Recurrence Quantification Analysis Reveals Interaction Partners in Paramyxoviridae Envelope Glycoproteins Alessandro Giuliani 1 * and Maurizio Tomasi 2 1 Comparative Toxicology and Ecotoxicology Laboratory, Istituto Superiore di Sanita ` , Roma, Italy 2 Cell Biology Laboratory, Istituto Superiore di Sanita ` , Roma, Italy ABSTRACT The paramyxovirus envelope fuses with the host cell membrane by cooperative interac- tion of two transmembrane glycoproteins: the hem- agglutinin neuraminidase (HN) and the fusion (F) glycoprotein. The interaction appears to be finely regulated, as both proteins must derive from the same viral species to obtain a functional interac- tion. Because HN and F do not form stable com- plexes, this interaction is poorly characterized. This article demonstrates that a modification of a classi- cal bioinformatic method based on the co-evolution of interacting partners can detect the specificity of the HN and F interaction. The proposed approach relies on a relatively new nonlinear signal analysis technique, recurrence quantification analysis (RQA), applied to the hydrophobicity sequences of viral proteins. This technique is able to shed light on the interaction between HN and F proteins in the virus– cell fusion and, more generally, permits the quanti- tative comparison of nonhomologue protein sys- tems. On the contrary, the same co-evolution approach, based on the classical sequence align- ment procedure, was unable to discriminate inter- acting partners from the general strict correlation existing between the evolution of viral proteins as a whole. The cooperation between HN and F in the fusion process is thus demonstrated by a bioinfor- matic, purely sequence-dependent, perspective. Proteins 2002;46:171–176. © 2001 Wiley-Liss, Inc. Key words: signal analysis; bioinformatics; nonlin- ear methods; membrane fusion; protein– protein interaction INTRODUCTION Specific protein–protein interactions are the trigger for many biological processes. For example, in the membrane fusion, a protein–protein interaction underlies lipid mix- ing between two membranes, limiting the fusion only at the docking point. 1 The dynamics of the process are mirrored by the dynamics of protein–protein interaction, which, in the case of cell vesicle membrane fusion, appears to involve different folding intermediates. 2 Although several indications demonstrate that a protein– protein interaction promotes fusion between the envelope of Paramyxoviridae members and the host cell membrane in a pH-independent manner, the interacting proteins do not appear to form stable complexes. 3 A clear understand- ing at the molecular level is needed. The goal of the present work is to test a computational approach for its ability to detect peculiarities of the interaction event. The viral envelope fuses with the host membrane in a pH-independent manner through concerted action of the HN and F glycoproteins. HN exhibits hemagglutinin neur- aminidase activity, and F is essential for fusion activity. There is considerable evidence 3–6 demonstrating that the fusion involves cooperation between the two envelope glycoproteins. This cooperation is very specific, as both HN and F must belong to the same virus species, to obtain an efficient fusion process. 6 Moreover, the proteins do not form a stable complex, suggesting that the communicative structures are established transiently. 6 This makes the HN–F interaction an interesting case for testing bioinfor- matic methods designed to detect protein–protein interac- tions. An impressive research effort is under way to tackle the problem of protein–protein interaction from a bioinfor- matic perspective 7–9 in an effort to provide reliable interac- tion forecasts directly from sequence data. One of the most promising bioinformatic approaches to the problem of protein–protein interaction is the one recently formalized by Goh et al. 10 based on the co-evolution of interaction partners. The idea, widely accepted in the biological literature, and used productively in detecting protein– protein interaction sites, 11 is that the correlation in amino acid substitutions, between two interacting proteins, should be much higher than the one that holds between a noninteracting pair. This simple but powerful idea is inferred from the consideration that the range of allowed mutations relative to one element of the interacting pair is constrained by the need to maintain an efficient interac- tion with the partner. 10,11 This constraint introduces an order parameter in the mutational drift of the two part- ners, so their mutation patterns correlate to a larger extent than do two noninteracting systems. 10 Such an approach implies the existence of a relevant amount of noise (genetic drift) decorrelating the evolution of each single protein system and thus masking the substantial unitary character of phylogenesis. However, in this case, the interacting pairs will be detected by the higher level of *Correspondence to: Alessandro Giuliani, Laboratorio di Tossicolo- gia Comparata ed Ecotossicologia, Viale Regina Elena 299, 00161 Roma, Italy. E-mail: [email protected] Received 8 June 2001; Accepted 4 October 2001 Published online XX Month 2001 PROTEINS: Structure, Function, and Genetics 46:171–176 (2002) © 2001 WILEY-LISS, INC. DOI 10.1002/prot.10044

Transcript of Recurrence quantification analysis reveals interaction partners in paramyxoviridae envelope...

Page 1: Recurrence quantification analysis reveals interaction partners in paramyxoviridae envelope glycoproteins

Recurrence Quantification Analysis Reveals InteractionPartners in Paramyxoviridae Envelope GlycoproteinsAlessandro Giuliani1* and Maurizio Tomasi2

1Comparative Toxicology and Ecotoxicology Laboratory, Istituto Superiore di Sanita, Roma, Italy2Cell Biology Laboratory, Istituto Superiore di Sanita, Roma, Italy

ABSTRACT The paramyxovirus envelope fuseswith the host cell membrane by cooperative interac-tion of two transmembrane glycoproteins: the hem-agglutinin neuraminidase (HN) and the fusion (F)glycoprotein. The interaction appears to be finelyregulated, as both proteins must derive from thesame viral species to obtain a functional interac-tion. Because HN and F do not form stable com-plexes, this interaction is poorly characterized. Thisarticle demonstrates that a modification of a classi-cal bioinformatic method based on the co-evolutionof interacting partners can detect the specificity ofthe HN and F interaction. The proposed approachrelies on a relatively new nonlinear signal analysistechnique, recurrence quantification analysis (RQA),applied to the hydrophobicity sequences of viralproteins. This technique is able to shed light on theinteraction between HN and F proteins in the virus–cell fusion and, more generally, permits the quanti-tative comparison of nonhomologue protein sys-tems. On the contrary, the same co-evolutionapproach, based on the classical sequence align-ment procedure, was unable to discriminate inter-acting partners from the general strict correlationexisting between the evolution of viral proteins as awhole. The cooperation between HN and F in thefusion process is thus demonstrated by a bioinfor-matic, purely sequence-dependent, perspective.Proteins 2002;46:171–176. © 2001 Wiley-Liss, Inc.

Key words: signal analysis; bioinformatics; nonlin-ear methods; membrane fusion; protein–protein interaction

INTRODUCTION

Specific protein–protein interactions are the trigger formany biological processes. For example, in the membranefusion, a protein–protein interaction underlies lipid mix-ing between two membranes, limiting the fusion only atthe docking point.1 The dynamics of the process aremirrored by the dynamics of protein–protein interaction,which, in the case of cell vesicle membrane fusion, appearsto involve different folding intermediates.2

Although several indications demonstrate that a protein–protein interaction promotes fusion between the envelopeof Paramyxoviridae members and the host cell membranein a pH-independent manner, the interacting proteins donot appear to form stable complexes.3 A clear understand-

ing at the molecular level is needed. The goal of the presentwork is to test a computational approach for its ability todetect peculiarities of the interaction event.

The viral envelope fuses with the host membrane in apH-independent manner through concerted action of theHN and F glycoproteins. HN exhibits hemagglutinin neur-aminidase activity, and F is essential for fusion activity.There is considerable evidence3–6 demonstrating that thefusion involves cooperation between the two envelopeglycoproteins. This cooperation is very specific, as both HNand F must belong to the same virus species, to obtain anefficient fusion process.6 Moreover, the proteins do notform a stable complex, suggesting that the communicativestructures are established transiently.6 This makes theHN–F interaction an interesting case for testing bioinfor-matic methods designed to detect protein–protein interac-tions.

An impressive research effort is under way to tackle theproblem of protein–protein interaction from a bioinfor-matic perspective7–9 in an effort to provide reliable interac-tion forecasts directly from sequence data. One of the mostpromising bioinformatic approaches to the problem ofprotein–protein interaction is the one recently formalizedby Goh et al.10 based on the co-evolution of interactionpartners. The idea, widely accepted in the biologicalliterature, and used productively in detecting protein–protein interaction sites,11 is that the correlation in aminoacid substitutions, between two interacting proteins, shouldbe much higher than the one that holds between anoninteracting pair. This simple but powerful idea isinferred from the consideration that the range of allowedmutations relative to one element of the interacting pair isconstrained by the need to maintain an efficient interac-tion with the partner.10,11 This constraint introduces anorder parameter in the mutational drift of the two part-ners, so their mutation patterns correlate to a largerextent than do two noninteracting systems.10 Such anapproach implies the existence of a relevant amount ofnoise (genetic drift) decorrelating the evolution of eachsingle protein system and thus masking the substantialunitary character of phylogenesis. However, in this case,the interacting pairs will be detected by the higher level of

*Correspondence to: Alessandro Giuliani, Laboratorio di Tossicolo-gia Comparata ed Ecotossicologia, Viale Regina Elena 299, 00161Roma, Italy. E-mail: [email protected]

Received 8 June 2001; Accepted 4 October 2001

Published online XX Month 2001

PROTEINS: Structure, Function, and Genetics 46:171–176 (2002)

© 2001 WILEY-LISS, INC.DOI 10.1002/prot.10044

Page 2: Recurrence quantification analysis reveals interaction partners in paramyxoviridae envelope glycoproteins

correlation exhibited by their relative phylogenetic trees.The species-specificity demonstrated by the HN–F interac-tion adds relevance to the Goh approach in this particularcase, pointing to the presence of physiological constraintsthat order the HN and F co-evolution. In contrast, thevirtual lack of genetic drift typical of highly integratedsystems such as viruses falsifies the basic assumption ofthe co-evolution approach and opens the way to othermethods not directly dependent on homologue sequencealignment strategies. More specifically, we tested theability of recurrence quantification analysis (RQA) a non-linear signal analysis method previously applied to theanalysis of protein sequences12,13 for its ability to singleout the HN–F interaction.

We devised a simple “triangulation” approach, in whichthe interaction between HN and F had to be discriminatedby pure bioinformatic sequence-dependent methodsthrough comparison with another protein system used as anegative control—NP protein, a nucleocapsid protein thatdoes not directly interact either with HN or F. For thisreason, we selected a set of 10 virus species belonging tothe Paramixoviridae family that represent the entire setfor which the data on HN, F, and NP are available. Sevenof these species are paramyxoviruses, and three are morbilli-viruses. We discarded the species pertaining to Pneumovirusgenera, because HN glycoprotein in their envelope structureis substituted by a smaller binding protein called G.

On the selected sequences, we applied both the Gohapproach based on sequence alignment and the RQA basedon protein hydrophobicity ordering along the sequence.12,13

The classical approach failed to single out the interact-ing pair because of the lack of random mutational drift inviruses and the subsequent strong co-evolution of all viralproteins, besides their direct interaction. In contrast, theRQA method,14–16 is independent of protein homology,eliminating the need to base the correlation on betweenviruses distance matrices while permitting direct compari-son of the two protein systems for the presence of commonhydrophobicity patterns. Therefore, RQA was able toindividuate the interacting pair and give a basis to theobserved specificity in terms of a strict linear relationbetween self-consistent numerical descriptors of the hydro-phobic ordering of the two interacting systems.

MATERIALS AND METHODSProtein Data Set

The viral protein sequences have been obtained by usingthe Entrez retrieval system for protein and nucleotidedatabases of the National Library of Medicine. The virusspecies, strain, and accessions for the single protein struc-ture are presented in Table I.

Best Alignment Similarities

To assess protein alignment similarities, we used thealgorithm developed by Huang and Miller17 through theExPASy Molecular Biology Server SIM-Alignment Tool,the matrices of between-sequence distances are derived bythe multiple alignments algorithm ClustalW, developed byThompson et al.,18 using the DeCypher server of Time-

Logic Corporation (Crystal Bay, NV). The between-virus10 3 10 pairwise distance matrices relative to the HN, F,and NP proteins, i.e., dist (HN), dist (F), dist (NP), were thencompared by means of the Pearson correlation coefficient (r).

Recurrence Quantification Analysis

RQA is a nonlinear signal analysis technique14,15 espe-cially suited to short, nonstationary series16 as are proteinsequences. The protein sequence is coded in terms of therelative hydrophobicity of subsequent amino acid residuesaccording to the Schneider and Wrede scale.12,13 Theresulting numerical discrete series is then submitted tofour-dimensional embedding by the method of delays.13

The embedding procedure consists of building an n-columns matrix (in our case, n 5 4) out of the originallinear array, by shifting the series of a fixed lag. Thus, alinear series X0 is transformed into a multivariate matrixXn with the X0, X1, X2, X3 series as columns correspondingto the original (X0) sequence and its amino acid-by-aminoacid progressively shifted copies.

The rows of the embedding matrix (EM) correspond tosubsequent four-residue window (embedding dimension)along the sequence. The choice of the embedding dimen-sion was dictated by a balance between the need for havinga window large enough to keep track of between-residuesinteractions and on relying at the same time on a suffi-ciently long series.

Moreover, the four-residue window was demonstrated19

through the application of a formalism derived frominformation theory, to constitute an upper limit for theinformation content of protein sequences.

RQA is based on the computation of the Euclideandistance matrix (DM) between the rows (epochs) of theEM.13 The DM was formally demonstrated to keep track ofall the information (in terms of autocorrelation structure)embedded in the original signal,20 without constraints

TABLE I. Virus Species, Strain, and Accessions

Sendai virus, (Sendai) strain ZAccessions: HN: PO4853; F: PO4855; NP: VHNZSV

Human parainfluenza 1 virus (parainfluenza 1) strainWashington/1957Accessions: HN: P16071; F: P12605 strain C39; NP: P26590

Human parainfluenza 3 virus (parainfluenza 3) strain NIH 47885Accessions: HN: P08492; F: P06828; NP: P06159

Newcastle disease virus (NDV) strain B1Accession to complete genome: AF309418

Mumps virus (mumps) strain Glouc1/UK96Accession to complete genoma: AF280799

Simian virus 5 (SV-5) strain W3Accessions: HN: PO4850; F: P04849; NP: Q88435

Simian virus 41 (SV-41) strain Toshiba/ChanockAccessions: HN: P25180; F: P25181; NP: P27018

Measles virus (measles) strain AIK-CAccession to complete genoma: AF266286

Phocine distemper virus (Phocine) isolate DK88-4AAccessions: HN: P28882; F: P28886; NP: P35944

Rinderpest virus (Rinderpest) strain L.Accessions: HN: P09460; F: P10864; NP: P37708

172 A. GIULIANI AND M. TOMASI

Page 3: Recurrence quantification analysis reveals interaction partners in paramyxoviridae envelope glycoproteins

linked to the specific nature of the signal at hand (e.g.,linearity, stationarity). The DM undergoes a first filter bysimply darkening the pixels, located at specific (i, j) coordi-nates (in our case, four-residue segments centered on ithand jth amino acid) corresponding to distance valuesbetween the ith and jth segments that are lower than apredefined radius,13 in this case set to 3, according toprevious experience with RQA application12,13,21 in pro-tein science. In any case, the chosen radius lies at thecenter of a range of variation, giving rise to a correspon-dent linear modification of the resulting RQA measures.

The features of the distance function make the plotsymmetric (DMi, j 5 DMj,i), and with a darkened maindiagonal corresponding to the identity line (DMi, j 5 0,when j 5 i). The darkened (recurrent) points single outrecurrences within the series, and the plot can be consid-ered a global picture of the autocorrelation structure ofhydrophobicity distribution along the chain.12,13,21 Figure1 presents HN and F recurrence plots of the Sendai virusto illustrate the output of the procedure.

In addition to the global impression given by the graphicappearance of the plot, the indexes developed by Webber

Fig. 1. Recurrence plots relative to the HN and F proteins of Sendai virus. Dots correspond to therecurrences, i.e., to pairs of segments along the chain whose Euclidean distance in terms of hydrophobicitydistribution is lower than the threshold. The main diagonal corresponds to the line of identity. HN and F have575 and 565 amino acid residues, respectively; their REC and DET values in Sendai virus are 3.58 and 40.54,respectively, for HN and 4.93 and 42.98 for F.

RECURRENCE ANALYSIS OF INTERACTING VIRAL PROTEINS 173

Page 4: Recurrence quantification analysis reveals interaction partners in paramyxoviridae envelope glycoproteins

and Zbilut14 provide a quantitative description of therecurrence structure of the plot. This quantitative descrip-tion was demonstrated to be consistent with the graphicappearance of the plot by means of a psychometric proce-dure.21 Five quantitative indexes describe the recurrenceplots.13 In the present work, we used the simplest andmost basic of these indexes: percentage recurrence (REC),which corresponds to the fraction of the plot filled byrecurrent points. The REC corresponds to the fraction ofrecurrent points over all the possible pairs of segments or,equivalently, to the fraction of pairwise distances belowthe chosen radius among all the computed distances.Other indexes are based on the relative position of therecurrent points. Among these indexes, percent determin-ism (DET) is particularly relevant, corresponding to thepercentage of recurrent points that appear consecutivelyon the chain. The use of REC in our application wasdictated by the high correlation scored by this index for theinteracting pair. The other RQA parameters also high-lighted significant correlations discriminating interactingand noninteracting pairs, but to a slightly lesser extentthan were achieved with REC.

RESULTS AND DISCUSSION

The Goh approach based on sequence alignment yieldedthe results depicted in Figure 2. All the distance matrixpairs analyzed scored an extremely high mutual correla-tion independent of the direct interaction of the correspond-ing protein systems. This result indicates that the viralproteins exhibit a strong level of co-evolution, likely be-cause of the strict integration of different functions typicalof viruses, with no room for mutational drift, and thuswithout fulfilling the basic requirement of the Goh ap-proach.

At odds with protein alignment methods, whose atomicquantitative result is defined at the level of sequencepairwise comparison, RQA allows for a self-consistentnumerical characterization of the single protein se-quences, the REC value in the present work (Table II). Themutual correlation among HN, F, and NP can then becomputed directly at the level of single proteins, withoutpassing through the intermediate step of between-speciesdistance matrices. These correlations are shown in Figure3. In this case, the two interacting pairs are clearlydiscriminated from the two nondirectly interacting part-ners; whereas HN and F show a strong correlation be-tween their REC values, neither HN and NP, nor F andNP, highlights any statistically significant correlationbetween their hydrophobic structuring along the chain.This result remained sustantially invariant when hydro-phobicity was calculated by means of different scales, andwhen other RQA parameters different from REC wereused to quantify the degree of recurrent structuring of thesequences investigated.

The presence of a strong correlation between the RECvalues of the two elements of the interacting pair points tothe presence of structural requirements to be fulfilled inthe interaction process and represents an empirical coun-terpart to the species-specificity of the process.

CONCLUSIONS

Sequence alignment methods fail to assess the interact-ing partners in a virus system. This is probably because ofthe need to compare homologue systems, and thus toinvestigate the co-evolution between different protein sys-tems indirectly by calculating the correlation coefficientbetween the phylogenetic trees (expressed as pairwisedistance matrices) based on different proteins. This im-plies the need for a certain amount of genetic noise todifferentiate among the various systems, so that theco-evolution of interacting pairs can emerge.

The lack of redundancy in the use of genetic materialtypical of viruses (e.g., the use of the same DNA sequence

Fig. 2. Points correspond to the N*(N 2 1)/2 distinct elements of thedistance matrices (distinct species pairs comparison. a: dist (F) vs dist(HN). b: dist (HN) vs dist (NP). c: dist (F) vs dist (NP). Because of thevirtual lack of random mutational drift in viruses, all three distancematrices are strongly correlated.

174 A. GIULIANI AND M. TOMASI

Page 5: Recurrence quantification analysis reveals interaction partners in paramyxoviridae envelope glycoproteins

used to code for different proteins by simply altering thereading frame)22 virtually eliminates random mutationaldrift, dramatically lowering the sensitivity of the Gohapproach in detecting interacting pairs.

RQA provides a holistic representation of protein se-quences in terms of degree of structuring,15 independent ofthe length of the sequence and, more importantly, retain-ing its meaning invariant for completely unrelated se-quences. Thus, the correlation between proteins in termsof RQA parameters points to similarities in periodic struc-tures present along the chain more than to similarities inthe use of particular motifs or residues, as was clearlyevidenced in a linguistic application of the method.23 Thisimplies that RQA description is not suitable for phyloge-netic analysis but gives us a more structure-orienteddescription of the studied systems. This fact has beenestablished empirically in previous reports.13,21 Moreover,an algorithm strictly related to RQA (singular valuedecomposition) proved very efficient in detecting receptor–peptide binding pairs25 on the basis of similar patterns ofhydrophobicity.

Besides the particular case of the lack of genetic noise,we believe that RQA outperforms the sequence alignmentmethods in all situations in which the periodicities ofhydrophobicity distribution play a significant role in theinteraction process. This is the case of HN and F systemsthat, being both transmembrane proteins, are expected todisplay a functionally relevant role for hydrophobicitypatterning along the chain in the interaction process.

In our case, HN and F undergo very highly coordinatedchanges in their periodic structures through the virusesexamined (Fig. 3). This observation indicates the necessityof a definite ratio between the recurrence structures of thetwo proteins, in order to maintain the general functional-ity of the system. This is probably the structural counter-part of the observed specificity of the interaction.

It is worth noting that the RQA approach to protein–protein interaction, while eliminating the need for se-quence alignment, maintains the essential idea of the Gohapproach, i.e., the assumption that two interacting pro-

teins have mutual structural constraints that lead to anincrease in coordinated variation.

RQA and, in general, signal analysis techniques such aswavelets26 or singular value decomposition,25 offer a globaland coarse-grained view of primary structures of proteins, asopposed to the local and detailed view given by sequencealignment methods. This global view could be very importantin investigating the nonlocal effects of single mutations onprotein structures such as that envisaged by Sinha andNussinov27 and, more generally, to sketch empirical sequen-ce–structure–activity studies of protein families.13

ACKNOWLEDGMENTS

The authors thank Dr. Clara Frontali for helpful discus-sion by stressing the peculiarity of viral evolution. The

TABLE II. Percentage Recurrence Values Relativeto the Three Investigated Protein Systems for

All Selected Virus Strains

Virus REC NP REC HN REC F

Sendai 3.74 3.58 4.93Parainfluenza 1 3.68 2.86 4.61Parainfluenza 3 2.75 2.83 3.59NDV 3.26 4.55 6.12Mumps 1.86 5.1 5.42SV-5 2.84 5.7 7.83SV-41 2.13 5.67 6.77Measles 2.7 2.92 4.2Phocine 2.79 3.02 4.15Rinderpest 2.89 4.05 4.91

REC NP recurrence, nucleocapsid protein; REC HN, recurrence,hemagglutinin neuraminidase; REC F, recurrence fusion protein;NDV, Newcastle disease virus; SV-5, SV-41, sarcoma virus 5 and 41.

Fig. 3. Linear correlations between the amount of recurrence of thethree proteins—rec (HN), rec (F), and rec (NP)—were evaluated bymeans of the Pearson correlation coefficient (r). a: rec (HN) vs rec (F). b:rec (HN) vs rec (NP). c: rec (F) vs rec (NP). Only the F–HN interacting pairshows a statistically significant correlation coefficient between the relativeamounts of recurrence.

RECURRENCE ANALYSIS OF INTERACTING VIRAL PROTEINS 175

Page 6: Recurrence quantification analysis reveals interaction partners in paramyxoviridae envelope glycoproteins

protein sequences have been obtained from databases atNCBI (www.ncbi.nlm.nih.gov:80/database/index.html).Similarity search results provided by DeCypher GenomicAnalysis Accelerator courtesy TimeLogic Corporation(www.timelogic.com). Contact [email protected] further information (recurrence quantification soft-ware can be freely downloaded at: http://homepages.luc.edu/;cwebber/).

REFERENCES

1. Hernandez LD, Hoffman LR, Wolfsberg T, White JM. Virus–celland cell–cell fusion. Annu Rev Cell Dev Biol 1996;12:627–661.

2. Chen YA, Scheller RH. SNARE-mediated membrane fusion. Na-ture Rev Mol Cell Biol 2001;2:98–106.

3. Tong S, Compans RW. Alternative mechanisms of interactionbetween homotypic and heterotypic parainfluenza virus HN and Fproteins. J Gen Virol 1999;80:107–115.

4. Dallocchio F, Tomasi M, Bellini T. Inhibition of Sendai virushemagglutinin neuraminidase by the fusion protein. BiochemBiophys Res Commun 1994;201:988–993.

5. Langeduk JPM, Daus FJ, van Oirschot JT. Sequence and struc-ture alignment of paramyxoviridae attachment proteins anddiscovery of enzymatic activity for a morbillivirus hemagglutinin.J Virol 1997;71:6155–6167.

6. Yao Q, Hu X, Compans RW. Association of the parainfluenza virusfusion and hemagglutinin-neuraminidase glycoproteins on cellsurfaces. J Virol 1997;71:650–661.

7. Pellegrini M, Marcotte EM, Thompson MJ, Eisenberg D, YeatesTO. Assigning protein functions by comparative genome analysisprotein phylogenetic profiles. Proc Natl Acad Sci USA 1999;96:4285–4288.

8. Schwikowski B, Uetz P, Fields S. A network of protein–proteininteractions in yeast. Nature Biotechnol 2000;12:1242–1243.

9. Marcotte EM, Xenarios I, Eisenberg D. Mining literature forprotein–protein interactions. Bioinformatics 2001;4:359–363.

10. Goh CS, Bogan A, Joachimiak M, Walther D, Cohen, FE. Co-evolution of proteins with their interaction partners. J Mol Biol2000;299:283–293.

11. Pazos F, Helmer-Citterich M, Ausiello G, Valencia, A. Correlatedmutations contain information about protein–protein interaction.J Mol Biol 1997;271:511–523.

12. Zbilut JP, Giuliani A, Webber CL, Colosimo A. Recurrence quanti-fication analysis in structure–function relationships of proteins:

overview of a general methodology applied to the case of TEM-1beta-lactamase. Protein Eng 1998;11:87–93.

13. Giuliani A, Benigni R, Sirabella P, Zbilut JP, Colosimo A. Nonlin-ear methods in the analysis of protein sequences: a case study inrubredoxins. Biophys J 2000;78:136–148.

14. Webber CL, Zbilut JP. Dynamical assessment of physiologicalsystems and states using recurrence plot strategies. J ApplPhysiol 1994;76:965–973.

15. Zbilut JP, Giuliani A, Webber CL. Recurrence quantificationanalysis and principal components in the detection of shortcomplex signals. Phys Lett A 1998;237:131–135.

16. Giuliani A, Piccirillo G, Marigliano V, Colosimo A. A nonlinearexplanation of aging-induced changes in heartbeat dynamics.Am J Physiol 1998;275:H1455–H1461.

17. Huang X, Miller W. A time-efficient, linear-space local similarityalgorithm. Adv Appl Math 1991;12:337–357.

18. Thompson JD, Higgins DC, Gibson TJ. CLUSTAL W: improvingthe sensitivity of progressive multiple sequence alignment throughsequence weighting, position-specific gap penalties and weightmatrix choice. Nucleic Acids Res 1994;22:4673–4680.

19. Strait BJ, Dewey TG. The Shannon information entropy of proteinsequences. Biophys J 1996;71:148–155.

20. Rao CR, Suryawanshi S. Statistical analysis of shape of objectsbased on landmark data. Proc Natl Acad Sci USA 1996;93:12132–12136.

21. Giuliani A, Sirabella P, Benigni R, Colosimo A. Mapping proteinsequence spaces by recurrence quantification analysis: a casestudy on chimeric structures. Protein Eng 2000;13:671–678.

22. Baranowski E, Ruiz-Jarabo CM, Domingo E. Evolution of cellrecognition by viruses. Science 2001;292:1102–1105.

23. Orsucci F, Walter K, Giuliani A, Webber CL, Zbilut JP. Ortho-graphic structuring of human speech and texts: linguistic applica-tion of recurrence quantification analysis. Int J Chaos TheoryAppl 1999;6:21–28.

24. Zbilut JP, Giuliani A, Webber CL. Detecting deterministic signalsin exceptionally noisy environments using cross-recurrence quan-tification. Phys Lett A 1998;246:122–128.

25. Mandell AJ, Selz K, Shlesinger MF. Mode matches and theirlocations in the hydrophobic free energy sequences of peptideligands and their receptor eigenfunctions. Proc Natl Acad Sci USA1997;94:13576–13581.

26. Lio P, Vannucci M. Wavelet change-point prediction of transmem-brane proteins. Bioinformatics 2000;16:376–382.

27. Sinha N, Nussinov R. Point mutations and sequence variability inproteins: redistributions of pre-existing populations. Proc NatlAcad Sci USA 2001;98:3139–3144.

176 A. GIULIANI AND M. TOMASI