Practical implementation of nonlinear time series methods...

CHAOS VOLUME 9, NUMBER 2 JUNE 1999

D

Practical implementation of nonlinear time series methods:The TISEAN package

Rainer Hegger and Holger Kantza)

Max Planck Institute for Physics of Complex Systems, No¨thnitzer Str. 38, D-01187 Dresden, Germany

Thomas SchreiberDepartment of Physics, University of Wuppertal, D-42097 Wuppertal, Germany

~Received 6 October 1998; accepted for publication 13 January 1999!

We describe the implementation of methods of nonlinear time series analysis which are based on theparadigm of deterministic chaos. A variety of algorithms for data representation, prediction, noisereduction, dimension and Lyapunov estimation, and nonlinearity testing are discussed withparticular emphasis on issues of implementation and choice of parameters. Computer programs thatimplement the resulting strategies are publicly available as theTISEAN software package. The use ofeach algorithm will be illustrated with a typical application. As to the theoretical background, wewill essentially give pointers to the literature. ©1999 American Institute of Physics.@S1054-1500~99!00802-2#

l

t

d

oeiba

s.e

esd-tedd tovedory

abeimeovelor-m-

ofter

blesthe

lit-the

hetheay

ble

er-uc-

n-ical

ndyos,

Nonlinear time series analysis is becoming a more andmore reliable tool for the study of complicated dynamicsfrom measurements. The concept of low-dimensionachaos has proven to be fruitful in the understanding ofmany complex phenomena despite the fact that very fewnatural systems have actually been found to be low-dimensional deterministic in the sense of the theory. Inorder to evaluate the long term usefulness of the nonlin-ear time series approach as inspired by chaos theory, iwill be important that the corresponding methods be-come more widely accessible. This paper, while not aproper review on nonlinear time series analysis, tries tomake a contribution to this process by describing the ac-tual implementation of the algorithms, and their properusage. Most of the methods require the choice of certainparameters for each specific time series application. Wewill try to give guidance in this respect. The scope andselection of topics in this article, as well as the implemen-tational choices that have been made, correspond to thecontents of the software packageTISEAN which is publiclyavailable from http://www.mpipks-dresden.mpg.de/˜ tisean. In fact, this paper can be seen as an extendemanual for the TISEAN programs. It fills the gap betweenthe technical documentation and the existing literature,providing the necessary entry points for a more thoroughstudy of the theoretical background.

I. INTRODUCTION

Deterministic chaos as a fundamental concept is by nwell established and described in a rich literature. The mfact that simple deterministic systems generically exhcomplicated temporal behavior in the presence of nonlineity has influenced thinking and intuition in many fieldHowever, it has been questioned whether the relevanc

a!Electronic mail: [email protected]

4131054-1500/99/9(2)/413/23/$15.00

ownloaded 13 Sep 2004 to 132.203.76.16. Redistribution subject to AIP lic

wreitr-

of

chaos for the understanding of the time evolving world gobeyond that of a purely philosophical paradigm. Accoringly, major research efforts are dedicated to two relaquestions. The first question is if chaos theory can be usegain a better understanding and interpretation of obsercomplex dynamical behavior. The second is if chaos thecan give an advantage in predicting or controlling suchtime evolution. Time evolution as a system property canmeasured by recording the time series. Thus, nonlinear tseries methods will be the key to the answers of the abquestions. This paper is intended to encourage the expative use of such methods by a section of the scientific comunity which is not limited to chaos theorists. A rangealgorithms has been made available in the form of compuprograms by theTISEAN project.1 Since this is fairly newterritory, unguided use of the algorithms bears considerarisk of wrong interpretation and unintelligible or spuriouresults. In the present paper, the essential ideas behindalgorithms are summarized and pointers to the existingerature are given. To avoid excessive redundancy withtext book2 and the recent review,3 the derivation of the meth-ods will be kept to a minimum. On the other hand, tchoices that have been made in the implementation ofprograms are discussed more thoroughly, even if this mseem quite technical at times. We will also point to possialternatives to theTISEAN implementation.

Let us at this point mention a number of general refences on the subject of nonlinear dynamics. At an introdtory level, the book by Kaplan and Glass4 is aimed at aninterdisciplinary audience and provides a good intuitive uderstanding of the fundamentals of dynamics. The theoretframework is thoroughly described by Ott,5 but also in theolder books by Berge´ et al.6 and by Schuster.7 More ad-vanced material is contained in the work by Katok aHasselblatt.8 A collection of research articles compiled bOtt et al.9 covers some of the more applied aspects of chalike synchronization, control, and time series analysis.

© 1999 American Institute of Physics

ense or copyright, see http://chaos.aip.org/chaos/copyright.jsp

icb

ribepe

-he,fo

thwarantinivpb

lear daneanouin

-p.heiehople-weaeinWt

t ia

of-kio

t

a-of

ton-

di-s,tics

al-se

willo beyear

earof

calofrentr-naltry.aveset-

--ted

ountese

ed in

we-of

in

s., thepoint

in-bor

sedre-ed inrtd.ided,

in

414 Chaos, Vol. 9, No. 2, 1999 Hegger, Kantz, and Schreiber

D

Nonlinear time series analysis based on this theoretparadigm is described in two recent monographs; oneAbarbanel10 and one by Kantz and Schreiber.2 While theformer volume usuallyassumeschaoticity, the latter bookputs some emphasis on practical applications to time sethat are not manifestly found, nor simply assumed todeterministic chaotic. This is the rationale we will also adoin the present paper. A number of older articles can be sas reviews, including Grassbergeret al.,11 Abarbanelet al.,12

as well as Kugiumtziset al.13,14 The application of the nonlinear time series analysis to real world measurements, wdeterminism is unlikely to be present in a stronger sensereviewed in Schreiber.3 Apart from these works, a number oconference proceedings volumes are devoted to a chatime series, including Refs. 15–19.

A. Philosophy of the TISEAN implementation

A number of different people have been credited forsaying that every complicated question has a simple answhich is wrong. Analyzing a time series with a nonlineapproach is definitely a complicated problem. Simpleswers have been repeatedly offered in the literature, quonumerical values for attractor dimensions for any conceable system. The present implementation reflects our skecism against such simple answers which are the inevitaresult of using black box algorithms. Thus, for exampnone of the ‘‘dimension’’ programs will actually printnumber which can be quoted as the estimated attractomension. Instead, the correlation sum is computed and btools are provided for its interpretation. It is up to the scietist who does the analysis to put these results into thproper context and to infer what information she or he mfind useful and plausible. We should stress that this issimply a question of error bars. Error bars do not tell absystematic errors and neither do they tell if the underlyassumptions are justified.

The TISEAN project has emerged from work of the involved research groups over several years. Some of thegrams are in fact based on the code published in RefNevertheless, we still like to see it as a starting point ratthan a conclusive step. First of all, nonlinear time seranalysis is still a rapidly evolving field, in particular witrespect to applications. This implies that the selection of tics in this article and the selection of algorithms impmented inTISEAN are highly biased towards what we knonow and found useful so far. But even the well establishconcepts like dimension estimation and noise reduction leconsiderable room for alternatives to the present implemtation. Sometimes this resulted in two or more concurrand almost redundant programs entering the package.have deliberately not eliminated these redundancies sinceuser may benefit from having a choice. In any case ihealthy to know that for most of the algorithms the finword has not been spoken yet—nor is ever to be.

While the TISEAN package does contain a numbertools for linear time series analysis~spectrum, autocorrelations, histograms, etc.!, these are only suitable for a quicinspection of the data. Spectral or even ARMA estimatare industries in themselves and we refer the reader—and


aly

es,ten

reis

tic

eer

-g-ti-le,

i-sic-iryott

g

ro-2.rs

-

dven-g

ehesl

nhe

user ofTISEAN—to the existing literature and available sttistics software for an optimal, up-to-date implementationsthese important methods.

Some users will miss a convenient graphical interfacethe programs. We felt that at this point the extra implemetational effort would not be justified by the expected adtional functionality of the package. Work is in progreshowever, to provide interfaces to higher level mathema~or statistics! software.

B. General computational issues

The natural basis to formulate nonlinear time seriesgorithms from chaos theory is a multi-dimensional phaspace, rather than the time or the frequency domain. Itbe essential for the global dynamics in this phase space tnonlinear in order to fulfill the constraints of nontrivialitand boundedness. Only in particular cases this nonlinstructure will be easily representable by a global nonlinfunction. Instead, all properties will be expressed in termslocal quantities, often by suitable global averages. All loinformation will be gained from neighborhood relationsvarious kinds from time series elements. Thus, a recurcomputational issue will be that of defining local neighbohoods in phase space. Finding neighbors in multidimensiospace is a common problem of computational geomeMultidimensional tree structures are widely used and hattractive theoretical properties. Finding all neighbors in aof N vectors takesO(logN) operations, thus the total operation count isO(N logN). A fast alternative that is particularly efficient for relatively low-dimensional structures embedded in multidimensional spaces is given by box-assisneighbor search methods which can push the operation cdown to O(N) under certain assumptions. Both approachare reviewed in Ref. 20 with particular emphasis on timseries applications. In theTISEAN project, a fast neighborsearch is done using a box-assisted approach, as describRef. 2.

No matter in what space dimension we are working,will define candidatesfor nearest neighbors in two dimensions using a grid of evenly spaced boxes. With a gridspacinge, all neighbors of a vectorx closer than epsilonmust be located in the adjacent boxes. But not all pointsthe adjacent boxes are neighbors, they may be up to 2e awayin two dimensions and arbitrarily far in higher dimensionThe neighbors search is thus a two stage process. Firstbox-assisted data base has to be filled and then for eacha list of neighbors can be requested. There are a fewstances where it is advisable to abandon the fast neighsearch strategy. One example is the programnoise thatdoes nonlinear noise filtering in a data stream. It is suppoto start filtering soon after the first points have beencorded. Thus the neighbor data base cannot be constructthe beginning. Another exception is if quite sho~,500 points, say!, high-dimensional data are processeThen the overhead for the neighbor search should be avoand instead an optimized straightO(N2) method be usedlike it is done in c2naive.

For portability, all programs expect time series data


m. Aeramh,slhatomar

et-enb

er

rly

vo

s,

snt

pus

ces

-

fthe

odo

vec-

rge-

heillcca-

a,, ornalayimeus

re-

l-e of

liza-

ceiteentthat

theformthe

ntly

415Chaos, Vol. 9, No. 2, 1999 Hegger, Kantz, and Schreiber

D

column format represented by ASCII numbers. The coluto be processed can be specified on the command linethough somewhat wasteful for storing data, ASCII numbcan be produced and read by most other software. All pareters can be set by adding options to the command, whicmany programs, just replace the default values. Obviourelying on default settings is particularly dangerous in sucsubtle field. Since almost all routines can read from standinput and write to standard output, programs can be paras pipelines. For example, they can be called filters frinside graphics software or other software tools whichable to execute shell commands. Also, data conversioncompression can be done ‘‘on the fly’’ this way. The readhere realizes that we are speaking of UNIX or LINUX plaforms which seems to be the most appropriate environmIt is, however, expected that most of the programs willported to other environments in the near future.

For those readers familiar with the programs publishin Ref. 2 we should mention that these form the basis fonumber of thoseTISEAN programs written in FORTRAN.The C programs, even if they do similar things, are faiindependent implementations. All C and C11 programsnow use dynamic allocation of storage, for example.

II. PHASE SPACE REPRESENTATION

Deterministic dynamical systems describe the time elution of a system in some phase spaceG,R. They can beexpressed, for example, by ordinary differential equation

x~ t !5F„x~ t !…, ~1!

or in discrete timet5n Dt by maps of the form

xn115f~xn!. ~2!

A time series can then be thought of as a sequence of obvations $sn5s(xn)% performed with some measuremefunction s(•). Since the~usually scalar! sequence$sn% initself does not properly represent the~multi-dimensional!phase space of the dynamical system, one has to emsome technique to unfold the multi-dimensional structureing the available data.

A. Delay coordinates

The most important phase space reconstruction tenique is themethod of delays. Vectors in a new space, thembedding space, are formed from time delayed valuethe scalar measurements:

sn5~sn2~m21!t ,sn2~m22!t ,...,sn!. ~3!

The numberm of elements is called theembedding dimension, the timet is generally referred to as thedelayor lag.Celebrated embedding theorems by Takens21 and by Saueret al.22 state that if the sequence$sn% does indeed consist oscalar measurements of the state of a dynamical system,under certain genericity assumptions, the time delay embding provides a one-to-one image of the original set$x%,providedm is large enough.

Time delay embeddings are used in almost all methdescribed in this paper. The implementation is straightfward and does not require further explanation. IfN scalar


nl-s-

iny,ardof

eorr

t.e

da

-

er-

loy-

h-

of

end-

sr-

measurements are available, the number of embeddingtors is onlyN2(m21)t. This has to be kept in mind for thecorrect normalization of averaged quantities. There is a laliterature on the ‘‘optimal’’ choice of the embedding parametersm andt. It turns out, however, that what constitutes toptimal choice largely depends on the application. We wtherefore discuss the choice of embedding parameters osionally together with other algorithms below.

A stand-alone version of the delay procedure~delay ,embed! is an important tool for the visual inspection of dateven though visualization is restricted to two dimensionsat most two-dimensional projections of three-dimensiorenderings. A good unfolding already in two dimensions mgive some guidance about a good choice of the delay tfor higher-dimensional embeddings. As an example letshow two different two-dimensional delay coordinate repsentations of a human magneto-cardiogram~Fig. 1!. Notethat we do neither assume nor claim that the magneto-~orelectro-! cardiogram is deterministic or even chaotic. Athough in the particular case of cardiac recordings the ustime delay embeddings can be motivated theoretically,23 herewe only want to use the embedding technique as a visuation tool.

B. Embedding parameters

A reasonable choice of the delay gains importanthrough the fact that we always have to deal with a finamount of noisy data. Both noise and finiteness will prevus from having access to infinitesimal length scales, so

FIG. 1. Time delay representation of a human magneto-cardiogram. Inupper panel, a short delay time of 10 ms is used to resolve the fast wavecorresponding to the contraction of the ventricle. In the lower panel,slower recovery phase of the ventricle~small loop! is better resolved due tothe use of a slightly longer delay of 40 ms. Such a plot can be conveniebe produced by a graphic tool such asgnuplot without generating extradata files.


th

bd

ed

. I

rreouse

opa

foti

oula

ioex

bler-n

to

nsdanoveanei-Thiveth

hena

ed

d-

r a

is aace.thethe

nam-bor-

thepo-

lintoe-

h-h-

am-

-s-

the

for

,

the

eptthe

bed-


D

the structure we want to exploit should persist up tolargest possible length scales. Depending on the typestructure we want to explore we have to choose a suitatime delay. Most obviously, delay unity for highly sampleflow data will yield delay vectors which are all concentrataround the diagonal in the embedding space and thusstructure perpendicular to the diagonal is almost invisibleRef. 24 the termsredundancyand irrelevancewere used tocharacterize the problem: Small delays yield strongly colated vector elements, large delays lead to vectors whcomponents are~almost! uncorrelated and the data are th~seemingly! randomly distributed in the embedding spacQuite a number of papers have been published on the prchoice of the delay and embedding dimension. We havegued repeatedly11,2,3 that an ‘‘optimal’’ embedding can—ifat all—only be defined relative to a specific purposewhich the embedding is used. Nevertheless, some quantive tools are available to guide the choice.

The usual autocorrelation function~autocor , corr !and the time delayed mutual information~mutual !, as wellas visual inspection of delay representations with varilags provide important information about reasonable detimes while the false neighbors statistic~false –nearest !can give guidance about the proper embedding dimensAgain, ‘‘optimal’’ parameters cannot be thus establishedcept in the context of a specific application.

1. Mutual information

The time delayed mutual information was suggestedFraser and Swinney25 as a tool to determine a reasonabdelay: Unlike the autocorrelation function, the mutual infomation also takes into account nonlinear correlations. Ohas to compute

S52(i j

pi j ~t!lnpi j ~t!

pipj, ~4!

where for some partition on the real numberspi is the prob-ability to find a time series value in thei -th interval, andpi j (t) is the joint probability that an observation falls inthe i -th interval and the observation timet later falls into thej -th. In theory this expression has no systematic dependeon the size of the partition elements and can be quite eacomputed. There exist good arguments that if the timelayed mutual information exhibits a marked minimum atcertain value oft, then this is a good candidate for a reasoable time delay. However, these arguments have to be mfied when the embedding dimension exceeds two. Moreoas will become transparent in the following sections, notapplications work optimally with the same delay. Our routimutual uses Eq.~4!, where the number of boxes of identcal size and the maximal delay time has to be supplied.adaptive algorithm used in Ref. 25 is more data intensSince we are not really interested in absolute values ofmutual information here but rather in the first minimum, tminimal implementation given here seems to be sufficieThe related generalized mutual information of order two cbe defined using the correlation sum concept~Sec. VII, Refs.26, 27!. An estimation of the correlation entropy is explainin Sec. VII A.


eofle

alln

-se

.err-

rta-

sy

n.-

y

e

ceilye-

-di-r,

ll

e.e

t.n

2. False nearest neighbors

A method to determine the minimal sufficient embeding dimensionm was proposed by Kennelet al.28 It iscalled thefalse nearest neighbormethod. The idea is quiteintuitive. Suppose the minimal embedding dimension fogiven time series$si% is m0 . This means that in am0-dimensional delay space the reconstructed attractorone-to-one image of the attractor in the original phase spEspecially, the topological properties are preserved. Thusneighbors of a given point are mapped onto neighbors indelay space. Due to the assumed smoothness of the dyics, neighborhoods of the points are mapped onto neighhoods again. Of course the shape and the diameter ofneighborhoods is changed according to the Lyapunov exnents. But suppose now you embed in anm-dimensionalspace withm,m0 . Due to this projection the topologicastructure is no longer preserved. Points are projectedneighborhoods of other points to which they would not blong in higher dimensions. These points are calledfalseneighbors. If now the dynamics is applied, these false neigbors are not typically mapped into the image of the neigborhood, but somewhere else, so that the average ‘‘dieter’’ becomes quite large.

The idea of the algorithmfalse –nearest is the fol-lowing. For each pointsW i in the time series look for its nearest neighborsW j in anm-dimensional space. Calculate the ditanceisW i2sW j i . Iterate both points and compute

Ri5usi 112sj 11u

isW i2sW j i. ~5!

If Ri exceeds a given heuristic thresholdRt , this point ismarked as having a false nearest neighbor.28 The criterionthat the embedding dimension is high enough is thatfraction of points for whichRi.Rt is zero, or at least suffi-ciently small. Two examples are shown in Fig. 2. One isthe Lorenz system~crosses!, one for the Heńon system~filledcircles!, and one for a Heńon time series corrupted by 10%of Gaussian white noise~open circles!. One clearly sees thatas expected,m52 is sufficient for the Heńon andm53 forthe Lorenz system, whereas the signature is less clear innoisy case.

The introduction of the false nearest neighbors concand other ad hoc instruments was partly a reaction to

FIG. 2. The fraction of false nearest neighbors as a function of the emding dimension for noise free Lorenz~crosses! and Henon ~filled circles!time series, as well as a Heńon time series~open circles! corrupted by 10%of noise.


ntc

eri

hamT

.o

chfo,

mn

s

utelud

a

sorasn

trevri

po

re

leom

ti

diFreeenthae

io

to-

of-the

ein

theof

al

eedus

sean-f

ramlar

en-s of

acethectord to

em-this

ns.thatidet of

gramho-


D

finding that many results obtained for the genuine invarialike the correlation dimension, has been spurious due toveats of the estimation procedure. In the latter case, scorrelations and small sample fluctuations can easily be mtaken for nonlinear determinism. It turns out, however, tthe ad hoc quantities basically suffer from the saproblems—which can be cured by the same precautions.implementationfalse –nearest therefore allows us tospecify a minimal temporal separation of valid neighbors

Other software for the analysis of false nearest neighbis available in source form from Kennel.29 Or, if you preferto pay for a license, from Ref. 30.

C. Principal components

It has been shown in Ref. 22 that the embedding tenique can be generalized to a wide class of smooth transmations applied to a time delay embedding. In particularwe introduce time delay coordinates$sn%, then almost everylinear transformation of sufficient rank again leads to an ebedding. A specific choice of linear transformation is knowasprincipal component analysis, singular value decompotion, empirical orthogonal functions, Karhunen–Loeve de-composition, and probably a few other names. The techniqis fairly widely used, for example, to reduce multivariadata to a few major modes. There is a large literature, incing textbooks like that by Jolliffe.31 In the context of nonlin-ear signal processing, the technique has been advocamong others by Broomhead and King.32

The idea is to introduce a new set of orthonormal bavectors in embedding space such that projections ontgiven number of these directions preserve the maximal ftion of the variance of the original vectors. In other wordthe error in making the projection is minimized for a givenumber of directions. Solving this minimization problem31

leads to an eigenvalue problem. The desiredprincipal direc-tions can be obtained as the eigenvectors of the symmeautocovariance matrix that correspond to the largest eigvalues. The alternative and formally equivalent approachthe trajectory matrix is used in Ref. 32. The latter is numecally more stable but involves the singular value decomsition of anN3m matrix for N data points embedded inmdimensions, which can easily exceed computationalsources for time series of even moderate length.33

In almost all the algorithms described below, simptime delay embeddings can be substituted by principal cponents. In theTISEAN project ~routinessvd , pc !, principalcomponents are only provided as a stand-alone visualizatool and for linear filtering,34 see Sec. II E below. In anycase, one first has to choose an initial time delay embedand then a number of principal components to be kept.the purpose of visualization, the latter is immediatelystricted to two or at most three. In order to take advantagthe noise averaging effect of the principal componscheme, it is advisable to choose a much shorter delayone would for an ordinary time delay embedding, whilethe same time increasing the embedding dimension. Expmentation is recommended. Figure 3 shows the contribut


s,a-ials-tehe

rs

-r-

if

-

i-

e

-

ted

isa

c-,

icn-ia--

-

-

on

ngor-oftantri-ns

of the first two principal components to the magnecardiogram shown in Fig. 1.

D. Poincare sections

Highly sampled data representing the continuous timea differential equation are calledflow data. They are characterized by the fact that errors in the direction tangent totrajectory do neither shrink nor increase exponentially~socalled marginally stable direction! and thus possess onLyapunov exponent which is zero, since any perturbationthis direction can be compensated by a simple shift oftime. Since in many data analysis tasks this direction islow interest, one might wish to eliminate it. The theoreticconcept to do so is called the Poincare´ section. After havingchosen an (m21)-dimensional hyperplane in thm-dimensional~embedding! space, one creates a compresstime series of only the intersections of the time continuotrajectory with this hyperplanein a predefined orientation.These data are then vector valued discrete timemap likedata. One can consider the projection of the(m21)-dimensional vectors onto the real numbers asother measurement function~e.g., by recording the value osn when sn passes the Poincare´ surface!, so that one cancreate a new scalar time series if desirable. The progpoincare constructs a sequence of vectors from a scaflow-like data set, if one specifies the hyperplane, the oritation, and the embedding parameters. The intersectionthe discretely sampled trajectory with the Poincare´ plane arecomputed by a third order interpolation~see Fig. 4!.

The placement of the Poincare´ surface is of high rel-evance for the usefulness of the result. An optimal surfmaximizes the number of intersections, i.e., minimizestime intervals between them, if at the same time the attraremains connected. One avoids the trials and errors relatethat if one defines a surface by the zero crossing of the tporal derivative of the signal, which is synonymous wicollecting all maxima or all minima, respectively. Thisdone byextrema . However, this method suffers more fromnoise, since for small time derivatives~i.e., close to the ex-trema! additional extrema can be produced by perturbatioAnother aspect for the choice of the surface of section isone should try to maximize the variance of the data insthe section, since their absolute noise level is independen

FIG. 3. The phase space representation of a human magneto-cardiousing the two largest principal components. An initial embedding was csen inm520 dimensions with a delay oft57 ms. The two componentscover 70% of the variance of the initial embedding vectors.


te

rels

me

f.

th

t

tnenlere

n

t

es topo-uto-data,ly

inbe

leinreurch

-

itu-ge,ten-, abe-al.av-

plothedis-allyndut--

n

nea

i-weref thet. Inpara-


D

the section. One last remark: Time intervals between insections are phase space observables as well36 and the em-bedding theorems are thus valid. For a time series with pnounced spikes, one often likes to study the sequencinterspike time intervals, e.g., in cardiology the RR-intervaIf these time intervals are constructed in a way to yield tiintervals of a Poincare´ map, they are suited to reflect thdeterministic structure~if any!. For complications see Re36.

For a periodically driven nonautonomous systembest surface of section is usually given by a fixed phasethe driving term, which is also called astroboscopic view.Here again the selection of the phase should be guided byvariance of the signal inside the section.

E. SVD filters

There are at least two reasons to apply a SVD filtertime series data: Either, if one is working with flow data, ocan implicitly determine the optimal time delay, or, whederiving a stroboscopic map from synchronously sampdata of a periodically driven system, one might use thedundancy to optimize the signal to noise ratio.

In both applications the mathematics is the same: Oconstructs the covariance matrix of all data vectors~e.g., inan m-dimensional time delay embedding space!,

Ci j 5^sn2m1 isn2m1 j&2^sn2m1 i&^sn2m1 j&, ~6!

and computes its singular vectors. Then one projects ontom-dimensional vectors corresponding to theq largest singu-lar values. To work with flow data,q should be at least the

FIG. 4. A Poincare´ surface of section usingextrema : A two-dimensionaldelay plot of the sequence of maxima~top! and of the time intervals betweesuccessive maxima~bottom!. Without employing the option-t time, wheretime is the number of time steps after the last extremum during whichfurther extrema are searched for~here: 3!, one finds some fake extrema duto noise showing up close to the diagonal of the delay representation. DTime series of the output power of a CO2 laser~Ref. 35!.


r-

o-of.e

eof

he

o

d-

e

he

correct embedding dimension, andm considerably larger~e.g., m52q or larger!. The result is a vector valued timseries, and in Ref. 22 the relation of these componenttemporal derivatives on the one hand and to Fourier comnents on the other hand were discussed. If, in the nonanomous case, one wants to compress flow data to mapq51. In this case, the redundancy of the flow is implicitused for noise reduction of the map data. The routinesvdcan be used for both purposes.

III. VISUALIZATION, NONSTATIONARITY

A. Recurrence plots

Recurrence plots are a useful tool to identify structurea data set in a time resolved way qualitatively. This canintermittency~which one detects also by direct inspection!,the temporary vicinity of a chaotic trajectory to an unstabperiodic orbit, or nonstationarity. They were introducedRef. 37 and investigated in much detail in Ref. 38, wheyou find many hints on how to interpret the results. Oroutinerecurr simply scans the time series and marks eapair of time indices (i , j ) with a black dot, whose corresponding pair of delay vectors has distance<e. Thus in the( i , j )-plane, black dots indicate closeness. In an ergodic sation, the dots should cover the plane uniformly on averawhereas nonstationarity expresses itself by an overalldency of the dots to be close to the diagonal. Of coursereturn to a dynamical situation the system was in beforecomes evident by a black region far away from the diagonIn Fig. 5, a recurrence plot is used to detect transient behior at the beginning of a longer recording.

For the purpose of stationary testing, the recurrenceis not particularly sensitive to the choice of embedding. Tcontrast of the resulting images can be selected by thetancee and the percentage of dots that should be actuplotted. Various software involving the color rendering aquantification of recurrence plots is offered in DOS execable form by Webber.40 The interpretation of the often in

o

ta:

FIG. 5. The recurrence plot for a Poincare´ section data from a vibratingstring experiment~Ref. 39!. Above the diagonal an embedding in two dmensions was used while below the diagonal, scalar time series valuescompared. In both cases the lighter shaded region at the beginning orecording indicates that these data are dynamically distinct from the resthis particular case this was due to adjustments in the measurement aptus.


stthc

th

owbit

t, t

ect

h-icesc

atrepis

rieb

anth

onl

p

thehasan-en

can

ore-dict.p-

elyn-

ans am-the

del

n-delheelilar

.

allions

or

o-tion

anycalis toer,In

ple

he

candata

begh-veligh-

s

h


D

triguing patterns beyond the detection and study of nontionarity is still an open question. For suggestions forstudy of nonstationary signals see Ref. 3 and referengiven there.

B. Space–time separation plot

While the recurrence plot shows absolute times,space–time separation plot introduced by Provenzaleet al.41

integrates along parallels to the diagonal and thus only shrelative times. One usually draws lines of constant probaity per time unit of a point to be ane-neighbor of the currenpoint, when its time distance isdt. This helps identifyingtemporal correlations inside the time series and is relevanestimate a reasonable delay time, and, more importantlyTheiler-windoww in dimension and Lyapunov-analysis~seeSec. VII!. Said in different words, it shows how large thtemporal distance between points should be so that weassume that they form independent samples according toinvariant measure. The corresponding routine of theTISEAN

package isstp ; see Fig. 6.

IV. NONLINEAR PREDICTION

To think about predictability in time series data is wortwhile even if one is not interested in forecasts at all. Predability is one way in which correlations between data exprthemselves. These can be linear correlations, nonlinearrelations, or even deterministic constraints. Questions relto those relevant for predictions will reappear with noiseduction and in surrogate data tests, but also for the comtation of Lyapunov exponents from data. Prediction is dcussed in most of the general nonlinear time sereferences, in particular, a nice collection of articles canfound in Ref. 17.

A. Model validation

Before entering the methods, we have to discuss howassess the results. The most obvious quantity for the qufication of predictability is the average forecast error, i.e.,root of the mean squared~rms! deviation of the individualprediction from the actual future value. If it is computedthose values which were also used to construct the mode~orto perform the predictions!, it is called thein-sample error. Itis always advisable to save some data for an out-of-sam

FIG. 6. A space–time separation plot of the CO2 laser data. Shown are lineof constant probability density of a point to be ane-neighbor of the currentpoint if its temporal distance isdt. Probability densities are 1/10 to 1 witincrements of 1/10 from bottom to top. Clear correlations are visible.


a-ees

e

sl-

tohe

anhe

t-s

or-ed-u--se

toti-e

le

test. If the out-of-sample error is considerably larger thanin-sample error, data are either nonstationary or oneoverfitted the data, i.e., the fit extracted structure from rdom fluctuations. A model with less parameters will thserve better. In cases where the data base is poor, oneapply complete cross-validationor take-one-out statistics,i.e., one constructs as many models as one performs fcasts, and in each case ignores the point one wants to preBy construction, this method is realized in the local aproaches, but not in the global ones.

The most significant, but least quantitative way of modvalidation is to iterate the model and to compare this sthetic time series to the experimental data. One starts formobserved delay vector as an initial condition and performforecast. Its outcome is combined with all but the last coponents of the initial vector to a new delay vector, andnext forecast is performed. Aftern.m iterations, then-thdelay vector contains only values generated by the moand no observations any more. In terms of ann-step predic-tion, the outcome will be terribly bad, since due to the sesitive dependence on initial conditions even an ideal mowill create a diverging trajectory due to inaccuracies in tmeasurement of the initial condition. However, for the modto be reasonable, the resulting attractor should be as simto the observed data as possible~e.g., in a delay plot!, al-though it is not easy to define the similarity quantitatively

B. Simple nonlinear prediction

Conventional linear prediction schemes average overlocations in phase space when they extract the correlatthey exploit for predictability. Tong42 promoted an extensionthat fits different linear models if the current state is belowabove a given threshold~TAR, ThresholdAutoregressiveModel!. If we expect more than a slight nonlinear compnent to be present, it is preferable to make the approximaas local in phase space as possible. There have been msimilar suggestions in the literature on how to exploit a lostructure; see, e.g., Refs. 43–46. The simplest approachmake the approximation local but only keep the zeroth ordthat is, approximate the dynamics locally by a constant.the TISEAN package we include such a robust and simmethod: In a delay embedding space, all neighbors ofsn aresought, if we want to predict the measurements at timen1k. The forecast is then simply

sn1k51

uUnu (sj PUn

sj 1k , ~7!

i.e., the average over the ‘‘futures’’ of the neighbors. Taverage forecast errors obtained with the routinezeroth~predict would give similar results! for the laser outputdata used in Fig. 4 as a function of the numberk of stepsahead the predictions are made is shown in Fig. 7. Onealso iterate the predictions by using the time series as abase.

Apart from the embedding parameters, all that has tospecified for zeroth order predictions is the size of the neiborhoods. Since the diffusive motion below the noise lecannot be predicted anyway, it makes sense to select ne


ayhitsl

he-

prd

acm-

ssllsein

49it

rie

–

pleons

illluens

s ofll,

ingofor

in-ndi-

bleom

n-

ithenor-ds.

its

tiond

t,:in

red d


D

borhoods which are at least as large as the noise level, mtwo or three times larger. For a fairly clean time series, tguideline may result in neighborhoods with very few poinThereforezeroth also permits us to specify the minimanumber of neighbors on which to base the predictions.

A relevant modification of this method is to extend tneighborhoodU to infinity, but to introduce a distance dependent weight,

sn1k5( j Þnsj 1kw~ usn2sj u!

( j Þnw~ usn2sj u!, ~8!

wherew is called the kernel. Forw(z)5Q(e2z) whereQ isthe Heaviside step function, we return to Eq.~7!.

C. Finding unstable periodic orbits

As an application of a simple nonlinear phase spacediction, let us discuss a method to locate unstable perioorbits embedded in a chaotic attractor. This is not the plto review the existing methods to solve this problem, soreferences include.47–50 The TISEAN package contains a routine that implements the requirement that for a periodp orbit$sn , n51,...,p% of a dynamical system like Eq.~2! acting ondelay vectors,

sn115f~ sn!, n51,...,p, sp11[ s1 . ~9!

With unit delay, thep delay vectors containp different scalarentries, and Eq.~9! defines a root of a system ofp nonlinearequations inp dimensions. Multidimensional root finding inot a simple problem. The standard Newton method habe augmented by special tricks in order to converge globaSome such tricks, in particular means to select differentlutions of Eq.~9!, are implemented in Ref. 50. Similar to thproblems encountered in nonlinear noise reduction, solvEq. ~9! exactly is particularly problematic sincef~•! is un-known and must be estimated from the data. In Ref.approximate solutions are found by performing just oneeration of the Newton method for each available time sepoint. We prefer to look for aleast squaressolution by mini-mizing

FIG. 7. Predictionsk time steps ahead~no iterated predictions! using theprogram zeroth . Top curve: embedding dimension two is insufficiensince these flow data fill a (21e)-dimensional attractor. Second from topAlthough embedding dimension four should in theory be a good embeddt51 suppresses structure perpendicular to the diagonal so that the ptions are as bad as inm52! Lower curves:m53 and 4 with a delay ofabout 4–8 time units serve well.


bes.

e-icee

toy.o-

g

,-s

(n51

p

i sn112f~ sn!i2, sp11[ s1 ~10!

instead. The routineupo uses a standard LevenbergMarquardt algorithm to minimize~10!. For this it is neces-sary thatf~•! is smooth. Therefore we cannot use the simnonlinear predictor based on locally constant approximatiand we have to use a smooth kernel version, Eq.~8!, instead.With w(z)5exp(2z2/2h2), the kernel bandwidthh deter-mines the degree of smoothness off~•!. Trying to start theminimization with all available time series segments wproduce a number of false minima, depending on the vaof h. These have to be distinguished from the true solutioby inspection. On the other hand, we can reach solutionEq. ~9! which are not closely visited in the time series at aan important advantage over close return methods.47

It should be noted that, depending onh, we may alwaysfind good minima of~8!, even if no solution of Eq.~9!, or noteven a truly deterministic dynamics, exists. Thus the findof unstable periodic orbits in itself is not a strong indicatordeterminism. We may, however, use the cycle locationsstabilities as a discriminating statistics in a test for nonlearity; see Sec. VIII. While the orbits themselves are fouquite easily, it is surprisingly difficult to obtain reliable estmates of their stability in the presence of noise. Inupo , asmall perturbation is iterated along the orbit and the unstaeigenvalue is determined by the rate of its separation frthe periodic orbit.

The user ofupo has to specify the embedding dimesion, the period~which may also be smaller! and the kernelbandwidth. For efficiency, one may choose to skip trials wvery similar points. Orbits are counted as distinct only whthey differ by a specified amount. The routine finds thebits, their expanding eigenvalue, and possible sub-perioFigure 8 shows the determination of all period six orbfrom 1000 iterates of the Heńon map, contaminated by 10%Gaussian white noise.

D. Locally linear prediction

If there is a good reason to assume that the relasn115 f (sn) is fulfilled by the experimental data in gooapproximation~say, within 5%! for some unknownf and that

g,ic-FIG. 8. Orbits of period six, or a sub-period thereof, of the Heńon map,determined from noisy data. The Heńon attractor does not have a periothree orbit.


n-e

i-

isde

c

ap.6E

arethobtec

re-

enaansth

srr

h

moro

gn thema-tarfit-

hsor-eenther

all.

o

andels

toual.e. Inbec-trialows

c-be

gh

eaof of

ta

-


D

f is smooth, predictions can be improved by fitting local liear models. They can be considered as the local Taylorpansion of the unknownf, and are easily determined by minmizing

s25 (sj PUn

~sj 112ansj2bn!2, ~11!

with respect toan andbn , whereUn is thee-neighborhood ofsn , excluding sn itself, as before. Then, the predictionsn115ansn1bn . The minimization problem can be solvethrough a set of coupled linear equations, a standard linalgebra problem. This scheme is implemented inonestep .For moderate noise levels and time series lengths thisgive a reasonable improvement overzeroth and pre-dict . Moreover, as discussed in Sec. VI, these linear mare needed for the computation of the Lyapunov spectrumlocally linear approximation was introduced in Refs. 45, 4We should note that the straight least squares solution of~11! is not always optimal and a number of strategiesavailable to regularize the problem if the matrix becomnearly singular and to remove the bias due to the errors in‘‘independent’’ variables. These strategies have in commthat any possible improvement is bought with a consideracomplication of the procedure, requiring subtle parameadjustments. We refer the reader to Refs. 51, 52 for advanmaterial.

In Fig. 9 we show iterated predictions of the Poinca´map data from the CO2 laser~Fig. 4! in a delay representation ~usingnstep in two dimensions!. The resulting data donot only have the correct marginal distribution and powspectrum, but also form a perfect skeleton of the originoisy attractor. There are of course artifacts due to noisethe roughness of this approach, but there are good reasoassume that the line-like substructure reflects fractality ofunperturbed system.

Casdagli53 suggested the use of local linear models atest for nonlinearity: He computed the average forecast eas a function of the neighborhood size on which the fit foran

and bn is performed. If the optimum occurs at large neigborhood sizes, the data are~in this embedding space! bestdescribed by a linear stochastic process, whereas an optiat rather small sizes supports the idea of the existencenonlinear almost deterministic equation of motion. This ptocol is implemented in the routinell-ar ; see Fig. 10.

FIG. 9. A time delay representation of 5000 iterations of the local linpredictornstep in two dimensions, starting from the last delay vectorFig. 4.


x-

ar

an

sA.q.esenlered

rl

ndto

e

aor

-

umf a-

E. Global function fits

The local linear fits are very flexible, but can go wronon parts of the phase space where the points do not spaavailable space dimensions and where the inverse of thetrix involved in the solution of the minimization does noexist. Moreover, very often a large set of different linemaps is unsatisfying. Therefore many authors suggestedting global nonlinear functions to the data, i.e., to solve

s25(n

„sn112 f p~sn!…2, ~12!

where f p is now a nonlinear function in closed form witparametersp, with respect to which the minimization idone. Polynomials, radial basis functions, neural nets,thogonal polynomials, and many other approaches have bused for this purpose. The results depend on how farchosen ansatzf p is suited to model the unknown nonlineafunction, and on how well the data are deterministic atWe included the routinesrbf andpolynom in the TISEAN

package, wheref p is modeled by radial basis functions54,55

and polynomials,56 respectively. The advantage of these twmodels is that the parametersp occur linearly in the functionf and can thus be determined by simple linear algebra,the solution is unique. Both features are lost for modwhere the parameters enter nonlinearly.

In order to make global nonlinear predictions, one hassupply the embedding dimension and time delay as usFurther, forpolynom the order of the polynomial has to bgiven. The program returns the coefficients of the modelrbf one has to specify the number of basis functions todistributed on the data. The width of the radial basis funtions ~Lorentzians in our program! is another parameter, busince the minimization is so fast, the program runs many tvalues and returns parameters for the best. Figure 11 shthe result of a fit to the CO2 laser time series~Fig. 4! withradial basis functions.

If global models are desired in order to infer the struture and properties of the underlying system, they shouldtested by iterating them. The prediction errors, althou

r

FIG. 10. The Casdagli test for nonlinearity: The rms prediction errorlocal linear models as a function of the neighborhood sizee. Lower curve:The CO2 laser data. These data are obviously highly deterministic inm54 dimensions and with lagt56. Central curve: The breath rate dashown in Fig. 12 withm54 andt51. Determinism is weaker~presumablydue to a much higher noise level!, but still the nonlinear structure is dominant. Upper curve: Numerically generated data of an AR~5! process, a lin-early correlated random process~m55, t51!.


ttesiinr

iefopb

enia

u

thnereintifli

Ine

u

geblic

ngo

nsiglarr-e

r t

ionon

tticp

ap-der;cur-

eingntsis

ous.is

tic,rdi-the

ed

ily.ely

time-erach

lleral-to

uatetheoutcial

-od

ea-

kenthatoint

ade.e-

le,an

f

drie


D

small in size, could be systematic and thus repel the iteratrajectory from the range where the original data are locaIt can be useful to study a dependence of the size or theof the prediction errors on the position in the embeddspace, since systematic errors can be reduced by a diffemodel. Global models are attractive because they yclosed expressions for the full dynamics. One must notget, however, that these models describe the observedcess only in regions of the space which have been visitedthe data. Outside this area, the shape of the model depexclusively on the chosen ansatz. In particular, polynomdiverge outside the range of the data and hence can bestable under iteration.

V. NONLINEAR NOISE REDUCTION

Filtering of signals from nonlinear systems requiresuse of special methods since the usual spectral or other lifilters may interact unfavorably with the nonlinear structuIrregular signals from nonlinear sources exhibit genubroad band spectra and there is no justification to idenany continuous component in the spectrum as noise. Nonear noise reduction does not rely on frequency informationorder to define the distinction between signal and noise.stead, structure in the reconstructed phase space will beploited. General serial dependencies among the measments $sn% will cause the delay vectors$sn% to fill theavailablem-dimensional embedding space in an inhomoneous way. Linearly correlated Gaussian random variawill for example be distributed according to an anisotropmultivariate Gaussian distribution. Linear geometric filteriin phase space seeks to identify the principal directionsthis distribution and project onto them; see Sec. II E. Nolinear noise reduction takes into account that nonlinearnals will form curved structures in delay space. In particunoisy deterministic signals form smeared-out lowedimensional manifolds. Nonlinear phase space filtering seto identify such structures and project onto them in ordereduce noise.

There is a rich literature on nonlinear noise reductmethods. Two articles of review character are available;by Kostelich and Schreiber,57 and one by Davies.58 We referthe reader to these articles for further references and fordiscussion of approaches not described in the present arHere we want to concentrate on two approaches that re

FIG. 11. Attractor obtained by iterating the model that has been obtainea fit with 40 radial basis functions in two dimensions to the time seshown in Fig. 4. Compare also Fig. 9.


edd.gngentldr-ro-yds

lsn-

ear.eyn-in-x-

re-

-es

f--,

kso

e

hele.

re-

sent the geometric structure in phase space by a localproximation. The first and simplest does so to constant orthe more sophisticated uses local linear subspaces plusvature corrections.

A. Simple nonlinear noise reduction

The simplest nonlinear noise reduction algorithm wknow of replaces the central coordinate of each embeddvector by the local average of this coordinate. This amouto a locally constant approximation of the dynamics andbased on the assumption that the dynamics is continuThe algorithm is described in Ref. 59, a similar approachproposed in Ref. 43. In an unstable, for example chaosystem, it is essential not to replace the first and last coonates of the embedding vectors by local averages. Due toinstability, initial errors in these coordinates are magnifiinstead of being averaged out.

This noise reduction scheme is implemented quite easFirst an embedding has to be chosen. Except for extremoversampled data, it is advantageous to choose a shortdelay. The programlazy always uses unit delay. The embedding dimensionm should be chosen somewhat highthan that required by the embedding theorems. Then for eembedding vector$sn%, a neighborhoodU e

(n) is formed inphase space containing all points$sn8% such thatisn2sn8i,e. The radius of the neighborhoodse should be taken largeenough in order to cover the noise extent, but still smathan a typical curvature radius. These conditions cannotways be fulfilled simultaneously, in which case one hasrepeat the process with several choices and carefully evalthe results. If the noise level is substantially smaller thantypical radius of curvature, neighborhoods of radius ab2–3 times the noise level gave the best results with artifidata. For each embedding vectorsn5(sn2(m21) ,...,sn) ~thedelay time has been set to unity!, a corrected middle coordinatesn2m/2 is computed by averaging over the neighborhoU e

(n) :

sn2m/251

uU e~n!u (

sn8PU e~n!

sn82m/2 . ~13!

After one complete sweep through the time series, all msurementssn are replaced by the corrected valuessn . Ofcourse, for the first and last (m21)/2 points~if m is odd!, nocorrection is available. The average correction can be taas a new neighborhood radius for the next iteration. Notethe neighborhood of each point at least contains the pitself. If that is the only member, the average, Eq.~13!, issimply the uncorrected measurement and no change is mThus one can safely perform multiple iterations with dcreasing values ofe until no further change is made.

Let us illustrate the use of this scheme with an exampa recording of the air flow through the nose of a human asindicator of breath activity.~The data is part of data set B othe Santa Fe time series contest held in 1991/92;17 seeRigneyet al.60 for a description.! The result of simple non-linear noise reduction is shown in Fig. 12.

bys


thohhthe

. Itl

elat

ba

-oeedin

gn

dddfos

ed.c.thatralyacho-

tiveter

un-e offorehis

y

ano do

est-

bor-

theheb-

rds

ichs inof

the

asirg

Int


D

B. Locally projective nonlinear noise reduction

A more sophisticated method makes use of the hypoeses that the measured data is composed of the outputlow-dimensional dynamical system and of random or higdimensional noise. This means that in an arbitrarily higdimensional embedding space the deterministic part ofdata would lie on a low-dimensional manifold, while theffect of the noise is to spread the data off this manifoldwe suppose that the amplitude of the noise is sufficiensmall, we can expect to find the data distributed closaround this manifold. The idea of the projective nonlinenoise reduction scheme is to identify the manifold andproject the data onto it. The strategies described here goto Ref. 61. A realistic case study is detailed in Ref. 62.

Suppose the dynamical system, Eq.~1! or Eq.~2!, form aq-dimensional manifoldM containing the trajectory. Ac-cording to the embedding theorems, there exists a one-toimage of the attractor in the embedding space, if the embding dimension is sufficiently high. Thus, if the measurtime series were not corrupted with noise, all the embeddvectorssn would lie inside another manifoldM in the em-bedding space. Due to the noise, this condition is no lonfulfilled. The idea of the locally projective noise reductioscheme is that for eachsn there exists a correctionQn , withiQni small, in such a way thatsn2QnPM and thatQn isorthogonal onM. Of course, a projection to the manifolcan only be a reasonable concept if the vectors are embein spaces which are higher dimensional than the maniM. Thus we have to over-embed inm-dimensional spacewith m.q.

FIG. 12. The simple nonlinear noise reduction of human breath rate dThree iterations have been carried out, staring with neighborhoods of0.4 units. Embeddings in 7 dimensions at unit delay have been used. Aably, the resulting series~lower panel! is less noisy. However, in Sec. VIIwe will show evidence that the noise is not just additive and independethe signal.


-f a--e

fyyrock

ned-

g

er

edld

The notion of orthogonality depends on the metric usIntuitively one would think of using the Euclidean metriBut this is not necessarily the best choice. The reason iswe are working with delay vectors which contain tempoinformation. Thus, even if the middle parts of two delavectors are close, the late parts could be far away from eother due to the influence of the positive Lyapunov expnents, while the first parts could diverge due the negaones. Hence it is usually desirable to correct only the cenpart of delay vectors and leave the outer parts mostlychanged, since their divergence is not only a consequencthe noise, but also of the dynamics itself. It turns out thatmost applications it is sufficient to fix just the first and thlast component of the delay vectors and correct the rest. Tcan be expressed in terms of a metric tensorP which wedefine to be61

Pi j 5 H1: i 5 j and 1, i , j ,m,0: elsewhere, ~14!

where m is the dimension of the ‘‘over-embedded’’ delavectors.

Thus we have to solve the minimization problem,

(i

~QiP21Q i !5

!

min, ~15!

with the constraints

ani ~sn2Qn!1bn

i 50, for i 5q11,...,m ~16!

and

ani Pan

j 5d i j , ~17!

where theani are the normal vectors ofM at the pointsn

2Qn .These ideas are realized in the programsghkss ,

project , andnoise in TISEAN. While the first two workasa posteriorifilters on complete data sets, the last one cbe used in a data stream. This means that it is possible tthe corrections online, while the data is coming in~for moredetails see Sec. V C!. All three algorithms mentioned abovcorrect for curvature effects. This is done by either poprocessing the corrections for the delay vectors~ghkss ! orby preprocessing the centers of mass of the local neighhoods~project !.

The idea used in theghkss program is the following.Suppose the manifold were strictly linear. Then, providednoise is white, the corrections in the vicinity of a point on tmanifold would point in all directions with the same proability. Thus, if we added all the correctionsQ we expectthem to sum to zero~or ^Q&5O). On the other hand, if themanifold is curved, we expect that there is a trend towathe center of curvature (^Q&5Qav). Thus, to correct for thistrend each correctionQ is replaced byQ2Qav.

A different strategy is used in the programproject .The projections are done in a local coordinate system whis defined by the condition that the average of the vectorthe neighborhood is zero. Or, in other words, the originthe coordinate systems is the center of mass^sn&U of theneighborhoodU. This center of mass has a bias towards

ta.zeu-

of


ncabo

ai-

ise

anmquerthdo

pfrain

de-nontioisfoti

in

s

dfa

oudpdatFoyin-fu

reuruln

th

aenu

thige

atni-areold,slythenelwerra-th-

hat


D

center of the curvature.2 Hence, a projection would not lie othe tangent at the manifold, but on a secant. Now wecompute the center of mass of these points in the neighhood ofsn . Let us call it^^sn&&U . Under fairly mild assump-tions this point has twice the distance from the manifold th^sn&U . To correct for the bias the origin of the local coordnate system is set to the point:^^sn&&U22^sn&U .

The implementation and use of locally projective noreduction as realized inproject andghkss is describedin detail in Refs. 61, 62. Let us recall here the most importparameters that have to be set individually for each tiseries. The embedding parameters are usually chosendifferently from other applications since considerable ovembedding may lead to better noise averaging. Thus,delay time is preferably set to unity and the embeddingmension is chosen to provide embedding windows of reasable lengths. Only for highly oversampled data~like themagneto-cardiogram, Fig. 15, at about 1000 samplescycle!, larger delays are necessary so that a substantialtion of a cycle can be covered without the need to workprohibitively high-dimensional spaces. Next, one has tocide how many dimensionsq to leave for the manifold supposedly containing the attractor. The answer partly depeon the purpose of the experiment. Rather brisk projectican be optimal in the sense of the lowest residual deviafrom the true signal. Low rms error can, however, coexwith systematic distortions of the attractor structure. Thusa subsequent dimension calculation, a more conservachoice would be in order. Remember, however, that poare only movedtowardsbut not onto the local linear sub-space and too low a value ofq does not do as much harm amay be thought.

The noise amplitude to be removed can be selectesome degree by the choice of the neighborhood size. Innonlinear projective filtering can be seen independentlythe dynamical systems background as filtering by amplitrather than by frequency or shape. To allow for a clear seration of noise and signal directions locally, neighborhooshould be at least as large as the supposed noise level, rlarger. This of course competes with curvature effects.small initial noise levels, it is recommended to also specifminimal number of neighbors in order to permit stable learizations. Finally, we should remark that in successcases most of the filtering is done within the first one to thiterations. Going further is potentially dangerous since fther corrections may lead mainly to distortion. One showatch the rms correction in each iteration and stop as sooit does not decrease substantially any more.

As an example for nonlinear noise reduction we treatdata obtained from a NMR laser experiment.63 Enlargementsof two-dimensional delay representations of the datashown in Fig. 13. The upper panel shows the raw experimtal data which contains about 1.1% of noise. The lower pawas produced by applying three iterations of the noise redtion scheme. The embedding dimension wasm57, the vec-tors were projected down to two dimensions. The size oflocal neighborhoods were chosen such that at least 50 nebors were found. One clearly sees that the fractal structurthe attractor is resolved fairly well.


nr-

n

teite-e

i-n-

erc-

-

dssntrvets

toct,fea-sherr

a

le-das

e

ren-elc-

eh-of

The main assumption for this algorithm to work is ththe data is well approximated by a low-dimensional mafold. If this is not the case it is unpredictable what resultscreated by the algorithm. In the absence of a real manifthe algorithm must pick statistical fluctuations and spuriouinterprets them as structure. Figure 14 shows a result ofghkss program for pure Gaussian noise. The upper pashows a delay representation of the original data, the loshows the outcome of applying the algorithm for 10 itetions. The structure created is purely artificial and has noing to do with structures in the original data. This means t

FIG. 13. A two-dimensional representation of the NMR Laser data~top! andthe result of theghkss algorithm ~bottom! after three iterations.

FIG. 14. A two-dimensional representation of a pure Gaussian process~top!and the outcome of theghkss algorithm ~bottom! after 10 iterations. Pro-jections fromm57 down to two dimensions were performed.


ran

pricniathftehiorgnaxi-a

ins

ronnlh

oefowto

r

simnovhe

on-perandirec-

s isteadhusandddi-l

de-rkerle.ilityctlyer-ari-t tont

derea-

ry ato

edeen-for

theclose

,e

nsRef.bustnov

or

imal

,er-

igh-

to-to

.1s

tiopr


D

if one wants to apply one of the algorithms, one has to cafully study the results. If the assumptions underlying thegorithms are not fulfilled in principle anything can happeOne should note, however, that the performance of thegram itself indicates such spurious behavior. For data whis indeed well approximated by a lower-dimensional mafold, the average corrections applied should rapidly decrewith each successful iteration. This was the case withNMR laser data and, in fact, the correction was so small athree iteration that we stopped the procedure. For the wnoise data, the correction only decreased at a rate that csponds to a general shrinking of the point set, indicatinlack of convergence towards a genuine low-dimensiomanifold. Below, we will give an example where an appromating manifold is present without pure determinism. In thcase, projecting onto the manifold does reduce noisereasonable way. See Ref. 64 for material on the dangergeometric filtering.

C. Nonlinear noise reduction in a data stream

In Ref. 65, a number of modifications of the above pcedure have been discussed which enable the use of noear projective filtering in a data stream. In this case, opoints in the past are available for the formation of neigborhoods. Therefore the neighbor search strategy has tmodified. Since the algorithm is described in detail in R65, we only give an example of its use here. Figure 15 shthe result of nonlinear noise reduction on a magnecardiogram~see Figs. 1 and 3! with the programnoise .The same program has also been used successfully foextraction of the fetal ECG.66

VI. LYAPUNOV EXPONENTS

Chaos arises from the exponential growth of infinitemal perturbations, together with global folding mechanisto guarantee boundedness of the solutions. This exponeinstability is characterized by the spectrum of Lyapunexponents.67 If one assumes a local decomposition of t

FIG. 15. The real time nonlinear projective filtering of a magnecardiogram time series. The top panel shows the unfiltered data. BotTwo iterations were done using projections fromm510 down to q52dimensions~delay 0.01 s!. Neighborhoods were limited to a radius of 0units ~0.05 in the second iteration! and to maximally 200 points. Neighborwere only sought up to 5 s back in time. Thus the first 5 s of data are notfiltered optimally and are not shown here. Since the output of each iteraleaps behind its input by one delay window, the last 0.2 s cannot becessed given the data in the upper panel.


e-l-.o-h-seer

tere-al

taof

-lin-y-be.s-

the

-stial

phase space into directions with different stretching or ctraction rates, then the spectrum of exponents is the proaverage of these local rates over the whole invariant set,thus consists of as many exponents as there are space dtions. The most prominent problem in time series analysithat the physical phase space is unknown, and that insthe spectrum is computed in some embedding space. Tthe number of exponents depends on the reconstruction,might be larger than in the physical phase space. Such ational exponents are calledspurious, and there are severasuggestions to either avoid them68 or to identify them. More-over, it is plausible that only as many exponents can betermined from a time series as are entering the Kaplan Yoformula ~see below!. To give a simple example: Considemotion of a high-dimensional system on a stable limit cycThe data cannot contain any information about the stabof this orbit against perturbations, as long as they are exaon the limit cycle. For transients, the situation can be diffent, but then data are not distributed according to an invant measure and the numerical values are thus difficulinterpret. Apart from these difficulties, there is one relevapositive feature: Lyapunov exponents are invariant unsmooth transformations and are thus independent of the msurement function or the embedding procedure. They cardimension of an inverse time and have to be normalizedthe sampling interval.

A. The maximal exponent

The maximal Lyapunov exponent can be determinwithout the explicit construction of a model for the timseries. A reliable characterization requires that the indepdence of embedding parameters and the exponential lawthe growth of distances are checked69,70 explicitly. Considerthe representation of the time series data as a trajectory inembedding space, and assume that you observe a veryreturn sn8 to a previously visited pointsn . Then one canconsider the distanceD05sn2sn8 as a small perturbationwhich should grow exponentially in time. Its future can bread from the time seriesD l5sn1 l2sn81 l . If one finds thatuD l u'D0el l then l is ~with probability one! the maximalLyapunov exponent. In practice, there will be fluctuatiobecause of many effects, which are discussed in detail in69. Based on this understanding, one can derive a roconsistent and unbiased estimator for the maximal Lyapuexponent. One computes

S~e,m,t !5K lnS 1

uUnu (sn8PUn

usn1t2sn81tu D Ln

. ~18!

If S(e,m,t) exhibits a linear increase with identical slope fall m larger than somem0 and for a reasonable range ofe,then this slope can be taken as an estimate of the maxexponentl1 .

The formula is implemented in the routineslyap –k andlyapunov in a straightforward way. ~The programlyap –r implements the very similar algorithm of Ref. 70where only the closest neighbor is followed for each refence point. Also, the Euclidean norm is used.! Apart fromparameters characterizing the embedding, the initial ne

m:

no-


e

. I

n

th

feoeTltba

eo

naneot

osulntha

ntle

pos-the

m-

ikeive. Nocon-a-ch,ofing

esAnans,ofts

-

fm-ses,ra-seps,ureir

m–de-,earis

rel-ed-efferond-su-

etesl odi

s.

-


D

borhood sizee is of relevance: The smallere, the large thelinear range ofS, if there is one. Obviously, noise and thfinite number of data points limite from below. The defaultvalues oflyap –k are rather reasonable for map-like datais not always necessary to extend the average in Eq.~18!over the whole available data, reasonable averages caobtained already with a few hundred reference pointssn . Ifsome of the reference points have very few neighbors,corresponding inner sum in Eq.~18! is dominated by fluc-tuations. Therefore one may choose to exclude those reence points which have less than, say, ten neighbors. Hever, discretion has to be applied with this parameter sincmay introduce a bias against sparsely populated regions.could in theory affect the estimated exponents due to mufractality. Like other quantities, Lyapunov estimates mayaffected by serial correlations between reference pointsneighbors. Therefore, a minimum time forun2n8u can andshould be specified here as well. See also Sec. VII.

Let us discuss a few typical outcomes. The data undlying the top panel of Fig. 16 are the values of the maximathe CO2 laser data. Since this laser exhibits low-dimensiochaos with a reasonable noise level, we observe a clear liincrease in this semi-logarithmic plot, reflecting the expnential divergence of nearby trajectories. The exponenl'0.38 per iteration~map data!!, or, when introducing theaverage time interval, 0.007 perms. In the bottom panel weshow the result for the same system, but now computedthe original flow-like data with a sampling rate of 1 MHz. Aan additional structure, an initial steep increase and regoscillations are visible. The initial increase is due to nonormality and effects of alignment of distances towardslocally most unstable direction, and the oscillations areeffect of the locally different velocities and thus differedensities. Both effects can be much more dramatic in

FIG. 16. Estimating the maximal Lyapunov exponent of the CO2 laser data.The top panel shows results for the Poincare´ map data, where the averagtime intervalTav is 52.2 samples of the flow, and the straight line indical50.38. For comparison: The iteration of the radial basis function modeFig. 11 yieldsl50.35. Bottom panel: Lyapunov exponents determinedrectly from the flow data. The straight line has slopel50.007. In a goodapproximation,lmap5lflowTav . Here, the time windoww to suppress cor-related neighbors has been set to 1000, and the delay time was 6 unit


t

be

e

r-w-it

hisi-end

r-flar

-is

n

ar-en

ss

favorable cases, but as long as the regular oscillationssess a linearly increasing average, this can be taken asestimate of the Lyapunov exponent. Normalizing by the sapling rate, we again findl'0.007 perms, but it is obviousthat the linearity is less pronounced than for the map-ldata. Finally, we show in Fig. 17 an example of a negatresult: We study the human breath rate data used beforelinear part exists, and one cannot draw any reasonableclusion. It is worth considering the figure on a doubly logrithmic scale in order to detect a power law behavior, whiwith power 1/2, could be present for a diffusive growthdistances. In this particular example, there is no convincpower law either.

B. The Lyapunov spectrum

The computation of the full Lyapunov spectrum requirconsiderably more effort than just the maximal exponent.essential ingredient is some estimate of the local Jacobii.e., of the linearized dynamics, which rules the growthinfinitesimal perturbations. One either finds it from direct fiof local linear models of the typesn115ansn1bn , such thatthe first row of the Jacobian is the vectoran , and (J) i j

5d i 21,j for i 52,...,m, wherem is the embedding dimension. Thean is given by the least squares minimizations2

5( l(sl 112ansl2bn)2 where$sl% is the set of neighbors osn .45,71Or one constructs a global nonlinear model and coputes its local Jacobians by taking derivatives. In both caone multiplies the Jacobians one by one, following the tjectory, to as many different vectorsuk in tangent space aone wants to compute Lyapunov exponents. Every few stone applies a Gram–Schmidt orthonormalization procedto the set ofuk , and accumulates the logarithms of therescaling factors. Their average, in the order of the GraSchmidt procedure, give the Lyapunov exponents inscending order. The routinelyap –spec uses this methodwhich goes back to Refs. 71 and 45, employing local linfits. Apart from the problem of spurious exponents, thmethod contains some other pitfalls: Itassumesthat thereexist well defined Jacobians, and does not test for theirevance. In particular, when attractors are thin in the embding space, some~or all! of the local Jacobians might bestimated very badly. Then the whole product can sufrom these bad estimates and the exponents are correspingly wrong. Thus the global nonlinear approach can beperior, if a modeling has been successful; see Sec. IV.

f-

FIG. 17. The breath rate data~cf. Fig. 12! exhibit no linear increase, reflecting the lack of exponential divergence of nearby trajectories.


pindretho

inoae

caale

anlleas

in

on

l aoo

dtaa

edth

iz

oreac-therkofr andatathesetanth

fi-les,ola-eda-

ing

ar-extonelychon

ter-

of

keeob-n-

d

nd-

orethe

oftheand11,po-leda-

h agiv


D

In Table I we show the exponents of the stroboscoNMR laser data in a three-dimensional embedding as a fution of the neighborhood size. Using global nonlinear moels, we find the numbers given in the last two rows. Momaterial is discussed in Ref. 2. The spread of values intable for this rather clean data set reflects the difficultyestimating Lyapunov spectra from time series, which hasbe done with great care. In particular, when the algorithmblindly applied to data from a random process, it caninternally check for the consistency of the assumption ofunderlying dynamical system. Therefore a Lyapunov sptrum is computed which now is completely meaningless.

The computation of the first part of the Lyapunov spetrum allows for some interesting cross-checks. It wconjectured,72 and is found to be correct in most physicsituations, that the Lyapunov spectrum and the fractal dimsion of an attractor are closely related. If the expandingleast contracting directions in space are continuously fiand only one partial dimension is fractal, then one canfor the dimensionality of a~fractal! volume such that it isinvariant, i.e., such that the sum of the correspondLyapunov exponents vanishes, where the last oneweighted with the noninteger part of the dimension:

DKY5k1( i 51

k l i

ulk11u, ~19!

wherek is the maximum integer such that the sum of theklargest exponents is still non-negative.DKY is conjectured tocoincide with the information dimension.

The Pesin identity is valid under the same assumptiand allows us to compute the KS-entropy:

hKS5(i 51

m

Q~l i !l i . ~20!

VII. DIMENSIONS AND ENTROPIES

Solutions of dissipative dynamical systems cannot filvolume of the phase space, since dissipation is synonymwith a contraction of volume elements under the actionthe equations of motion. Instead, trajectories are confinelower-dimensional subsets which have measure zero inphase space. These subsets can be extremely complicand frequently they possess a fractal structure, which methat they are in a nontrivial way self-similar. Generalizdimensions are one class of quantities to characterizefractality. TheHausdorff dimensionis, from the mathemati-cal point of view, the most natural concept to character

TABLE I. Lyapunov exponents of the NMR laser data, determined witthree-dimensional embedding. The algorithms described in Sec. VI Al150.360.02 for the largest exponent.

Method l1 l2 l3

Local linear k520 0.32 20.40 21.13Local linear k540 0.30 20.51 21.21Local linear k5160 0.28 20.68 21.31Radial basis functions 0.27 20.64 21.31Polynomial 0.27 20.64 21.15


cc--

eftostnc-

-s

n-ddk

gis

s

usftoheted,ns

is

e

fractal sets,67 whereas theinformation dimensiontakes intoaccount the relative visitation frequencies and is therefmore attractive for physical systems. Finally, for the charterization of measured data, other similar concepts, likecorrelation dimension, are more useful. One general remais highly relevant in order to understand the limitationsany numerical approach: dimensions characterize a set oinvariant measure whose support is the set, whereas anyset contains only a finite number of points representingset or the measure. By definition, the dimension of a finiteof points is zero. When we determine the dimension ofattractor numerically, we extrapolate from finite lengscales, where the statistics we apply is insensitive to theniteness of the number of data, to the infinitesimal scawhere the concept of dimensions is defined. This extraption can fail for many reasons which will be partly discussbelow. Dimensions are invariant under smooth transformtions and thus again computable in time delay embeddspaces.

Entropies are an information theoretical concept to chacterize the amount of information needed to predict the nmeasurement with a certain precision. The most popularis the Kolmogorov–Sinai entropy. We will discuss here onthe correlation entropy, which can be computed in a mumore robust way. The occurrence of entropies in a sectiondimensions has to do with the fact that they can be demined both by the same statistical tool.

A. Correlation dimension

Roughly speaking, the idea behind certain quantifiersdimensions is that the weightp(e) of a typicale-ball cover-ing part of the invariant set scales with its diameter lip(e)'eD, where the value forD depends also on the precisway one defines the weight. Using the square of the prability pi to find a point of the set inside the ball, the dimesion is called the correlation dimensionD2 , which is com-puted most efficiently by the correlation sum:73

C~m,e!51

Npairs(j 5m

N

(k, j 2w

Q~e2usj2sku!, ~21!

where si are m-dimensional delay vectors,Npairs5(N2m2w)(N2m2w11)/2 the number of pairs of points covereby the sums,Q is the Heaviside step function, andw will bediscussed below. On sufficiently small length scales awhen the embedding dimensionm exceeds the correlationdimension of the attractor,74

C~m,e!}eD2. ~22!

Since one does not know the correlation-dimension befdoing this computation, one checks for convergence ofestimated values ofD2 in m.

The literature on the correct and spurious estimationthe correlation dimension is huge and this is certainly notplace to repeat all the arguments. The relevant caveatsmisconceptions are reviewed, for example, in Refs. 75,76, 2. The most prominent precaution is to exclude temrally correlated points from the pair counting by the so calTheiler windoww.75 In order to become a consistent estimtor of the correlationintegral ~from which the dimension is

e


len

ely

cot b

gtesiena

fosetisia

.eo

ssega

ce

lues.b

foveir

es

me

thrl,e

rie

fo

era

rame.

bes’’way

.

the

q.

Eq.to


D

derived! the correlationsumshould cover a random sampof points drawn independently according to the invariameasure on the attractor. Successive elements of a timries are not usually independent. In particular, for highsampled flow data subsequent delay vectors are highlyrelated. Theiler suggested to remove this spurious effecsimply ignoring all pairs of points in Eq.~21! whose timeindices differ by less thanw, where w should be chosengenerously. WithO(N2) pairs available, the loss ofO(wN)pairs is not dramatic as long asw!N. At the very least, pairswith j 5k have to be excluded,77 since otherwise the stronbias toD250, the mathematically correct value for a finiset of points, reduces the scaling range drastically. Choow, the first zero of the auto-correlation function, sometimeven the decay time of the auto-correlation function, arelarge enough since they reflect only overall linecorrelations.75,76 The space–time-separation plot~Sec. III B!provides a good means of determining a sufficient valuew, as discussed, for example in Ref. 41, 2. In some canotably processes with inverse power law spectra, inspecrequiresw to be of the order of the length of the time serieThis indicates that the data does not sample an invarattractor sufficiently and the estimation of invariants likeD2

or Lyapunov exponents should be abandoned.Parameters in the routinesd2 , c2 , andc2naive are as

usual the embedding parametersm andt, the time delay, andthe embedding dimension, as well as the Theiler window

Fast implementation of the correlation sum have beproposed by several authors. At small length scales, the cputation of pairs can be done inO(N logN) or evenO(N)time rather thanO(N2) without losing any of the precioupairs; see Ref. 20. However, for intermediate size datawe also need the correlation sum at intermediate lenscales where neighbor searching becomes expensive. Mauthors have tried to limit the use of computational resourby restricting one of the sums in Eq.~21! to a fraction of theavailable points. By this practice, however, one loses vaable statistics at the small length scales where points arscarce anyway that all pairs are needed for stable resultRef. 62, both approaches were combined for the first timeusing a fast neighbor search fore,e0 and restricting the sumfor e>e0 . The TISEAN implementationsc2 and d2 go onestep further and select the range for the sums individuallyeach length scale to be processed. This turns out to gimajor improvement in speed. The user can specify a desnumber of pairs which seems large enough for a stablemation of C(e), typically 1000 pairs will suffice. Then thesums are extended to a range which guarantees that nuof pairs, or, if this cannot be achieved, to the whole timseries. At the largest length scales, this range may be rasmall and the user may choose to give a minimal numbereference points to ensure a representative average. Stiling the programc2 the whole computation may thus at largscales be concentrated on the first part of the time sewhich seems fair for stationary, nonintermittent data~nonsta-tionary or strongly intermittent data is usually unsuitablecorrelation dimension estimation anyway!. The programd2is safer with this respect. Rather than restricting the rangthe sums, only a randomly selected subset is used. The


tse-

r-y

ngsotr

rs,

on.nt

nm-

tsthnys

-soIny

ra

edti-

ber

erofus-

s,

r

ofn-

domization, however, requires a more sophisticated progstructure in order to avoid an overhead in computation tim

1. Takens –Theiler estimator

Convergence to a finite correlation dimension canchecked by plotting scale-dependent ‘‘effective dimensionversus length scale for various embeddings. The easiestto proceed is to compute~numerically! the derivative oflogC(m,e) with respect to loge, for example, by fittingstraight lines to the log–log plot ofC(e). In Fig. 18~a! wesee the output of the routinec2 acting on data from theNMR laser, processed byc2d in order to obtain local slopesBy default, straight lines are fitted over one octave ine;larger ranges give smoother results. We can see that on

FIG. 18. The dimension estimation for the~noise filtered! NMR laser data.Embedding dimensions 2 to 7 are shown. From above:~a! slopes are deter-mined by straight line fits to the log–log plot of the correlation sum, E~21!. ~b! The Takes–Theiler-estimator of the same slope.~c! Slopes areobtained by straight line fits to the Gaussian kernel correlation sum,~25!. ~d! Instead of the correlation dimension, it has been attemptedestimate the information dimension.


enifioev-s

a-st-

or

ve

o

depcaonlyg

nne

te

th

lth

t

ofa

er,

ngs.sksa

,

c-

r

ta

va-

ilyut-on

this

t is

ity,

of


D

large scales, self-similarity is broken due to the finite extsion of the attractor, and on small but yet statistically signcant scales we see the embedding dimension insteadsaturated,m-independent value. This is the effect of noiswhich is infinite dimensional, and thus fills a volume in eery embedding space. Only on the intermediate scales wethe desiredplateauwhere the results are in good approximtion independent ofm and e. The region where scaling iestablished, not just the range selected for straight line fiting, is called thescaling range.

Since the statistical fluctuations in plots like Fig. 18~a!show characteristic ~anti-!correlations, it has beensuggested78,79 to apply a maximum likelihood estimator tobtain optimal values forD2 . The Takens–Theiler-estimatoreads as

DTT~e!5C~e!

E0

e C~e8!

e8de8

, ~23!

and can be obtained by processing the output ofc2 by c2t .Since C(e) is available only at discrete values$e i , i50,...,I %, we interpolate it by a pure power law@or, equiva-lently, the log–log plot by straight lines: logC(e)5ai loge1bi# in between these. The resulting integrals can be soltrivially and summed:

E0

e C~e8!

e8de85(

i 51

I

ebiEe i 21

e i~e8!ai21 de8

5(i 51

Iebi

ai~e i

ai2e i 21ai !. ~24!

Plotting DTT vs e @Fig. 18~b!# is an interesting alternative tthe usual local slopes plot, Fig. 18~a!. It is tempting to usesuch an ‘‘estimator of dimension’’ as a black box to provia number one might quote as a dimension. This would imthe unjustified assumption that all deviations from exact sing behavior is due to the statistical fluctuations. Instead,still has to verify the existence of a scaling regime. Onthen,DTT(e) evaluated at the upper end of the scaling ranis a reasonable dimension estimator.

2. Gaussian kernel correlation integral

The correlation sum, Eq.~21!, can be regarded as aaverage density of points where the local density is obtaiby a kernel estimator with a step kernelQ(e2r ). A naturalmodification for small point sets is to replace the sharp skernel by a smooth kernel function ofbandwidthe. A par-ticularly attractive case that has been studied inliterature80 is given by the Gaussian kernel, that is,Q(e2r ) is replaced bye2r 2/4e2

. The resulting Gaussian kernecorrelation sumCG(e) has the same scaling properties asusualC(e). It has been observed in Ref. 3 thatCG(e) can beobtained fromC(e) via

CG~e!51

2e2 E0

`

de e2 e2/4e2eC~ e !, ~25!

without having to repeat the whole computation. IfC(e) isgiven at discrete values ofe, the integrals in Eq.~25! can be


--f a,

ee

d

lyl-e

e

d

p

e

e

carried out numerically by interpolatingC(e) with purepower laws. This is done inc2g which uses a 15 poinGauss–Kronrod rule for the numerical integration.

B. Information dimension

Another way of attaching weight toe-balls, which ismore natural, is the probabilitypi itself. The resulting scalingexponent is called the information dimensionD1 . Since theKaplan–Yorke dimension of Sec. VI is an approximationD1 , the computation ofD1 through scaling properties isrelevant cross-check for highly deterministic data.D1 can becomputed from a modified correlation sum, where, howevunpleasant systematic errors occur. Thefixed massapproach81 circumvents these problems, so that, includifinite sample corrections,77 a rather robust estimator existInstead of counting the number of points in a ball one ahere for the diametere which a ball must have to containcertain numberk of points when a time series of lengthN isgiven. Its scaling withk and N yields the dimension in thelimit of small length scales by

D1~m!5 limk/N→0

d logk/N

d^ loge~k/N!&. ~26!

The routinec1 computes the~geometric! mean length scaleexp loge(k/N)& for which k neighbors are found inN datapoints, as a function ofk/N. Unlike the correlation sumfinite sample corrections are necessary ifk is small.77 Essen-tially, the log ofk has to be replaced by the digamma funtion C(k). The resulting expression is implemented inc1 .Given m and t, the routine variesk and N such that thelargest reasonable range ofk/N is covered with moderatecomputational effort. This means that for 1/N<k/N<K/N~default: K5100!, all N available points are searched foneighbors andk is varied. ForK/N,k/N<1, k5K is keptfixed andN is decreased. The result for the NMR laser dais shown in Fig. 18~d!, where a nice scaling withD1'1.35can be discerned. For comparability, the logarithmic deritive of k/N is plotted versus exp^loge(k,N)& and notviceversa, althoughk/N is the independent variable. One easdetects again the violations of scaling discussed before: Coff on the large scales, noise on small scales, fluctuationseven smaller scales, and a scaling range in between. Inexample,D1 is close toD2 , and multifractality cannot beestablished positively.

C. Entropy estimates

The correlation dimension characterizes thee depen-dence of the correlation sum inside the scaling range. Inatural to ask what we can learn from itsm-dependence,oncem is larger thanD0 . The number ofe-neighbors of adelay vector is an estimate of the local probability densand, in fact, it is a kind of joint probability: Allm-components of the neighbor have to be similar to thosethe actual vector simultaneously. Thus when increasingm,


dor

rnpot

ve

p

pmithelepl

aa

tivic

itethorestf

benloube

ereso

rore

ext

by ad totoull

reeinto

m-there-

tion.thepa-

h aasesri-venmlya

uch

own-t itrtedesisstic

-

re-ame

ou-. Asflat-

thatnd

ro-and

destepso ame

areonedatative


D

joint probabilities covering larger time spans get involveThe scaling of these joint probabilities is related to the crelation entropyh2 , such that

C~m,e!'eD2e2mh2. ~27!

As for the scaling ine, also the dependence onm is validonly asymptotically for largem, which one will not reachdue to the lack of data points. So one will studyh2(m) vs mand try to extrapolate to largem. The correlation entropy is alower bound of the Kolmogorov Sinai entropy, which in tucan be estimated by the sum of the positive Lyapunov exnents. The programd2 produces as output the estimatesh2 directly, from the other correlation sum programs it hasbe extracted by post-processing the output.

The entropies of first and second order can be derifrom the output ofc1 and c2 , respectively. An alternatemeans of obtaining these and the other generalized entrois by a box counting approach. Letpi be the probability tofind the system state in boxi, then the orderq entropy isdefined by the limit of small box size and largem of

(i

piq'e2mhq. ~28!

To evaluate( i piq over a fine mesh of boxes inm@1 dimen-

sions, economical use of memory is necessary: A simhistogram would take (1/e)m storage. Therefore the prograboxcount implements the mesh of boxes as a tree w(1/e)-fold branching points. The tree is worked through rcursively so that at each instance at most one compbranch exists in storage. The current version does not imment finite sample corrections to Eq.~28!.

VIII. TESTING FOR NONLINEARITY

Most of the methods and quantities discussed so farmost appropriate in cases where the data show strongconsistent nonlinear deterministic signatures. As soonmore than a small or at most moderate amount of addinoise is present, scaling behavior will be broken and predability will be limited. Thus we have explored the opposextreme, nonlinear and fully deterministic, rather thanclassical linear stochastic processes. The bulk of real wtime series falls in neither of these limiting categories bcause they reflect nonlinear responses and effectivelychastic components at the same time. Little can be donemany of these cases with current methods. Often it willadvisable to take advantage of the well founded machinof spectral methods and venture into nonlinear territory oif encouraged by positive evidence. This section is abmethods to establish statistical evidence for nonlinearityyond a simple rescaling in a time series.

A. The concept of surrogate data

The degree of nonlinearity can be measured in sevways. But how much nonlinear predictability, say, is necsary to exclude more trivial explanations? All quantifiersnonlinearity show fluctuations but the distributions, or erbars if you wish, are not available analytically. It is therefonecessary to use Monte Carlo techniques to assess the


.-

o-fo

d

ies

le

-tee-

rendaset-

eld-o-oreryyt-

al-fr

sig-

nificance of the results. One important method in this contis the method of surrogate data.82 A null hypothesis is for-mulated, for example, that the data has been createdstationary Gaussian linear process, and then it is attemptereject this hypothesis by comparing results for the dataappropriate realizations of the null hypothesis. Since the nassumption is not a simple one but leaves room for fparameters, the Monte Carlo sample has to take theseaccount. One approach is to constructconstrained realiza-tions of the null hypothesis. The idea is that the free paraeters left by the null are reflected by specific properties ofdata. For example, the unknown coefficients of an autogressive process are reflected in the autocorrelation funcConstrained realizations are obtained by randomizingdata subject to the constraint that an appropriate set oframeters remains fixed. For example, random data witgiven periodogram can be made by assuming random phand taking the inverse Fourier transform of the given peodogram. Random data with the same distribution as a gidata set can be generated by permuting the data randowithout replacement. Asking for a given spectrum andgiven distribution at the same time poses already a mmore difficult question.

B. Iterative Fourier transform method

Very few real time series which are suspected to shnonlinearity follow a Gaussian single time distribution. NoGaussianity is the simplest kind of nonlinear signature bumay have a trivial reason: The data may have been distoin the measurement process. Thus a possible null hypothwould be that there is a stationary Gaussian linear stochaprocess that generates a sequence$xn%, but the actual observations aresn5s(xn) wheres(•) is a monotonic function.Constrained realizations of this null hypothesis wouldquire the generation of random sequences with the spower spectrum~fully specifying the linear process! and thesame single time distribution~specifying the effect of themeasurement function! as the observed data. TheAmplitudeAdjusted Fourier Transform ~AAFT! method proposed inRef. 82 attempts to invert the measurement functions(•) byrescaling the data to a Gaussian distribution. Then the Frier phases are randomized and the rescaling is inverteddiscussed in Ref. 83, this procedure is biased towards ater spectrum since the inverse ofs(•) is not available ex-actly. In the same reference, a scheme is introducedremoves this bias by iteratively adjusting the spectrum athe distribution of the surrogates. Alternatingly, the surgates are rescaled to the exact values taken by the datathen the Fourier transform is brought to the exact amplituobtained from the data. The discrepancy between both seither converges to zero with the number of iterations or tfinite inaccuracy which decreases with the length of the tiseries. The programsurrogates performs iterations untilno further improvement can be made. The last two stagesreturned, one having the exact Fourier amplitudes andtaking on the same values as the data. For not too exoticthese two versions should be almost identical. The reladiscrepancy is also printed.


otdawraheaioetl

on

mcwe

ouu

tost tdesc

eaisou

eci-ierua-ar-bi-themu-lutely

ofngde-

aci-c-

le-serat

nen-by

an-e ati-

ired.n-se

-astict atle to

eytheict-ed95%heth

oiszeb

stic

tabil-stri-


D

In Fig. 19 we used this procedure to assess the hypesis that the noise reduction on the breath data reporteFig. 12 removed an additive noise component which windependent of the signal. If the hypothesis were true,could equally well add back on the noise sequence or adomized version of it which lacks any correlations to tsignal. In the upper panel of Fig. 19 we show the origindata. In the lower panel we took the noise reduced vers~cf. Fig. 12, bottom! and added a surrogate of the supposnoise sequence. The result is similar but still significandifferent from the original to make the additivity assumptiimplausible.

Fourier based randomization schemes suffer from socaveats due to the the inherent assumption that the datastitutes one period of a periodic signal, which is not whatreally expect. The possible artifacts are discussed, forample, in Ref. 84 and can, in summary, lead to spurirejection of the null hypothesis. One precaution that shobe taken when usingsurrogates is to make sure that thebeginning and the end of the data approximately matchvalue and phase. Then, the periodicity assumption is notfar wrong and harmless. Usually, this amounts to the losa few points of the series. One should note, however, tharoutine may truncate the data by a few points itself in orto be able to perform afastFourier transform which requirethe number of points to be factorizable by small prime fators.

C. General constrained randomization

In Ref. 85 a general method has been proposed to crrandom data which fulfill specified constraints. With thmethod, the artifacts and remaining imprecision of the F

FIG. 19. Upper: The human breath rate data from Fig. 12. Lower: the ncomponent extracted by the noise reduction scheme has been randomiorder to destroy correlations with the signal. The result appears slightlysignificantly less structured than the original.


h-insen-

lndy

eon-ex-s

ld

ino

ofher

-

te

-

rier based randomization schemes can be avoided by spfying the autocorrelation function rather than the Fourtransform. The former does not assume periodic contintion. Maybe more importantly, the restriction to a rather nrow null hypothesis can be relaxed since, in principle, artrary statistical observables can be imposed onsurrogates. A desired property of the data has to be forlated in terms of a cost function which assumes an absominimum when the property is fulfilled. States arbitrariclose to this minimal cost can be reached by the methodsimulated annealing. The cost function is minimized amoall possible permutations of the data. See Ref. 85 for ascription of the approach.

The TISEAN package contains the building blocks forlibrary of surrogate data routines implementing user spefied cost functions. Currently, only the autocorrelation funtion with and without periodic continuation have been impmented. Further, a template is given from which the umay derive her/his own routines. A module is provided thdrives the simulated annealing process through an expotial cooling scheme. The user may replace this moduleother scheme of her/his choice. A module that performs rdom pair permutations is given which allows us to excludlist of points from the permutation scheme. More sophiscated permutation schemes can be substituted if desMost importantly, the cost function has to be given as aother module. The autocorrelation modules umaxt51

tmaxuC(t)2C(t)datau/t, whereC(t) is the autocorrelationfunction with or without periodic continuation.

In Fig. 20 we show an example fulfilling the null hypothesis of a rescaled stationary Gaussian linear stochprocess which has been contaminated by an artifacsamples 200–220. The Fourier based schemes are unabimplement the artifact part of the null hypothesis. Thspread the structure given by the artifact evenly overwhole time span, resulting in more spikes and less predability. In fact, the null hypothesis of a stationary rescalGaussian linear stochastic process can be rejected at thelevel of significance using nonlinear prediction errors. Tartifact would spuriously be mistaken for nonlinearity. Withe programrandomize –auto –exp –random , we can

ed inut

FIG. 20. Upper trace: Data from a stationary Gaussian linear stochaprocess (xn50.7xn211hn) measured bys(xn)5xn

3 . Samples 200–220 arean artifact. With the Fourier based scheme~middle trace! the artifact resultsin an increased number of spikes in the surrogates and reduced predicity. In the lower trace, the artifact has been preserved along with the dibution of values and lags 1,...,25 of the autocorrelation function.


o

us

an

ich

a-ld

ohe

-v

rmThal

aco

erebees

eal-be

o-ua-cost

nsen-m-

woesenbe

uck,het aac-wmeest

fa-isas

tput

eit,oryionIfme-ooder,

nceree-it,

ec-to

-’’way.ta-

ter-ionalor-

sta-


D

exclude the artifact from the randomization scheme andtain a correct test.

As an example of a more exotic cost function, letshow the randomization of 500 iterates of the He´non map,Fig. 21~a!. Panel~b! shows the output ofsurrogates hav-ing the same spectrum and distribution. Starting from a rdom permutation~c!, the cost function,

C5^xn21xn&1^xn22xn&1^xn212 xn&1^xn21xn

2&

1^xn222 xn&1^xn22xn21xn&1^xn21

2 xn2&1^xn21xn

3&

1^xn213 xn&, ~29!

is minimized ~randomize –generic –exp –random !. Itinvolves are all the higher order autocorrelations whwould be needed for a least squares fit with the ansatzxn

5c2axn212 1bxn22 and in this sense fully specifies the qu

dratic structure of the data. The random shuffle yieC52400, panels~c!–~f! correspond toC5150,15,0.002, re-spectively.

Since the annealing process can be very CPU time csuming, it is important to provide an efficient code for tcost function. Specifyingtmax lags forN data points requiresO(Ntmax) multiplications for the calculation of the cost function. An update after a pair has been exchanged, howecan be obtained withO(tmax) multiplications. Often, the fullsum or supremum can be truncated since after the first teit is clear that a large increase of the cost is unavoidable.driving Metropolis algorithm provides the current maximpermissible cost for that purpose.

The computation time required to reach the desiredcuracy depends on the choice and implementation of the

FIG. 21. Randomization of 500 points generated by the the He´non map.~a!Original data;~b! the same autocorrelations and distribution;~c!–~f! differ-ent stages of annealing with a cost functionC involving three- and four-point correlations.~c! A random shuffle,C52400; ~d! C5150; ~e! C515; ~f! C50.002. See the text.


b-

-

s

n-

er,

se

c-st

function but also critically on the annealing schedule. This a vast literature on simulated annealing which cannotreviewed here. Experimentation with cooling schemshould keep in mind the basic concept of simulated anning. At each stage, the system—here the surrogate tocreated—is kept at a certain ‘‘temperature.’’ Like in thermdynamics, the temperature determines how likely flucttions around the mean energy—here the value of thefunction C—are. At temperatureT, a deviation of sizeDCoccurs with the Boltzmann probability}exp(2DC/T). In aMetropolis simulation, this is achieved by acceptingalldownhill changes (DC,0), but also uphill changes withprobability exp(2DC/T). Here the changes are permutatioof two randomly selected data items. The present implemtation offers an exponential cooling scheme, that is, the teperature is lowered by a fixed factor whenever one of tconditions is fulfilled: Either a specified number of changhas beentried, or a specified number of changes has beaccepted. Both these numbers and the cooling factor canchosen by the user. If the state is cooled too fast it gets stor ‘‘freezes’’ in a false minimum. When this happens, tsystem must be ‘‘melted’’ again and cooling is taken up aslower rate. This can be done automatically until a goalcuracy is reached. It is, however, difficult to predict homany steps it will take. The detailed behavior of the scheis still subject to ongoing research and in all but the simplcases, experimentation by the user will be necessary. Tocilitate the supervision of the cooling, the current statewritten to a file whenever a substantial improvement hbeen made. Further, the verbosity of the diagnostic oucan be selected.

D. Measuring weak nonlinearity

When testing for nonlinearity, we would like to usquantifiers that are optimized for the weak nonlinearity limwhich is not what most time series methods of chaos thehave been designed for. The simple nonlinear predictscheme~Sec. IV B! has proven quite useful in this context.used as a comparative statistic, it should be noted that sotimes seemingly inadequate embeddings or neighborhsizes may lead to rather big errors which have, howevsmall fluctuations. The tradeoff between bias and variamay be different from the situation where predictions adesiredper se. The same rationale applies to quantities drived from the correlation sum. Neither the small scale limgenuine scaling, or the Theiler correction, are formally nessary in a comparative test. However, any temptationinterpret the results in terms like ‘‘complexity’’ or ‘‘dimensionality’’ should be resisted, even though ‘‘complexitydoes not seem to have an agreed-upon meaning anyApart from average prediction errors, we have found the sbilities of short periodic orbits~see Sec. IV C! useful for thedetection of nonlinearity in surrogate data tests. As an alnative to the phase space based methods, more traditmeasures of nonlinearity derived from higher order autocrelation functions~Ref. 86, routineautocor3 ! may also beconsidered. If a time-reversal asymmetry is present, itstistical confirmation~routine timerev ! is a very powerful


-

o-teakavto

-h

cnin

itth

vedilm

or

alikarai

inuries

hroinns—e

rpdtdininth

ic-themmn

em

otu

hece,re-ndr-

haverolusion

atiesthetil

, orthe

the

r,therlye

disrs-

t

th-

-

-

se-

g,od.

e

e

s

,’’

re-


D

detector of nonlinearity.87 Some measures of weak nonlinearity are compared systematically in Ref. 88.

IX. CONCLUSION AND PERSPECTIVES

The TISEAN project makes available a number of algrithms of nonlinear time series analysis to people interesin applications of the dynamical systems approach. To mproper use of these algorithms, it is not essential to hwritten the programs from scratch, an effort we intendspare the user by makingTISEAN public. Indispensable, however, is a good knowledge of what the programs do, and wthey do what they do. The latter requires a thorough baground in the nonlinear time series approach which canbe provided by this paper, but rather by textbooks likeRefs. 10, 2, reviews,11,12,3 and the original literature.9 Here,we have concentrated on the actual implementation asrealized inTISEAN and on examples of the concrete use ofprograms.

Let us finish the discussion by giving some perspection future work. So far, theTISEAN project has concentrateon the most common situation of a single time series. Whfor multiple measurements of a similar nature most progracan be modified with moderate effort, a general framewfor heterogeneous multivariate recordings~say, blood pres-sure and heart beat! has not been established so far innonlinear context. Nevertheless, we feel that conceptsgeneralized synchrony, coherence, or information flowwell worth pursuing and at some point should become avable to a wider community, including applied research.

Initial experience with nonlinear time series methodsdicates that some of the concepts may prove useful enoin the future to become part of the established time setool box. For this to happen, availability of the algorithmand reliable information on their use will be essential. Tpublication of a substantial collection of research level pgrams through theTISEAN project may be seen as one stepthat direction. However, the potential user will still need cosiderable experience in order to make the right decisionabout the suitability of a particular method for a specific timseries, about the selection of parameters, about the intetation of the results. To some extent, these decisions coulguided by software that evaluates the data situation andresults automatically. Previous experience with black boxmension or Lyapunov estimators has not been encouragbut for some specific problems, ‘‘optimal’’ answers can,principle, be defined and computed automatically, onceoptimality criterion is formulated. For example, the predtion programs could be encapsulated in a frameworkautomatically evaluates the performance for a range ofbedding parameters, etc. Of course, quantitative assessof the results is not always easy to implement and depeon the purpose of the study. As another example, it serealistic to define ‘‘optimal’’ Poincare´ surfaces of sectionand to find the optimal solutions numerically.

Like in most of the time series literature, the issuestationarity has entered the discussion only as somethinglack of which has to be detected in order to avoid spurioresults. Taking this point seriously amounts to rejecting


dee

yk-ot

ise

s

esk

eel-

-ghs

e-

-

re-behei-g,

e

at-

entdss

fhesa

substantial fraction of time series problems, including tmost prominent examples, that is, most data from finanmeteorology, and biology. It is quite clear that the merejection of these challenging problems is not satisfactory awe will have to develop tools to actually analyze, undestand, and predict nonstationary data. Some suggestionsbeen made for the detection of fluctuating contparameters.89–92 Most of these can be seen as continuoversions of the classification problem, another applicatwhich is not properly represented inTISEAN yet.

Publishing software, or reviews and textbooks for thmatter, in a field evolving as rapidly as nonlinear time seranalysis will always have the character of a snapshot ofstate at a given time. Having the options either to wait unthe field has saturated sufficiently or to risk that programsstatements made, will become obsolete soon, we chosesecond option. We hope that we can thus contribute tofurther evolution of the field.

ACKNOWLEDGMENTS

We wish to thank Eckehard Olbrich, Marcus Richteand Andreas Schmitz who have made contributions toTISEAN project, and the users who patiently coped with eaversions of the software, in particular, Ulrich Hermes. Wthank Leci Flepp, Nick Tufillaro, Riccardo Meucci, anMarco Ciofini for letting us use their time series data. Thwork was supported by the SFB 237 of the Deutsche Fochungsgemeinschaft.

1The TISEAN software package is publicly available ahttp://www.mpipks-dresden.mpg.de/ ˜ tisean . The distribu-tion includes an online documentation system.

2H. Kantz and T. Schreiber,Nonlinear Time Series Analysis~CambridgeUniversity Press, Cambridge, 1997!.

3T. Schreiber, ‘‘Interdisciplinary application of nonlinear time series meods,’’ Phys. Rep.308, 1 ~1998!.

4D. Kaplan and L. Glass,Understanding Nonlinear Dynamics~Springer-Verlag, New York, 1995!.

5E. Ott, Chaos in Dynamical Systems~Cambridge University Press, Cambridge, 1993!.

6P. Berge´, Y. Pomeau, and C. Vidal,Order Within Chaos: Towards aDeterministic Approach to Turbulence~Wiley, New York, 1986!.

7H.-G. Schuster,Deterministic Chaos: An Introduction~Physik-Verlag,Weinheim, 1988!.

8A. Katok and B. HasselblattIntroduction to the Modern Theory of Dynamical Systems~Cambridge University Press, Cambridge, 1996!.

9E. Ott, T. Sauer, and J. A. Yorke,Coping with Chaos~Wiley, New York,1994!.

10H. D. I. Abarbanel,Analysis of Observed Chaotic Data~Springer-Verlag,New York, 1996!.

11P. Grassberger, T. Schreiber, and C. Schaffrath, ‘‘Non-linear timequence analysis,’’ Int. J. Bifurcation Chaos Appl. Sci. Eng.1, 521~1991!.

12H. D. I. Abarbanel, R. Brown, J. J. Sidorowich, and L. Sh. Tsimrin‘‘The analysis of observed chaotic data in physical systems,’’ Rev. MPhys.65, 1331~1993!.

13D. Kugiumtzis, B. Lillekjendlie, and N. Christophersen, ‘‘Chaotic timseries I,’’ Modeling Identification Control15, 205 ~1994!.

14D. Kugiumtzis, B. Lillekjendlie, and N. Christophersen, ‘‘Chaotic timseries II,’’ Modeling Identification Control15, 225 ~1994!.

15G. Mayer-Kress, inDimensions and Entropies in Chaotic System~Springer-Verlag, Berlin, 1986!.

16M. Casdagli and S. Eubank, in ‘‘Nonlinear modeling and forecastingSanta Fe Institute Studies in the Science of Complexity, Proceedings Vol.XII ~Addison-Wesley, Reading, MA, 1992!.

17A. S. Weigend and N. A. Gershenfeld, in ‘‘Time Series Prediction: Fo


aly

rdi

co

g

ta

nd

d-co

m

ge,

kit

ls,

lot

o,ro

gtim

n.

sh

rto

ys

ctt.

in

ee

o

ed

der-

od-

d

-

ies

ica

,’’

y,sis

iber,

, J.udy

-ys.

ta

ith

at-

ing

ent

odsica

rom

cef

Int.

es-

tics

s.

he

J.

st-ica


D

casting the future and understanding the past,’’Santa Fe Institute Studiesin the Science of Complexity, Proceedings Vol. XV~Addison-Wesley,Reading, MA, 1993!.

18J. Belair, L. Glass, U. an der Heiden, and J. Milton, inDynamical Disease~AIP, Woodbury, NY, 1995!.

19H. Kantz, J. Kurths, and G. Mayer-Kress, inNonlinear Analysis of Physi-ological Data ~Springer-Verlag, Berlin, 1998!.

20T. Schreiber, ‘‘Efficient neighbor searching in nonlinear time series ansis,’’ Int. J. Bifurcation Chaos Appl. Sci. Eng.5, 349 ~1995!.

21F. Takens,Detecting Strange Attractors in Turbulence, Lecture Notes inMathematics Vol. 898~Springer-Verlag, New York, 1981!, Vol. 898.

22T. Sauer, J. Yorke, and M. Casdagli, ‘‘Embedology,’’ J. Stat. Phys.65,579 ~1991!.

23M. Richter and T. Schreiber, ‘‘Phase space embedding of electrocagrams,’’ Phys. Rev. E.58, 6392~1998!.

24M. Casdagli, S. Eubank, J. D. Farmer, and J. Gibson, ‘‘State space restruction in the presence of noise,’’ Physica D51, 52 ~1991!.

25A. M. Fraser and H. L. Swinney, ‘‘Independent coordinates for stranattractors from mutual information,’’ Phys. Rev. A33, 1134~1986!.

26B. Pompe, ‘‘Measuring statistical dependences in a time series,’’ J. SPhys.73, 587 ~1993!.

27M. Palus, ‘‘Testing for nonlinearity using redundancies: Quantitative aqualitative aspects,’’ Physica D80, 186 ~1995!.

28M. B. Kennel, R. Brown, and H. D. I. Abarbanel, ‘‘Determining embeding dimension for phase-space reconstruction using a geometricalstruction,’’ Phys. Rev. A45, 3403~1992!.

29http://hpux.cs.utah.edu/hppd/hpux/Physics/embedd-ing-26.May.93

30http://www.zweb.com/apnonlin/31I. T. Jolliffe, Principal Component Analysis~Springer-Verlag, New York,

1986!.32D. Broomhead and G. P. King, ‘‘Extracting qualitative dynamics fro

experimental data,’’ Physica D20, 217 ~1986!.33W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterlin

Numerical Recipes, 2nd ed. ~Cambridge University Press, Cambridg1992!.

34R. Vautard, P. Yiou, and M. Ghil, ‘‘Singular-spectrum analysis: a toolfor short, noisy chaotic signals,’’ Physica D58, 95 ~1992!.

35A. Varone, A. Politi, and M. Ciofini, ‘‘CO2 laser with feedback,’’ Phys.Rev. A 52, 3176~1995!.

36R. Hegger and H. Kantz, ‘‘Embedding of sequences of time intervaEurophys. Lett.38, 267 ~1997!.

37J. P. Eckmann, S. Oliffson Kamphorst, and D. Ruelle, ‘‘Recurrence pof dynamical systems,’’ Europhys. Lett.4, 973 ~1987!.

38M. Casdagli, ‘‘Recurrence plots revisited,’’ Physica D108, 206 ~1997!.39N. B. Tufillaro, P. Wyckoff, R. Brown, T. Schreiber, and T. Molten

‘‘Topological time series analysis of a string experiment and its synchnized model,’’ Phys. Rev. E51, 164 ~1995!.

40http://homepages.luc.edu/ ˜ cwebber/41A. Provenzale, L. A. Smith, R. Vio, and G. Murante, ‘‘Distinguishin

between low-dimensional dynamics and randomness in measuredseries,’’ Physica D58, 31 ~1992!.

42H. Tong,Threshold Models in Non-Linear Time Series Analysis, LectureNotes in Statistics~Springer-Verlag, New York, 1983!, Vol. 21.

43A. Pikovsky, ‘‘Discrete-time dynamic noise filtering,’’ Sov. J. CommuTechnol. Electron.31, 81 ~1986!.

44G. Sugihara and R. May, ‘‘Nonlinear forecasting as a way of distinguiing chaos from measurement error in time series,’’ Nature~London! 344,734 ~1990!; reprinted in Ref. 9.

45J.-P. Eckmann, S. Oliffson Kamphorst, D. Ruelle, and S. Cilibe‘‘Lyapunov exponents from a time series,’’ Phys. Rev. A34, 4971~1986!;reprinted in Ref. 9.

46J. D. Farmer and J. Sidorowich, ‘‘Predicting chaotic time series,’’ PhRev. Lett.59, 845 ~1987!; reprinted in Ref. 9.

47D. Auerbach, P. Cvitanovic´, J.-P. Eckmann, G. Gunaratne, and I. Procacia, ‘‘Exploring chaotic motion through periodic orbits,’’ Phys. Rev. Le58, 2387~1987!.

48O. Biham and W. Wenzel, ‘‘Characterization of unstable periodic orbitschaotic attractors and repellers,’’ Phys. Rev. Lett.63, 819 ~1989!.

49P. So, E. Ott, S. J. Schiff, D. T. Kaplan, T. Sauer, and C. Grebogi, ‘‘Dtecting unstable periodic orbits in chaotic experimental data,’’ Phys. RLett. 76, 4705~1996!.

50P. Schmelcher and F. K. Diakonos, ‘‘A general approach to the finding


-

o-

n-

e

t.

n-

,

’’

s

-

e

-

,

.

-

-v.

f

unstable periodic orbits in chaotic dynamical systems,’’ Phys. Rev. E57,2739 ~1998!.

51D. Kugiumtzis, O. C. Lingjærde, and N. Christophersen, ‘‘Regularizlocal linear prediction of chaotic time series,’’ Physica D112, 344~1998!.

52L. Jaeger and H. Kantz, ‘‘Unbiased reconstruction of the dynamics unlying a noisy chaotic time series,’’ Chaos6, 440 ~1996!.

53M. Casdagli, ‘‘Chaos and deterministic versus stochastic nonlinear meling,’’ J. R. Statist. Soc. B54, 303 ~1991!.

54D. Broomhead and D. Lowe, ‘‘Multivariable functional interpolation anadaptive networks,’’ Complex Syst.2, 321 ~1988!.

55L. A. Smith, ‘‘Identification and prediction of low dimensional dynamics,’’ Physica D58, 50 ~1992!.

56M. Casdagli, ‘‘Nonlinear prediction of chaotic time series,’’ Physica D35,335 ~1989!; reprinted in Ref. 9.

57E. J. Kostelich and T. Schreiber, ‘‘Noise reduction in chaotic time serdata: A survey of common methods,’’ Phys. Rev. E48, 1752~1993!.

58M. E. Davies, ‘‘Noise reduction schemes for chaotic time series,’’ PhysD 79, 174 ~1994!.

59T. Schreiber, ‘‘Extremely simple nonlinear noise reduction methodPhys. Rev. E47, 2401~1993!.

60D. R. Rigney, A. L. Goldberger, W. Ocasio, Y. Ichimaru, G. B. Moodand R. Mark, ‘‘Multi-channel physiological data: Description and analy~Data set B!,’’ in Ref. 17.

61P. Grassberger, R. Hegger, H. Kantz, C. Schaffrath, and T. Schre‘‘On noise reduction methods for chaotic data,’’ Chaos3, 127 ~1993!;reprinted in Ref. 9.

62H. Kantz, T. Schreiber, I. Hoffmann, T. Buzug, G. Pfister, L. G. FleppSimonet, R. Badii, and E. Brun, ‘‘Nonlinear noise reduction: a case ston experimental data,’’ Phys. Rev. E48, 1529~1993!.

63M. Finardi, L. Flepp, J. Parisi, R. Holzner, R. Badii, and E. Brun, ‘‘Topological and metric analysis of heteroclinic crises in laser chaos,’’ PhRev. Lett.68, 2989~1992!.

64A. I. Mees and K. Judd, ‘‘Dangers of geometric filtering,’’ Physica D68,427 ~1993!.

65T. Schreiber and M. Richter, ‘‘Nonlinear projective filtering in a dastream,’’ Wuppertal preprint, 1998.

66M. Richter, T. Schreiber, and D. T. Kaplan, ‘‘Fetal ECG extraction wnonlinear phase space projections,’’ IEEE Trans. Biomed. Eng.45, 133~1998!.

67J.-P. Eckmann and D. Ruelle, ‘‘Ergodic theory of chaos and strangetractors,’’ Rev. Mod. Phys.57, 617 ~1985!.

68R. Stoop and J. Parisi, ‘‘Calculation of Lyapunov exponents avoidspurious elements,’’ Physica D50, 89 ~1991!.

69H. Kantz, ‘‘A robust method to estimate the maximal Lyapunov exponof a time series,’’ Phys. Lett. A185, 77 ~1994!.

70M. T. Rosenstein, J. J. Collins, and C. J. De Luca, ‘‘A practical methfor calculating largest Lyapunov exponents from small data sets,’’ PhyD 65, 117 ~1993!.

71M. Sano and Y. Sawada, ‘‘Measurement of the Lyapunov spectrum fa chaotic time series,’’ Phys. Rev. Lett.55, 1082~1985!.

72J. Kaplan and J. Yorke, ‘‘Chaotic behavior of multidimensional differenequations,’’ inFunctional Differential Equations and Approximation oFixed Points, edited by H. O. Peitgen and H. O. Walther~Springer-Verlag,New York, 1987!.

73P. Grassberger and I. Procaccia, Physica D9, 189 ~1983!.74T. Sauer and J. Yorke, ‘‘How many delay coordinates do you need?,’’

J. Bifurcation Chaos Appl. Sci. Eng.3, 737 ~1993!.75J. Theiler, J. Opt. Soc. Am. A7, 1055~1990!.76H. Kantz and T. Schreiber, Chaos5, 143 ~1995!; reprinted in Ref. 18.77P. Grassberger, ‘‘Finite sample corrections to entropy and dimension

timates,’’ Phys. Lett. A128, 369 ~1988!.78F. Takens, inDynamical Systems and Bifurcations, edited by B. L. J.

Braaksma, H. W. Broer, and F. Takens, Lecture Notes in Mathema~Springer-Verlag, Heidelberg, 1985!, Vol. 1125.

79J. Theiler, ‘‘Lacunarity in a best estimator of fractal dimension,’’ PhyLett. A 135, 195 ~1988!.

80J. M. Ghez and S. Vaienti, ‘‘Integrated wavelets on fractal sets I: Tcorrelation dimension,’’ Nonlinearity5, 777 ~1992!.

81R. Badii and A. Politi, ‘‘Statistical description of chaotic attractors,’’Stat. Phys.40, 725 ~1985!.

82J. Theiler, S. Eubank, A. Longtin, B. Galdrikian, and J. D. Farmer, ‘‘Teing for nonlinearity in time series: The method of surrogate data,’’ PhysD 58, 77 ~1992!; reprinted in Ref. 9.


rity

ta

ys

ib

for

al

ies

nda-

ro-

ies


D

83T. Schreiber and A. Schmitz, ‘‘Improved surrogate data for nonlineatests,’’ Phys. Rev. Lett.77, 635 ~1996!.

84J. Theiler, P. S. Linsay, and D. M. Rubin, ‘‘Detecting nonlinearity in dawith long coherence times,’’ in Ref. 17.

85T. Schreiber, ‘‘Constrained randomization of time series data,’’ PhRev. Lett.80, 2105~1998!.

86T. Subba Rao and M. M. Gabr,An Introduction to Bispectral Analysis andBilinear Time Series Models, Lecture Notes in Statistics~Springer-Verlag,New York, 1984!, Vol. 24.

87C. Diks, J. C. van Houwelingen, F. Takens, and J. DeGoede, ‘‘Reversity as a criterion for discriminating time series,’’ Phys. Lett. A201, 221~1995!.


.

il-

88T. Schreiber and A. Schmitz, ‘‘Discrimination power of measuresnonlinearity in a time series,’’ Phys. Rev. E55, 5443~1997!.

89J. Kadtke, ‘‘Classification of highly noisy signals using global dynamicmodels,’’ Phys. Lett. A203, 196 ~1995!.

90R. Manuca and R. Savit, ‘‘Stationarity and nonstationarity in time seranalysis,’’ Physica D99, 134 ~1996!.

91M. C. Casdagli, L. D. Iasemidis, R. S. Savit, R. L. Gilmore, S. Roper, aJ. C. Sackellares, ‘‘Non-linearity in invasive EEG recordings from ptients with temporal lobe epilepsy,’’ Electroencephalogr. Clin. Neuphysiol.102, 98 ~1997!.

92T. Schreiber, ‘‘Detecting and analyzing nonstationarity in a time serusing nonlinear cross predictions,’’ Phys. Rev. Lett.78, 843 ~1997!.


Practical implementation of nonlinear time series methods...

Documents

Transcript of Practical implementation of nonlinear time series methods...