Theory and Practice of Physical Modeling for Musical Sound Transformations
Transcript of Theory and Practice of Physical Modeling for Musical Sound Transformations
HAL Id: tel-01219693https://hal.archives-ouvertes.fr/tel-01219693
Submitted on 27 Oct 2015
HAL is a multi-disciplinary open accessarchive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come fromteaching and research institutions in France orabroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, estdestinée au dépôt et à la diffusion de documentsscientifiques de niveau recherche, publiés ou non,émanant des établissements d’enseignement et derecherche français ou étrangers, des laboratoirespublics ou privés.
Theory and Practice of Physical Modeling for MusicalSound Transformations : An Instrumental Approach toDigital Audio Effects Based on the CORDIS-ANIMA
SystemAlexandros Kontogeorgakopoulos
To cite this version:Alexandros Kontogeorgakopoulos. Theory and Practice of Physical Modeling for Musical Sound Trans-formations : An Instrumental Approach to Digital Audio Effects Based on the CORDIS-ANIMA Sys-tem. Modeling and Simulation. Institut national polytechnique de Grenoble; Université d’Athènes,2008. English. �tel-01219693�
INSTITUT POLYTECHNIQUE DE GRENOBLE ΕΘΝΙΚΟ ΚΑΙ ΚAΠΟΔΙΣΤΡΙΑΚΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΑΘΗΝΩΝ
N° attribué par la bibliothèque |__|__|__|__|__|__|__|__|__|__|
T H E S E E N C O T U T E L L E I N T E R N A T I O N A L E
pour obtenir le grade de
DOCTEUR DE L’IP Grenoble et
de l’Université d’Athènes
Spécialité : « Art Sciences, Technologies»
préparée au laboratoire Informatique Création Artistique
dans le cadre de l’Ecole Doctorale « Ecole Doctorale Ingénierie pour la Santé, la Cognition et l'Environnement»
et au Μεταπτυχιακού Προγράµµατος Σπουδών του
Department of Informatics and Tellecommunications
présentée et soutenue publiquement
par
Alexandros Kontogeorgakopoulos
le 9 Octobre 2008
TITRE
Theory and Practice of Physical Modeling for Musical Sound Transformations: An Instrumental Approach to Digital Audio Effects Based on the CORDIS-ANIMA System
DIRECTEUR DE THESE CO-DIRECTEUR DE THESE
Claude CADOZ
Georgios KOUROUPETROGLOU
JURY
M. Jean-Claude RISSET , Président M. Julius SMITH III , Rapporteur M. Giovanni DE POLI , Rapporteur Mme. Nadine GUILLEMOT , Examinateur M. Sergios THEODORIDIS , Examinateur M. Georgios KOUROUPETROGLOU , Co-Directeur de thèse M. Claude CADOZ , Directeur de thèse
ii
iii
ΕΘΝΙΚΟ ΚΑΙ ΚΑΠΟΔΙΣΤΡΙΑΚΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΑΘΗΝΩΝ
ΣΧΟΛΗ ΘΕΤΙΚΩΝ ΕΠΙΣΤΗΜΩΝ
ΤΜΗΜΑ ΠΛΗΡΟΦΟΡΙΚΗΣ ΚΑΙ ΤΗΛΕΠΙΚΟΙΝΩΝΙΩΝ
ΠΡΟΓΡΑΜΜΑ ΜΕΤΑΠΤΥΧΙΑΚΩΝ ΣΠΟΥΔΩΝ ΚΑΙ INSTITUT POLYTECHNIQUE DE GRENOBLE
ECOLE DOCTORALE INGENIERIE POUR LA SANTE LA COGNITION ET L’ENVIRONNEMENT
ΑΠΟ ΚΟΙΝΟΥ ΔΙΔΑΚΤΟΡΙΚΗ ΔΙΑΤΡΙΒΗ
Theory and Practice of Physical Modeling for Musical
Sound Transformations:
An Instrumental Approach to Digital Audio Effects
Based on the CORDIS-ANIMA System
Αλέξανδρος Γ. Κοντογεωργακόπουλος
GRENOBLE ΚΑΙ ΑΘΗΝΑ
ΟΚΤΩΒΡΙΟΣ 2008
iv
v
ΔΙΔΑΚΤΟΡΙΚΗ ΔΙΑΤΡΙΒΗ
Theory and Practice of Physical Modeling for Musical Sound Transformations:
An Instrumental Approach to Digital Audio Effects Based on the
CORDIS-ANIMA System
Αλέξανδρος Γ. Κοντογεωργακόπουλος
ΕΠΙΒΛΕΠOΝΤΕΣ: Claude Cadoz Ingénieur de Recherche - HDR en Informatique Γεώργιος Κουρουπέτρογλου, Επίκουρος Καθηγητής ΕΚΠΑ
ΤΡΙΜΕΛΗΣ ΕΠΙΤΡΟΠΗ ΠΑΡΑΚΟΛΟΥΘΗΣΗΣ: Γεώργιος Κουρουπέτρογλου, Επίκουρος Καθηγητής ΕΚΠΑ
Σέργιος Θεοδωρίδης, Καθηγητής ΕΚΠΑ
Αναστασία Γεωργάκη, Επίκουρος Καθηγήτρια ΕΚΠΑ
ΕΞΕΤΑΣΤΙΚΗ ΕΠΙΤΡΟΠΗ
Ηµεροµηνία εξέτασης 09/10/2008
Jean-Claude RISSET Directeur de Recherche Emérite
CNRS Marseille
Σέργιος Θεοδωρίδης, Καθηγητής ΕΚΠΑ
Γεώργιος Κουρουπέτρογλου
Επίκουρος Καθηγητής ΕΚΠΑ
Claude Cadoz
Ingénieur de Recherche HDR
Institute Polytechnique de Grenoble
Julius SMITH
Professor - Stanford University
Giovanni DE POLI
Professor - University of Padova
Nadine GUILLEMOT
Professeur
Institute Polytechnique de Grenoble
vi
vii
viii
ix
Abs
trac
t
x
xi
Abstract
This thesis proposes the use of physical modeling as a mean of designing musical sound transformations
under the concept of instrumental interaction. We were seeking for a physical modeling paradigm that
permits straightforwardly the exploration for new sonic possibilities based on sound transformation
techniques. Hence the present essay is clearly oriented and devoted to musical creation.
This work provides new results and new discussions in musical signal processing and in the design of
digital audio effects. The novelty is the introduction of the instrumental gesture in the audio signal
processing algorithms. The same concept has been applied to Frequency Modulation (FM) synthesis
algorithm.
Concretely the novel outcomes and the contribution of this study are:
× Some theoretical aspects concerning the exploit of physical modeling for musical sound
transformations
× The proposition of a simple modular system architecture for the design of physical audio effect
models based on a common visual block type programming environment
× The representation of CORDIS-ANIMA (CA) system by a number of helpful mathematical formalisms
× The redesign/re-definition of several well-known digital audio effects using CA system: filters, delay-
based effects, distortions, amplitude modifiers. Especially in filter synthesis by CA networks, an
algorithm has been applied which is based on the Cauer realization of electrical circuits.
× The redesign/re-definition of FM synthesis using CA system.
SUBJECT AREA: Computer Music
KEYWORDS: digital audio effects, physcial modeling, musical sound transformation, CORDIS ANIMA,
signal processing, instrumental gesture, control, sound processing, mass-interaction formalism
xii
xiii
Περίληψη
Η παρούσα διατριβή προτείνει την χρήση των τεχνικών φυσικής μοντελοποίησης για την σχεδίαση
αλγορίθμων ψηφιακής επεξεργασίας ήχου προορισμένων για μουσική δημιουργία. Η βασική ιδέα στην
οποία στηρίχτηκε είναι η παρουσίαση ενός νέου και καινοτομικού τρόπου ελέγχου αλγορίθμων
επεξεργασίας ήχου βασισμένου στο πρότυπο αλληλεπίδρασης με τον όρο: αλληλεπίδραση οργάνου
(instrumental interaction). Αναζητήθηκε ένας φορμαλισμός φυσικής μοντελοποίησης ο οποίος να
επιτρέπει άμεσα την έρευνα για νέα ηχοχρώματα μέσω τεχνικών επεξεργασία ήχου. Συνεπώς η
παρούσα έρευνα έχει σαφέστατο μουσικό προσανατολισμό.
Η εργασία παρέχει νέα αποτελέσματα και θέτει νέα ερωτήματα σχετικά με τη επεξεργασία ήχου για
μουσική δημιουργία και για την σχεδίαση ψηφιακών ακουστικών εφφέ (digital audio effects). Καινοτομία
αποτελεί η εισαγωγή χειροvομιών οργάνου (instrumental gesture) για το έλεγχο των αλγόριθμων
επεξεργασίας ήχου. Η ίδια ιδέα εφαρμόστηκε και στον αλγόριθμο σύνθεσης ήχου διαμόρφωσης κατά
συχνότητα FM (frequency modulation).
Συγκεκριμένα τα πρωτότυπα αποτελέσματα και η συμβολή αυτής της έρευνας είναι:
× Ορισμένα θεωρητικά πορίσματα που αφορούν την έρευνα σχετικά με την χρήση των τεχνικών
φυσικής μοντελοποίησης για την σχεδίαση αλγορίθμων ψηφιακής επεξεργασίας ήχου προορισμένων
για μουσική δημιουργία
× Η πρόταση μιας απλής αρθρωτής αρχιτεκτονικής για τον σχεδιασμό φυσικών μοντέλων ψηφιακών
ακουστικών εφφέ, βασισμένης σε ένα κοινότυπο γραφικό περιβάλλον προγραμματισμού μπλοκ τύπου.
× Η αναπαράσταση του συστήματος CORDIS-ANIMA και η σύγκρισή του με άλλους χρήσιμους
φορμαλισμούς μοντελοποίησης.
× Ο σχεδιασμός και η προσομοίωση διαφόρων γνωστών εφφέ με το σύστημα CORDIS-ANIMA: φίλτρα,
εφφέ βασισμένα σε γραμμές καθυστέρησης, παραμορφώσεις καθώς και εφφέ που επηρεάζουν την
δυναμική του ήχου. Ιδιαίτερα σημαντική συμβολή είναι η χρήση της τεχνικής Cauer για τη σύνθεση
φίλτρων CORDIS-ANIMA.
× Ο σχεδιασμός και η προσομοίωση της σύνθεσης FM με το σύστημα CORDIS-ANIMA.
ΘΕΜΑΤΙΚΗ ΠΕΡΙΟΧΗ : Μουσική Πληροφορική
ΛΕΞΕΙΣ ΚΛΕΙΔΙΑ : ψηφιακά ακουστικά εφφέ, φυσική μοντελοποίηση, μουσικός μετασχηματισμός ήχου,
CORDIS ANIMA, επεξεργασία σήματος, χειρονομίες οργάνου, έλεγχος, επεξεργασία ήχου, φορμαλισμός
μάζας-αλληλεπίδρασης
xiv
xv
to Ste lak i s !
xvi
xvii
Akn
owle
dgem
ent
xviii
xix
Aknowledgements
Those four years I spent at ACROE, in the University of Athens, at IRCAM, in McGill University and
elsewhere during my “euro-libraries tour” have given me unforgettable and invaluable experiences.
Evidently, I have too many people that I would like to thank.
Claude Cadoz who believed in me and inspired me enormously during this adventurous research
Giorgios Kouroupetroglou who did not stop giving me valuable advises about everything
Annie Luciani and especially Jean-Loups Florens for the help and all the interesting conversations
we had. Of course, I cannot forget the whole ACROE team and especially Francois Poyer, Julien
Castet, Olivier Tache, Nicola Castagne - for the nice moments we spent together...
Jean-Claude Risset, who transmitted to me his magic view of computer music
Julius Smith, for his comments about my thesis and his incredible site
Vincent Verfaille for those interesting discussions we had about the musical sound transformation
The jury who kindly accepted reading my PhD and honored me with its presence in my thesis
defense
In addition, I would like to thank my family: thanks mama, dad and Noulikiti. You were always
there when I needed you.
Special thanks goes to olivia (of course!), Rory, Gareth, Haralambos and all my friends.
ATPS
xx
xxi
Pref
ace
xxii
xxiii
P reface
The present dissertation is the final work of my research at the Institut National Polytechnique de
Grenoble and at the Department of Informatics & Telecommunications of the National & Kapodistrian
University of Athens. It serves as documentation of my PhD study, which has been accomplished during
the period 2004-2008, in order to obtain a common Doctorate diploma by both establishments. Since it
was the first collaboration of that kind between the National & Kapodistrian University of Athens and
other foreign establishments, a significant amount of energy, effort and time has been spent in order to
attain it.
The research has been partly funded for two years by the MIRA program (Mobilité Internationale Rhône-
Alpes - France). Within the context of this collaboration between the two mentioned institutes, a
research program in a similar subject area for the physical modeling of Greek traditional musical
instruments has been founded by the program ΑΝΤΑΓΩΝΙΣΤΙΚΟΤΗΤΑ (General Secretariat for Research
and Technology, Ministry of Development – Hellenic Republic). However, the most important financial
support has been provided by my parents. I cannot find words thank them.
As a PhD student, I have had the opportunity to follow a large number of courses at IRCAM (Institut de
Recherche et Coordination Acoustique et Musique) in Paris and to spend a period in the Sound
Processing and Control Laboratory at McGill University. The last part has been founded by the AIRES
CULTURELLES program (Ministère de l’Enseignement Supérieur et de la Recherche - France).
I should underline that this PhD was a great challenge for me since it addresses a question that has
never been posed before. I hope that it will become an inspiration for significant applications and
research areas in the computer music domain.
Enjoy it!
xxiv
xxv Out
line
xxvi
xxvii
Out l ine
The dissertation is organized and structured as follows:
Chapter 1 presents some definitions of musical sound transformation and digital audio effects. The point
of view is turned more on the design part. Among several taxonomies found in the literature a new one
is proposed.
Chapter 2 and 3 expose an overview of the basic digital audio effect algorithms and their control. They
cover exclusively previous research.
Chapter 4 introduces briefly the hypothesis and the context of our research: musical sound
transformation by means of physical modeling within the framework of instrumental interaction. Under
the prism of gesture and physics our approach is presented and explained.
Chapter 5 is focused on the CORDIS-ANIMA system. CA models are studied and presented using several
useful system representations like CA Networks, Finite Difference representations, Kirchhoff
representations, digital signal processing Block Diagrams, State Space internal descriptions, System
Function input/output external descriptions and, Wave Flow Diagrams.
Chapter 6 examines some issues relative to the general design of CA audio effects. The system
architecture is presented.
Chapter 7 studies a list of basic digital audio effects and proposes their CA version. The presented
algorithms are: elementary mathematical operations, filters, amplitude modification, echo, comb filter,
flanger, pick-up point modulation and clipping/distortion.
Annex A describes a simple way to dynamically modify the physical characteristics of CA physical
models. With this general methodology called transfer characteristics biasing method, the instrumental
interaction is not violated.
Annex B presents some primary ideas and results concerning the redesign/re-definition of FM synthesis
several using CA system.
Conclusions are… conclusions. In the same chapter suggestions for future work are presented.
xxviii
xxix
Cont
ents
xxx
xxxi
Contents
abst ract
acknow ledgements
pre face
out l ine
contents
1 Mus ica l Sound Transformat ions
1.1 What is Musical Sound Transformation?
1.2 What is an Audio Effect?
1.3 Taxonomy of Audio Effects
2 An Overv iew of D ig it al Aud io Ef fect s A lgor it hms
2.1 Time-Domain Models
2.1.1 Simple Operations
2.1.2 Filters
2.1.3 Delay-based Effects
2.1.4 Nonlinear Processing
2.1.5 Time-Segment Processing
2.2 Time-Frequency Models
2.2.1 Phase Vocoder
2.2.2 Wavelet Transform
2.2.3 Other Time-Frequency Techniques
2.3 Parametric/Signal-Models
2.3.1 Spectral Models
2.3.2 Source-Filter Models
3 Cont ro l o f D igit a l Aud io E ffect A lgor it hms
3.1 Gestural Control
3.2 LFO Control
3.3 Control based on Automation
3.4 Algorithmic Control
3.5 Adaptive / Content-based Control
4 Phys ical Aud io Ef fect Mode ls
4.1 Digital Audio Effects and Physical Modeling: La raison d’être of this PhD research
1
3
5
7
13
15
15
15
17
19
20
23
23
24
24
24
24
25
27
29
30
31
31
31
35
37
ix
xvii
xxi
xxv
xxix
xxxii
4.2 Digital Audio Effects and Instrumental Multisensory Interaction: La Nouveauté of this PhD
. Research
5 CORDIS-AN IMA System Ana lys is
5.1 CORDIS-ANIMA Network Representation
5.2 Finite Differences (FD) Representations and Finite Derivatives (FDe) Representation
5.2.1 Mass-Interaction Networks: System of FD and FDe equations
5.2.2 Multichannel FD and FDe Representations
5.2.3 N-dimensional Finite Difference Representation
5.2.4 From CA Networks to Partial Differential Equations
5.3 State Space Representation
5.4 Kirchhoff Representation - Electrical Analogous Circuits
5.5 System Function Representation
5.5.1 General System Block Diagrams
5.5.2 One-Ports
5.5.3 Two-Ports
5.5.4 Modal Representation
5.6 Digital Signal Processing Block Diagram Representation
5.7 Wave Flow Representation: Interfacing the DWG with the CA
6 CORDIS-AN IMA Phys ica l Audio E ffects Mode l Des ign
6.1 System Architecture
6.2 Components and Modules
6.3 Modules Interconnection and Construction Rules
6.4 User Interface
6.5 Simulations/Simulator
7 CORDIS AN IMA Phys ica l Aud io E ffect Model s
7.1 Elementary Signal Processing Operations
7.1.1 Unit Delay Element
7.1.2 Constant Multiplier
7.1.3 Adder and Subtracter
7.1.4 Memoryless Nonlinear Element
7.2 Basic Low-Order Time-Invariant Filters
7.2.1 FRO: Highpass
7.2.2 REF: Highpass/Lowpass
7.2.3 CEL: Bandpass/Lowpass/Highpass
7.3 Synthesis of High-Order Time-Invariant Filters
7.3.1 Cascade-Form and Parallel-Form CA Structures
7.3.2 String-Form CA Structures
7.4 Time-Variant Filters
38
41
46
48
49
50
52
55
59
62
66
67
68
69
72
73
75
79
83
84
85
89
92
97
99
100
101
101
102
102
102
103
104
107
108
110
114
xxxiii
7.4.1 Wah-Wah
7.4.2 Time-Variant Resonators: “Pressing-String” and “Sticking-String” Models
7.5 Amplitude Modifiers
7.5.1 Bend Amplitude Modification Model
7.5.2 Mute Amplitude Modification Model
7.5.3 Pluck Amplitude Modification Model
7.6 Delay-based Audio Effects
7.6.1 Delay Model
7.6.2 Comb Filter Models
7.6.3 Flanger Models
7.6.4 Spatialization and Pick-Up Point Modulation
7.7 Nonlinear Audio Effects
7.7.1 Nonlinearities without Memory: Waveshaping
7.7.2 Nonlinearities with Memory: Clipping
Annex A Trans fer Characte r i st i cs Bias ing Method
Annex B Frequency Modu lat ion with in the fr amework of Physi cal Mode l i ng
B.1 Simple Frequency Modulation
B.2 Physical Frequency Modulation Model
B.3 Triangular Physical Frequency Modulation Model
Conclus ions
Bibl iography
114
115
118
118
119
121
122
122
124
126
127
128
128
129
118
131
137
139
140
142
147
153
xxxiv
Chap
ter
1 M
usic
al S
ound
Tra
nsfo
rmat
ions
2
3
Chapter 1
Musical Sound Transformations
“Making a good transformation is like writing a tune… There are no rules.”
Trevor Wishart
1.1 What is Mus ical Sound Transformat ion?
The idea of sound transformation refers to the process of transforming or modifying a sound into
another one with different quality. A more musical oriented definition describes sound transformations as
“the processing of sound to highlight some attribute intended to become part of the musical discourse
within a compositional strategy” (Glossary of EARS web site [EARS]). In the present research, we do not
use this term in the sense of sound morphing.
Clearly, in this thesis we are interested in the “how to” in order to formulate a sound transformation:
the means, the methods, and the tools. Hence, our approach is always from the construction point of
view. Figure 1.1 illustrates schematically where we focus our interest.
Figure 1.1 Our interest concerning sound transformations**
Innovative musicians, charismatic sound engineers and pioneering researchers in the domain of music,
audio and acoustics have written the history of musical sound transformations. Mechanical, acoustical,
electromechanical, electromagnetic, electronic and digital systems were developed and used for this
artistic* purpose. We will present in historical order several important events in order to better reach and
understand the concept and the nature of musical sound transformation.
The research on timbral development from one texture to another is evident in the 20th century history of
electronic and computer music. During the 20’s Varèse had already started searching for new sound
qualities working with natural musical instruments only [Manning 2004]. The early works of Iannis Xenakis
** the term texture is employed in a very general way * We study sound transformations always in a musical context.
4
are excellent examples of instrumental sound transformations. These transformations influenced a lot and
motivated Trevor Wishart to start his investigations on audio effects [Landy 1991]. They were based on
the mechanical manipulation of the sound propagation medium -all the possible parts of the vibrating
system- and on the excitation mechanism.
The invention of the commercial gramophone record offered a conversion of time information into spatial
information. However, this technique was not only used for storing sound information; soon after
composers began to experiment with the recording medium and with the process of sound recording and
reproduction. Fred Gaisberg back in 1890 was holding an opera singer by the arm and moving her closer
or further away from the gramophone according the dynamics of the piece [Moorefield 2005]. A manual
and acoustic dynamic processing of the recorded sound was achieved by this simple technique. Darius
Milhaud carried several experiments on gramophones investigating vocal transformations during the period
1922 to 1927 [Manning 2004]. An excellent and more contemporary example of musical creation based
on the manipulation of the recording support is the Disc Jockey (DJ) who emerged in the late sixties.
The whole performance is focused on the direct manipulation of the records: playing inverse, playing at
different speeds, “scratching”, playing with many and different type of heads and scratching the surface
with a sharp tool are some techniques used by experimental Djs.
Optical recording has also given an interesting support and encouraged similar musical experimentations.
Pfenninger in 1932 modified sounds by altering their shapes on the optical soundtrack. The introduction
of the magnetic tape recorder in the studios after the Second World War gave new promises on sound
transformation. Once more, the creative process was based on the manipulation of the support. The
enhanced possibilities of tape gave birth to Musique Concrète in 1948. Even though the magnetic type
systems did not permit visually the physical modifications of the waveform patterns, their editing and
rewriting capabilities were significantly important to musicians for musical expression.
Analog and digital technology offered a different type of sound treatment. It was neither the sound
propagation medium nor the recording support that it was manipulating but a proper mathematical
representation of sound. Analog signal processing techniques have been used since the late 19 century
with the invention of the Telharmonium. Most of the widely known audio effects like the phazer,
wahwah, distortion and chorus were created with analog signal processing techniques and implemented
with electronic circuits [Bode 1984]. Digital signal processing continued with the same idea and offered
a more convenient and general framework for the conception, the design and the implementation of
digital audio effects.
This research concerns musical sound transformations where the sound source (that is transformed) is
separated from the sound processing system: the transformation and the sound generation mechanisms
are detached and are not parts of the same object. All the previous sound transformation paradigms fall
in this category except for the transformations in the Xenakis example. According to Landy [Landy
1991] this type of musical transformation can be referred as transformation through synthesis. In a
5
similar manner, we call the other case that we are interested in as transformation through processing*. It
is evident that the frontiers between sound synthesis and sound processing are ambiguous and this
separation is more meaningful under a musical perspective than a functional or a technical one.
Figure 1.2 (a) transformation through synthesis (b) transformation through processing
Wishart, whose contribution on sound transformations is invaluable, considers its entire computer based
sound manipulation procedures as musical instruments. Moreover, he regards his research as an
indispensable part of its musical work [Wishart]. He says characteristically: “…In particular the goal of
the process set in motion may not be known or even (with complex signals) easily predictable
beforehand. In fact, as musicians, we do not need to “know” completely what we are doing (!!). The
success of our effort will be judged by what we hear…” [Wishart 1994].
1.2 What is an Audio E ffect?
We define an audio effect as the “How” part -the method- of a sound transformation through
processing. This definition is more general than the common ones. For example Verfaille et al. define the
audio effect as:
“…we consider that the terms “audio effects”, “sound transformations” and “musical sound
processing” all refer to the same process: applying signal processing techniques to sounds in order to
modify how they will be perceived, or in other words, to transform a sound into another sound with a
different quality” [Verfaille Guastavino Traube 2006].
It is obvious that this does not coincide completely with our definitions for audio effect and sound
transformation. In the last definition, an ambiguity probably occurs from the term signal processing. In
general, signal processing concerns the techniques and methods for the manipulation of the mathematical
representations of signals. Even the terms signal/mathematical representation of signal usually are
considered equivalent. Therefore, signal processing is just an approach to design sound transformation
and a particular category of audio effects.
* Landy categorized sound transformations into three categories: i) electroacoustically ii) through synthesis iii) through resynthesis. In the present thesis we consider that the categories i) iii) under the term transformation through processing
6
For example, it is uncommon to consider the spring reverb, guitar or even an electric network as a signal
processing system. On the other hand, we may approach it, represent it and model it as a signal
processing system. This distinction is not only theoretical. A helpful example is the acoustical instrument
designer: the methods he applies to construct his instruments are not at all signal processing-based. As
this thesis concerns more the design of audio effects than the analysis, this distinction is essential.
When we deal with the processing of musical audio signals by digital means, we use the term digital
audio effect (DAFX is an synonim for digital audio effects). Someone could say that since we use
information-processing systems for the musical transformation of sound, we employ directly signal-
processing techniques. We should be prudent once more, as modern digital sound technology has made
sound certainly more immaterial and more object-like. Its production is not any longer bound to
instruments and instrumentalists: it can be manipulated with tools acting on its representations*. On the
other hand, this does not necessarily mean that we must necessarily employ a signal processing method,
or much more importantly, a signal-thinking approach to modify a sound with a digital computer. Of
course, we cannot deny that at a low level all these modifications are in fact digital signal processing
procedures.
The definition of digital audio effects by Zoelzer is more general:
“Digital audio effects are boxes or software tools with input audio signals or sounds which are modified
according to some sound control parameters and deliver output signals our sounds” [Zoelzer 2002].
We could rearticulate the last definition and describe the digital audio effects as digital
systems/algorithms that modify incoming audio signals according to the available control parameters. The
control signifies all the possible methods available to the user for accessing the control parameters:
graphical user interfaces, abstract algorithms, gestural interfaces, sound features etc. In figure 1.3, we
illustrate a digital audio effect according to this definition. In chapter 4, we will propose another
approach to digital audio effects that differs from the depicted one.
Figure 1.3 Digital audio effect and its control (from [Zoelzer 2002])
* anyway the digital computers do not process numbers – digital signal processing does – but symbols
7
Clearly, the field of musical sound transformation by digital means is lying between science and art. It
could be seen strictly scientifically as a branch of digital signal processing applied to sound signals.
However, as we have already stated the goal is musical and thus neither obvious nor clear as in the
digital signal processing field or even in audio engineering. The mathematics and the rigorous scientific
approaches are often more important for the analysis of the algorithms -when this is possible- than for
their design. A characteristic example is the design of the reverberation algorithms that they have
traditionally been invented through experimentation in the past [Dattorro 1997b].
1.3 Taxonomy of Aud io E ffects
Apparently, it is extremely difficult to identify, classify and name existing digital audio effects. This is
more evident in the current market where there are no official standards and each manufacturer uses
different parameter names and adds supplementary signal processing components to generate their own
distinctive sound. Moreover if we try to analyze the unpublished work of Wishart we will soon get
disappointed (impressed in fact!) as there is an impressive amount of novel undocumented algorithms
that are not widely (academically or commercially) known*.
Figure 1.4 Perceptual classifications of various effects. Bold-italic words: perceptual attributes, italic
words: perceptual sub-attributes, other words: audio effects (from [Verfaille Guastavino Traube 2006])
* Fortunately all these procedures are available though the Composers Desktop Project [Endrich 1997][Composers Desktop Project]
8
This section will recall some of the classifications of audio effects and will propose a new one that is not
necessarily linked with the concept of signal processing. Verfaille summarizes several classifications in the
article [Verfaille Guastavino Traube 2006]. As he points out, these classifications are neither exhaustive,
nor mutually exclusive. In figure 1.4 is illustrated his proposed perceptual classification of various audio
effects.
From the designer point of view, maybe the most practical classification is a technological one proposed
by Zoelzer [Zoelzer 2002] and illustrated in figure 1.5. We will use this type of classification in the next
chapter for the presentation of the basic digital audio effects algorithms.
Figure 1.5 Classification of audio effects based on the underlying techniques [Zoelzer 2002]
A similar classification, but more general, is based on the input domain. Since the digital audio effect is a
discrete-time system, it may be seen as an abstract mathematical operator that transforms the input
sound sequence into another sequence. The input sequence is a coded representation of the sound
signal. It could be a time-domain representation, a time-frequency representation, or a parametric
representation.
Time-doma in process ing: Most digital audio effects are conceived and designed in the time-domain.
One of the most intuitive ways to create sound transformations is to cut the input streams, replay them
and re-assemble them in different ways. All these may be done down to sample-accuracy. Filters, delay
functions, reverberation algorithms are other examples of sound transformations that may be realized in
the time-domain by elementary mathematical operators as multipliers, adders and delay lines.
Time-Frequency-domain process ing: Time-Frequency processing permits working with a sound
signal from both frequency and time viewpoints simultaneously. In 1946, Dennis Gabor published an
article in which he explicitly defined a time-frequency representation of a signal [Arfib 1991]. Another
interesting publication of the same period is the one of Koening Dunn and Lacy [Koenig Dunn and Lacy
1946]. Each point in this time-frequency representation corresponds to both a limited interval of time
and a limited interval of frequency. In general, with time-frequency methods we project the signal (time
9
representation) onto a set of basis functions to determine their respective correlations, which give the
transform coefficient. If these transforms are discrete, the time-frequency plane takes the form of a grid
or a matrix of coefficients. In the fixed resolution case, as in the Phase Vocoder, the bandwidths of the
basis functions/analysis grains are the same for all frequencies and they have the same length. This
gives constant time-frequency resolution. In the multi-resolution case, as in the Wavelet Transform, the
analyzing grains are of different lengths and their bandwidth is not constant. In this scenario, the wider
sub-bands give better time resolution and vice versa.
Parametr i c- domain p rocess ing : Parametric or Signal-Model processing of audio signals concerns
algorithms based on sound synthesis models: the signal to be modified is first modeled and then the
model parameters are modified to achieve the desired transformation. All of the valuable knowledge in
sound synthesis can also be applied in sound transformation. Hence as Risset states [Risset 1991], it
would be of great interest to develop automatic analysis procedures that could, starting with a given
sound, identify the parameter values of a given sound model that yield a more or less faithful imitation
of sound. This is a hard problem for certain synthesis models. The most useful and widely known sound
synthesis models are additive synthesis, subtractive or source-filter synthesis, frequency modulation and
wave-shaping or non-linear distortion.
According to the discussion in chapter 1.1, we propose a new classification applied to all the types
sound processing techniques or audio effects in three general categories [Kontogeorgakopoulos Cadoz
2007b]. This classification has a strong impact on the way you conceive, you control and you implement
your audio effects. It is based on the following question:
What do you manipulate?
We distinguished the following interesting basic cases: The energy flux that carries the musical signal
(table 1.1), the recording medium that stores the musical signal (table 1.2) and the symbolic expression
that represents the musical signal (table 1.3). This classification should not be confounded with the
technological one as it covers more than technological aspects. Once again, it is not a mutually
exclusive classification.
Table 1.1 Propagation Medium Processing
Propagat ion Med ium Process ing
A physical manipulation is taking place on
the medium where the sound signal
propagates. We could use the term
channel processing as well. Since the
concept of channel comes from the
information theory, it is directly related to
the notion of language that could cause
confusion. So we preferred the first term.
10
The Propagation Medium Processing does not necessarily concern mechanical or acoustical vibrating
systems. An electrical network, an optical or an electromagnetic waveguide for example may consist of
the propagation medium. The importance is that (a) the initial sound energy is propagating in a physical
medium and (b) sound transformation is achieved physically through the manipulation of the medium
(manipulation). The Talk Box is a typical audio effect of this category: the sound is conducted to your
mouth and modified by the vocal tract resonances. The reverberation chamber or the reverberation plate
or spring is classified in this category as well.
Table 1.2 Recording Medium Processing
Record ing Med ium Process ing
A physical manipulation is taking place on
the support where the sound signal is
recorded. Segmenting and rearranging the
support, scratching it, altering the shapes
that sound takes are several techniques
to modify the musical signal.
The recording medium processing concerns every possible medium where the sound signal can be
recorded and stored: phonograph or vinyl record, magnetic tape, optical soundtrack etc. The mastery of
sound processing with such systems logically lead to the creation of Musique Concrète and, more
generally to all electro-acoustic music. The two-turntable setup used by thousands of Djs nowadays
around the world offers musical sound transformations by recording medium processing. Clearly, the
analog sound storage medium has been proven more appropriate for this type of processing.
Table 1.3 Information Processing
I nformat ion Process ing
A symbolic manipulation is taking place on
the sound signal that considered as a
codified message, which is generally
expressed by a mathematical function. A
set of mathematical operations transforms
it either in the continuous time domain
(analog signal processing) or in the
discrete time domain (digital signal
processing).
11
Today, often the term audio effect is used for audio effects that belong only to this category. This is
reasonable since the majority of audio effects are computer-based. The computer simulation techniques
offered the possibility to redesign and implement the Propagation Medium Processing and the Recording
Medium Processing audio effects with a computer.
12
13 Chap
ter
2 A
n Ove
rvie
w o
f Dig
ital
Aud
io E
ffec
t Alg
orithm
s
14
15
Chapter 2
An Overv i ew of Dig i ta l Aud io Effect A lgor i thms
In this chapter we will present a survey of digital audio effects. The subject is huge and the bibliography
is “unlimited”. We will mention briefly only the most common effects with all the necessary bibliography.
In chapter 7, where we expose our designs, the references are covered with more details. We cite five
compilations of digital audio effect algorithms that we found really helpful [Zoelzer 1997][Zoelzer
2002][Orfanidis 1996][Verfaille 2003][Wishart 1994]. Zoelzer and Orfanidis present the algorithms from
a technological perspective. Verfaille uses a perceptual categorization and Wishart chooses a much more
compositional approach without focusing in algorithmic details.
We will expose the algorithms according to the following simple classification: time-domain, time-
frequency and parametric procedures. As we have already stated in chapter one, we try to follow the
designer point of view.
2.1 T ime-Doma in Model s
2.1.1 Simp le Operat ions
The very elementary signal operations such as addition/subtraction, multiplication by a constant and
signal routing can hardly be considered as audio effects. However the basic operations of a mixing
console are based on them [Roads 1996][Rumsey 1999]:
× Gain changing (redithering is employed in mixers when the sample resolution has to be shortened)
× Cross-fading
× Mixing
× Mute/Solo
× Input/Output selection
× Grouping facilities
× Auxiliary send/return
A good reference for those operations, analyzed at a more technical level, can be found in [Zoelzer
1997][Mitra 2001].
2.1.2 Fi lt ers
Every SISO (Single Input Single Output) digital system can be considered as a filter. The committee on
Digital Signal Processing of the IEEE Group on Audio and Electroacoustics defined a filter as [Rabiner et
al. 1972]:
16
A digital filter is a computational process or algorithm by which a digital signal or sequence of numbers
(acting as input) is transformed into a second sequence of numbers termed the output digital signal.
Very often the term filter is employed exclusively for the family of linear time-invariant systems (LTI). In
this case, the general time-domain representation of the filter is given by a linear constant-coefficient
finite difference equation .
The most standard filter classifications are named according to their amplitude response: lowpass (LP),
highpass (HP), bandpass (BP), bandreject (BR). In figure 2.1 we illustrate this filter classification [Dodge
Jerse 1997]. Another classification is between the finite impulse response (FIR) filters where the
autoregressive term disappears and the infinite impulse response filters (IIR) where the auto-regressive
term appears in the difference equation. There are many reference books concerning the fundamentals of
filtering theory [Oppenheim Schafer Buck 1999][Proakis Manolakis 1996]. A more music-friendly
reference is [Dodge Jerse 1997].
Figure 2.1 Classification of filters according to their amplitude response: (a) lowpass (b) highpass
(c) bandpass (d) band reject
The simplest filters are the one-zero, the one-pole, the two-pole and the two-pole/two-zeros filters
[Smith 1995]. They will be will be presented with more details in chapter 7. Typical useful parametric
filter structures for first-order highpass, lowpass filters, and second order highpass, lowpass, bandpass
and bandreject filters can be found in [Zoelzer 2002].
Another very popular filter coming from the analog domain is the classical analogs Sallon & Key, the
State Variable Filter [Tomlinson 1991] and the Moog filter [Moog 1965]. Digital versions of the first two
filters can be found in [Dutilleux 1998] and for the third one in [Stilson 1996][Huovilainen
2004][Huovilainen Valimaki 2005][Fontana 2007]. The PhD thesis of Stilson is a good reference for all
these vintage filters [Stilson 2006].
In computer music the filters is desired to be adjustable. This means that the filters should be “usefully”
and efficiently varied in real time. All of the previous filters that are suitable for computer music
applications respect this requirement. A very classical time-variant filter effect is the wah-wah bandwidth
[Dutilleux Zoelzer 2002b][Loscos Aussenac 2005] [Smith 2008].
17
2.1.3 De lay-based Ef fects
The delay-based effects are probably the most common effects. The main functional unit is the digital
delay line [Smith 2005][Dutilleux Zoelzer 2002]. Fractional delay lines are used when delays of the input
signal with non-integer values of the sampling interval are necessary. Interpolation functions are applied
between samples in order to achieve smooth delay length changes [Laakso Valimaki Karjalainen Laine
1996]. The delay line can also vary in length with time.
Another way to delay a signal is by passing it through an allpass filter i.e. filters with amplitude response
one and arbitrary phase response [Mitra 2001]. A mathematical expression of a general transfer function
of a finite-order, causal, unity gain allpass filter can be found in [Smith 2005]. The second order allpass
sections are very useful filters due to their tunable characteristics.
Simp le de lay-based aud io ef fect s
Typical delay-based effects realized with time-invariant delay lines are doubling, echo, slap back, tapped
delay and the IIR/FIR/Universal Comb filter [Smith 2005][Dutilleux Zoelzer 200d][Dattorro 1997]. All
these effects use a simple single delay -or a tapped delay- with different delay times from 1msec to
several seconds in feed-forward or feedback configuration mixed with the original input.
When the delay line is time-variant we obtain several other interesting effects. The simple vibrato effect
is just a delay modulated by a low frequency oscillator (LFO). If we use the same delay, without the LFO
and with variable read-pointers / write-pointers we get the Doppler effect [Smith 2005]. The flanger
effect is widely used by guitarists and is similar to vibrato except it uses the comb filter structure
[Hartmann 1978][Disch Zolzer 1999][Dutilleux Zoelzer 2002d][Smith 2005][Huovilainen 2005]. The leslie
effect has been simulated in [Smith Serafin Abel Berners 2002] [Disch Zolzer 1999]. An easy way to
get the phaser effect is by using cascade second-order allpass sections [Smith 1984] or second order
notch filters [Orfanidis 1996]. With the combination of a few modulated delay lines by random signals
we get a chorus [Smith 2005][Dutilleux Zoelzer 200d].
Spat i a l E ffects
According to the pioneering work of Schroeder, by combining many of the simple delay-based systems
mentioned before, we can simulate the reverberation* effects [Schroeder Logan 1961][Schroeder 1962]
[Schroeder 1970]. Schroeder introduced the comb filters and the allpass comb filters as basic
* Sound reverberation is a physical phenomenon occurring when sound waves propagate in an enclosed space. It is the result of
the repeated reflections of radiated sound waves from the surfaces of the space. The direct sound and its echoes, which are
delayed, attenuated and filtered replicas, give this effect that modifies the perception of the original sound by changing its
loudness, its timbre and its spatial characteristics. For the study of the reverberation process many approaches have been
followed and Deterministic and Stochastic models have been proposed [Blesser 2001][Gardner 1998][.
18
components for their simulation. Many other researchers continued in this direction and ameliorated his
algorithms.
Moorer reconsidered Schroeder’s reverberator and made some improvements [Moorer 1979]. He
increased the number of comb filters from 4 to 6 to effect longer reverberation times and inserted a
one-pole lowpass filter into each comb feedback loop. Gardner based his reverberators on nested allpass
filters [Gardner, 1992][Gardner 1998]. A similar reverberator to Gardner’s reverberator is the Dattoro’s
reverberator [Dattorro 1997b].
Gerzon proposed feedback delay networks for reverberation in 1972 [Gerzon 1971]. J. Stautner and M.
Puckette suggested a similar structure for reverberation based on delay lines interconnected in a
feedback loop by means of a matrix [Stautner Puckette 1982]. In the same direction, Jot developed a
systematic design methodology [Jot Chaigne 1991][Jot 1992].
Smith proposed digital waveguide reverberators in 1985 [Smith 1985]. Recently those networks have
been used for the simulation of spring reverberators [Abel Berners Costello Smith 2006]. The wave
digital mesh and the finite difference schemes are also used widely for artificial reverberation [Van Duyne
Smith 1993] [Savioja Backman Jarvinen Takala 1995] [Bilbao 2007].
Good general references for reverberation algorithms are [Gardner 1998][Rocchesso 2002][Rocchesso
2003][Smith 2005][Zoelzer 1997].
Acoustics were always a part of the musical performance space and share a very long history with the
auditory arts. A sound is decomposed by the human auditory system in the following perceptual
components [Blesser 2001]: (a) the identity of source, (b) its spatial location and (c) an image of the
space. The image of the space contains the reverberation that provides a sense of the size and the
materials of the enclosing space and the human echolocation that detects walls, objects and objects at
reasonable distances. The spatial location gives the position of the source, its acoustic field and its
width. A special case concerns the movements of the sound sources that are detected as changes in
direction, distance and by the Doppler effect.
A variety of techniques exist for the simulation of the spatial location of a sound. The most standard
ones are the panorama and the precedence effect for a stereo loudspeaker setup or headphones
[Rocchesso 2002]. Audio engineers, in the mixdown process, use both techniques all the time. Distance
rendering algorithms are also common which are based on the intensity of the incoming sound, the ratio
of the reverberated to direct sound and the modification of the high frequencies in the sound [Dodge
Jerse 1997] [Chowning, 1971]. Other advanced techniques for 3D sound with headphones or with many
loudspeakers can be found in [Rocchesso 2002][Rocchesso 2003].
19
2.1.4 Non l i near Process ing
The audio effects category with the name nonlinear processing includes all the time-based digital audio
effect algorithms that cannot be considered linear*: dynamic processing, simulation of nonlinear
amplifiers, distortion-type effects, and amplitude modulation algorithms. All these systems create
frequency components that are not present in the input signal.
Digital signal processing studies in general linear time-invariant systems (LTI). However there are several
methods for the analysis and modeling of systems with nonlinearities and the domain is very wide and
large. Two main categories are the nonlinear systems with and without memory [Dutilleux Zoelzer 2002].
In the domain of digital audio effects, a standard modeling approach is to consider the nonlinear system
as a black box. A method for estimating nonlinear transfer functions for nonlinear systems without
memory is presented in [Moeller Gromowski Zoelzer 2002]. For systems with memory, the Volterra and
Wiener theories can be used [Schattschneider Zoelzer 1999].
Another more common modeling approach is based on the “digitization” of analog nonlinear processing
systems [Smith 2007]. In this case an “optimal**” numerical method has to be employed to express in
discrete time the continuous-time mathematical equations that describe each of the analog signal
processing circuit elements (resistors, inductors, capacitors, operational amplifiers, diodes, and
transistors) [Yeh Abel Smith 2007b][Huovilainen 2004].
Probably the most important nonlinear processing units in a modern recording studio are the dynamic
range controllers: compressors, expanders, limiters and noise gates. Until this point of our survey, all the
types of digital audio effects, such as filters, reverbs and flangers were designed to make an obvious
modification in the sound; dynamic range controllers do not! Hence, often only if you hear the original
dynamic range of an input sound and compare it to the modified version will the effect be noticeable.
According to [Dutilleux Zoelzer 2002]: “Dynamics processing is performed by amplifying devices where
the gain is automatically controlled by the level of the input signal”. The system performs essentially an
automatic level control. Its functional components are the level measurement that can be a peak or a
RMS measurement, the static curve, which is the relationship between the input level and the weighting
level, and the gain factor smoothing that controls the time-responses of the system [McNally 1984]
[Zoelzer 1997][Dutilleux Zoelzer 2002]. In the systems diagram, the lower path consisting of all the
previous processing blocks to derive the gain factor and to multiply the input signal (actually a delayed
version of the input signal in order to compensate the delay of the lower path) is usually called the side
chain path.
We give some simple definitions of those effects taken as they appear in [Orfanides 1996]:
* the definition of a linear system can be found in any signal processing book (for example [Lathi 1998]) ** some important criteria for the choice of the optimal numerical method are the accuracy, the stability, the aliasing and the complexity
20
× Compressors are used mainly to decrease the dynamical range of audio signals (so that they fit into
the dynamic range of the playback system, “ducking” background music and “de-essing” for
eliminating excessive microphone sibilance).
× Expanders are used for increasing the dynamic range of signals (noise reduction, sustain time of an
instrument reduction)
× Limiters are extreme forms of compressors that prevent signals form exceeding certain maximum
thresholds.
× Noise gates are extreme cases of expanders that infinitely attenuate weak signals (remove weak
background noise)
Recent publications concerning the digital emulation of analog companding algorithms are [Peissig
Haseborg 2004] [Schimmel 2003].
A discussion on the topic of simulation of valve amplifier circuits can be found in [Dutilleux Zoelzer
2002]. The triodes and the pentodes can be modeled as memoryless functions. From the measures, we
observe that they provide asymmetrical (triode) and symmetrical (pentode) soft clipping. Other more
analytical references are [Schimmel 2003][Karjalainen et al. 2004][Keen 2000]. Valve amplifiers are used
as amplifiers, as preamplifiers for microphones or in other effect devices such as the dynamic range
controllers presented before.
Terms like overdrive, distortion, fuzz and buzz are used to describe similar effects to distorting the w
aveform of audio signals. The easiest memoryless way to design distortion-type effects is by
waveshaping [Schaefer 1970][Arfib 1979][LeBrun 1979][De Poli 1984][Fernadez-Cid Quiros 2001]. In
chapter 7 we will present several transfer characteristics found in the relative literature. A simulation of
a distortion and an overdrive guitar pedal has been reported in [Yeh Abel Smith 2007a][Yeh Abel Smith
2007b].
Amplitude and ring modulation can be used to alter periodically the amplitude of a sinusoidal signal.
[Oppenheim Willsky Young 1983][Dutilleux 1998] [Dutilleux Zoelzer 2002c]. For a low-frequency carrier
signal and an audio signal as a modulator, we obtain the tremolo effect: a cyclical variation of the input
signal amplitude. Other amplitude modulation-type effects using the single side-band modulator are
reported in [Disch Zolzer 1999][Wardle 1998].
2.1.5 Time-Segment Process ing
Time-Segment processing of audio signals in general concerns algorithms that can be decomposed in
three stages:
× An analysis stage where the input signal is divided into segments of fixed or variable length
× A processing stage where simple time domain algorithms are applied on these segments.
× A synthesis stage where the processed segments merge by an overlap and add procedure.
21
Even before the years of digital computing these techniques where implemented by electromechanical
devices. Denis Gabor built one of the earliest electromechanical time/pitch-scaling devices in 1946 using
optical sound recording [Gabor 1946][Roads 1991]. The first method for time or pitch scaling of an
audio signal using a tape recorder appeared in 1954 on a modified tape recorder used for speech - not
many years after the diffusion of the tape recorder by the end of World War 2 [Fairbanks, Everitt,
Jaeger 1954][Laroche 1998]. Other similar electromechanical devices followed, like Springer’s apparatus
[Springer 1955] or Phonogene [Poullin 1954]. In 1950 Pierre Schaeffer founded the Group de
Recherches Musicales (GRM) in Paris and with Pierre Henry he begun the musical experiments based on
the manipulation of concrete recorded sounds. Musique Concrète has made intensive use of splicing of
tiny elements of magnetic tape.
The time/pitch scaling methods based or electromechanical devices inspired the time domain techniques
for time and pitch scale modification of audio signals and was transposed for the first time in the digital
domain by Lee [Lee 1972][Laroche 1998]. Also, in 1978 when the GRM received its first computer
many processing techniques transferred in the digital domain under the Studio 123 software [Geslin
2002].
General Granu la r Methods and Granu lat ion E ffects
Dennis Gabor proposed the idea of a granular representation of sound in 1947. Xenakis in 1971 had
been inspired by Gabor to compose music in terms of grains of sound. His work motivated Roads and
Truax among others to perform granular synthesis of sound using computers.
Granular synthesis constructs a sound by means of overlapping time-sequential acoustic elements. It is
actually a family of techniques based on the manipulation of sonic grains [Roads 2001]. The control of
the temporal distribution of the grains may be the synchronous where the grains are triggered at fairly
regular time, or asynchronous. Also, the grain can be derived from natural sounds or from a sound
synthesis model. Many audio signal processing methods may be grouped within the common paradigm of
granular techniques, like the Short Time Fourier Transform, the Gabor Transform, the Wavelet transform,
the Pitch-Synchronous Granular Synthesis, the FOF and the VOSIM method [Cavaliere & Piccialli 1997]
[De Poli & Piccialli 1991].
A very interesting possibility of the granular technique is the time granulation of sampled sounds - the
granulation effects. In 1988 Barry Truax programmed several granular algorithms in real time [Truax
1988]. Trevor Wishart also designed, developed and used in his pieces various granular-type sound
transformations. Many of his algorithms are presented with examples extracted from his compositions in
his book Audible Design [Wishart 1994]. A short list of them is: granular reordering, granular reversal,
granular time-stretching/pitch-shifting, zigzagging, shredding, looping and iteration, progressive looping,
multi-source brassage, chorusing and all the effects based on wavesets like waveset distortion, waveset
interleaving, waveset substitution etc.
22
Brassage or time shuffling – a term that includes several granular techniques - is based on the micro-
splicing technique used widely, and for years, in Musique Concrète. In 1980 Bernard Parmegiani
suggested at the Groupe de Recherches Musicales (GRM) that this technique could be done by
computers. A simple algorithm of brassage can be found in [Dutilleux, De Poli, Zolzer 2002].
Over l ap-Add Methods for Time Sh ift i ng / Time St re tching
There are many ways to perform time shifting and time stretching with the granular technique. For
example we can affect the duration of the signal without changing the pitch by cloning or omitting
grains. In a similar way we can shift the pitch and conserve the duration using the above technique and
changing appropriately the sampling rate. These techniques were first explored in 1968 [Otis, Grossman
and Cuomo 1968][Roads 1991] but they gave poor results. More sophisticated techniques, based once
again on the time-segment processing paradigm, followed later. For simplicity we will call all these time
shift / time stretch techniques as overlap-add methods. The most famous are the SOLA and the PSOLA
method.
The basic idea of the SOLA algorithm (Synchronized OverLap Add) originally proposed by Roucos and
Wilgus [Roucos & Wilgus 1985], consists of the decomposition of the input signal into equal-length
successive segments of relative short duration (N=10msec to 40msec) and then re-positioning them with
a time shift. Effort must be taken to avoid the discontinuities that appear at time instants where
segments are joined together. In the case of SOLA we apply fade-in and fade-out on the overlapping
blocks starting on the point where the two overlapped segments are maximally similar [Dutilleux, De Poli,
Zolzer 2002]. Thus it is necessary to compute the similarity between the overlapping parts. The cross-
correlation function is the most standard technique. For pitch-scale modification we combine this
algorithm with resampling
The PSOLA (Pitch-Synchronous Overlap and Add) method [Moulines & Charpentier 1990] is a slight
variation of the algorithm described above. It uses the pitch information from the incoming signal to
avoid pitch discontinuities. The length of the analysis time-segments is adjusted according to the local
value of the pitch. With the PSOLA algorithm we can preserve the formants. If prior to the overlap and
add operation the short time segments are resampled, the formant will be modified accordingly. Thus we
can change the formant position in the signal without affecting the pitch and the duration of the sound.
If we lower the sampling rate by a factor γ we raise the formants by the same factor [Laroche 1998].
A simple description of the algorithm can be found in [Dutilleux, De Poli, Zolzer 2002].
A similar approach to PSOLA algorithm is Lent’s algorithm [Lent 1988][Bristow-Johnson, 1995].
Variations of the PSOLA algorithm can be found on [Peeters 1998].
Many other algorithms exist in the family of overlap-add methods for time and pitch alteration of an
audio signal except the widely used SOLA and PSOLA. All of them are based on the same idea and differ
only in the choice of the segment durations, splicing and weighting windows. Some of them are the
Synchronized Adaptive OverLap-Add SAOLA algorithm [Dorran ,Lawlor, Coyle 2003a], the Peak Alignment
23
OverLap–Add PAOLA [Dorran ,Lawlor, Coyle 2003b], the Subband Analysis Synchronized OverLap-Add
SASOLA [Tan & Lin 2000], the Waveform Similarity OverLap-Add WSOLA [Verhelst & Roelands 1993],
the Variable Parameter Synchronized OverLap Add VSOLA [Dorran & Lawlor 2003c] and the Transient
and Voice Detecting Zero Crossing Synchronous OverLap-Add TvdZeroXSola [Pesce 2000].
2.2 T ime-Frequency Model s
2.2.1 Phase Vocoder
The vocoder first appeared as a voice coding technique with the name channel vocoder in 1939 by
Dudley. Analog vocoders like those of Dudley’s, Moog’s, Bode’s are of this type. Flangan and Golden first
described the phase vocoder in 1966 [Flanagan & Golden 1966]. The signal was represented by a sum
of sine waves modulated in frequency and amplitude. The basic objective is the separation of temporal
from spectral information.
Two complementary viewpoints of the phase vocoder exist: the filterbank interpretation and the Fourier
transform interpretation [Dolson 1986][Portnoff 1976][Allen & Rabiner 1977]. The Fourier transform
interpretation is equivalent with the Gabor transform although this approach was developed later. It can
be seen as a STFT with a Gaussian function for the window [Arfib 1991][Arfib & Delprat 1993].
An important number of audio effects can be developed by the phase vocoder. Good references for
those effects are [Arfib, Keiler, Zolzer 2002][Wishart 1994][Laroche & Dolson 1999][Smith 2007].
Below we name a few of them.
Filtering can be achieved by multiplying every frame by a filtering function in the frequency domain.
Before this multiplication, we have to zero pad the windowed input signal and the filtering function in
the time domain to avoid the aliasing effects of the circular convolution. The time scale modification
algorithm, or time stretching, consists of providing a different synthesis grid from the analysis grid. Three pitch-shifting algorithms based on resampling and time-stretching can be found in [Laroche 1998].
In the robotization effect we put zero phase values on every STFT frame before reconstruction. If the
phase of the STFT values took random values, we will get a whisper-like effect. It is also interesting to
randomly change the magnitude and keep the same phase. The denoising effect is a frequency depended
dynamic controller. Using a non-linear transfer function on the analysed sound we modify the intensities
of the input’s frequency components. The phase is kept as it is, while the magnitude is processed to
attenuate the noise. The mutation effect reconstructs a sound from the STFT of two sounds.
Diagrammatic descriptions for many other digital audio effects like spectral shifting, spectral freezing,
spectral shaking, spectral undulation, spectral interpolation, spectral blurring and spectral interleaving can
be found in [Wishart 1994].
24
2.2.2 Wave le t T ransform
The wavelet transform (WT) was firstly proposed for sound transformation purposes by Kronland-Martinet
[Kronland-Martinet 1988][Duttilleux, Grossmann and Kronland-Martinet 1998]. The underlying idea of WT
is the decomposition of a signal in elementary grains that are called wavelets. Wavelets are obtained by
time shifts and compressions of a mother wavelet. This procedure derives a two-dimensional time-
frequency representation of audio input. The signal is decomposed into a set of wavelets that are
A number of sound modifications may be achieved similar to the Phase Vocoder: filtering, time shifting
the analyzing voices, frequency transposition, the brightness effect, cross-synthesis and thresholding are
some that can be found in [Kronland-Martinet 1988][Evangelista 1991][Gerhards 2002].
2.2.3 Othe r Time-Frequency Techn iques
The Comb Wavelet Transform (CWT) [Evangelista 1994], the Multiplexed Wavelet Transform (MWT)
[Evangelista 1994] and the Pitch-Synchronous Wavelet Transform (PSWT) [Evangelista 1993] are a more
recent family of Wavelet Transforms based on the pitch information of the signal. The MWT is a
particular case of PSWT and applies to signals with constant pitch.
The Wavelet Packet Transform (WPT) developed by Ronald A. Coifman generalizes the wavelet transform.
It generates a set of orthonormal transform bases of which the wavelet transform basis is only one
member. The variety of orthonormal bases which can be formed for the WPT algorithm, coupled with the
infinite number of wavelets and scaling functions which can be created, yields a very flexible analysis tool
used for analyzing, modifying and re-synthesizing sounds [Gerhards 2002]. Apart from the classical
effects implemented with any time-frequency analysis/synthesis technique, the WPT permits modifications
to be made in the resynthesis process by changing the basis filters.
2.3 Paramet r i c/Signa l-Mode ls
2.3.1 Spect ral Mode ls
The Sinusoidal model is based on modeling time-varying spectral characteristics of sound as sums of
time-varying sine waves [McAulay & Quatieri 1986][Quatiery & McAuley 1998][Smith & Serra
1987][Serra 1997]. A large class of sounds may be represented in terms of estimated amplitudes,
frequencies and phases of sinusoids. The motivation for this model is the Fourier Transform of periodical
signals. This model works well for inharmonic and pitch-changing sounds in contrast to the Phase
Vocoder analysis/synthesis technique, but it’s not suited for noisy-like ones. The few differences
between McAulay-Quatieri and Smith-Serra approaches are mentioned in [Serra & Smith 1990].
Although it is possible to model noise by a sum of sinusoidal signals it is neither efficient nor meaningful
[George & Smith 1992]. In 1989 Serra and Smith proposed an expansion of the previews model by
adding the residual left by the subtraction of the modeled sinusoids from the original input signal called
25
Spectral Modeling Synthesis (SMS) [Serra & Smith 1990][Serra 1997][Amatriain, Bonada, Loscos, Serra
2002]. In SMS the residual is modeled as a stochastic signal so it is described mathematically as filtered
white noise.
The Sinusoidal + Residual + Transients is a three-part model that extends the Sinusoidal + Residual model
proposed by Serra with a transient part [Verma & Meng 2000]. This model called Transient Modeling
Synthesis (TMS) has over-covered the pitfalls of the Sinusoidal + Residual model with transients. By
modeling the transients as filtered noise, the resulted sound looses its sharpness and its attack. One
method that has been considered is removing the transient parts from the residual during the noise
analysis and then adding them back in to the synthesis. It is clear that this technique does not retain
the flexible spirit of the Sinusoidal + Residual model.
Many effects can be implemented using spectral models. In this case the analyzed data are modified
before resynthesis. The most common effects that appear in the bibliography are listed below [Bonada
2000] [Amatriain, Bonada, Loscos, Serra, 2001] [Amatriain, Bonada, Loscos, Serra, 2002] [Serra 1994].
By modifying the amplitude of the partials we can easily obtain a filter with arbitrary resolution. A
frequency shift factor can be introduced to all the partials in order to derive a non-harmonic spectrum. It
is also possible to multiply the partials by a scaling factor to obtain a pitch shifter. Pitch transposition
with timbre preservation can be achieved if we scale all the partials by the same multiplying factor and
then apply the original spectral shape. Vibrato and tremolo can be easily applied to the analyzed partials
if we use the previous techniques. The modulation depth would apply different depths at different
frequencies. The spectral shape shift produces many interesting effects without new partials generation.
Using the combination of spectral shape shift and pitch transposition with timbre preservation algorithms
we are able to change the gender of a given local sound. If we add pitch-transposed versions of the
original signal we get a harmonizer. By applying a gain to the residual component we get a similar effect
to hoarseness. Morphing is the transformation that from two or more sounds we generate a new one
with hybrid properties. This is easily achieved by interpolating their spectral representations. We can
interpolate all the parameters or a part of them like the fundamental frequency. If we use the
deterministic magnitudes of one sound with the frequencies of the other we get the classical vocoder
effect.
2.3.2 Source -F i l te r Models
The source-filter processing is a typical analysis/synthesis technique that comes from the voice
production and recognition system: the vocal cords are the excitation system (the source) and the
mouth and nose are the resonator system (the filter). So we have a rich source on which a spectral
envelope is superimposed. In a source/filter system the goal is to extract the spectral envelope of a
sound or in other words to separate the excitation from the resonance. Three techniques are used for
the estimation of the spectral envelope: the channel vocoder [Flanagan & Golden 1966], the linear
prediction [Atal & Hanauer 1971] and the cepstrum [Noll 1964]. The input stream can be cut in
overlapped frames and analyzed as in the phase vocoder. For synthesis, the overlap-add method is used.
26
A simple representation of the phase vocoder in a musical context can be found in [D. Arfib ,F. Keiler, U.
Zolzer 2002]
After separating the source signal from the filter, transformations can be applied to the spectral
envelope and to the source signal giving interesting audio effects. The first publications regarding the
LPC-based digital audio effects are those from Dodge “Speech Songs” composition in 1973 [Dodge
1989], Moorer [Moorer 1979b] and Lansky [Lansky 1989].
The vocoding or cross-synthesis effect generates a combination of two sound inputs. The spectral
envelope of a sound is used to filter the other sound or the excitation signal of the other sound in an
improved version. The musical instrument called Vocoder uses this technique. In the formant changing
effect we remove the original spectral envelope of the sound and we impose a frequency-warped version
of it. In spectral interpolation we mix the excitation signals and the spectral envelopes of two sounds.
With this technique the excitation of one sound influences the spectral envelope of the other unlike the
simple mixing. We can perform time varying spectral interpolation if the coefficients are time varying
variables. For pitch-shifting a sound without changing its formant structure we must keep the spectral
envelope of the original sound. If we excite the computed filter with an artificially created pulse-like
signal, in which we have control over the time and periodicity, and update this filter in the synthesis part
at a different rate from the analysis we can obtain a time-scaled version of the signal which preserves
the formant structure. Because the frame rate is changed, we can use interpolation for the calculation of
the coefficient between the analysed frames. Some ways in which to process the cepstrum features can
be found in [Pabon 1994].
27
Chap
ter
3 C
ontr
ol o
f Dig
ital
Aud
io E
ffec
t Alg
orithm
s
28
29
Chapter 3
Contro l of Dig i ta l Aud io Effect A lgor i thms
This new chapter could easily be a subsection of the previous one. We will continue in a similar fashion,
to survey digital audio effects, but from a different point of view: their control. The main control
tendencies of audio effect algorithms will be addressed in the following few pages. The chapter
organization will be based on the five control categories proposed by Verfaille et al in [Verfaille
Wanderley Depalle 2006]: gestural, low frequency oscillator (LFO), automation, algorithmic and adaptive.
All of these control types can be combined in a number of different ways.
Todoroff, simply describes the control problem as: “control, in the broad meaning of the word,
encompasses every possible method available to the user for accessing the various parameters of a
digital audio effect” [Todoroff 2002]. But an interesting question arises immediately: Where is the
frontier between the control part and the signal processing part of a digital audio effect? In a flanger for
example, the LFO is part of the processing algorithm and not of the control system. Verfaille et al
proposed that: “In order to define the frontier between processing and control mapping level, we
consider as part of the effect the control mapping layer that is specific to the way it sounds, whereas
the other control layers belong to the control mapping level.” [Verfaille Wanderley Depalle 2006].
It is clear that within the previous quote is hidden the idea of decomposing an audio effect into two
independent units: the signal processing unit and the control unit. Mapping procedures are further
designed and employed to link those two units in a musically and perceptually meaningful way [Arfib
Couturier Kessous Verfaille 2002][Van Nort Castagne 2007][Castagne 2007]. We will see in chapter four
that the mapping concept is not the unique control solution of digital audio effect algorithms. The
present thesis is devoted to this novel type of control that re-introduces the mechanical constraints of
acoustical instruments into sound transformation algorithms.
3.1 Gestural Control
The most common way to control a digital audio effect is through physical gestures. Cadoz proposed a
gesture typology in [Cadoz 1994]. A very interesting general study upon the thematic gesture-music
can be found in [Cadoz 1999][Cadoz Wanderley 2000]. But how can this particular link between
gesture-audio effects be established? What kind of gestural controllers are involved?
A variety of gestural controllers have been proposed and developed for musical applications [Roads
1996][Wanderley Battier 2000][Todoroff 2002][Miranda Wanderley 2006]. Wanderley and Deppale
propose the following controllers categories: instrument-like controllers, instrument-inspired controllers,
extended instruments and alternate controllers [Wansderley Depalle 2004]. In the same article they
define: “…the gestural controller is the part of the DMI (Digital Music Instruments) where physical
30
interaction takes place. Physical interaction here means the actions of the performer, be they body
movements, empty-handed gestures, or object manipulation, and the perception by the performer of the
instrument’s status and response by means of tactile-kinesthetic, visual, and auditory senses.”
Since the gestural channel – we will not generalize the discussion towards the multimodal interaction - is
the sum of gestural action and gestural perception, sensor and actuators should be parts of the gestural
interface* [Luciani 2007b]. According to the author, knowledge of those types of interfaces have never
been proposed or used for sound transformation purposes.
Direct modification of the effect input parameters by gesture transducers is the simplest and without
doubt the most standard approach (figure 3.1a). A good example is the surface control manipulated by
the user using knobs, potentiometers and buttons. In this case the mapping is direct, explicit and 1-to-
1[Verfaille Wanderley Depalle 2006].
The direct control with gesture analysis is a more complicated controlling scheme (figure 3.1b). Gesture
analysis can be considered the process of extracting and measuring information from gesture [Volpe
2007]. In this type of control the mapping is divided into a gesture feature extraction layer and into a
mapping layer that transforms those features into effect control values [Verfaille Wanderley Depalle
2006]. A similar mapping approach is employed on the adaptive control that we will present in
subsection 3.5.
Figure 3.1 (a) direct control of DAFx (b) direct control with gesture analysis of DAFx (xc: gesture
control input, x/y sound input/output)
3.2 LFO Control
Low Frequency Oscillator (LFO) control consists simply of having an LFO to drive the control input
parameter of the digital audio effect. A typical application of this type of control is the flanger effect
where the delay line length is controlled continuously by an LFO. As we mentioned before, this specific
control is part of the effect. * We prefer the term interface from the term controler
31
A typical LFO unit includes the following output waveforms: sine, square, triangular and sawtooth. The
square wave can be modified by a pulse-width modulation. Reference for the digital signal processing
design of those algorithms can be found in [Stilson Smith 1996][Lane Hoory Martinez Wang 1997]
[Dattorro 2002]. Other waveforms may also be used, produced by wavetable generators [Orfanidis
1996].
3.3 Cont ro l Based on Automat ion
Automation is a control given through a sequencer editor. The control data is recorded or written by the
musician and then played back in this type of system. MIDI is a standard protocol used for this scope.
Digital sequencing is probably the most common composing paradigm in popular music today. All modern
sequencers provide many helpful sequence-editing operations.
Often the sequencing is programmed by graphical means. Graphical user interfaces (GUI) give the
opportunity to represent the control messages by segment-line curves. Musicians can generate the
desired curves interactively by defining a few points. The tools that let musicians create and modify
functions of time using interactive graphics techniques are called function editors [Roads 1996]. Those
editors are normally built into sequencers and in many other music applications.
3.4 A lgor it hmic Control
A very common control approach used by contemporary composers is the algorithmic control. A variety
of algorithms have been employed in electroacoustic or mixed music pieces to modify the control input
parameters of an audio effect: stochastic processes, neural networks, cellular automata, genetic
algorithms, fractal structures, chaos generators, processed information extracted by collected data, and
in general, all the strategies used in algorithmic composition become more and more the preferred
control choices for “experimental” composers [Roads 1996][Todoroff 2002]. Modern composition
languages such as Open Music have recently allowed for the control of a variety of sound processing
tools, such as the SuperVP phase vocoder [Bresson 2006]. Many other languages that are used to
control algorithmically digital audio effects are Csound, SuperCollider, Max-MSP, Pd, and Cmusic among
others.
Unfortunately, the neomanic cultural atmosphere that surrounds contemporary music demands for
complex and highly “sophisticated” algorithmic control schemes that frequently take over the musical
goals. Regrettably, in the end it seems that the algorithm is more important than the perceived musical
output…the processed sound!
3.5 Adapt ive / Content-Based Control
Content based transformations concerns the processing of audio signals where the signal content plays
an essential role in the algorithm. The word “content” is used for any piece of information related to
32
the signal that is meaningful to the targeted user and carries semantic information [Amatriain Bonada
Loscos Arcos Verfaille 2003]. This type of transformation includes a features extraction step.
Sometimes this analysis may be omitted because the input stream can contain content description data.
The description of content can be represented by a hierarchically structure from low-level features like
the sound level to high-level features such as the notes. The content based transformation can focus on
transformations of sound properties related to signal properties: modify the main perceptual axes in
sound (timbre, pitch, loudness, duration and spatial position); also, it can affect the higher-musical or
symbolic layers: modify the melodic motive, the tempo, the dynamics, the articulation of a musical
phrase etc.
The low-level descriptors are purely syntactic because they don’t carry any information on the actual
meaning of the source and they are closely related to the signal or any of its representations [Amatriain,
Herrera 2001]. The most common descriptors are the amplitude envelope, the zero-crossing rate, the
voiced/unvoiced indicator, the spectral envelope, the spectral centroid and the fundamental frequency.
All these can be computed directly in the time domain or the frequency domain [Dutilleux, Zolzer
2003][Arfib ,Keiler, Zolzer 2003][Verfaille, 2003]. By using spectral models we get features like the
amplitude of the sinusoidal component, the amplitude of the residual component, the spectral shape of
the sinusoidal component, the spectral shape of the residual component, the harmonic distortion, the
degree of harmonicity, the noisiness and the spectral tilt [Serra, Bonada 1998].
High-level descriptors may be syntactic or semantic and normally they are computed from low-level
descriptors. Amatriain and Herrera points out [Amatriain, Herrera 2001] “…Syntactic descriptors usually
refer to features that can be understood by an end-user without previous signal processing knowledge as
they may refer to psycho acoustical properties of the source, but they do not actually carry any
semantic meaning about the content itself. In other words, syntactic descriptors cannot be used to label
a piece of sound according to what actually ‘is’ but rather to describe how it is structured or what it is
made of. In that sense, the computation of syntactic descriptors (either low or high-level) is not
dependent on any kind of musical knowledge…The ultimate purpose of a semantic descriptor is to label
the piece of sound to which it refers using a commonly accepted concept or term...”. The term higher-
level it is not clearly defined. For example the fundamental frequency is a low-level descriptor but the
pitch a high-level descriptor.
All these descriptors are computed normally at constant frame lengths of the input signal [Amatriain,
Bonada, Loscos, Serra, 2001]. For example they are computed directly from the analysis frames of the
analysis/synthesis models or the spectral models. Apart from the instantaneous or frame values,
attributes are calculated from adjacent frames that characterize the time evolution of the features.
Global attributes are used and computed, such as the average of a feature, directly from the
instantaneous attributes.
An important step is the segmentation of the sound into regions that are homogeneous in terms of
sound attributes. Then the region attributes can be extracted. A simple segmentation process is to
33
divide a melody into notes and silences and then each of the notes into an attack, steady state and
release regions.
The concept of controlling digital audio effects by sound features has been generalized and formalized
by Verfaille under the name adaptive digital audio effects (A-DAFx) [Verfaille 2001]. According to his
definition, adaptive digital audio effects are effects controlled by parameters that are extracted from the
sound it self [Verfaille Arfib 2001]. This is an old but very successful control paradigm. Many well-known
effects fall in this category such as the dynamic processors, the harmonizers, and the auto tremolo etc.
Furthermore, the adaptive scheme can be combined with gestural data [Arfib Verfaille 2003].
The general proposed structure of an A-DAFx is depicted in figure 3.2. Three steps are needed: (a) the
feature extraction (b) the mapping between features and effects parameters (c) the transformation
algorithm. The mapping parameters can be modified by gesture data. This schema can be modified by
performing the features extraction step on the processed sound or by using features from another
sound to control the digital audio effect. The last scenario is called cross-adaptive digital audio effects.
Figure 3.2 Structure of an A-DAFx with gestural input (xc: control input, x/y: input/output sound)
As we have already mentioned before, many sound descriptors exist. The most common ones used for
A-DAFx are RMS, spectral centroid, fundamental frequency and voiced/unvoiced indicator. The proposed
mapping scheme used is divided into two levels. Details can be found in [Verfaille Wanderley Depalle
2006].
An important number of published A-DAFx have been made avaliable the last few years [Verfaille, Arfib
2001][Verfaille, Arfib 2002][Verfaille, 2002][Verfaille, Depalle 2004][Verfaille Zoelzer Arfib
2006][Verfaille Wanderley Depalle 2006]. Other content-based digital audio effects that do not strictly
follow the same mapping approach are [Cano Loscos Bonada Boer Serra 2000][Gomez Peterschmitt
Herrera 2003][Gouyon Fabig Bonada 2003][Janer Bonada Jorda 2006].
34
35
Chap
ter
4 P
hysi
cal
Aud
io E
ffec
t Mod
els
36
37
Chapter 4
Physical Aud io Effect Models
4.1 D igit a l Aud io E ffect s and Phys i ca l Model i ng : La ra ison d ’être o f thi s PhD research
In the previous chapter we presented a survey of the most common digital audio effects and their
control methodologies. Many of those algorithms were based on physical and acoustical phenomena. The
echo effect, reverberation, and even filters are experienced in acoustical spaces. Therefore, there is a
strong relation between sound transformations and musical or architectural acoustics. Physical modeling,
expressed algorithmically many of the mathematical models from the field of musical acoustics and gave
birth to various interesting digital audio effects*. These effects, even if they are synthesized by digital
means, they are classified in the propagation medium processing category presented in chapter 1.
Physical modeling is a buzzword. Often it is employed for the domain of “Virtual Analog”. This domain
brought in the foreground many vintage analog audio effects. It has been used mainly to emulate
“classical” analog audio signal processing systems such as the Moog filter, distortion stompboxes and
modulation effects. Stilson gives a clear definition of virtual analog in his PhD thesis [Stilson 1996]
“Virtual Analog is the attempt to implement digitally the algorithms employed in analog synthesizers
(most typically subtractive synthesis) and hence emulate their sounds”. Even if this definition concerns
the sound synthesis domain, it can be generalized to include audio effects.
In both cases, virtual analog and physical modeling, the starting point a mathematical model of a musical
instrument from the field of musical acoustics or electronics. The continuous time systems are
decomposed on functional parts that are described by mathematical equations; then these equations are
expressed in the discrete time domain as computer algorithms. Surprisingly, the physically based
modelling has rarely been proposed as method for the design and the conception of new digital audio
effect algorithms.
Furthermore, all the digital audio effects with direct physical references – we may call them physical
audio effect models** – are always supporting the paradigm of control/mapping. In this paradigm the
control surface and the sound processing unit are independent, dissociated and they are related to each
other by mapping strategies [Miranda Wanderley 2006]. Our hypothesis is that the interesting expressive
possibilities of acoustic musical instruments are coming from the energetic coupling between the player
and the instrument [Cadoz Luciani Florens 1984][Castagne 2007]; our goal is to apply this type of
interaction to audio effects algorithms.
* It is not necessary to repeat again the state of the art given in chapter two
** We will give a definition more relevant to the present PhD research in the end of the chapter 4
38
We should note that even in the virtual analog the control problem is crucial. Much effort is paid to
design a digital system that offer the same controls and control responses as the analog original system
[Smith 2007]. We see that often the goal of physical modelling is not only to offer a functional model
but a structural model [Cadoz 1990].
Now we are ready, according to the previous discussion, to answer the question: why digital audio
effects based on physical modelling?
First of all, audio effects often result from creative use of technology. That happened in a very obvious
manner with the phase vocoder, the linear prediction and the time-segment processing. We believe thus,
that it remains a challenge and quite open question the application of physically based techniques to
design new musical sound transformations.
Moreover, if we consider that our aural experience is somehow “grounded” in physical experience, it is
logical to be more sensitive to transformations that occur by physical means. The simple delay line
offered a variety of audio effects probably due to the fact that we experience delay-type phenomena
everyday in acoustical spaces. Something similar could happen with the mechanical vibrating systems.
Instrumental interaction surely enforces the previous statement.
Therefore the motivation of this PhD research is double:
× Propose a framework for the design of physical audio effect models
× Design digital audio effects with system structure that supports the physical instrumental
interaction and not the mapping paradigm
An orientation related to reverberation algorithms based on physical modelling has been excluded by
purpose since it is already an object of significant research (see chapter 2).
4.2 D igit a l Aud io E ffect s and Inst rumental Mult isensory Inte ract ion : La Nouveauté of
th is PhD Research
In the present thesis, we propose a novel approach of musical sound transformation based on the
physical simulation of vibrating structures with the aim to investigate the possibilities of physical
modeling to provide more “plausible” sound modifications and alternative control procedures. Briefly our
method is centered on the numerical simulation of vibrating physical objects: at first the input digital
audio signal feeds a properly designed virtual viscoelastic system that matches the general specifications
of the desired effect; then a set of mechanical manipulations are taking place which consequently modify
dynamically the input sound. Thus this procedure offers a purely “materialistic” nature in the sound
modification. It is the “matter” which is manipulated after all and not the signal. According to the
classification proposed in chapter 1, this approach follows the Propagation Medium Processing paradigm.
The initial concept behind our models is to establish a physical interaction between the user-musician and
the audio effect unit that has virtual material substance. This is feasible of course only by the use of
39
suitable ergotic interfaces or by gesture models, as we will se in chapter 6. We should understand that
in this type of “control” no mapping layer exists between gesture and sound since no representation is
involved in this situation, but only physical processes (figure 4.1). The parametric control is replaced by
the mechanical modifications or transformations of the object that plays the role of the audio effect.
Figure 4.1(a) Usual structure of a contemporary real time sound system (b) the traditional instrumental
relation (from [Castagne, Cadoz, Florens, Luciani 2005])
It is obvious from the previous description that this instrumental approach to digital audio effects is
significantly correlated with the instrumental gesture typology proposed by Cadoz [Cadoz 1994][Cadoz
Wanderley 2000]. Below this analogy is explained in details.
Exc it at ion Gesture: The sound source is considered as a perturbation into space. A necessary
condition to be hard is to interact with a physical object that is considered as the physical audio effect
model. With our gestures we can move this source into space and evoke or stop this interaction. The
energy is coming from the sound source and from the gesture that carries it.
Mod if icat ion Gesture: Alter the properties of the physical audio effect model. All the dynamical
modifications of the effect are based on modification gestures: Damping, grasping, stretching, bending,
pressing, inserting/removing extra parts are some of the available techniques to modify it during the
simulation time.
40
Se lect ion Gesture: Select the physical audio effect model to apply the input sound source.
We could claim that our approach to digital audio effects is completely symmetrical to the one proposed
by Berdahl [Berdahl Smith 2006]. Our proposition is to derive sound transformations by physically
controlling or even better interacting physically with a virtual object. On the contrary Berdahl proposes
a methodology to derive sound transformations by controlling real objects and more particularly strings,
by signal processing means - active feedback control.
In the rest of this thesis we will often employ the term physical audio effects model. From our scope, a
relevant definition is the following:
Physical audio effects model refers to any digital audio effect algorithm that is designed to simulate a
physical system, including the physical interaction with that system, which transforms input sound
signals.
41
Chap
ter
5 C
ORD
IS-A
NIM
A S
yste
m A
naly
sis
42
43
Chapter 5
CORDIS-ANIMA System Analysis
The aim of this part of the thesis is to enlighten some special and particulars features of CORDIS-ANIMA
(CA) formalism by exploiting it mathematically. CA models are studied and presented using several useful
system representations like CA Networks, Finite Difference representations, Kirchhoff representations,
digital signal processing block diagrams, State Space internal descriptions, System Function input/output
external descriptions and, Wave Flow Diagrams. This mathematical analysis and formal approach was
crucial in the context of our research on the digital audio effect.
Most of the material covered in this chapter could be presented under the name “mathematical aspects
of CORDIS ANIMA system”. The two basic references that influenced the subject and the organization of
this chapter are the Julius O. Smith III book Physical Audio Signal Processing for Virtual Music Instruments
and Digital Audio Effects [Smith 2005] and Nicos Kalouptsidis book Signal Processing Systems Theory
and Design [Kalouptsidis 1997]. The majority of the present work, even if it seems basic, it is original
and has been based on several spare resources concerning the CA description [Cadoz Luciani Florens
1984][Cadoz Luciani Florens 1990][Florens, Cadoz 1991][Incerti 1993][Djoharian 1993][Habibi 1997]
[Kontogeorgakopoulos Cadoz 2007]. Before passing to the core of this research we present some
essential points concerning physical modeling and simulation.
In every physical modelling technique, mechanical and acoustical systems governed by physical laws are
modelled using several mathematical formalisms and simulated with the use of numerical methods and
digital computers. We shall now review some of the main techniques that have been introduced and
proposed in the domain of computer music the last forty years [Smith 1996][Valimaki Takala 1996] [De
Poli Rocchesso 1998][Rabenstain Trautman 2001]:
× simulation techniques based on numerical analysis i.e. the finite differences scheme
× CORDIS-ANIMA mass-interaction modular scheme
× lumped models used to approximate parts of physical systems like a singer’s vocal folds or a brass
player’s lips
× digital wave guide scheme oriented in the wave equation
× modal approach where the vibrating structure is represented through a series of elementary
oscillators
× state space modular methodologies
× functional transformation method which provides a multidimensional transfer function by the
application of a suitable functional transformation
Every physical modeling technique reflects various aims and may be considered “optimal” according to
the preliminary philosophy that has been conceived. In 2003 Castagne and Cadoz proposed 10 general
44
criteria for evaluating physical modeling techniques oriented to music creation (figure 5.1) [Castagne,
Cadoz 2003].
Vibrating structures like all kind of elastic bodies, strings, membranes, bars and plates can be considered
and approximated as linear deformable objects. Each physical modelling scheme represents those physical
objects differently in a discrete-time and discrete-space form. Those various structures may often be in
the end mathematically equivalent or more generally they may show a high degree of functional
equivalence even if they represent and realize the physical object using different formalisms and
strategies.
Figure 5.1 Ten criteria for evaluating physical modeling techniques.
In signal processing [Mitra 2001] two structures are defined as equivalent if they have the same transfer
function. However, their realizations may not at all be equivalent: dissimilar realizations leads to system
configurations with different complexity* and different memory requirements. Each structure also presents
different finite word-length effects (round-off noise, limit cycles, coefficient sensitivity) [Zolzer 1997], and
poses different stability issues.
Furthermore, each system structure supports and permit diverse control procedures. For example in digital
filtering several structures with tuneable frequency response characteristics provide independent tuning of
the filter parameters (cut-off frequency, bandwidth) [Mitra 2001]. In the context of physical modelling,
several schemes like CA provide spatial accuracy while others such as the commuted DGW do not,
although they are effective computationally. The aim of the present PhD research could be reposed
alternatively through the problematic of the system structures:
How can we redesign the basic digital audio effects algorithms or design new ones using system
structures that offer physical instrumental interaction?
* The complexity of a system structure is indicated by the number of multipliers and by the number of two-input adders
45
It is clear from all these reasons that it is appealing in many cases to pass from one formalism to another
and represent a certain model with other mathematical schemes. Another interest is to combine several of
the physical modelling approaches into one hybrid model.
For example, in recent years there has been an interest to combine the wave-guide scheme with finite
difference methods and lumped elements to enhance the modelling possibilities of digital wave guides
[Erkut Karjalainen 2002][Karjalainen 2003][Karjalainen Erkut 2004a][Smith 2004]. Also, other models
have used the digital wave-guide structure as the functional transformation method mostly for its
algorithmic efficiency [Petrausch Rabenstein 2005].
A further essential motivation for the use of several formalisms is the analysis. This corresponds to the 9th
criterion of Castagne and Cadoz criteria for evaluating a physical modelling scheme as illustrated in figure
5.1, or to the 10th criterion of Jaffe criteria for evaluating synthesis technique [Jaffe 1995] depicted in
figure 5.2. It is evident that as every formalism offers a different type of system description, it is useful
to choose the appropriate one for the desired analysis purposes. These purposes may be strictly scientific
and can help in the study and the development of the physical modelling scheme or more artistically to
offer modelling techniques based on the paradigm of synthesis by analysis. These reasons stimulated us
to study how the CA formalism is transformed to other representations.
Figure 5.2 10 Criteria for evaluating sound synthesis techniques.
A crucial question at this point might be:
Why would we bother changing formalisms for the analysis and synthesis while it is possible to stick with
the most convenient formalism and start the simulation directly?
Apart from the fact that even if the model has an equivalent mathematical description the different
configuration will produce slightly different simulacrum -as we have already mentioned previously-, there is
a much more vital and essential reason. Each formalism permits and allows a different way of
manipulation, control and thinking due to its structure and to the mental image that it conveys to the
user. Consequently one user can use other representations for the analysis, along with the concept of its
model and then pass it to a preferable physical modelling scheme for further manipulation and musical
creation.
46
5.1 CORDIS-AN IMA Network Representat ion
In CORDIS-ANIMA formalism a physical object is modelled as a modular assembly of elementary mechanical
components [Cadoz Luciani Florens 1993]. Hence it is straightforward to represent the model as a
topological network whose nodes are the punctual matter elements <MAT> and the links are the physical
interaction elements <LIA> (figures 5.3 and 5.4). The simulation space used for sound and musical
applications is limited to just one dimension. In the present thesis CA systems are strictly one-dimensional.
Forces and displacements are projected on a single axis, perpendicular to the network plane. Consequently
the geometrical distance between two <MAT> elements is reduced to their relative distance on the
vibration axis [Incerti Cadoz 1995].
Figure 5.3 A CA network
Figure 5.4 CA modules
By using linear <LIA> elements such as springs and dumpers whose parameters do not change with time,
we obtain a discrete linear time-invariant (DLTI) system. In table 5.1 we illustrate the linear CA algorithms
and in table 5.2 the non-linear CA algorithms as defined and developed in GENESIS software [Castagne
Cadoz 2002], [Castagne Cadoz 2002b]. The symbols that appear in table 5.1, accompanied by their
units, are given in table 5.3.
A model is fully described by its topology-network, its parameter values and its initial conditions x0 and
v0. In GENESIS all CA models are designed graphically directly on the workbench as networks using a
simplified representation of CA networks enriched with colours.
This highly modular representation gives the possibility to design a model based on intuition. As the basic
building elements have actually a strong physical counterpart they remain pertinent to human senses and
47
create a very realistic mental model. Therefore the design phase allows a purely physical approach carried
out by “Physical Thinking”. Castagne points out [Castagne Cadoz 2002] “…Models are more easily
internalized as representations of real objects than with more mathematical or signal processing physical
modelling techniques…” Furthermore it is very often possible to guess and predict the general behaviour
of a model by examining its network without the use of mathematical analysis tools.
CA networks directly offer another type of control based on the “Physical Instrumental Interaction”. In
this control scheme we don’t affect the parameters of the model -even though it is possible and
previewed within the CA system- but we apply forces to the <MAT> elements of the model using <LIA>
elements like in reality. It is straightforward that this type of control is totally physical and energetically
coherent. Since physical models enable an intuitive representation of the action we perform with real
objects, we can imagine several physical gestures to manipulate and control our model: dumping, pulling,
pushing, etc. This is still feasible for non real-time simulations and without the use of force feedback
gestural interfaces, but by designing models that simulate the physical gesture. The deferred-time
simulation permits to design accurate and valid models of the control gesture with a precision that is not
possible in real-time situations
Table 5.1 CA Linear Algorithms
!
x(n) = 2x(n " 1) " x(n " 2) +1
Mfi(n " 1)
i
N
#
!
x(n) = c
!
fRESij
(n) = Kij[xi(n) " x j(n)]
!
fFROij
(n) = Zij[xi(n) " xi(n " 1) " xj(n) + x
j(n " 1)]
!
fREFij
(n) = Kij[xi(n) " x j(n)] + Zij[xi(n) " xi(n " 1) " xj(n) + x
j(n " 1)]
One drawback* of CA networks is that they do not give information about their functional structure. The
algorithms and realization structures beyond the model do not appear in this representation. Consequently
it is not possible to implement directly the model using only the information furnished from these
diagrams. On the other hand we may pass over this problem if the few basic algorithms of each module
accompany those networks. Even in this case the precise implementation of the model is hidden. CA as a
simulation language has been designed to offer an optimal implementation correspondence to its
modularity. It is clear immediately the reason we identify CA as a modelling and simulation language
* for the needs and the demands of the present research. In many cases it is an advantage.
48
Table 5.2 CA Nonlinear Algorithms
!
fBUTij
(n) =fREFij
(n) xi(n) " x
j(n) # S
0 xi(n) " x
j(n) > S
$
% &
' &
!
fLNLKij
(n) = interpolate(xi(n) " x
j(n),#)
lookup table # : (fr ,$xr ) 2 % r % 20 : fr defined force points
$xr = xi " x j defined displacement poits
& ' (
interpolation types : linear, cubic, splines
!
fLNLZij
(n) = interpolate(xi(n) " x
i(n " 1) " x
j(n) + x
j(n " 1),#)
lookup table # : (fr ,$vr ) 2 % r % 20 : fr defined force points
$vr = vi " v j defined velocity difference poits
& ' (
interpolation types : linear, cubic, splines
Table 5.3 CA variables and units
x: position [m] M: mass [Kgr]
v: velocity [m][sample]-1 K: stiffness factor [Kgr][sample]-2
F: force [Kgr][m][sample]-2 Z: damping factor [Kgr][sample]-1
Another more important weakness of the network representation is that it does not offer a direct
mathematical analysis. The tools of linear algebra and calculus are not directly applicable. Hence it is
inconvenient to set out right away on a mathematical study of these networks.
5.2 Fini te Di fferences (FD) and Fini te Der iv at ives (FDe) Representat ions
We have seen in the previous section that CA networks -oriented principally on the simulation and on
the synthesis of physical systems- are closely related to elementary computable structures optimised for
synthesis purposes, but they do not offer a flexible framework for analysis. On the other hand finite
derivative representations, used extensively in the branch of Physics and Acoustics, offer this possibility
directly.
In general, any problem in classical mechanics can result into a finite derivative representation. The
application of Newtonian laws for the formal description and the analysis of any given physical system
provide a set of differential equations, partials or ordinary depending on the nature of the problem. A CA
system may be approached by this point of view.
49
5.2.1 Mass- Interact ion Networks : Sys tem of FD and FDe equat ions
It is evident that the point of departure of CA derives from the general interaction between two objects.
This basic concept may be “interpreted” and expressed in classical Newtonian mechanics as the
interaction of punctual masses [Florens, Cadoz 1991]. Therefore CA models are in principal a special
type or sub-category of mass-interaction networks.
An interesting subclass of mass-interaction systems is the linear networks of coupled mechanical
oscillators: the mass-spring systems**. The analysis of linear mass-spring systems is referred to as
lumped modeling. In figure 5.5 we depict schematically the transition from a linear CA network to a
discrete-time (DT) mass-spring network and then to a continuous-time (CT) mass-spring network*. Figure
5.6 illustrates an example of a simple 1-D mass-interaction network.
Figure 5.5 From linear CA networks to discrete-time (DT) and to continuous-time (CT) mass-spring
networks
For each component, the mathematical expression that describes its dynamical behavior is considered
and applied. In the linear case, the system is made up entirely of masses, linear springs and linear
dampers (also referred as dashpots). Table 5.4 provides the mathematical relations of the primitive
components of the mass-interaction formalism (linear and non linear) in the continuous-time form.
There are a considerable number of methods for solving continuous-time system equations using
discrete-time methods. CA uses central finite difference scheme for the acceleration and backward finite
difference scheme for the velocity terms in the previous differential equations. Equations (5.1) and (5.2)
give those approximations in t/n domain and in s/z domain (Laplace-transform z-transform).
!
d
dtx(t) " #x(n) =
1
Ts
(x(n) $ x(n $ 1)) %
s =1
Ts
(1$ z$1
)
(5.1)
!
d2
dt2
x(t) " #2x(n) =
1
Ts2
(x(n + 1) $ 2x(n) + x(n $ 1))
s2
=1
Ts2
(z $ 2z$1
+ z$2
)
(5.2)
** We could say interesting for their mathematical properties. In the composition context the majority of the models are non linear because they are more “interesting” sonically. * The arrows represent the transition from one system formalism to another. We use a simple arrow symbol for the certain transition, a dotted arrow for the unsure transition and a marked arrow for the approximate transition.
50
Figure 5.6 A 1-D mass-interaction network
These finite difference schemes facilitate the modular realization of CA algorithms. The resultant finite
difference expressions of the equations of table 5.4 have already been given in the previous subchapter.
In figure 5.6 we illustrate schematically the exact mass-interaction network analog of simple CA model.
Table 5.4 mass-interaction analogs of CA linear networks
!
f(t) = Md2x(t)
dt2
!
x(t) = c
!
fKij(t) = K[x
i(t) " x
j(t)]
!
fZij(t) = Z[
dxi(t)
dt"
dx j(t)
dt]
!
fKZij
(t) = K[xi(t) " x j(t)]+ Z[dxi(t)
dt"
dx j(t)
dt]
!
fnonlinearij
(t) = f(xi(t) " x j(t),dxi(t)
dt"
dx j(t)
dt)
5.2.2 Mu lt i channel FD and FDe Representat ions
A standard approach to study linear multiple degrees of freedom (MDOF) in vibrating systems are the
multi-channel finite derivative representation using matrices and vectors [Tongue 1996]. In general, these
systems are Multiple-Input Multiple-Output (MIMO) so this representation is well suited and preferred. This
representation has been widely applied to CA systems [Incerti 2003], and Figure 5.7 illustrates the
transition from CA network to a multi-channel FD and FDe.
51
Figure 5.7 From linear CA networks to multi-channel FD and FDe representation
The equations of motion in the linear case are given by:
!
[M]˙ ̇ X (t) +[Z]˙ X (t) +[K]X(t) = F(t)"
!
˙ ̇ X (t) +[M]"1
[Z]˙ X (t) +[M]"1
[K]X(t) = [M]"1
F(t) (5.3)
Using the double discretization scheme of equation (1) employed by CA we obtain a similar expression:
!
[M](X(n + 1) " 2X(n) + X(n " 1)
Ts2
) + [K]X(n) + [Z](X(n) " X(n " 1)
Ts
) = F(n) #
[M]X(n) + ([K]
Fs2
+[R]
Fs
" 2[M])X(n " 1) + ([M] "[Z]
Fs
)X(n " 2) =F(n " 1)
Fs
#
[M]X(n) + ([K] + [Z] " 2[M])X(n " 1) + ([M] " [Z])X(n " 2) = F(n " 1) #
!
X(n) + ([M]"1
[K] + [M]"1
[Z] " 2[I])X(n " 1) + ([I] " [M]"1
[Z])X(n " 2) = [M]"1
F(n " 1) (5.4)
The matrices [M], [Z], and [K] respectively represents the inertia, the viscosity and the elasticity of the
system, vector X the position of the masses and vector F the external forces. If the system has the form
of linked masses by viscoelastic forces moving in one-dimensional space as in CA then these matrices are
symmetric. The form of the matrices [M], [K], [Z] can be easily found in any book that covers the field
of vibrations [Tongue 1996].
The proportional viscosity networks are the networks where the matrices [K] and [Z] are related by
equation (5.5). In this case these matrices are diagnosable by the same transformation. These networks
cover a sufficient number of interesting situations.
!
[Z] = a[M] + b[K] (5.5)
Equation (5.4) describes a finite difference problem. In this equation Fs denotes the sampling rate in
Hertz and Ts the sampling period in seconds. Apparently, equations (5.3) and (5.4) are diagonalizable by
the same transformation under the condition of proportional viscosity. Remember that in CA we do not
use the SI units –the time is measured in samples- and that is the reason we omitted the Fs terms in
equation (5.4).
52
The finite difference models offer a direct way of implementation by iterations (figure 5.8). Furthermore,
they can be treated by numerous software dedicated to scientific computation, such as Matlab®.
Optimization problems, the inverse problem, and many others, can be conveniently approached by this
representation.
Figure 5.8 Multi-channel FD DLTI system implementation
An interesting remark is that similarity transformations produce “equivalent” CA networks with different
internal structure. Hence we are able to modify the physical structure of the system and conserve the
initial transfer function. It would be interesting to develop similarity transformations to pass from one
classical topology to another i.e. transform a membrane into a string.
5.2.3 N-d imensiona l Fini te Di fference Rep resentat ion
It has already been stated before that CA networks are purely topological. This property often causes
confusion, as we tend to attribute artificially spatial dimensions to CA models. However, this spatiality is
not always false. In this section we will present how we can formalize and interpret several CA networks
as N-Dimensional Finite Difference models [Kalouptsidis 1994]. Two simple cases are studied: the 2-D
problems and the 3-D problems using Cartesian coordinates (N is referred to the number of spatial
dimensions + the temporal dimension: for example membranes and plates are considered 3-D problems).
In a CA system, a composed linear lossless string has the topology of figure 5.10. For the moment we
do not care about the boundary conditions of the problem. The following analysis can be generalized for
any interaction force in a linear topology.
Figure 5.9 From linear CA networks to N-Dimensional Finite Difference Representation
The equation of motion is given by the multi-channel FD representation or in this simple case directly by
the algorithms of table 5.1. We express the equations for the mass m as:
53
!
Mlx[x
lx(n) " 2x
lx(n " 1) + x
lx(n " 2) = f
RESlx " 1, lx
(n " 1) + fRESlx + 1, lx
(n " 1) #
!
Mlx[x
lx(n) " 2x
lx(n " 1) + x
lx(n " 2) =
Klx[x
lx " 1(n " 1) " x
lx(n " 1)] + K
lx + 1[x
lx + 1(n " 1) " x
lx(n " 1)]
The last equation is written using the SI units as:
!
Mlx
Ts2
[xlx(n) " 2x
lx(n " 1) + x
lx(n " 2) =
Klx[x
lx " 1(n " 1) " x
lx(n " 1)] + K
lx + 1[x
lx + 1(n " 1) " x
lx(n " 1)]
(5.6)
Since our model is an ideal string model, we may give a geometrical-spatial interpretation to the resulted
topology. The string is moving on the x-axis and we may imagine that each MAS is a punctual spatial
point on the y-axis. It is more common to use the y-axis for motion; hence we modify accordingly the
equation (5.6). Also, we no longer use the multi-channel notion with lx as index: each point m has a
certain position in the space.
!
M(lx)[y(n, lx) " 2y(n " 1, lx) + y(n " 2, lx) =
K(lx + 1)Ts2[y(n " 1, lx + 1) " y(n " 1, lx)] + K(lx)Ts
2[y(n " 1, lx " 1) " y(n " 1, lx)]
(5.7)
The last equation gives the 2-D finite difference representation of an inhomogeneous lossless linear CA
model (figure 5.10).
Figure 5.10 A CA 2-D lossless string model
The 3-D case is a bit more complicated and out of the scope of the present research. We
demonstrate the equation for the homogenous case that results easily from the previous analysis:
!
M[y(n, lx, ly) " 2y(n " 1, lx, ly) + y(n " 2, lx, ly) =
KTs2[y(n " 1, lx + 1, ly) " y(n " 1, lx, ly)] + KTs
2[y(n " 1, lx " 1, ly) " y(n " 1, lx, ly)]
KTs2[y(n " 1, lx, ly + 1) " y(n " 1, lx, ly)] + KTs
2[y(n " 1, lx, ly " 1) " y(n " 1, lx, ly)]
(5.8)
We get a similar expression for the 4-D case:
!
M[y(n, lx, ly, lz) " 2y(n " 1, lx, ly, lz) + y(n " 2, lx, ly, lz) =
KTs2[y(n " 1, lx + 1, ly, lz) " y(n " 1, lx, ly, lz)] + KTs
2[y(n " 1, lx " 1, ly, lz) " y(n " 1, lx, ly, lz)]
KTs2[y(n " 1, lx, ly + 1, lz) " y(n " 1, lx, ly, lz)] + KTs
2[y(n " 1, lx, ly " 1, lz) " y(n " 1, lx, ly, lz)]
KTs2[y(n " 1, lx, ly, lz + 1) " y(n " 1, lx, ly, lz)] + KTs
2[y(n " 1, lx, ly1, lz " 1) " y(n " 1, lx, ly, lz)]
(5.9)
54
Figure 5.11 A CA 3-D lossless rectangular mesh model
Figure 5.12 Computation of the AR part of CA system - equation (5.10)
If we do not use the spatial dimensions in the system equation we can find very general finite difference
expressions for the CA models. In general every linear CA model may be reduced to the following 2-D
general finite difference expression:
55
!
y(n1,n2) = " a(k1, k2 )y(n1 " k1,n2 " k2)(k1,k2 )#(0,0)
$ + b(k1,k2)u(n1 " k1,n2 " k2)(k1,k2 )
$ (5.10)
where y corresponds to the displacement of the <MAT> elements, u to the input forces or positions, n1
corresponds to the discrete-time variable n, and n2 to an index that defines uniquely each <MAT>
element. Every CA model is iteratively computable. The autoregressive part (AR) of (5.10) which is
responsible for the iterative computability of the differential equation can take the form:
!
y(n1,n2) = " a(n1 " k1,n2 " k2)y(k1, k2 )(k1,k2 )#(n1,n2 )
$ (5.11)
The mask a(k1,k2), in the most general CA models takes the form of figure 5.12.
5.2.4 From CA Networks to Part ia l Di fferent i a l Equat ions
The idea to compose a discrete system by a finite number of masses connected by springs may seem
restrictive[Smith 2005]. Moreover, the composition of multidimensional continuous systems by the same
approach can be considered prohibited. Since multidimensional physical systems are usually represented
by partial differential equations* (PDE), in this chapter, we will present a link between CA formalism and
PDEs. This analysis could be helpful in order to understand the variety of systems that can by modeled
by CA. We claim that the class of physical systems that can be sufficiently approximated by this
formalism is large enough for the compositional or sound design purposes.
In musical acoustics literature, the finite difference approximation** is generally used and the standard
technique for creating computational models. We have seen in the previously that CA networks can be
described by multidimensional FD representation. By considering the last representation as a finite
difference approximation of a PDE, we pass directly to a system representation by PDEs (figure 5.13). In
the present work we will treat only the 2-D and 3-D linear homogenous problems. This study may be
generalized for N-D problems.
Figure 5.13 From linear CA networks to PDEs
1-D loss less str ing model
The ideal vibrating string is given by the equation [Fletcher, Rossing 1998]:
* type of finite derivative representation in which the unknown function is a function of multiple independed variables and their partial derivatives. Multidimensional physical systems are usually represented by partial differential equations (PDE). ** Finite difference methods are widely used numerical methods for approximating the solutions of differential equations
56
!
" " y (t,x) = c2˙ ̇ y (t,x) with c =
T
µ#
!
T " " y (t,x) = µ˙ ̇ y (t,x) (5.12)
where c is the wave velocity, Τ is the string tension [N], μ is the linear mass density [Kg/m] and y is
the sting displacement [m]. Using the central difference for the approximation (equation (5.2)) of the
second order temporal and spatial derivative we obtain:
!
T
hx2
[y(n,m + 1) " 2y(n,m) + y(n,m " 1) =µ
ht2
[y(n + 1,m) " 2y(n,m) + y(n " 1,m) (5.13)
where ht is the time step or the time sampling interval and hx is the spatial step or the spatial sampling
interval. Note that in the previous chapter we have used the symbol Ts for the sampling interval. In the
finite difference approximation it is more common to use the above notation. In this study we will use
the Ts notation. The Von Neumann stability condition demands the following constraint:
!
cTs
hx
" 1 (5.14)
A practical choice for time and the spatial steps is:
!
hx
= c Ts (5.15)
The equation (5.13) can be written:
!
µ
Ts2
[y(n + 1,m) " 2y(n,m) + y(n " 1,m) =T
hx2
[y(n,m + 1) " 2y(n,m) + y(n,m " 1) #
µ
Ts2
[y(n,m) " 2y(n " 1,m) + y(n " 2,m) =T
hx2
[y(n " 1,m + 1) " 2y(n " 1,m) + y(n " 1,m " 1) #
!
µ[y(n,m) " 2y(n " 1,m) + y(n " 2,m) =
TTs2
hx2
[y(n " 1,m + 1) " y(n " 1,m)] + TTs2
hx2
[y(n " 1,m " 1) " y(n " 1,m)] (5.16)
Equations (5.16) and (5.7) are equivalent if we substitute:
!
M = µ (5.17)
!
K = T1
hx
2 (5.18)
Using the steps of equation (5.15) we get:
!
K = T1
hx2
= T1
c2Ts
2= T
1
(T /µ)Ts2
= µFs2
(5.17)
"
!
K
Fs
2= M (5.19)
We have proven the equivalence of CA with the finite difference approximation of an ideal lossless string.
Moreover we have found a way that permits passage from PDE to CA networks and vise-versa. Although
this methodology cannot be applied for every PDE, in most cases a CA network can approximate the
classical continuous linear models of musical acoustics For the rectangular mesh it is straightforward to
prove the equivalence between the CA model and FD approximation.
57
We must also note that the concept of spatial step hx does not explicitly appear(s) in the CA modeling
approach. A continuous time model S of a reference lossless linear string is described by the set of
parameters {μ, T, L}PDE. Where L is the length of the string. Evidentially these are the terms that appear
in the PDE (5.12). A CA model of the same string is described by the set of parameters {M,K,N}CA where
N is the number of <MAT> elements. If we want to design the CA model starting from the previous
continuous time model – something that we normally do not do in CA modeling system! - we must
follow the scheme depicted in the figure 5.13, starting from the right-hand side. We may start by
choosing either the number of <MAT> elements that corresponds to the degrees of freedom (DOF) of
the designed system, or by choosing the corresponding spatial step. Equation (5.20) correlates those
two variables. Since in CA the spatial step is not a direct modeling parameter we prefer the number of
<MAT> elements.
!
N =L
hx
" hx
=L
N (5.20)
Now from the equations (5.17) and (5.18) and (5.20) we get the following correspondences between
the continuous model expressed by the PDE and the CA model:
!
{µ, T, L}PDE " {µ,TN
2
L2
,N}CA (5.21)
Remember that in CA we do not use the SI unit system. According to table 5.3, equation (5.21) is
rewritten as:
!
{µ, T, L}PDE " {µ,TN
2
Fs2L2
,N}CA (5.22)
If we follow the spatial step given by equation (5.15), the number of N of <MAT> elements is defined.
!
N =L
hx
" N =L
T
µTs
(5.23)
Relation (5.22) now takes the form :
!
{µ, T, L}PDE " {µ,µ,L
T
µTs
}CA (5.24)
1-D str i ng model with losses
We will present synoptically the equivalent CA networks to PDEs that describe 1-D systems with losses.
The ideal vibrating string with losses is given by the equation [Smith 2005]:
!
µ˙ ̇ y (t,x) = T " " y (t, x)# $a˙ y (t,x) (5.25)
With the first order term approximating the losses, a more realistic model would be:
!
µ˙ ̇ y (t,x) = T " " y (t,x) # $a˙ y (t,x) # $b ˙ " " y (t,x) (5.26)
The effects of internal losses due to the viscosity of the medium are approximated by the term with
coefficient ζb (internal damping), while the effects of losses due to the radiation resistance are
approximated by the term ζa (air damping) which appears in both equations (5.25) and (5.26). Those
terms added to the ideal lossless string equation and approximated by the backward finite difference
58
scheme (for the first order terms) and by the central finite difference scheme (for the second order
terms) can be written as:
!
"#a˙ y (t,x) $ "#a[y(n,m) " y(n"1,m)
Ts
] =#a
Ts
[0" (y(n,m) " y(n"1,m))] (5.27)
By combining equations (5.27) and (5.16) and the table (5.1) we conclude:
!
Z = "aTs#
!
Z ="
a
Fs
(5.28)
Following the same procedure for the term ζb:
!
"#b ˙ $ $ y (t,x) % "#b[˙ y (t,m+1) "2˙ y (t,m) + ˙ y (t,m"1)]
hx2
&
"#b ˙ $ $ y (t,x) %"#b
hx2
[( ˙ y (t,m+1) " ˙ y (t,m)) + ( ˙ y (t,m"1) " ˙ y (t,m))]&
"#b ˙ $ $ y (t,x) %"#b
hx2
[ ˙ y (t,m+1) " ˙ y (t,m)]+"zb
hx2
[ ˙ y (t,m"1) " ˙ y (t,m)]&
!
"#b ˙ $ $ y (t,x) %"#b
hx2
[(y(n,m+1) " y(n"1,m+1)
Ts
) " (y(n,m) " y(n"1,m)
Ts
)]
+"#b
hx2
[(y(n,m"1) " y(n"1,m"1)
Ts
) " (y(n,m) " y(n"1,m)
Ts
)]&
!
"#b ˙ $ $ y (t,x) %"#b
Tshx2
[(y(n,m+1) " y(n"1,m+1))" (y(n,m) " y(n"1,m))]
+"#b
Tshx2
[(y(n,m"1) " y(n"1,m"1))" (y(n,m) " y(n"1,m))]
(5.29)
By combining equations (5.29) and (5.16) and the table (5.1) we conclude:
!
Z = "b
Ts
hx
2#
!
Z ="
b
Fshx
2 (5.30)
In figure 5.14 we illustrate the CA model of equation (5.26)
Figure 5.14 CA string model with losses corresponding to equation (5.26)
59
1-D l i near s tr ing model w ith d ispe rs ion
The dispersive 1-D wave equation is given by equation (5.31) [Fletcher, Rossing 1998]:
!
µ˙ ̇ y (t,x) = T " " y (t,x) # $ " " " " y (t,x) with $ =Y%&
4
4 (5.31)
where κ is the momentary constant, Y is the Young modulus and α the diameter of the string. This term
added to the ideal lossless string equation is approximated by the backward finite difference scheme:
!
"# $ $ $ $ y (t, x) % "#
hx4[6x(n,m) " 4(x(n,m + 1) + x(n,m " 1) + x(n,m + 2) + x(n,m " 2)]&
"# $ $ $ $ y (t, x) %#
hx4[4(x(n,m + 1) " x(n,m)) + 4(x(n,m " 1) " x(n,m)) " (x(n,m + 2) " x(n,m)) " (x(n,m " 2) " x(n,m)) &
!
"# $ $ $ $ y (t, x) %4#
hx4
[x(n,m + 1) " x(n,m)] +4#
hx4
[x(n,m " 1) " x(n,m)]
"#
hx4[x(n,m + 2) " x(n,m)] "
#
hx4[x(n,m " 2) " x(n,m)]
(5.32)
By combining equations (5.32) and (5.16) and the table (5.1) we conclude:
!
Km+1,m =4"Ts
2
hx4
Km#1,m =4"Ts
2
hx4
Km+2,m = #"Ts
2
hx4
Km#2,m = #"Ts
2
hx4
(5.33)
Some general observations concerning the CA representation of PDE models are:
× Time derivatives of order greater than two do not occur in CA (the first order derivative
approximates the frequency independed losses). This is not a problem as mixed derivatives are a
preferable solution for more accurate damping simulations [Smith 2005].
× Spatial derivatives of any order (mixed or not with first order time derivatives) may occur in CA.
There are many constraints though in order to interpret their finite difference approximations as FRES
and FFRO applied to one MAS element (for example in equation (5.33) the stiffness factor K may take
negative values).
5.3 State Space Representat ion
Multichannel systems that are lumped and linear can be described easily with state-space equations. The
multichannel FD and FDe equations (5.3) and (5.4) can be written in first-order form directly
[Kalouptsidis 1997].
!
˙ Y (t) = [Ac ]Y(t) +[Bc ]U(t) (5.34)
!
Y(n + 1) = [Ad]Y(n) + [Bd]U(n) (5.35)
!
Z(t) = [Cc]Y(t) + [Dc]U(t) (5.36)
!
Z(n) = [Cd]Y(n) + [Dd]U(n) (5.37)
60
Figure 5.15 From linear CA networks to state space representation
This formulation is called state space and is extensively used in control engineering. It offers an internal
description of the system since it not only has a relationship between the input and the output signals
involved, but it also provides information about the state of the system by the state variables. The
equations (5.34) and (5.35) are the state equations. The output equation is computed from the state
vector Y(t) or Y(n) and the input vector F(t) or F(n). The terms that appear in these equations are given
in the table 5.4. In figure 5.16 we depict a common implementation of a DLTI (discrete-time linear time-
invariant) system in state space.
Figure 5.16 State space DLTI system implementation
We reached the state equation from the finite difference representation of CA (figure 5.15). We could
find out other state equations by choosing different state variables. The state of a system at a certain
moment is the set of variables which are sufficient to determine the future behaviour of the system.
Accordingly by choosing another set we derive to another expression. The analogous electrical circuit
description of CA for example imposes as to work easier with forces and positions via a systematic
procedure described in [Lathi 1998] and not with velocities and positions as in equation (5.34) and
(5.35) (see figure 5.17).
61
Figure 5.17 From linear CA networks to state space representation
Table 5.4 Terms in State Space Formulation
state vectors
[2Nx1]
!
Y(t) =X(t)
˙ X (t)
"
# $
%
& ' Y(n) =
X(n(1)
X(n(2)
"
# $
%
& '
input vectors
[Nx1]
!
U(t) = F(t)[ ] U(n) = [F(n " 1)]
state matrices
[2Nx2N]
!
[Ac] =[0]
NxN[I]
NxN
"[M]"1
[K] "[M]"1
[Z]
#
$ %
&
' (
[Ad] ="[M]
"1[K] " [M]
"1[Z] + 2[I]
NxN"[I]
NxN+ [M]
"1[Z]
[I]NxN
[0]NxN
#
$ % %
&
' ( (
input matrices
[2Nx1],[2Nx2N]
!
[Bc] =[0]
NxN
[M]"1
#
$ %
&
' ( [Bd] =
[M]"1
[0]NxN
#
$ % %
&
' ( (
[Nx2N]
!
[Cc] = [[I]NxN
[0]NxN
]
[Cd] = ["[M]"1
[K] " [M]"1
[Z] + 2[IN] "[I
N] + [M]
"1[Z]]
[NxN]
!
[Dc] = [0]NxN
[Dd] = [I]NxN
62
State space models are remarkably interesting because they offer a link between most of the physical
modelling formalisms -as these intend to provide internal description of the systems and the signal-
processing world. Depalle et al. [Depalle Rodet Matignon Pouilleute 1990] provide a methodology for
modular construction of musical instruments within this formalism. Further more these models propose a
direct realization scheme [Kalouptsidis 1997].
5.4 K ir chhof f Rep resentat ion - E lectr ica l Analogous C ir cu it s
An electrical analogous circuit of a mechanical system is an electrical circuit in which currents/voltages are
analogous to velocities/forces in the mechanical system [Marshall 2003]. If voltage is the analog of force
and current is analog of velocity the circuit is called impedance analogous. In a similar way if voltage is
the analog of velocity and current is the analog of force the circuit is called mobility analogous. In
Electroacoustics, mechanical and acoustical systems are modelled with electrical circuits and simulated
using digital computers by special software packages, such as SPICE®.
In the previous chapter we saw that CA networks can be seen as a discrete time approximation of a
subclass of mass-spring systems. Hence electrical circuits may represent them easily. In practice to pass
from a CA network to a Kirchhoff network we transform in the first step the CA network to a mass-
spring network. Then using the classical electro-mechanic analogies we obtain the Kirchhoff circuit. The
figure 5.18 depicts schematically this transition.
Figure 5.18 From linear CA network- to Kirchhoff network
The basic linear building modules <MAT> and <LIA> correspond to ideal basic circuit elements as resistors,
capacitors and inductors. Other elements used currently in passive electrical network designs are the
open-circuit, the short-circuit, the voltage source and the current source.
In classical network theory the circuit variables are the voltage υ and the current i and in CA theory the
variables are the force f and the position x. The study of mechanical networks is based upon the variables
force and velocity in contrast to CA system.
63
The equations relating current i (or charge q) and voltage υ (or magnetic flux ϕ) for basic circuit
elements are given in tables 5.5 and 5.6. The sign for current and voltage may be chosen freely. In the
tables we have chosen the signs that correspond directly to the equations of the mechanical analogs.
We will see later in two-port models that we do not use the same convention as it is standardized in
the literature. By comparing the previous equations set with equations of table 5.4 we get the
analogies between CA and electrical systems. Due to the specific form of CA, <MAT> modules
correspond to one-ports and <LIA> modules correspond to two-ports. Two-port and one-port theory
applied to the CA system is covered in the next section. Obviously, to express those equations in the
discrete-time domain we use the approximations given by the equations (1) and (2). Table 5.7
synopsises the CA/electrical circuit analogies for the impedance and the mobility analogs.
Table 5.5 Impedance analog of CA linear network
!
"(t) = Ld2q(t)
dt2
or
!
q(t) = c
!
"(t) = f(t)
!
q(t) = f(t)
!
"Cij(t) =
1
C[q
i(t) # q
j(t)]
!
"Rij(t) = R(
dqi(t)
dt#
dqj(t)
dt)
!
"CRij
(t)1
C[q
i(t) # q
j(t)] + R(
dqi(t)
dt#
dqj(t)
dt)
Sometimes confusingly, the CA formalism is thought that it has been conceived as a way of discretizing
elementary mechanical elements or its electrical counterparts. However, the scope of CA is not to offer a
numerical approach for expressing and simulating the continuous-time models but to propose a different
methodology and an alternative point of view on physical modelling.
64
It would be interesting though to compare it with the Wave Digital Filters (WDF) as developed by Fettweis
[Fettweis 1986][Bilbao 2001], which were used principally for the discretization of analog filters. This
comparison could offer an interesting interface and link with the digital waveguide physical modelling
scheme, which share a similar formalism with them.
It is far more convenient to represent a CA model by its mobility electrical analog where the voltages
are analogs of individual velocities. In the other case the currents are analogs of velocity differences,
which pose difficulties since in CA all positions are measured and compared to a predefined point. The
necessary steps for forming analogous electrical circuits are described clearly in [Marshall 2003]. In
figure 5.19 we illustrate the mobility analogous and the impedance analogous of a CA model. The
mobility analog circuit can be designed directly from a CA network simply by using as reference this
example. Unfortunately this is not true for the impedance analog with the exception of few structures
such as the string illustrated in the same figure. A general simple methodology is described below
[Habibi 1997].
Table 5.6 Mobility analog of CA linear network
!
i(t) = Cd2"(t)
dt2
or
!
"(t) = c
!
i(t) = f(t)
!
"(t) = f(t)
!
iLij(t) =
1
L["
i(t) # "
j(t)]
!
iRij(t) =
1
R(d"
i(t)
dt#
d"j(t)
dt)
!
iLRij
(t) =1
L["
i(t) # "
j(t)] +
1
R(d"
i(t)
dt#
d"j(t)
dt)
65
× If we have n mobile masses we will have n+1 nodes
× One of the n+1 nodes is the ground. The other nodes are connected with that node with the
correspondent mass dipoles.
× When two masses (or one mass and the ground) are connected with an interconnection-link, we will
use the corresponding dipoles for that interconnection.
The impedance analogs circuits of structures further complicated than strings unfortunately do not offer
intuitive networks and their design demands a certain but simple manipulation of the mobility analog
network. There are three rules that we have to respect for the representation of a CA network by a
mobility analog Kirchhoff network:
i) in CA we define the velocity by a stable point: the ground. So in the equivalent Kirchhoff network all
the mass dipoles are connected with the same node: the ground (infinity mass-no movement)
ii) in CA it is not possible to connect two interactions/links in a serial way without a mass in the middle.
Therefore in the equivalent Kirchhoff network every node has to be connected at one side with a mass
dipole.
iii) in CA two masses are connected with one interaction-link of elastic and/or friction type. Therefore in
the equivalent Kirchhoff network the correspondent nodes have to be connected with the dipoles that
serve this interconnection. These dipoles can only be interconnected in parallel mode.
Figure 5.19 Electrical analogous circuits of a CA models: (a) mobility analog (b) impedance analog
The representation of CA models by electrical circuits is helpful. Most of the techniques and theories
conceived in the field of electrical networks are directly applicable: Kirchhoff laws, Thevenin and Norton
theorems, Impedance analysis, calculation of two-ports and scattering parameters e.t.c. A number of
methods developed for filter design and synthesis may be adopted as well. Additionally, many acoustical
and mechanical systems modelled in elecroacoustics may be simulated with CA networks.
66
TABLE 5.7 CA electrical analogs
Impedance Analog Mobility Analog
!
f " #
x " q
M " L
K " 1/ C
Z " R
!
f " i
x " #
M " C
K " 1/ L
Z " 1/ R
5.5 System Funct ion Representat ion
The system function for a discrete time system may be defined generally as the ratio of the z-Transform
of the output response to the z-Transform of the input excitation. All initial conditions are set to zero.
For a CA model the excitation or the response may be either a force or a position. This is not the case
for mechanical networks where instead of positions they may use velocities.
Figure 5.20 System functions in CA network examples
A CA system function can have the form of an immittance function or a transfer function [Tomlinson
1991]. An immittance denotes both impedances and immittances. The immittance function is referred as
driving-point immittance when it relates the force and the position at the same <MAT> or <LIA> element
(Zd for driving-point impedance, Yd for driving-point admittance) and as transfer immittance when it
67
relates them at different elements (Zt for transfer impedance, Yt for transfer admittance). The transfer
function is referred to as a position transfer function Hx when both the excitation and the response are
positions and as force transfer function Hf when they are both forces. The above cases are illustrated in
figure 5.20.
5.5.1 General System B lock Diagrams
A system function is actually a system transformation: Laplace-Transform for the CLTI and z-Transform
for the DLTI systems [Kalouptsidis 1994]. Consequently, every LTI system S represented by the
formalisms described in the previous sections, can be expressed as system function (multichannel in the
general case). Figure 5.21 illustrates a system block diagram for a CLTI and DLTI system.
!
Ss
= L SL"1 (5.38)
!
Sz
= Z SZ"1 (5.39)
Figure 5.21 (a) System block diagram of a CLTI system (b) System block of a DLTI system
A system block diagram may represent an immittance or a transfer function. There is no interest in this
representation to distinguish those two occasions. System functions are “black boxes”: we care only
about the input/output relations. As Smith points out for the applications of system functions to
physical modelling [Smith 2005]: “…Often we begin with a physical model, and decide which portions of
the model can be “frozen” as “black boxes” characterized only by their transfer functions*. This is
normally done only for linear, time-invariant model components where there is no need to ever “look
inside the black box””.
It is clear that there are several methods to compute the system function of a CA model. An easy
method is to use its electrical analog and perform all the mathematical operations in the z-domain where
they are expressed algebraically. The rules concerning one-ports or two-ports interconnections appear
very helpful.
One other method to compute the transfer admittance is to use the state space model. The transfer
function matrix is given by the expression:
* In the present thesis we would prefer to use the term system function. A transfer function is considered as a sub-category of system function (chapter 5.5). When we use the term transfer function we give a minimum information about the internal structure of the system: where we pick the input/output relations [Tomlinson 1991].
68
!
Yt(z) = [Cd ](z[I] " [Ad])"1
[Bd] + [Dd ] #
Yt(z) =1
det(z[I] " [Ad])[Cd ]adj(z[I] " [Ad])[Bd ] + [Dd ]
(5.40)
Ytij(z) is the admittance function relating the output Z to the input U (equations (5.35) and (5.37)). The
zeros of the polynomial det(z[I]-[Ad]) are the characteristic roots of the system.
5.5 .2 One-Port s
A one-port is a system with a single pair of input/output terminals. In classical network theory the
variable across the terminals is voltage and the variable that enters the one-port is current. Figure 5.22
depicts an electrical-one port. In CA, according to the analog used, we may have either the force or the
position as the variable across the terminals.
A one-port is characterized mathematically by a system function that can take the form of driving point
admittance Yd or of driving point impedance Zd. In table 5.8 we present the driving point impedances of
CA modules. They were computed by the z-transform of the CA electrical analogs using the finite
difference scheme of equation (5.2). For the <LIA> modules, which correspond to two-ports, we present
the driving-point impedances seen looking into port 1 when port 2 is open.
Figure 5.22 One-port network
A general one-port may be constituted by an arbitrary interconnection of one-ports. Generally for the
computation of a general one-port from a CA network we employ its electrical analog (figure 5.23). The
simple rules used for the combination of one-ports in series and in parallel* facilitate this computation. It is
important to note that CA construction rules [Cadoz Luciani Florens 1990], [Cadoz Luciani Florens 1993]
permit only a certain way of assembling primitive one-ports.
Figure 5.23 From linear CA networks to one-ports
* Impedances add in series and admittances add in parallel
69
TABLE 5.8 CA driving point impedances
!
ZdMAS
(z) =f(z)
x(z)=
1" 2z"1 + z"2
1/ M z"1
!
ZdSOL
(z) =f(z)
x(z)= "
!
ZdRES
(z) =f(z)
x(z)= K
!
ZdFRO
=f(z)
x(z)= Z(1" z
"1)
!
ZdREF
=f(z)
x(z)= Zd
REs+ Zd
FRO= K + Z(1" z
"1)
5.5 .3 Two-Po rts
In a similar manner a two-port network is a system with two pairs of terminals. In figure 5.24 we
illustrate the representation of an electrical two-port. The sign of currents and voltages are standardized
according to this figure. These two input / two output structures are also called four terminal networks
and quadrapoles. Classical network theory [Van Valkenburg 1960] and theoretical acoustics [Chaigne
2003] is to some extent concerned with the interconnections and the properties of N–ports and more
often of two-ports. A reference, which covers several essential points of N –ports is the PhD dissertation
of Stefan Bilbo [Bilbao 2001].
Figure 5.24 Two-port network
The mathematical description of two-ports networks is carried out easily with matrices in the z-domain.
To represent CA with 2-ports there are two possible choices: we can use the DT Kirchhoff
representation of CA or we can start directly from the CA networks.
If we use the Kirchhoff representation, we must first express the electrical analogs of the CA one ports
of table 5.8 and then to form the matrices according to the network theory. Several possible forms
exist which relate the four terminal variables [Dutoit Gosselin 2005]. In the CA case, it is clear that we
do not use as terminals variables the {υ1, i1, υ2, i2} but {υ1, q1, υ2, q2} for the impedance analog and {ϕ1,
i1, ϕ2, i2} for the mobility analog.
70
Below we present the more typical representations, which are useful to our research: the transmission
(or chain) matrix, the impedance matrix and the admittance matrix. The existeding relationships among
the two-port parameters of each two-port form -transmission parameters, impedance parameters,
admittance parameters in the examined case- can be easily calculated or found in the literature in
parameter conversion tables [Nilsson Riedel 2004].
Transmission matrix:
!
"1
i1
#
$ %
&
' ( =
a11
a12
a21
a22
#
$ %
&
' ( "
2
)i2
#
$ %
&
' ( (5.41)
Impedance matrix:
!
"1
"2
#
$ %
&
' ( =
z11
z12
z21
z22
#
$ %
&
' (
i1
i2
#
$ % &
' ( (5.42)
Admittance matrix:
!
i1
i2
"
# $ %
& ' =
y11 y12
y21 y22
"
# $
%
& ' (1
(2
"
# $
%
& ' (5.43)
Typical two-ports, useful in the present research, are depicted in table 5.9. These are the networks of
the CA electrical analogs (tables 5.5 and 5.6). For each network the transmission matrix is given. It is
obvious again that for the CA two-ports we use the terminal variables υ, q (impedance analog) or i, ϕ
(mobility analog).
TABLE 5.9 Transmission matrices of characteristic electric network
!
1 z
0 1
"
# $
%
& '
!
1 0
1/ z 1
"
# $
%
& '
!
1+ z1/ z
2z1
1/ z2
1
"
# $
%
& '
!
1 z2
1/ z1
z2/ z
1+ 1
"
# $
%
& '
As we have already stated before, we may employ the concept of two-ports directly in the digital signal-
processing domain without the intervention of the network theory. A digital two-port or a two-pair
according to Mitra terminology is nothing more than a two-input/two-output digital structure (figure
5.25) [Mitra 2001]. We may easily construct our 2-pairs by what he defines as the chain matrix.
71
Figure 5.25 A digital two-pair
!
x1
y1
"
# $
%
& ' =
A B
C D
"
# $
%
& '
y2
x2
"
# $
%
& ' where
A =x1
y2
x2
= 0
B =x1
x2
y2
= 0
C =y1
y2
x2
= 0
D =y1
x2
y2
= 0
(5.44)
This particular matrix form has the interesting property that the matrix of the overall cascade
interconnection of digital two-pairs is given by the product of the individual matrices of each two-pair. The
same is happening with the cascade interconnection of two-ports expressed by their transfer matrices.
TABLE 5.10 CA digital two-pairs
!
f1
x1
"
# $
%
& ' =
1( 2z(1
+ z(2
1/ Mz(1
(1
1 0
"
#
$ $
%
&
' '
x2
f2
"
# $
%
& '
not defined
!
x1
f1
"
# $
%
& ' =
1
K1
(1 0
"
#
$ $
%
&
' '
f2
x2
"
# $
%
& '
!
x1
f1
"
# $
%
& ' =
1
Z(1( z(1
)1
(1 0
"
#
$ $
%
&
' '
f2
x2
"
# $
%
& '
!
x1
f1
"
# $
%
& ' =
1
K + Z(1( z(1
)1
(1 0
"
#
$ $
%
&
' '
f2
x2
"
# $
%
& '
In table 5.10 we provide the two-pair representation of the CA linear modules. It has been computed by
equation (6.44) and applied to the z-transform of the linear CA algorithms given in table 5.1. For the
MAS algorithm we have used two forced inputs.
72
The interconnection between a two-pair and a one-port G(z) is given by equation (5.45).
!
H(z) =y1(z)
x1(z)=
C + DG(z)
A + BG(z) (5.45)
5.5.4 Moda l Rep resentat ion
Modal representation can be considered as a particular case of system function representation [J. O.
Smith 2005]. In computer music it has been used as a sound synthesis method called modal synthesis
[Djoharian 1993][Adrien 1991][Florens Cadoz 1991].
To derive a modal description of a CA mode it is preferable to compute the characteristic polynomial by
making the matrix [Ad] of the state space representation diagonal (equation (5.35)). The modes are
decoupled and we may write each mode independently. Consequently we obtain a 2N parallel first order
systems where N is the number of masses of our CA system. It is clear that the diagonal state space
form is equivalent to a partial-fraction expansion of a transfer function.
If our system forms a viscosity compatible network, we can combine the conjugate poles to obtain a
system of N parallel classical two-pole filters. Then our model can physically be represented as a set of
independent elementary oscillators. In this case it is more convenient to reach the modal description of a
CA system from its multichannel FD or FDe representation (equations (5.3) and (5.4)). The modal
representation of a network ([M], [K], [Z]) is characterized by the transformation matrix [Q], where each
column represents a mode shape and the diagonal stiffness and viscosity matrices, [Km] and [Zm]. All these
are given by the equations (5.46) [Djoharian 1993]. The matrix [Q] is calculated in GENESIS by the Jacobi
transformation algorithm.
!
[Q]t[M][Q] = [I], [K
m] = [Q]
t[K][Q], [Z
m] = [Q]
t[Z][Q] (5.46)
Modal models have many advantages. As Djoharian points out “…modal modelling bridges the gap
between the structural representation (geometric an dynamic) of the vibrating system objects and their
perceptual properties”. They preserve in a certain way the physicality of the modelled object and
additionally they furnish directly its perceptual characteristics. Moreover they are very simple in terms of
their computer simulation. The modal data (frequencies, damping coefficients and mode shapes) can be
obtained mathematically or by physical measurements. Therefore it is easy to design accurate linear
models like an instrument resonators by using these data.
The modal representation is really useful when we need to pass from a continuous-time system function
(Laplace-domain) to discrete-time system function (z-domain) according to CA simulation. The double
discretization scheme adopted by CA unfortunately does not permit the use of a direct transformation
method from the s-domain to the z-domain. Nevertheless this can be done for some case (viscosity
compatible networks) if we use the modal expressions and compare the form of the elementary oscillator
in the continuous time domain and in the CA formalism.
73
5.6 D igit a l S ignal P rocess ing Block D iag ram Rep resentat ion
The block diagrams offer a convenient structural representation of the computational algorithm of a
system. This kind of representation using interconnected basic building blocks as adders, multipliers and
unit delays, is the first step in the software or hardware implementation of a digital signal processing
system [Mitra 2001]. The interconnections may be in cascade, in parallel or in feedback. Block diagrams
contain all the information for the modelling and the simulation of a physical system.
Figure 5.26 From CA network to digital signal processing block diagrams
It is interesting to notice that as digital signal processing block diagrams offer a decomposition of the
system on interconnected subsystems performing elementary mathematical operations, CA networks
suggest a similar decomposition on subsystems performing “elementary physical operations”. Each
subsystem in both cases is characterized by its input/output relationships. The mathematical blocks are
near to the computing machine and allow a more “Signal Thinking”, while the physical blocks are near our
mental image for the physical world and allow a more “Physical Thinking”. The first approach is more
symbolic/mathematical and the second one more material/physical. This type of CA modularity where each
element preserves an “experimentable” physical nature and quality was one of the basics demands on the
design of this formalism.
From the input/output relationships of the CA modules given in table 2 we can construct the block
diagrams. There are various ways to realize or simulate these algorithms and consequently represent them
by these elementary functional elements. In tables 5.8 and 5.9 we represent the CA linear modules using
ordinary signal processing block diagrams. As we can see the subsystems are always interconnected using
feedback links. This is a direct derivation from Newtonian Mechanics.
When we represent a CA model with block diagrams, its physical constitution is no more easily perceptible
or detectible. A straight consequence of this entirely functional point of view is the lost of the
“Physical Instrumental Interaction”. In this case the control problem normally takes the form of a
mapping between the control signals and the available input parameters of the system. Nearly every
physical modeling approach follows this non-physical control paradigm.
74
Table 5.11 Block diagrams of linear CA modules
Table 5.12 Block diagrams of nonlinear CA modules
There are certainly many advantages in representing the physical model in a block diagram form. Mitra
summarizes them as follows [Mitra 2001]: i) ease in the derivation of the computational algorithm by
inspection ii) ease in the determination of input/output relation iii) ease in the manipulation of diagram to
75
derive an “equivalent” with a different computational algorithm iv) ease in the determination of the
hardware requirements and complexity v) ease in the developing of different block diagram representations
from transfer functions. We may add to this list the ease to determine the complexity of the
computational algorithm and the ease to detect delay-free loops i.e feedback branches without delay
elements.
We must note that the one important reason for the double discretization scheme –centered for the
acceleration and backward for the velocity- adopted by the CA formalism was the delay-free loop
difficulty. Within the framework of physical modeling, this problem lead to the notions of T-simulable and
*-simulable objects [Cadoz Luciani Florens 1990] [Cadoz Luciani Florens 1993].
5.7 Wave Flow Representat ion: Inter facing the D igi ta l Wavegu ides w ith the CA
We have seen that a CA system employs two types of physical variables: forces (intensive variables) and
displacements (extensive variables). Therefore it can be categorized generally as a K-modeling approach
– K comes from Kirchhoff [Karjalainen, Erkut 2004]. Other widely used physical modeling systems in the
domain of computer music use wave variables. Digital waveguides (DWG) and wave digital filters (WDF)
are typical paradigms of this type of W-modeling approach – W comes from wave. In this section we will
try to answer two questions:
× Is it possible to represent CA models by DWG models and the opposite?
× Can we mix both representations to derive a mixed model?
In chapter 5.2 we have proposed a methodology to represent CA networks by FD models*. For the case
of 1-D ideal sting we have proven that the two models are equivalent (using the spatial step of equation
(5.15)). This stands right as well for the DWG and the FD models [Smith 2004]. Therefore, it is possible
to represent a CA 1-D ideal string (with or without losses) accurately by digital waveguides (figure
5.27).
The interest to combine different physical modeling approaches together and especially into the context
of block-based modeling has appeared in much of the computer music research of the last decade.
Hybrid methods have been reported that combine DWG methods with FD [Erkut Karjalainen 2002],
[Karjalainen 2003], [Karjalainen 2004], [Karjalainen 2004b] and WDF with State Space Structures
[Petrausch, Rabenstein 2005b].
Figure 5.27 From linear CA to DWG networks
* It is clear that there is no interest in representing CA networks by WDF as both are discrete-time lumped elements. Every element in both formalisms can be interpreted physically so the correspondence between the modules is one-to-one (though the CA network may not exist). Of course they have different numerical properties.
76
The difficulty to connect DWG structures with CA models rises from the fact that those systems use
different variables. This case is referred to as port incompatibility [Rabenstein et al. 2007]. Furthermore,
although CA and DWG blocks may be correctly synthesized separately, their interconnection may cause
delay-free loops [Mitra 2001] and consequently prevent their simulation. Another issue that makes things
a bit more complicated is the stability of the composed network. However, if the components are
passive [Smith 2005] in other words they don’t create energy – (all un-amplified acoustic musical
instruments are passive) and the Kirchhoff laws are followed this problem has no reason to occur.
The interface between CA and DWG networks is achieved with the help of an intermediate block.
According to the relevant literature we call this an interface converter. Its computation is based on the
definition of passive reflectance. In figure 5.28 we illustrate a wave flow diagram terminated by lumped
impedance expressed as reflectance.
Figure 5.28 A wave flow diagram terminated by lumped impedance expressed as (a) a force reflectance
(b) a velocity reflectance
We distinguish 2 possibilities in the adaptors according to the CA input variables [Kontogeorgakopoulos
Cadoz 2008b]. If the input (to CA block) is the displacement then the first block of the CA model must
be a <LIA> module and the input/output communication points are of type x/f. In the case of DLTI
models the input/output relations is driving-point admittance Yd. On the contrary, if the input is the
force then the first block of the CA model must be a <MAT> model and the input/output communication
points are of type f/x. In the case of DLTI models the input output relations is a driving-point
impedances Zd.
Our aim is i) to maintain the structure of the DWG and the CA model and ii) to obtain a final hybrid
structure where the CA and the DWG models are interconnected through a two-port junction (figure
5.29).
Figure 5.29 The general structure of the DWG to CA converter
77
A definition of the force and velocity reflectance is (figure 5.28):
!
Sf (z) =f"(z)
f+(z)
=ZCA " ZDWG
ZCA + ZDWG
(5.47)
!
Sv (z) =v"(z)
v+(z)
= "Sf (z) (4.48)
Equation (5.47) can be expressed as well by admittances:
!
Sf (z) =YDWG " YCA
YDWG + YCA
(5.49)
We develop the equation (5.47) in order to obtain the desired arrangement:
!
Sf (z) =ZCA " ZDWG
ZCA + ZDWG
=ZCA " 2ZDWA + ZDWG
ZCA + ZDWG
#
Sf (z) = 1" 2ZDWG
1
ZCA + ZDWG
#
!
Sf (z) =
1" 2ZDWG
1
1/ YCA + ZDWG
1" 2ZDWG
1
1/ YDWG + ZCA
#
$ %
& %
(4.50)
If we start from equation (5.49) we get a similar result:
!
Sf (z) =
"1+ 2YDWG
1
1/ ZCA + YDWG
"1+ 2YDWG
1
1/ ZDWG + YCA
#
$ %
& %
(4.51)
The block of figure 5.30 has the following input/output relation:
!
y(z)
x(z)=
1
1/ A + B (4.52)
-
Figure 5.30 Feedback connection of systems A and B
Therefore, from equations (4.50) and (4.51) and the block diagram form of equation of figure 5.30, we
design our DWG to CA adaptors as illustrated in figure 5.31. It is clear from equation (4.48) that when
we use velocity waves in the DWG structure, we add only a multiplier -1 in the input of the DWG part
and keep exactly the same structure.
78
Figure 5.31 DWG to CA converters: (a) force to force (b) force to position (c) an alternative design for
the case force to position (d) an alternative design for the case force to force
We have used the simplest first order approximation for the integration and the differentiation in order
to pass from displacement to velocity variables and vice versa. This was a natural choice since these
schemes are used in CA algorithms.
As we verify from CA block diagrams (table 5.11), when the first module is <MAT>-type, delay-free
loops never occur (converters (a) and (d) of figure 5.31). In contrast, when the first module is a <LIA>-
type delay free loops occur. That explains why we have used simple delays in converters (c) and (d). As
this way to break the delay free loop is quite na�ve, it is preferable to avoid the use of those
converters. We must remember that the CA models are passive, so theoretically we have no stability
issues except those two converters, where the supplementary delay affects the dynamics of the system
and may cause instabilities. Note that the converter (d) can be found directly in the literature*
[Karjalainen Erkut 2004]. Here we have presented a more detailed image of all the possible solutions
concerning the CA/DWG interface. We have reached several designs having as a starting point the
reflectance function and not the application of Kirchhoff laws in a junction point.
* In this research it is not presented as a converter but as a N-port scattering junction. Since it converts K-variables to W-variables we consider it as a converter.
79
Chap
ter
6 C
ORD
IS-A
NIM
A P
hysi
cal
Aud
io E
ffec
ts M
odel
Des
ign
80
81
Chapter 6
CORDIS-ANIMA Physical Aud io Effects Model Design
In chapter 4 we presented the concept of Physical Audio Effects. Our purpose was to demonstrate a
general conceptual framework that is independent as much as possible from any modeling and simulation
system architecture. In this chapter we will present the adopted system architecture. Moreover, we will
introduce the design and simulation process. Therefore the goal of this research from the beginning was
double:
Def ine a sys tem arch it ecture and an approach for the design and the development of physical
audio effect algorithms clearly oriented for musical purposes. In this context, we claim that the digital
audio effect design procedure is essentially a creative and artistic process. We seek for a modeling
practice that starts from scratch and that permits a straightforward exploration for new sonic
possibilities based on sound transformation techniques. A stream processing architecture where the
system is briefly a collection of blocks that compute in parallel and communicate data via channels was
a preferable choice as it facilitates many of the “artistic” conception of the sound processing
algorithms.
Design and s imu late physical audio effects models that specify the following requirements:
× the signal processing part of the effect is a simulated passive physical object
× they support instrumental interaction
× they are real-time
× they are modular
× they are as intuitive as a physical object
× they are reasonably simple from a functional point of view
CA modeling and simulation system feature many of these characteristics so it has been chosen as the
ideal for the fulfillment of our aims. The description of the general formalism [Cadoz Luciani Florens
1993] attests clearly that present research and the “raison d’ être” of CA system share very common
goals. Hence it was a natural choice to propose a system architecture that was totally based on the CA
system.
Design, as a process necessitates extensive and broad research, thought, modeling, iterative adjustment
and redesign [C. Magnusson 2007]. In figure 6.1 we illustrate diagrammatically our design approaches.
As we can see three different alternative strategies for the design and the simulation of our physical
audio effects models have been followed: (a) new designs based on abstract ideas but always in the
context of “physical thinking” (b) designs of preexisting effects directly from their formal-mathematical
description (c) designs of versions of preexisting effects by re-interpreting their basic functionality in the
82
domain of physical modeling. It is clear that CA system analysis was an indispensable part of the
research in order to follow the (b) and (c) design methods.
Figure 6.1 Three different alternative strategies for the design and the simulation of our physical audio
effects models
Traditionally the definition of network synthesis (design + simulation) as opposed to network analysis is
[Van Valkenburg 1960]: “If the network are given and the response is to be determined, the problem is
defined as analysis. When the excitation and the response are given and it is required to determine a
network, the problem is defined as synthesis”. Hence according to this definition, in network synthesis,
we are concerned with the design of networks to meet prescribed excitation-response characteristics.
However in the present thesis the response is not strictly determined. We are looking to design models
that converge towards a more general musically meaningful behavior inspired to a certain extent by
other digital audio effects and by physical phenomena.
83
6.1 System Arch it ecture
Initially, our aim was to use only the special CA version that appears in GENESIS software [Castagne
Cadoz 2002b]. We found out that it was often neither possible nor essential for the goals of our
research, to stay strictly attached to this environment. Without doubt, it would be preferable to have
the facility to simulate directly our models with GENESIS in order to be available straightforwardly to the
number of artists that use it. Even if that proved impractical and unachievable for several models, the
concept of “Physical Thinking” promoted by this software, has always been respected and guided all
parts of this research.
Figure 6.2 The proposed system architecture (xsi(n): input sound signals, xgi(n)/ygi(n): gestural
input/output signals, xci(n): control signals, ysi(n): output sound signals)
In figure 6.2 is illustrated the proposed system architecture. In the core of the system is positioned the
CA models that transform the input sounds xsi(n). These sound signals if necessary are calibrated in
amplitude before entering the model by the block sound input calibration. In a similar manner, a
calibration appears for the gestural signals xgi(n)/ygi(n) by the block gesture i/o calibration. Remember
that the central concept of this research is the possibility of establishing an instrumental interaction
between the audio effect model and the human operator through the depicted gestural port. Therefore,
the novelty appears in this part of the architecture.
Apart from the gestural signals we have certain control signals xci(n) that modify dynamically (during the
simulation) or non-dynamically some CA model parameters. It is evident that it is more a matter of
84
meaning than a technological issue to decide which of CA parameters can be altered dynamically.
Moreover they control several other parameters of the i/o calibration modules and of the sound output
post-processing module. The mapping module was necessary in order to pass from certain perceptual
parameters to CA ones. The digital audio effects models of chapter 7 will clarify the need for such
architecture.
6.2 Components and Modu les
Basically every system consists of subsystems and similarly every system can be viewed as a subsystem
of another system. For example the basic building blocks of digital signal processing are the adder, the
constant multiplier, the signal multiplier and the unit delay element [Proakis Manolakis 1996]. It is clear
that we could go lower and decompose those elementary arithmetic blocks into interconnected
subsystems: in this decomposition the corresponding subsystems could be the Boolean operands and
their interconnection (their implementation algorithm) depends on the adopted type of number
representation [Zolzer 1997][Patterson Hennessy 2005]. However the last decomposition in general is
out of the scope of digital signal processing as it is concerned more with the computer architecture
domain.
The toolbox we employed in this research for the concept and the design of the digital audio effect
algorithms is the subset of one-dimensional CA elements. The “physicality” of the primitive CA modules
imposed the desired physical and intuitive design approach. It is clear that since we follow certain
modular system architecture, such as CA, we must respect and not violate its principles. Hence
disassembling the modules or adding ad-hoc non-consistencies to the formalism components is
considered as bad design and is strictly avoided.
Apart from the classical CA algorithms of table 5.1, we have developed a few new ones. In the
definition of CA formalism the nonlinear <LIA> modules are conditional links represented by finite state
automata [Florens, Cadoz 1991]. On the other side, in GENESIS software we come across an
implementation based on lookup tables. In our system we have used a combination of both approaches.
We have also used polynomials and truncated polynomial expansions for the approximation of more
special functions as the square root [Mitra 2001].
For the non-CA blocks of the system architecture of figure 6.2 we have still used low-level functional
units: adders/multipliers, comparison operators, selection switches and lookup tables. Therefore the block
constitution of our system is still respected. The goal was to represent the global models by block
diagrams. In [Borin De Poli Sarti 1992] are cited three basic criteria to portion a model into sub-model
blocks: (a) physical resemblance, (b) functional structure and (c) simplicity of the formal description. We
have used the functional criterion to define the non-CA blocks.
We have stated before that the stream-based computation i.e. with block diagrams is a very strong
conceptual tool especially for the design of digital audio effects. The design approach depicted in figure
6.2 is based clearly on the direct interaction of the user with the flow graph visual representation of the
85
algorithm. Moreover, block diagrams can be expressed by equivalent algebraic approaches [Orlarey Fober
Letz 2002]. The advantage of this representation is not immediately apparent. According to the authors
of this paper: “Algebraic representations of block diagrams have several interesting applications for visual
programming languages. First of all they are useful to formally define the semantic of the language and,
as stated in the introduction, there is a real need for such formalizations if we want our tools (and the
music based on them) to survive.”
Figure 6.3 illustrates all the necessary primitive blocks used for the research and design of our physical
audio effect models. Appart from the CA modules several signal processing low-level modules have been
employed such as adders/subtractors, multipliers, lookup tables, selection switches (they let you select
one input stream from an index stream) and comparison operators.
Figure 6.3 Primitive blocks used for the research and design of physical audio effect models
6.3 Modules Interconnect ion and Const ruct ion Rules
After outlining the primitive modules in the previous chapter we have to define the system architecture
that will identify the general structure of our effects designs. As principally the CA system has been
adopted for the essential signal processing part of our algorithms it is not necessary to repeat its
formalism [Cadoz Luciani Florens 1993]. However we will focus on some interesting points that have
direct impact on this research and try to clarify them through a rigorous mathematical approach.
Discrete-time systems can be interconnected in order to form larger ones. The ways in which they can
be interconnected are in cascade, in parallel and in feedback (figure 6.4). For the parallel and the
feedback interconnection a third system is needed to perform the addition. We will now study the
Single-Input Single-Output (SISO) situation.
Figure 6.4 Interconnection of SISO systems
86
Digital audio effects units are normally interconnected in parallel or in cascade. Multi-effects processors
are good examples that encounter this characteristic [Nielsen 1999]. Clearly, the feedback
interconnection is not desirable since the objective is that the second effect device does not load and
alters the dynamic behavior of the previous one.
One of the most significant particularities of the CA system is that the modules are always
interconnected in feedback configuration. This is desirable most of the time as it is a consequence of
the physical laws that justify the dynamic principles of the overall system. In general all physical
modeling approaches must respect this important property. Since in the context of computer simulation
the “physical communication” between models is discrete and “finite” in the sense that it is
accomplished through a finite number of 1-D or multidimensional signals, these signals appear as pairs of
two [Cadoz Luciani Florens 1993]. That is why, in general the interconnection of physical models is
better described by ports [Rabenstein et al. 2007].
If a release from this constraint is critical to our designs, we should follow the physical paradigm: The
nature of the interaction between physical objects is determined by their impedances and by their
internal sources of energy. It is evident that the exchange of energy between the physical objects
compared to their internal energy provides a direct measure of the interaction. Figure 6.5 illustrate
schematically the 4 approximately possible types of interaction between two CA physical models.
Figure 6.5 Possible types of interaction between two CA physical models (a) coupled (b) decoupled (c)
model 1 drives model 2 (d) model 2 drives model 1
It is easy to measure and study mathematically this problem for linear CA models. In Habibis thesis
[Habibi 1997] a method based on the continuous-time Kirchhoff representation of CA networks is
reported. Below we present briefly a simpler version of it that expresses rigorously the stated problem
by employing the discrete-time Kirchhoff representation of CA presented in chapter 5.4.
Using the Norton/Thevenin equivalents of the two interacting models -it can be generalized for a number
of models-, we may compute the exchanged energy E (equation (6.3)) and compare it with the internal
energy E1 and E2 (equations (6.1), (6.2)) for a certain period of the simulation time N. The initial
conditions are represented in the electrical circuit by equivalent sources [Lathi 1998]. Often in CA
models and especially in GENESIS, there are no input sources but only non-zero initial conditions. In figure
6.6 we illustrate the mobility analogs of two CA models in interaction.
87
!
E1 =1
Niz1(n)"(n)
n=0
N#1$ (6.1)
!
E2 =1
Niz2 (n)"(n)
n=0
N#1$ (6.2)
!
E =1
Ni(n)"(n)
n=0
N#1$ (6.3)
In table 6.1 we present the conditions according to the energetic exchanges where each type of
interaction of figure 6.5 occur.
Figure 6.6 Mobility analogs of two CA models in interaction
TABLE 6.1 - Energetic conditions for each type of interaction between 2 CA models
!
E1
<< E and E2
<< E
!
E1
>> E and E2
>> E
!
E1
>> E and E2
<< E
!
E1
<< E and E2
>> E
Of course we may adapt purely signal processing techniques to control precisely the type of interaction
of two models. In CA this is achieved simply by omitting the feedback link of the interconnection. In the
domain of electronics a typical way of achieving this is by the use of unity gain buffer amplifiers. It is
clear that this is not a physical approach and hence it should be avoided in the context of physical
modeling and in our digital audio effects designs.
88
A procedure that provides elegantly the condition when two CA models can be interconnected in a
feedforward configuration is presented below. We represent the two CA models as two-ports (chapter
5.5.3) by their transmission matrices A’ and A’’ (figure 6.7). Their transfer function H’ and H’’, the
impedance looking into the input port of model 2 (we use the symbol ’’) Z’’11 and the admittance
looking at the output port of the model 1 Y’22 is given by the following formulas [Dutoit Gosselin
2005]. For the sake of economy concerning the size of the manuscript we will not be overly analytical.
Figure 6.7 Transfer function of two two-ports in cascade
!
" H =" # 2
" # 1 " i 2=0
=1
" A 11
(6.4)
!
" " H =" " # 2
" " # 1 " " i 2=0
=1
" " A 11
(6.5)
!
" " Z 11
=" " # 1
" " i 1 " " i 2=0
=" " A 11
" " A 21
(6.6)
!
" Y 22
=" i 2
" # 2 " # 1=0
=" A 11
" A 12
(6.7)
A cascade interconnection of two-ports 1 and 2 results in a two-port with transmission matrix A and
transfer function H given by equation (6.5):
!
A = " A " " A #
A11
A12
A21
A22
$
% &
'
( ) =
" A 11 " A
12
" A 21 " A
22
$
% &
'
( )
" " A 11 " " A
12
" " A 21 " " A
22
$
% &
'
( )
!
H =1
A11
=1
" A 11 " " A 11 + " A 12 " " A 21
=1
" A 11
1
" " A 11
1
(1+" A 12 " " A 21
" A 11 " " A 11
)
#
!
H =1
" A 11
1
" " A 11
1
(1+1
" A 11
" A 12
" " A 11
" " A 21
)
(6.8)
Equation (6.8) can be written from (6.4)-(6.7):
!
H = " H " " H 1
(1+1
" Y 22 " " Z 11
)
(6.9)
So it is obvious now from (6.8):
89
!
H " # H # # H for # Y 22# # Z 11$% (6.10)
The last condition is general, but also very important. It proposes a classical configuration widely used in
electronics named impedance bridging. If we consider the system A’ as source and the system A’’ as
load, equation (6.10) can be written:
!
Zload
>> Zsource
(6.11) The transfer function of a linear system like a linear CA model is unaltered if each impedance is
multiplied by the same factor k. Therefore the impedance scaling is a typical solution to control
physically the type of interconnection between CA sub-models.
6.4 Use r Inter face
The term User Interface is used to describe the “medium” by which the user interacts with a system –
a digital signal processing system in our case. Technically speaking an interface can be a software entity,
a hardware entity or both. In Human Computer Interaction a variety of interfaces have been reported
such as command line interfaces, graphical user interfaces, gestural interfaces etc. On chapter three we
presented briefly the main tendencies regarding the control of Digital Audio Effects.
Even if the present research is not addressing the design of user interfaces, the novelty of our
contribution is directly related with this question. According to the proposed system architecture of
figure 6.2, two different approaches are taking place to control the physical audio effect model. The first
one employs some forms of mapping between the users actions and gestures into appropriate parameter
values needed to dr ive the sound processing algorithm. This practice is used more to tune the models
statically. Nevertheless it has been used slightly for dynamic sound modifications, as we will see later on.
The second approach suggests a completely novel tactic to i nte ract with the digital audio effect. In
this scenario we are not driving the sound processing algorithm: the term control is not the proper one
to describe this situation. Following the generic concept of physical audio effects presented in chapter
four and the theory of instrumental interaction [Cadoz 1994] this new paradigm of designing, thinking
and interacting with digital audio effects has been emerged.
Evidently specific technologies are needed to implement this new type of digital audio effects.
Appropriate and accurate force feedback devices must be developed and used. This problem, in an
artistic context applied to sound synthesis, was confronted for the first time in 1978 by Florens [Florens
1978]. Since then a number of similar devices have been developed under the name TGR in the same
laboratory (ACROE) [Cadoz Luciani Florens 1984][Cadoz Lisowski Florens 1990][Florens Luciani Cadoz
Castagne 2004].
Unfortunately, only a small number of force-feedback systems have had a considerable commercial
impact*. In general they remain expensive and consequently are not usually affordable to musicians.
* the PHANToM desktop haptic device is an affordable solution [Massie Saisbury 1994]
90
Moreover, the real-time instrumental interaction often demands high performance platforms, such special
DSP cards that augment significantly the cost of a complete real-time force feedback system like the
proposed physical audio effects.
The input/output gestural signals during this thesis were simulated by a collection of simple CA gestural
models. This fact does not affect at all the generality and the validity of this research. It would be
preferable of course to run additionally several other experiments with the presence of a human
operator. What is interesting though with the gesture modules is that they permit a more analytical and
rigorous study under completely controllable and measurable conditions.
Besides, the force-feedback transducers are completely integrated to CA formalism and hence they are
viewed as special components networks alongside the simulated components**. According to CA
terminology an object can be termed a complete object when all its modules, including transducers
modules, are presented. The purely digital part is called the internal object [Cadoz Luciani Florens 1993].
It is also interesting to mention that the generic concept of instrumental interaction is still valid for the
simulated offline gestures. That means that the simulated gestures should not only be considered as a
mean to study methodically the instrumental interaction as we indicated before. It can also be
approached as a type of physically based waveform generator that interacts with the audio effect
without the intervention of the human user. This approach is very close to the problematic of composing
the instrumental gesture.
Figure 6.8 Control possibilities of physical audio effects models
We can imagine for example, low frequency oscillators (LFO) that do not drive the audio effect but
interact with it. In this case and according to the discussion of the previous chapter, their impedance
** In realistic simulator implementations, several undesired delays are introduced that are generally considered as a sources of instabilities [Florens, Voda Urma 2006]
91
defines the type of interaction: from feed-forward command-type, to a more free and organic one. In
Annex B we present two LFO physical models used and studied in the present thesis for this type of
offline control of digital audio effects.
Figure 6.8 illustrates the 5 control categories proposed by Verfaille et al. [Verfaille Wanderley Depalle
2006] for controlling a physical audio effect model. As we can see the instrumental control concept may
be applied for each type of control. Particularly in the case of adaptive control, the sound features may
be considered as internal gesture signals** that represent the gesture intention of the instrumentalist
[Gibet Florens 1988]. Figure 6.9 depicts this configuration.
Figure 6.9 Adaptive control of physical audio effect models: configuration with instrumental gesture
From all the possible types and methods, only the LFO control type with or without simulated
instrumental interaction has been studied and documented in this manuscript. Still most of the proposed
models have been principally conceived to be realizable in real-time simulators. Hence our choices
concerning the force-feedback interface that may replace the CA gesture models were realistic and
reasonable.
For example, some very generic characteristics of the proposed configuration employed to accomplish
the interaction with the physical audio effect could be:
× morphology of the manipulation similar to a classical keyboard configuration
× size of the working space limited at the digital scales
× degrees of freedom not more than two or three
The questions of force signal resolution and of frequency bandwidth should be examined during the
implementation phase in order to achieve optimal but in the same time accurate realizations.
It is important to mention that the concept of applying an instrumental interaction for the control of
sound transformation algorithms is coherent and reliable only when the algorithm is a model of a physical
object [Luciani 2007]. An abstract tactile rendering would break the correlation between the gesture and
** Sylvie Gibet and Jean-Loup Florens considered the instrumentalist as a mechanical system with two parts: a passive one and
an ideal generator of force or displacement: the internal gesture signal which cannot by measured directly
92
the resulted multisensory* stimuli generated. For this reason we have insisted on designing the classical
audio effects with the CA formalism. Clearly this was not a simple task and the majority of this research
has been spent analyzing it.
6.5 S imulat ions/S imu lato r
The design of our physical audio effect models has been carried out by the CA modeling and simulation
system. One of the fundamental characteristics of this formalism is that the modeling modules are in the
same time the elements of the simulation procedure. Moreover the algorithmic models can be
implemented in a computer system in the form of hardware, firmware and software without further
approximations and structural modifications.
The stream processing architecture of CA permits directly a distributed computation. A hardware
implementation was designed in 1982 [Dars-Berberyan Cadoz Florens 1983] but soon after less special-
purpose computer systems were preferred. In a general-purpose platform, the algorithm is easily
implemented as a computer program using a very simple repetitive sequential scheme [Cadoz Luciani
Florens 1984]. The real-time gestural control, demands more sophisticated architecture with additional
DSP cards and real-time operating systems.
The present thesis is not concerned with the implementation aspects of physical audio effect models.
However the simulation was a fundamental part of the design procedure as we can see from figure 6.1.
Hence the choice of flexible computer simulation software was crucial. The option of programming
directly in a general-purpose language was not a suitable solution; we were looking for a more simple
and direct method to construct, modify and analyze complex system models. Specialized visual block
diagram languages proved to be the ideal solution for our model-based design approach. Common
solutions for signal processing applications are LabVIEW* , Simulink** and Scicos***. For the present
research we have used Simulink.
Models in Simulink can be hierarchical. We have used several levels for the definition of our models
(figure 6.10). First of all we defined the CA modules. Since they are decomposed into basic digital signal
operators, we used directly the required Simulink blocks from the available libraries. Then the rest of the
primitive modules used for the research and design of physical audio effect models of figure 6.3 were
collected and designed. The idea was to construct a basic toolbox to work with. Simulink is tightly
integrated with MATLAB. This offers enormous opportunities and facilities for the control and analysis of
the simulation.
* We could obtain multisensory responses from our designed digital audio effects – in the present dissertation the visual feedback is not taken into account * www.ni.com/labview ** www.mathworks.com/products/simulink *** www.scicos.org
93
In order to study the physical audio effect models we arranged the simulation space in sections/blocks in
the form of black boxes equipped with input/output ports:
× A physical audio effect block in a highest-level input/output representation that follows of course the
architecture of figure 6.2 and the hierarchy of figure 6.10.
× The reference digital audio effect - if we do have a concrete reference…
× A general audio input block to feed the models with audio signals: sound samples (a small data base
of 20 diverse sound samples) and other basic waveforms (sine waves, chirp signals, constants,
pulses, step signals, random signals). This block has only output ports.
× A general gesture block to study the “control” of the model. It included LFO waveforms and very
simple CA models with varying impedance for the instrumental interaction.
× An output block with all the necessary modules to analyze and capture the output during or after
the simulation (signal viewers and displays, spectral scopes, modules that write the output into
different file formats).
Figure 6.10 Physical Audio Effect model hierarchy
Another simulation software designed in particular for the CA formalism is GENESIS [Castagne Cadoz
2002][Castagne Cadoz 2002b][Castagne 2002c]. This software has been used exclusively in the
beginning of this research. Several interesting simulation options offered by GENESIS are not found on
other environments. However it remains a complete simulation system for musical creation so several
needs, for scientific research, are not efficiently covered.
The most attractive feature of GENESIS is its ergonomic design that is particularly adapted for network
generation, representation and manipulation. The direct manipulation of the basic 1-D CA modules
graphically during modeling, the attractive possibilities for the representation of the network and the
navigation tools within the modeling workbench are very important options. Moreover the available
higher-level tools for the visualization of models during the simulation approved to be fundamental for
the physical thinking approach. Several other tools such as the modal analysis and the tuning of linear
structures are helpful as well.
94
However the weakness of GENESIS for the study and the design of digital audio effects are important.
We do not doubt its significance as a creation tool in any sense. Besides it is evident that the goals do
not always coincide when we design an environment for scientific research or for musical applications.
Figure 6.11 Genesis screenshot
Figure 6.12 Simulink screenshot
The limitation of GENESIS* concerning the analysis and the design of digital audio effects was:
× many analysis tools during or in the end of the simulation were missing: spectrograms, FFT plots etc
× the state of the system in every moment of the simulation could not be traced
× the reference digital audio effect model could not be simulated at the same time under the same
simulation system
× several CA modules were unavailable
× the parameter editing could not be accomplished automatically
* We should note that this analysis concerns the version 1.6 for SGI platforms. Recently a new and more advanced version of GENESIS has been developed for several systems such as Windows, Linux, Mac OS but for the moment is not available
95
× optimization or iterative modeling techniques could not be employed
× the sound input functionality was not efficiently developed
× hierarchical multi-layer design was not available
In figure 6.11 the principal window of GENESIS is illustrated. On the right side we recognize the palette
or the toolbox where the models are created. In figure 6.12 a screenshot from the Simulink environment
used to simulate CA models is depicted. We observe the similarities: we tried to use the same visual
representation of the CA network.
96
97
Chap
ter
7 C
ORD
IS-A
NIM
A P
hysi
cal
Aud
io E
ffec
t Mod
els
98
99
Chapter 7
CORDIS-ANIMA Physical Aud io Effect Model s
An essential part of the research was the re-design of several classic digital audio effects. It would be
probably better to talk about a redefinition of classic effects through the prism of physical modeling and
of haptic gestural interaction and not about a re-design. The strategies for the design of physical audio
effect models have been explained in chapter 6.
As mentioned earlier, we strongly believe that the instrumental interaction can contribute to the
plausibility of certain sound transformations and to the amelioration of the live performance. Even in
deferred-time conditions, the potential to employ the numerous GENESIS models designed to simulate the
instrumentalist offer very interesting and promising possibilities. Therefore, in any case the physical
redefinition of known digital audio effect algorithms is motivating.
We have chosen certain basic effects to realize as CA models. It is evident that it is not possible to
provide a physical model for every audio effect algorithm. We cannot imagine for example a physical
analog of time-segment processing algorithms such as granulation or brassage. Then again, it is
necessary to mention that many very important existing effects are experienced physically. Reverberation
and the Doppler Effect are two remarkable examples.
7.1 E lementary S ignal P rocess ing Operat ions
It has been already stated earlier in the previous chapters that the basic building blocks of the majority
of digital signal processing algorithms is the adder, the constant multiplayer, the signal multiplier and the
unit delay element. Since digital audio effects are in general non-linear time-invariant systems,
memoryless non-linear blocks such as functions or lookup tables are added to the basic block list.
These elementary blocks can be synthesized by the CA system. The big difficulty remains their parallel
and cascade interconnection. Following the impedance scaling methodology of chapter 6, it is possible to
skip this problem. Unfortunately we cannot employ it as many times as we want in the same model
since we can easily reach the minimum number in the floating-point number representation. Equation
(7.1) gives this minimum. The number of bits for the exponent part of the floating-point representation
is symbolized with we [Zoelzer 2002]. A single-precision floating-point word is 32 bits and the we is 8
and a double-precision floating-point word is 64 bits and the we is 11 [Smith 2002].
!
xQmin = 0.5 2^ ("2we"1
+ 2) (7.1)
Of course our research goal -the design of digital audio effects based on the propagation medium
processing paradigm- has not changed. The processing proposed in this chapter scopes principally in
100
exploring and understanding the possibilities of physical modeling. However these techniques are
employed often in physically based composition: the large complicated models demand that kind of
simple mathematical operations. An example is the mixing of synthesized sounds within the framework of
physical modeling.
We must note that before the invention of digital computing machines, several mechanical/electrical
apparatus appeared in order to accomplish the demand of automatic computation. The fundamental
difficulty was once again the desired property for infinite input and for zero output impedances on the
blocks designed and charged to implement the mathematical operations. A very interesting way to
achieve this goal using only passive elements can be found in [Pelegrin 1959]. The use of active circuits
and especially the use of operational amplifiers simplified enormously this procedure. The significances
however were extremely important: Probably this was the critical step where we clearly “decoupled” the
concept of energy from the concept of information. Of course the advent of digital computer made this
approach much more evident.
7.1.1 Un it De lay Element
The unit delay is a system that delays the input signal by one sample [Proakis Manolakis 1996]. This
block obviously is not memoryless. In figure 7.1 we depict a basic CA block called CEL that models the
simple damped mass-spring mechanical oscillator driven by an input force [Morse, Ingard 1968].
Figure 7.1 CEL module
From the CA algorithms given in table 5.1 we obtain easily the finite difference equation of the CEL. We
observe that this equation gets simplified if all the model parameters have the same value.
!
x(n) + ["2 +K + Z
M]x(n " 1) + [1"
Z
M]x(n " 2) =
1
Mf(n " 1)
K = Z = M
#
!
x(n) =1
Mf(n " 1) (7.2)
Equation (7.2) is the differential equation of a unit delay scaled by M. For M=1 we obtain the exact
expression of a unit delay element. The input signal is position-type and the output signal is force type.
Sometimes this signal is called position to force transformer. It can be considered as the simplest
example of an all-pass filter [Proakis Manolakis 1996]. Figure 7.2 depicts its graphical representation and
its CA realization.
Figure 7.2 (a) unit delay element and (b) its CA realization
101
Obviously we cannot design delay line systems using a cascade interconnection of CEL modules as this
violates the syntactic rules of CA [Cadoz Luciani Florens 1993]. However, in chapter 7.6 we will present
a method to design delay lines using the CA system.
7.1.2 Constant Mu lt i p l ie r
A constant multiplier is a system that applies a scale factor on the input signal. Figure 7.3 depicts the
graphical representation of a constant multiplier, its CA realization and its electrical analog.
Figure 7.3 (a) constant multiplier, (b) its CA realization and (c) its electrical analog
The RES with impedance ZK is connected to an impedance Z (according to CA formalism we must always
terminate a <LIA> module with a <MAT> module). We compute the input/output relation by the
impedance analog of the model as depicted in figure 7.3.c (see also chapter 5). The x(n) is a position
and the y(n) is a force signal accordingly.
!
"(n) = q (n)(ZK || Z) # "(n) = q(n)ZKZ
ZK + Z# "(n) = q(n)
Zk
ZK / Z + Z / Z
Z >> ZK
#
"(n) $ q(n)ZK # f(n) = ZKx(n) # f(n) = Kx(n) #
!
y(n) = Kx(n) for Z >> ZRES (7.3)
This previous mathematical analysis illustrates that if we attach the RES module to a much higher
impedance, it can be consider as a constant multiplier. Ideally the RES can be attached to a SOL
module.
7.1.3 Adder and Subtracte r
An adder is a system that performs the addition of two signals [Proakis Manolakis 1996]. Its algorithmic
implementation is carried out easily for floating point or fixed-point numbers. In a similar manner the
subtracter performs the subtraction of two signals. In figure 7.4 we depict the graphical representation
of an adder and its CA realization.
Figure 7.4 (a) an adder and (b) its CA realization
Following a procedure similar to the one employed for the constant multiplier we get:
!
y(n) = K1x1(n) + K2x2(n) for Z >> ZRES1, Z >> ZRES2 (7.4)
102
The result is a weighted addition of the two input signals. The subtracter is derived when one of the
RES modules has negative stiffness. If the Z impedance is infinite -SOL module- we obtain a perfect
adder/subtracter. This configuration is used widely in GENESIS to mix signals coming from different parts
of the models.
7.1.4 Memory less Non l i near Element
Any memoryless nonlinear system can be modeled by a function [Dutilleux Zoelzer 2002][Smith 2005]:
!
y(n) = f[x(n)] (7.5)
In figure 7.5 we depict the graphical representation of a memoryless nonlinear element and its CA
realization.
Figure 7.5 (a) a memoryless non linear block and (b) its CA realization
The slope of the transfer curve in a LNLK module is given by the function f[x(n)], and can be
considered as a nonlinear stiffness K(x). The impedance of a RES module -the stiffness K- facilitates the
description of the type of interconnection between the modules. Therefore, we will employ the same
concept with care for the case of the nonlinear module. Certainly the notion of impedance is valid only
for linear systems. However, this kind of “generalization” permits to express in a more understandable
way the input/output relation of the CA model of figure 7.5
It can follow once more that a procedure similar to the one employed for the constant multiplier,
eventually we get:
!
y(n) = fLNLK[x(n)] for Z >> K(x) (7.6)
We will use this model to design several nonlinear audio effects at the end of the chapter.
7.2 Basi c Low-Order Time- Invar i ant Fi lt er s
In this section we will present and characterize the basic CA modules FRO, REF and CEL from a filtering
perspective. Many of these simple models are quite often used in compositions conceived and realized
with the GENESIS software.
7.2.1 FRO: H ighpass
The FRO module can be considered from a signal processing point of view as a high pass filter. The
equivalent signal processing network together with its CA representation is given figure 7.6. Following a
procedure similar to the one employed for the constant multiplier we obtain the difference equation:
103
!
y(n) = Z[x(n) " x(n " 1)] for Z >> ZFRO (7.7)
Figure 7.6 (a) the simplest one-zero high pass filter and (b) its CA realization
This equation corresponds to the simplest one-zero FIR highpass filter [Dodge Jerse 1997]. Its amplitude
and phase response is given by the expressions:
!
H(") = Z(1# e# j" / Fs
) $
!
H(") = 2Z sin("Ts
2) (7.8)
!
"H(#) = tan$1
[Zsin(#Ts)
1$ cos(#Ts)] (7.9)
The cutoff frequency is Fs/4. The Z parameter defines the pick of the filter.
7.2.2 REF : H ighpass/Lowpass
The REF module can be consider from a signal processing point of view as a one-pole filter that can be
highpass or lowpass [Smith 1995]. The equivalent signal processing network together with its CA
representation is given in figure 7.7. Following a procedure similar to the one employed for the constant
multiplier we obtain the difference equation:
!
y(n) = (K + Z)x(n) " Zx(n " 1) for Z >> ZREF (7.10)
Figure 7.7 (a) a one-zero filter and (b) its CA realization
This equation corresponds to a one-zero FIR filter. Its amplitude and phase response given by the
expressions [Smith 1995]:
!
H(") = (K + Z) # Ze# j" / Fs
$
!
H(") = (K + Z)2
+ Z2# 2Z(K + Z) cos("Ts) (7.11)
!
"H(#) = tan$1
[Z sin(#Ts)
(K + Z) $ Z cos(#Ts)] (7.12)
This filter is highpass for –Z/(K+Z)<0 or a lowpass for –Z/(K+Z)>0. The parameters K and Z also define
the pick of the filter. If our desire is to stay attached to a realistic situation were K and Z>0, the REF
behaves always as a highpass filter.
104
7.2.3 CEL : Bandpass/Lowpass/H ighpass
We have already seen that the CEL module is a two-pole system described mathematically by the
following second order difference equation:
!
x(n) + ["2 +K + Z
M]x(n " 1) + [1"
Z
M]x(n " 2) =
1
Mf(n " 1) (7.13)
The CEL characteristics depend on its poles, p1,2 locations which are a function of the constant
coefficient terms given by equation (7.14).
!
a1 = ["2 +K + Z
M], a2 = [1"
Z
M] (7.14)
In general, when the poles are real and distinct (
!
a1
2> 4a
2) or the system function is expressed as two
one-pole sections in a parallel interconnection and when the poles are real and equal (
!
a1
2= 4a
2) the
system function is expressed as a two-pole filter with a repeated pole. Finally when the poles are
complex conjugate (
!
a1
2< 4a
2) the system takes the form of a special case of bandpass filter known as
a digital resonator [Proakis Manolakis 1996]. In figure 7.8 we illustrate the region of stability in the
(a1,a2) coefficient plane which offers a very clear image of the system characteristics.
Figure 7.8 Region of stability (stability triangle) in the in the (a1,a2) coefficient plane [Proakis Manolakis
1996]
Particularly in this research, we are interested in three specific forms that this equation can take:
a) The first-order and the second-order terms of the recursive of the equation (the denominator
coefficients of the system function) are zero. The CEL becomes a simple delay line element (section
7.1.1)
b) The second order-order term of the recursive part of the equation becomes zero. In this case we
have a one-pole filter
c) None of the terms are zero but the two poles are complex –conjugate. In this case we have a
digital resonator
105
One-po le f i lt er
The equation (7.13) takes the following form when Z=M:
!
x(n) + ["1+K
M]x(n " 1) =
1
Mf(n " 1) #
H(z) =1/ Mz
"1
1+ ["2 +K + Z
M]z"1
+ [1"Z
M]z"2
(7.14)
Figure 7.9 gives the signal graph and the CA network of the one-pole filter described by equation
(7.14).
Figure 7.9 (a) a one-pole filter and (b) its CA realization
The one-sample input delay does not influence the magnitude response but has an effect on the phase
response as it adds an extra term ω [Proakis Manolakis 1996]. Its amplitude and phase response are
given by the expressions [Smith 1995]:
!
H(") =1/ Me
# j" / Fs
1+ [#1+K
M]e# j" / Fs
$
!
H(") =1/ M
[1+ [K / M # 1]2
+ 2[K / M # 1]cos("Ts)
(7.15)
!
"H(#) =
# $ tan$1
[$[K / M $ 1]sin(#Ts)
1+ [K / M $ 1]cos(#Ts)], 1/ M > 0
# + % $ tan$1
[$[K / M $ 1]sin(#Ts)
1+ [K / M $ 1]cos(#Ts)] 1/ M < 0
&
' (
) (
(7.16)
The filter is lowpass when [K/M-1]<0 or highpass when [K/M-1]>0. It is clear that the filter is stable
when |K/M-1|<1.
Digi ta l r esonato r
When the poles of equation (7.13) are complex conjugate we obtain a two-pole digital resonator. This
filter is found in most computer music programs [Dodge Jerse 1997]. As in the one-pole case, the one-
sample input delay does not influence the magnitude response but has an effect on the phase response
as it adds an extra term ω. Its amplitude and phase response is given by the expressions [Smith 1995]:
106
!
H(") =1/ Me
# j" / Fs
1+ [#2 +K + Z
M]e# j" / Fs
+ [1#Z
M]e# j2" / Fs
$
!
H(") =1/ M ...
[1+ [(K + Z) / M # 2]cos("Ts) + [1# Z / M]cos(2"Ts)]2
+ ...
...
... + [#[(K + Z) / M # 2]sin("Ts) # [1# Z / M]sin(2"Ts)]2
(7.17)
!
"H(#) = # $ tan$1
[$[(K + Z) / M $ 2]sin(#Ts) $ [1$ Z / M]sin(2#Ts)
1+ [(K + Z) / M $ 2]cos(#Ts) + [1$ Z / M]cos(2#Ts)]
(7.18)
The equivalent signal processing network together with its CA representation is given figure 7.10.
Figure 7.10 (a) a two-pole filter and (b) its CA realization
This filter is found in most computer music programs [Dodge Jerse 1997]. If we write the two conjugate
poles in polar form
!
p1,2 = ±rej" 0 we can easily derive expressions that relate the CA physical
parameters M, K, Z with perceptual ones such as the center frequency* F0 [Hertz] and the 3dB-
bandwidth BW [Hertz]:
!
H(z) =1/ M
1+ ["2 +K + Z
M]z"1
+ [1"Z
M]z"2
H(z) =1/ M
(1" rej# 0 z
"1)(1+ re
j# 0 z"1
)
$
%
& &
'
& &
(["2 +
K + Z
M] = "2r cos(#0 )
[1"Z
M] = r
2
r = e")BW / Fs
,#0 = 2)F0 / Fs
(
!
K = M["2e"#BW / Fs
cos(2#F0 / Fs) + e"2#BW / Fs
+ 1] (7.15)
!
Z = M[1" e"2#BW / Fs
] (7.16)
We can see from figure 7.8 that the stability conditions of the digital resonator can be written using the
CA parameters as:
!
a2
> "a1" 1# K > 0 (7.17)
* We consider the center frequency as good approximation of the pick-gain frequency of the resonator. An analysis can be found on [Incerti 1996]
107
!
a2
> a1" 1# K + 2Z < 4M (7.18)
The CEL module has been studied from a more physical point of view in the thesis of Incerti
[Incerti1996].
Var i at ions o f the digi t al r esonato r
The CA models of figure 7.11 are alternative realizations of two-pole digital resonators (if their poles are
complex conjugate). We compute their transfer functions Ha, Hb:
Figure 7.11 Variations of the digital resonator
!
model (a) : M[y(n) " 2y(n " 1) + y(n " 2)] = K[x(n " 1) " y(n " 1)] + Z[0 " y(n " 1) + y(n " 2)]#
!
Ha(z) =[K / M]z
"1
1+ ["2 +K + Z
M]z"1
+ [1"Z
M]z"2
(7.19)
!
model (b) : M[ " x (n) # 2 " x (n # 1) + " x (n # 2)] = K[0 # " x (n # 1)] + Z[0 # " x (n # 1) + " x (n # 2)] + x(n # 1) $
M
KK " x (z)[1# 2z
#1+ z
#2] + K " x (z)z
#1+
Z
KK " x (z)[z
#1+ z
#2] = x(z)z
#1 y(z) = K " x (z)
$
M
Ky(z)[1# 2z
#1+ z
#2] + y(z)z
#1+
Z
KKy(z)[z
#1+ z
#2] = x(z)z
#1 $
!
Hb(z) =[K / M]z
"1
1+ ["2 +K + Z
M]z"1
+ [1"Z
M]z"2
(7.20)
We observe that they have the same system functions. The previous analysis for the CEL module can be
applied once more. The obtained results are very similar and obvious so it is not necessary to present
them.
7.3 Synthesis o f H igh-Orde r T ime- Invar i ant Fi lte rs
Using combinations of the above simple models, more complicated filters can be synthesized. In the next
two subsections, three simple synthesis processes will be presented. In general, all linear CA systems are
Multiple Input Multiple Output IIR filters. The modal representation presented in chapter 5.5.4, provides
an enlightening and useful functional view of the system in terms of input/output relations.
108
CA model design normally implies the use of cut-and-try methods and the know-how from experience.
Unfortunately these approaches haven’t proven to be helpful and interesting for the synthesis of filters.
We suppose that they could be helpful for the design of digital reverberators. In general, the
development of digital filters can be decomposed in the phases illustrated in figure 7.12.
Figure 7.12 Filter development
The most classical way to synthesize digital filters is to decompose the designed system function
obtained in the design phase, such as the general one given by the equation (7.21), into low-order
sections.
!
H(z) =
bkz"k
k=0
M
#
1+ akz"k
k=1
N
#
(7.21)
7.3.1 Cascade-Fo rm and Para l l e l -Form CA Structures
It is straightforward to synthesize a filter using a parallel-form CA structure. In figure 7.13 we illustrate
two ways that employ the tunable digital resonator presented in the previous chapter. The (a) network
of the figure has been used for the implementation of filter banks [Incerti 1996]. The (b) network of
the figure has been used for the design of sound board models of musical instruments and for the
realization of modal synthesis in Genesis software.
Equations (7.22) and (7.23) give the expression of the system functions.
!
Ha(z) =Ki / Mi z
"1
1+ ["2 +Ki + Zi
Mi
] z"1
+ [1"Zi
Mi
] z"2k=1
K
#
(7.22)
!
Hb(z) =1/ Mi z
"1
1+ ["2 +Ki + Zi
Mi
] z"1
+ [1"Zi
Mi
] z"2k=1
K
# (7.23)
The cascade-form of CA structures is less obvious. For once again the difficulty is the necessary
impedance scaling to prevent feedback. In figure 7.14 we illustrate two ways that they employ the
tunable digital resonator as in the parallel-from of CA structures. Equations (7.24) and (7.25) give the
approximation of the system functions (Zs stands for the impedance of each section). Using the Norton
theorem for the model 7.13(a) and the Thevenin theorem for the model 7.13(b) we can prove that
these equations are correct.
109
Figure 7.13 Parallel-Form CA Structures
!
Ha(z) "Ki /Mi z
#1
1+[#2+Ki + Zi
Mi
] z#1
+[1#Zi
Mi
] z#2k=1
K
$ for Zs1>> Zs2 >> ...ZsN (7.24)
!
Hb (z) "Ki /Mi z
#1
1+[#2+Ki + Zi
Mi
] z#1
+[1#Zi
Mi
] z#2
for Zs1<< Zs2 << ...ZsNk=1
K
$ (7.25)
Unfortunately, as we can see from equations (7.24) and (7.25) it is not possible to tune the resonator
and adjust the impedance at the same time. A straightforward way to resolve this problem is by
replacing the FRO with REF modules with elasticity factors K’i. Then the equations will take the form:
!
Ha(z) "Ki /Mi z
#1
1+[#2+Ki +Ki
$ + Zi
Mi
] z#1
+[1#Zi
Mi
] z#2k=1
K
% for Zs1>> Zs2 >> ...ZsN (7.24)
!
Hb (z) "Ki /Mi z
#1
1+[#2+Ki +Ki
$ + Zi
Mi
] z#1
+[1#Zi
Mi
] z#2
for Zs1<< Zs2 << ...ZsNk=1
K
% (7.25)
Figure 7.14 Cascade-Form CA Structures
110
7.3.2 Str ing- Form CA St ructures
We have seen in chapter 5.4 that CA models can be represented as Kirchhoff networks in the
continuous time domain. This analogy, permits in a certain number of cases, the application of electrical
network synthesis procedures for the determination of CA networks with a specified frequency
responses. The necessary condition, which must be satisfied by a rational system function H(s) in order
to be realizable as a driving-point immittance of a passive network, is for H(s) to be a positive real
[Tomlinson 1991]. A reference for the discrete-time case of positive real functions can be found in
[Smith 1983][Smith 2005]. Below follows the definition of the positive real functions in the discrete-
time domain [Smith 2005].
A complex real valued function of a complex variable H(z) is said to be positive real if:
×
!
z real " f(z) real
×
!
z " 1# re H(z){ } " 0
In this section we consider the problem of synthesis of CA immittance functions using the Cauer method
[Kontogeorgakopoulos Cadoz 2007c]. Before describing in more details this procedure, we will briefly
introduce the Cauer technique in the way it appears in the literature of electrical circuits. An interesting
theoretical research would be find see the conditions where the CA discretization scheme preserves the
positive real property in the continuous-time domain.
The Cauer synthesis procedure of passive electrical networks concerns the implementation of a specified
immittance function by a particular form of ladder electrical network. It is one among several other
methods used for the synthesis of driving point immittances [Temes LaPatra 1977]. Immittance is a
general term used to include both impedance and admittance. In many cases the required immittance is
realized using only LC circuit elements: inductors L with impedance
!
ZL(s) = Ls and admittance
!
YL(s) = 1/ Ls , and capacitors C with impedance
!
ZC(s) = 1/ Cs and admittance
!
YC(s) = Cs.
The necessary conditions that must by satisfied by a rational function that is realizable as the LC
driving-point immittance may be synopsized [Huelsman 1993]:
× The poles are simple and on the jω axis
× The zeros are simple on the jω axis
× The poles and zeros alternate
× There is a pole or a zero at the origin
× There is a pole or a zero at infinity
× The residues of the poles are real and positive
× The functions are reactance functions whose value along the jω axis is purely imaginary, i.e.
!
ZLC
(j") = jX("), YLC
(j") = jB(")
× dX(ω)/dω and dY(ω)/dω=jB(ω) are always positive
111
× The functions are odd rational functions: if the numerator is an even polynomial then the
denominator is an odd polynomial and vice-versa.
The general form of a realizable LC driving-point immittance I is
!
ILC
(s) =k0s
+ k"s + [ci
s # j$+
ci*
s + j$]
i%
ci = ci*
&
ILC
(s) =k0s
+ k"s +2cis
s2 + $2i%
(7.25)
Where
!
k0, k", ci are the residues of the poles at the origin, at the infinity and on the jω axis,
respectively.
The ladder network has a specific topology with alternating series and shunt branches as shown in figure
7.15. This singularity allows the driving-point immittance to be expressed in the following form
[Tomlinson 1991]:
!
Z = Z1
+1
Y1
+1
Z2
+1
Y2
+ ...
, Y = Y1
+1
Z1
+1
Y2
+1
Z2
+ ...
(7.26)
Figure 7.15 A ladder electrical network
The heart of this method follows from the consideration of an LC driving-point immittance that
consequently has poles at infinity, the origin and complex-conjugate poles on the jω axis. Removing any
of the poles from this function results in a function that is LC realizable.
Each division can be carried out starting with either the highest or the lowest power of s. When each
division starts with the highest powers, the procedure is known as Cauer I method. In the other case the
procedure is known as the Cauer II method. If the order of the numerator is greater than the order of
the denominator the Cauer I method is used which always leads to a ladder with series inductors and
shunt capacitors. In a similar way if the order of the numerator is smaller than the order of the
denominator the Cauer II method is used which always leads to a ladder with series capacitors and shunt
inductors
Below there is an example of an admittance function that corresponds to a realizable circuit with
YC1=1s, ZL1=1/2s, YC2=4s, ZL2=1/6s.
!
Y(s) =s4+4s2+3
s3+2s" Y(s) = s +
1
s / 2 +1
4s +1
s /6
112
We can apply the previous method for the synthesis of string MAS/RES CA models. Our goal was the
design of resonators with specified modes characteristics.
One of the difficulties is to pass from the s-domain to the z-domain: the double discretization scheme
adopted by CA unfortunately does not permit the use of a direct transformation method from the s-
domain to the z-domain or the inverse. Nevertheless we may use an indirect method using the modal
decomposition. The desired filter function is designed directly in the discrete time simulation space of CA
using a bank of second order parallel resonators. Each resonator is tuned to a certain resonance frequency
Fi and has a certain amplitude Ai. The peak of the resonators is approximated by their natural frequencies.
From these filter perceptual characteristics we compute the physical characteristics i.e. the mass M, the
elasticity K. M affects the amplitude of the filter and the term M/K the resonant frequency. After the
application of the Cauer method we may add some damping to our structure to adjust experimentally the
bandwidth of each resonance pick.
In figure 7.16 is illustrated the flowchart of the algorithm that has been used to synthesize a driving
point immittance function as a CA string based on the Cauer technique. Some steps of the algorithm
require several computations presented below.
Given a set of resonance frequencies Fi [Herz] and amplitude Ai [m], the CA parameters are computed by
the equation (7.27). It is derived from equation (7.16) for Z=0.
!
Mi = 1/ Ai
Ki
= Mi[2 " 2cos(
Fi2#
Fs
)] (7.27)
From equation (7.28) we can calculate the CA electric analogs corresponding parameters and from
equation (7.29) we derive the expression for the driving point impedance where the Cauer method will be
applied.
At each iteration of the algorithm, we derive and compute the new CA parameters of the synthesized CA
string model by equation (7.30).
!
Li
= Mi
Ci
= 1/(KiFs2) (7.28)
!
Z(s) =V(s)
I(s)=
V(s)
Q(s)s=
1
Yall
(s)s
where Yall(s) = Yisos
i" and Y
isos
(s) =Qi(s)
Vi(s)
=1
Lis2
+ 1/ Ci
(7.29)
!
Zi(s) = L
is
Yi(s) = C
is
" # $
!
" M i = Zi(s) / s
" K i = s /(Yi(s)Fs2)
(7.30)
A Cauer development corresponds to a realizable ladder network when all the resulted coefficients are
positive. The CA simulation system on the other hand, permits the use of negative coefficients.
Several filters were synthesized using the previous algorithm. All the scripts were written in matlab® and
the results were transferred to GENESIS or to Simulink. The results of the algorithm were very accurate.
113
Figure 7.16 Flowchart of the algorithm to synthesize a driving point immittance as a CA string based on
Cauer technique
114
7.4 T ime-Var i ant Fi l te rs
A filter structure is assumed to give direct and dynamic access to several perceptual parameters.
Computer music applications especially require the use of digital filters with tunable characteristics [Mitra
2001]. The most common frequency characteristics are the center/cutoff frequency, the bandwidth and
the gain. In the literature, numerous approaches exist for the relation of the perceptual filter parameters
with the filter transfer function coefficients. It is evident that this functional point of view disallows the
instrumental interaction guides to the general question of mapping between the control signals and the
available input parameters of the system – in this case the transfer function coefficients.
In this section we will present three time-varying filter models within the framework of instrumental
gesture. A general and more detailed presentation of the employed methods for the dynamical
modification of the CA audio effect models by instrumental gestures has been presented in chapter 4.
7.4.1 Wah-Wah
The wah-wah filter is a bandpass filter with variable center frequency and small bandwidth [Dutilleux
Zoelzer 2002b]. The resonant frequency is placed around [a] and [u] formant frequencies. Therefore, it
would usually be moved from around 400Hz to 2Khz. An important factor that makes wah-wah filters
sound special is the way the resonance changes i.e. the amplitude and the bandwidth, as the frequency
is moved. Measure and a digital model for the famous Cry Baby analog Wah-Wah can be found in [Smith
2008]. An auto-wah is a wah-wah filter where a low frequency oscillator controls its central frequency
(figure 7.17a). The effect can be mixed with the direct signal.
A standard second order filter used for the Wah-Wah audio effect is given by equation (7.31) [Loscos
Aussenac 2005]. F0 is the center frequency [Hertz] and BW is the 3dB-bandwidth [Hertz].
!
H(z) =(1+ c)(1" z
"2)
2[1+ d(1" c)z"1" cz
"2]
with c =tan(#BW / Fs " 1)
tan(2#BW / Fs + 1), d = cos(2#F0 / Fs) (7.31)
Figure 7.17 (a) auto-Wah model and (b) its CA realization (xs/ys: input/output sound signal, xg/yg:
input/output gestural signal, A: oscillator amplitude, F0: oscillator central frequency)
A band pass filter with time variant resonant frequency and bandwidth can be designed using a CEL
module [Kontogeorgakopoulos Cadoz 2005]. The CEL module, which is a simulated mass-spring linear
oscillator, according to our previous discussion is as digital second order IIR resonant filter. If we
115
dynamically change the physical parameters K, Z and M we affect the frequency response of the filter.
Since our goal is to modify its elastic properties without altering “artificially” the model parameters (in
order to preserve the instrumental interaction), we should use the transfer characteristics biasing method
(see Annex A). The parameter modification neither has a physical interpretation nor conserves energy
and momentum.
Figure 7.17b illustrates the CA Wah-Wah model based on a nonlinear CEL. The nonlinearity is defined by
the designed LNL Force/Position f(x) characteristic. The slope of this curve f’(x) gives the spring
constant K(x) as we can see from equation* (7.32). The mass is attached on two stable point low-
frequency oscillator models (LFO). These are located symmetrically on different levels from the
equilibrium point of the mass where we apply the input sound signal xs. By changing the position of the
stable points it is possible to affect the synthesized bias signal xbias by the two LFO models. Due to the
symmetrical topology, the equilibrium position of the mass doesn’t change. The positions of the LFO
models control the operating point Q of LNL function.
!
f(x) " f(xbias) + # f (xbias) (x $ xbias) if# # f (xbias)
2!(x $ xbias) << # f (xbias) (7.32)
The curve must be designed carefully in order to be approximately linear for every “functional point”.
Otherwise the system will no longer be a linear oscillator and distortion will occurs. For example a
Duffing oscillator can be chosen as a nonlinear oscillator with a restoring force of the form:
!
f(x) = " K x(1+x
d
#
$ % &
' ( 2
) (7.33)
Where K’ is the spring constant for small displacements, and d is the characteristic displacement at
which the linear and nonlinear contributions to the restoring force are the same [Morse Ingard 1968].
It is obvious that we generalize the CA Wah-Wah model and replace the simple resonator with more
complex networks.
7.4.2 Time-Var i ant Resonato rs : “Press ing-Str ing” and “St ick ing-Str i ng” Model s
The “pressing-string” and the “sticking-string” models have been originally developed and analyzed in
the context of sound synthesis by Tache* [Tache 2004][Tache Cadoz 2006]. Many of these ideas have
been conceived and applied in compositions earlier by Claude Cadoz. Here we present briefly, an
application of those concepts in the domain of sound transformation.
“Press ing-Str ing” Model
In string instruments, the most common way to change the pitch of the notes is to shorten the
effective length of the string by pressing on them with our fingers. The proposed time-variant filter is
based on this instrumental technique.
* more details can be found in appendix A * actually the concept is the concept is the same; the models of Tache are quiet different and meet other needs
116
The illustrated general model in figure 7.18 is decomposed in 3 basic functional sub-models: the string,
the finger model and a fingerboard model.
The string sub-model is excited by the input sound xs. In CA, a sequence of <MAT> and <LIA> modules
approximate a string of a musical instrument. As we will see in the chapter concerning the delay-based
audio effects, a homogeneous linear string gives a type of time-variant comb filter.
The finger sub-model, models the viscoelastic interaction between a finger and a string. This type of
nonlinear link in CA system is a conditional to position viscoelastic interaction BUT.
The fingerboard model, which is simply a SOL, is linked to the MAS of the string model by another BUT.
It helps to stop the movement of the finger model. It simulates the fret of a string instrument. We used
many finger models to give an articulation to the sound transformation.
The transient state characteristics make the transformation richer than a simple time varying comb filter.
An interesting effect occurs when placing the finger model midway between the fingerboard model: this
position buzzes easily and requires additional pressure to make a clear comb-type transformation.
This signal processing algorithm is open to experimentation. By using several sets of parameter values
we may get different results. We can also tune up the string model to a pre-defined set of frequencies
by using the technique described in chapter 7.3.2.
Figure 7.18 CA network of the “pressing-string” model (xs/ys: input/output sound signal, xg/yg:
input/output gestural signal)
A simplified version of this model (homogenous – non dispersive) is based on the string filter and is
depicted in figure 7.19. This structure is the Karplus-Strong algorithm or the string digital waveguide
model [Karplus Strong 1983][Jaffe Smith 1983][Smith 1992]. The simulation of slur may reproduce a
similar audio effect. The delay line Z-M length is changed abruptly without further treatment. The
occurred clicks sound more artificial than the transient signal components on the “pressing-string”
117
model. Of course more complex and sophisticated mixed digital waveguide models can be designed with
the addition of wave digital filters.
Figure 7.19 String digital waveguide model
“St i ck ing-St r i ng” Model
The “sticking-string” model is part of a class of models used for the generation of complex sound
sequences based on CA models with dynamic structures [Tache Cadoz 2006] [Tache Cadoz 2006b]. The
idea behind these models is to set-up temporary interactions between virtual objects.
In the context of sound transformation we have employed the simplest model. In figure 7.20 its
schematic diagram is illustrated. In contrast with the “pressing string” model, we tried to change the
pitch of a resonator by enlarging the effective length of the string. This has been attained by virtually
sticking two strings together. Even if it remains unrealizable in real world conditions, it is physically
consistent. Moreover, from an energetic scope it is completely coherent: The principle of the
conservation of energy is respected on the synthesized isolated system before and after the interaction.
Figure 7.20 CA network of the “sticking-string” model (xs/ys: input/output sound signal, xg/yg:
input/output gestural signal)
The central part of the LNL characteristic depicted in figure 7.20 corresponds to an ideal spring that is
activated only when the distance between the <MAT> elements that is attached is smaller than a certain
threshold S. A more sophisticated sticking device is proposed in [Tache Cadoz 2006]. The string digital
waveguide model of figure 7.19 can be considered once again a simplified version of this model.
118
7.5 Ampl it ude Mod if ie rs
We call amplitude modifiers the category of digital audio effects that are used to modify the sound
intensity level. In chapter one we presented a diagram that proposed a perceptual classification of
various effects [Verfaille Guastavino Traube 2006]. We should indicate that we are not actually
evoking the loudness category of digital audio effects since the proposed models are altering
several perceptual attributes at the same time. Three simple models are present that offer a
“control” based on instrumental interaction [Kontogeorgakopoulos Cadoz 2008a].
7.5.1 Bend Amp l it ude Modi f i cat ion Model
The bend amplitude modification model is a physical realization or interpretation of the amplitude
modulation / ring modulation process for low frequency carrier-signals [Oppenheim Willsky Young
1983][Dutilleux 1998] [Dutilleux Zoelzer 2002c]. It is a time-variant version of the constant multiplier
based on the transfer characteristics biasing method (Annex A). For this range of frequencies, under
20Hz, is also called tremolo [Verfaille 2003].
The algorithm, which is illustrated in figure 7.21c, contains two basic blocks: a second order mechanical
oscillator (the modulator which is a low frequency signal in most typical applications -LFO) and a special
designed nonlinear elastic link or spring (deviation of the linear Hook law). This link, which is <LIA> CA
module, is illustrated in figure 7.22. Both the input sound file and the oscillator feed its two sides with
their positions. The input sound “drives” this link with a low amplitude signal and the oscillator offers
approximately a high amplitude bias value that changes slowly and controls the functional point of the
interaction. The calculated force from this interaction is the modulated output of the audio effect.
Figure 7.21 (a) amplitude modulation (b) ring modulation (c) bend amplitude modification model (xs/ys:
input/output sound signal, xg/yg: input/output gestural signal, A: oscillator amplitude, F0: oscillator
central frequency)
As we can observe from this schematic diagram, a high pass filter has been used to cut all the low-
frequency non-audible components of the output signal. This filter must be very selective thus its order
must be high. If the system is linear for every bias point, we can subtract the processed low-frequency
component from the output. We must not forget that for nonlinear systems the relation
!
f(x1 + x2) = f(x1) + f(x2 ) is no more valid. The input/output sound signals are denoted with xs/ ys and
the input /output gesture signals are denoted with xg/ yg .
119
Figure 7.22 LNLSQ link
The LNLQS link has been designed carefully, in order to consider it linear for the displacements occurred
by the sound input for every “functional point” on the Force/Position curve. Otherwise the system can
no longer be approximated as linear and distortion occurs. Additionally, effort has been given to
preserve a linear relationship between the input position and the output amplitude.
The amplitude of the oscillator -LFO model- that is controlled by the external gesture determines the
depth of the modulation. Its impedance defines the coupling with the other part of the model: for high
impedances its motion is not influenced by the sound input whereas for low values a mutual influence
occurs. The parameter S of the SOL module specifies the transition from ring modulation to amplitude
modulation. In figure 7.23 we observe the “dry” input audio signal and the “wet” output by the bend
amplitude modification model and the classical amplitude modulation algortihm.
Figure 7.23 Examples of amplitude modulation by the bend amplitude modification model
7.5.2 Mute Ampl it ude Mod if icat ion Mode l
In the mute amplitude modification model, the sound file excites a simulated linear physical object such
as a string or a simple oscillator. Two BUT-type nonlinear modules are used to stop and damp the
movement of the string. This type of interaction, which in the real time situation is defined by the user
physical gesture -the hand movement that touches the virtual string-, controls the wave propagation and
consequently the sound amplitude. A similar type of gestural interaction has been used in the “pressing-
string” model. Consequently, as in the “pressing-string” model, in the midway between the two states
of the model, the free resonator and the completely “squeezed one”, strong saturation occurs in the
output. The mute amplitude modification model is a highly nonlinear effect. We will see later on that the
same module will be used for the distortion-type audio effect.
In figure 7.24 the schematic diagram of the algorithm is presented for the case of a simple oscillator
(Resonator Model). It contains two main blocks: the vibrating structure, which is the simple resonator
and the stiff nonlinear link.
120
A representative image that explains the concept of this audio effect is a human hand touching a
speaker in order to stop its movement when it is driven by an input sound. Unfortunately it is not
possible to fully prevent the movement of the driven simple resonator model. Even in the case in which
its movement is completely restricted into infinite small space -in virtual conditions like CA we are able
to make it zero- the stability boundaries pose limits to the BUT stiffness maximum value. According to
the CEL stability conditions we obtain* equation (7.34) An analysis of the time-varying resonator can be
found in [Smith 1995].
!
K + KBUT + 2(Z + ZBUT ) < 4M (7.34)
The effect is not a mere amplitude modification: it reminds a combination of distortion, resonator and
amplitude modification. In figure 7.25 we observe the “dry” input audio signal and the “wet” output.
Figure 7.24 (a) dynamic saturator as a simplified version of the mute amplitude modification model (b)
and the mute amplitude modification model (xs/ys: input/output sound signal, xg/yg: input/output
gestural signal)
Figure 7.25 Examples of the mute amplitude modification model
Obviously, the mute amplitude modification model illustrated in figure 7.24b is not perfect. An intuitive
judgment of the model behavior could be the following: “I can completely stop the oscillator movement
by grasping and squeezing the moving mass… ”. We verified that the “numerical reality” has its own
rules and posses other types of limitations. We could say that this model is clearly a distortion model.
The reason that it has been presented in this part of this thesis is to demonstrate that often intuition,
an essential element of this research, conducts to errors…
* in this limit, the overall model is simplified to a CEL module and the stiffness of the BUT modules define the amplitude of output sound
121
7.5.3 Pluck Ampl it ude Mod if icat ion Mode l
In the pluck amplitude modification model the sound file is “attached” to a nonlinear link –LNLK module-
as illustrated in figure 7.26. The other side of the link is attached to a stable point. As in the first
model, the calculated force from this interaction is the output of the audio effect. The user controls the
point in space where this link is established and the kind or the quality of the interaction. This algorithm
may provide heavily distorted sounds during the transient phase depending mostly on the type of the
contact-interaction. Two curves used for the look-up table of the LNLK module are given in figure 7.27.
Our concept was to provide a physical model that implements an ON/OFF amplitude switch. The
algorithm exhibits similarities with plucked stings or plucked oscillator models. A hand that “holds” a
sound -for example we can imagine a hand that holds a speaker (measles in this model)- and “attach” it
to a vibrating structure that diffuse it as the resonance box of a guitar, giving an image of the process.
Figure 7.26 Pluck Amplitude Modification Model (xs/ys: input/output sound signal, xg/yg: input/output
gestural signal)
Figure 7.27 Curves for the look-up table of the LNLK module
Figure 7.28 Examples of the Pluck Amplitude Modification Model
122
7.6 Delay-based Aud io E ffects
A variety of digital audio effects make use of time delays. For example, the echo, the comb filter, the
flanger, the chorus and the reverb use the time delay as a building block [Dutilleux Zoelzer
2002d][Smith 2005]. An evident digital synthesis-realization of the time delay is the digital delay line.
Generally, the digital delay line and the unit delay is one of the few basic building blocks used in almost
every classical audio digital signal processing algorithm. In this section will present a number of important
delay-based audio effects realized with the CA system [Kontogeorgakopoulos Alexandros 2008c].
7.6.1 De lay Model
A delay simply delays the input audio signal by an amount of time. For fractional sample delay lengths,
interpolation techniques are used such as linear and allpass interpolation algorithms [Laakso Valimaki
Karjalainen 1996][Smith 2005].
Synthesizing a delay line with the mass-interaction physical modeling scheme of CA is neither
straightforward nor computationally effective. On the other hand, its algorithmic structure can be
interesting as it offers a mentally strong physical metaphor and permits direct physical instrumental
interaction.
In chapter 5.4 we have presented a method to pass –if it is possible- from a continuous-time Kirchhoff
network to a CA one. Luckily an electrical transmission line, which may be considered as an analog delay
line, can be transformed to a CA structure. Hence, in CA system, a digital delay line takes the form of a
virtual string terminated by its characteristic impedance.
In figure 7.29 we depict the network of a digital delay line, an electrical transmission/delay line and its
CA realization. The impedance analog network has been used. The stiffness parameter K of the model
controls the time delay. Analytic expressions of the time delay as a function of the model will be
presented later on.
Figure 7.29 (a) electric transmission line (b) digital delay line (c) CA delay model (xs: input sound, ys:
output sound, qin/qout: input/output charge)
123
We can compute the characteristic impedance by expressing and decomposing the CA model into two-
ports (chapter 5.5.3). An approximate analytic expression of the time delay as a function of the model
parameters can also be computed by the same decomposition. Figure 7.30 illustrates an elementary CA
two-port. Equation (7.35) expresses mathematically its input/output terminal relations.
Figure 7.30 (a) A CA two-port and (b) its to discrete-time (DT) and to continuous-time (CT) Kirchhoff
representation
!
f1
f2
"
# $ %
& ' =
K (K
K (K ( M1( 2z
(1+ z
(2
z(1
"
#
$ $
%
&
' '
x1
x2
"
# $
%
& ' (7.35)
The terms of the matrix that appears in the equation (7.35) are called impedance parameters zij
[Nilsson Riedel 2004]. The characteristic impedance Zc and the delay time D are functions of those
parameters [Rocard 1971]. Equations (7.36) and (7.37) give their analytic mathematical expressions (we
express the impedance parameters by their Fourier-transform). The total time delay [samples] for a N-
MAS CA delay line is given by the equation (7.38).
!
Zc = K " K2Ke
" j#+ M(1" 2e
" j#+ e
"2 j#)
2Ke" j#
±K (2Ke
" j#+ M(1" 2e
" j#+ e
"2 j#)
2Ke" j#
)2" 1
(7.36)
!
D =d cos
"1(#)
d $ with # =
z22 " z11
2z12
(1)
%
!
D =
d cos"1
(2Ke
" j#+ M(1" 2e
" j#+ e
"2 j#)
2Ke" j#
)
d # (7.37)
!
Dtotal = ND (3)
"
!
Dtotal = N
d cos"1
(2Ke
" j#+ M(1" 2e
" j#+ e
"2 j#)
2Ke" j#
)
d # (7.38)
We observe from the last expression, equation (7.38), as we also perceive from the audio outputs that
the model suffers from dispersion. It is remarkable that for M=K the CA delay line synthesizes precisely
the time delay without phase distortion or undesired filtering. The Zc and D in this case are:
!
Zc = K(1" z"1
) (7.39)
124
!
D = 1 and Dtotal = N (7.40)
The characteristic impedance expressed by the equation (7.39) can be synthesized in CA by a FRO
module with Z=K attached to a SOL module.
Instead of the precise derived equations (7.36)-(7.38) we may use their approximation in the
continuous-time domain as given by the continuous-time electrical network. Using the results from the
electrical transmission lines [Rocard 1971] and the CA-electrical analogs we get the following helpful
approximations:
!
Zc = MK for "2 M
K<< 4 (7.41)
!
D =M
K and Dtotal = N
M
K
(7.42)
7.6.2 Comb F i lt er Mode ls
A comb filter is produced when a slightly delayed audio signal is mixed with the original one [Dutilleux
Zoelzer 2002d][Smith 2005][Dutilleux 1998]. When the delayed version is fed back to the delay line
input we have an IIR comb filter. Otherwise we get an FIR comb filter. Both topologies give a large
number of notches in the spectrum of the input signal.
Two models that synthesize this classical digital audio effect are depicted in figure 7.31 and 7.32. The
models 7.31b and 7.32b first use the previous delay model. The multiplications and the additions can be
realized with CA models. For the clarity of the presentation we preferred to use the mathematical
operators.
Figure 7.31 (a) FIR comb filter (b ) CA FIR comb filter model using the delay model (xs: input sound, ys:
output sound)
125
A similar effect to the IIR one is experience naturally inside an acoustical cylinder when a sound
circulates inside it: the successive reflections at both ends of a cylinder modify the signal approximately
as an IIR comb filter. It is not difficult to simulate this phenomenon with a CA string model figure
7.32c). The resulted effect is perceived as a natural resonator. Two important differences from the
signal processing model are (a) the notches do not cover the whole spectrum (the number of notches is
defined by the number of masses N) and (b) the dispersion is inevitable in contrast to the first one
where it is inexistent.
Figure 7.32 (a) IIR comb filter (b) IIR CA comb filter model using the delay model (c) CA comb filter
model using a CA string (xs: input sound, ys: output sound)
The dispersion can be calculated from the group delay given by equation (7.38). The maximum
frequency can be computed with matrix algebra [Lay 2003]. From equations (7.15), (7.16) we
compute central frequency of the resonator as a function of its CA parameters [M, K, Z].
!
f0 =Fs
2"cos
#1[2 # (K + Z) / M
2 1# Z / M] (7.43)
From modal analysis, presented in chapter 5.5.4 we can compute the eigenvectors of the matrix [K].
Fortunately for simple symmetrical and homogenous networks as the linear homogenous CA string, we
can find analytical expressions of the eigenvalues of [K][Incerti 1996]. This gives us the possibility to
derive an analytic expression of the eigenfrequencies.
!
[Km ] = [Qt][K][Q]
[K] = K[A]
"[Km ] = K[Q
t][A][Q]"
[Km ] = K diag(#0 ,#1, ...#N$1) "
126
!
[Km ]string = K diag(...) with : "p =
4 sin2(p#
2N)
4 sin2((p + 1)#
2N + 1)
4 sin2(p(+1)#
2(N + 1))
$
%
& &
'
& &
0 ( p ( N ) 1 string ) free ends
0 ( p ( N ) 1 open string ) one fixed end
0 ( p ( N ) 1 open string ) fixed ends
(7.44)
It is straightforward now to compute the cutoff frequency of the CA string illustrated in figure 7.32c:
!
(7.43),Z = 0
"fmax = fN#1 =
Fs
2$cos
#1[2 # Kmax / M
2] =
Fs
2$cos
#1[2 # Kmax / M
2]
(7.44)
"
!
fmax =Fs
2"cos
#1[
2 # (K / M) 4 sin2(
N"
2N + 1)
2] (7.45)
7.6.3 Fl anger Models
A flanger can be easily implemented using variable length delay lines [Smith 1984][Dattorro 1997]
[Disch Zolzer 1999][Dutilleux Zoelzer 2002d][Smith 2005][Huovilainen 2005]. Basically it is a comb filter
where the delay line length is slightly changed. The use of an interpolation method like liner or allpass is
necessary to ensure smooth delay line variations.
In the proposed model we use the previous CA comb filter that is based where the parameter K that
approximately defines the time delay according to equation (7.42) is altered periodically (figure 7.33a).
For wide amplitude modulation values of the K parameter, a new effect is obtained between dynamic
lowpass filtering and flanging. Also the strong dispersion colors the output.
A more physical approach in the context of physical instrumental interaction is to use a variation of the
second CA comb filter model where the linear <RES> modules are exchanged by non-linear <LNLK>
modules. The designed non-linearity characterizes the time delay. A gesture stressing the physical model
biases it and consequently determines the linear regions of the system (Annex A). Therefore, this
gesture affects the time delay of the comb filter structure.
Figure 7.33 (a) CA flanger model using a comb filter model (b) CA flanger model using a nonlinear CA
string (xs: input sound, ys: output sound, xc: control input, xg: gesture input, yg: gesture output)
127
It would be interesting to investigate the case of a time-variant non-homogenous resonator. The
synthesis methodology of string-form CA structures can be used for non-harmonic tunings. Probably an
audio effect similar to a phaser could be obtained. For the phaser effect, a small number of
notches/picks in the spectrum is necessary in contrary to the flanger [Hartmann 1978].
Another flanger model will be presented in the next section. It has been necessary to describe it in
another part of the document due to the adopted design approach.
7.6.4 Spat ia l i zat ion and Pick -Up Point Modulat ion
The CA networks have an inherent spatiality due to their topology. The sound can be picked-up from
every elementary CA basic module output. Figure 7.34 represents a simple CA spatial vibrato model
(time variant delay model) with two outputs. Due to the relative time delay between the two output
nodes in the network, we obtain a spatial image of the sound source (the interaural time differences
(ITD) are a strong cue that the human auditory system uses to estimate the apparent direction of a
sound source [Rocchesso 2002]). Hence the geometrical spatial characteristics (a more accurate term
would be topological) of CA models are related to the spatial sound characteristics of the outputs.
Figure 7.34 (a) CA spatial vibrato model (xs: input sound, ys: output sound, xc: control input)
Figure 7.35 CA flanger model using a pick-up point modulation (xs: input sound, ys: output sound, xc:
control input)
The analysis of the previous chapter can be applied in order to choose the proper CA nodes to obtain a
desired spatial image of the sound source*. It is clear that the spatial discretization quantizes the spatial
trajectories. The interaural intensity differences (IID) can also be used to improve the spatial image.
The use of movable pick-up points gives the opportunity to attain dynamic effects. A similar idea has
been applied earlier to digital waveguides [Van Duyne 1992]. The CA model of figure 7.35 is a type of
* A research relative to sound panoramization by CA system can be found in [Meunier 2000]
128
flanger. Each pick up point determines the partials reinforced by the string topology. If we place the
pick-up point at a position 1/m across the sting length, the partials whose number is m will be canceled.
Figure 7.36. Linear cross-fader block diagram
7.7 Nonl i near Aud io Ef fect s
We have mentioned that the amplitude modification models of chapter 7.5 present nonlinearities.
Therefore, we can push these models towards a completely noticeable nonlinear behavior. Very simple
variations can produce distortion-type digital audio effects. We will present two nonlinear models based
on the bend amplitude modification and on the mute amplitude modification model.
7.7.1 Non l i near it i es w ithout Memory: Waveshap ing
The most common way to distort a signal is by waveshaping [Schaefer 1970][Arfib 1979][LeBrun
1979]. Waveshaping is a memoryless signal processing technique of distorting the amplitude of a sound.
It is characterized by a function that is called the waveshaper. This function can be expressed
mathematically as polynomial by a Taylor series expansion:
!
y(n) = f[x(n)] = aixi(n)
i=0
N
" (7.46)
Since many complex nonlinear systems can be decomposed into LTI systems with a memoryless
nonlinearity, it is very useful technique. For example recently, two classical distortion and overdrive
pedals with nonlinearities with memory have been approximated efficiently by a waveshaper and linear
filters/equalizers [Yeh Abel Smith 2007a].
In the sound synthesis domain, several techniques have been employed for the design of waveshapers.
The most important technique is called spectral matching and is accomplished through the use of
Chebyshev polynomials [Dodge Jerse 1997]. In the audio effects domain, several simple functions exist
in the literature [Smith 2005][Dutilleux Zolzer 2002][Yeh Abel Smith 2007a]. Many of them are
presented in table 7.1.
We have already developed the memoryless nonlinearity in chapter 7.14. The CA network is illustrated in
figure 7.5. It is clear that this model is a special case of the bend amplitude modification model. The
only difference is in the design of the nonlinearity. In contrast to the bend modification model, here the
nonlinear components are desired. Also the gestural input/out is omitted.
129
In figure 7.37 we illustrate a memoryless nonlinear model with gestural interaction. The LFO model was
not necessary, so it was excluded from this model. We can design the waveshaping function LNLK
according to the proposed formulas of table 7.1.
Figure 7.37 CA memoryless nonlinear element with gestural interaction
TABLE 7.1 Symmetrical Nonlinearities
hard c l ippe r
!
f(x) =
"1 x # "1
x " 1# x # 1
1 x $ 1
%
& '
( '
sof t c l i pper
!
f(x) =
2x 0 " x " 1/3
±[3 # (2 # 3 x )2] /3 1/3 " x " 2 /3
±1 2 /3 " x " 1
$
% &
' &
cub ic so ft c l ipper
!
f(x) =
"2 /3 x # "1
x " x3/3 " 1# x # 1
2/3 x $ 1
%
& '
( '
arctangent non l i near it y
!
f(x) =2
"tan
#1($x) x % [#1,1], $ >> 1
d is tor t ion
!
f(x) =x
x[1" exp(
x2
x)]
tanh app rox imat ion
!
f(x) =x
(1+ xn)1/ n
7.7.2 Non l i near it i es w ith Memory : C l ipp ing
A simple nonlinear CA network with memory is the mute amplitude modification model. It is not difficult
to verify it from the form of the difference equation (7.47) that represents the system. If we leave out
the gestural interaction and we use non dissipative BUT links, we obtain a more simple network depicted
in figure 7.38 with its signal processing block.
130
!
x(n) + ["2 +K + KBUT1 + KBUT2 + ZBUT1 + ZBUT2
M]x(n " 1) + [1"
Z + ZBUT1 + ZBUT2
M] =
=1
Mf(n " 1) +
KBUT1
Mxg1(n " 1) +
KBUT2
Mxg2(n " 1) +
+ZBUT1
M[xg1(n " 1) " xg1(n " 2)] +
ZBUT2
M[xg2(n " 1) " xg2(n " 2)]
with
KBUT1 = f[xg1(n " 1) " x(n " 1)]
KBUT2 = f[xg2(n " 1) " x(n " 1)]
ZBUT1 = f[(xg1(n " 1) " xg1(n " 2)) " (x(n " 1) " x(n " 2))]
ZBUT2 = f[(xg2(n " 1) " xg2(n " 2)) " (x(n " 1) " x(n " 2))]
#
$
% %
&
% %
(7.47)
Figure 7.38 (a) the signal processing and the (b) CA network of the mute amplitude modification model
Once again, the system does present a memoryless nonlinearity. We could affirm that it is a bandpass
filter whose pole location changes with the input. Surprisingly this system is very similar with the
lowpass filter with a diode limiter, used in distortion audio effects [Yeh Abel Smith 2007a][Yeh Abel
Smith 2007b]. This system, with its CA model, is illustrated in figure 7.39. It is easy to observe that in
the model of figure 7.38b for K=0 it is almost the same as the model of figure 7.39b. The only
difference is that the lowpass filter Ha (model 7.38b) and Hb (model 7.39b) design, coupled with BUT
modules – the CA implementation of the diode clipper.
!
Ha(z) =1/ Mz
"1
1+ ["2 +Z
M]z"1
+ [1"Z
M]z"2
(7.48)
!
Hb(z) =Z / Mz
"1(1" z
"1)
1+ ["2 +Z
M]z"1
+ [1"Z
M]z"2
7.50
#
!
Hb(z) = Z(1" z"1
)Ha(z) (7.49)
Figure 7.39 (a) lowpass filter with diode limiter and (b) its CA realization
131
Ann
ex A
Tra
nsfe
r Ch
arac
terist
ics
Bias
ing
Met
hod
132
133
Annex A
Transfer Character i st ics Bi asing Method
A simple way to dynamically modify the physical characteristics of CA physical models will be presented.
Our goal is to change, during the simulation, the model’s elastic properties without artificially altering its
parameters. The instrumental interaction will not be violated with this general methodology.
The concept is based on the technique of biasing used in electronics [Sedra Smith 2004]. We may
design an LNLK <LIA> module that can be assumed to be linear over a limited range of input values. By
applying an additional DC signal xbias with the input signal x, we control the operating point Q of the
nonlinear transfer characteristic f(x) and thus the amplification factor A(xbias) (figure A.1).
Figure A.1 (a) CA model with an LNLK module (b) its transfer characteristics - biasing
We can explain mathematically this method using the Taylor series of function f(x) about a point xbias :
!
f(x) = f(xbias) + " f (xbias) (x # xbias) +" " f (xbias)
2!(x # xbias)
2+ ...$
f(x) # f(xbias) = " f (xbias) (x # xbias) +" " f (xbias)
2!(x # xbias)
2+ ...$
!
f(x) " f(xbias) + # f (xbias) (x $ xbias) if# # f (xbias)
2!(x $ xbias) << # f (xbias) (A.1)
Consequently, if we take as an input a signal xs(n) and a dc signal xbias, the output y(n)=f(xs(n)+ xbias)
will be given by the equation (A.2)
!
f(xs + xbias) " f(xbias) + # f (xbias) (xs + xbias $ xbias) %
!
f(xs + xbias) " f(xbias) + # f (xbias) xs (A.2)
134
The last equations shows that we have the possibility to control the elasticity factor K=f’(x) with a bias
DC input signal. To get rid off the f(xbias) term, a highpass filter like a DC blocker can be used. We can
also directly subtract the term. The biasing signal can vary with time, typically in sub-audio frequencies;
in this case the subtraction still works. When the approximation is not valid but we still want to use the
same principle of biasing, to change for example the amount of distortion, we can use a very selective
highpass filter. In the literature there exist many filter design procedures [Proakis Manolakis 1996].
The transfer characteristics biasing method can be generalized and applied to nonlinear CA string model
as illustrated in figure A.2. Below we compute the operating point for each section, when the string is
stretched by its end.
Figure A.2 A CA string model
According to classical mechanics, in the equilibrium we have:
!
fs,1
(s " x1) = "f2,1(x2 " x1)
f1,2
(x1 " x2) = "f3,2 (x3 " x2)
...
fN"2,N"1
(xN"2 " xN"1) = "fN,N"1(xN " xN"1)
fN"1,N
(xN"1 " xN) = "f0,N(0 " xN)
#
$
% % %
&
% % %
f(x) = "f("x)
'
fs,1
(s " x1) = f2,1
(x1 " x2)
f1,2
(x1 " x2) = f3,2
(x2 " x3)
...
fN"2,N"1
(xN"2 " xN"1) = fN,N"1
(xN"1 " xN)
fN"1,N
(xN"1 " xN) = f0,N
(xN " 0)
#
$
% % %
&
% % %
!
f(xa) = f(xb) "
xa = xb
"
s # x1 = x1 # x2
x1 # x2 = x2 # x3
...
xN#2 # xN#1 = xN#1 # xN
xN#1 # xN = xN
$
%
& & &
'
& & &
"
xN#1 # xN = xN
xN#2 # xN#1 = xN#1 # xN
...
x1 # x2 = x2 # x3
s # x1 = x1 # x2
$
%
& & &
'
& & &
In the previous steps we assumed that the functions are odd i.e. symmetric about the center of the
graph and one-to-one. With some simple algebraic operations we get:
135
!
xN"1 " x
N= x
N
xN"2 " x
N"1 = xN
...
x1" x
2= x
N
s " x1
= xN
#
$
% % %
&
% % %
(A.3)
!
(A.3) "
xN#1 = 2xN
xN#2 = 3xN
...
x1 = NxN
s = (N + 1)xN
$
%
& & &
'
& & &
" xN =s
N + 1 (A.4)
!
(A.3)(A.4)
"
xN#1 # xN =s
N + 1
xN#2 # xN#1 =s
N + 1...
x1 # x2 =s
N + 1
s # x1 =s
N + 1
$
%
& & & &
'
& & & &
(A.5)
From (A.5) we can compute the operating point Qi for each nonlinear interaction:
!
fs,1
(s " x1) = fs,1
(s
N + 1)
f1,2
(x1 " x2) = f1,2
(s
N + 1)
...
fN"2,N"1
(xN"2 " xN"1) = fN"2,N"1
(s
N + 1)
fN"1,N
(xN"1 " xN) = fN"1,N
(s
N + 1)
(A.6)
136
137
Ann
ex B
Fre
quen
cy M
odul
atio
n withi
n th
e Fr
amew
ork
of
Phys
ical
Mod
elin
g
138
139
Annex B
Frequency Modul at ion with in the Framework of Physical
Model ing
The FM synthesis technique was a break through in the digital sound synthesis domain [Chowning 1973].
It has been a dominating sound synthesis method for years. As it has mainly only one parameter to
control the rich synthesized spectra, its use is very simple and simplifies a lot the production of
musically meaningful audio signals.
The success of FM technique followed the physical modeling paradigm in digital sound synthesis. The last
twenty years, various physically-based schemes have been introduced capable of generating models for
various sound sources [Valimaki Takala 1996]. An important advantage of this technique -which probably
is even more interesting than its capabilities for the resynthesis and imitation of real acoustical
instruments- is the playability and the control possibilities that offers to the user. We have already seen
in chapter 6 that this remark stands right both for real time and deferred time simulations: in the first
case by the use of haptic interfaces and in the second case by the use of models of instrumental
gestures.
An approach to combine FM and physical modeling has been made by Scott Van Duyne [Van Duyne
1992]: by extending the digital waveguide string model through the addition of a movable pick-up point
along the string he produced FM sidebands around each partial. Nevertheless, this research was not
oriented on the physical control of the FM algorithm. So far, according to the author, such models have
not been yet studied.
In this section, two CORDIS-ANIMA (CA) models are proposed which attempt to offer an instrumental
type of control to the FM procedure. The advantages, the disadvantages and some applications are
discussed. These two models – the Physical Frequency Modulation Model (PFM) and the Triangular
Physical Frequency Modulation Model (TPFM) – provide a physical interpretation of the FM algorithm.
Since their design is based on the simulation of some real word sound generation mechanisms, the
associated mental model of the proposed systems is very effective and intuitive.
B.1 S imp le Frequency Modulat ion
The Frequency Modulation technique is widely used for broadcasting. The pioneering work of John
Chowning showed that the relevant equations can be applied for the generation computer synthesized
sounds [Chowning 1973][Roads 1996]. Vibrato which is slight wavering was first simulated using the FM
algorithm. When the vibrato frequency pass from the sub-band range to the audio range and its width
gets larger, interesting complex sonic results appear.
140
!
y(n) = A(n) sin 2"fcn + I sin(2"fmn)[ ] (B.1)
FM synthesis is in general the alteration of the frequency of an oscillator in accordance with the
amplitude of a modulating signal [Dodge Jerse 1997]. The block diagram of the algorithm consists of
oscillators, envelope generators and adders. Both oscillators and envelopes can be designed by
wavetables [Orfanidis 1995]. Shown in figure B.1, is a simple FM algorithm able to produce time-varying
spectra. The parameters are the modulating frequency Fm, the carrier frequency Fc, the amplitude of
the carrier wave Amp and the peak deviation of the modulating wave Dev. Envelope generators may be
applied to Amp and Dev parameters. Instead of the peak deviation and the modulating frequency
parameter, the modulation Index I= Dev/ Fm and the harmonicity factor H=Fc/Fm can be used
accordingly.
Figure B.1 FM block diagram that produce time-varying spectra
B.2 Phys ica l F requency Modu lat ion Model
For the design of the basic FM algorithm (figure B.1) using a physical modeling approach, first of all we
have exchanged the oscillators with simulated mechanical ones. The simplest second order mechanical
oscillator is a mass attached to a stable point by a spring. In CA the following network given in figure
B.2 represents this. In the same figure we also present the signal processing block diagram of the
algorithm. This algorithm is a classical filtering method for the design of sinusoidal oscillators [Orfanidis
1995].
Figure B.2 (a) simple second order oscillator model and it’s (b) block diagram (xg force gesture input, yg
position-gesture output, ys position-sound output)
The frequency of an oscillator of a given inertia M, elasticity K, damping factor Z and sampling rate Fs is
given in the equation (B.1). M is in [Kgr], K in [Kgr][samples]-2, Z in [Kgr][samples]-1 , f in
[Kgr][m][samples]-2 and x in [m].
141
!
F =Fs
2"cos#1
(2 # (K + Z) / M
2 1# Z / M) (B.2)
The next step is to modify the frequency of the oscillator -the carrier oscillator- by another one –the
modulating oscillator- in a physical way. We could easily control dynamically the elasticity K of the carrier
(figure B.2) through the mapping scheme of the equation 2. It is difficult though to imagine a physical
analogy of this procedure.
A more realistic method is to use a nonlinear oscillator instead of a linear one (see Annex A for transfer
characteristics biasing method). In this case the elasticity K is a function of distance of its two
extremes. Hence by attaching those one two other oscillators that have opposite phase, we manage to
alter periodically the frequency. The nonlinear interaction LNLK, described by the block diagram of the
figure B.3 may be used to achieve the desired system function (it is a <LIA> module in CA formalism): a
linear mapping of the position to frequency is preserved by the proposed Force-Displacement
characteristic. This characteristic is derived if we combine equation (B.2) with the derivate of the LNLK
function f(x) since K=f’(x).
Figure B.3 Nonlinear interaction used in the PFM model (LNLK)
The PFM model (figure B.4) is designed by two modulating oscillators -the oscillator L (Left) and the
oscillator R (Right)- and by a nonlinear carrier oscillator. We use two symmetrical modulating oscillators
in order to keep one equilibrium point for the central mass. By comparing this physical model with the
FM algorithm of figure B.1 we get directly:
× The parameter Dev corresponds to the maximum amplitude displacements xLmax, xRmax of the
masses of the oscillators L and R
× The parameter Fm corresponds to the choice of the elasticity parameter KL, KR (KL = KR) of the
oscillators L and R
× The parameter Fc corresponds to the to the position SL, SR (SL = -SR) of the attachment point of
the oscillators L and R
× The parameter Amp corresponds to the amplitude position xmax of the nonlinear oscillator.
The control of the PFM model is based on the dual concept of force/position and consequently the
necessity of mapping is avoided. Every parameter of the FM synthesis is reached through physical
gestures. As depicted in figure B.4 we have force input gesture signals (fg, f1g, f2g), position output
gesture signals (xg, x1g, x2g) and the position output sound signal xs which corresponds to the audio
output of the system. We may attach directly a force feedback haptic interface or we may design
physical models to simulate the instrumental gesture to control or more precisely to interact with the
model.
142
Figure B.4 PFM model (fg, f1g, f2g: force input gesture signals, xg, x1g, x2g position output gesture
signals, xs audio output)
Several FM instruments which have time-varying spectra use envelope generators. A very simple
amplitude envelope contains an attack, a sustain and a decay phase with four input parameters: the
attack time, the total duration of the envelope, the decay time and the maximum amplitude [Dodge
Jerse 1997]. The attack and the decay segment forms may be specified as well. They are either linear
or more often exponential – as in the natural vibrations.
Figure B.5 Physical Model of an envelope generator ( x1g, x2g: force input gesture signals, y1g, y2g
position output gesture signals)
A simple physical model of an envelope generator is presented in figure B.5. The two BUT modules
simulate a conditional to position viscose interaction. The use of negative damping constant (Z) although
violates the second law of thermodynamics, appears very useful in our designs. The equation that shows
how the amplitude decrease (positive damping factor) or increase (negative damping factor) with time
for the CA second order oscillator is given by equation (B.3). Approximately it follows an exponential
law.
Hence we are able to control the amplitude with accuracy by damping the mass movements of our
models. The time of viscose interaction and the damping constant shape the amplitude and generate
indirectly amplitude functions of time. In the model of figure B.5 the user holds and to moves two
masses toward the PFM model. When these masses approach it (figure B.5), a viscose interaction is
installed which alters the kinetic state of the system.
!
"(n) = (1#Z
M)n /2 $ exp(
#Z
2Mn) (B.3)
This model prohibits instabilities for high modulating frequencies near the audio range due to the
algorithms of the oscillators. This unfortunately limits the possibilities to implement many FM
143
instruments. Nevertheless in the vibrato range the system is similar to classical FM as proposed by
Chowning. In figure B.6 we observe the amplitude modification due to instabilities.
Figure B.6 Examples of the PFM model using the same type of oscillators (Fc=200, Fm=2, Dev=150)
B.3 Tr i angu lar Physi cal Frequency Modulat ion Mode l
The TPFM model uses triangular-type oscillators. Although it is rarely necessary to use complicated
oscillators in the FM synthesis, as the resulting spectrum can be extremely dense, this model exhibits an
interesting and simple structure so it worth’s our attention.
An interesting way to synthesize triangular waveforms by means of physical modeling is to imagine a
free mass bouncing without gain/loss of energy in-between two walls: its measured position is the
desired signal. The frequency of the mass movement is characterized by its initial velocity and by the
distance of the walls: longer distances give lower frequencies and higher amplitude signals.
In figure B.7 the CA network of the triangular oscillator is presented. This algorithm prohibits aliasing
which the stiffness constant of the BUT modules controls to some extent its amount. This can be seen
from the spectrogram given in figure B.8. Additionally, the presence of BUT modules, furnish sometimes
unexpected energy to the system due to discrete form of the algorithms. The supplementary energy
changes the velocity of the mass and affects in some degree the resulted frequency of the oscillator.
This phenomenon has been analyzed in [Thill 2003]. Unfortunately is difficult to predict the behavior of
this nonlinear system.
The equation (B.4) gives the formulae that map the velocity V0 and the wall distance L to the
frequency of the oscillator. As the applied BUT module of the model simulates a conditional elastic
interaction, an interaction time should be added to the mass round trip time for more precise
computations. In addition, it is straightforward from the same equation that there is always a maximum
synthesized frequency imposed by the characteristics of the interaction.
144
Figure B.7 Simple triangular oscillator model
Figure B.8 Aliasing in CA triangular-type waveforms
!
F =1
2Tdistance + 2Tinteraction
with
Tinteraction
="
Fs cos#1
(2M # K) /(2M)[ ]
Tdistance
= L / V0
(B.4)
The TPFM model (figure B.9) is similar to the PFM model. We use exactly the same design for the
modulating oscillators and a triangular one for the carrier oscillator as described before in the text. It is
evident that in the present case, the modulating oscillators do not stretch any nonlinear spring to alter
the oscillating frequency: their distance defines it indirectly as shown by the equation (B.4).
The impedance of every functional component of the model is crucial because it defines the interaction.
This is valid for the PFM model as well. For example the impedance of the modulation oscillators must be
much greater than the mass impedance otherwise the coupling will disturb substantially the system
behavior.
As in the PFM model, by comparing TPFM with the FM algorithm of figure B.1 we get:
× The parameter Dev corresponds to the maximum amplitude position xUpmax, xDownmax of the
masses of the oscillators Up and Down
× The parameter Fm corresponds to the choice of the elasticity parameter KUp, KDown (KUp =
KDown ) of the oscillators Up and Down
145
× The parameter Fc corresponds to the position SUp, SDown (SUp = -SDown) of the attachment point
of the oscillators Up and Down and to the mass initial velocity
× The parameter Amp is correlated to the frequency and thus we are not able to obtain and exclusive
control
Figure B.9 TPFM model model (xg, x1g, x2g: force input gesture signals, yg, y1g, y2g position output
gesture signals, ys audio output)
The control of the TPFM model is similar to the PFM thus no further explanation is needed.
Figure B.10 Examples of the TPFM model
This model has a very strong physical metaphor which strengths it usability. It can be intuitively
combined and linked to other physical models for the creation of further complicated waveforms. Some
drawbacks are the frequency-amplitude dependency, the unexpected energy exchange phenomena which
give birth to chaotic behaviors and the upper limit of the frequency.
The goal of this research was to introduce the important natural instrumental relation which has the
instrumentalist with his acoustical musical instrument into the successful FM sound synthesis. The
deferred-time simulations show a satisfying agreement -whether the system is stable- on the synthesized
audio output of the proposed models and the classical FM algorithm. Their structure is physically
consistent and hence the system passivity is preserved. This is a vital necessity in the instrumental
gestural interaction.
Our belief and motivation is that through a physical dynamic control of the FM algorithm, a virtuosity will
emerge that will contribute to the quality of synthesized sound. The validation of the proposed models
146
in real time conditions with the support of force feedback controllers through psychoacoustic
experiments is necessary.
147
Conc
lusi
ons
and
Futu
re W
ork
148
149
Conclusions and Future Work
This study introduced for the first time the concept of instrumental interaction in the domain of digital
audio effects. It is a novel and original contribution to the computer music domain. A number of simple
classical digital audio effects have been designed and approximated that provide haptic inputs/outputs to
the user. Moreover, their structure has permitted an ergotic interaction loop in which energy is
exchanged between them and the user. The CORDIS-ANIMA system has been employed for the design
and synthesis of all the proposed models. Our thesis was that the energetic coupling between acoustic
instrument / instrument player and the tactile-proprio-kinesthetic gesture feedback is essential and could
be transferred into the digital audio effects system.
The same concept has been transferred and applied to FM synthesis. We presented some primary ideas
and results concerning the redesign/re-definition of FM synthesis using a CA system. Once again, the
goal was to introduce the instrumental interaction into the successful FM sound synthesis algorithm. The
models structure was physically consistent and hence the system passivity was preserved. Using
triangular-type oscillators, interesting chaotic behavior was attained and unpredicted sonic results
occurred. The deferred-time simulations showed a satisfying agreement on the synthesized audio output
of the proposed models and the classical FM algorithm for modulation frequencies down to the gesture
range. For higher frequencies, instabilities occurred that prevented the simulation of most of the
interesting FM timbres.
An important step for the realization of the present research was the analysis of CA. Therefore, this
physical modeling and simulation system has been represented and analyzed by several other useful
mathematical formalisms. The aim was to enlarge the analysis tools of this mass-interaction approach
and to add new modeling strategies based on a more mathematical framework; furthermore, was to
make clear the particularities and the limits of this classical physical modeling technique. The choice of
the representation language is extremely important especially for artistic situations like musical creation.
We demonstrated that this formalism could be easily analyzed and combined with other mathematical
representations. This capability strengthened further its “modeling potential”, verified and made more
visible its strong theoretical formal basis. The analysis appeared crucial for the design of some physical
audio effect models, such as the delay-based effects.
A simple and accurate method to synthesize filters with string-form structure has been also presented.
Adopting the Cauer technique from the domain of the electrical networks, we simulated CA physical
models for filtering purposes – limited of course to several realizable filter designs. The main intention
and interest was to tune up those models to a pre-given set of frequencies. This method has been used
also for sound synthesis purposes.
Modular system architecture has been proposed for the design phase. The simulation was a fundamental
part of the design procedure thus we used and developed some basic tools for Simulink, which is a
common specialized visual block diagram language. Several other computer scripts have been written the
150
Matlab language for technical computing. All these will be available for further research in this specific
subject.
All the proposed algorithms were given in the form of flow graphs. Therefore their implementation is
straightforward and facilitates a functional programming paradigm, which probably is the optimal solution
for DSP musical applications. A very interesting discussion panel in the International Computer Music
Conference 2008 under the name Reinventing Audio and Music Computation for Many-Core Processors,
proposed functional programming as probably the more adequate computing paradigm for the
implementation of real time audio.
This research provided a global review of musical sound transformation algorithms focusing more on their
design processes. We started our “exploration” from a general but strong philosophical basis – the
“need” for instrumental interaction-, we proposed a simple classification of audio effects, and
progressively we developed a theory and a framework for the design of physical digital audio effect
models. In the end we proposed several models that approximate classical audio effects and presented a
few new ones.
The input/output gestural signals during this thesis were simulated off-line by a collection of simple CA
gestural models. We have reported that this fact does not affect the generality and the validity of this
research. However, it is necessary to run several additional experiments with the presence of a human
operator. Therefore the implementation of the proposed algorithms in real-time simulation system
equipped with force-feedback interface is probably the most necessary future work. The importance of
instrumental control in digital audio effects must be verified by formal observations and experiments and
by less formal/less controlled conditions -in order to eliminate the perturbations of the system coming
from our measures- such as during a musical performance. We believe that through this physical dynamic
control of the audio effect process, a virtuosity will emerge that will contribute to the quality and the
finesse of sound transformation.
The section in which we proposed some CORDIS-ANIMA/Digital Waveguide converters has been principally
validated mathematically. We should verify that the simulations give the same results and do not present
unexpected “surprises” such as instabilities. Furthermore, it would be interesting to design several simple
mixed models as bowed stings, air columns etc.
Many of the developed and used software tools in Simulink and Matlab must be implemented in GENESIS.
We insist that its future versions should offer these possibilities for sound transformation. We are
seeking forward to develop our algorithms in this environment in the context of a musical composition.
The gesture models in GENESIS have proved their interest in the past. Their application on physical audio
effect models will surely produce more plausible, “warm” and organic musical sound transformations.
This research was only the first small step in this novel approach of designing and interacting with digital
audio effects. Many other ideas for new digital audio effect algorithms or variations of more classical
ones can be still developed from here. The design of physical audio effect models is not finished…
151
…It has just started!!
152
153
Bibl
iogr
aphy
154
155
B ib l iography
[Abel Berners Costello Smith 2006] J. Abel, D. Berners, S. Costello, J. O. Smith, “Spring Reverb
Emulation Using Dispersive Allpass Filters in a Waveguide Strucure”, Proceedings of the 121st Convention
of the Audio Engineering Society, California, USA 2006
[Allen & Rabiner 1977] J. B. Allen and L. R. Rabiner, “A unified approach to short-time Fourier analysis
and synthesis”, Proceedings of IEEE, 65, 1977
[Amatriain, Herrera 2001] X. Amatriain, P. Herrera, “Audio Content Transmission”, in Proceedings of
2001 DAFX, 2001
[Amatriain, Bonada, Loscos, Serra, 2001] X. Amatriain, J. Bonada, A. Loscos, X. Serra, “Spectral Modeling
for Higher-level Sound Transformations”, Proceedings of MOSART Workshop on Current Research
Directions in Computer Music. Barcelona, 2001
[Amatriain, Bonada, Loscos, Serra 2002] X. Amatriain, J. Bonada, A. Loscos, X. Serra, “Spectral
Processing”, U. Zolzer DAFX, John Wiley & Sons, 2003
[Amatriain, Bonada, Loscos, Arcos, Verfaille 2003] X. Amatriain, J. Bonada, A. Loscos, J. Arcos, V.
Verfaille, “Contentbased Transformations”, JNMR Vol.(32)1, 2003
[Adrien 1991] J-M Adrien, “The Missing Link: Modal Synthesis”, in Representation of Musical Signals (G.
De Poli, A. Picialli, C. Roads, eds.), MA: MIT Press, 1991
[Arfib 1979] D. Arfib, “Digital Synthesis of Complex Spectra by Means of Multiplication Nonlinear
Distorted Sine Waves”, JAES, 27(10), 1979
[Arfib 1991] D. Arfib, “Analysis transformation and resynthesis of musical sounds with the help of a
time-frequency representation”, G. Poli, A. Piccialli and C. Roads, Representations of musical signals, The
MIT press, 1991
[Arfib & Delprat 1993] D. Arfib and N. Delprat, “Musical Transformations Using the Modification of Time-
Frequency Images”, CMJ 17(2), 1993
[Arfib 1998] D. Arfib, “Des courbes et les sons”, Recherches et applications en informatique musicale,
Hermes, pp. 277-286, 1998
[Arfib, Keiler, Zolzer 2002] D. Arfib, F. Keiler, U. Zolzer, “Time-frequency processing”, U. Zolzer DAFX,
John Wiley & Sons, 2002
156
[Arfib Couturier Kessous Verfaille 2002] D. Arfib, J.-M. Couturier, L. Kessous, V. Verfaille, “Strategies of
mapping between gesture para- meters and synthesis model parameters using perceptual spaces”,
Organised Sound, 7(2), 135–152, 2002
[Arfib ,Keiler, Zolzer 2002] D. Arfib, F. Keiler, U. Zolzer, “Source-Filter processing”, U. Zolzer DAFX,
John Wiley & Sons, 2002
[Atal & Hanauer 1971] B. S. Atal and S. L. Hanauer, “Speech Analysis and Synthesis by linear prediction
of the speech wave”. JASA, 50, 1971
[Berdahl Smith 2006] E. Berdahl, “Some Physical Audio Effects”, Proceeding of 9th Digital Audio Effects
Conference (DAFx-06), Montreal, Canada, 2006
[Bilbao 2001] S. Bilbao, Wave and Scattering Methods for the Numerical Integration of Partial Differential
Equations, PhD thesis, Stanford University, June 2001,
[Bilbao 2007] S. Bilbao, “A Digital Plate Reverberation Algorithm”, Journal of Audio Engineering Society
55(3), 2007
[Blesser 2001] Barry Blesser, An Interdisciplinary Synthesis of Reverberation Viewpoints, JAES, Vol. 49
No. 10, 2001
[Bode 1984] Harald Bode, “ History of Electronic Sound Modification”, Journal of the Audio Engineering
Society, 32(10), 1984
[Bonada 2000] J. Bonada, “Automatic technique in frequency domain for near-lossless time-scale
modification of audio”, In Proceedings of 2000 ICMC, pp. 396-399, 2000
[Borin De Poli Sarti 1992] G. Borin, G. De Poli, A. Sarti, “Algorithms and Structures for Synthesis Using
Physcal Models”, Computer Music Journal, 16(4), pp. 47-5. M.I.T. Press, Cambridge Mass. 1992
[Blesser 2001] Barry Blesser, “An Interdisciplinary Synthesis of Reverberation Viewpoints”, JAES, Vol. 49
No. 10, 2001
[Bresson 2006] J. Bresson, “Sound Processing in OpenMusic”, in Proceeding of the 9th Int. Conference
on Digital Audio Effects (DAFx-06), Montreal, Canada, 2006
[Bristow-Johnson, 1995] R. Bristow-Johnson, “A detailed analysis of a time-domain formant corrected
pitch shifting algorithm”, JAES., 43(5), 1995
[Chaigne 2003] A. Chaingne, Ondes Acoustiques, Les Editions De l’ Ecole Polytechnique, 2003
157
[Cadoz Luciani Florens 1984] C. Cadoz, A. Luciani and J.-L. Florens, “Responsive Input Devices and
Sound Synthesis by Simulation of Instrumental Mechanisms : The CORDIS System”, Computer Music
Journal 8(3), 1984 – reprint in The Music Machine – edited by Curtis Roads, The MIT Press, Cambridge,
Massachusetts, London, England - pp. 495-508, 1988
[Cadoz 1988] C. Cadoz, “Instrumental Gesture and Musical Composition”, in Proceedings of the 1988
International Computer Music Conference, Cologne, Germany, 1988
[Cadoz 1994] C. Cadoz, “Le geste, canal de communication instrumental”, Techniques et Sciences
Informatiques Vol 13 - n01 pp. 31-64,1994
[Cadoz 1999] C. Cadoz, "Musique, geste, technologie", In H. Genevois and R. de Vivo, eds. Les nouveaux
gestes de la musique, Parenthèses Editions, 1999
[Cadoz Wanderley 2000] C. Cadoz, M. M. Wanderley, “Gesture-Music”, in Trends in Gestural Control of
Music, M. M. Wanderley and M. Battier, eds, ©2000, Ircam – Centre Pompidou, pp. 71-94 , 2000
[Cadoz Lisowski Florens 1990] C. Cadoz, L, Lisowski and J.L Florens, "A modular Feedback Keyboard
design" - Computer Music Journal, 14, N°2, pp. 47-5. M.I.T. Press, Cambridge Mass. 1990
[Cadoz Luciani Florens 1990] C. Cadoz, A. Luciani and J.L Florens, “CORDIS ANIMA : Système de
modélisation et de simulation d’instruments et d’objets physiques pour la creation musicale et l’image
animée", Actes du colloque "Modèle physique, création musicale et ordinateur", France, Grenoble1990
[Cadoz 1990] C. Cadoz, “CORDIS ANIMA : “Simuler pour connaître / Connaître pour simuler", Actes du
colloque "Modèle physique, création musicale et ordinateur", France, Grenoble1990
[Cadoz Luciani Florens 1993] C. Cadoz, A. Luciani and J.L Florens, “CORDIS-ANIMA: A modelling and
Simulation System for Sound and Image Synthesis – The General Formalism”, Computer Music Journal,
17(4), 1993.
[Cadoz 2008] C. Cadoz, “Musical Creation Process and Digital Technology - the Supra-Instrumental
Gesture”, Proceedings of 4th International Conference on Enactive Interfaces, Grenoble, France, 2008
[Cano, Loscos, Bonada, Boer, Serra 2000] P. Cano, A. Loscos, J. Bonada, M. de Boer, X. Serra, “Voice
morphing System for Impersonating in Karaoke applications”, in Proceedings of 2000 ICMC, 2000
[Castagne Cadoz 2003] N. Castagne, C. Cadoz, ''10 Criteria for evaluating physical modelling schemes
for music creation'', Proceeding of the Digital Audio Effects Conference DAFX03, London, UK, 2003
158
[Castagne Cadoz 2002b] N. Castagne, C. Cadoz, "GENESIS: A Friendly Musician-Oriented Environment for
Mass-Interaction Physical Modeling", in Proceedings of ICMC2002, Sweden, Goteborg, 2002
[Castagne Cadoz 2002] N. Castagne, C. Cadoz, “Creating music by means of ‘Physical Thinking’: The
Musician oriented GENESIS environment”, Proceedings of the Digital Audio Effects Conference DAFX02,
Hamburg, Germany, 2002
[Castagne 2002c] N. Castagne, “GENESIS un environment pour la creation musicale à l’aide de modeles
physiques particulaires”, PhD thesis, INPG, France, 2002
[Castagne Cadoz Florens Luciani 2005] N. Castagne, C. Cadoz, J. L. Florens, A. Luciani, “Haptics in
Computer Music: a Paradigm Shift”, Proceedings of Eurohaptics 2005, 2005
[Castagne 2007] N. Castagne, “Mapping and Control VS Instrumental Interaction”, in Enaction and
Enactive Interfaces, a Handbook of Terms, A. Luciani and C. Cadoz eds, Enactive System Books, 2007
[Cavaliere & Piccialli 1997] S. Cavaliere and A. Piccialli, “Granular synthesis of musical sounds”, C. Roads,
S. T. Pope, A. Piccialli and G. Poli Musical Signal Processing, Swets & Zeitlinger, 1997
[Chowning, 1971] J. Chowning, “The Simulation of Moving Sound Sources”, J. Audio Eng. Soc.,
19(1) : 1–6, 1971
[Chowning 1973] J. M. Chowning, “The Synthesis of Complex Audio Spectra by Means of Frequency
Modulation”, J. Audio Eng. Soc., vol 21, no. 7, pp. 526-534, 1973
[Composers Desktop Project] www.composersdesktop.com
[Dars-Berberyan Cadoz Florens 1983] T. Dars-Berberyan, C. Cadoz, J. L. Florens, "Processeur
spécialisépour lecalcul desonsen temps réel par simulation instrumentale", 11nth ICA, Paris 1983
[Dattorro 1997a] J. Dattorro, “Effect design part 2: delay line modulation and chorus”, JAES 45(10),
1997
[Dattorro 1997b] Jon Dattorro, “Effect Design Part 1: Reverberator and Other Filters”, JAES 45(9),
1997
[Dattorro 2002] J. Dattorro, “Effect design: Part 3 oscillators: Sinusoidal and pseudonoise“, Journal of
the Audio Engineering Society, vol. 50, no. 3, pp. 115-146, 2002
[De Poli 1984] G. De Poli, “Sound Synthesis by Fractional Waveshaping”, JAES, 32(11), 1984
159
[De Poli & Piccialli 1991] G. De Poli and A. Piccialli, “Pitch-Synchronous Granular Synthesis ”, G. Poli, A.
Piccialli and C. Roads, Representations of musical signals, The MIT press, 1991
[De Poli Rocchesso 1998] G. De Poli, D. Rocchesso, “Physically based Sound Modelling”, Organized
Sound, 3(1), 1998
[Djoharian 1993] P. Djoharian, “Generating Models for Modal Synthesis”, Computer Music Journal, 17(1),
1993.
[Depalle Rodet Matignon Pouilleute 1990] P. Depalle, X. Rodet, D. Matignon, P. Pouilleute, “Premiers
resultants sur les modèles en variables d’état et leur identification”, Colloque Modèles Physiques,
Création Musicale et Ordinateurs, Grenoble 1990
[Disch Zolzer 1999] S. Disch, U. Zolzer, “Modulation and delay line based digital audio effects”,
Proceedings of the 200 Digital Audio Effects Conference, Trondheim, Norway, 1999
[Dodge 1989] C. Dodge, “On Speech Songs”, Current Directions in Computer Music Research, The MIT
Press, 1989
[Dodge Jerse 1997] C. Dodge, T. Jerse, Computer Music: synthesis, composition and performance,
second edition, Schirmer Books New York, 1997
[Dolson 1986] Mark Dolson, The Phase-Vocoder: a tutorial, CMJ 10(4), 1986
[Dorran ,Lawlor, Coyle 2003a] D. Dorran, R. Lawlor, E. Coyle, “Time-Scale Modification of Speech using a
Synchronised and Adaptive Overlap-Add (SAOLA) Algorithm”, 114th Convention of AES, 2003
[Dorran ,Lawlor, Coyle 2003b] D. Dorran, R. Lawlor, E. Coyle, “High Quality Time-Scale Modification of
Speech using a Peak Alignment Overlap-Add Algorithm (PAOLA)”, IEEE International Conference on
Acoustics, Speech and Signal Processing, 2003
[Dorran & Lawlor 2003c] D. Dorran and R. Lawlor, “An efficient audio time-scale modification algorithm
for use in a subband implementation” , Proceedings of the 2003 Conference on DAFX, 2003
[Dutilleux 1998] P. Dutilleux, “Filters, Delays, Modulations and Demodulations – A Tutorial,” in Proc.
DAFX-98, pp. 4-11, Barcelona, Spain, Nov. 1998
[Duttilleux, Grossmann and Kronland-Martinet 1998] P. Duttilleux, A. Grossmann and R. Kronland-
Martinet, “ Application of the wavelet transform to the analysis, transformation and synthesis of musical
sounds”, Proceedings of 85 AES convention, 1988
160
[Dutilleux Zoelzer 2002] P. Dutilleux, U. Zolzer, “Nonlinear Processing”, DAFX, edited by U. Zolzer, John
Wiley & Sons, 2002
[Dutilleux Zoelzer 2002b] P. Dutilleux, U. Zolzer, “Filters”, DAFX, edited by U. Zolzer, John Wiley & Sons,
2002
[Dutilleux Zoelzer 2002c] P. Dutilleux, U. Zolzer, “Modulators and Demodulators”, DAFX, edited by U.
Zolzer, John Wiley & Sons, 2002
[Dutilleux Zoelzer 2002d] P. Dutilleux, U. Zolzer, “Delays”, DAFX, edited by U. Zolzer, John Wiley & Sons,
2002
[Dutilleux, De Poli, Zolzer 2002] P. Dutilleux, G. De Poli and U. Zolzer, “Time-segment processing”, U.
Zolzer DAFX, John Wiley & Sons, 2002
[Dutoit Gosselin 2005] T. Dutoit, B. Gosselin, Theory de Circuits, Notes de Cours – Faculté Polytechnique de Mons, 2005
[Endrich 1997] A. Endrich, “Composers’ Desktop Project: a musical imperative”, Organized Sound, vol 2
issue 1, 1997
[EARS] www.ears.dmu.ac.uk
[Erkut Karjalainen 2002] C. Erkut, M. Karjalainen, “Finite Difference Method vs. Digital Waveguide Method
in String Instruments Modeling and Synthesis”, Proceedings of International Symposium on Musical
Acoustics, Mexico City, Mexico, 2002
[Evangelista 1991] G. Evangelista, “Wavelet transform we can play ”, G. Poli, A. Piccialli and C. Roads,
Representations of musical signals, The MIT press, 1991
[Evangelista 1993] G. Evangelista, “Pitch-Synchronous wavelet representation of speech and music
signals”, IEEE transactions on Signal Processing, 41(12), 1993
[Evangelista 1994] G. Evangelsita, “Comb and Multiplexed Wavelet Transforms and Their Applications to
Signal Processing”, IEEE Transactions on Signal Processing, 42(2), 1994
[Gouyon Fabig Bonada 2003] G. Fabien, L. Fabig, J. Bonada, “Rhythmic expressiveness transformations
of audio recordings: swing modifications”, In Proceedings 2003 DAFX, 2003
[Fairbanks Everitt Jaeger 1954] G. Fairbanks, W. Everitt and R. Jaeger, “Method for time or frequency
compression-expansion of speech”, IEEE Transactions on Audio and Electroacoustics, AU-2, 1954
161
[Fernadez-Cid Quiros 2001] P. Fernandez-Cid, J. C. Quiros, “Distortion of Musical Signals by Means of
Multiband Waveshaping”, Journal of New Music Research, 30(3), pp. 279-287, 2001
[Fettweis 1986] A. Fettweis, ''Wave digital filters: Theory and practice'', Proceedings of the IEEE, vol.
74, pp. 270-327, Feb. 1986
[Flanagan & Golden 1966] J. L. Flanagan and R. M. Golden, Phase Vocoder, Bell System Technical
Journal, Vol. 45, 1966
[Fletcher, Rossing 1998] N. H. Fletcher, N. D. Rossing, The Physics of Musical Instruments, second
edition, Springer, 1998
[Florens 1978] J. L. Florens, "Coupleur gestuel interactif pour la commande et le contrôle de sons
synthétisés en temps réel", PhD Thesis, INPG, Grenoble, France 1978
[Florens, Cadoz 1991], J. L. Florens, C. Cadoz, ''The Physical Model : Modeling and Simulating the
Instrumental Universe'' in Representations of Musical Signals (G. De Poli, A. Picialli, and C. Roads, eds.),
pp. 269-297, Cambridge, MA: MIT Press, 1991
[Florens Luciani Cadoz Castagne 2004] J. L. Florens, A. Luciani, C. Cadoz, N. Castagne, “ERGOS: Multi-
degrees of Freedom and Versatile Force-Feedback Panoply”, Proceeding of EuroHaptics04, Munich,
Germany, 2004
[Florens Voda Urma 2006] J.L Florens, A. Voda, D. Urma, “Dynamical Issues in Interactive
Representations of Physical Objects”, Proceedings of EuroHaptics 2006, Paris, France, 2006
[Fontana 2007] F. Fontana, “Preserving the Structure of th Moog VCF in the Digital Domain”, in
Proceedings of 2007 International Computer Music Conference ICMC2007, Copenhagen, Denmark, 2007
[Gabor 1946] D. Gabor, “Theory of communication”, Journal of the Institute of Electrical Engineers Part
III, 93, 1946
[Gibet Florens 1988] S. Gibet, J. L. Florens, "Instrumental gesture modeling by identification with time
varying mechanical models", International Computer Music Conference - Cologne 1988
[Gardner 1992] W. G. Gardner “A realtime multichannel room simulator”. J. Acoust. Soc. Am., 92(4
(A)): 2395. http://sound.media.mit.edu/papers.html
[Gardner 1998] W. G. Gardner, “Reverberation Algorithms“, Application of digital signal processing to
audio and acoustics, edited by M. Kahrs and K. Brandeburg, Khuer Academic Publishers, 1998
162
[George & Smith 1992] E. B. George and M. J. T. Smith, “Analysis by Synthesis/Overlap-Add Sinusoidal
modeling Applied to the Analysis and Synthesis of musical tones”, JAES, 40(6):497-515, 1992
[Gerhards 2002] R. H. Gerhards, “Sound analysis, modification and resynthesis with wavelet packets”,
Master Thesis – School of Engineering Science Simon Fraser University, 2002
[Gerzon 1971] M. A. Gerzon, “Synthetic stereo reverberation, parts i and ii“, Studio Sound, vol. 13(I),
14(II), pp. 632-635(I), 24-28(II), 1971(I),1972(II)
[Geslin 2002] Y. Geslin, “Digital Sound and Music Transformation Environments: A Twenty-year
Experiment at the Groupe de Recherches Musicales”, JNMR, 31(2), 2002
[Gomez, Peterschmitt, Herrera 2003] E. Gomez, G. Peterschmitt, P. Herrera, “Content-based melodic
transformations of audio for a music processing application”, In Proceedings of 2003 DAFX, 2003
[Habibi 1997] A. Habibi, “Visualisation d’objets très déformables”, PhD Thesis, ACROE and CLIPS, Institut
National Polytechnique de Grenoble, France, 1997
[Hartmann 1978] W. M. Hartmann, “Flanging and Phasers”, Journal of Audio Engineering Society, 26(6),
1978
[Huelsman 1993] L. P. Huelsman, Active and Passive Analog Filter Design, McGraw-Hill, 1993
[Huovilainen 2004] A. Huovilainen, “Non-linear digital implementation of the Moog ladder filter”,
Proceedings of the 2004 Digital Audio Effects Conference pp. 61-64, 2004
[Huovilainen 2005]A. Huovilainen, “Enhanced digital models for digital modulation effects”, Proceedings
of the 2005 Digital Audio Effects Conference, Madrid, Spain, 2005
[Huovilainen Valimaki 2005] A. Huovilainen, V. Valimaki, “New Approaches to Digital Subtractive
Synthesis”, in Proceedings of 2005 International Computer Music Conference ICMC2005, Barcelona, Spain,
2005
[Incerti 1996] Incerti E., ''Synthèse de sons par modélisation physique de structures vibrantes :
application pour la création musicale par ordinateur'', PhD Thesis, ACROE, Institut National Polytechnique
de Grenoble, France, 1996
[Incerti Cadoz 1995] E. Incerti , C. Cadoz, ''Topology, Geometry, Matter of Vibrating Structures
Simulated with CORDIS-ANIMA. Sound Synthesis Methods. '', Proceedings of the International Computer
Music Conference ICMC1995, 1995
163
[Jaffe Smith 1983] D. A. Jaffe and J. O. Smith, "Extensions of the Karplus-Strong plucked string
algorithm'', Computer Music Journal, vol. 7, no. 2, pp. 56-69, 1983
[Jaffe 1995] D. A. Jaffe, ''Ten criteria for evaluating synthesis and processing techniques,'' Computer
Music Journal, vol. 19, no. 1, pp. 76-87, 1995
[Janer Bonada Jorda 2006] J. Janer, J. Bonada, S. Jorda, “Groovator – an implementation of real-time
rhythm transformations”, in 121 AES Convention, San Francisco, USA, 2006
[Jot Chaigne 1991] J.-M. Jot, A. Chaigne, “Digital delay networks for designing artificial reverberators“,
Audio Engineering Society Convention, Feb. 1991.
[Jot 1992] J. M. Jot, “An analysis/synthesis approach to real-time artificial reverberation“, in
Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, San Francisco,
(New York), pp. II.221-II.224, IEEE Press, 1992.
[Kalouptsidis 1997] N. Kalouptsidis, Signal Processing Systems Theory and Practice, Wiley, 1997
[Kalouptsidis 1994] N. Kalouptsidis, Signal, Systems Theory and Algorithms (in Greek), second edition,
Diavlos, 1993
[Karjalainen 2003] M. Karjalainen, ''Mixed physical modelling: DWG+FDTD+WDF'', Proceedings of the IEEE
Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, New York, 2003
[Karjalainen Erkut 2004] M. Karjalainen, C. Erkut, “Digital Waveguides versus Finite Difference Structures:
Equivalence and Mixed Modeling”, EURASIP Journal on Applied Signal Processing, 2004:7, 2004
[Karjalainen 2004b] M. Karjalainen, "Discrete-time Modeling and Synthesis of Musical Instruments", Joint
Baltic-Nordic Acoustics Meeting, Mariehamn, Aland, 2004
[Karjalainen et al. 2004] M. Karjalainen, T. Maki-Patola, A. Kanerva, A. Huovilainen, P. Janis, “Virtual Air
Guitar”, 117th AES Convention, San Francisco, USA, 2004
[Karplus Strong 1983] K. Karplus and A. Strong, ``Digital synthesis of plucked string and drum timbres,''
Computer Music Journal, vol. 7, no. 2, pp. 43-55, 1983
[Keen 2000] R. G. Keen, “A Musical Distortion Primer”, http://www.geofex.com
[Koenig Dunn and Lacy 1946] W. Koening, H. K. Dunn, L. Y. Lacy, “The Sound Spectrograph”, J.
Acoustical Soc. of America 18(1), 1946
164
[Kontogeorgakopoulos Cadoz 2005] A. Kontogeorgakopoulos, C. Cadoz, “Digital Audio Effects and
Physical Modelling”, in the Proceedings of Sound and Music Computing Conference SMC05, Italy, 2005
[Kontogeorgakopoulos Cadoz 2007a] A. Kontogeorgakopoulos, C. Cadoz, “Cordis Anima Physical
Modelling and Simulation System Analysis”, in the Proceedings of Sound and Music Computing Conference
SMC07, Lefkada, Greece, 2007
[Kontogeorgakopoulos Cadoz 2007b] A. Kontogeorgakopoulos, C. Cadoz, “Physical Modelling as a
Proposed Framework for the Conception, the Design and the Implementation of Sound Transformations”,
in the Proceedings of International Computer Music Conference ICMC2007, Denmark, 2007
[Kontogeorgakopoulos Cadoz 2007c] A. Kontogeorgakopoulos, C. Cadoz, “Filtering Within the Framework
of the Mass-Interaction Physical Modelling and of Haptic Gestural Interaction”, in the Proceedings of
Digital Audio Effects DAFX07, France, 2007
[Kontogeorgakopoulos Cadoz 2008a] A. Kontogeorgakopoulos, C. Cadoz, “Amplitude Modification
Algorithms using Physical Models”, in the Proceedings of 124 Audio Engineering Society Convention, The
Netherlands, 2008
[Kontogeorgakopoulos Cadoz 2008b] A. Kontogeorgakopoulos, C. Cadoz, “Interfacing Digital Waveguide
with CORDIS-ANIMA Networks” (abstract), in the Proceedings of Acoutics08, Paris, France, 2008
[Kontogeorgakopoulos Cadoz 2008c] A. Kontogeorgakopoulos, C. Cadoz, Designing and Synthesizing
Delay-Based Digital Audio Effects using the CORDIS-ANIMA Physical Modeling Formalism”, in the
Proceedings of Sound and Music Computing Conference SMC08, Berlin, Germany, 2008
[Kronland-Martinet 1988] R. Kronland-Martinet, “The wavelet transform for analysis, synthesis and
processing of speech and music sounds”, CMJ, 12(4), 1988
[Laakso Valimaki Karjalainen 1996] T. I. Laakso, V. Välimäki, M. Karjalainen, and U. K. Laine, ``Splitting
the Unit Delay--Tools for Fractional Delay Filter Design,'' IEEE Signal Processing Magazine, vol. 13, pp. 30-
60, Jan. 1996
[Landy 1991] L. Landy, “Sound Transformations in Electroacoustic Music”,
www.composersdesktop.com/landyeam.htm ,1991
[Lane Hoory Martinez Wang 1997] J. Lane, D. Hoory, E. Martinez, and P. Wang, “Modeling analog
synthesis with dsps”, Computer Music Journal, vol. 21, pp. 23-41, Winter 1997
[Lansky 1989] P. Lansky, “Compositional Applications of linear predictive coding”, Current Directions in
Computer Music Research, The MIT Press, 1989
165
[Laroche 1998] Jean Laroche, “Time and pitch scale modification of audio signals”, Mark Kahrs
Application of digital signal processing to audio and acoustics, , Khuer Academic Publishers, 1998
[Laroche & Dolson 1999] J. Laroche and M. Dolson, New phase vocoder techniques for real time pitch
shifting, chorusing, harmonizing and other exotic audio modifications, JAES, vol 47(11), 1999
[Lathi 1998] B. P. Lathi, Signal Processing and Linear Systems, Berkeley-Cambridge Press, 1998
[Lay 2003] D. Lay, Linear Algebra and its Applications, Addison-Wesley, 2003
[Le Brun 1979] M. Le Brun, “Digital Waveshaping Synthesis”, JAES, 27(4), 1979
[Lee 1972] F. Lee, “Time compression and expansion of speech by the sampling method”, JAES, 20(9)
[Lent 1989] K. Lent, “An Efficient Method for Pitch Shifting Digitally Sampled Sounds”, CMJ, 13(4),
1989
[Loscos Aussenac 2005] A. Loscos, T. Aussenac, “The Wahwactor: A Voice Controlled Wah-Wah Pedal”,
in Proceedings of International Conference on New Interfaces for Musical Expression NIME05, Vancouver,
Canada, 2005
[Luciani 2007] A. Luciani, “Instrumental Interaction: Technology”, in Enaction and Enactive Interfaces, a
Handbook of Terms, A. Luciani and C. Cadoz eds, Enactive System Books, 2007
[Luciani 2007b] A. Luciani, “Gestural Channel”, in Enaction and Enactive Interfaces, a Handbook of
Terms, A. Luciani and C. Cadoz eds, Enactive System Books, 2007
[Magnusson 2007] C. Magnusson, “Design and Enaction”, in Enaction and Enactive Interfaces, a
Handbook of Terms, A. Luciani and C. Cadoz eds, Enactive System Books, 2007
[Manning 2004] P. Manning, Electronic and Computer Music, Oxford University Press, 2004
[Marliere Urma Florens Marchi 2004] S. Marliere, D. Urma, J. L. Florens, F. Marchi, “Multi-sensorial
interaction with a nano-scale phenomenon: the force curve”, Proceeding of Eurohaptics 2004, Munich,
Germany, 2004
[Marshall 2003] W. Marshall Leach, Jr, Introduction to Electroacoustics and Audio Amplifier Design, third
edition, Kendall/Hunt publishing company, 2003
[Massie Saisbury 1994] T. Masie, K. Salisbury, “The PHANToM Haptic Interface: A device for Probing
Virtual Objects”, ASME Winter Annual Meeting, DSC, 55-1, 295-300, 1994
166
[McAulay and Quatieri 1986] R.J. McAulay and T.F. Quatieri, “Speech analysis/synthesis based on a
sinusoidal representation”, IEEE Transactions on Acoustics, Speech and Signal Processing, 34(4):744-
754, 1986
[McNally 1984] G. W. McNally, “Dynamic Range Control of Digital Signals”, JAES, 32(5), 1984
[Meunier 2000] O. Meunier, “La spatialite dans les modeles physiques pour la creation musicale”,
Master’s thesis, ACROE, Institut National Polytechnique de Grenoble, France, 2000
[Miranda Wanderley 2006] E. Miranda, M. Wanderley, New Digital Musical Instruments: Control And
Interaction Beyond the Keyboard, A-R Editions, 2006
[Mitra 2001] S. K. Mitra, Digital Signal Processing: A Computer-Based Approach, McGraw-Hill, second
edition, 2001
[Moeller Gromowski Zoelzer 2002] S. Moeller, M. Gromowski, Zoelzer, “A Measurement Technique For
Highly Nonlinear Transfer Functions”, Proceedings of 2002 Digital Audio Effects Conference (DAFx-02),
Hamburg, Germany, 2002
[Moorefield 2005] V. Moorefield, The Producer as Composer: Shaping the Sounds of Popular Music, MIT
Press books, 2005
[Moog 1965] R. A. Moog, “A voltage-controlled low-pass high-pass filter for audio signal processing,”
Audio Engineering Society Convention, Preprint 413, Oct. 1965
[Moorer 1979] J. A. Moorer. “About this Reverberation Business”, Computer Music J., 3(2):13–18, 1979
[Moorer 1979b] J. A. Moorer, “The use of linear prediction of speech in Computer Music applications”,
JAES, 27(3), 1979
[Morse, Ingard 1968] P. M. Morse, K. U. Ingard, Theoretical Acoustics, Princeton University Press, 1968
[Moulines & Laroche 1995] E. Moulines and J. Laroche, “Non parametric techniques for pitch-scale and
time-scale modifications of speech”, Speech Communication, 16
[Nielsen 1999] S. H. Nielsen, “Real Time Control of Audio Effects”, Proceedings of DAFX1999, Norway,
1999
[Nilsson Riedel 2004] J. W. Nilsson, S. A. Riedel, Electric Circuits, Pearson/Prentice Hall, 2004
[Noll 1964] A.M. Noll, “Short-time spectrum and “cepstrum” techniques for vocal-pitch detection”,
JASA., 36(2), 1964.
167
[Oppenheim Willsky Young 1983] Alan V. Oppenheim, Alan S. Willsky, Ian T. Young, Signals and Systems,
Prentice-Hall, 1983
[Oppenheim Schafer Buck 1999] A. V. Oppenheim, R. W. Schafer, J. R. Buck, Discrete-Time Signal
Processing, second edition, Prentice Hall, 1999
[Orfanidis 1996] S. J. Orfanidis, Introduction to Signal Processing, Prentice-Hall, 1996
[Orlarey Fober Letz 2002] Y. Orlarey, D. Fober, S. Letz, “An Algebra for Block Diagram Languages”, in
Proceedings of ICMC2002, Sweden, Goteborg, 2002
[Otis, Grossman Cuomo 1968] A. Ottis, G. Grossman and J, Cuomo, “Four sound-processing programs for
the Illiac computer and D/A converter”, Experimental Music Studios Technical Report Number 14,
University of Illinois, 1968
[Pabon 1994] P. Pabon 1994, “Real-time spectrum/cepstrum games”, Proceedings of 1994 ICMC, 1994
[Patterson Hennessy] D. A. Patterson, J. L. Hennessy, Computer Organization and Design: The
Hardware/Software Interface, third edition, Morgan Kaufmann Publishers, 2005
[Peeters 1998] G. Peeters, “Analyse et Synthèse des sons musicaux par la methode PSOLA”,
Proceedings of 1998 JIM, 1998
[Peissig Haseborg 2004] J. Peissig, J. R. ter Haseborg, “Digital Emulation of Analog Companding
Algorithms for FM Radio Transmission”, in Proceedings of 2004 Digital Audio Effects Conference (DAFx-
04), Naples, Italy, 2004
[Pelegrin 1959] M. Pelegrin, Machines a Calculer Electriniques Arithmetiques et Analogiques, Dunod 1959
[Pesce 2000] F. Pesce, “Real-time stretching of speech signals”, Proceedings of the 2000 Conference
on DAFX, 2000
[Petrausch Rabenstein 2005] S. Petrausch, R. Rabenstein, ''Implementation of arbitrary linear sound
synthesis algorithms by digital waveguide structures'', Proceedings of the Digital Audio Effects
Conference DAFX05,Madrid, Spain, 2005
[Petrausch, Rabenstein 2005b] S. Petrausch, R. Rabenstein, “Interconnection of state space structures
and wave digital filters,” IEEE Trans. Circuits Syst. II, vol. 52, no. 2, pp. 90–93, Feb 2005
[Portnoff 1976] M. R. Portnoff, Implementation of the digital phase vocoder using the Fast Fourier
Transform, IEEE Transactions on Acoustics Speech and Signal Processing, vol ASSP 24(3), 1976
168
[Poullin 1954] J. Poullin, “L’apport des techniques d’enregistrement dans la fabraction de matières et
formes musicales nouvelles. Applications a la musique concrète”, L’Onde Electrique, 34(324), 1954
[Proakis Manolakis 1996] J. Proakis, D. Manolakis, Digital Signal Processing: Principles, Algorithms and
Applications, third edition, Prentice-Hall, 1996
[Rabenstain Trautman 2001] R. Rabenstain, L. Trautmann, “Digital Sound Synthesis by Physical
Modelling”, Symposium on Image and Signal Processing and Analysis ISPA01, Pula, Croatia, 2001
[Rabenstein et al. 2007] R. Rabenstein, S. Petrausch, A. Sarti, G. De Sanctis, C. Erkut, M. Karjalainen,
“Block-Based Physical Modeling for Digital Sound Synthesis”, IEEE Signal Processing Magazine, (42), 2007
[Rabiner et al. 1972] L. Rabiner, J. Cooley, H. Helms, L. Jackson, J. Kaiser, C. Rader, R. Schafer, K.
Steiglitz, C. Weinstein, “Teminology in digital signal processing”, IEEE Transactions on Audio and
Electroacoustics, 20(5), 1972
[Risset 1991] J. C. Risset, “Timbre Analysis by Synthesis: Representations, Imitations, and Variants for
Musical Composition”, G. Poli, A. Piccialli and C. Roads, Representations of musical signals, The MIT
press, 1991
[Quatieri McAulay 1998] T. F. Quatieri and R. J. McAulay, “Audio signal processing based on sinusoidal
analysis/synthesis'' in Applications of DSP to Audio & Acoustics (M. Kahrs and K. Brandenburg, eds.), pp.
343-416, Boston/Dordrecht/London: Kluwer Academic Publishers, 1998
[Roads 1991] C. Roads, “Asynchronus Granular Synthesis ”, G. Poli, A. Piccialli and C. Roads,
Representations of musical signals, The MIT press, 1991
[Roads 1996] C. Roads, The Computer Music Tutorial, MIT Press, 1996
[Roads 2001] C. Roads, Microsound, MIT Press, 2001
[Rocchesso 2003] D. Rocchesso, Introduction to Sound Processing, http://www.scienze.univr.it/˜rocchess
[Rochhesso 2002] D. Rocchesso, “Spatial Effects”, DAFX, edited by U. Zolzer, John Wiley & Sons, 2002
[Roucos & Wilgus 1985] S. Roucos and A.M. Wilgus, “High quality time-scale modification for speech”,
Proceedings ICASSP, 1985
[Schaefer 1970] R. Schaefer,“Electronic Musical Production by Nonlinear Waveshaping”, JAES, 18(4),
1970
[Rocard 1971] Y. Rocard, Dynamique générale des vibrations, Masson et Cie, 1971
169
[Rocchesso 2002] D. Rochesso, “Spatial Effects”, DAFX, edited by U. Zolzer, John Wiley & Sons, 2002
[Stautner Puckette 1982] J, Stautner, M. Puckette, “Designing Multichannel Reverberators”, Computer
Music J., 6(1): 52–65, Spring 1982
[Rumsey 1999] F. Rumsey, The Audio Workstation Handbook, Focal Press, 1999
[Schattschneider Zoelzer 1999] J. Schattschneider, U. Zoelzer, “Discrete-time models for nonlinear audio
systems”, Proceedings of 1999 Digital Audio Effects Conference (DAFx-99), Norway, 1999
[Schimmel 2003] J. Schimmel, “Using Nonlinear Amplifier Simulation in Dynamic Range Controlers”, in
Proceeding of 2003 Digital Audio Effects Conference (DAFx-03), London, England, 2003
[Schroeder Logan 1961] M. R. Schroeder, B. F. Logan, “ “Colorless” Artificial Reverberation”, JAES,
9(3), 1961
[Schroeder 1962] Manfred R. Schroeder, “Natural-Sounding Artificial Reverberation”, J. Audio Eng.
Soc., 10(3):219–233, July 1962
[Schroeder 1970] Manfred R. Schroeder. “Digital Simulation of Sound Transmission in Reverberant
Spaces”, J. Acoustical Soc. of America, 47(2): 424–431, 1970
[Sedra Smith 2004] A. Sedra, K. Smith, Microelectronic Circuits, Fifth edition, Oxford University Press,
2004
[Serra, Bonada 1998] X. Serra, J. Bonada, “Sound Transformations Based on the SMS High Level
Attributes”, In Proceedings of 1998 DAFX, 1998
[Serra 1994] X. Serra, “Sound hybridization techniques based on a deterministic plus stochastic
decomposition model”, In Proceedings of 1994 ICMC, pp. 348-351, 1994
[Serra & Smith 1990] X. Serra and J. Smith, “Spectral modeling synthesis: a sound analysis/synthesis
system based on a deterministic plus stochastic decomposition”, CMJ, 14(4):12-24, 1990
[Serra 1997] X. Serra, “Musical Sound Modeling with sinusoids plus noise”, C. Roads, S. T. Pope, A.
Piccialli and G. Poli Musical Signal Processing, Swets & Zeitlinger, 1997
[Smith 1983]J. O. Smith, Techniques for Digital Filter Design and System Identification with Application
to the Violin, PhD thesis, Elec. Engineering Dept., Stanford University (CCRMA), June 1983
[Smith 1984]J. O. Smith, “An all-pass approach to digital phasing and flanging”, Proceedings of the
1984 International Computer Music Conference, Paris, France, 1984
170
[Smith 1985] J. O. Smith, “A New Approach to Digital Reverberation using closed waveguide networks”,
Proceedings of 1985 International Computer Music Conference, Vancouver, Canada, 1985
[Smith 1992]J. O. Smith, |"Physical modeling using digital waveguides'', Computer Music Journal, vol. 16,
pp. 74-91, 1992
[Smith 1995] J. O. Smith, Introduction to Digital Filters, September 2005 Draft, http://ccrma.stanford.
edu/~jos/filters05
[Smith 1996] J. O. Smith, ``Physical modeling synthesis update,'' Computer Music Journal, vol. 20, pp.
44-56, Summer 1996
[Smith Serafin Abel Berners 2002] J. O. Smith, S. Serafin, J. Abel, and D. Berners, “Doppler simulation
and the leslie“, in Proceedings of the COST-G6 Conference on Digital Audio Effects (DAFx-02), Hamburg,
Germany, pp. 13-20, September 26 2002
[Smith 2002] J. O. Smith, Mathematics of the Discrete Fourier Transform, August 2002 Draft
[Smith 2005] J. O. Smith, Physical Audio Signal Processing For Virtual Musical Instruments and Digital
Audio Effects, December 2005 Draft
[Smith 2007] J. O. Smith, Physical Audio Signal Processing For Virtual Musical Instruments and Digital
Audio Effects, August 2007 Edition
[Smith 2007b] J. O. Smith, Spectral Audio Signal Processing, March 2007 Draft
[Smith 2008] J. O. Smith, “Virtual Electric Guitars and Effects using Faust and Octave”, Linux Audio
Conference LAC2008, Cologne, Germany, 2008
[Smith 1998] J. O. Smith, “Pinciples of Digital Waveguide Models of Musical Instruments”, Application of
Digital Signal Processing to Audio and Acoustics edited by M. Kahrs and K, Brandenburg, Kluwer
Academic Publishers, 1998
[Smith 2004] J. O. Smith, ``On the equivalence of digital waveguide and finite difference time domain
schemes,'' July 21, 2004, http://arxiv.org/abs/physics/0407032/
[Springer 1955] A.M. Springer, “Ein akustischer Zeitregler”, Gravesaner Blutter, (1):32-27, 1955
[Stilson 2006], T. S. Stilson, “Efficiently-Variable, Non-Oversampled Algorithms in Virtual-Analog Music
Synthesis”, PhD Thesis, CCRMA, Stanford University, 2006
171
[Stilson Smith 1996] T. Stylson, J. O. Smith, “Analyzing the Moog VCF with considerations for digital
implementation”, Proceedings of 1996 International Computer Music Conference, Hong-Kong, 1996
[Stilson Smith 1996b] T. Stilson and J. O. Smith, “Alias-free synthesis of classic analog waveforms” in
Proceedings of the 1996 International Computer Music Conference, Hong Kong, Computer Music
Association, 1996
[Strikwerda 1989] J. C. Strikwerda, Finite Difference Schemes and Partial Differential Equations, Pacific
Grove, CA: Wadsworth and Brooks, 1989
[Tache 2004] O. Tache, “Utilisation des liaisons non linéaires pourla construction d’objets évolutifs dans l’environnement de creation musicale GENESIS”, Master’s thesis, ACROE, Institut National Polytechnique
de Grenoble, France, 2004
[Tache Cadoz 2006] Tache, O., Cadoz, C.: "Generation of Complex Sound Sequences using Physical
Models with Dynamical Structures". Proceedings of the 2006 International Computer Music Conference,
pp 1 – 8, ICMC, 2006.
[Tache Cadoz 2006b] O. Tache, , C. Cadoz, "Using Evolving Physical Models for Musical Creation in the
GENESIS Environment", Proceedings of the third Sound and Music Computing Conference - SMC’06, pp 53
– 60, GMEM, May 2006.
[Tan & Lin 2000] R.K.C. Tan and A.H.J. Lin, A.H.J, “A Time-Scale Modification Algorithm Based on the
Subband Time-Domain Technique for Broad-Band Signal Applications”, JAES, 48(5), 2000.
[Temes LaPatra 1977] G. C Temes, J. W. LaPAtra, Introduction to Circuit Synthesis and Design, McGraw-
Hill, 1977
[Thill 2003] F. Thill, “Simulation de Collisions et Applications à la Synthèse d’objets sonores et
percussifs”, Master thesis, Institut National Polytechnique de Grenoble, 2003
[Todoroff 2002] T. Todoroff, “Control of Digital Audio Effects”, DAFX, edited by U. Zolzer, John Wiley &
Sons, 2002
[Tolonen Valimaki Karjalainen 1998] T Tolonen, V. Valimaki, M. Karjalainen, “Evaluation of Modern Sound
Synthesis Methods”, Report 48 - Helsinki University of Technology, Department of Electrical and
Communication Engineering, Laboratory of Acoustics and Audio Signal Processing, 1998
[Tomlinson 1991] G. H. Tomlinson, Electrical Networks and Filters – Theory and Design, Prentice Hall,
1991
[Tongue 1996] B. H. Tongue, Principles of Vibration, Oxford University Press, 1996
172
[Truax 1988] B. Truax, “Real-time granular synthesis with a digital signal processor”, CMJ, 12(2), 1988
[Valimaki Takala 1996] V. Valimaki, T. Takala, “Virtual Musical Instruments – Natural Sound Using
Physical Models”, Organised Sound, vol 1, no 2, pp. 75-86, 1996
[Van Valkenburg 1960] M. E. Van Valkenburg, Introduction to Modern Network Synthesis, New York: John
Wiley and Sons, Inc., 1960.
[Van Duyne Smith 1993] S. A. Van Duyne and J. O. Smith, “Physical modeling with the 2-D digital
waveguide mesh“. in Proceedings of the 1993 International Computer Music Conference, Tokyo, pp. 40-
47, Computer Music Association, 1993
[Van Duyne 1992] S. A. Van Duyne, and J. O. Smith, “Implementation of a variable pick-up point on a
waveguide string model with FM/AM applications”, In Proc. Intl. Computer Music Conf, San Jose, pp.
154–157, 1992
[Van Duyne 2007] Scott Van Duyne, “Digital Filtering Applications to modeling wave propagation in
springs, strings, membrane and acoustical space”, PhD Thesis, CCRMA, Stanford University, 2007
[Van Nort Castagne 2007] D. Van Nort, N. Castagne, “Mapping, in Digital Music Instruments”, in
Enaction and Enactive Interfaces, a Handbook of Terms, A. Luciani and C. Cadoz eds, Enactive System
Books, 2007
[Verfaille Arfib 2001] V. Verfaille and D. Arfib, “ADAFx : Adaptive Digital Audio Effects”, In Proceedings
of the 2001 DAFX, 2001
[Verfaille, Arfib 2002] V. Verfaille and D. Arfib, “Implementation Strategies for Adaptive Digital Audio
Effects”, In Proceedings of 2002 DAFX, Hamburg, 2002
[Verfaille 2003] V. Verfaille, “Effets Audionumériques Adaptatifs: Théorie, Mise en oeuvre et Usage en
Création Musicale Numérique,” Ph.D. dissertation, Univ. Méditerranée Aix-Marseille II, Marseille, France,
2003
[Verfaille, Depalle 2004] V. Verfaille, D. Arfib, “Adaptive Effects based on STFT, using a source filter
model” in Proceedings 2004 DAFX, 2004
[Verfaille Zolzer Arfib 2006] V. Verfaille, U. Zolzer, D. Arfib, “Adaptive Digital Audio Effects (A-DAFx): A
New Class of Sound Transformations”, IEEE Transactions on Audio, Speech and Language Processing,
2006
[Verfaille Guastavino Traube 2006] V. Verfaille, C. Guastavino, C. Traube, “An Interdiscliplinary Approach
to Audio Effect Classification”, Proceedings of DAFX06, Montreal, Canada, 2006
173
[Verfaille Wanderley Depalle 2006] V. Verfaille, M. Wanderley, P. Depalle “Mapping Strategies for Gestural
and Adaptive Control of Digital Audio Effects”, Journal of New Music Research 35(1), 2006
[Verhelst & Roelands 1993] W. Verhelst, M. Roelands, “An overlap-add technique based on waveform
similarity (WSOLA) for high quality time-scale modification of speech”, IEEE International Conference on
Acoustics, Speech, and Signal Processing, vol. 2, 1993
[Verma & Meng 2000] T. S. Verma and T. H. Y. Meng, “Extending spectral modeling synthesis with
transient modeling synthesis”, CMJ, 24(2):47-59, 2000
[Volpe 2007] G. Volpe, “Gesture Analysis”, in Enaction and Enactive Interfaces, a Handbook of Terms,
A. Luciani and C. Cadoz eds, Enactive System Books, 2007
[Wanderley Battier 2000] M. Wanderley, M. Battier (Edt), Trends In Gestural Control Of Music, IRCAM
2000
[Wansderley Depalle 2004] M. Wanderley, P. Depalle. “Gestural control of sound synthesis”, Proceedings
of the IEEE, Special Issue on Engineering and Music—Supervisory Control and Auditory Communication
92(4), 2004
[Wardle 1998] S. Wardle, “A Hilbert-Transformer Frequency Shifter for Audio”, ” in Proc. of the Int.
Conf. on Digital Audio Effects (DAFx-98), Barcelona, Spain, 1998
[Wishart] T. Wishart, “Computer Sound Transformations: A Personal Perspective from the UK”,
www.composersdesktop.com/trnsform.htm
[Wishart 1994] T. Wishart, Audible Design: A Plain and Easy Introduction to Practical Sound Composition,
Orpheus and Pantomime Ltd, 1994
[Yeh Abel Smith 2007a] D. T. Yeh, J. Abel, and J. O. Smith, “Simplified, physically-informed models of
distortion and overdrive guitar effects pedals,” in Proc. of the Int. Conf. on Digital Audio Effects (DAFx-
07), Bordeaux, France, Sept. 10–14, 2007
[Yeh Abel Smith 2007b] D. T. Yeh, J. Abel, and J. O. Smith, Simulation of the Diode Limiter in Guitar
Distortion Circuits by Numerical Solution of Ordinary Differential Equations”, in Proc. of the Int. Conf. on
Digital Audio Effects (DAFx-07), Bordeaux, France, Sept. 10–14, 2007
[Zoelzer 1997] U. Zolzer, Digital Audio Signal Processing, John Wiley & Sons LTD, 1997
[Zoelzer 2002] U. Zoelzer (Edt), Digital Audio Effects, John Wiley & Sons Ltd, 2002