The ChucK Audio Programming

redlemonbalmMobile - Wireless

Dec 10, 2013 (5 years and 1 month ago)


The ChucK Audio Programming
\A Strongly-timed and On-the-fly
Ge Wang
A Dissertation
Presented to the Faculty
of Princeton University
in Candidacy for the Degree
of Doctor of Philosophy
Recommended for Acceptance
By the Department of
Computer Science
Advisor:Perry R.Cook
September 2008
c Copyright by Ge Wang,2008.
All Rights Reserved
The computer has long been considered an extremely attractive tool for creating,
manipulating,and analyzing sound.Its precision,possibilities for new timbres,and
potential for fantastical automation make it a compelling platform for expression
and experimentation - but only to the extent that we are able to express to the
computer what to do,and how to do it.To this end,the programming language
has perhaps served as the most general,and yet most precise and intimate interface
between humans and computers.Furthermore,\domain-specic"languages can
bring additional expressiveness,conciseness,and perhaps even dierent ways of
thinking to their users.
This thesis argues for the philosophy,design,and development of ChucK,a
general-purpose programming language tailored for computer music.The goal is to
create a language that is expressive and easy to write and read with respect to time
and parallelism,and to provide a platform for precise audio synthesis/analysis and
rapid experimentation in computer music.In particular,ChucK provides a syntax
for representing information ow,a new time-based concurrent programming model
that allows programmers to exibly and precisely control the owof time in code (we
call this\strongly-timed"),and facilities to develop programs on-the- y - as they
run.A ChucKian approach to live coding as a new musical performance paradigmis
also described.In turn,this motivates the Audicle,a specialized graphical environ-
ment designed to facilitate on-the- y programming,to visualize and monitor ChucK
programs in real-time,and to provide a platform for building highly customizable
user interfaces.
In addition to presenting the ChucK programming language,a history of music
and programming is provided (Chapter 2),and the various aspects of the ChucK
language are evaluated in the context of computer music research,performance,
and pedagogy (Chapter 6).As part of an extensive case study,the thesis discusses
ChucK as a primary teaching and development tool in the Princeton Laptop Or-
chestra (PLOrk),which continues to be a powerful platform for deploying ChucK
1) to teach topics ranging from programming to sound synthesis to music com-
position,and 2) for crafting new instruments,compositions,and performances for
computer-mediated ensembles.Additional applications are also described,includ-
ing classrooms,live coding arenas,compositions and performances,user studies,
and integrations of ChucK into other software systems.
The contributions of this work include the following.1) A time-based pro-
gramming mechanism (both language and underlying implementation) for ultra-
precise audio synthesis,naturally extensible to real-time audio analysis.2) A non-
preemptive,time/event-based concurrent programming model that provides fun-
damental exibility and readability without incurring many of the diculties of
programming concurrency.3) A ChucKian approach to writing code and design-
ing audio programs on-the- y.This rapid prototyping mentality has potentially
wide ramications in the way we think about coding audio,in designing/testing
software (particular for real-time audio),as well as new paradigms and practices
in computer-mediated live performance.4) The Audicle as a new type of audio
programming environment that combines live development with visualizations.5)
Extended case studies of using,teaching,composing,and performing with ChucK,
most prominently in the Laptop Orchestra.These show the power of teaching pro-
gramming via music,and vice versa - and how these two disciplines can reinforce
each other.
In developing the works described in this dissertation (as well as the document
itself),I am indebted to a great many people for teaching,help,guidance,and
encouragement throughout.I am deeply grateful to Perry Cook for his teaching,
mentorship,friendship,and for granting me the freedom to explore new (sometimes
crazy-seeming) directions while providing the guidance and encouragement to help
me to see things through.Immense gratitude to Dan Trueman for his always con-
siderate and generous help and guidance,and (to both Dan and Perry) for trust
and condence in me to help develop the laptop orchestra.I have been incredibly
fortunate to work with Paul Lansky,Ken Steiglitz,Brian Kernighan,Bede Liu,
Wayne Wolf,Kai Li,and Andrew Appel.I thank them for guidance from and
to many directions.Very special thanks to Roger Dannenberg,who has provided
invaluable guidance and support throughout my graduate career.Deep thanks to
Melissa Lawson for taking care of us graduate students since before we arrived.
Many aspects of this work have beneted immensely from people working hard
together.It continues to be my great honor to work with folks at the SoundLab:
Ajay Kapur,Ari Lazier,Philip Davidson,Spencer Salazar,Tae Hong Park,Adam
Tindale,Ahmed Abdallah,Paul Botelho,Matt Homan,Je Bernstein.I am truly
grateful to Ananya Misra { together weve pulled through many projects,demos,
papers,and presentations.I am incredibly honored to work with Scott Smallwood,
whether it is surviving PLOrk or composing/performing together.Heartiest thanks
to comrade Rebecca Fiebrink,for being amazing in everything we do.
Many thanks to Adam Finkelstein,Szymon Rusinkiewicz,Tom Funkerhouser,
and the always illuminating folks in the Graphics Lab.To Ed Felten,Alex Halder-
man,Frances Perry,and Shirley Gaw,Limin Jia,Yong Wang,and Jay Ligatti for
security and support.Special thanks to CS Sta for amazing support,and to all
faculty,sta,and students at the Princeton Computer Science Department.
I have also been fortunate to collaborate with many great mentors and colleagues
outside of Princeton.My deep thanks to Jon Appleton for his support and friend-
ship.To Larry Polansky and Kui Dong for their encouragement,and for inviting me
to teach the graduate seminar at Dartmouth in 2007.To the graduate students in
the Electro-acoustic Music program;to Yuri Spitsyn for his friendship and guidance.
Special thanks to Gary Scavone for being a great colleague (even though we rarely
see each other in person),George Tzanetakis for endlessly fascinating discussions on
software design for audio systems,Georg Essl for being a wonderful colleague and
friend,Nic Collins and Shawn Decker for encouragement and interest in this work,
Brad Garton for his wonderful ideas and support.My profound thanks to Max
Mathews and John Chowning for continued inspiration and encouragement,as well
as to Chris Chafe,Julius Smith,Jonathan Berger,Chryssie Nanou,Rob Hamilton,
Jonathan Abel,Bret Ascarelli,Jieun Oh,and the amazing 2007-2008 MA/MST
students,as well as everyone else at CCRMA and the Stanford Music Department
for their deep support and belief in me.I profusely thank the ChucK users and
developers communities,especially Kassen and Kijjaz,as well as Nick Collins,Alex
McLean,and other fellow live coders and colleagues at TOPLAP.I leave out many
wonderful folks to whom I am indebted and whom I keep at heart.
This research was supported by funding from the National Science Foundation
(0101247,0509447,9984087) and by Interval Research and the Arial Foundation.
Finally,thanks to my grandparents who are responsible for the good in me,my
parents for standing behind me and encouraging my interests,to Manman for her
sacrices and undying support.
1 Introduction and Motivation 1
1.1 Problem Statement...........................1
1.2\A New Way of Thinking about Audio Programming".......4
1.3 The ChucKian approach........................5
1.4 Roadmap................................6
2 A History of Music and Programming 8
2.1 Early Eras:Before Computers.....................8
2.2 The Computer Age (Part I):Early Languages and Rise of MUSIC-N 10
2.2.1 MUSIC I (and II,III,...)....................11
2.2.2 The CARL System (or\UNIX for Music")..........15
2.2.3 Cmix,CLM,Csound......................17
2.3 The Computer Age (Part II):Real-time Systems and New Approaches 19
2.3.1 Graphical Music Programming:Max/MSP + Pure Data..19
2.3.2 Programming Libraries for Sound Synthesis.........21
2.3.3 Nyquist.............................23
2.3.4 SuperCollider..........................24
2.3.5 Graphical vs.Text-based....................26
2.3.6 Additional Music Programming Languages..........27
2.4 The Computer Age (Part III):New Language Explorations.....27
2.4.1 Custom Music Programming Software............29
2.5 Synchronous Reactive Systems.....................30
2.5.1 ChucK and Synchronous Languages..............31
2.6 Future Directions............................33
3 ChucK 35
3.1 Language Design............................35
3.1.1 Two Observations........................35
3.1.2 Design Goals..........................37
3.2 Core Language Features........................39
3.2.1 ChucK Operator (=>).....................39
3.2.2 ChucKian Time.........................41
3.2.3 Concurrency Based on Shreds.................46
3.2.4 Synthesis and Analysis.....................49
3.3 Language Specication.........................49
3.3.1 Types,Values,Variables....................49
3.3.2 Arrays..............................52
3.3.3 Operators............................53
3.3.4 Control Structures.......................53
3.3.5 Manipulating Time.......................53
3.3.6 Functions............................57
3.3.7 Concurrency,Processes,and Shreds..............58
3.3.8 Programming with Events...................63
3.3.9 Unit Generators.........................66
3.3.10 Unit Analyzers.........................72
3.4 System Design and Implementation..................73
3.4.1 Architecture...........................73
3.4.2 Compilation...........................75
3.4.3 ChucK Virtual Machine + Shreduler.............76
3.4.4 Audio Computation.......................79
3.5 Properties................................80
3.5.1 Time and Programming....................80
3.5.2 Dynamic,Precise Control Rate................81
3.5.3 Synchronous Concurrent Control...............84
3.6 Where to go from here.........................86
4 On-the- y Programming 87
4.1 Motivation................................87
4.2 Challenges................................90
4.3 A ChucKian Approach.........................90
4.3.1 External Interface........................91
4.3.2 Internal Semantics.......................94
4.4 An On-the- y Aesthetic........................95
5 The Audicle 99
5.1 Introduction...............................99
5.1.1 Motivation............................100
5.1.2 Related Environments.....................102
5.2 Audicle Design.............................103
5.3 Faces of the Audicle..........................104
5.3.1 The ShrEditor..........................105
5.3.2 VM-Space............................107
5.3.3 Shredder.............................109
5.3.4 Time and Timing........................111
5.3.5 Tabula Rasa...........................111
5.4 Audicle Implementation........................112
5.5 miniAudicle...............................113
5.6 Discussion................................113
6 Applications and Evaluations 116
6.1 Evolution of ChucK...........................117
6.2 Teaching ChucK............................118
6.2.1 Princeton Laptop Orchestra..................123
6.2.2 Assignments...........................125
6.2.3 Results and Evaluation.....................132
6.2.4 Additional Courses.......................134
6.3 ChucK in Performance and Research.................136
6.3.1 Performance in Laptop Orchestra...............136
6.3.2 S.M.E.L.T............................149
6.3.3 ChucK for TAPESTREA....................149
6.4 Additional and Potential Applications.................150
7 Conclusion 154
7.1 Contributions..............................154
7.1.1 The Meaning of\Strongly-timed"...............155
7.2 Future Work...............................156
7.2.1 Exploring Analysis,Rapid Prototyping,Learning......156
7.2.2 Worlds for Collaborative Social Audio Programming....158
7.2.3 Planned Language Features..................159
7.2.4 Laptop Orchestras.......................161
7.2.5 ChucK and the Mobile Phone.................162
7.3 Concluding Remarks..........................163
Bibliography 164
List of Figures
1.1 Some ChucKian things..........................1
1.2 A conjecture...............................4
1.3 Another conjecture............................5
2.1 The IBM 360,released in 1965,with human operators........11
2.2 A simple Max/MSP patch which synthesizes the vowel ahh.....22
2.3 SuperCollider programming environment in action..........25
3.1 Relative emphasis between three design goals.............38
3.2 A statement using the ChucK operator,here connecting the output
of foo to dac (both are assumed to be unit generators).......40
3.3 A statement that uses the ChucK operator to connect three audio
elements together.............................40
3.4 Two dierent syntaxes for invoking the same nested function calls..41
3.5 A ChucK program to generate a sine wave,changing its frequency of
oscillation every 100 milliseconds....................43
3.6 complex values,real/imaginary components..............51
3.7 polar values,magnitude/phase components..............52
3.8 Some operations on complex and polar types in ChucK........52
3.9 Examples of basic operations on time and dur............54
3.10 Example of constructing the notions of a quarter and a whole with
3.11 Some examples of using the dur type.................55
3.12 Some examples of advancing time with durations in ChucK.....56
3.13 A short program demonstrating time and dur types.........56
3.14 A sound generating program that randomizes frequencies every 100
3.15 Dening a function,then sporking that function on a new shred...59
3.16 Dening a function,sporking two copies of it on new shreds......60
3.17 Example showing the yield() function,which relinquishes the VM
without advancing time.........................61
3.18 Prints out the current shred's id....................61
3.19 Attempts to open a MIDI device,and exits if the operation fails...62
3.20 Operations using static members of the Machine class........62
3.21 Passing arguments to shreds via Machine...............62
3.22 Loops through shred arguments and prints each............62
3.23 Code snippet to wait on an Event,and printing a debug message..63
3.24 An example that sporks four shreds,and invokes them via signal()
and broadcast().............................65
3.25 A program to open a MIDI device,wait on incoming messages,and
print them................................66
3.26 A OpenSoundControl receiver......................67
3.27 Dening and using a Event subclass..................68
3.28 Setting up a unit generator network with feedback..........70
3.29 Dynamically connecting/disconnecting unit generators........70
3.30 Asserting/reading control values....................71
3.31 ChucK run-time architecture......................74
3.32 Phases in the ChucK compiler.....................75
3.33 A ChucK shred and primary components...............76
3.34 Single-shredded shreduling algorithm..................77
3.35 Multi-shredded shreduling algorithm,with messaging........78
3.36 Constructing a classic Karplus and Strong plucked string model...82
3.37 An envelope follower (and simple onset detector),based on a leaky
integrator.(author:Perry Cook)...................83
3.38 A concurrent programframework for singing synthesis,naturally bal-
ancing source generation,musical parameters,and interpolation in
three shreds.(author:Perry Cook)..................85
4.1 An article about live coding,published in Zeitwissen in 2006.....89
4.2\external"ChucK shell commands for adding/replacing code....92
4.3\internal"ChucK shell commands for adding/replacing code.....92
4.4 Two methods to\synch"with a later time...............94
4.5\Synching"to some absolute time...................94
4.6 Dene a period;synchronize to next period boundary.........95
4.7 Synchronize to period boundary,plus oset..............95
4.8 An on-the- y programmer/performer and code projection......97
4.9 An on-the- y programmer/performer and code projection (close-up).97
4.10 A schematic for a double-projection,on-the- y duet.........98
4.11 A On-the- y Programming collage,prepared for Art Gallery perfor-
mance at SIGGRAPH 2006.......................98
5.1 Completing the loop.The Audicle strives to bridge runtime interac-
tions with development-time elements.................100
5.2 The Audicle Console.The cube interface (left) can be used to graphi-
cally navigate the AudiCube.The command line prompt on the right
accept text commands..........................105
5.3 The ShrEditor:a version-tracking on-the- y editing interface....106
5.4\Grapes"represent running shreds,grouped by revision.......106
5.5 One can drag revisions to split text buers...............107
5.6 Many on-the- y coding buers.....................107
5.7 VMSpace:Audicle face to visualize real-time audio and spectra...108
5.8 The Shredder:visualizing active and deactivated shreds (the latter
ascending towards viewer........................110
5.9 The Shredder:an Audicle face to visualize and monitor shreds....110
5.10 The Shredder:a top-down view.....................111
5.11 Time'n'Timing (TNT):Audicle face to visualize relative timing
between shreds..............................112
5.12 miniAudicle:a lightweight integrated development environment for
ChucK and on-the- y programming..................114
6.1 Timeline:evolution of ChucK,2002-2004...............119
6.2 Timeline:evolution of ChucK,2005..................120
6.3 Timeline:evolution of ChucK,2006..................121
6.4 Timeline:evolution of ChucK,2006-2008...............122
6.5 PLOrk class in session..........................124
6.6 PLOrk setup (individual stations)...................125
6.7 PLOrk setup (minus humans)......................126
6.8 PLOrk in action.............................127
6.9 The Stanford Laptop Orchestra classroom in motion.........128
6.10 PLOrk setup,onstage at Taplin Auditorium,Princeton........131
6.11 Teaching in the Stanford Laptop Orchestra..............135
6.12 A on-the- y programming schematic..................137
6.13 The score for On-the- y Counterpoint.................138
6.14 Network conguration (partial ensemble)................139
6.15 Non-Specic Gamelan Takio Fusion performed in PLOrk.......139
6.16 Non-Specic Groove:a network-synchronized colorful step sequencer
implemented in the Audicle.The green highlight moves across the
squares in real-time,as coordinated by the ensemble's master ma-
chine.Each color is associated with a dierent sound.All sound
synthesis and networking are written in ChucK............140
6.17 A possible sequence of suggested colors (texture) and density consti-
tute the score,which the conductor visually conveys to the ensemble.141
6.18 CliX in performance:the orchestra surrounds the audience (below)
from around the balcony at Chancellor Green Library;conductor
guides the direction of the performance.................141
6.19 A interface for On The Floor,built in the Audicle,sound synthesis
in ChucK.................................142
6.20 ChucK ChucK Rocket:game board as seen by one of the players..143
6.21 ChucK ChucK Rocket:from another viewpoint............144
6.22 Crystalis:keyboard and trackpad mappings..............145
6.23 TBA:orchestral live coding.......................146
6.24 PLOrk Beat Science:Rebecca Fiebrink and Ge Wang........147
6.25 PLOrk Beat Science:1 ute,2 humans,5 laptops,5 TriggerFingers,
30 audio channels............................148
6.26 PLOrk Beat Science: oor plan.....................148
6.27 Mahadevibot:musical robotic,playing with a human performer;var-
ious software components implemented in ChucK...........151
6.28 Networked Audio Performances:Gigapop Ritual (2003,left) between
McGill University and Princeton University;right:performance be-
tween CCRMA and Ban with a distributed St.Lawrence String
Quartet using JackTrip.Both systems used C/C++ based software,
though networked audio may be a potential application of ChucK in
the future.................................152
7.1 Future work:a denizen in the envisioned collaborative social audio
programming virtual world.......................158
7.2 Future work:many entities collaboratively live coding in same virtual
7.3 Laptop Orchestras:PLOrk,SLOrk { and hopefully beyond!....162
Figure 1.1:Some ChucKian things.
Chapter 1
Introduction and Motivation
\The old computing is about what computers can do.The new computing is about
what people can do."- Ben Shneiderman
1.1 Problem Statement
The computer has long been considered an extremely attractive tool for creating
and manipulating sound [54,93].Its precision,possibilities for new timbres,and
potential for fantastical automation make it a compelling platformfor experimenting
with and making music - but only to the extent that we can actually tell a computer
what to do,and how to do it.
A program is a sequence of instructions for a computer.A programming lan-
guage is a collection of syntactic and semantic rules for specifying these instructions,
and eventually for providing the translation from human-written programs to the
corresponding instructions computers carry out.In the history of computing,many
interfaces have been designed to instruct computers,but none have been as fun-
damental (or perhaps as enduring) as programming languages.Unlike most other
classes of human-computer interfaces,programming languages don't directly per-
form any specic\end-use"task (such as word processing or video editing),but
instead allow us to build software that might perform almost any custom function.
The programming language acts as a mediator between human intention and the
corresponding bits and instructions that make sense to a computer.It is the most
general and yet the most intimate and precise tool for instructing computers.
Programs exist on many levels,ranging from assembler code (extremely low
level) to high-level scripting languages that often embody more human-readable
structures,such as those resembling spoken languages or graphical representation
of familiar objects.Domain-specic languages retain general programmability while
providing additional abstractions tailored to the domain (e.g.,sound synthesis).
Yet,even within the domain of audio programming,there is a staggeringly vast
range of tasks that one may wish to perform (or investigate),ranging from methods
for sound synthesis,physical modeling of real-time world artifacts and spaces (e.g.,
musical instruments,environmental sounds),analysis and information retrieval of
sound and music,to mapping and crafting of new controllers and interfaces (both
software and physical) for music,algorithmic/generative processes for automated
or semi-automatic composition and accompaniment,real-time music performance,
to many others.Moreover,within each of these areas,there lies unbounded varia-
tion in programming approaches,styles,and demands on the tools (e.g.,ability to
create/run real-time programs).
Furthermore,audio programming,in both computational acoustics research and
in music composition and performance,is necessarily an experimental and empir-
ical process;it requires rapid experimentation,verication/rejection/workshoping
of ideas and approaches,and takes the forms of both short-term and sustained pro-
totyping.It can greatly benet from the ability to modify,or even create,parts of
the software system\on-the- y"{ as it runs.We believe that rapid prototyping,
in and of itself,is a uniquely useful approach to programming audio,with its own
benets (and dierent ways of thinking about problems).
Faced with such a wide gamut of possibilities and demands,how do we go
about thinking about and designing a general programming tool to address these
aspects of expressive programmability,rapid prototyping,readability?This is the
problem statement,and this dissertation addresses its various facets in terms of
a new programming language,called ChucK,and chronicles its design,ideas,and
In addition to our desire to address the problems stated above,we are also
motivated in providing new tools for computer science,computer music pedagogy,
and for exploring with newmusical paradigms.We believe an audio-centric language
such as ChucK should be useful to both novices learning about the domain,as
well as to experts wishing to eectively craft software that is expressive,readable
(to themselves and to others),and that supports clear,concise,and maintainable
representations of sonic ideas and algorithms.
1.2\A New Way of Thinking about Audio Pro-
The great computer scientist Alan Perlis once said that\a programming language
that doesn't change the way you think is not worth learning."Indeed,we are mo-
tivated in the design of ChucK to investigate new ways of thinking about pro-
gramming sound and music,particularly by looking at it from a human-centric
perspective (e.g.,as opposed to a machine-centric one).As we posited above,a
programming language is a highly general and yet highly intimate human-computer
interaction (HCI) device.
HCI Device
Figure 1.2:A conjecture.
If that is the case,then perhaps we can think of the task of programming lan-
guage design as HCI design { loosely speaking.We say\loosely"because while
the process embodies the high level principle of designing for humans,we do not
necessarily employ any specic theory from the eld of human-computer interac-
tion.Sometimes it is the holistic sum of the features,feel,or even\vibe"that can
make a programming systemappealing,inviting,and ultimately useful.So,much of
the design process also tends to be holistic in the above sense,which in retrospect
for ChucK,remains to be the right decision (we believe).Through this process,
we have produced several interesting paradigms and principles that are potentially
useful for audio programming,and constructed a practical language that employs
these ideas.In addition,the entirety of the programming language and its runtime
system present a new way of thinking about developing software for sound synthe-
sis,analysis,composition,live performance,and pedagogy.For example,chapter 4
(On-the- y Programming) discusses another\equivalence"in the context of writing
code live for musical performance and experimentation (see Figure 1.3).
Musical instrument
Figure 1.3:Another conjecture.
1.3 The ChucKian approach
A central tenet of the ChucKian solution to audio programming is to expose pro-
grammability/control over time (at vastly dierent granularities) in cooperation
with a time-based concurrent programming model.This gives rise to our notion of
a\strongly-timed"audio programming language { one in which the programming
has intimate,precise,and modular control over time as it relates to digital audio
In more concrete terms,this entails making time itself both computable and
directly controllable,at any granularity.Programmers specify the exact\pattern"
with which computation is performed in time by embedded explicit timing infor-
mation within the code.Based on this semantic,the language's runtime system
ensures properties of determinism and precision between the program and time.
Furthermore,programmers can specify concurrent code modules,each of which in-
dependently controlling their own computations over time but can also be synchro-
nized to other modules via time and other mechanisms (e.g.,events and condition
In short,the design of ChucK strives to\hide the mundane aspects of pro-
gramming,and expose true control".Additionally,ChucK provides an approach
for on-the- y programming,where the programmer is enabled and encouraged to
develop/test/prototype programs on-the- y.This style of development has led to
applications in prototyping,teaching,and live musical performance where the au-
dience observes the\live code"as musical gestures.
In turn,on-the- y programming and our interests in exploring ChucK's peda-
gogical potentials has led to investigations of empowering the programmer,as well
as the observers (students,colleagues,audience) through visualization of real-time
audio programs and the act of on-the- y programming.This motivates the Audi-
cle as an integrated development platform that also serves as a real-time program
monitor providing feedback to the ChucK programmer [98].
Putting these elements together,this thesis addresses ideas and investigations
at the intersection of computer science and music,of technology and art,and of
computing and the humans that interact with it.
1.4 Roadmap
In the rest of this document,we explore the history of programming languages for
sound/music (Chapter 2).We chronicle the design of ChucK,an audio programming
language,and introduce a new way of thinking about music programming,as well
as present some of its ramications (Chapter 3).We discuss the practice of\on-the-
y programming",a new way of rapidly prototyping for experimentation and for
live musical performance (Chapter 4).The Audicle,a graphical programming envi-
ronment for visualizing ChucK in real-time and for aiding on-the- y programming,
is presented in Chapter 5.We then look at the various applications of ChucK in
practical contexts,including in performance ensembles such as the Princeton Lap-
top Orchestra,in classrooms teaching computer science side-by-side with music and
sound synthesis/analysis,and in several other arenas (Chapter 6).The conclusion
addresses contributions and potential future directions.
Chapter 2
A History of Music and
2.1 Early Eras:Before Computers
The idea of using general-purpose programming computational automata to make
music can be traced back to as early as 1843.Ada Lovelace,while working with
Charles Babbage,wrote about the applications of the theoretical Analytical Engine,
the successor to Babbage's famous Dierence Engine.The original Dierence En-
gine was chie y a\calculating machine"whereas the Analytic Engine (which was
never built) was to contain mechanisms for decision and looping,both fundamental
to true programmability.Lady Lovelace rightly viewed the Analytical Engine as
a general-purpose computer,suited for\developping [sic] and tabulating any func-
tion whatever...the engine [is] the material expression of any indenite function
of any degree of generality and complexity."She further predicted the following:
\Supposing,for instance,that the fundamental relations of pitched sounds in the
science of harmony and of musical composition were susceptible of such expression
and adaptations,the engine might compose elaborate and scientic pieces of music
of any degree of complexity or extent."
Lady Lovelace's prediction was made more than a hundred years before the rst
computer-generated sound.But semi-programmable music-making machines ap-
peared in various forms before the realization of a practical computer.For example,
the player piano,popularized in the early 20th century,is an augmented piano that
\plays itself"according to rolls of paper (called piano rolls) with perforations rep-
resenting the patterns to be played.These interchangeable piano rolls can be seen
as simple programs that explicitly specify musical scores.
As electronic music evolved,analog synthesizers gained popularity (around the
1960s).They supported interconnecting and interchangeable sound processing mod-
ules.There is a level of programmability involved,and this block-based paradigm
in uenced later design of digital synthesis systems.For the rest of this chapter,
however,we are going to focus on programming as specifying computations to make
sound and music.
As we step into to the digital age,we divide our discussion into three overlap-
ping eras of programming and programming systems for music.They loosely follow
a chronological order,but more importantly each age embodies common themes
in how programmers and composers interact with the computer to make sound.
Furthermore,we should keep a few overall trends in mind.One crucial trend in
this context is that as computers increased in computational power and storage,
programming languages tended to become increasingly high-level,abstracting more
details of the underlying system.This,as we shall see,greatly impacted the evolu-
tion of how we program music.
2.2 The Computer Age (Part I):Early Languages
and Rise of MUSIC-N
Our rst era of computer-based music programming systems paralleled the age of
mainframes (the rst generations of\modern"computers in use from1950 to the late
1970s) and the beginning of personal workstations (mid 1970s).The mainframes
were gigantic,often taking up rooms or even entire oors.Early models had no
monitors or screens,programs had to be submitted via punch cards,and the results
delivered as printouts.Computing resources were severely constrained.It was
dicult even to gain access to a mainframe - they were not commodity items and
were centralized and available mostly at academic and research institutions (in 1957
the hourly cost to access a mainframe was $200!).Furthermore,the computational
speed of these early computers were many orders of magnitude (factors of millions
or more) slower than today's machines and were greatly limited in memory (e.g.192
kilobytes in 1957 compared to gigabytes today) [55,15].However,the mainframes
were the pioneering computers and the people who used them made the most of
their comparatively meager resources.Programs were carefully designed and tuned
to yield the highest eciency.
Sound generation on these machines became a practical reality with the advent
of the rst digital-to-analog converters (or DAC's),which converted digital audio
samples (essentially sequences of numbers) that were generated via computation,
to time-varying analog voltages,which can be amplied to drive loudspeakers or be
recorded to persistent media (e.g.magnetic tape).
Figure 2.1:The IBM 360,released in 1965,with human operators.
2.2.1 MUSIC I (and II,III,...)
The earliest programming environment for sound synthesis,called MUSIC,appeared
in 1957 [54].It was not quite a full programming language as we might think of
today,but more of an\acoustic compiler",developed by Max Mathews at AT&T
Bell Laboratories.Not only were MUSIC (or MUSIC I,as it was later referred to)
and its early descendants the rst music programming languages widely adopted
by researchers and composers,they also introduced several key concepts and ideas
which still directly in uence languages and systems today.
MUSIC I and its direct descendants (typically referred to as MUSIC-N lan-
guages),at their core,provided a model for specifying sound synthesis modules,
their connections,and time-varying control.This model eventually gave rise,in
MUSIC III,to the concept of unit generators,or UGen's for short.UGen's are
atomic,often predened,building blocks for generating or processing audio signals.
In addition to audio input and/or output,a UGen may support a number of control
inputs that control parameters associated with the UGen.
An example of a UGen is an oscillator,which outputs a periodic waveform (e.g.
a sinusoid) at a particular fundamental frequency.Such an oscillator might include
control inputs that dictate the frequency and phase of the signal being generated.
Other examples of UGens include lters,gain ampliers,and envelope generators.
The latter,when triggered,produce amplitude contours over time.If we multiply
the output of a sine wave oscillator with that of an envelope generator,we can pro-
duce a third audio signal:a sine wave with time-varying amplitude.In connecting
these unit generators in an ordered manner,we create a so-called instrument or
patch (the term comes from analog synthesizers that may be congured by connect-
ing components using patch cables),which determines the audible qualities (e.g.
timbre) of a sound.In MUSIC-N parlance,a collection of instruments is an or-
chestra.In order to use the orchestra to create music,a programmer could craft a
dierent type of input that contained time-stamped note sequences or control signal
changes,called a score.The relationship:the orchestra determines how sounds are
generated,whereas the score dictates (to the orchestra) what to play and when.
These two ideas - the unit generator,and the notion of an orchestra vs.a score as
programs - have been highly in uential to the design of music programming systems
and,in turn,to how computer music is programmed today (but we get ahead of
In those early days,the programming languages themselves were implemented as
low-level assembly instructions (essentially human-readable machine code),which
eectively coupled a language to the particular hardware platform it was imple-
mented on.As new generations of machines (invariably each with a dierent set
of assembly instructions) were introduced,new languages or at least new imple-
mentations had to be created for each architecture.After creating MUSIC I,Max
Mathews soon created MUSIC II (for the IBM 740),MUSIC III in 1959 (for the
IBM 7094),and MUSIC IV (also for the 7094,but recoded in a new assembly
language).Bell Labs shared its source code with computer music researchers at
Princeton University - which at the time also housed a 7094 - and many of the
additions to MUSIC IV were later released by Godfrey Winham and Hubert Howe
Around the same time,John Chowning,then a graduate student at Stanford
University,traveled to Bell Labs to meet Max Mathews,who gave Chowning a copy
of MUSICIV.Copy in this instance meant a box containing about 3000 punch cards,
along with a note saying\Good luck!".John Chowning and colleagues were able to
get MUSIC IV running on a computer that shared the same storage with a second
computer that performed the digital-to-analog conversion.In doing so,they created
one of the world's earliest integrated computer music systems.Several years later,
Chowning (who had graduated by then and join the faculty at Stanford),Andrew
Moore,and their colleagues completed a rewrite of MUSIC IV,called MUSIC 10
(named after the PDP-10 computer on which it ran),as well as a program called
SCORE (which generated note lists for MUSIC 10).
It is worthwhile to pause here and re ect how composers had to work with com-
puters during this period.The composer/programmer would design their software
(usually away from the computer),create punch cards specifying the instructions,
and submit them as jobs during scheduled mainframe access time (also referred to
as batch-processing) - sometimes traveling far to reach the computing facility.The
process was extremely time-consuming.A minute of audio might take several hours
or more to compute,and turn-around times of several weeks were not uncommon.
Furthermore,there was no way to know ahead of time whether the result would
sound anything like what was intended.After a job was complete,the generated
audio would be stored on computer tape and then be digital-to-analog converted,
usually by another computer.Only then could the composer actually hear the
result.It would typically take many such iterations to complete a piece of music.
In 1968,MUSIC V broke the mold by being the rst computer music program-
ming system to be implemented in FORTRAN,a high-level general-purpose pro-
gramming language (often considered the rst).This meant MUSIC V could be
ported to any computer system that ran FORTRAN,which greatly helped both
its widespread use in the computer music community and its further development.
While MUSIC V was the last and most mature of the Max Mathews/Bell Labs
synthesis languages of the era,it endures as possibly the single most in uential com-
puter music language.Direct descendants include MUSIC 360 (for the IBM 360)
and MUSIC 11 (for the PDP-11) by Barry Vercoe and colleagues at MIT [90],and
later cmusic by F.Richard Moore.These and other systems added much syntactic
and logical exibility,but at heart remained true to the principles of MUSIC-N
languages:connection of unit generators,and the separate treatment of sound syn-
thesis (orchestra) and musical organization (score).Less obviously,MUSIC V also
provided the model for many later computer music programming languages and
2.2.2 The CARL System (or\UNIX for Music")
The 1970s and 80s witnessed sweeping revolutions to the world of computing.The
C programming language,one of the most popular in use,was developed in 1972.
The 70s was also a decade of maturation for the modern operating system,which
includes time-sharing of central resources (e.g.CPU time and memory) by multiple
users,the factoring of runtime functionalities between a privileged kernel mode
vs.a more protected user mode,as well as clear process boundaries that protect
applications fromeach other.Fromthe ashes of the titanic Multics operating system
project arose the simpler and more practical UNIX,with support for multi-tasking of
programs,multi-user,inter-process communication,and a sizable collection of small
programs that can be invoked and interconnected from a command line prompt.
Eventually implemented in the C language,UNIX can be ported with relative ease
to any new hardware platform for which there is a C compiler.
Building on the ideas championed by UNIX,F.Richard Moore,Gareth Loy,
and others at the Computer Audio Research Laboratory (CARL) at University of
California at San Diego developed and distributed an open-source,portable system
for signal processing and music synthesis,called the CARL System [52,63].Unlike
previous computer music systems,CARL was not a single piece of software,but
a collection of small,command line programs that could send data to each other.
The\distributed"approach was modeled after UNIX and its collection of inter-
connectible programs,primarily for text-processing.As in UNIX,a CARL process
(a running instance of a program) can send its output to another process via the
pipe (j),except instead of text,CARL processes send and receive audio data (as
sequences of oating point samples,called oatsam).For example,the command:
> wave -waveform sine -frequency 440Hz | spect
invokes the wave program,and generates a sine wave at 440Hz,which is then
\piped"(|) to the spect program,a spectrum analyzer.In addition to audio data,
CARL programs could send side-channel information,which allowed potentially
global parameters (such as sample rate) to propagate through the system.Complex
tasks could be scripted as sequences of commands.The CARL System was imple-
mented in the C programming language,which ensured a large degree of portability
between generations of hardware.Additionally,the CARL framework was straight-
forward to extend - one could implement a C program that adhered to the CARL
application programming interface (or API) in terms of data input/output.The
resulting program could then be added to the collection and be available for imme-
diate use.
In a sense,CARL approached the idea of digital music synthesis from a divide-
and-conquer perspective.Instead of a monolithic program,it provided a at hier-
archy of small software tools.The system attracted a wide range of composers and
computer music researchers who used CARL to write music and contributed to its
development.Gareth Loy implemented packages for FFT (Fast Fourier Transform)
analysis,reverberation,spatialization,and a music programming language named
Player.Richard Moore contributed the cmusic programming language.Mark Dol-
son contributed programs for phase vocoding,pitch detection,sample-rate conver-
sion,and more.Julius O.Smith developed a package for lter design and a general
lter program.Over time,the CARL Software Distribution consisted of over one
hundred programs.While the system was modular and exible for many audio
tasks,the architecture was not intended for real-time use.Perhaps mainly for this
reason,the CARL System is no longer widely used in its entirety.However,thanks
to the portability of C and to the fact CARL was open source,much of the im-
plementation has made its way into countless other digital audio environments and
2.2.3 Cmix,CLM,Csound
Around the same time,the popularity and portability of C gave rise to another
unique programming system:Paul Lansky's Cmix [67,46].Cmix wasn't directly
descended from MUSIC-N languages;in fact it's not a programming language,but
a C library of useful signal processing and sound manipulation routines,unied by
a well-dened API.Lansky authored the initial implementation in the mid-1980s to
exibly mix sound les (hence the name Cmix) at arbitrary points.It was partly
intended to alleviate the in exibility and large turnaround time for synthesis via
batch processing.Over time,many more signal processing directives and macros
were added.With Cmix,programmers could incorporate sound processing function-
alities into their own C programs for sound synthesis.Additionally,a score could
be specied in the Cmix scoring language,called MINC (which stands for\MINC
Is Not C!").MINC's syntax resembled that of C and proved to be one of the most
powerful scoring tools of the era,due to its support for control structures (such as
loops).Cmix is still distributed and widely used today,primarily in the form of
RTCmix (the RT stands for real-time),an extension developed by Brad Garton and
David Topper [32].
Common Lisp Music (or CLM) is a sound synthesis language written by Bill
Schottstaedt at Stanford University in the late 1980s [75].CLM descends from
the MUSIC-N family and employs a Lisp-based syntax for dening the instruments
and score and provides a collection of functions that create and manipulate sound.
Due to the naturally recursive nature of LisP (which stands for List Processing),
many hierarchical musical structures turned out to be straightforward to represent
using code.A more recent (and very powerful) LisP-based programming language
is Nyquist [24],authored by Roger Dannenberg.(Nyquist is discussed below;both
CLM and Nyquist are freely available)
Today,the most widely used direct descendent of MUSIC-Nis Csound,originally
authored by Barry Vercoe and colleagues at MIT Media Labs in the late 1980s [89,7,
91].It supports unit generators as opcodes,objects that generate or process audio.
It embraces the instrument vs.score paradigm:the instruments are dened in
orchestra (.orc) les,while the score in.sco les.Furthermore,Csound supports the
notion of separate audio and control rates.The audio rate (synonymous with sample
rate) refers to the rate at which audio samples are processed through the system.On
the other hand,control rate dictates how frequently control signals are calculated
and propagated through the system.In other words,audio rate (abbreviated as ar
in Csound) is associated with sound,whereas control rate (abbreviated as kr) deals
with signals that control sound (i.e.changing the center frequency of a resonant
lter or the frequency of an oscillator).The audio rate is typically higher (for
instance 44100 Hz for CD quality audio) than the control rate,which usually is
adjusted to be lower by at least an order of magnitude.The chief reason for this
separation is computational eciency.Audio must be computed sample-for-sample
at the desired sample rate.However,for a great majority of synthesis tasks,it
makes little perceptual dierence if control is asserted at a lower rate,say on the
order of 2000Hz.This notion of audio rate vs.control rate is widely adopted across
nearly all synthesis systems.
This rst era of computer music programming pioneered how composers could
interact with the digital computer to specify and generate music.Its mode of work-
ing was associated with the diculties of early mainframes:oine programming,
submitting batch jobs,waiting for audio to generate,and transferring to persis-
tent media for playback or preservation.It paralleled developments in computers
as well as general-purpose programming languages.We examined the earliest mu-
sic languages in the MUSIC-N family as well as some direct descendants.It is
worth noting that several of the languages discussed in this section have since been
augmented with real-time capabilities.In addition to RTCMix,Csound now also
supports real-time audio.
2.3 The Computer Age (Part II):Real-time Sys-
tems and New Approaches
This second era of computer programming for music partially overlaps with the rst.
The chief dierence is that the mode of interaction moved fromoine programming
and batch processing to real-time sound synthesis systems,often controlled by ex-
ternal musical controllers.By the early 1980s,computers have become fast enough
and small enough to allowworkstation desktops to outperformthe older,gargantuan
mainframes.As personal computers began to proliferate,so did new programming
tools and applications for music generation.
2.3.1 Graphical Music Programming:Max/MSP + Pure
We now arrive at one of the most popular computer music programming environ-
ment to this day:Max and later Max/MSP [68].Miller S.Puckett implemented
the rst version of Max (when it was called Patcher) at IRCAM in Paris in the
mid-1980s as a programming environment for making interactive computer music.
At this stage,the program did not generate or process audio samples;its primary
purpose was to provide a graphical representation for routing and manipulating sig-
nals for controlling external sound synthesis workstations in real-time.Eventually,
Max evolved at IRCAM to take advantage of DSP hardware on NeXT computers
(as Max/FTS,FTS stands for\faster than sound"),and was later released in 1990
as a commercial product by Opcode Systems as Max/Opcode.In 1996,Puckette
released a completely redesigned and open source environment called Pure Data
(PD) [69].At the time,Pure Data processed audio data whereas Max was pri-
marily designed for control (MIDI).PD's audio signal processing capabilities then
made their way into Max as a major add-on called MSP (MSP either stands for
Max Signal Processing or for Miller S.Puckett),authored by Dave Zicarelli.Cycling
'74,Zicarelli's Company,distributes the current commercial version of Max/MSP.
Meanwhile,IRCAM currently maintains jMax [27] as freely available and new im-
plementation of the original Max software.
The modern-day Max/MSP supports a graphical patching environment and a
collection containing thousands of objects,ranging fromsignal generators,to lters,
to operators,and user interface elements.Using the Max import API,third party
developers can implement external objects as extensions to the environment.De-
spite its graphical approach,Max descends from MUSIC-V (in fact Max is named
after the father of MUSIC-N,Max Mathews) and embodies a similarly modular
approach to sound synthesis.A simple Max/MSP example is shown in Figure 2.2.
Max oers two modes of operation.In edit mode,a user can create objects,
represented by on-screen boxes containing the object type as well as any initial
arguments.An important distinction is made between objects that generate or pro-
cess audio and control rate objects (the presence of a ~ at the end of the object
name implies audio rate).The user can then interconnect objects by creating con-
nections from the outlets of certain objects to the inlets of others.Depending on
its type,an object may support a number of inlets,each of which is well-dened
in its interpretation of the incoming signal.Max also provides dozens of additional
widgets,including message boxes,sliders,graphs,knobs,buttons,sequencers,and
meters.Events can be manually generated by a bang widget.All of these widgets
can be connected to and from other objects.When Max is in run mode,the patch
topology is xed and cannot be modied,but the various on-screen widgets can be
manipulated interactively.This highlights a wonderful duality:a Max patch is at
once a program and (potentially) a graphical user interface.
Max/MSP has been an extremely popular programming environment for real-
time synthesis,particularly for building interactive performance systems.Con-
trollers both commodity (MIDI devices) and custom as well as sensors (such as
motion tracking) can be mapped to sound synthesis parameters using Max/MSP.
The visual aspect of the environment lends itself well to monitoring and ne-tuning
patches.Max/MSP can be used to render sequences or scores,though due to the
lack of detailed timing constructs (the graphical paradigm is better at representing
what than when),this can be less straightforward.
2.3.2 Programming Libraries for Sound Synthesis
So far,we have discussed mostly standalone programming environments,each of
which provides a specialized language syntax and semantics.In contrast to such
languages or environments,a library provides a set of specialized functionalities for
Figure 2.2:A simple Max/MSP patch which synthesizes the vowel ahh.
an existing,possibly more general-purpose language.For example,the Synthesis
Toolkit (STK) [20] is a collection of building blocks for real-time sound synthesis
and physical modeling,for the C++programming language.STKwas authored and
released by Perry Cook in the early 1990's,with later contributions by Bill Butnam
and Gary Scavone.JSyn [11],released around the same time,is a collection of real-
time sound synthesis objects for the Java programming language.In each case,the
library provides an API,with which a programmer can write synthesis programs
in the host language (e.g.C++ and Java).For example,STK provides an object
denition called Mandolin,which is a physical model for a plucked string instrument.
It denes the data types which internally comprise such an object,as well as publicly
accessible functionalities that can be invoked to control the Mandolin's parameters
in real-time (e.g.frequency,pluck position,instrument body size,etc.).Using
this denition,the programmer can create instances of Mandolin,control their
characteristics via code,and generate audio from the Mandolin instances in real-
time.While the host languages are not specically designed for sound,these libraries
allow the programmer to take advantage of language features and existing libraries
(of which there is a huge variety for C++ and Java).This also allows integration
with C++ and Java applications that desire real-time sound synthesis.
2.3.3 Nyquist
Nyquist is an interactive programming language based on Lisp for sound synthesis
and music composition [24,23,83],and is a culmination of ideas explored in ear-
lier systems such as Arctic [26] and Canon [22].While adopting familiar elements
of audio programming found in earlier MUSIC-N languages,Nyquist (along with
SuperCollider,discussed below) is among the rst music composition and sound
synthesis languages to remove the distinction between the\orchestra"(sound syn-
thesis) and the\score"(musical events):both can be implemented in the same
framework.This tighter integration allows both synthesis and musical entities to
be specied using a shared\mindset",favoring the high customizability of code
over the ease and simplicity of data (e.g.,note lists).
In Nyquist,the composer species sound,transformations,and music by com-
bining expressions,leveraging both audio building blocks as well as the full array
of features in the general purpose Lisp language and environment.Additionally,
Nyquist supports a wide array of advanced ideas.These include\behavioral ab-
straction",which allows programmers to specify appropriate underlying behaviors
in dierent contexts while maintaining a unied high-level interface.Nyquist also
supports the ability to work in both quantitative and perceptual attack times,as
well as an advanced abstract time-warping of compound events [23].At a more
basic level,Nyquist oers temporal operators such as sim(for simultaneous signals
and events) and seq (for sequential evaluation).
While Nyquist is not a real-time programming environment (it is interactive),it
provides a powerful and intrinsically dierent way of thinking about programming
audio and composing music.Nyquist is in wide use today,including as the primary
plug-in programming engine in the open source audio editor Audacity [57].
2.3.4 SuperCollider
SuperCollider is a text-based audio synthesis language and environment [58,59].
It is highly powerful as a programming language,and the implementation of the
synthesis engine is highly optimized.It combines many of the previous ideas in com-
puter music language design while making some fundamental changes and additions.
SuperCollider,like languages before it,supports the notion of unit generators for
signal processing (audio and control).However,like Nyquist,there is no longer a
distinction between the orchestra and score.Furthermore,the language,which in
part resembles the Smalltalk and C programming languages,is object-oriented and
provides a wide array of expressive programming constructs for sound synthesis
and user interface programming.This makes SuperCollider suitable not only for
implementing synthesis programs,but also for building large interactive systems for
sound synthesis,algorithmic composition,and for audio research.
At the time of this writing,there have been three major version changes in Super-
Collider.The third and latest (often abbreviated SC3) makes an explicit distinction
between the language (front-end) and synthesis engine (back-end).These loosely
coupled components communicate via OpenSoundControl (OSC),a standard for
sending control messages for sound over the network.One immediate impact of this
new architecture is that programmers can essentially use any front-end language,as
long as it conforms to the protocol required by the synthesis server (called scsynth
in SuperCollider).
Figure 2.3:SuperCollider programming environment in action.
2.3.5 Graphical vs.Text-based
It is worthwhile to pause here and re ect on the dierences between the graphical
programming environments of Max/MSP and PD vs.the text-based languages and
libraries such as Csound,SuperCollider,STK,and Nyquist (as well as ChucK).
The visual representation presents the data ow directly,in a what-you-see-is-what-
you-get sort of way.Text-based systems lack this representation and understanding
of the syntax and semantics is required to make sense of the programs.However,
many tasks,such as specifying complex logical behavior,are more easily expressed
in text-based code.
Ultimately it's important to keep in mind that most synthesis and musical tasks
can be implemented in any of these languages.This is the idea of universality:
two constructs (or languages) can be considered equivalent if we can emulate the
behavior of one using the other,and vice versa.However,certain types of tasks may
be more easily specied in a particular language than in others.This brings us back
to the idea of the programming language as a tool,and perhaps more importantly,
as a way of thinking.In general,a tool is useful if it does at least one thing better
than any other tool (for example,a hammer or a screwdriver).Computer music
programming languages are by necessity more general,but diering paradigms (and
syntaxes) lend themselves to dierent tasks (and no single environment\does it
best"in every aspect:it's important to choose the right tools for the tasks at hand).
In the end,it's also a matter of personal preference - some like the directness of
graphical languages whereas others prefer the feel and expressiveness of text-based
code.It's often a combination of choosing the right tool for the task and nding
what the programmer is comfortable working in.
2.3.6 Additional Music Programming Languages
Formula,short for Forth Music Language,is a programming for computing control
signals to synthesizers based on concurrent processes operating in a unied environ-
ment that can be scheduled at runtime [3].Variously processes can be specied to
compute pitch sequences as well as control for parameters such as volume,duration,
and articulation.Unlike the languages discussed above,Formula computes control
signals and does not directly generate or synthesize audio.
Haskore is a set of modules in the Haskell programming language created for
expressing musical structures in a high-level declarative style of functional program-
ming.Like Formula,it is more of a language for describing music (in Haskore's case,
mostly Western music),not sound [38].An advantage of Haskell (and by extension,
Haskore) is that objects in the language simultaneously represent abstract (musical)
ideas as well as their concrete representation,leading to provable property which
can be reasoned about,a result of the programming system.
Formes provides an object-oriented,hierarchical event-based approach to dealing
with time [16] and is based on the Lisp programming language.It was not designed
to compute audio directly but rather time-oriented control information for the Chant
synthesis system.
2.4 The Computer Age (Part III):New Language
With the growth of low-cost,high performance computers,the real-time and in-
teractive music programming paradigms are more alive than ever and expanding
with the continued invention and renement of new interfaces for musical expres-
sion.Alongside the continuing trend of explosive growth in computing power is
the desire to nd new ways to leverage programming for real-time interaction.If
the second era of programming and music evolved from computer becoming com-
modities,then this third era is the result of programming itself becoming pervasive.
With the ubiquity of hardware and the explosion of new high-level general-purpose
programming tools (and people willing to use them),more composers and musicians
are crafting not only software to create music,but also new software to create newer
and more custom software for music.
As part of this new age of exploration,a recent movement has been taking shape.
This is the rise of dynamic languages and consequently of using the act of program-
ming itself as a musical instrument.This,in a way,can be seen as a subsidiary
of real-time interaction,but with respect to programming music,this idea is fun-
damentally powerful.For the rst time in history,we have commodity computing
machines that can generate sound and music in real-time (and in abundance) from
our program specications.One of the areas investigated in our third age of pro-
gramming and music is the possibilities of changing the program itself in real-time
as it's running.Given the innite expressiveness of programming languages,might
we not leverage code to create music on-the- y?
The idea of run-time modication of programs to make music (interchangeably
called live coding,on-the- y programming,interactive programming) is not an en-
tirely new one.As early as the beginning of the 80s,researchers such as Ron Kuivila
and groups like the Hub have experimented with runtime modiable music systems.
The Hierarchical Music Scoring Language (HMSL) is a Forth-based language,au-
thored by Larry Polansky,Phil Burk,and others in the 1980s,whose stack-based
syntax encourages runtime programming [12].These are the forerunners of live cod-
ing.The fast computers of today enable an additional key component:real-time
sound synthesis.
2.4.1 Custom Music Programming Software
An incredibly vibrant and wonderful aspect of the era is the proliferation of cus-
tom,\home-brew"sound programming software.The explosion of new high-level,
general-purpose,programming platforms has enabled and encouraged programmers
and composers to build systems very much tailored to their liking.Alex McLean
performs via live coding using the high-level scripting language Perl [60],while de-
velopers such as Andrew Sorensen and Andrew Brown have recently explored live
coding environments based on Scheme [10].Similar frameworks have been devel-
oped in Python,various dialects of Lisp,Forth,Ruby,and others.Some systems
make sound while others visualize it.Many systems send network message (in
OpenSoundControl) to synthesis engines such as SuperCollider Server,PD,Max,
and ChucK.In this way,musicians and composers can leverage the expressiveness of
the front-end language to make music while gaining the functionalities of synthesis
languages.Many descriptions of systems and ideas can be found through TOPLAP
(which usually stands for Transnational Organisation for the Proliferation of Live
Audio Programming),a collective of programmers,composers,and performers ex-
ploring live programming to create music [82].
This third era is promising because it enables and encourages new compositional
and performance possibilities not only to professional musicians,researchers,and
academics,but also to anyone willing to learn and explore programming and music.
Indeed,the homebrew aesthetic has encouraged personal empowerment and artis-
tic independence from established traditions and trends.Also,the new dynamic
environments for programming are changing how we approach more traditional
computer music composition by providing more rapid experimentation and more
immediate feedback.This era is young but growing rapidly and the possibilities are
truly fascinating.Where will it take programming and music in the future?
2.5 Synchronous Reactive Systems
In addition to the realm of audio and music programming,it is worthwhile to
provide context for this work with respect to synchronous languages for reactive
systems.A reactive system maintains an ongoing interaction with its environment,
rather than (or in addition to) producing a nal value upon termination [35,34].
Typical examples include air trac control systems,control kernels for mechan-
ical devices such as wristwatches,trains,and even nuclear power plants.These
systems must react to their environment at the environment's speed.They dier
from transformational systems,which emphasize data computation instead of the
ongoing interaction between systems and their environments;and from interactive
systems,which in uence and react to their environments at their own rate (e.g.,
web browsers).
In synchronous languages for reactive systems,a synchrony hypothesis states
that computations happen innitely fast,allowing events to be considered atomic
and truly synchronous.This aords a type of logical determinism that is an essen-
tial aspect of a reactive system (which should produce the same output at the same
points in time,given the same input),and reconciles determinismwith the modular-
ity and expressiveness of concurrency.Such determinism can lead to programs that
are signicantly easier to specify,debug,and analyze compared to non-deterministic
ones,for example those written in more\classical"concurrent languages [36,1].
Several programming languages have embodied this synchrony hypothesis.Es-
terel [6] is a textual imperative language,suitable for specifying small control-
intensive tasks (such as many nite state machines and cases where it's benecial to
emit signals through a systemwith zero logical delay).In Esterel,the notion of time
is replaced by that of order,and an Esterel program describes reactions to events
that are deterministically and totally ordered in a sequence of logical instants.Two
events happening at the same logical instant are said to be occurring simultane-
ously (otherwise,they occur in sequence).Communication in Esterel is carried out
exclusively via signals,which can be broadcast with zero delay (i.e.,visible to other
parts of the system at the same logical instant).Esterel provides a deterministic
method to model concurrent low-level control processes and is commonly used in
various control systems and embedded devices,being amenable to be implemented
in software as well as in hardware.
Other synchronous languages include LUSTRE [13],a declarative data ow lan-
guage for reactive systems,and SIGNAL [33].All of these,including Esterel,are
designed for specifying and/or describing low-level reactive systems.As such,they
are not\complete"languages - the programs they specify are meant to be integrated
into more complex host systems implemented via other means.
2.5.1 ChucK and Synchronous Languages
While it wasn't necessarily designed as such,ChucK possesses aspects of syn-
chronous languages for reactive systems.Computation is assumed to happen in-
nitely fast with respect to logical ChucKian time,which can only be advanced as
explicitly requested and allowed by the program.Compared to the existing syn-
chronous languages for reactive systems,ChucK most closely resembles Esterel -
they are both imperative and,each in their own ways,enforce the synchrony hy-
pothesis.Yet,there are some important dierences.
Esterel and other synchronous languages are designed for specifying minimal
reactive kernels often for integration into more complex host systems.ChucK,on
the other hand,is meant to provide a unifying solution for specifying and developing
an entire system,including a deterministic kernel of control as well as constructs to
build complex components and algorithms.It combines high-level,general purpose,
audio-centric programmability with the intrinsically low-level benets oered by the
synchrony hypothesis.Additionally,ChucK is highly dynamic,allowing on-the- y
creation of high-level objects and processes.
Existing synchronous languages emphasize reaction,whereas ChucK's design
goals and programming style are intended to be reactive as well as proactive and
interactive.The ChucK programming model oers events and signal (which are re-
active),as well as the ability to specify concurrent processes that move themselves
through logical time,to both control and to dene the system.This encourages
a fundamentally dierent,proactive mentality when programming.Additionally,
ChucK presents a highly visible and centralized view of logical time (via the now
keyword) that reconciles logical time with real-time.This mechanism deterministi-
cally couples and interleaves user computation with audio computation,providing
a continuous notion of time mapped explicitly to the audio stream (see Chapter 3
for discussions,examples,and analysis).
Finally,Esterel is meant to facilitate verication to ensure that the synchrony
hypothesis can be reasonably approximated in a practical real-time (often mission-
critical) system.ChucK leverages the determinism for program specication,de-
bugging,and analysis,but is less concerned with absolute real-time performance,
and more with the determinismbridging code,audio computations,and the ongoing
output.For example,even when a ChucK audio synthesis program cannot keep up
in real-time on a particular machine,the computed audio samples are still guaran-
teed to be accurate and interruption-free (e.g.,if written to le,and in the absence
of asynchronous input).In other words,ChucK can either assume the computer to
be innitely fast,or alternately relax the real-time constraint while maintaining out-
put integrity.In this latter sense,ChucK can also be used in more transformational
In this context,ChucK presents a synchronous language that simultaneously
embodies elements of reactive,transformational,and interactive systems.Moreover,
it embodies a dierent way of thinking about writing synchronous code that is
inextricably related to time and audio.In the upcoming chapters,we shall explore
the these and other mechanisms and properties of ChucK.
2.6 Future Directions
What does the future hold for programming and music?As Lady Ada Lovelace fore-
saw the computing machine as a programming tool for creating precise,arbitrarily
complex,and\scientic"music,what might we imagine about the ways music will
be made decades and beyond from now?
Several themes and trends pervade the development of programming languages
and systems for music.The movement towards increasingly more real-time,dy-
namic,and networked programming of sound and music continues;it has been
taking place in parallel with the proliferation and geometric growth of commodity
computing resources until recent times.New trends are emerging.At the time of
this writing (between 2006 and 2008),the focus in the design of commodity machines
is shifting to distributed,multi-core processing units.We may soon have machines
with hundreds (or many more) cores as part of a single computer.How might these
potentially massive parallel architectures impact the way we think about and pro-
gramsoftware (in everything fromcommercial data-processing to sound synthesis to
musical performance)?What new programming paradigms will have to be invented
to take advantage of these and other new computing technology such as quantum
computers,and (for now) theoretical computing machines?An equally essential
question:how can we better make use of the machines we have?
Finally,let's think back to Ada Lovelace's question from the beginning of this
chapter,and ponder the following:\Supposing,for instance,that the engine were
susceptible of such expression and adaptations,might not the human compose elab-
orate and scientic pieces of music of any degree of complexity or extent?"
It's always the right time to imagine what new possibilities await us.
Chapter 3
3.1 Language Design
The chapter presents the design of the ChucK programming language,its primary
features,and its instantiation in the formof the language specication.In these con-
texts of the design,this chapter then addresses the implementation of the language,
as well as some useful properties it oers.
3.1.1 Two Observations
As we formulate the problem statement,we observe two important commonalities
that pervade the gamut of audio programming.The rst is that time is intimately
connected with sound and central to how sound and music programs are created
and reasoned about.This may seem like an obvious point - as sound and music
are intrinsically time-based.Yet we also feel that control over time in programming
languages is often under-represented (or sometimes over-abstracted).Low level
languages like C/C++ and Java have no inherent notion of time and allow for
data types to be built to represent time,which can be cumbersome to implement
and use.High level computer music languages tend to abstract time too much,
often embodying a more declarative style and connect things in a way that assumes
quasi-parallel modules,(e.g.,similar to analog synthesizers) while hiding much of the
internal scheduling.Timing is also typically broken up into two or more distinct,
internally maintained rates (e.g.,audio rate and control rate,the latter is often
arbitrarily determined for the programmer).The main problem with these existing
types of programming model is that the programmer knows what,but does not
always know when.
The second observation is two-fold:1) sound and music are often the simultane-
ity of many parallel processes and thus a programming language for music/sound
can fundamentally benet from a concurrent programming model that easily and
exibly captures parallel processes and their interactions.2) The ability to program
parallellismmust be both intimately connected to time,and yet it must be provided
in such way that it operates independently of time.In other words,this function-
ality must be\orthogonal"to time to provide the maximal degree of freedom and
From these two observations,the ChucKian insight is to expose programmabil-
ity/control over time (at various granularities) in cooperation with a time-based
concurrent programming model.In particular,this entails making time itself both
computable and directly controllable at any granularity.Programmers specify the
algorithm,logic,or pattern according to which computation is performed in time,
by embedded explicit timing information within the code.Based on this framework,
the language's runtime system ensures properties of determinism and precision be-
tween the program and time.Furthermore,programmers can specify concurrent
code modules,each of which can independently control their own computations
over time and can also be synchronized to other modules via time and other mech-
anisms (e.g.,events).
3.1.2 Design Goals
Based on the observations in the proceeding section,a set of core language design
goals can be summarized as follows:
 Flexibility:allow the programmer to naturally express ideas in code,and to
exibly create,edit,and maintain audio programs.
 Time:allow the programmer to program the passage of time,and to control
and reason about time with precision and across a wide range of temporal
 Concurrency:allow the programmer to write parallel modules that share
both data and time,and that can be precisely synchronized;provide a deter-
ministic concurrent programming model for audio,minimizing the hassle and
complexity of (preemptive) concurrent programming by taking advantage of
time and events in the language.
 Readability:provide/maintain a strong correspondence between code struc-
ture and timing.
 A do-it-yourself language:combine the expressiveness of lower-level lan-
guages and the ease of high-level computer music languages.Support high-
level musical concepts,precise low-level timing,and the creation of\white-
box"unit generators,all directly in the language.
 Rapid prototyping:allow programs to be created and edited as they run,
for rapid experimentation,pedagogy,and live performance.
 Pedagogy:make audio programming more accessible;an observation is that
many people are willing to (learn to) programin order to make music,present-
ing an opportunity to teach programming more eectively (possibly to people
who would otherwise never learn to program).Conversely,the clarity and
logic of a programming language can help teach computer music concepts.
Figure 3.1:Relative emphasis between three design goals.
In terms of the focal points of the design (Figure 3.1),top priority is given
to exibility and readability.While performance (in the sense of computational
throughput) is a highly important consideration,it is not our top priority.We
design the language to provide maximal control for the programmer,and tailor the
system performance around the design.
3.2 Core Language Features
With the design goals outlined above,we present four key ideas that form the
foundation of ChucK.The goal is to design a\natural"audio programming language
(1) to concurrently and accurately represent complex audio synthesis,(2) to enable
ne-grain, exible control over time,(3) to provide the capability to operate on
multiple,dynamic and simultaneous control rates,and (4) to make possible an
on-the- y style of programming.The central ideas are as follows.
 A unifying,massively overloaded operator.
 A precise timing model that unies high-level and low-level timing and is
straightforward to write as well as reason about from code.
 A precise concurrent programming model that supports arbitrarily ne gran-
ularity,as well as multiple,simultaneous,and dynamic rates of control.
 A programming paradigmand run-time environment that allow on-the- y pro-
gramming,enabling dynamically modiable programs for performance and ex-
3.2.1 ChucK Operator (=>)
ChucK is a strongly-typed,imperative programming language.Its syntax and se-
mantics are governed by a statically-compilable,run-time modiable type system.
The heart of ChucK's language syntax is based around the ChucK operator
(written as =>).This left-to-right operator,originates from the slang term\to
chuck",meaning to throw an entity into or at another entity.The language uses
this notion to help express sequential operations and data ow.=> (and related
operators) form the\syntactic glue"that binds ChucK code elements together.
=> is a massively overloaded operator,where the behavior of => depends on
the context - in particular,what is being chucked and what is chucked to (see
Figure 3.2).In this code fragment,we omit the declaration of the foo variable.But
assuming we declared foo as a unit generator (an audio signal processing element),
then the behavior of => in this context would be to connect the output of foo into
the input of dac (another unit generator).
// connecting 'foo' to 'dac'
foo =>
Figure 3.2:A statement using the ChucK operator,here connecting the output of
foo to dac (both are assumed to be unit generators).
A slightly more complex example can be seen in Figure 3.3.This code fragment
constructs a simple synthesis instrument using a series of Unit Generators (their
declarations are omitted for the moment):a white noise generator,a lter of some
type,and the audio output.Notice that the single line captures the ow of the
signals from left to right - the same order as ChucK programmers would read and
// connect 'noise' to 'filter' to 'dac'
noise => filter =>
Figure 3.3:A statement that uses the ChucK operator to connect three audio
elements together.
A ChucK statement can be composed of any appropriate types of objects (in-
cluding instances of user-dened types),unit generators,operations,values,and
variables.The semantic of the statement depends on the types of the objects,and
the overloading of the ChucK operator on those types.
In addition to performing connections of Unit Generators,the ChucK operator
can be employed in a variety of contexts,including time advancement (see sections
on time and concurrency below),function invocation,assignment,and more.For
example,Figure 3.4 demonstrates two dierent syntaxes to achieving the same
nested function calls,both valid in ChucK.Note the\un-nesting"eect of using
=> can lead to more linear and streamlined representations.In this case,the
programmer might think of values passing through a sequence of transformations {
from left to right.
// nested function calls
Math.fabs( Math.min( a, b ) );
// same function calls via =>
( a, b ) => Math.min => Math.fabs;
Figure 3.4:Two dierent syntaxes for invoking the same nested function calls.
Furthermore,there is a greater family of\ChucK operators",ranging from+=>
(plus chuck),%=> (modulo chuck),=< (unchuck),to the more recent =^ (up-
chuck),introduced as part of the ChucK Unit Analyzer framework [101].
3.2.2 ChucKian Time
A key part of the solution in ChucK is to make time itself computable,and also to
allow a program to be\self-aware"in the sense that it always knows its position
in time,and can control its own progress over time.Furthermore,many program
components can share a central notion of logical ChucKian time,making it possible
to synchronize parallel code solely and naturally based on time as well as to precisely
express and reason about the temporal behavior of a program.This gives rise to our
notion of a strongly-timed audio programming language { one in which programs
has intimate,precise,and modular control over their own timing.With respect to
synthesis and analysis,an immediate ramication is that control can be asserted at
any unit generator at any time and at any rate.In order to make this happen:
 ChucK provides time and dur as native types in the language (for time and
 The language allows well-dened arithmetic on time and duration (Table 3.1)
 The model provides a deterministic and total mapping of code to time to audio
synthesis.It is natural to reason about and specify timing from anywhere in
a program.
 The language provides now,a special keyword (of type time) that holds the
current ChucK time.It has a exible granularity can be orders of magnitude
ner than sample-rate,and provides a way to talk about time in an immediate,
deterministic,and well-dened sense.
 ChucK oers a globally consistent means to advance time from anywhere in
the program ow:by duration (D +=> now;) or by absolute time (T =>
Table 3.1 shows the resulting types of performing various arithmetic on time
and dur types,and whether the operations are commutable by type.
As an example,consider the following program (Figure 3.5),which creates a
patch consisting a sine wave generator,and changes its frequency of oscillation
randomly every 100 millisecond.