Music Processing Algorithms

boorishadamantΤεχνίτη Νοημοσύνη και Ρομποτική

29 Οκτ 2013 (πριν από 3 χρόνια και 9 μήνες)

106 εμφανίσεις

Music Processing Algorithms

David Meredith

Recent projects


Musical pattern matching and discovery


Finding occurrences of a query pattern in a work


Finding works that are similar to a query work


Discovering themes in a work


Pitch spelling


Predicting the pitch names (e.g.,
C#4,
B@3
) of notes in a “piano
-
roll” representation
(e.g., MIDI)


Essential for transcription from MIDI (or audio) to
notation

Algorithms for pattern matching
and pattern discovery in music

Uses of musical pattern
discovery algorithms


In content
-
based music retrieval


Creating an index of memorable patterns to enable faster retrieval


For music analysts, performers and listeners


A motivic/thematic analysis can assist understanding and
appreciation


In transcription


Helps with inferring beat and metrical structure


similar patterns have similar metrical structure


Helps with inferring grouping and phrasing


“parallellism” (Lerdahl and Jackendoff, 1983) most important factor in
grouping


In composition and improvisation


Cure composer’s block by suggesting new material based on
patterns discovered in music already written


Automatically create new music that develops themes discovered
in music already played


Use analysed thematic structure as a template for a new work

Importance of repeated patterns
in music analysis and cognition


Schenker (1954. p.5):


repetition “is the basis of music as an art”


Bent and Drabkin (1987, p.5):


“the central act” in all forms of music analysis is
“the test for identity”


Lerdahl and Jackendoff (1983, p.52):


“the importance of parallelism [i.e., repetition] in
musical structure cannot be overestimated. The
more parallelism one can detect, the more
internally coherent an analysis becomes, and the
less independent information must be processed
and retained in hearing or remembering a piece”

Most musical repetitions are
neither perceived nor intended

Rachmaninoff, Prelude in C sharp minor, Op.3, No.2, bars 1
-
6

Interesting musical repetitions
are structurally diverse


Want to discover all and only
interesting

repeated patterns


i.e., themes and motives


Class of interesting repeated patterns is
structurally diverse because


patterns vary widely in structural characteristics


many ways of transforming a musical pattern to
give another pattern that is perceived to be a
version of it


e.g., we can transpose it, embellish it, change tempo
harmony, accompaniment, instrumentation, etc.

Example of repeated motive

Barber, Sonata for Piano, Op.26, 1st mvt, bars 1
-
4

Example of thematic
transformation

J.S.Bach, Contrapunctus VI from Die Kunst der Fuge, bars 1
-
5

String
-
based algorithms for
discovering musical patterns


Most previous approaches assume music
represented as strings


each string represents a voice or part


each symbol represents a note or an interval
between two consecutive notes in a voice


Similarity between two patterns measured
in terms of edit distance calculated using
dynamic programming


see, e.g., Lemstrom (2000), Hsu et al. (1998),
Rolland (1999)

Problems with the string
-
based
approach
-

Edit distance


B is an embellished version of A


If both patterns represented as strings


each symbol represents pitch of note


then edit distance between A and B is 9


If allow pattern with 9 differences to count
as a match, then get many spurious hits

Problems with string
-
based
approach
-

Polyphony


If searching polyphonic music and


do not know voice to which each note belongs (e.g., MIDI
format 0 file);
or


interested in patterns containing notes from 2 or more voices


then


combinatorial explosion in number of possible string
representations


if don’t use all possible representations then may not find all
interesting patterns

Using multidimensional point
sets to represent music (1)

Using multidimensional point
sets to represent music (2)

SIA
-

Discovering all maximal
translatable patterns (MTPs)

Pattern is
translatable

by vector
v

in

dataset if it can be translated
by
v

to give another pattern in the dataset

MTP for a vector
v

contains all points mapped by
v

onto other
points in the dataset

O
(
kn
2

log
n
) time,
O
(
kn
2
) space


where
k

is no. of dimensions &
n

is no. of points

O
(
kn
2
) average time with hashing

SIATEC
-

Discovering all
occurrences of all MTPs

Translational
Equivalence Class
(TEC) is set of all
translationally
invariant
occurrences of a
pattern


Absolute running times of SIA
and SIATEC


SIA and SIATEC implemented in C


run on a 500MHz Sparc on 52 datasets


6
≤n
≤3456, 2
≤k
≤5


< 2 mins for SIA to process piece with 3500 notes


13 mins for SIATEC to process piece with 2000 notes

Need for heuristics to isolate

interesting MTPs


2
n
patterns in a dataset of size
n


SIA generates <
n
2
/2 patterns


=> SIA generates small fraction of all patterns in a dataset


Many interesting patterns derivable from
patterns found by SIA


BUT many of the patterns found by SIA are
NOT interesting


70,000 patterns found by SIA in Rachmaninoff’s
Prelude

in
C# minor


probably about 100 are interesting


=> Need heuristics for isolating interesting
patterns in output of SIA and SIATEC

Heuristics for isolating musical
themes and motives

Cov=6

CR=6/5

Cov=9

CR=9/5

Comp = 1/3

Comp = 2/5

Comp = 2/3

COSIATEC
-

Data compression
using SIATEC

Start

Dataset

SIATEC

List of <Pattern, Translator_set> pairs

Print out best pattern,
P
, and its translators

Remove occurrences of
P

from dataset

Is dataset empty?

End

No

Yes

Using COSIATEC for finding
themes and motives in music

First iteration

Second iteration

SIAM
-

Pattern matching using
SIA


k
dimensions


n

points in dataset


m

points in query


O
(
knm

log(
nm
)) time


O
(
knm
) space


O
(
knm
) average time with
hashing

Query pattern

Dataset

Improving SIAM
-

Ukkonen,
Lemstr
ö
m & M
ä
kinen (2003)


Use sweepline
-
like scanning of the dataset (Bentley and
Ottmann, 1979)


Generalized to approximate matching of sets of
horizontal line
-
segments


However, restricted to 2
-
dimensional representations
(unlike SIA
-
family)


Improved complexity to


O
(
mn
log m +
n
log
n + m
log
m
) running time (without hashing)


O
(
m
) working space


Implemented as algorithm P2 on C
-
BRAHMS demo
web site


<
http://www.cs.helsinki.fi/group/cbrahms/demoengine/
>

Improving SIAM
-

MSM

(Clifford et al., 2006)


Finding size of maximal match is 3SUM hard
(i.e.,
O
(
n
2
) )


Reduce problem of multi
-
dimensional point
-
set
matching to 1d binary wildcard matching


Random projection to 1D


Length reduction by universal hashing


Binary wildcard matching using FFTs


Find best match and check in
O
(
m
) time exactly how
many points match at the location that can be
inferred from this match


Reduces time complexity to
O
(
n

log
n
)

Evaluating MSM:

Precision
-
Recall


Compared with OMRAS (Pickens
et al
., 2003)


Test set of 2338 documents, 480 used as queries


All score encodings in strict score time


Queries had notes deleted, transposed and inserted

Evaluating MSM:

Running time


Run on prefixes of various sizes of first movement of
Beethoven’s 3rd Symphony


Each prefix matched against itself


Compared with largest common subset algorithm of
Ukkonen, Lemström and Mäkinen (2003)


MSM nearly 2 orders of magnitude faster (log scale)

Pitch spelling algorithms

A pitch spelling algorithm

takes this...

Chromatic pitch

Time

...and computes this

Diatonic pitch

Time

Why are pitch spelling algorithms
useful?


In transcription, for generating a correctly
notated score from a MIDI or audio file


In content
-
based music retrieval


For representing better the perceived tonal
relationships between notes


Allows us to find occurrences that sound like the
query but contain different chromatic intervals


For better understanding the cognitive
processes that underlie the perception of tonal
music

Why is the same sound spelt
differently in different contexts?

1

2

3

4

Comparative analysis of pitch
spelling algorithms


Algorithms analysed, evaluated and (in
some cases) improved


Longuet
-
Higgins (1976, 1987, 1993)


Cambouropoulos (1996,1998, 2001, 2003)


Temperley (2001)


Chew and Chen (2003, 2005)


Meredith (2003, 2005, 2006)


Test corpus


195972 notes, 216 movements, 8 baroque and
classical composers


almost exactly equal number of notes (24500) for
each composer

The PS13s1 algorithm

MIDI Note number

Time

Tonic chroma and pitch name class

Freq

Ebb

Bbb

Fb

Cb

Gb

Db

Ab

Eb

Bb

F

C

G

D

A

E

B

F#

C#

G#

D#

A#

2

9

4

11

6

1

8

3

10

5

0

7

2

9

4

11

6

1

8

3

10

1

T

T1

T

1

2

T

T

1

1

T

Initial pitch name class

The PS13s1 algorithm

MIDI Note number

Time

Tonic chroma and pitch name class

Freq

Ebb

Bbb

Fb

Cb

Gb

Db

Ab

Eb

Bb

F

C

G

D

A

E

B

F#

C#

G#

D#

A#

2

9

4

11

6

1

8

3

10

5

0

7

2

9

4

11

6

1

8

3

10

T1

T

1

T

1

T

1

T

1

T

2

Initial pitch name class

Evaluation criteria and
performance metrics


Evaluation criteria


Spelling accuracy

-

how well an algorithm predicts the pitch
names


Style dependence

-

how much spelling accuracy depends
on style


Performance metrics


Note error rate

-

proportion of notes in corpus spelt
incorrectly


Style dependence

-

standard deviation of note error rates
over 8 composers


Robustness to temporal deviations


Best versions of algorithms also run on version of test
corpus in which onsets and durations were randomly
adjusted


To evaluate how well algorithms would work on files
generated directly from performances

Results for algorithms that were
most accurate over clean corpus

Algorithm


Clean corpus

Noisy corpus

NER%

SD

NER%

SD

PS13s1
x

0.56

0.49

0.61

0.54

Temperley*

0.70

1.13

3.32

3.91

Chew and Chen
+

0.85

0.35

0.99

0.55

Cambouropoulos
+

0.85

0.47

0.93

0.53

Longuet
-
Higgins
§

1.79

1.79

1.75

1.71

Fixed LOF Range (Eb
-
G#)

4.38

1.47

4.38

1.47

x
K
pre
= 10,
K
post
= 42

*Two
-
pass, half tempo corpus, without enh. change (MH2PX2)

+
New optimized versions (CamOpt and CCOP01
-
06)

§
Only when music processed a voice at a time (LH1V)

Future work

Further development of SIA family
of algorithms


Compare SIA algorithms with methods
developed in other more mature fields (e.g.,
computer vision, graph matching)


Improve time complexity of SIA algorithms with
techniques such as ones used in MSM


Adapt algorithms for approximate matching and
scaling (matching at different tempi)


Adapt SIA and SIATEC for early pruning of
uninteresting patterns


Further work on PS13s1


Incorporate PS13s1 into complete MIDI
-
to
-
notation transcription system


Incorporate PS13s1 into Sibelius notation
software


Use PS13s1 for key
-
tracking and
harmonic analysis


Use PS13s1 for feature extraction on
audio data

Web
-
scale content
-
based music
search engine


“Query
-
by
-
humming”:


Interactive input system that allows user to enter
query multiple times and/or adjust displayed
interpretation


Sound
-
based browsing


Allows users to browse for music that has similar
tempo, rhythm, harmony, melody, timbre, etc.


Indexing on


melody, harmony, rhythm, tempo, loudness,
timbre,...


Crawling


iTunes, Amazon, MusicBrainz, MySpace, ...


Interactive music systems


Design of interactive musical spaces in which
motion is mapped onto sound in interesting
ways


e.g., experimentation with various different pitch and
timbre spaces


Design of software and hardware that
maximises creativity


e.g. instruments and software that can be used by
both musically trained and untrained to create
interesting music and manipulate it

Transcription


From MIDI to notation


Requires algorithms for


pitch spelling (e.g., PS13s1)


key tracking (also PS13s1?)


metrical structure analysis (finding the beat)


voice analysis


quantization of note duration values


phrase structure analysis

Composition and improvisation


Creating templates for new pieces of music by
analysing the tonal, rhythmic and thematic structures of
corpora of existing works


“Auto
-
completion” or a “composer’s deblocker”


Analyses music already written and suggests ways in which it
might be continued


Intelligent algorithmic improvisation system


Finds salient themes and motives in music already generated


Creates new music that manipulat these themes in interesting
ways


Algorithm for complete stylistic composition


Input a set of songs or works to be analysed


System composes a new work that has similar structure to
those analysed

Open
-
source object
-
oriented music
processing framework


Supports the rapid development and testing of
algorithms for analysing, retrieving, recognizing,
transcribing, composing and performing music.


Classes and interfaces that successfully
encapsulate the properties of a musical
passage, work or collection of works


Supports the operations and transformations
that a musician, musicologist, composer,
listener or performer might want to carry out on
the music.

Systems for corpus
-
building


Need large, varied and high quality “ground truth” test
-
corpora for testing algorithms in music analysis and
music information retrieval


Currently such corpora are scarce


Serious obstacle to progress in MIR and computational
musicology


Need more effective methods for digitizing musical
resources such as scores and expert analyses and
performances


Faster and more reliable systems for creating structured digital
encodings from printed scores, recordings


Such systems are of value to libraries and other parties
(e.g., internet search engines) interested in making
such resources available online.

SLUT

Mange tak!

References


Bent, I. and Drabkin, W. (1987)
Analysis
. Macmillan.


Bentley, J. and Ottmann, T. (1979) "Algorithms for reporting and counting geometric intersections".
IEEE
Transactions on Computers
, C(28), 643
-
647.


Clifford, R., Christodoulakis, M., Crawford, T., Meredith, D. and Wiggins, G. A. (2006) "A fast, randomised,
maximal subset matching algorithm for document
-
level music retrieval". In
Proceedings of the 7th International
Conference on Music Information Retrieval

(ISMIR 2006), Victoria, Canada.


Hsu, J.
-
L., Liu, C.
-
C. and Chen, A. L. B. (1998) "Efficient repeating pattern finding in music databases". In
Proceedings of the 1998 ACM 7th International Conference on Information and Knowledge
Management, pages
281
-
288.


Lemstr
öm, K. (2000)
String Matching Techniques for Music
Retrieval. PhD dissertation, Department of Computer
Science, University of Helsinki.


Lerdahl, F. and Jackendoff, R. (1983)
A Generative Theory of Tonal Music
. MIT Press, Cambridge MA.


Meredith, D., Lemstr
öm, K. and Wiggins, G. A. (2002) "Algorithms for discovering repeated patterns in
multidimensional representations of polyphonic music".
Journal of New Music Research
, 31(4), 321
-
345.


Meredith, D. (2006) "Point
-
set algorithms for pattern discovery and pattern matching in music". In
Content
-
Based
Retrieval,
Dagstuhl Seminar Proceedings, 06171.


Pickens, J., Bello, J. P., Monti, G., Sandler, M., Crawford, T., Dovey, M. and Byrd, D. (2003) "Polyphonic score
retrieval using polyphonic audio queries: A harmonic modeling approach".
Journal of New Music Research
,
32(2), 223
-
236.


Roland, P.
-
Y. (1999) "Discovering patterns in musical sequences".
Journal of New Music Research,
28(4), 334
-
350.


Schenker, H. (1954)
Harmony
. University of Chicago Press, London.


Ukkonen, E., Lemström, K. and Mäkinen, V. (2003) "Geometric algorithms for transposition invariant content
-
based music retrieval" In
Proceedings of the Fourth International Conference on Music Information Retrieval
(ISMIR 2003), Baltimore.