Traditional Irish Dance

hopeacceptableΛογισμικό & κατασκευή λογ/κού

28 Οκτ 2013 (πριν από 4 χρόνια και 2 μήνες)

99 εμφανίσεις

Machine Annotation of
Traditional Irish Dance
Music

PhD Thesis

Dr. Bryan Duggan BSc, MSc, PhD

School of Computing

DIT

Kevin St

bryan.duggan@dit.ie

http://www.comp.dit.ie/bduggan/music

Supervisors:

Prof Brendan O’ Shea, DIT

Dr Mikel Gainza, DIT

Prof Padraig Cunningham,UCD

Overview


Introduction


Contributions


Dissemination


Traditional music


Challenges


MATT2


Machine Annotation of

Traditional Tunes


Experiment & results


TANSEY
-

Turn ANnotation from SEts
using similaritY profiles


Experiment & results


Conclusions & future work


The problem

Irish Traditional

Music Archive

thesession.org

Introduction


"The main problem in music signal analysis is the
development of algorithms to extract sufficiently
high level content from audio signals. The low level
signal processing algorithms are well understood,
but they produce inaccurate or ambiguous results,
which can be corrected given sufficient musical
knowledge, such as that possessed by a
musically
literate human listener
. This type of musical
intelligence is difficult to encapsulate in rules or
algorithms that can be incorporated into computer
programs.“

-

Dixon, 2004

My thesis


CBMIR adapted to the characteristics of
traditional Irish dance music


Embed rules for melodic similarity in this domain
into the system


Compensating for expressiveness
(ornamentation, reversing, the long note,
phrasing)


Transposition invariance


Results in significantly better annotation
accuracy then generic approaches


Solve domain specific problem (sets)

Contributions

C1

The development of a content based music information
retrieval system (MATT2) which supports the input of queries
played on traditional instruments.

C2

The development of a new machine transcription approach for
traditional music that supports transposition invariance for the
keys and modes used to play traditional music.

C3

The development of a framework of algorithms to
accommodate interpretative style in audio queries to content
based music information retrieval system
-

Ornamentation
Filtering.

C4

The development of a novel algorithm based on similarity
profiles to annotate sets of traditional Irish dance tunes
(TANSEY).

Dissemination


7 publications at peer reviewed conferences including ISMIR, ICMC, CBMI,
ECAI, AICS


Duggan, B., O'Shea, B., Gainza, M., Cunningham, P.: Compensating for Expressiveness in Queries to a Content Based
Music Information System, 2009 International Computer Music Conference (ICMC 2009), Montreal, Canada 16

21
August 2009


Duggan, B., O'Shea, B., Gainza, G and Cunningham P.: Machine Annotation of Sets of Traditional Irish Dance Tunes,
Ninth International Conference on Music Information Retrieval (ISMIR), Drexel University, Philadelphia, USA,
September 2008.


Duggan, B., O'Shea B., Cunningham, P.: A System for Automatically Annotating Traditional Irish Music Field
Recordings, Sixth International Workshop on Content
-
Based Multimedia Indexing, Queen Mary University of London,
UK, Jun. 2008


1 PhD symposium


Source code (GPL 2.0)


http://code.google.com/p/matt2/


Test Audio


http://www.comp.dit.ie/bduggan/music/testaudio.html


Supporting audio examples
-

http://www.comp.dit.ie/bduggan/music/examples.php


Software
-

http://www.comp.dit.ie/bduggan


Browser based online system


http://tunepal.org


1 article in the Irish Times


Best presentation prize at the ICMC


It seems that on this particular
occasion
Touhey

wanted to
learn a tune from McFadden. He
had McFadden play it for him
several times and then tried his
own hand at it. Of course
McFadden had to play it again,
pointing out several "errors."
This happened a number of
times until
Touhey

finally gave
up, for McFadden was playing
the tune a little differently each
time through!




(
Krassen

1975)


The Main

Challenges


Chapter 2

P1


Support for traditional


instruments

P2


Commonly used keys & modes


(Table 5, Table 6 & Table 7, Fig 6, Fig 10)

P3


Reversing

P4


C, C# similarity

P5


Phrasing

P6


Transposition in tin
-
whistles (Table 18)

P7


Ornamentation (Fig 16, Table 8)

P8


The long note (Table 8)

P9


Tempo deviation (Table 3)

P10


The playing of tunes in sets

Literature review


Chapter 2


Traditional Irish Dance Music


Chapter 3


Features of music


Chapter 4


Melodic similarity


Chapter 5


Content based Music Information
Retrieval

C1: My solution!
-

MATT2

Collections


(Petrie 1855; Bunting 1843; Joyce
1909)


The Music of Ireland, The Dance
Music of Ireland


1001 Gems
(O’Neill, 1903) (O’Neill, 1907)


Ceol Rince Na hÉireann


(Breathnach 1963; Breathnach 1976;
Breathnach 1985; Breathnach 1996;
Breathnach 1999)


ABC


(Walshaw 2007)


O’Neills (1997
-
2000) in ABC


(Chambers 2007)


Norebeck (1997
-
2000)

X:422

T:Come West Along the Road

R:reel

S:Session

H:See also #432, in A. This version is
also played in A.

H:1st part similar to "Over the Moor
to Peggy", #710

D:Arcady: Many Happy Returns

D:Noel Hill & Tony McMahon:
\
'I gCnoc
na Gra
\
'i

Z:id:hn
-
reel
-
422

M:C|

K:G

d2BG dGBG|~G2Bd efge|d2BG dGBG|1 ABcd
edBc:|2 ABcd edBd||

|:g2bg egdg|(3efg dg edBd|1 g2bg
egdB|ABcd edBd:|2 gabg efge|dega
bage||


P1: Transcription


Onset detection


ODCF (Gainza 2006;
Gainza & Coyle 2007)


12 Time domain comb
filters


12 Semitones from the
fundamental note

of the
instrument


Equation 1


Table 13 (delays)


Figure 36


Java extract


Pitch detection


Peak picking, F0 estimation


Figure 38


Java extract

C4, P7, P8, P9:
Compensating for style


Ornamentation Filtering


Removes ornamentation notes


Adds durations back to the subsequent note


Splits long notes


Adapts to tempo deviation


uses window size of 6
seconds


Generates a histogram of note durations (offset


onset)


Bin widths calculated on the fly based on 33% fuzz factor


Pseudocode given in Fig 39


Examples given in Table 15

Example

The Kilmovee Jig

C3, P2, P4, P5: Breath
detection & Pitch spelling


Uses an energy based breath detector


Annotates with a z (rest symbol)


Uses spellings from ABC notation


Adjusts pitch spellings to the fundamental note
(Breathnach 1985)


A range of 33 notes


Transposes the pitch spellings for the tin
-
whistle


Spells C & C# the same


(D, G major scales on a tin
-
whistle)

P3, P7: Corpus
normalisation

Original:

d2BG dGBG|~G2Bd efge|d2BG dGBG|1 ABcd edBc:|2 ABcd edBd||



1 After Ornamentation filtering:

d2BGdGBG|G2Bdefge|d2BGdGBG|1ABcd edBc:|2ABcdedBd||



2 After note expansion:

ddBGdGBG|GGBdefge|ddBGdGBG|1ABcd edBc:|2ABcdedBd||



3 After section expansion:

ddBGdGBGGGBdefgeddBGdGBGABcdedBc

ddBGdGBGGGBdefgeddBGdGBGABcdedBd



4 After register normalisation:

DDBGDGBGGGBDEFGEDDBGDGBGABCDEDBC

DDBGDGBGGGBDEFGEDDBGDGBGABCDEDBD


5 Tune Expansion



Edit (Levenshtein)
distance

Bit
-
parallelism


(Lemstrom & Perttu 2000; Navarro & Raffinot 2002)

Intervals


(Mongeau & Sankoff 1990) (Lemstrom & Ukkonen 2000)

Cost function


(Lemstrom & Ukkonen 2000)

Alignment


(Navarro & Raffinot 2002)

Used by


(Lemstrom & Perttu 2000) (Prechelt & Typke 2001)


(Lu, You & Zhang 2001)

(Rho & Hwang 2004), (Grachten, Arcos & Lopez de Mantaras 2005)

(Duggan, Cui & P. Cunningham 2006)


Table 11, 12

P5: Matching


Cost function allows a z to match any
character


Tunes expanded where necessary


Substring edit distance


(Navarro & Raffinot 2002)


Corpus strings ranked according to
distance from the query


Experiment


50 Whole Tunes


36 min 17 sec


50 Excerpts


11 min 48 sec

MC
-
ED: Edit distance based on
melodic contours

ABC corpus converted to MIDI
(ABC2MIDI)

MIDI note numbers extracted

Converted to Parsons Code U, D, S

Transcription to MIDI numbers

Converted to Parsons Code

TI
-
ED: Transposition invariant cost
function

(SEMEX
like
)

MIDI note numbers extracted from
ABC & transcription

No style compensation

(Navarro & Raffinot 2002) substring
edit distances

(Lemstrom & Ukkonen 2000)
transposition invariant cost function

MATT2: The complete MATT2
system

Style compensation

(Navarro & Raffinot 2002) substring
edit distances

Results

MC
-
ED gives very poor accuracy and a
high error rate for both WT and E.

TI
-
ED is able to successfully annotate
about half the whole tunes and less
than half of the excerpts.

MATT2 gives greater than 90%
accuracy for both WT and E.

When the results are combined, it can
be seen that MC
-
ED gives 11%
accuracy, TI
-
ED gives 47% accuracy
and MATT2 gives 93% accuracy


Results for all three systems compared
with stochastic sampling (Binomial)

McNemar’s test

X
2

for MC
-
ED, MATT2 = 80.01

X
2

for TI
-
ED, MATT2 = 44.02

See Table 23, 24 for results

See Table 28, 29, 30, Equation 20 for
contingency tables


C5, P10: Annotating sets
of tunes


Tunes are always played in a set of 2 or more tunes


Each tune can be repeated multiple times


The order of tunes is unknown


There is no interval between tunes (segue)


Tunes are always in the same time signature


Tunes are often in the same key


Problem: Detect the timings, identify the tunes


Count the
turns

C5: Turn ANnotation from
SEts using SimilaritY
profiles (TANSEY)


Transcription & expressiveness
compensation from MATT2


Vector
Nj

= {
os
,
of
,
dS
,
f, ps, e
} is
retained


Makes use of Similarity Profiles (the
last row of the edit distance matrix)


Human annotations used as a ground
truth

See Fig 45 for pseudocode

<
-

Fig 46

Experiment


30 Audio files


1 hour 27 minutes and 18 seconds.


In total, the test audio contained 64
separate tunes with 141 turns


2 second Threshold

Results

See Figure 49 for precision & recall, Table 34, Table 35, Table 37

Conclusions

P1


Support for traditional instruments

P2


Commonly used keys & modes

P3


Reversing

P4


C, C# similarity

P5


Phrasing

P6


Transposition in tin
-
whistles

P7


Ornamentation

P8


The long note

P9


Tempo deviation

P10


The playing of tunes in sets

Contributions


Contribution 1
: The development of a content based music information
retrieval system (MATT2) which supports the input of queries played on
traditional instruments. This is addressed in solutions to P1, P2, P4 and P6
discussed in Chapter 2 and is presented in Chapter 6.



Contribution 2
: The development of a new automatic transcription
approach for traditional music that supports transposition invariance for the
keys and modes used to play traditional music, while minimising pitch spelling
errors and avoiding the double weighting of substitutions, insertions and
deletions that occurs when edit distances are calculated on pitch intervals..
This is addressed in the solution to P2 presented in Chapter 6.


Contribution 3
: The development of a framework of algorithms to
accommodate expressiveness in audio queries to a content based music
information retrieval system is addressed in solutions to P5, P7 and P8
discussed in Chapter 2 and presented in Chapter 6.


Contribution 4
: The development of a novel algorithm based on similarity
profiles to annotate sets of traditional Irish dance tunes. This is addressed in
the solution to P10 presented in Chapter 8.

The Future!


11,555 tunes


thesession.org


ONeills

1001


Norebeck


Transcription in a
java Applet


Matching in a
JSP/
Servlet


MySQL

Backend


1767 queries to
date!


iPhone

version?



“Just love MATT2 and TANSEY, I'm planning on playing with that some more. I
understand its possible to expand the database of tunes it looks up, I have a huge
collection in ABC. The session I go to, as all sessions I'm sure, has loads of good tunes
and we have a couple of players who often can't remember the names of tunes but play
from a seemingly endless well. How great it would be to match these orphan tunes back
with their names and notes. Being able to snatch a recording and match it back to an ABC
and a name is such a functionality.”


“My big interest with MATT2 and TANSEY is the element of the application that strips
away artistic deviation from the 'set' tune. One area I'd be fascinated to work on is
looking at using this artistic input as a means of identifying a performer. Many key
exponents have characteristics within their playing
-

Tommy People's triplets, Bobby
Casey's tone, Tanseys' use of reversing and linked rolls, etc. It would be great to work on
a program that would analyse the music and try and identify a performer based on
melodic variation and technical ornamentation within the given extract”


“Had a look at your MATT2 program. Looks class. I'm going to try it out with the fiddle for
the craic and see if it works :)”


“Just took MATT2 for a test drive


impressive stuff, well done! I like the way it can even
identify a set and pick out the individual tunes. Brilliant!”


“Thinking of doing a PhD. I am an interactive designer. Your project may need a new
interface. Let me know if you are interested. I'll see if I can couple it with an elective in
here.”


“I have several collections of just reels.

These have all been thinned down to only the
tunes that could be read by my (ad hoc) ABC parser.

BTW, I use this corpus to do
automatic creation of tunes, ala what you did in your paper with Zheng.

(My dissertation
was on creativity and storytelling.)”



All my congratulations for this marvellous program ! An invaluable help to all musicians.
This is really
really

great !!! Jean
Lhuillery
, a piper in South West France


I was amazed to see you posting on the session because in the last couple of weeks I
had been saying to Michelle that what we needed was a piece of software that could hear
the tune we played and tell us what it was. & viola! here is Tunepal.org. Your work will be
deeply appreciated.


Your software is amazing. Specially the searching module. For example, I have played a
tune on the bouzouki (courses of 2 strings, resonances, double notes ...) and the result of
the transcription was :

E,F,E,AB,A,DF,A,DFDECA,fDd'A,F,eF,DCdDDDBEzE
,

despite the poor result, the tune finder (jigs) gave me "The Gold Ring" twice (rank 1 and
5), which was the correct answer (FWI, there at least 2 different tunes called the Gold
Ring).

Thanks a million for you tremendous work. It makes me saving a lot of time.


Feedback from musicians who’ve tried it has been extremely positive so far, with one
noting in the traditional music
chatroom
, thesession.org, that he “tried several more well
-
known tunes, with good results. It’s very forgiving of minor mistakes, but gets confused if
you swing the rhythm too much”. Another remarked that tunepal.org worked well with
both obscure and well
-
known tunes.



All in all, maybe a small step for a musician but a giant leap for traditional music?


What I learned from
doing my PhD


Pick your area carefully


Pick your supervisors/externs carefully


Your PhD is your hobby


Publish and present all the time


Be
singleminded

(when the time comes)


Be original!


Grab the opportunities when they come


Come up with some catchy acronyms


Know the area and the conferences


ABC


Enjoy it!


especially the finish


Buy my book!

“A ripping good yarn”


-

Damian Gordon