EarGram: an Application for Interactive Exploration of Large Databases of Audio Snippets for Creative Purposes
GilbertoBernardes
542 views
41 slides
Nov 10, 2012
Slide 1 of 41
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
About This Presentation
Presented at CMMR 2012, London, UK.
Size: 1.18 MB
Language: en
Added: Nov 10, 2012
Slides: 41 pages
Slide Content
earGram
concatenative sound
synthesis in pure data
Gilberto Bernardes FEUP INESC
Carlos Guedes FEUP INESC
Bruce Pennycook UT Austin
•brief introduction to concatenative sound
synthesis (CSS)
•description of the main modules of earGram
•detail the recombination strategies for
automatic music generation implemented in
earGram
•demo
•future work
[outline]
[concatenative sound synthesis]
(2, 4, 100, ...) (5, 1, 80, ...)
units
0
25
50
75
100
target
corpus
synthesis
[earGram: design scheme]
•offline and online segmentation
•segmentation modes: beat (Dixon, 2001),
onset (Brossier, 2006; Puckette, 2005;
Brent, 2010), uniform size
•auto-mode automatically selects between
beat and onset segmentation modes,
according to the audio characteristics
[analysis: segmentation]
•a n-dimensional vector of numerical
features represents each unit
•the feature values correspond to low- and
mid-level description of the audio data
•feature vectors can be static and dynamic
[analysis: feature vector]
•transition probability table between
consecutive units for harmony
(fundamental bass) and timbre
•meter induction
•label and group units according to their
position within the meter
•representation of the noisiness of the signal
on a meter basis
[analysis: representing the temporal evolution]
[analysis: representing the temporal evolution]
transition
probability
table
1st-7th order
fundamental
bass
CC#DD#E...
C 0.1 0.6
C#
D 0.3 0.1 0.2
D# 0.1
E 0.4 0.4
...
CC#DD#E...
C C0.1 0.6
C C#
C D0.3 0.1 0.2
C D# 0.1
C E0.4 0.4
...
1st order 2nd order
spec
audibilitypitch weightpitch salience
[analysis: representing noisiness]
histogram presenting the normalized
distribution of the noisiness of the units
over the length of a bar
1 2 3 4
eg. 4/4 /C
[visualizations]
[dimensionality reduction]
P (24, 26, 20, 50, 15, 60, 38, 20)
P
C1
C2
C3
C4
C5
C6
C7
C8
d1d2d3d4d5d6d7d8
•star coordinates (Kandogan,
2001)
•requires interactive and visual
exploration of the corpus to
extract meaningful
information (notably clusters)
•Scaling changes contribution
to resulting visualization
•Rotation induces correlation
between data columns
[clustering]
k-means QT-clustering DBSCAN
number of clusters
neighborhood
proximity threshold
minimum number
of units per cluster
cluster diameter
minimum number
of units per cluster
[database]
•collection of arrays (one array per feature)
•can be saved into a text file (.txt) and
loaded later to not repeat the time
consuming tasks of the database
construction
•allows intuitive and
interactive exploration of
the corpus
•extended granular
synthesizer with a refined
control
•dynamic parameters: gain,
density of events, pitch
deviations, and panning
[performance: spaceMap]
[performance: infiniteMode]
•extends infinitely the
audio source while
retaining its structural
qualities
•user must select and rank
the characteristics we/she
wants to evolve over time
•is suitable for both
soundscapes and
polyphonic “concert”
music
[performance: infiniteMode]
meter harmony noisiness
previously
selected
unit: 3
∩∩
4
8
2
0
from the remaining units the algorithm selects the
closest to the previously played unit
[performance: shuffMeter]
•creates patterns stochastically
according to a preassigned time
signature and metrical level
•Clarence Barlow’s Indispensability
algorithm defines a templates that
is used as a target phrase
•template is assigned to spectral
flux and loudness
•adaptation during performance
according to two parameters:
roughness or loudness
•roughness scales the template
towards it’s medium value and
loudness scales the template up to
a maximum value
[performance: shuffMeter]
•creates patterns stochastically
according to a preassigned time
signature and metrical level
•Clarence Barlow’s Indispensability
algorithm defines a templates that
is used as a target phrase
•template is assigned to spectral
flux and loudness
•adaptation during performance
according to two parameters:
roughness or loudness
•roughness scales the template
towards it’s medium value and
loudness scales the template up to
a maximum value
[performance: shuffMeter]
•creates patterns stochastically
according to a preassigned time
signature and metrical level
•Clarence Barlow’s Indispensability
algorithm defines a templates that
is used as a target phrase
•template is assigned to spectral
flux and loudness
•adaptation during performance
according to two parameters:
roughness or loudness
•roughness scales the template
towards it’s medium value and
loudness scales the template up to
a maximum value
[performance: soundscapeMode]
•dynamic control over
soundscapes generation
•density (vertical axis) the
number of units played
simultaneously and ranges
from 1 to 5
• sharpness (horizontal axis)
defines the target to be
synthesized according to the
diversity and stability of the
units, which is translated to a
value in the spectral flux space
[performance: soundscapeMode]
•dynamic control over
soundscapes generation
•density (vertical axis) the
number of units played
simultaneously and ranges
from 1 to 5
• sharpness (horizontal axis)
defines the target to be
synthesized according to the
diversity and stability of the
units, which is translated to a
value in the spectral flux space
[performance: soundscapeMode]
•dynamic control over
soundscapes generation
•density (vertical axis) the
number of units played
simultaneously and ranges
from 1 to 5
• sharpness (horizontal axis)
defines the target to be
synthesized according to the
diversity and stability of the
units, which is translated to a
value in the spectral flux space
[performance: soundscapeMode]
•dynamic control over
soundscapes generation
•density (vertical axis) the
number of units played
simultaneously and ranges
from 1 to 5
• sharpness (horizontal axis)
defines the target to be
synthesized according to the
diversity and stability of the
units, which is translated to a
value in the spectral flux space
[performance: soundscapeMode]
density = 3
given a group of units that satisfies the query for
the next unit we select the one that minimizes the
distance on the bark space to the previously
selected unit
•concatenating units with short overlap or
through a phase vocoder
•avoid remaining discontinuities between
concatenated units by a spectral
compander (+spectralcompand~ from
soundhack plugins bundle)
[synthesis]
•studio experimentations and live performances
ranging from installations to concert polyphonic
music
•computer-assisted composition
•computer-assisted improvisation
•interactive music systems
•collecting statistics from a corpus for
musicological purposes
[applications]
[eargram.modules]
•modularity
•PD-friendly
•expert users
•multiple corpus
at once
•meaningful representations of the corpus (P. Schaeffer’s
Typo-morphology and D. Smalley’s Spectromorphology)
•more robust rhythmic descriptions
•new performance modes: stepSeq
•bridging the playing modes to interactive music systems
[future work]
•https://sites.google.com/site/eargram/
•code (released under the GNU GPL)
•examples
•documentation
[website]
•
Brent W.: A Timbre Analysis and Classification Toolkit for Pure Data. In: Proceedings
of the ICMC, New York, EUA (2010)
•
Brossier, P.: Automatic Annotation of Musical Audio for Interactive Applications.
Ph.D. thesis, Queen Mary, University of London (2006)
•
Dixon, S.: An interactive beat tracking and visualization system. In: Proceedings of
the ICMC (2001)
•
Inselberg, A.: Parallel Coordinates: Visual Multidimensional Geometry and Its
Applications. Springer (2009)
•
Kandogan, E.: Visualizing Multi-dimensional Clusters, Trends, and Outliers using
StarCoordinates. In: Proceedings of the Knowledge and Data Mining (2001)
•
Pure Data, http://www.puredata.info
•
SoundHack Plugins Bundle, http://soundhack.henfast.com/
[references]