Skip to content

Commit

Permalink
initialize algorithms, finalize README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
lingyigu committed May 13, 2017
1 parent 4f1f8f6 commit ed349bb
Show file tree
Hide file tree
Showing 5 changed files with 493 additions and 1 deletion.
152 changes: 151 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,152 @@
# chord-recognition
Projects developed during the Spring 2017 Computational Audio course at Boston University.
Team: [Lingyi(Echo) Gu](http://github.com/lingyigu), Huiyi Chen

## Getting Started
This project is developed using [Anaconda](https://docs.continuum.io/anaconda/install) with [Python3.3+](https://www.python.org/download/releases/3.0/). To install the Anaconda distribution, check the documentations [here](https://docs.continuum.io/anaconda/install).

This project has also used an external library [librosa](https://github.com/librosa/librosa).<br>
To install ```librosa```:

pip3 install librosa

To install ```librosa``` for Anaconda:

conda install -c conda-forge librosa

To run the scripts, execute ```chordgram.py``` under the ```chord-recognition``` directory.
You can also modify the parameters in ```main.py``` as follows.

```python3
## load file
# directory
dir = "audiosamples"
# filename
filename = "PianoChordsElectric.wav"
file = dir + "/" + filename
# offset: starting reading after this time (in seconds)
offset = 0
# duration: only load up to this much audio (in seconds)
duration = 50

## chromagram
# sr: sampling rate
sr = 44100
# hop_length: number of samples between successive chroma frames (frame size)
hop_length = 4096

## chordgram
# w: filter size; w = 30 to be a good comromise that works well for most songs
w = 30
```

## Motivation
We're interested in developing a **Chord Recognition** tool that recognizes majors and minors.
When we're young, we both had experience with piano.
Recently Huiyi decides to pick up her piano skills, and Echo has been learning guitar for a while.
Sometimes when a song is relatively new, it's very unlikely to find a music sheet for it online.
If that's the case, I would go to websites like [Chordify](https://chordify.net/) and get the chords from there.
I'm definitely curious about how chord recognition works in general, so we decide to work on this topic.


## Algorithms
![algorithm](https://github.com/lingyigu/chord-recognition/blob/master/visualizations/algorithm.png)

#### Chromagram Calculation
The Chromagram could be constructed as spectrograms, which represents the relationship between time and frequency spectrum.
The frequency spectrum is formed by 12-dimensional chroma vectors, which is a set of pitch classes ```{C, Db, D, Eb, E, F, F#, G, Ab, A, Bb, B}```, and each element of the vector shows the strength of the input.
The computation of the Chromagram is to calculate the frequency and amplitudes of the corresponding note from the spectrogram.

There are many ways to calculate it as discussed in different papers.
Short-time Fourier Transform, Constant Q Transform and Fast Fourier Transform are the three common ones that people use.
We have implemented two of them during the progress of this project, the CQT approach and the FFT approach.
The CQT approach is the method that we have chosen for template matching later. Partial code for FFT has also been attached by the end of ```chromagram.py``` file.

A good way to visualize the Chromagram is a two-dimensional image, showing time or number of frames on the x axis and 12 pitch classes on the y axis.
We can easily identify which pitches being the strongest according to the color.

Given the audio ```PianoChordsElectric.wav``` provided under the ```audiosamples``` directory, here's the Chromagram produced by analyzing the wave file between 0:00 - 0:50.
For the rest of our analysis, we always uses this segment of the wave file.

![set1](https://github.com/lingyigu/chord-recognition/blob/master/visualizations/set1.png)

Here's another chromagram produced by analyzing the same wave file between 0:55 - 1:45.

![set2](https://github.com/lingyigu/chord-recognition/blob/master/visualizations/set2.png)

#### Chordgram Calculation
The next thing we need to do is to determine the chord probabilities from the chroma vectors we have calculated from the previous step.
It is basically a comparison between the typical distributions of chords and the energy computed in the chroma vectors to estimate the correct chord.
We will do this analysis frame by frame.

###### Template
The set of chords we want to detect is the 12 major chords, 12 minor chords and a "N.C" chord, which stands for "no chord".
Given the template for ```C``` and ```Cm``` as follows, we can generate the templates for all 24 chords simply by shifting the numbers in the array.

C major = [1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1]
C minor = [1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0]

For the special chord, the "no chord", all pitch classes are weighted equally. The template for "N.C" is given as [Christoph Hausner].

N.C. = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

###### Cosine Similarity
Then we can match the chroma vector calculated for the current frame against the template.
It is trivial to use the Euclidean Distance as many researchers do, however, we decide to adopt the [cosine similarity](https://en.wikipedia.org/wiki/Cosine_similarity) computation.
Since we want to improve the ability of our algorithm to detect complex chords, we find this method gives us a slightly more accurate result thant the Euclidean Distance can offer.
By computing the cosine similarity between chroma vectors and the template for each frame, we obtain the Chordgram as a result.

Here's the Chordgram produced.
![chordgram.png](https://github.com/lingyigu/chord-recognition/blob/master/visualizations/chordgram.png)

#### Chord Sequence Estimation
We can also estimate the chord sequences from the Chordgram obtained.
In order to do this, we compute the chord with the highest probability during each time step.

###### Mode Filter
Before that, there's another factor we need to take into consideration: noises.
One way we can solve this is to apply a mode filter presented by [Christoph Hausner], which filters by computing the mode of all values in a neighborhood and ```w``` stands for the filter size.
In our cases, we have set ```w = 15``` after testing a few different values. We find this value works fine in this scenario.
Sometimes we need a larger filter size since it will make the smoothing stronger if twe only have slow-changing chords, generally w = 30 is adequate.
We may also repeat the smoothing process to obtain a more accurate results.

R[r] = mode { R[r - w / 2], ..., R[r + w / 2]}

###### Results
According to the smoothed Chordgram, we obtain the most likely chord for each frame using cosine similarity.
Here's the output. It's easy to see that the chords estimated without using a smoothing filter has more fluctuations.

[CHORD-RECOGNITION]
File: audiosamples/PianoChordsElectric.wav
Time: 0 to 50
[Chords w/t smoothing]
Bb - - Bb - - Bb - B - - - - Eb - Eb - - F# - D - - - Eb Eb - - - - Eb Eb Eb - - - Ab Ab C C C C Ab C Ab C Ab C C C C C C C C C C Em Em Em Em Em Em Em Em E - - - C - - - - - - - - - - - - - - - - - E - G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G - - - - G - - - - F# Eb - - Bb Bb - - Eb D D D D D D D D D D D D D D D D D D D D D D D D D Bb D D D - - - - Eb E B - - F - - - - - - - - - - A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A - - - - - - - B - - - D E A A A A A A A A A A A A A A A A A A A A A A A A Ab Ab Ab A A Ab A - - - - - - F B B B B B B B B B G G G G G G G G G G G G B B C C C C Eb - - - Eb - - C - - - B - - F# F# F# F# F# F# F# F# F# F# F# F# F# F# F# F# F# F# F# F# F# F# B G B F# - - - B Eb - - - - - G - A A A A A A A A A A A A A A Fm Fm A Fm Fm Fm Fm Fm Fm Fm Fm Fm Fm Fm Fm A A A - - F# - - - Db - - - Bb Ab Ab Ab Ab Ab Ab Ab Ab Ab Ab Ab Ab Ab Ab Ab Ab Ab Ab Ab Ab Ab Ab Ab A A Ab A Ab - - - - - - Bb D - - Ab Ab Ab Ab G Ab Ab Ab Ab Ab Ab Ab G G G G G G G G Gm G Ab - G - Eb - Eb Eb - Eb Eb Eb - - F# - A Bb Bb Bb Bb Bb Bb Bb Bb Bb Bb Bb Bb Bb Bb Bb Bb Bb Bb Bb Bb Bb F# F# F# F# F# G - - - Eb Eb - B - Db F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F Bb F F
[Chords w/ smoothing]
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - G G G G G G G G G G C G C Ab C C C C C C Em Em Em Em Em Em Em Em Em Em C C C C - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Eb G G G G G G G G G G G G G G G G G G G G G G G G G Eb - - - - - - - - - - - - - - - - - - - - - - D D D D D D D D D D D D D D D D Bb D D Bb - - - - - - - - - - - - - - - - - - - - - - - - - - - - - F A A A A A A A A A A A A A A A A A A A Bb Bb Bb Bb Bb - - - - - - - - E E E E E E E E Db Ab Ab Ab A A A A A A A A A A A A A A A A A E E E E E E E E - - - - - - - - - - - F# F# G G G G G G G G B B B B B Bm G B B B B B B B B B B - - - - - - - - - - - - - - F# F# F# F# F# F# F# F# F# F# F# B B F# F# F# F# F# B B B B B - - - - - - - - - - - - - - Ab Ab Ab A A A A A A A Fm Fm Fm Fm Fm Fm Fm Fm Fm A A A A Db - - - - - - - - - - - Db Db Db Db Db Db Db Db Db Db Db Ab Ab Ab Ab Ab Ab Ab Ab Ab A A A A A A Ab Ab Ab Ab Ab Db Db Db Db - - - Eb Eb G G G G Eb Eb Eb Eb G Ab G G G G G G G G Gm Gm Eb Eb Eb Eb Eb Eb Eb Eb Eb Eb - - - - - - - - - - Bb Bb Bb Bb Bb Bb Bb Bb Bb Bb Bb Bb Bb Bb Bb Bb F# F# F# F# F# F# Bb Bb Bb Bb Bb Bb Bb - - - - - - - - - - - - - Db F F F F F F F F F F F F F F F - - - - - - - - - - - - - - -

We can also compare it to the actual chord information as in ```PinaoChordsDescription.txt```.
As we can see, the general chord information is correct. Although it doesn't do a good job with ```C``` and ```Ab```.

Set 1 (0:0 - 0:50): The 12 major chords in root position (i.e., C major would be C-E-G)
played in the circle of fifths:
C, G, D, A, E, B, F#, C#, Ab, Eb, Bb, F

## Limitations and Future Work
1. This program would only works well for an audio file of pure piano chords, like the ones provided in the ```audiosample``` directory.
(I have tried it with a pop song, terrible result!)

2 .In this program, we have chosen to use a simple filter method. Again, this strategy doesn't clean up the noises as much as we want.
We may need to combine different filters, such as the low-pass and median-pass filters together.

2. A few chords are off. We may improve this by trying different measures of fit and find a better one for this particular kind of music file.

3. It does not have an user interface yet.
Web would be a pretty good way to present this tool since users can just go on the website and check the chords of a song.

## Reference
1. **"Spectral Analysis"**.[http://www.cs.bu.edu/snyder/cs591/Lectures/SpectralAnalysisChords.pdf](http://www.cs.bu.edu/~snyder/cs591/Lectures/SpectralAnalysisChords.pdf).

2. **"CHORD RECOGNITION USING MEASURES OF FIT, CHORD TEMPLATES AND FILTERING METHODS"**, by Laurent Oudre, Yves Grenier, and Cédric Févotte: [http://www.ee.columbia.edu/dpwe/papers/OudGF09-chords.pdf](http://www.ee.columbia.edu/~dpwe/papers/OudGF09-chords.pdf).

3. **"TEMPLATE-BASED CHORD RECOGNITION : INFLUENCE OF THE CHORD TYPES"** by Laurent Oudre, Yves Grenier, and Cédric Févotte: [http://laurentoudre.fr/publis/OGF-ISMIR-09.pdf](http://laurentoudre.fr/publis/OGF-ISMIR-09.pdf).

4. **"Design and Evaluation of a Simple Chord Detection Algorithm"** by Christoph Hausner: [http://www.fim.uni-passau.de/fileadmin/files/lehrstuhl/sauer/geyer/BA_MA_Arbeiten/BA-HausnerChristoph-201409.pdf](http://www.fim.uni-passau.de/fileadmin/files/lehrstuhl/sauer/geyer/BA_MA_Arbeiten/BA-HausnerChristoph-201409.pdf)
158 changes: 158 additions & 0 deletions chordgram.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
from parameters import *
from chromagram import *
import librosa
import librosa.display
import numpy as np
from scipy.stats import mode

def generate_template():
template = {}
majors = ["C","Db","D","Eb","E","F","F#","G","Ab","A","Bb","B"]
minors = ["Cm","Dbm","Dm","Ebm","Em","Fm","F#m","Gm","Abm","Am","Bbm","Bm"]

# template for C and Cm
tc = [1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1]
tcm = [1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0]
shifted = 0

for chord in majors:
template[chord] = tc[12 - shifted:] + tc[:12 - shifted]
shifted += 1

for chord in minors:
template[chord] = tcm[12 - shifted:] + tcm[:12 - shifted]
shifted += 1

# template for no chords
tnc = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
template["NC"] = tnc

return template

def cossim(u, v):
"""
:param u: non-negative vector u
:param v: non-negative vector v
:return: the cosine similarity between u and v
"""
return np.dot(u, v) / (np.linalg.norm(u) * np.linalg.norm(v))

def chordgram(C, display=False):
"""
:param C: chromagram C
:return: chordgram H
"""
frames = C.shape[1]

# initialize
template = generate_template()
chords = list(template.keys())
chroma_vectors = np.transpose(C)
# chordgram
H = []

for n in np.arange(frames):
cr = chroma_vectors[n]
sims = []

for chord in chords:
t = template[chord]
# calculate cos sim, add weight
if chord == "NC":
sim = cossim(cr, t) * 0.7
else:
sim = cossim(cr, t)
sims += [sim]
H += [sims]
H = np.transpose(H)

if display == True:
plt.figure(figsize=(10, 5))
librosa.display.specshow(H, sr=sr, x_axis="frames")
plt.title("Chordgram")
plt.colorbar()
plt.tight_layout()
plt.show()

return H

def smoothing(s):
"""
:param s: sequence s
:return: mode filter for sequence s
"""
w = 15
news = [0] * len(s)
for k in np.arange(w, len(s) - w):
m = mode([s[i] for i in range(k - w // 2, k + w // 2 + 1)])[0][0]
news[k] = m
return news

def smoothed_chordgram(H, display=False):
"""
:param H: chordgram
:return: chordgram after filtering
"""
chords = H.shape[0]
H1 = []

for n in np.arange(chords):
H1 += [smoothing(H[n])]

H1 = np.array(H1)
if display == True:
plt.figure(figsize=(10, 5))
librosa.display.specshow(H1, sr=sr, x_axis="frames")
plt.title("Chordgram")
plt.colorbar()
plt.tight_layout()
plt.show()

return H1

def chord_sequence(H):
"""
:param H: chordgram H
:return: a sequence of chords
"""
template = generate_template()
chords = list(template.keys())

frames = H.shape[1]
H = np.transpose(H)
R = []

for n in np.arange(frames):
index = np.argmax(H[n])
if H[n][index] == 0.0:
chord = "NC"
else:
chord = chords[index]

R += [chord]

return R

def tostring_chords(input):
string = ""
for r in input:
if r == "NC":
string += " -"
else:
string += " " + r
return string

# Chord-Recognition Done!
C = chromagram(file)
H = chordgram(C)
H1 = smoothed_chordgram(H)
R = chord_sequence(H)
R1 = chord_sequence(H1)

print("[CHORD-RECOGNITION]")
print("File:", file)
print("Time:", offset, "to", offset + duration)
print("[Chords w/t smoothing]")
print(tostring_chords(R))
print("[Chords w/ smoothing]")
print(tostring_chords(R1))
Loading

0 comments on commit ed349bb

Please sign in to comment.