Why is music21 using pitch attributes in an unexpected way? - python-3.x

Consider the following testing code.
from music21 import pitch
C0 = 16.35
for f in [261, 130, 653, 64, 865]:
p = pitch.Pitch()
p.frequency = f
# Compare manual frequency with music21 frequency
f1 = p.frequency
f2 = C0 * pow(2, p.octave) * pow(2, p.pitchClass / 12) * pow(2, p.microtone.cents / 1200)
print(f, f1, f2)
# Compare manual pitchspace with music21 pitchspace
ps1 = p.ps
ps2 = 12 * (p.octave + 1) + p.pitchClass + p.microtone.cents / 100
print(ps1, ps2)
print()
The output of this is
261 260.99999402174154 521.9489797003519
59.958555 71.95855499999999
130 129.99999854289362 259.974590631057
47.892097 59.892097
653 653.0000144741496 652.9362051837928
75.834954 75.834954
64 63.999998381902046 65.86890433005668
35.623683 36.123683
865 864.9999846113213 890.2594167561009
80.702359 81.202359
There is often a difference between my manual computation of the frequency resp. the pitch space and the music21 value.
Note that sometimes this difference can be about an octave (like the first two C note frequencies), but mostly it is about one tone. Another weird thing is that for the third testing frequency the pitchspace values are the same while the frequencies are not.
What could be wrong about my manual formulas?

So it appears that while the deviation of an octave was a bug, the other deviations are intended behaviour. See https://github.com/cuthbertLab/music21/issues/96 for detailed explanation.

Related

Divide by a number which is not power of 2 in Verilog RTL coding

For multiplication and division, we can use the left and right shifts.
x>>2 // it will right shift by 2. ---> 2^2=4. (Multiply by 4 or divide by 4, depends on MSB/LSB)
However, if we want to divide by a number that isn't the power of 2, how can we achieve the required purpose?
Booth's algorithm is an additive one and can take a comparatively longer time than the multiplicative algorithms, like the Newton-Raphson algorithms found in this educational PDF.
Each next approximation is calculated using the previous approximation.
X(N+1) = X(N)(2 - b * X(N)), where x(0)=1
So, to find the inverse of b, i.e. 1/b, where b=0.6 (error=e(x)), it takes about 5 iterations.
X(000) = 1.000
X(001) = 1.000 * (2 - (1.000 * 0.6)) = 1.400
X(002) = 1.400 * (2 - (1.400 * 0.6)) = 1.624
X(003) = 1.624 * (2 - (1.624 * 0.6)) = 1.6655744
X(004) = X(003) * (2 - (X(003) * 0.6)) = 1.666665951
X(005) = X(004) * (2 - (X(004) * 0.6)) = 1.666666668
which approximates the answer, which is 1.6666666667.
I included this example in case the referenced PDF disappears. See the referenced PDF or look up the Newton-Raphson algorithm for more information.
By using Booth's restoring division algorithm

Splitting an int64 into two int32, performing math, then re-joining

I am working within constraints of hardware that has 64bit integer limit. Does not support floating point. I am dealing with very large integers that I need to multiply and divide. When multiplying I encounter an overflow of the 64bits. I am prototyping a solution in python. This is what I have in my function:
upper = x >> 32 #x is cast as int64 before being passed to this function
lower = x & 0x00000000FFFFFFFF
temp_upper = upper * y // z #Dividing first is not an option, as this is not the actual equation I am working with. This is just to make sure in my testing I overflow unless I do the splitting.
temp_lower = lower * y // z
return temp_upper << 32 | lower
This works, somewhat, but I end up losing a lot of precision (my result is off by sometimes a few million). From looking at it, it appears that this is happening because of the division. If sufficient enough it shifts the upper to the right. Then when I shift it back into place I have a gap of zeroes.
Unfortunately this topic is very hard to google, since anything with upper/lower brings up results about rounding up/down. And anything about splitting ints returns results about splitting them into a char array. Anything about int arithmetic bring up basic algebra with integer math. Maybe I am just not good at googling. But can you guys give me some pointers on how to do this?
Splitting like this is just a thing I am trying, it doesnt have to be the solution. All I need to be able to do is to temporarily go over 64bit integer limit. The final result will be under 64bit (After the division part). I remember learning in college about splitting it up like this and then doing the math and re-combining. But unfortunately as I said I am having trouble finding anything online on how to do the actual math on it.
Lastly, my numbers are sometimes small. So I cant chop off the right bits. I need the results to basically be equivalent to if I used something like int128 or something.
I suppose a different way to look at this problem is this. Since I have no problem with splitting the int64, we can forget about that part. So then we can pretend that two int64's are being fed to me, one is upper and one is lower. I cant combine them, because they wont fit into a single int64. So I need to divide them first by Z. Combining step is easy. How do I do the division?
Thanks.
As I understand it, you want to perform (x*y)//z.
Your numbers x,y,z all fit on 64bits, except that you need 128 bits for intermediate x*y.
The problem you have is indeed related to division: you have
h * y = qh * z + rh
l * y = ql * z + rl
h * y << 32 + l*y = (qh<<32 + ql) * z + (rh<<32 + rl)
but nothing says that (rh<<32 + rl) < z, and in your case high bits of l*y overlap low bits of h * y, so you get the wrong quotient, off by potentially many units.
What you should do as second operation is rather:
rh<<32 + l * y = ql' * z + rl'
Then get the total quotient qh<<32 + ql'
But of course, you must care to avoid overflow when evaluating left operand...
Since you are splitting only one of the operands of x*y, I'll assume that the intermediate result always fits on 96 bits.
If that is correct, then your problem is to divide a 3 32bits limbs x*y by a 2 32bits limbs z.
It is thus like Burnigel - Ziegler divide and conquer algorithm for division.
The algorithm can be decomposed like this:
obtain the 3 limbs a2,a1,a0 of multiplication x*y by using karatsuba for example
split z into 2 limbs z1,z0
perform the div32( (a2,a1,a0) , (z1,z0) )
here is some pseudo code, only dealing with positive operands, and with no guaranty to be correct, but you get an idea of implementation:
p = 1<<32;
function (a1,a0) = split(a)
a1 = a >> 32;
a0 = a - (a1 * p);
function (a2,a1,a0) = mul22(x,y)
(x1,x0) = split(x) ;
(y1,y0) = split(y) ;
(h1,h0) = split(x1 * y1);
assert(h1 == 0); -- assume that results fits on 96 bits
(l1,l0) = split(x0 * y0);
(m1,m0) = split((x1 - x0) * (y0 - y1)); -- karatsuba trick
a0 = l0;
(carry,a1) = split( l1 + l0 + h0 + m0 );
a2 = l1 + m1 + h0 + carry;
function (q,r) = quorem(a,b)
q = a // b;
r = a - (b * q);
function (q1,q0,r0) = div21(a1,a0,b0)
(q1,r1) = quorem(a1,b0);
(q0,r0) = quorem( r1 * p + a0 , b0 );
(q1,q0) = split( q1 * p + q0 );
function q = div32(a2,a1,a0,b1,b0)
(q,r) = quorem(a2*p+a1,b1*p+b0);
q = q * p;
(a2,a1)=split(r);
if a2<b1
(q1,q0,r)=div21(a2,a1,b1);
assert(q1==0); -- since a2<b1...
else
q0=p-1;
r=(a2-b1)*p+a1+b1;
(d1,d0) = split(q0*b0);
r = (r-d1)*p + a0 - d0;
while(r < 0)
q = q - 1;
r = r + b1*p + b0;
function t=muldiv(x,y,z)
(a2,a1,a0) = mul22(x,y);
(z1,z0) = split(z);
if z1 == 0
(q2,q1,r1)=div21(a2,a1,z0);
assert(q2==0); -- otherwise result will not fit on 64 bits
t = q1*p + ( ( r1*p + a0 )//z0);
else
t = div32(a2,a1,a0,z1,z0);

Remove noise from accelerometer data

I have data from an accelerator which is a bit noisy (as it seems to me). The manufacturer states the noise spectral density as 45 micro g /(Hz)^0.5.
How do I use this information to remove noise from the time signal. I don't have a signal processing background. So can anyone point me to a source where I can find how this problem can be handled.
Thank you all.
The acceleration signal and its frequency content look like this
I used the algorithm mentioned here but I doesn't seem to work for me. I found the following code at a blog. Do you think it is implemented correctly? I have tried to vary my cutoff frequency, but the noise is still there
def lpf(x, Fc, Fs, x0 = None):
alpha = 1 - exp(-2.0 * pi * Fc / Fs)
y = zeros_like(x)
yk = x[0] if x0 is None else x0
for k in range(len(y)):
yk += alpha * (x[k]-yk)
y[k] = yk
return y
The noise seems to be randomly distributed and not in particular range of frequencies.

Spectrogram of two audio files (Added together)

Assume for a moment I have two input signals f1 and f2. I could add these signals to produce a third signal f3 = f1 + f2. I would then compute the spectrogram of f3 as log(|stft(f3)|^2).
Unfortunately I don't have the original signals f1 and f2. I have, however, their spectrograms A = log(|stft(f1)|^2) and B = log(|stft(f2)|^2). What I'm looking for is a way to approximate log(|stft(f3)|^2) as closely as possible using A and B. If we do some math we can derive:
log(|stft(f1 + f2)|^2) = log(|stft(f1) + stft(f2)|^2)
express stft(f1) = x1 + i * y1 & stft(f2) = x2 + i * y2 to write
... = log(|x1 + i * y1 + x2 + i * y2|^2)
... = log((x1 + x2)^2 + (y1 + y2)^2)
... = log(x1^2 + x2^2 + y1^2 + y2^2 + 2 * (x1 * x2 + y1 * y2))
... = log(|stft(f1)|^2 + |stft(f2)|^2 + 2 * (x1 * x2 + y1 * y2))
So at this point I could use the approximation:
log(|stft(f3)|^2) ~ log(exp(A) + exp(B))
but I would ignore the last part 2 * (x1 * x2 + y1 * y2). So my question is: Is there a better approximation for this?
Any ideas? Thanks.
I'm not 100% understanding your notation but I'll give it a shot. Addition in the time domain corresponds to addition in the frequency domain. Adding two time domain signals x1 and x2 produces a 3rd time domain signal x3. x1, x2 and x3 all have a frequency domain spectrum, F(x1), F(x2) and F(x3). F(x3) is also equal to F(x1) + F(x2) where the addition is performed by adding the real parts of F(x1) to the real parts of F(x2) and adding the imaginary parts of F(x1) to the imaginary parts of F(x2). So if x1[0] is 1+0j and x2[0] is 0.5+0.5j then the sum is 1.5+0.5j. Judging from your notation you are trying to add the magnitudes, which with this example would be |1+0j| + |0.5+0.5j| = sqrt(1*1) + sqrt(0.5*0.5+0.5*0.5) = sqrt(2) + sqrt(0.5). Obviously not the same thing. I think you want something like this:
log((|stft(a) + stft(b)|)^2) = log(|stft(a)|^2) + log(|stft(b)|^2)
Take the exp() of the 2 log magnitudes, add them, then take the log of the sum.
Stepping back from the math for a minute, we can see that at a fundamental level, this just isn't possible.
Consider a 1st signal f1 that is a pure tone at frequency F and amplitude A.
Consider a 2nd signal f2 that is a pure tone at frequency F and amplitude A, but perfectly out of phase with f1.
In this case, the spectrograms of f1 & f2 are identical.
Now consider two possible combined signals.
f1 added to itself is a pure tone at frequency F and amplitude 2A.
f1 added to f2 is complete silence.
From the spectrograms of f1 and f2 alone (which are identical), you've no way to know which of these very different situations you're in. And this doesn't just hold for pure tones. Any signal and its reflection about the axis suffer the same problem. Generalizing even further, there's just no way to know how much your underlying signals cancel and how much they reinforce each other. That said, there are limits. If, for a particular frequency, your underlying signals had amplitudes A1 and A2, the biggest possible amplitude is A1+A2 and the smallest possible is abs(A1-A2).

How to find the fundamental frequency of a guitar string sound?

I want to build a guitar tuner app for Iphone. My goal is to find the fundamental frequency of sound generated by a guitar string. I have used bits of code from aurioTouch sample provided by Apple to calculate frequency spectrum and I find the frequency with the highest amplitude . It works fine for pure sounds (the ones that have only one frequency) but for sounds from a guitar string it produces wrong results. I have read that this is because of the overtones generate by the guitar string that might have higher amplitudes than the fundamental one. How can I find the fundamental frequency so it works for guitar strings? Is there an open-source library in C/C++/Obj-C for sound analyzing (or signal processing)?
You can use the signal's autocorrelation, which is the inverse transform of the magnitude squared of the DFT. If you're sampling at 44100 samples/s, then a 82.4 Hz fundamental is about 535 samples, whereas 1479.98 Hz is about 30 samples. Look for the peak positive lag in that range (e.g. from 28 to 560). Make sure your window is at least two periods of the longest fundamental, which would be 1070 samples here. To the next power of two that's a 2048-sample buffer. For better frequency resolution and a less biased estimate, use a longer buffer, but not so long that the signal is no longer approximately stationary. Here's an example in Python:
from pylab import *
import wave
fs = 44100.0 # sample rate
K = 3 # number of windows
L = 8192 # 1st pass window overlap, 50%
M = 16384 # 1st pass window length
N = 32768 # 1st pass DFT lenth: acyclic correlation
# load a sample of guitar playing an open string 6
# with a fundamental frequency of 82.4 Hz (in theory),
# but this sample is actually at about 81.97 Hz
g = fromstring(wave.open('dist_gtr_6.wav').readframes(-1),
dtype='int16')
g = g / float64(max(abs(g))) # normalize to +/- 1.0
mi = len(g) / 4 # start index
def welch(x, w, L, N):
# Welch's method
M = len(w)
K = (len(x) - L) / (M - L)
Xsq = zeros(N/2+1) # len(N-point rfft) = N/2+1
for k in range(K):
m = k * ( M - L)
xt = w * x[m:m+M]
# use rfft for efficiency (assumes x is real-valued)
Xsq = Xsq + abs(rfft(xt, N)) ** 2
Xsq = Xsq / K
Wsq = abs(rfft(w, N)) ** 2
bias = irfft(Wsq) # for unbiasing Rxx and Sxx
p = dot(x,x) / len(x) # avg power, used as a check
return Xsq, bias, p
# first pass: acyclic autocorrelation
x = g[mi:mi + K*M - (K-1)*L] # len(x) = 32768
w = hamming(M) # hamming[m] = 0.54 - 0.46*cos(2*pi*m/M)
# reduces the side lobes in DFT
Xsq, bias, p = welch(x, w, L, N)
Rxx = irfft(Xsq) # acyclic autocorrelation
Rxx = Rxx / bias # unbias (bias is tapered)
mp = argmax(Rxx[28:561]) + 28 # index of 1st peak in 28 to 560
# 2nd pass: cyclic autocorrelation
N = M = L - (L % mp) # window an integer number of periods
# shortened to ~8192 for stationarity
x = g[mi:mi+K*M] # data for K windows
w = ones(M); L = 0 # rectangular, non-overlaping
Xsq, bias, p = welch(x, w, L, N)
Rxx = irfft(Xsq) # cyclic autocorrelation
Rxx = Rxx / bias # unbias (bias is constant)
mp = argmax(Rxx[28:561]) + 28 # index of 1st peak in 28 to 560
Sxx = Xsq / bias[0]
Sxx[1:-1] = 2 * Sxx[1:-1] # fold the freq axis
Sxx = Sxx / N # normalize S for avg power
n0 = N / mp
np = argmax(Sxx[n0-2:n0+3]) + n0-2 # bin of the nearest peak power
# check
print "\nAverage Power"
print " p:", p
print "Rxx:", Rxx[0] # should equal dot product, p
print "Sxx:", sum(Sxx), '\n' # should equal Rxx[0]
figure().subplots_adjust(hspace=0.5)
subplot2grid((2,1), (0,0))
title('Autocorrelation, R$_{xx}$'); xlabel('Lags')
mr = r_[:3 * mp]
plot(Rxx[mr]); plot(mp, Rxx[mp], 'ro')
xticks(mp/2 * r_[1:6])
grid(); axis('tight'); ylim(1.25*min(Rxx), 1.25*max(Rxx))
subplot2grid((2,1), (1,0))
title('Power Spectral Density, S$_{xx}$'); xlabel('Frequency (Hz)')
fr = r_[:5 * np]; f = fs * fr / N;
vlines(f, 0, Sxx[fr], colors='b', linewidth=2)
xticks((fs * np/N * r_[1:5]).round(3))
grid(); axis('tight'); ylim(0,1.25*max(Sxx[fr]))
show()
Output:
Average Power
p: 0.0410611012542
Rxx: 0.0410611012542
Sxx: 0.0410611012542
The peak lag is 538, which is 44100/538 = 81.97 Hz. The first-pass acyclic DFT shows the fundamental at bin 61, which is 82.10 +/- 0.67 Hz. The 2nd pass uses a window length of 538*15 = 8070, so the DFT frequencies include the fundamental period and harmonics of the string. This enables an ubiased cyclic autocorrelation for an improved PSD estimate with less harmonic spreading (i.e. the correlation can wrap around the window periodically).
Edit: Updated to use Welch's method to estimate the autocorrelation. Overlapping the windows compensates for the Hamming window. I also calculate the tapered bias of the hamming window to unbias the autocorrelation.
Edit: Added a 2nd pass with cyclic correlation to clean up the power spectral density. This pass uses 3 non-overlapping, rectangular windows length 538*15 = 8070 (short enough to be nearly stationary). The bias for cyclic correlation is a constant, instead of the Hamming window's tapered bias.
Finding the musical pitches in a chord is far more difficult than estimating the pitch of one single string or note played at a time. The overtones for the multiple notes in a chord might all be overlapping and interleaving. And all the notes in common chords may themselves be at overtone frequencies for one or more non-existent lower pitched notes.
For single notes, autocorrelation is a common technique used by some guitar tuners. But with autocorrelation, you have to be aware of some potential octave uncertainty, as guitars may produce inharmonic and decaying overtones which thus don't exactly match from pitch period to pitch period. Cepstrum and Harmonic Product Spectrum are two other pitch estimation methods which may or may not have different problems, depending on the guitar and the note.
RAPT appears to be one published algorithm for more robust pitch estimation. YIN is another.
Also Objective C is a superset of ANSI C. So you can use any C DSP routines you find for pitch estimation within an Objective C app.
Use libaubio (link) and be happy . It was one the biggest time lose for me to try to implement a fundemental frequency estimator. If you want to do it yourself I advise you follow to YINFFT method (link)

Resources