FFT does not give harmonics not integer multiple of fundamental frequency - audio

FFT does not give harmonics not integer multiple of fundamental frequency.
this is result of FFT (fundamental frequency is 441.4307 Hz(index:328))
where index is location of x of a FFT graph
2 harmonic freq = 884.2072(index:657)
3 harmonic freq = 1328.3295(index:987)
4 harmonic freq = 1775.1434(index:1319)
5 harmonic freq = 2221.9574(index:1651)
6 harmonic freq = 2675.5005(index:1988)
7 harmonic freq = 3135.7727(index:2330)
8 harmonic freq = 3592.0074(index:2669)
9 harmonic freq = 4064.3921(index:3020)
10 harmonic freq = 4531.3934(index:3367)
11 harmonic freq = 5011.8530(index:3724)
12 harmonic freq = 5480.2002(index:4072)
13 harmonic freq = 5943.1641(index:4416)
14 harmonic freq = 6420.9320(index:4771)
15 harmonic freq = 6887.9333(index:5118)
16 harmonic freq = 7346.8597(index:5459)
17 harmonic freq = 7805.7861(index:5800)
18 harmonic freq = 8264.7125(index:6141)
19 harmonic freq = 8723.6389(index:6482)
20 harmonic freq = 9167.7612(index:6812)
as you seen, its not perfectly integer multiple of fundamental frequency.
its error is getting bigger.
when I mix two sounds(notes), i cannot find harmonics of one sound.
How can I find harmonics more accurately?
EDIT:
the graph is from this code
actually i did stft, and each frame size is 2^15(32768)
s = np.fft.rfft(frames)
so there are several frames(0, 1, 2, ... ,n)
And i got frequency from this code
timebins, freqbins = np.shape(s)
allfreqs = np.abs(np.fft.fftfreq(2^15, 1./44100)[:freqbins+1])
from this code i got allgreqs[0] = 1.3
And the graph above is one of s(s[i])
I wonder why those bins are not integer multiple
first selected bin(red) is 441 Hz where x = 328
and 20th selected bin(last yellow) is 9167 Hz where x = 6812
I found several webs that say that every harmonics are integer multiple of fundamental frequency(1st harmonic = fundamental frequency, 2nd harmonic = (fundamental frequency * 2) and so on...)
I expected 20th selected bin to be 8820 Hz and x = 6560 as 20 times of fundamental frequency.
the error is getting bigger as calculating more harmonic frequencies.
why does this happen? how can i fix this.

Related

FFT plot of raw PCM comes wrong for higher frequency in python

Here I am using fft function of numpy to plot the fft of PCM wave generated from a 10000Hz sine wave. But the amplitude of the plot I am getting is wrong.
The frequency is coming correct using fftfreq function which I am printing in the console itself. My python code is here.
import numpy as np
import matplotlib.pyplot as plt
frate = 44100
filename = 'Sine_10000Hz.bin' #signed16 bit PCM of a 10000Hz sine wave
f = open('Sine_10000Hz.bin','rb')
y = np.fromfile(f,dtype='int16') #Extract the signed 16 bit PCM value of 10000Hz Sine wave
f.close()
####### Spectral Analysis #########
fft_value = np.fft.fft(y)
freqs = np.fft.fftfreq(len(fft_value)) # frequencies associated with the coefficients:
print("freqs.min(), freqs.max()")
idx = np.argmax(np.abs(fft_value)) # Find the peak in the coefficients
freq = freqs[idx]
freq_in_hertz = abs(freq * frate)
print("\n\n\n\n\n\nfreq_in_hertz")
print(freq_in_hertz)
for i in range(2):
print("Value at index {}:\t{}".format(i, fft_value[i + 1]), "\nValue at index {}:\t{}".format(fft_value.size -1 - i, fft_value[-1 - i]))
#####
n_sa = 8 * int(freq_in_hertz)
t_fft = np.linspace(0, 1, n_sa)
T = t_fft[1] - t_fft[0] # sampling interval
N = n_sa #Here it is n_sample
print("\nN value=")
print(N)
# 1/T = frequency
f = np.linspace(0, 1 / T, N)
plt.ylabel("Amplitude")
plt.xlabel("Frequency [Hz]")
plt.xlim(0,15000)
# 2 / N is a normalization factor Here second half of the sequence gives us no new information that the half of the FFT sequence is the output we need.
plt.bar(f[:N // 2], np.abs(fft_value)[:N // 2] * 2 / N, width=15,color="red")
Output comes in the console (Only minimal prints I am pasting here)
freqs.min(), freqs.max()
-0.5 0.49997732426303854
freq_in_hertz
10000.0
Value at index 0: (19.949569768991054-17.456031216294324j)
Value at index 44099: (19.949569768991157+17.45603121629439j)
Value at index 1: (9.216783424692835-13.477631008179145j)
Value at index 44098: (9.216783424692792+13.477631008179262j)
N value=
80000
The frequency extraction is coming correctly but in the plot something I am doing is incorrect which I don't know.
Updating the work:
When I am change the multiplication factor 10 in the line n_sa = 10 * int(freq_in_hertz) to 5 gives me correct plot. Whether its correct or not I am not able to understand
In the line plt.xlim(0,15000) if I increase max value to 20000 again is not plotting. Till 15000 it is plotting correctly.
I generated this Sine_10000Hz.bin using Audacity tool where I generate a sine wave of freq 10000Hz of 1sec duration and a sampling rate of 44100. Then I exported this audio to signed 16bit with headerless (means raw PCM). I could able to regenerate this sine wave using this script. Also I want to calculate the FFT of this. So I expect a peak at 10000Hz with amplitude 32767. You can see i changed the multiplication factor 8 instead of 10 in the line n_sa = 8 * int(freq_in_hertz). Hence it worked. But the amplitude is showing incorrect. I will attach my new figure here
I'm not sure exactly what you are trying to do, but my suspicion is that the Sine_10000Hz.bin file isn't what you think it is.
Is it possible it contains more than one channel (left & right)?
Is it realy signed 16 bit integers?
It's not hard to create a 10kHz sine wave in 16 bit integers in numpy.
import numpy as np
import matplotlib.pyplot as plt
n_samples = 2000
f_signal = 10000 # (Hz) Signal Frequency
f_sample = 44100 # (Hz) Sample Rate
amplitude = 2**3 # Arbitrary. Must be > 1. Should be > 2. Larger makes FFT results better
time = np.arange(n_samples) / f_sample # sample times
# The signal
y = (np.sin(time * f_signal * 2 * np.pi) * amplitude).astype('int16')
If you plot 30 points of the signal you can see there are about 5 points per cycle.
plt.plot(time[:30], y[:30], marker='o')
plt.xlabel('Time (s)')
plt.yticks([]); # Amplitude value is artificial. hide it
If you plot 30 samples of the data from Sine_10000Hz.bin does it have about 5 points per cycle?
This is my attempt to recreate the FFT work as I understand it.
fft_value = np.fft.fft(y) # compute the FFT
freqs = np.fft.fftfreq(len(fft_value)) * f_sample # frequencies for each FFT bin
N = len(y)
plt.plot(freqs[:N//2], np.abs(fft_value[:N//2]))
plt.yscale('log')
plt.ylabel("Amplitude")
plt.xlabel("Frequency [Hz]")
I get the following plot
The y-axis of this plot is on a log scale. Notice that the amplitude of the peak is in the thousands. The amplitude of most of the rest of the data points are around 100.
idx_max = np.argmax(np.abs(fft_value)) # Find the peak in the coefficients
idx_min = np.argmin(np.abs(fft_value)) # Find the peak in the coefficients
print(f'idx_max = {idx_max}, idx_min = {idx_min}')
print(f'f_max = {freqs[idx_max]}, f_min = {freqs[idx_min]}')
print(f'fft_value[idx_max] {fft_value[idx_max]}')
print(f'fft_value[idx_min] {fft_value[idx_min]}')
produces:
idx_max = 1546, idx_min = 1738
f_max = -10010.7, f_min = -5777.1
fft_value[idx_max] (-4733.232076236707+219.11718299533203j)
fft_value[idx_min] (-0.17017443966211232+0.9557200531465061j)
I'm adding a link to a script I've build that outputs the FFT with ACTUAL amplitude (for real signals - e.g. your signal). Have a go and see if it works:
dt=1/frate in your constellation....
https://stackoverflow.com/a/53925342/4879610
After a long home work I could able to find my issue. As I mentioned in the Updating the work: the reason was with the number of samples which I took was wrong.
I changed the two lines in the code
n_sa = 8 * int(freq_in_hertz)
t_fft = np.linspace(0, 1, n_sa)
to
n_sa = y.size //number of samples directly taken from the raw 16bits
t_fft = np.arange(n_sa)/frate //Here we need to divide each samples by the sampling rate
This solved my issue.
My spectral output is
Special thanks to #meta4 and #YoniChechik for giving me some suggestions.

How to convert amplitude to dB in python using Librosa?

I have a few questions, which are all very related. The main problem here is to convert the amplitude of an audio file to dB scale and I am doing it as below which I am not sure is correct:
y, sr = librosa.load('audio.wav')
S = np.abs(librosa.stft(y))
db_max = librosa.amplitude_to_db(S, ref=np.max)
db_median = librosa.amplitude_to_db(S, ref=np.median)
db_min = librosa.amplitude_to_db(S, ref=np.min)
db_max_AVG = np.mean(db_max, axis=0)
db_median_AVG = np.mean(db_median, axis=0)
db_min_AVG = np.mean(db_min, axis=0)
My question is how can I convert 'y' to dB scale. Is not 'y' the amplitude?
Also, the shape of 'y' and 'db_max_AVG' is not the same. The size of 'db_max_AVG' is 9137 while the size of 'y' is 4678128.
Another question is that my audio file is 3 minutes and 32 seconds and the shape of y is:
print(y.shape)
(4678128,)
I do not know what this number represents because it obviously does not represent milliseconds or microseconds. Below you can see two plots of 'y' using different methods:
plt.plot(y)
plt.show()
librosa.display.waveplot(y, sr=22050, x_axis='time')
If you just want to convert the time domain amplitude readings from linear values in the range -1 to 1 to dB, this will do it:
import numpy as np
amps = [1, 0.5, 0.25, 0]
dbs = 20 * np.log10(np.abs(amps))
print(amps, 'in dB', dbs)
Should output:
[1, 0.5, 0.25, 0] in dB [ 0.-6.02059991 -12.04119983 -inf]
Note that maximum amplitude (1) goes to 0dB, half amplitude (0.5) goes to -6dB, quarter goes to -12dB.
You get a divide by zero error caused by that zero amplitude as the dB scale cannot cope with silence :)
Here is a reference to a 1971 Audio Engineering Society paper for the well known 20 * log10(amp) equation:
https://www.aes.org/e-lib/browse.cfm?elib=2157 (see equation 8)

How can I simplify a nested loop into torch tensor operations?

I'm trying to convert some code I have written in numpy which contains a nested-loop into tensor operations found in PyTorch. However, after trying to implement my own version I'm not getting the same value on the output. I have managed to do the same with a single loop, so I'm not entirely sure what I'm doing wrong.
#(Numpy Version)
#calculate Kinetic Energy
summation = 0.0
for i in range(0,len(k_values)-1):
summation += (k_values[i]**2.0)*wavefp[i]*(((self.hbar*kp_values[i])**2.0)/(2.0*self.mu))*wavef[i]
Ek = step*(4.0*np.pi)*summation
#(Numpy Version)
#calculate Potential Energy
summation = 0.0
for i in range(0,len(k_values)-1):
for j in range(0,len(kp_values)-1):
summation+= (k_values[i]**2.0)*wavefp[i]*(kp_values[j]**2.0)*wavef[j]*self.MTV[i,j]
Ep = (step**2.0)*(4.0*np.pi)*(2.0/np.pi)*summation
#####################################################
#(PyTorch Version)
#calcualte Kinetic Energy
Ek = step*(4.0*np.pi)*torch.sum( k_values.pow(2)*wavefp.mul(wavef)*((kp_values.mul(self.hbar)).pow(2)/(2.0*self.mu)) )
#(PyTorch Version)
#calculate Potential Energy
summation = 0.0
for i in range(0,len(k_values)-1):
summation += ((k_values[i].pow(2)).mul(wavefp[i]))*torch.sum( (kp_values.pow(2)).mul(wavef).mul(self.MTV[i,:]) )
Ep = (step**2.0)*(4.0*np.pi)*(2.0/np.pi)*summation
The arrays/tensors k_values, kp_values, wavef, and wavefp have dimensions of (1000,1). The values self.hbar, and self.mu, and step are scalars. The variable self.MTV is a matrix of size (1000,1000).
I would expect that both methods would give the same output but they don't. The code for calculating the Kinetic Energy (in both Numpy and PyTorch) give the same value. However, the potential energy calculation differ, and I'm not entirely sure why.
Many Thanks in advance!
The problem is in the shapes. You have kp_values and wavef in (1000, 1) which needs to be converted to (1000, ) before the multiplications. The outcome of (kp_values.pow(2)).mul(wavef).mul(MTV[i,:]) is a matrix but you asummed it is a vector.
So, the following should work.
summation += ((k_values[i].pow(2)).mul(wavefp[i]))*torch.sum((kp_values.squeeze(1)
.pow(2)).mul(wavef.squeeze(1)).mul(MTV[i,:]))
And a loop-free Numpy and PyTorch solution would be:
step = 1.0
k_values = np.random.randint(0, 100, size=(1000, 1)).astype("float") / 100
kp_values = np.random.randint(0, 100, size=(1000, 1)).astype("float") / 100
wavef = np.random.randint(0, 100, size=(1000, 1)).astype("float") / 100
wavefp = np.random.randint(0, 100, size=(1000, 1)).astype("float") / 100
MTV = np.random.randint(0, 100, size=(1000, 1000)).astype("float") / 100
# Numpy solution
term1 = k_values**2.0 * wavefp # 1000 x 1
temp = kp_values**2.0 * wavef # 1000 x 1
term2 = np.matmul(temp.transpose(1, 0), MTV).transpose(1, 0) # 1000 x 1000
summation = np.sum(term1 * term2)
print(summation)
# PyTorch solution
term1 = k_values.pow(2).mul(wavefp) # 1000 x 1
term2 = kp_values.pow(2).mul(wavef).transpose(0, 1).matmul(MTV) # 1000 x 1000
summation = torch.sum(term2.transpose(0, 1).mul(term1)) # 1000 x 1000
print(summation.item())
Output
12660.407492918514
12660.407492918514

How to compute correlation ratio or Eta in Python?

According the answer to this post,
The most classic "correlation" measure between a nominal and an interval ("numeric") variable is Eta, also called correlation ratio, and equal to the root R-square of the one-way ANOVA (with p-value = that of the ANOVA). Eta can be seen as a symmetric association measure, like correlation, because Eta of ANOVA (with the nominal as independent, numeric as dependent) is equal to Pillai's trace of multivariate regression (with the numeric as independent, set of dummy variables corresponding to the nominal as dependent).
I would appreciate if you could let me know how to compute Eta in python.
In fact, I have a dataframe with some numeric and some nominal variables.
Besides, how to plot a heatmap like plot for it?
The answer above is missing root extraction, so as a result, you will receive an eta-squared. However, in the main article (used by User777) that issue has been fixed.
So, there is an article on Wikipedia about the correlation ratio is and how to calculate it. I've created a simpler version of the calculations and will use the example from wiki:
import pandas as pd
import numpy as np
data = {'subjects': ['algebra'] * 5 + ['geometry'] * 4 + ['statistics'] * 6,
'scores': [45, 70, 29, 15, 21, 40, 20, 30, 42, 65, 95, 80, 70, 85, 73]}
df = pd.DataFrame(data=data)
print(df.head(10))
>>> subjects scores
0 algebra 45
1 algebra 70
2 algebra 29
3 algebra 15
4 algebra 21
5 geometry 40
6 geometry 20
7 geometry 30
8 geometry 42
9 statistics 65
def correlation_ratio(categories, values):
categories = np.array(categories)
values = np.array(values)
ssw = 0
ssb = 0
for category in set(categories):
subgroup = values[np.where(categories == category)[0]]
ssw += sum((subgroup-np.mean(subgroup))**2)
ssb += len(subgroup)*(np.mean(subgroup)-np.mean(values))**2
return (ssb / (ssb + ssw))**.5
coef = correlation_ratio(df['subjects'], df['scores'])
print('Eta_squared: {:.4f}\nEta: {:.4f}'.format(coef**2, coef))
>>> Eta_squared: 0.7033
Eta: 0.8386
The answer is provided here:
def correlation_ratio(categories, measurements):
fcat, _ = pd.factorize(categories)
cat_num = np.max(fcat)+1
y_avg_array = np.zeros(cat_num)
n_array = np.zeros(cat_num)
for i in range(0,cat_num):
cat_measures = measurements[np.argwhere(fcat == i).flatten()]
n_array[i] = len(cat_measures)
y_avg_array[i] = np.average(cat_measures)
y_total_avg = np.sum(np.multiply(y_avg_array,n_array))/np.sum(n_array)
numerator = np.sum(np.multiply(n_array,np.power(np.subtract(y_avg_array,y_total_avg),2)))
denominator = np.sum(np.power(np.subtract(measurements,y_total_avg),2))
if numerator == 0:
eta = 0.0
else:
eta = numerator/denominator
return eta

Plotting trends and predictions data from OLS (statsmodels)

I have this data from 1992 to 2016:
Year month data stdBS index1
1992-05-01 1992 5 302.35 31.69 727319
1992-06-01 1992 6 305.07 27.59 727350
1992-07-01 1992 7 297.12 29.12 727380
1992-08-01 1992 8 304.39 21.41 727411
1992-09-01 1992 9 294.30 32.26 727442
Using this code:
flow2=fmO['data']
fig,ax = plt.subplots(1,1, figsize=(6,4))
res2 = sm.tsa.seasonal_decompose(flow2)
residual = res2.resid
seasonal = res2.seasonal
trend = res2.trend
fig = res2.plot()
plt.show()
I obtained this plot:
Everything is fine, but now I need to plot the predictions fit
trend2 = trend.reset_index()
X = fm0.index1
y = trend
X = sm.add_constant(X)
model = sm.OLS(y,X, missing='drop')
results = model.fit()
predictions = results.predict(X)
p = results.summary()
With this short code:
fig, ax = plt.subplots(figsize=(8,4))
ax.scatter(df0.index, trend)
ax.plot(df0.index, df0.predic, 'r')
ax.set_ylabel('Data')
I obtained this plot:
But I lost the index of the original trend plot. My question is if there exists some simple way to plot trend data from sm.tsa.seasonal_decompose with the linear fit predictions with the original index time?

Resources