Real time audio acquisition using pyAudio (problem with CHUNK size)

Real time audio acquisition using pyAudio (problem with CHUNK size) - audio

I am having an issue when I run the code below. The goal is to develop an app that acheives real time sound acquisition. I have set the CHUNK (frame) size to 320 using 16KHz sampling rate, hence, frame duration of 0.02 s. The issue when I record, the result (the content of the variable "many") contains some glitch sounds or noise. When I double the CHUNK, the problem disapears. The value 0.02 depends on the nature of the problem I am trying to resolve. It is required to set to 0.02. Do you have any suggestions?
import pyaudio
import struct
import numpy as np
import matplotlib.pyplot as plt
import time
import IPython.display as ipd
CHUNK = int(1*320)
FORMAT = pyaudio.paFloat32
CHANNELS = 1
RATE = 16000
p = pyaudio.PyAudio()
chosen_device_index = 1
for x in range(0,p.get_device_count()):
info = p.get_device_info_by_index(x)
#print p.get_device_info_by_index(x)
if info["name"] == "pulse":
chosen_device_index = info["index"]
print("Chosen index: ", chosen_device_index)
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input_device_index=chosen_device_index,
input=True,
output=False,
frames_per_buffer=CHUNK)
plt.ion()
%matplotlib qt
fig, ax = plt.subplots()
x = np.arange(0, CHUNK)
data = stream.read(CHUNK)
print(len(data))
data_ = struct.unpack(str(CHUNK) + 'f', data)
line, = ax.plot(x, data_)
ax.set_ylim([-1,1])
many = []
while True:
data = struct.unpack(str(CHUNK) + 'f', stream.read(CHUNK))
line.set_ydata(data)
fig.canvas.draw()
fig.canvas.flush_events()
many= np.concatenate((many, data),axis=None)
ipd.Audio(many,rate = 16000)

From the conversation between you can fdcpp, it seems true that the piece of code
line.set_ydata(data)
fig.canvas.draw()
fig.canvas.flush_events()
many= np.concatenate((many, data),axis=None)
takes more than 0.02 s to run. That's why when the next CHUNK size data comes, your code hasn't been ready to receive it, which causes input overflow.
There are different ways to bypass it. But I agree with fdcpp that the best way to solve this problem is to think about your end goal.
For example, you can separate the processing of receiving audio data from processing the data, i.e., your line, fig code. One process just receives and stores the audio data, while the other process takes the stored data and draws it.
But please keep in mind that as long as the drawing part takes more than 0.02 s, you cannot achieve "real-time" as you wanted.

Related

Find the time the music sound starts

I have this sound file where I am looking for the time the music starts. I am limited to using only the scipy module. How do I detect the time on x axis when the sound starts?
An example figure is shown below. The Signal with higher magnitudes shows when the music.
Note sometimes there is noise in the signal which could also have high peaks.
import scipy
import numpy as np
import matplotlib.pyplot as plt
#create single signal
dt = 0.001
t= np.arange(0,6,dt)
lowFreq = np.sin(2+ np.pi*10*t)
musicFreq = 3.5*np.sin(2+np.pi*25*t)
combinedSignal = np.concatenate([lowFreq,musicFreq])
plt.plot(combinedSignal)
plt.show()

I think you can get the time when the sound start by just using:
idx_start = np.where(combinedSignal > 1)[0][0]
This will return the first index where the magnitude of your signal is greater than 1.

Converting a wav file to amplitude and frequency values for textual, time-series analysis

I'm processing wav files for amplitude and frequency analysis with FFT, but I am having trouble getting the data out to csv in a time series format.
Using #Beginner's answer heavily from this post: How to convert a .wav file to a spectrogram in python3, I'm able to get the spectrogram output in an image. I'm trying to simplify that somewhat to get to a text output in csv format, but I'm not seeing how to do so. The outcome I'm hoping to achieve would look something like the following:
time_in_ms, amplitude_in_dB, freq_in_kHz
.001, -115, 1
.002, -110, 2
.003, 20, 200
...
19000, 20, 200
For my testing, I have been using http://soundbible.com/2123-40-Smith-Wesson-8x.html, (Notes: I simplified the wav down to a single channel and removed metadata w/ Audacity to get it to work.)
Heavy props to #Beginner for 99.9% of the following, anything nonsensical is surely mine.
import numpy as np
from matplotlib import pyplot as plt
import scipy.io.wavfile as wav
from numpy.lib import stride_tricks
filepath = "40sw3.wav"
""" short time fourier transform of audio signal """
def stft(sig, frameSize, overlapFac=0.5, window=np.hanning):
win = window(frameSize)
hopSize = int(frameSize - np.floor(overlapFac * frameSize))
# zeros at beginning (thus center of 1st window should be for sample nr. 0)
samples = np.append(np.zeros(int(np.floor(frameSize/2.0))), sig)
# cols for windowing
cols = np.ceil( (len(samples) - frameSize) / float(hopSize)) + 1
# zeros at end (thus samples can be fully covered by frames)
samples = np.append(samples, np.zeros(frameSize))
frames = stride_tricks.as_strided(samples, shape=(int(cols), frameSize), strides=(samples.strides[0]*hopSize, samples.strides[0])).copy()
frames *= win
return np.fft.rfft(frames)
""" scale frequency axis logarithmically """
def logscale_spec(spec, sr=44100, factor=20.):
timebins, freqbins = np.shape(spec)
scale = np.linspace(0, 1, freqbins) ** factor
scale *= (freqbins-1)/max(scale)
scale = np.unique(np.round(scale))
# create spectrogram with new freq bins
newspec = np.complex128(np.zeros([timebins, len(scale)]))
for i in range(0, len(scale)):
if i == len(scale)-1:
newspec[:,i] = np.sum(spec[:,int(scale[i]):], axis=1)
else:
newspec[:,i] = np.sum(spec[:,int(scale[i]):int(scale[i+1])], axis=1)
# list center freq of bins
allfreqs = np.abs(np.fft.fftfreq(freqbins*2, 1./sr)[:freqbins+1])
freqs = []
for i in range(0, len(scale)):
if i == len(scale)-1:
freqs += [np.mean(allfreqs[int(scale[i]):])]
else:
freqs += [np.mean(allfreqs[int(scale[i]):int(scale[i+1])])]
return newspec, freqs
""" compute spectrogram """
def compute_stft(audiopath, binsize=2**10):
samplerate, samples = wav.read(audiopath)
s = stft(samples, binsize)
sshow, freq = logscale_spec(s, factor=1.0, sr=samplerate)
ims = 20.*np.log10(np.abs(sshow)/10e-6) # amplitude to decibel
return ims, samples, samplerate, freq
""" plot spectrogram """
def plot_stft(ims, samples, samplerate, freq, binsize=2**10, plotpath=None, colormap="jet"):
timebins, freqbins = np.shape(ims)
plt.figure(figsize=(15, 7.5))
plt.imshow(np.transpose(ims), origin="lower", aspect="auto", cmap=colormap, interpolation="none")
plt.colorbar()
plt.xlabel("time (s)")
plt.ylabel("frequency (hz)")
plt.xlim([0, timebins-1])
plt.ylim([0, freqbins])
xlocs = np.float32(np.linspace(0, timebins-1, 5))
plt.xticks(xlocs, ["%.02f" % l for l in ((xlocs*len(samples)/timebins)+(0.5*binsize))/samplerate])
ylocs = np.int16(np.round(np.linspace(0, freqbins-1, 10)))
plt.yticks(ylocs, ["%.02f" % freq[i] for i in ylocs])
if plotpath:
plt.savefig(plotpath, bbox_inches="tight")
else:
plt.show()
plt.clf()
"" HERE IS WHERE I'm ATTEMPTING TO GET IT OUT TO TXT """
ims, samples, samplerate, freq = compute_stft(filepath)
""" Print lengths """
print('ims len:', len(ims))
print('samples len:', len(samples))
print('samplerate:', samplerate)
print('freq len:', len(freq))
""" Write values to files """
np.savetxt(filepath + '-ims.txt', ims, delimiter=', ', newline='\n', header='ims')
np.savetxt(filepath + '-samples.txt', samples, delimiter=', ', newline='\n', header='samples')
np.savetxt(filepath + '-frequencies.txt', freq, delimiter=', ', newline='\n', header='frequencies')
In terms of values out, the file I'm analyzing is approx 19.1 seconds long and the sample rate is 44100, so I’d expect to have about 842k values for any given variable. But I'm not seeing what I expected. Instead here is what I see:
freqs comes out with just a handful of values, 512 and while they appear to be correct range for expected frequency, they are ordered least to greatest, not in time series like I expected. The 512 values, I assume, is the "fast" in FFT, basically down-sampled...
ims, appears to be amplitude, but values seem too high, although sample size is correct. Should be seeing -50 up to ~240dB.
samples . . . not sure.
In short, can someone advise on how I'd get the FFT out to a text file with time, amp, and freq values for the entire sample set? Is savetxt the correct route, or is there a better way? This code can certainly be used to make a great spectrogram, but how can I just get out the data?

Your output format is too limiting, as the audio spectrum at any interval in time usually contains a range of frequencies. e.g the FFT of a 1024 samples will contain 512 frequency bins for one window of time or time step, each with an amplitude. If you want a time step of one millisecond, then you will have to offset the window of samples you feed each STFT to center the window at that point in your sample vector. Although with an FFT about 23 milliseconds long, that will involve a high overlap of windows. You could use shorter windows, but the time-frequency trade-off will result in proportionately less frequency resolution.

FFT on MPU6050 output signal

I want to perform FFT on data array that I have extracted from MPU6050 sensor connected to Arduino UNO using Python
Please find the data sample below
0.13,0.04,1.03
0.14,0.01,1.02
0.15,-0.04,1.05
0.16,0.02,1.05
0.14,0.01,1.02
0.16,-0.03,1.04
0.15,-0.00,1.04
0.14,0.03,1.02
0.14,0.01,1.03
0.17,0.02,1.05
0.15,0.03,1.03
0.14,0.00,1.02
0.17,-0.02,1.05
0.16,0.01,1.04
0.14,0.02,1.01
0.15,0.00,1.03
0.16,0.03,1.05
0.11,0.03,1.01
0.15,-0.01,1.03
0.16,0.01,1.05
0.14,0.02,1.03
0.13,0.01,1.02
0.15,0.02,1.05
0.13,0.00,1.03
0.08,0.01,1.03
0.09,-0.01,1.03
0.09,-0.02,1.03
0.07,0.01,1.03
0.06,0.00,1.05
0.04,0.00,1.04
0.01,0.01,1.02
0.03,-0.05,1.02
-0.03,-0.05,1.03
-0.05,-0.02,1.02
I have taken 1st column (X axis) and saved in an array
Reference:https://hackaday.io/project/12109-open-source-fft-spectrum-analyzer/details
from this i took a part of FFT and the code is as below
from scipy.signal import filtfilt, iirfilter, butter, lfilter
from scipy import fftpack, arange
import numpy as np
import string
import matplotlib.pyplot as plt
sample_rate = 0.2
accx_list_MPU=[]
outputfile1='C:/Users/Meena/Desktop/SensorData.txt'
def fftfunction(array):
n=len(array)
print('The length is....',n)
k=arange(n)
fs=sample_rate/1.0
T=n/fs
freq=k/T
freq=freq[range(n//2)]
Y = fftpack.fft(array)/n
Y = Y[range(n//2)]
pyl.plot(freq, abs(Y))
pyl.grid()
ply.show()
with open(outputfile1) as f:
string1=f.readlines()
N1=len(string1)
for i in range (10,N1):
if (i%2==0):
new_list=string1[i].split(',')
l=len(new_list)
if (l==3):
accx_list_MPU.append(float(new_list[0]))
fftfunction(accx_list_MPU)
I have got the output of FFT as shown FFToutput
I do not understand if the graph is correct.. This is the first time im working with FFT and how do we relate it to data
This is what i got after the changes suggested:FFTnew

Here's a little rework of your fftfunction:
def fftfunction(array):
N = len(array)
amp_spec = abs(fftpack.fft(array)) / N
freq = np.linspace(0, 1, num=N, endpoint=False)
plt.plot(freq, amp_spec, "o-", markerfacecolor="none")
plt.xlim(0, 0.6) # easy way to hide datapoints
plt.margins(0.05, 0.05)
plt.xlabel("Frequency $f/f_{sample}$")
plt.ylabel("Amplitude spectrum")
plt.minorticks_on()
plt.grid(True, which="both")
fftfunction(X)
Specifically it removes the fs=sample_rate/1.0 part - shouldn't that be the inverse?
The plot then basically tells you how strong which frequency (relative to the sample frequency) was. Looking at your image, at f=0 you have your signal offset or mean value, which is around 0.12. For the rest of it, there's not much going on, no peaks whatsoever that indicate a certain frequency being overly present in the measurement data.

Live plotting of audio signal having all data points in the plot

I am plotting real time audio data from audiojack and trying to have entire signal in a plot(from time 0th second to current) ; therefore I am appending the audio signal into a list ( i.e. 'merged' in my case) and plotting the updated list again and again, but as the data increases (i.e. the no. of elements in merged) the plotting becomes slower and slower. Any suggestions to make it faster , keeping in mind that I need to have all the data points from start till end in the end to be including plot.
Please find my code below
import pyaudio
import itertools
import numpy as np
import time
import matplotlib.pyplot as plt
from scipy.signal import butter, lfilter
import matplotlib.animation as animation
RATE = 44100
CHUNK = int(RATE/2) # RATE / number of updates per second
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
ax.set_ylim([-1,1])
line, = ax.plot([], [],'-k',label='red')
ax.legend()
frames = []
# define callback (2)
def callback(in_data, frame_count, time_info, status):
# convert data to array
data = (np.fromstring(in_data, dtype=np.float32))
frames.append(data)
return (in_data, pyaudio.paContinue)
if __name__=="__main__":
# instantiate PyAudio (1)
p = pyaudio.PyAudio()
# open stream using callback (3)
stream = p.open(format=pyaudio.paFloat32,
channels=1,
rate=RATE,
input=True,
frames_per_buffer=CHUNK,
stream_callback=callback)
# start the stream (4)
stream.start_stream()
tt = 0
xar = []
while stream.is_active():
if frames:
t1 = time.time()
def animate(i):
#data is appended in the frames (global variable is pulled out every time one cycle of plotting is over
data_out= frames.pop()
xar.append(data_out.tolist())
merged = list(itertools.chain.from_iterable(xar)) #merging of audio data
line.set_ydata(merged)
line.set_xdata(range(len(merged)))
ax.relim()
ax.autoscale_view()
data_filter = []
data_out = []
print((time.time() - t1) % 60)
ani = animation.FuncAnimation(fig, animate, interval=1000)
plt.show()
# close stream and connection
stream.close()
p.terminate()
# wait for stream to finish (5)

Faster plotting of real time audio signal

I have a piece of code that takes real time audio signal from audio jack of my laptop and plots its graph after some basic filtering. The problem I am facing is that the real time plotting is getting slower and slower as the program is running ahead.
Any suggestions to make this plotting faster and proceed at constant rate?? I think animation function will make it faster but was not able to formulate according to my requirement
import pyaudio
import numpy as np
import time
import matplotlib.pyplot as plt
import scipy.io.wavfile
from scipy.signal import butter, lfilter
import wave
plt.rcParams["figure.figsize"] = 8,4
RATE = 44100
CHUNK = int(RATE/2) # RATE / number of updates per second
#Filter co-efficients
nyq = 0.5 * RATE
low = 3000 / nyq
high = 6000 / nyq
b, a = butter(7, [low, high], btype='band')
#Figure structure
fig, (ax, ax2) =plt.subplots(nrows=2, sharex=True)
x = np.linspace(1, CHUNK, CHUNK)
extent = [x[0] - (x[1] - x[0]) / 2., x[-1] + (x[1] - x[0]) / 2., 0, 1]
def soundplot(stream):
t1=time.time()
data = np.array(np.fromstring(stream.read(CHUNK),dtype=np.int32))
y1 = lfilter(b, a, data)
ax.imshow(y1[np.newaxis, :], cmap="jet", aspect="auto")
plt.xlim(extent[0], extent[1])
plt.ylim(-50000000, 50000000)
ax2.plot(x, y1)
plt.pause(0.00001)
plt.cla() # which clears data but not axes
y1 = []
print(time.time()-t1)
if __name__=="__main__":
p=pyaudio.PyAudio()
stream=p.open(format=pyaudio.paInt32,channels=1,rate=RATE,input=True,
frames_per_buffer=CHUNK)
for i in range(RATE):
soundplot(stream)
stream.stop_stream()
stream.close()
p.terminate()

This is a little long for a comment, and since you're asking for suggestions I think it's a semi-complete answer. There's more info and examples online about getting realtime plotting with matplotlib, if you need ideas beyond what's here. The library wasn't designed for this, but it's possible.
First step, profile the code. You can do this with
import cProfile
cProfile.run('soundplot(stream)')
That will show where most of the time is being spent.
Without doing that, I'll give a few tips, but be aware that profiling may show other causes.
First, you want to eliminate redundant function calls in the function soundplot. Both of the following are unnecessary:
plt.xlim(extent[0], extent[1])
plt.ylim(-50000000, 50000000)
They can be called once in initialization code. imshow updates these automatically, but for speed you shouldn't call that every time. Instead, in some initialization code outside the function use im=imshow(data, ...), where data is the same size as what you'll be plotting (although it may not need to be). Then, in soundplot use im.set_data(y1[np.newaxis, :]). Not having to recreate the image object each iteration will speed things up immensely.
Since the image object remains through each iteration, you'll also need to remove the call to cla(), and replace it with either show() or draw() to have the figure draw the updated image. You can do the same with the line on the second axis, using line.set_ydata(y).
Please post the before and after rate it runs at, and let me know if that helps.
Edit: some quick profiling of similar code suggests a 100-500x speedup, mostly from removing cla().
Also looking at your code, the reason for it slowing down is that cla isn't ever called on the first axis. Eventually there will be hundreds of images drawn on that axis, slowing matplotlib to a crawl.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Real time audio acquisition using pyAudio (problem with CHUNK size) - audio

Related

Find the time the music sound starts

Converting a wav file to amplitude and frequency values for textual, time-series analysis

FFT on MPU6050 output signal

Live plotting of audio signal having all data points in the plot

Faster plotting of real time audio signal

Categories

Resources