I am using libmpg123 library for parsing and decoding mp3 song, but I want draw something like this:
http://img.photobucket.com/albums/v627/Flabbergast/jurafirstCDpressing-1.jpg
How can I get an amplitude from mp3 song? I am trying to find solution in a lot of pages but without good result.
Thanks for your answer
Related
I currently have a PNG of a spectrogram such as:
I don't have the original audio file but am wondering if there is a way I can convert this into a SciPy spectrogram object. I was thinking I could try to convert the image to an audio file first, but it seems like there aren't many packages reconstructing spectrogram audio since there's already so much lost data.
Any ideas and suggestions would be appreciated!
I was working on changing bitrates of some MP3 files from 160kbps to 80kbps just to see how it affects the audio quality and frequencies. So I plotted the spectrogram for both waves and it looks something like this (Upper one being 80Kbps ofcourse).
I was curious to know if spectrograms are in any way related to audio bitrates but just after a simple search it turned out that they are not. It would be great if someone could explain to me the reason behind the differences in frequencies in the spectrograms if people say that there is no relation. Also if I had to analyse the differences between the two audio waves then what would be the best way to do this if people say spectrograms shouldn't be used?
Sorry if I am missing something here as I am not from the signal processing field but I am keen on learning. Any direction or links that could help me out would be great. Thanks in advance!
So I'm doing real time Audio processing in Python. The good news is, i found this link, which helps me collect data from my PC mic, and plot all the data in real time which is fantastic.
I also found this code from other links, where i can stream the data from Mic to Speaker for a given time.
self.stream=self.p.open(format=pyaudio.paInt16,channels=self.CHANNELS,rate=self.RATE,input=True,
output=True,frames_per_buffer=self.CHUNK)
def stream_data(self):
for i in range(0, int(self.RATE / self.CHUNK * self.RECORD_SECONDS)):
data = self.stream.read(self.CHUNK)
self.stream.write(data, self.CHUNK)
Where my idea diverges from the above link is, I want to apply an FFT to the Microphone data before i send it to the speaker. if I print the 'data' from the above code, i see that it is a whole lot of hexa gibberish that has to converted to decimal format. From the earlier link, I know how to do that as well
data = np.frombuffer(self.stream.read(self.CHUNK),dtype=np.int16)
I have the data that I need in decimal format. But now that i have this data, how can i convert it back to the hexa format after processing, that 'self.stream.write' can understand & output to the speaker. I'm not sure how that gets done.
i believe I've been able to find an answer. so if this might help someone else as well, here is a paper that helped me.
Real-Time Digital Signal Processing Using pyaudio_helper and the ipywidgets
I am planning to build a music genre classifier working with mp3 files, and I wanna test and see which features work best for this. I have seen a paper that used MFCC(Mel Frequency Cepstral Coefficients) for this, but as a beginner in Machine Learning, this method felt complicated. I also saw some that converted mp3 files into spectograms and analysed those, but with no success. What I am looking for is a few easy-to-extract features to classify mp3 files. Do any other methods exist save for the two I just listed?
There are some papers on this, you can easily google them up.
But the simplest features would be the beat speed, the proportions of high/low frequencies etc.
All of this can be extracted using FFT (Fast Fourier Transform). But I am afraid this may not be so easy if you haven't done it before...
I need to apply an impulse response of an audio file that is 48kHz to an audio file that is 44.1kHz. It could be someone speaking for example. If I'm using the correct term, I need to convolve two audio file together so it would sound like someone is speaking inside of a cathedral.
What I don't know is how I go about doing this. I looked at minim library since it's the only audio library that I remember using and I found an example that applies an impulse response of a low pass filter to an audio file. Is there a way to convolve two audio files together to output a new sound? Audio processing isn't my forte so please don't mind my ignorance. Thanks, I'm trying to figure this out along the way.
Yes, convolution is what you want, but first your source needs to be the same sample rate. Once your sources are the same sample rate, you have have two options for performing the S/R conversion: 1. you can do it "directly", which is the most straightforward way, but takes M*N time, or 2. you can do it using the Fourier transform, which is much more complex, but faster. You will need to implement the overlap add algorithm as well. Looking at the docs of Minim, it looks to me like they use a standard IIR filter, not convolution by an impulse response, so I don't think that will help. You will have to do a lot of work to do convolution on top of what Minim gives you using the FFT. If you want to go the "direct" route, it will look something like this:
for( i in 0...input.length )
for( j in 0...conv.length )
output[i] += i-j < 0 ? 0 : input[i-j] * conv[j] ;
more details here: http://www-rohan.sdsu.edu/~jiracek/DAGSAW/4.3.html or google "discrete convolution"
Update: Minim does give you convolution: http://code.compartmental.net/minim/javadoc/ddf/minim/effects/Convolver.html