Real Time Audio Analysis In Linux - linux

I'm wondering what is the recommended audio library to use?
I'm attempting to make a small program that will aid in tuning instruments. (Piano, Guitar, etc.). I've read about ALSA & Marsyas audio libraries.
I'm thinking the idea is to sample data from microphone, do analysis on chunks of 5-10ms (from what I've read). Then perform a FFT to figure out which frequency contains the largest peak.

This guide should help. Don't use ALSA for your application. Use a higher level API. If you decide you'd like to use JACK, http://jackaudio.org/applications has three instrument tuners you can use as example code.

Marsyas would be a great choice for doing this, it's built for exactly this kind of task.
For tuning an instrument, what you need to do is to have an algorithm that estimates the fundamental
frequency (F0) of a sound. There are a number of algorithms to do this, one of the newest and best
is the YIN algorithm, which was developed by Alain de Cheveigne. I recently added the YIN algorithm
to Marsyas, and using it is dead simple.
Here's the basic code that you would use in Marsyas:
MarSystemManager mng;
// A series to contain everything
MarSystem* net = mng.create("Series", "series");
// Process the data from the SoundFileSource with AubioYin
net->addMarSystem(mng.create("SoundFileSource", "src"));
net->addMarSystem(mng.create("ShiftInput", "si"));
net->addMarSystem(mng.create("AubioYin", "yin"));
net->updctrl("SoundFileSource/src/mrs_string/filename",inAudioFileName);
while (net->getctrl("SoundFileSource/src/mrs_bool/notEmpty")->to<mrs_bool>()) {
net->tick();
realvec r = net->getctrl("mrs_realvec/processedData")->to<mrs_realvec>();
cout << r(0,0) << endl;
}
This code first creates a Series object that we will add components to. In a Series, each of the components
receives the output of the previous MarSystem in serial. We then add a SoundFileSource, which you can feed
in a .wav or .mp3 file into. We then add the ShiftInput object which outputs overlapping chunks of audio, which
are then fed into the AubioYin object, which estimates the fundamental frequency of that chunk of audio.
We then tell the SoundFileSource that we want to read the file inAudioFileName.
The while statement then loops until the SoundFileSource runs out of data. Inside the while
loop, we take the data that the network has processed and output the (0,0) element, which is the
fundamental frequency estimate.
This is even easier when you use the Python bindings for Marsyas.

http://clam-project.org/
CLAM is a full-fledged software framework for research and application development in the Audio and Music Domain. It offers a conceptual model as well as tools for the analysis, synthesis and processing of audio signals.
They have a great API, nice GUI and a few finished apps where you can see everything.

ALSA is sort of the default standard for linux now by virtue of the kernel drivers being included in the kernel and OSS being depreciated. However there are alternatives to ALSA userspace, like jack, which seems to be aimed at low-latency professional type applications. It's API seems to have a nicer API, although I've not used it, my brief exposure to the ALSA API would make me think that almost anything would be better.

Audacity includes a frequency plot feature and has built-in FFT filters.

Related

APCS final project: Converting an audio file to a simpler MIDI file

Lets say I have the audio file for Happy Birthday. I want to convert that audio file into an audio file that sounds like this : happy birthday.
First, I'd like to know if I have the ability to program this? Can a highschooler who's almost finished with APCS program this?
If I can:
How would I change the bpm of the song? I've searched through a bunch of websites, but they weren't very helpful.
I know that audio files can be represented in waveforms. How would I scan for each individual wave in an audio file (I need this to isolate the notes)?
This is a very ambitious project, actually. One reason is that it involves using digital signal processing tools like FFT (Fast fourier transforms) to analyze the sound to pick out the pitches. You might be able to find a library that can do this, but as far as coding such a tool, that would involve a steep learning curve.
If you would like to look further into this, there is a good online resource called "The Scientists and Engineers Guide to Digital Signal Processing". I was able to work through and understand the discrete fourier transform with only high school math (lots of trig) and a bit of calculus. It was a lift, though.
Trying to analyze rhythm is also no easy task. Even with advanced tools provided in professional notation system such as Finale, people have trouble playing rhythms in time well enough for the best transcription tools. Algorithms that "quantize" the beats help but also limit the amount of detail that can be included in the playback.
My guess is that as interesting and worthwhile as this project would be, to bring it to completion before the semester ends would require putting together prebuilt pieces. A lot of programming is done that way, these days.
If you scale the project back to something like just getting your code to analyze a short sample of a single note and give its pitch, that would be both impressive and doable with a lot of work. It could be done with a DFT algorithm instead of requiring FFT, reducing the amount of info you'd have to acquire first. That way, you'd only have to work your way up to understanding and implementing the material on this link which is about calculating the DFT. Notice that there is example code in BASIC. The code examples throughout this book are a big help.

audio processing in labVIEW( Is stream process possible ?? )

I am quite new to LabVIEW and NI devices.
I am working on Active Noise Cancellation Project, where I will be using two microphones input and one loud speaker as output. I have NI myRIO 1900 and CDAQ 9178 devices in our university lab. I need to do real time audio processing, I will collect data from microphone and process it using filtered XLMS algorithm to produce anti noise from loud speaker and other microphone is error microphone. I want to process data so quickly( within 1.7 msec ) so I will have real time response at 44100 sample rate !! My question is , 'is it possible to do with labview ?? and is stream processing possible in labVIEW?? and can I achieve so small audio latencies as mentioned above ??'
I have searched for audio processing objects in labview help. I can only find 'Acquire Sound', 'Play Waveform', surprisingly 'Acquire Sound configuration ' will work only for duration of minimum of 1 second not less than that !!! I can't input the time milli seconds !!!( I am still facing problem installing myRIO, so I have used host computed VI to do this.)
Please help !! Thank You
The thing you should be looking into is the FPGA part of the myRIO. You’re never going to be able to get 1.7ms response time via the host computer. The FPGA can access the Analogue inputs and outputs, so if you can get your algorithm to compile onto the FPGA then it should work.
Yes, it is possible with LabVIEW, insofar as any algorithm you want to code up can be executed by LabVIEW. If you're asking whether there is a library that already exists to do the filtering you're wanting to do, you may want to explore the NI Sound & Vibration toolkit, which is sold separate from LabVIEW, or explore third-party libraries.
The raw waveform mathematics abilities that come with LabVIEW are fairly extensive. You should be able to code whatever transforms you want if you know the base math.

Audio signal comparison from Radar

I an working on a student project. We have a radar that gives an audio detection using headphones to indicate the type of target. Target types are (eg car/truck/man). Radar distinguishes between these targets based on doppler variation, down converts this into audible range and operator can hear it through headphone. System has provided sample audio files corresponding to each type of target(man/car/truck) to train the operator to know as to what he is hearing when live signal is fed and accordingly decide what target it is.
I intend that a software can do the job of this operator.
I want to compare live audio signal input from Radar with 7 different test audio files and want the software to tell me which file matches the input.
kindly educate me .... can these audio fingerprinting softwares do my job.
What you're trying to be done can be implemented in GNU Radio, in a lot of ways.
You could, for example, take the audio signal as input to an audio source, connect that to a set of xlating FIR filters, which you'd design using the gr_filter_design tool; you then would estimate the (potentially decimated) signal in these bands by converting the complex samples to their power (complex to Mag^2) and would then further low-pass and decimate, to then select the band with the highest energy. All this can be done in a nice graphical way in the GNU Radio Companion (gnuradio-companion), which will then generate Python code, which is used to set up the signal flow graph based on the C++ GNU Radio framework.
I recommend you read the Guided Tutorials and see where you get from there.

Signal Processing in Go

I have come up with an idea for an audio project and it looks like Go is a useful language for implementing it. However, it requires the ability to apply filters to incoming audio, and Go doesn't appear to have any sort of audio processing package. I can use cgo to call C code, but every signal processing library I find uses C++ classes which cgo cannot handle. It looks like libsox may work. Are there any others?
What libsox can provide and what I need is to take an incoming audio stream and divide it into frequency bands. If I can do this while only reading the file once, then bonus! I am not sure if libsox can do this.
If you want to use a C++ library you could try SWIG, but you'll have to get it out of Subversion. The next release (2.0.1) will be the first released version to support Go. In my experience the Go support is still a little rough, but then again the library I tried to wrap is a monster.
Alternatively, you could still create your own bindings through cgo using the same method SWIG does, but it will be painful and tedious. The basic idea is that you first create a C wrapper, then let cgo create a Go wrapper around your C wrapper.
I don't know anything about signal processing or libsox, though. Sorry.
There is a relatively new project called ZikiChombo
which contains so far some basic DSP functionality geared toward audio, see here
The dsp part of the project has filters on its roadmap, but they are not yet there. On the other hand some infrastructure for implementing filters, such as real fft and block convolution is there. Meaning that if you want FIRs, and can compute the coefficients by some other means, you can run them via convolution in zc currently with sound in real time.
Basic filtering design support (FIR,Biquad), for example using an ideal filter as a starting point will be the next step for zc. There are numerous small self-contained open source projects for basic and more advanced FIR and IIR filter design, most notably Iowa Hills which might be more accessible than a larger project to compute filter coefficients outside of Go.
More advanced filtering such as Butterworth, and filters based on polynomial solving and the bilinear transform will take more time for zc.
There is also some software defined radio Golang projects with some code related to filtering, sorry don't have the links offhand but a search for the topic may lead you to them.
Finally, there is a gonum Fourier package which also supplies fft.
So Go is growing some interesting and potentially stuff in this domain, but still has quite a ways to go compared to older projects (which are mostly in C/C++, or perhaps with a Python wrapper via numpy for example).
I am using this pure golang repo to perform Fourier Transforms with good effect
https://github.com/mjibson/go-dsp
just supply the FFT call with a
import (
"github.com/mjibson/go-dsp/fft" // https://github.com/mjibson/go-dsp
)
var audio_wave []float64
// ... now populate audio_wave with your audio PCM samples
var complex_fft []complex128
// input time domain ... output frequency domain of equally spaced freq bins
complex_fft = fft.FFTReal(audio_wave)

Sound Synthesis Framework in C/C++/Objective-C?

I've searched the net but didn't found anything interesting. Maybe I'm doing something wrong.
I'm looking for sound synthesis API written on C, C++ or even Objective-C, which can synthesize different types of waves, effects are optional.
Here's a complete library/toolkit for FM (Frequency Modulation) synthesis:
link1
link2
If you have time to spare... creating simple sound synthesis from scratch is actually a fun endeavor. If you create a small buffer of 256 16 bit samples which represent either a sine. a sawtooth, block or pulse, you can copy these to a live audiobuffer (e.g. a small buffer (say 16kb)) which constantly loops. By staying ahead of the playposition, and constantly filling up the buffer with new values, you can create the soundoutput.
You can use the small buffers to combine these in interesting ways (simplest is just to add them together (additive synthesis)).
The frequency of the tone can be manipulated by using a bigger or smaller sampling step through the small buffers. Amplitude can be manipulated by scaling the samples before putting them into the output buffer.
Great fun experimenting with this!
If you have this step nailed, you can add more sophisticated effects like filters (low pass, high pass, etc) and effects (reverbs, echoes, etc)
R
Have you looked at the synthesis toolkit (STK)? It's in C++ (I don't think ObjC is the right language for audio synthesis, in fact audio units, Apple's own way of doing audio stuff, including generators/filters/effects... is in C++).
STK will run on Mac OS X, and iOS no problem (CoreAudio is supported), but will also run on Linux and Windows (Direct sound and ASIO), using RtAudio. It's really nicely done and lightweight, these guys have spent a lot of time thinking about it and it will definitely give you a big head start. It can handle loads of different audio file formats + midi (and hopefully OSC soon...).
There is also Create and CLAM which is huge, these include GUI components and many other things which you might or might not want. If you're only interested in doing sound synthesis I really recommend STK.
PortAudio is also a great C API that we used last semester in an audio programming course. It provides an audio callback...what more could you need!?
I haven't tried incorporating it with anything in Objective-C yet, but will report back when I do.
Writing audio synthesis algorithms in C/obj-C is quite difficult in my opinion. I would recommend writing your signal processing algorithms using PureData and then use ZenGarden or libpd to embed and interpret the pd patches in your app.
Another C++ library is nsound:
http://nsound.sourceforge.net
One can generate any kind of modulated signal using the Generator class or using the provided Sine class. Each time-step can have it's own instantaneous frequency and phase offset.
You can also experiment with the Python module to prototype your algorithm quickly, then implement in C++. It can produce pretty matplotlib plots from Python and even from C++!
Have you looked at CSound? It's an incredibly flexible audio generation platform, and can handle everything from simple waveform generation to FM synthesis and all kinds of filters. It also provides MIDI support, and you can extend it by writing custom opcodes. There's a full C API and several C++ APIs as well.

Resources