Signal Processing in Go - audio

I have come up with an idea for an audio project and it looks like Go is a useful language for implementing it. However, it requires the ability to apply filters to incoming audio, and Go doesn't appear to have any sort of audio processing package. I can use cgo to call C code, but every signal processing library I find uses C++ classes which cgo cannot handle. It looks like libsox may work. Are there any others?
What libsox can provide and what I need is to take an incoming audio stream and divide it into frequency bands. If I can do this while only reading the file once, then bonus! I am not sure if libsox can do this.

If you want to use a C++ library you could try SWIG, but you'll have to get it out of Subversion. The next release (2.0.1) will be the first released version to support Go. In my experience the Go support is still a little rough, but then again the library I tried to wrap is a monster.
Alternatively, you could still create your own bindings through cgo using the same method SWIG does, but it will be painful and tedious. The basic idea is that you first create a C wrapper, then let cgo create a Go wrapper around your C wrapper.
I don't know anything about signal processing or libsox, though. Sorry.

There is a relatively new project called ZikiChombo
which contains so far some basic DSP functionality geared toward audio, see here
The dsp part of the project has filters on its roadmap, but they are not yet there. On the other hand some infrastructure for implementing filters, such as real fft and block convolution is there. Meaning that if you want FIRs, and can compute the coefficients by some other means, you can run them via convolution in zc currently with sound in real time.
Basic filtering design support (FIR,Biquad), for example using an ideal filter as a starting point will be the next step for zc. There are numerous small self-contained open source projects for basic and more advanced FIR and IIR filter design, most notably Iowa Hills which might be more accessible than a larger project to compute filter coefficients outside of Go.
More advanced filtering such as Butterworth, and filters based on polynomial solving and the bilinear transform will take more time for zc.
There is also some software defined radio Golang projects with some code related to filtering, sorry don't have the links offhand but a search for the topic may lead you to them.
Finally, there is a gonum Fourier package which also supplies fft.
So Go is growing some interesting and potentially stuff in this domain, but still has quite a ways to go compared to older projects (which are mostly in C/C++, or perhaps with a Python wrapper via numpy for example).

I am using this pure golang repo to perform Fourier Transforms with good effect
https://github.com/mjibson/go-dsp
just supply the FFT call with a
import (
"github.com/mjibson/go-dsp/fft" // https://github.com/mjibson/go-dsp
)
var audio_wave []float64
// ... now populate audio_wave with your audio PCM samples
var complex_fft []complex128
// input time domain ... output frequency domain of equally spaced freq bins
complex_fft = fft.FFTReal(audio_wave)

Related

Realtime audio manipulation

Here is what i like to achieve:
I like to play around in creating "new" software / hardware instruments.
Sound processing and creation is always managed by software. But one could play the instrument via ultrasonic distance sensor for example. Another idea is to start playback when someone interrupts the light of a photoelectric barrier and so on....
So the instrument would play common sounds, but has to be used in an unusal way. For example, the ultrasonic instrument would play a sound if it detects something in a certain distance. The sound could be manipiulated in pitch for example if the distance gets smaller.
Basically i like to playback a sound sample and manipualte this in realtime.
I guess i have to use WAV samples for this, right? And which programming language do you think fits best for this task?
Edited after kevins hint: please kick me into the right direction - give me a hint where to start.
Thanks in advance
Since you're using the the Processing tag, you can try Processing.
It comes with a sound library like Minim or you can install beads which is great. There's actually a nice book on it: Sonifying Processing
You might find SuperColider fun as well.
The main thing is what are you comfortable with at the moment ?
If Processing syntax looks intimidating, you can actually try a different programming paradigm like data flow. In which case you can use PureData(free, opensource) or MaxMSP(very similar, but commercial). The idea is rather than typing instructions, you connect boxes with wires which is fun and the examples are great too.
If you're into c++ there are plenty of libraries. On the creative side, there's a nice set of libraries called OpenFrameworks that's easy and fun to use. If this is your cup of tea, have a peek at Maximilian.
Bottomline is: there are multiple options to achieve the same task. Choose the best tool for your (based on your background) or try each and see what you like best.
You asked "And which programming language do you think fits best for this task?" - I would also suggest using Processing. I have been used Processing to work with sounds previously. And in all cases I used Minim. It has many UgenS to generate sounds programmatically.
Also, you wants to integrate with some sensors. I'm not sure what types of sensors you will use, but Processing goes pretty well with different Arduino modules and sensors. Check this link for more direction.
Furthermore, you can export your project as .exe or executable .jar files. And their JS version (P5.js) works almost the same as the Java version.

How can I create a 3D model file from geometric shapes?

I am writing a program that will output 3D model files based on simple geometric shapes (e. g. rectangular prisms & cylinders) with known coordinates in 3-dimensional space. As an example, imagine creating a 3D model of stonehenge. this question suggests that OBJ files are the easiest to generate, but I'm struggling to find a good tutorial or easy-to-use library for doing so.
Can anyone either
(1) describe step-by-step how to create a simple file OR
(2) point me to a tutorial that describes how to do so
Notes:
* Using a GUI-based program to draw such files is not an option for me
* I have no prior experience with 3D modeling
* Other formats such as WRL or DAE would work for me as well
EDIT:
I do not need to use textures, just combinations of simple geometric shapes positioned in 3D space.
I strongly recommend to use some ASCII exchange format there are many out there I usually use these:
*.x DirectX object (it is a C++ source code)
this one is easiest to implement !!! But there are not many tools that can handle them. If you do not want to spend too much time coding then this is the right choice. Just copy the templates (at the start) from any *.x file to get started.
here some specs
*.iges common and importable on most CAD/CAM platform (Catia included)
this one is a bit complicated but for export purposes it is not that bad. It supports Volume operation like +,-,&,^ which are VERY HARD to implement properly but you do not have to use them :)
*.dxf AutoCAD exchange format
this one is even more complicated then IGES. I do not recommend to use it
*.ac AC3D
I first saw this one in flight gear.
here some specs
at first look it is quite easy but the sub-object implementation is really tricky. Unless you use it you should be fine.
This approach is easily verifiable in note pad or by loading to some 3D model viewer. Chose one that is most suitable for your needs and code save/load function to your Apps internal model class/struct. This way you will be compatible with other software and eliminate incompatibility problems which are native to creating 'almost known' binary formats like 3ds,...
In your case I would use IGES (Initial Graphics Exchange Specification)
For export you do not need to implement all just few basic shapes so it would not be too difficult. I code importers which are much much more complicated. Mine IGES loader class is about 30KB of C++ source code look here for more info
You did not provide any info about your 3D mesh model structure and capabilities
like what primitives you use, are your object simple or in skeleton hierarchy, are you using textures, and more ... so it is impossible to answer
Anyway export often looks like this:
create header and structure of target file format
if the format has any directory structure fill it and write it (IGES)
for sub-objects do not forget to add transformation matrices ...
write the chunks you need (points list, faces list, normals, ...)
With ASCII formats you can do this inside String variable so you can easily insert into or modify. Do all thing in memory and write the whole thing to file at the end which is fast and also add capability to work with memory instead of files. This is handy if you want to pack many files to single package file like *.pak or send/receive files through IPC or LAN ...
[Edit1] more about IGES
fileformat specs
I learned IGES from this pdf ... Have no clue where from I got it but this was first valid link I found in google today. I am sure there is some non registration link out there too. It is about 13.7 MB and original name IGES5-3_forDownload.pdf.
win32 viewer
this is free IGES viewer. I do not like the interface and handling but it works. It is necessary to have functional viewer for testing yours ...
examples
here are many tutorial files for many entities there are 3 sub-links (igs,peek,gif) where you can see example file in more ways for better understanding.
exporting to IGES
you did not provide any info about your 3D mesh internal structure so I can not help with export. There are many ways to export the same way so pick one that is closest to your App 3D mesh representation. For example you can use:
point cloud
rotation surfaces
rectangle (QUAD) surfaces
border lines representation (non solid)
trim surface and many more ...

XAudio2 occlusion processing

I'm working on a home brew game engine and I am currently working on the audio engine implementation. This is mostly for self-educational reasons. I want to create an interface wrapper for generic audio processing, so I can switch between OpenAL, XAudio2 or other platforms as appropriate or needed. I also want this code to be reusable, so I am trying to make it as complete as possible, and have various systems implement as much functionality as possible. For the time being, I am focusing on an XAudio2 implementation and may move on to an OpenAL implementation at a later date.
I've read a good deal over the past few months on 3D processing (listener/emitter), environmental effects (reverberation), exclusion, occlusion, obstruction and direct sound. I want to be able to use any of these effects with audio playback. While I've researched the topics as best I can, I can't find any examples as to how occlusion (direct and reflection signal muffling), obstruction (direct signal muffling) or exclusion (reflection signal muffling) are actually implemented. Reading MSDN documentation seems to passive references to occlusion, but nothing directly about implementation. The best I've found is a generic "use a low-pass filter", which doesn't help me much
So my question is this: using XAudio2, how would one implement audio reflection signal muffling (exclusion) and audio direct signal muffling (obstruction) or both simultaneously (occlusion)? What would the audio graph look like, and how would these relate to reverberation environmental effects?
Edit 2013-03-26:
On further thinking about the graph, I realized that I may not be looking at the graph from the correct perspective.
Should the graph appear to be: Source → Effects (Submix) → Mastering
-or-
Should the graph appear generically as follows:
↗→ Direct → Effects ↘
Source →Mastering
↘→ Reflections → Effects ↗
The second graph would split the graph such that exclusion and obstruction could be calculated separately; part of my confusion has been how they would be processed independently.
I would think, then, that the reverb settings from the 3D audio DSP structure would be applied to the reflections path; that the doppler would be applied to either just the direct or both the direct and the reflections path; and that the reverb environmental effects would affect the reflections path only. Is this getting close to the correct audio graph model?
You want your graph to look something along the lines of:
Input Data ---> Lowpass Filter ---> Output
You adjust the Lowpass filter as the source becomes more obstructed. You can also use the lowpass filter gain to simulate absorption. The filter settings are best set up so that they are exposed in way that the could be adjusted by the Sound Designer.
This article covers sound propogation in more detail: http://engineroom.ubi.com/sound-propagation-and-diffraction-simulation/
In terms of this then been passed along the graph for environmental effects such as reverb, you just want those to be further down the graph:
Input ---> Low pass filter ---> Output ---> Reverb ----> Master Out
This way the reverberated sound will match the occluded sound (otherwise it will sound odd having the reverb mismatched to the direct signal).
Using a low pass filter sounds vague and incomplete, but there is not actually much more to the effect than filtering the high frequencies and adjusting the gain. For more advanced environmental modelling you want to research something like "Precomputed Wave Simulation for Real-Time Sound Propagation of Dynamic Sources in Complex Scenes" (I'm unable to link directly as I don't have enough rep yet!) but it may well be beyond the scope of what you are trying to achieve.

Sound Synthesis Framework in C/C++/Objective-C?

I've searched the net but didn't found anything interesting. Maybe I'm doing something wrong.
I'm looking for sound synthesis API written on C, C++ or even Objective-C, which can synthesize different types of waves, effects are optional.
Here's a complete library/toolkit for FM (Frequency Modulation) synthesis:
link1
link2
If you have time to spare... creating simple sound synthesis from scratch is actually a fun endeavor. If you create a small buffer of 256 16 bit samples which represent either a sine. a sawtooth, block or pulse, you can copy these to a live audiobuffer (e.g. a small buffer (say 16kb)) which constantly loops. By staying ahead of the playposition, and constantly filling up the buffer with new values, you can create the soundoutput.
You can use the small buffers to combine these in interesting ways (simplest is just to add them together (additive synthesis)).
The frequency of the tone can be manipulated by using a bigger or smaller sampling step through the small buffers. Amplitude can be manipulated by scaling the samples before putting them into the output buffer.
Great fun experimenting with this!
If you have this step nailed, you can add more sophisticated effects like filters (low pass, high pass, etc) and effects (reverbs, echoes, etc)
R
Have you looked at the synthesis toolkit (STK)? It's in C++ (I don't think ObjC is the right language for audio synthesis, in fact audio units, Apple's own way of doing audio stuff, including generators/filters/effects... is in C++).
STK will run on Mac OS X, and iOS no problem (CoreAudio is supported), but will also run on Linux and Windows (Direct sound and ASIO), using RtAudio. It's really nicely done and lightweight, these guys have spent a lot of time thinking about it and it will definitely give you a big head start. It can handle loads of different audio file formats + midi (and hopefully OSC soon...).
There is also Create and CLAM which is huge, these include GUI components and many other things which you might or might not want. If you're only interested in doing sound synthesis I really recommend STK.
PortAudio is also a great C API that we used last semester in an audio programming course. It provides an audio callback...what more could you need!?
I haven't tried incorporating it with anything in Objective-C yet, but will report back when I do.
Writing audio synthesis algorithms in C/obj-C is quite difficult in my opinion. I would recommend writing your signal processing algorithms using PureData and then use ZenGarden or libpd to embed and interpret the pd patches in your app.
Another C++ library is nsound:
http://nsound.sourceforge.net
One can generate any kind of modulated signal using the Generator class or using the provided Sine class. Each time-step can have it's own instantaneous frequency and phase offset.
You can also experiment with the Python module to prototype your algorithm quickly, then implement in C++. It can produce pretty matplotlib plots from Python and even from C++!
Have you looked at CSound? It's an incredibly flexible audio generation platform, and can handle everything from simple waveform generation to FM synthesis and all kinds of filters. It also provides MIDI support, and you can extend it by writing custom opcodes. There's a full C API and several C++ APIs as well.

Real Time Audio Analysis In Linux

I'm wondering what is the recommended audio library to use?
I'm attempting to make a small program that will aid in tuning instruments. (Piano, Guitar, etc.). I've read about ALSA & Marsyas audio libraries.
I'm thinking the idea is to sample data from microphone, do analysis on chunks of 5-10ms (from what I've read). Then perform a FFT to figure out which frequency contains the largest peak.
This guide should help. Don't use ALSA for your application. Use a higher level API. If you decide you'd like to use JACK, http://jackaudio.org/applications has three instrument tuners you can use as example code.
Marsyas would be a great choice for doing this, it's built for exactly this kind of task.
For tuning an instrument, what you need to do is to have an algorithm that estimates the fundamental
frequency (F0) of a sound. There are a number of algorithms to do this, one of the newest and best
is the YIN algorithm, which was developed by Alain de Cheveigne. I recently added the YIN algorithm
to Marsyas, and using it is dead simple.
Here's the basic code that you would use in Marsyas:
MarSystemManager mng;
// A series to contain everything
MarSystem* net = mng.create("Series", "series");
// Process the data from the SoundFileSource with AubioYin
net->addMarSystem(mng.create("SoundFileSource", "src"));
net->addMarSystem(mng.create("ShiftInput", "si"));
net->addMarSystem(mng.create("AubioYin", "yin"));
net->updctrl("SoundFileSource/src/mrs_string/filename",inAudioFileName);
while (net->getctrl("SoundFileSource/src/mrs_bool/notEmpty")->to<mrs_bool>()) {
net->tick();
realvec r = net->getctrl("mrs_realvec/processedData")->to<mrs_realvec>();
cout << r(0,0) << endl;
}
This code first creates a Series object that we will add components to. In a Series, each of the components
receives the output of the previous MarSystem in serial. We then add a SoundFileSource, which you can feed
in a .wav or .mp3 file into. We then add the ShiftInput object which outputs overlapping chunks of audio, which
are then fed into the AubioYin object, which estimates the fundamental frequency of that chunk of audio.
We then tell the SoundFileSource that we want to read the file inAudioFileName.
The while statement then loops until the SoundFileSource runs out of data. Inside the while
loop, we take the data that the network has processed and output the (0,0) element, which is the
fundamental frequency estimate.
This is even easier when you use the Python bindings for Marsyas.
http://clam-project.org/
CLAM is a full-fledged software framework for research and application development in the Audio and Music Domain. It offers a conceptual model as well as tools for the analysis, synthesis and processing of audio signals.
They have a great API, nice GUI and a few finished apps where you can see everything.
ALSA is sort of the default standard for linux now by virtue of the kernel drivers being included in the kernel and OSS being depreciated. However there are alternatives to ALSA userspace, like jack, which seems to be aimed at low-latency professional type applications. It's API seems to have a nicer API, although I've not used it, my brief exposure to the ALSA API would make me think that almost anything would be better.
Audacity includes a frequency plot feature and has built-in FFT filters.

Resources