How do i get the frequency of the microphone in ChucK? - audio

I need to get the frequency that comes out of the microphone input, in order to play certain notes on a sequencer or any instrument depending on the microphone tone.
I use this code to output de microphone
adc => dac;
while(true){
0.1::second=>now;
}
Is there any function that is used on adb to be able to do what I want?
Thanks! :D

The easiest way to do this is to modify the Spectral Centroid UAna example.
/// sending the mic through the analysis instead of SinOsc
adc => FFT fft =^ Centroid cent => blackhole;
float trackedFrequency;
512 => fft.size;
Windowing.hann(512) => fft.window;
second / samp => float srate;
while( true )
{
cent.upchuck();
// modifying the example to put the analysis in a variable
cent.fval(0) * srate / 2 => trackedFrequency;
<<< trackedFrequency >>>; // use it set the frequency of something else
fft.size()::samp => now; // advance time
}

Related

How to analyze MP3 for beat/drums timestamps, trigger actions and playback at the same time (Rust)

I want to trigger an action (let a bright light flash for example) when the beat or drums in a mp3 file are present during playback. I don't know the theoretically procedure/approach I should take.
First I thought about statically analyzing the MP3 in the first step. The result of the analysis would be at which timestamps the action should be triggered. Then I start the MP3 and another thread starts the actions at the specific timings. This should be easy because I can use rodio-crate for playback. But the static analyzing parts is still heavy.
Analysis algorithm:
My idea was to read the raw audio data from a MP3 using minimp3-crate and do a FFT with rustfft-crate. When I have the spectrum analysis from FFT I could look where the deep frequencies are on a high volume and this should be the beat of the song.
I tried combining minimp3 and rustfft but I have absolutely no clue what the data that I get really means.. And I can't write a test for it really either..
This is my approach so far:
use minimp3::{Decoder, Frame, Error};
use std::fs::File;
use std::sync::Arc;
use rustfft::FFTplanner;
use rustfft::num_complex::Complex;
use rustfft::num_traits::{Zero, FromPrimitive, ToPrimitive};
fn main() {
let mut decoder = Decoder::new(File::open("08-In the end.mp3").unwrap());
loop {
match decoder.next_frame() {
Ok(Frame { data, sample_rate, channels, .. }) => {
// we only need mono data; because data is interleaved
// data[0] is first value channel left, data[1] is first channel right, ...
let mut mono_audio = vec![];
for i in 0..data.len() / channels {
let sum = data[i] as i32 + data[i+1] as i32;
let avg = (sum / 2) as i16;
mono_audio.push(avg);
}
// unnormalized spectrum; now check where the beat/drums are
// by checking for high volume in low frequencies
let spectrum = calc_fft(&mono_audio);
},
Err(Error::Eof) => break,
Err(e) => panic!("{:?}", e),
}
}
}
fn calc_fft(raw_mono_audio_data: &Vec<i16>) -> Vec<i16> {
// Perform a forward FFT of size 1234
let len = raw_mono_audio_data.len();
let mut input: Vec<Complex<f32>> = vec![];
//let mut output: Vec<Complex<f32>> = vec![Complex::zero(); 256];
let mut spectrum: Vec<Complex<f32>> = vec![Complex::zero(); len];
// from Vec<i16> to Vec<Complex<f32>>
raw_mono_audio_data.iter().for_each(|val| {
let compl = Complex::from_i16(*val).unwrap();
input.push(compl);
});
let mut planner = FFTplanner::new(false);
let fft = planner.plan_fft(len);
fft.process(&mut input, &mut spectrum);
// to Vec<i16>
let mut output_i16 = vec![];
spectrum.iter().for_each(|val| {
if let Some(val) = val.to_i16() {
output_i16.push(val);
}
});
output_i16
}
My problem is also that the FFT function doesn't have any parameter where I can specify the sample_rate (which is 48.000kHz). All I get from decoder.next_frame() is Vec<i16> with 2304 items..
Any ideas how I can achive that and what the numbers I currently get actually mean?
TL;DR:
Decouple analysis and audio data preparation. (1) Read the MP3/WAV data, join the two channels to mono (easier analysis), take slices from the data with a length that is a power of 2 (for the FFT; if required fill with additional zeroes) and finally (2) apply that data to the crate spectrum_analyzer and learn from the code (which is excellently documented) how the presence of certain frequencies can be obtained from the FFT.
Longer version
Decouple the problem into smaller problems/subtasks.
analysis of audio data in discrete windows => beat: yes or no
a "window" is usually a fixed-size view into the on-going stream of audio data
choose a strategy here: for example a lowpass filter, a FFT, a combination, ... search for "beat detection algorithm" in literature
if you are doing an FFT, you should extend your data window always to the next power of 2 (e.g. fill with zeroes).
read the mp3, convert it to mono and then pass the audio samples step by step to the analysis algorithm.
You can use the sampling rate and the sample index to calculate the point in time
=> attach "beat: yes/no" to timestamps inside the song
The analysis-part should be kept generally usable, so that it works for live audio as well as files. Music is usually discretized with 44100Hz or 48000Hz and 16 bit resolution. All common audio libraries will give you an interface to access audio input from the microphone with these properties. If you read a MP3 or a WAV instead, the music (the audio data) is usually in the same format. If you analyze windows of a length of 2048 at 44100Hz for example, each window has a length of 1/f * n == T * n == n/f == (2048/44100)s == ~46,4ms. The shorter the time window, the faster your beat detection can operate but the less your accuracy will be - it's a tradeoff :)
Your algorithm could keep knowledge about previous windows to overlap them to reduce noise/wrong data.
To view existing code that solves these sub-problems, I suggest the following crates
https://crates.io/crates/lowpass-filter : Simple low pass filter to get the low frequencies of a data window => (probably a) beat
https://crates.io/crates/spectrum-analyzer : spectrum analysis of an audio window with FFT and excellent documentation about how it is done inside the repository
With the crate beat detector there is a solution that pretty much implements the original content of this question. It connects live audio input with the analysis algorithm.

How to detect a basic audio signal into a much bigger one (mpg123 output signal)

I am new to signal processing and I don't really understand the basics (and more). Sorry in advance for any mistake into my understanding so far.
I am writing C code to detect a basic signal (18Hz simple sinusoid 2 sec duration, generating it using Audacity is pretty simple) into a much bigger mp3 file. I read the mp3 file and copy it until I match the sound signal.
The signal to match is { 1st channel: 18Hz sin. signal , 2nd channel: nothing/doesn't matter).
To match the sound, I am calculating the frequency of the mp3 until I find a good percentage of 18Hz freq. during ~ 2 sec. As this frequency is not very common, I don't have to match it very precisely.
I used mpg123 to convert my file, I fill the buffers with what it returns. I initialised it to convert the mp3 to Mono RAW audio:
init:
int ret;
const long *rates;
size_t rate_count, i;
mpg123_rates(&rates, &rate_count);
mpg123_handle *m = mpg123_new(NULL, &ret);
mpg123_format_none(m);
for(i=0; i<rate_count; ++i)
mpg123_format(m, rates[i], MPG123_MONO, MPG123_ENC_SIGNED_32);
if(m == NULL)
{
//err
} else {
mpg123_open_feed(m);
}
(...)
unsigned char out[8*MAX_MP3_BUF_SIZE];
ret = mpg123_decode(m, buf->data, buf->size, out, 8*MAX_MP3_BUF_SIZE, &size);
`(...)
unsigned char out[8*MAX_MP3_BUF_SIZE];
ret = mpg123_decode(m, buf->data, buf->size, out, 8*MAX_MP3_BUF_SIZE, &size);
(...) `
But I have to idea how to get the resulting buffer to calculate the FFT to get the frequency.
//FREQ Calculation with libfftw3
int transform_size = MAX_MP3_BUF_SIZE * 2;
fftw_complex *fftout = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * transform_size);
fftw_complex *fftin = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * transform_size);
fftw_plan p = fftw_plan_dft_r2c_1d(transform_size, fftin, fftout, FFTW_ESTIMATE);
I can get a good RAW Audio (PCM ?) into a buffer (if I write it, it can be read and converted into wave with sox:
sox --magic -r 44100 -e signed -b 32 -c 1 rps.raw rps.wav
Any help is appreciated. My knowledge of signal processing is poor, I am not even sure of what to do with the FFT to get the frequency of the signal. Code is just fyi, it is contained into a much bigger project (for which a simple grep is not an option)
Don't use MP3 for this. There's a good chance your 18 Hz will disappear or at least become distorted. 18 Hz is will below audible. MP3 and other lossy algorithms use a variety of techniques to remove sounds that we're not going to hear.
Assuming PCM, since you only need one frequency band, consider using the Goertzel algorithm. This is more efficient than FFT/DFT for your use case.

How to transmit and receive a baseband signal in unetstack?

I am trying to work on a project involving implementation of acoustic propagation loss models in underwater communication(based on a certain research paper). We are trying to simulate that in unetstack. The ultimate goal is to create a channel model class that has all the loss models implemented.
But for now we have started by trying to send a baseband signal from one node to another and then are trying to capture the frequency on the receiver node and calculate loss models on that given frequency. (The loss models are a function of frequency value of the signal). I have tried to follow some documentation and some blog posts but I am not able to transmit and receive the signal.
For reference, I have already referred to these articles:
1.) svc-12-baseband
2.) basic-modem-operations-using-unetstack
This is the Research paper that I am following this to calculate the Loss based on different Loss models.
I have tried to write a groovy file for simulation, but it does not seem to work. If someone can please have a look and let me know the mistakes I have made, that would be of real help. We are quite new to unetstack as well as the topic of underwater signal processing like this and this is our first attempt at implementing it on a simulator. We are using unetsim-1.3
Any help is really appreciated! Thanks in advance
import org.arl.fjage.*
import org.arl.unet.*
import org.arl.unet.phy.*
import org.arl.unet.bb.*
import org.arl.unet.sim.*
import org.arl.unet.sim.channels.*
import static org.arl.unet.Services.*
import static org.arl.unet.phy.Physical.*
import java.lang.Math.*
platform = RealTimePlatform
simulate 3.minutes, {
def n = []
n << node('1', address: 1, location: [0,0,0])
n << node('2', address: 2, location: [0,0,0])
n.eachWithIndex { n2, i ->
n2.startup = {
def phy = agentForService PHYSICAL
def node = agentForService NODE_INFO
def bb = agentForService BASEBAND
subscribe phy
subscribe bb
if(node.address == 1)
{
add new TickerBehavior(50000, {
float freq = 5000
float duration = 1000e-3
int fd = 24000
int fc = 24000
int num = duration*fd
def sig = []
(0..num-1).each { t ->
double a = 2*Math.PI*(freq-fc)*t/fd
sig << (int)(Math.cos(a))
sig << (int)(Math.sin(a))
}
bb << new TxBasebandSignalReq(signal: sig)
println "sent"
})
}
if(node.address == 2)
{
add new TickerBehavior(50000, {
bb << new RecordBasebandSignalReq(recLen: 24000)
def rxNtf = receive(RxBasebandSignalNtf, 25000)
if(rxNtf)
{
println "Received"
}
println "Tried"
})
}
}
}
}
In some cases "Tried" is printed first even before "sent" is printed. This shows that (node.address == 2) code is executing first, before (node.address == 1) executes.
The basic code you have for transmission (TxBasebandSignalReq) and reception (RecordBasebandSignalReq) of signals seems correct.
This should work well on modems, other than the fact that your signal generation is likely flawed for 2 reasons:
You are trying to generate a signal at 5 kHz in baseband representation using a carrier frequency of 24 kHz and a bandwidth of 24 kHz. This signal will be aliased, as this baseband representation can only represent signals of 24±12 kHz, i.e., 12-36 kHz. If you need to transmit a 5 kHz signal, you need your modem to be operating at much lower carrier frequency (easy in the simulator, but in practice you'd need to check your modem specifications).
You are typecasting the output of sin and cos to int. This is probably not what you intended to do, as the signal is an array of float scaled between -1 and 1. So just dropping the (int) would be advisable.
On a simulator, you need to ensure that the modem parameters are setup correctly to reflect your assumptions of baseband carrier frequency, bandwidth and recording length:
modem.carrierFrequency = 24000
modem.basebandRate = 24000
modem.maxSignalLength = 24000
The default HalfDuplexModem parameters are different, and your current code would fail for RecordBasebandSignalReq with a REFUSE response (which your code is not checking).
The rest of your code looks okay, but I'd simplify it a bit to:
import org.arl.fjage.*
import org.arl.unet.bb.*
import org.arl.unet.Services
platform = RealTimePlatform
modem.carrierFrequency = 24000
modem.basebandRate = 24000
modem.maxSignalLength = 48000
simulate 3.minutes, {
def n1 = node('1', address: 1, location: [0,0,0])
def n2 = node('2', address: 2, location: [0,0,0])
n1.startup = {
def bb = agentForService Services.BASEBAND
add new TickerBehavior(50000, {
float freq = 25000 // pick a frequency in the 12-36 kHz range
float duration = 1000e-3
int fd = 24000
int fc = 24000
int num = duration*fd
def sig = []
(0..num-1).each { t ->
double a = 2*Math.PI*(freq-fc)*t/fd
sig << Math.cos(a)
sig << Math.sin(a)
}
bb << new TxBasebandSignalReq(signal: sig)
println "sent"
})
}
n2.startup = {
def bb = agentForService Services.BASEBAND
add new TickerBehavior(50000, {
bb << new RecordBasebandSignalReq(recLen: 24000)
def rxNtf = receive(RxBasebandSignalNtf, 25000)
if(rxNtf) {
println "Received"
}
println "Tried"
})
}
}
This should work as expected!
However, there are a few more gotchas to bear in mind:
You are sending and recording on a timer. On a simulator, this should be okay, as both nodes have the same time origin and no propagation delay (you've setup the nodes at the same location). However, on a real modem, the recording may not be happening when the transmission does.
Transmission and reception of signals with a real modem works well. The Unet simulator is primarily a network simulator and focuses on simulating the communication system behavior of modems, but not necessarily the acoustic propagation. While it supports the BASEBAND service, the channel physics of transmitting signals is not accurately modeled by the default HalfDuplexModem model. So your mileage on signal processing the recording may vary. This can be fixed by defining your own channel model that uses an appropriate acoustic propagation model, but is a non-trivial undertaking.

Where can i find a simple filterbank BP/wavelet filter function, in C#/JS?

I am searching for a filter function which inputs an audio signal (a float[]) and returns the amplitude of the audio signal at a given frequency, i.e.:
void WAVELET (float In , float Freq ):float
{
out = in * coefficients*a1*b1*c1*d1/Freq;
return in;
}
void MAKE_SPECTROGRAM2D(float[] stream)
{
for (var bands = 0, bands < 1024 , bands++)
for (var i = 0, i < AudioStream.Length , i++)
var spectrogram2D[i,j] = WAVELET(AudioStream[i] , bands*22050/1024)
}
I have found some wavelet transformation projects (i.e. 1 and 2). It's very confusing trying to understand the codes. Wavelets seem to be much different to audio filters. A kind of bandpass filter for audio analysis would be fine. I can only find collections of difficult functions that have matrix transformations and seem to be not destined to input an array of data. I haven't managed to find filterbank audio filter functions online. i am simply confused, some information would be very helpful.

gstreamer read decibel from buffer

I am trying to get the dB level of incoming audio samples. On every video frame, I update the dB level and draw a bar representing a 0 - 100% value (0% being something arbitrary such as -20.0dB and 100% being 0dB.)
gdouble sum, rms;
sum = 0.0;
guint16 *data_16 = (guint16 *)amap.data;
for (gint i = 0; i < amap.size; i = i + 2)
{
gdouble sample = ((guint16)data_16[i]) / 32768.0;
sum += (sample * sample);
}
rms = sqrt(sum / (amap.size / 2));
dB = 10 * log10(rms);
This was adapted to C from a code sample, marked as the answer, from here. I am wondering what it is that I am missing from this very simple equation.
Answered: jacket was correct about the code loosing the sign, so everything ended up being positive. Also the code 10 * log(rms) is incorrect. It should be 20 * log(rms) as I am converting amplitude to decibels (as a measure of outputted power).
The level element is best for this task (as #ensonic already mentioned) its intended for exactly what you need..
So basically you add to your pipe element called "level", then enable the messages triggering.
Level element then emits messages which contains values of RMS Peak and Decay. RMS is what you need.
You can setup callback function connected to such message event:
audio_level = gst_element_factory_make ("level", "audiolevel");
g_object_set(audio_level, "message", TRUE, NULL);
...
g_signal_connect (bus, "message::element", G_CALLBACK (callback_function), this);
bus variable is of type GstBus.. I hope you know how to work with buses
Then in callback function check for the element name and get the RMS like is described here
There is also normalization algorithm with pow() function to convert to value between 0.0 -> 1.0 which you can use to convert to % as you stated in your question.

Resources