Spectrogram - Calculating is wrong - audio

Ok, so basically, I am implementing the following algorithm:
1) Slice signal of size 256 with an overlap of 128
2) Multiply each chunk with the Hanning window
3) Get DFT
4) Compute the abs value sqrt(re*re+im*im)
Plotting these values, as a imshow I get the following result:
This looks ok, it's clearly showing some difference, i.e. the spike where the signal has most amplitude shows. However, in Python I get this result:
I know that I'm doing something right, but, also doing something wrong. I just can't seem to find out where which is making me not think I have done it correctly.
Any rough ideas to where I could be going wrong here? I mean, is plotting the abs value the right way here or not?
Thanks
EDIT:
Result after clamping..
UPDATE:
Code:
for(unsigned j=0; (j < stft_temp[i].size()/2); j++)
{
double v = 10 * log10(stft_temp[i][j].re * stft_temp[i][j].re + stft_temp[i][j].im * stft_temp[i][j].im);
double pixe = 1.5 * (v + 100);
STFT[i][j] = (int) pixe;
}

Typically you might want to use a log magnitude and then scale to the required range, which would usually be 0..255. In pseudo-code:
mag_dB = 10 * log10(re * re + im * im); // get log magnitude (dB)
pixel_intensity = 1.5 * (mag_dB + 100); // offset and scale
pixel_intensity = min(pixel_intensity, 255); // clamp to 0..255
pixel_intensity = max(pixel_intensity, 0);

Related

THREE.js BufferGeometry points animate individually around sphere and control speed

So im going nuts lol this seems to be super easy and I feel like im so close but I cant seem to pin point where im going wrong here.
.
Im trying to create an animated group of THREE.js BufferGeometry Points, and animate them around a sphere pattern INDIVIDUALLY at different speeds and have them start at different positions. I want them to each animate in a circular motion around the sphere pattern, not shoot randomly around like i have them now. They can BEGIN in any random spot but where they start, they should begin on a straight, normal circular pattern around the sphere. Also, my issue is figuring out how to SLOW THEM DOWN.
From what I understand, theta, is the angle which needs to b increased to rotate a particle around a sphere. So im kinda lost how to do that properly. below is my code and codepens, any advice is greatly appreciated and try to dumb it down for me on the math terminology as im super new to vector math but have been studying to try and learn some cool sh!t
Primarily, there are two main parts. The initial loop which does the initial drawing/placement of the particles and the second loop which udpdates them and is meant to move them forward in their circular future path around the sphere pattern. If thats not the correct way to go about this, please lmk lol.
Initial placement of particles in random places all along the outside of a spherical pattern and my update function is the same as of now although im sure the update function is the one that needs to change:
const t = clock.getElapsedTime();
let theta = 0, phi = 0;
for (let i = 0; i < 1000; i++) {
theta += 2 * Math.PI * Math.random();
phi += Math.acos(2 * Math.random() - 1);
const x = radius * Math.cos(theta) * Math.sin(phi)
const y = radius * Math.sin(theta) * Math.sin(phi)
const z = radius * Math.cos(phi)
positions.push(x, y, z);
sizes.push(Math.random()*100)
const hex = colorList[Math.round(Math.random() * colorList.length)]
const rgb = new THREE.Color(hex)
colors.push(rgb.r, rgb.g, rgb.b)
}
jsfiddle
The Vue tab in the fiddle has all the code.
Ive tried the above with a time constant added to theta and without and all the particles move about randomly around the sphere, but I cant figure out how to get the particles to moe in a smooth, slower, circular pattern around the sphere. Again, the initial random positions are fine, but the way they update and scatter around randomly is wrong, i know it has to do with the theta variable i just cant figure out what to do to make it right.
Ok, after what seems likes months, I FINALLY figured out how to INDIVIDUALLY rotate three.js points around a sphere at different speeds, from random starting positions.
THERE ARE LOTS OF EXAMPLES FOR OLD THREE.JS VERSIONS THAT USE THREE.GEOMETRY, BUT THIS USES THE NEW BUFFERGEOMETRY WITH THE LATEST THREE.JS VERSION, NOT SOME ANCIENT R86 VERSIO LIKE THE OTHER EXAMPLES!!!
This first part does the initial plotting of the points
const radius = 1.5
const vectors = []
let theta = 0; let phi = 0
for (let i = 0; i < 5000; i++) {
theta = 2 * Math.PI * Math.random()
phi = Math.acos(2 * Math.random() - 1)
const px = radius * Math.cos(theta) * Math.sin(phi)
const py = radius * Math.sin(theta) * Math.sin(phi)
const pz = radius * Math.cos(phi)
const vertex = new THREE.Vector3(px, py, pz)
vertex.delay = Date.now() + (particlesDelay * i)
vertex.rotationAxis = new THREE.Vector3(0, Math.random() * 2 - 1, Math.random() * 2 - 1)
vertex.rotationAxis.normalize()
vertex.rotationSpeed = Math.random() * 0.1
vectors.push(vertex)
positions.push(vertex.x, vertex.y, vertex.z)
sizes.push(Math.random() * 0.1)
const hex = colorList[Math.round(Math.random() * colorList.length)]
const rgb = new THREE.Color(hex)
colors.push(rgb.r, rgb.g, rgb.b)
}
geometry.setAttribute('position', new THREE.Float32BufferAttribute(positions, 3).setUsage(THREE.DynamicDrawUsage))
geometry.setAttribute('color', new THREE.Float32BufferAttribute(colors, 3))
geometry.setAttribute('size', new THREE.Float32BufferAttribute(sizes, 1))
const particles = new THREE.Points(geometry, shaderMaterial)
scene.add(particles)
This is the magic that updates the points around the sphere
const posAttribute = particles.geometry.getAttribute('position')
const ps = posAttribute.array
const updateParticles = () => {
// loop over vectors and animate around sphere
for (let i = 0; i < vectors.length; i++) {
const vector = vectors[i]
vector.applyAxisAngle(vector.rotationAxis, vector.rotationSpeed)
ps[i * 3] = vector.x
ps[i * 3 + 1] = vector.y
ps[i * 3 + 2] = vector.z
}
particles.geometry.attributes.position.needsUpdate = true
}

Retrieving depth value of a sample position using depth map

I have been trying to implement SSAO following the LearnOpenGL implementation. In their implementation they have utilized the positions g-buffer to obtain the sample positions depth value and I am wondering how I could go about using the depth buffer instead since I have this ready to use, in order to retrieve the depth value instead of using the positions g-buffer. I have shown the LearnOpenGL implementation using the positions texture and the attempt I made at using the depth buffer. I think I might be missing a required step for utilizing the depth buffer but I am unsure.
[LearnOpenGL SSAO][1]
Using Positions g-buffer
layout(binding = 7) uniform sampler2D positionsTexture;
layout(binding = 6) uniform sampler2D depthMap;
// ...
vec4 offset = vec4(samplePos, 1.0);
offset = camera.proj * offset; //transform sample to clip space
offset.xyz /= offset.w; // perspective divide
offset.xyz = offset.xyz * 0.5 + 0.5; // transform to range 0-1
float sampleDepth = texture(positionsTexture, offset.xy).z;
I want to use the depth buffer instead. This approach did not seem to work for me
float sampleDepth = texture(depthMap, offset.xy).x;
Update: 8/01
I have implemented a linerazation function for the depth result. Still unable to obtain the right result. Am I missing something more?
float linearize_depth(float d,float zNear,float zFar)
{
return zNear * zFar / (zFar + d * (zNear - zFar));
}
float sampleDepth = linearize_depth(texture(depthMap, offset.xy).z, zNear, zFar);

Open Scene Graph - Usage of DrawElementsUInt: Drawing a cloth without duplicating vertices

I am currently working on simulating a cloth like material and then displaying the results via Open Scene Graph.
I've gotten the setup to display something cloth like, by just dumping all the vertices into 1 Vec3Array and then displaying them with a standard Point based DrawArrays. However I am looking into adding the faces between the vertices so that a further part of my application can visually see the cloth.
This is currently what I am attempting as for the PrimitiveSet
// create and add a DrawArray Primitive (see include/osg/Primitive). The first
// parameter passed to the DrawArrays constructor is the Primitive::Mode which
// in this case is POINTS (which has the same value GL_POINTS), the second
// parameter is the index position into the vertex array of the first point
// to draw, and the third parameter is the number of points to draw.
unsigned int k = CLOTH_SIZE_X;
unsigned int n = CLOTH_SIZE_Y;
osg::ref_ptr<osg::DrawElementsUInt> indices = new osg::DrawElementsUInt(GL_QUADS, (k) * (n));
for (uint y_i = 0; y_i < n - 1; y_i++) {
for (uint x_i = 0; x_i < k - 1; x_i++) {
(*indices)[y_i * k + x_i] = y_i * k + x_i;
(*indices)[y_i * (k + 1) + x_i] = y_i * (k + 1) + x_i;
(*indices)[y_i * (k + 1) + x_i + 1] = y_i * (k + 1) + x_i + 1;
(*indices)[y_i * k + x_i] = y_i * k + x_i + 1;
}
}
geom->addPrimitiveSet(indices.get());
This does however cause memory corruption when running, and I am not fluent enough in Assembly code to decipher what it is trying to do wrong when CLion gives me the disassembled code.
My thought was that I would iterate over each of the faces of my cloth and then select the 4 indices of the vertices that belong to it. The vertices are inputted from top left to bottom right in order. So:
0 1 2 3 ... k-1
k k+1 k+2 k+3 ... 2k-1
2k 2k+1 2k+2 2k+3 ... 3k-1
...
Has anyone come across this specific use-case before and does he/she perhaps have a solution for my problem? Any help would be greatly appreciated.
You might want to look into using DrawArrays with QUAD_STRIP (or TRIANGLE_STRIP because quads are frowned upon these days). There's an example here:
http://openscenegraph.sourceforge.net/documentation/OpenSceneGraph/examples/osggeometry/osggeometry.cpp
It's slightly less efficient than Elements/indices, but it's also less complicated to manage the relationship between the two related containers (the vertices and the indices).
If you really want to do the Elements/indices route, we'd probably need to see more repro code to see what's going on.

Increasing mean deviation with increasing sample size on Excel's NORMINV()

I have a strange behaviour in my attempt to code Excel's NORMINV() in C. As norminv() I took this function from a mathematician, it's probably correct since I also tried different ones with same result. Here's the code:
double calculate_probability(double x0, double x1)
{
return x0 + (x1 - x0) * rand() / ((double)RAND_MAX);
}
int main() {
long double probability = 0.0;
long double mean = 0.0;
long double stddev = 0.001;
long double change_percentage = 0.0;
long double current_price = 100.0;
srand(time(0));
int runs = 0;
long double prob_sum = 0.0;
long double price_sum = 0.0;
while (runs < 100000)
{
probability = calculate_probability(0.00001, 0.99999);
change_percentage = mean + stddev * norminv(probability); //norminv(p, mu, sigma) = mu + sigma * norminv(p)
current_price = current_price * (1.0 + change_percentage);
runs++;
prob_sum += probability;
price_sum += current_price;
}
printf("\n\n%f %f\n", price_sum / runs, prob_sum / runs);
return 0;
}
Now I want to simulate Excel's NORMINV(rand(), 0, 0.001) where rand() is a value > 0 and < 1, 0 is the mean and 0.001 would be the standard deviation.
With 1000 values it looks okay:
100.729780 0.501135
With 10000 values it spreads too much:
107.781909 0.502301
And with 100000 values it sometimes spreads even more:
87.876500 0.498738
Now I don't know why that happens. My assumption is that the random number generator has to be normally distributed, too. In my case probability is calculated fine since the mean is pretty much 0.5 all the time. Thus I don't know why the mean deviation is increasing. Can somebody help me?
You're doing something along the lines of a random walk, except your moves are with a multiplicative scaling factor rather than additive steps.
Consider two successive moves, the first of which gives 20% inflation, the second with 20% deflation. Starting with a baseline of 100, after the first step you're at 120. If you now take 80% of 120, you get 96 rather than the original 100. In other words, seemingly symmetric scaling factors are not actually symmetric. While your scaling factors are random, they are still being created symmetrically around 1, so I'm not surprised to see deviations accumulate.

What exactly does a Sample Rate of 44100 sample?

I'm using FMOD library to extract PCM from an MP3. I get the whole 2 channel - 16 bit thing, and I also get that a sample rate of 44100hz is 44,100 samples of "sound" in 1 second. What I don't get is, what exactly does the 16 bit value represent. I know how to plot coordinates on an xy axis, but what am I plotting? The y axis represents time, the x axis represents what? Sound level? Is that the same as amplitude? How do I determine the different sounds that compose this value. I mean, how do I get a spectrum from a 16 bit number.
This may be a separate question, but it's actually what I really need answered: How do I get the amplitude at every 25 milliseconds? Do I take 44,100 values, divide by 40 (40 * 0.025 seconds = 1 sec) ? That gives 1102.5 samples; so would I feed 1102 values into a blackbox that gives me the amplitude for that moment in time?
Edited original post to add code I plan to test soon: (note, I changed the frame rate from 25 ms to 40 ms)
// 44100 / 25 frames = 1764 samples per frame -> 1764 * 2 channels * 2 bytes [16 bit sample] = 7056 bytes
private const int CHUNKSIZE = 7056;
uint bytesread = 0;
var squares = new double[CHUNKSIZE / 4];
const double scale = 1.0d / 32768.0d;
do
{
result = sound.readData(data, CHUNKSIZE, ref read);
Marshal.Copy(data, buffer, 0, CHUNKSIZE);
//PCM samples are 16 bit little endian
Array.Reverse(buffer);
for (var i = 0; i < buffer.Length; i += 4)
{
var avg = scale * (Math.Abs((double)BitConverter.ToInt16(buffer, i)) + Math.Abs((double)BitConverter.ToInt16(buffer, i + 2))) / 2.0d;
squares[i >> 2] = avg * avg;
}
var rmsAmplitude = ((int)(Math.Floor(Math.Sqrt(squares.Average()) * 32768.0d))).ToString("X2");
fs.Write(buffer, 0, (int) read);
bytesread += read;
statusBar.Text = "writing " + bytesread + " bytes of " + length + " to output.raw";
} while (result == FMOD.RESULT.OK && read == CHUNKSIZE);
After loading mp3, seems my rmsAmplitude is in the range 3C00 to 4900. Have I done something wrong? I was expecting a wider spread.
Yes, a sample represents amplitude (at that point in time).
To get a spectrum, you typically convert it from the time domain to the frequency domain.
Last Q: Multiple approaches are used - You may want the RMS.
Generally, the x axis is the time value and y axis is the amplitude. To get the frequency, you need to take the Fourier transform of the data (most likely using the Fast Fourier Transform [fft] algorithm).
To use one of the simplest "sounds", let's assume you have a single frequency noise with frequency f. This is represented (in the amplitude/time domain) as y = sin(2 * pi * x / f).
If you convert that into the frequency domain, you just end up with Frequency = f.
Each sample represents the voltage of the analog signal at a given time.

Resources