Can SuperpoweredFloatToShortIntInterleave be used to process media playback from devices? - audio

I am having some issues with understanding Superpowered audio output processing. Everything was fine when I was using SuperpoweredFloatToShortInt input buffers as;
buffer[n] = (float *)memalign(16, (buffersize + 16) * sizeof(float) * 2);)
then I realised my audio output have been mono all this while so I decided to process my output using SuperpoweredFloatToShortIntInterleave to give out surround stereo effect. Using the same buffer[n] variable, my audio got processed but was distorted and low-pitched towards bass when testing on device.
I have also tried having separate buffer variables as learnt here
with these:
inputBufferFloat = (float *)malloc(buffersize * sizeof(float) * 2 + 128);
leftInputBuffer = (float *)malloc(buffersize * sizeof(float) + 128);
rightInputBuffer = (float *)malloc(buffersize * sizeof(float) + 128);
leftOutputBuffer = (float *)malloc(buffersize * sizeof(float) + 128);
rightOutputBuffer = (float *)malloc(buffersize * sizeof(float) + 128)
static bool audioProcessing(void * __unused clientdata, short int *audioInputOutput, int numberOfSamples, int
__unused samplerate) {
SuperpoweredShortIntToFloat(audioInputOutput, inputBufferFloat, numberOfSamples, 2);
SuperpoweredDeInterleave(inputBufferFloat, leftInputBuffer, rightInputBuffer, numberOfSamples);
FIR(leftInputBuffer, leftOutputBuffer, numberOfSamples);
FIR(rightInputBuffer, rightOutputBuffer, numberOfSamples);
SuperpoweredFloatToShortIntInterleave(leftOutputBuffer, rightOutputBuffer, audioInputOutput,
numberOfSamples);
return true;
}
But the app crashes instantly at test. Please, any help at all will be much appreciated.
Thanks.

You may set up the audio I/O with a different "buffersize", such as "buffersize" multiplied by two. In this case your "numberOfSamples" will be twice as big and your buffers not big enough.

Related

An Algorithm for producing fake audio visualizer

Does anybody knows an algorithm for making a random series of numbers (like 100 java-byte (>=-127 & <= 127) ) which when are drawn as a bar chart, would be similar to a regular audio spectrum, like those SoundCloud ones?
I'm trying to write one, it has multiple Random and Sinus calculations, but the result is very ugly, it's something between a sinus wave and an old toothbrush. I would be very thankful if you code direct me to a one which is aesthetically convincing
An algorithm with an explanation (and/or picture) is fine. A pseudocode would be very nice of you. An actual JAVA code is bonus. :D
Edit:
This is the code I'm using right now. It's convoluted but I'm basically adding a random deviation to a sinus wave with random amplitude (which I'm not sure if it was a good idea).
private static final int FREQ = 7;
private static final double DEG_TO_RAD = Math.PI / 180;
private static final int MAX_AMPLITUDE = 127;
private static final float DEVIATION = 0.1f; // 10 percent is maximum deviation
private void makeSinusoidRandomBytes() {
byte[] bytes = new byte[AUDIO_VISUALIZER_DENSITY];
for (int i = 0; i < AUDIO_VISUALIZER_DENSITY; i++) {
int amplitude = random.nextInt(MAX_AMPLITUDE) - MAX_AMPLITUDE/2;
byte dev = (byte) (random.nextInt((int) Math.max(Math.abs(2 * DEVIATION * amplitude), 1))
- Math.abs(DEVIATION * amplitude));
bytes[i] = (byte) (Math.sin(i * FREQ * DEG_TO_RAD) * amplitude - dev);
}
this.bytes = bytes;
}
A real soundwave is actually a combination of sine waves of different frequencies and amplitudes added together, not random deviations from a sine wave. The difficult part will be to choose a combination of wave amplitudes and frequencies that will produce the output that you will subjectively like! However, most sound waves have a base frequency and then a number of overtones which "fit into" that wavelength - for example it might have an overtone at 3/2 * the base frequency and at amplitude of 2/3 the base frequency. By combining these overtones and scaling the resulting waveform to the -127 - +127 range, you'll get an actual soundwave.
The following code is C#, but close enough to Java to give you an idea. It's from a game, where I needed to combine many sine waves together to create various types of oscillating effects:
/// <summary>
/// Return a value between 0 and 1 based on a sine-wave oscillating with a given combination of periods at a given point in time
/// </summary>
/// <param name="time">time to get wave value at</param>
/// <param name="periods">lengths of waves</param>
/// <returns>height of wave</returns>
public static float MultiPulse(float time, params float[] periods)
{
float c = 0;
foreach (float p in periods)
{
float cp = (MathHelper.Pi / p) * time;
float s = ((float)Math.Sin(cp) + 1) / 2;
c += s / periods.Length;
}
return c;
}
You probably want to modify that to allow you to specify different amplitudes as well as periods for the waves you are combining.
By combining many widely varying amplitudes and periods (frequencies) you should by trial and error be able to get something convincing.
Based on the idea see sharper gave me, this is the code I'm using right now:
int mainAmp = random.nextInt(MAX_AMPLITUDE) - MAX_AMPLITUDE / 2;
int overtoneAmp = random.nextInt(MAX_AMPLITUDE * 2 / 3) - MAX_AMPLITUDE / 3;
int overtone2Amp = random.nextInt(MAX_AMPLITUDE * 4 / 7) - MAX_AMPLITUDE / 2 * 7;
int mainFreq = random.nextInt(7) + 7;
int overtoneFreq = mainFreq * 3 / 2;
int overtone2Freq = mainFreq * 7 / 4;
byte[] bytes = new byte[AUDIO_VISUALIZER_DENSITY];
for (int i = 0; i < AUDIO_VISUALIZER_DENSITY; i++) {
bytes[i] = (byte) (Math.sin(i * mainFreq * DEG_TO_RAD) * mainAmp
+ Math.sin(i * overtoneFreq * DEG_TO_RAD) * overtoneAmp
+ Math.sin(i * overtone2Freq * DEG_TO_RAD) * overtone2Amp);
}
Main frequency is between 8 and 15 for my app. You can play with those. The other two overtones I'm using are (2 - 1/2)x & (2 - 1/4)x of main frequency. You can add more like (2 - 1/8)x etc. Or use another series of frequencies. I also randomize the amplitude to get a unique wave each time.
These are some waves I'm drawing using this code:

Why does this programmatically generated musical chord not sound correct?

I have the following class which generates a buffer containing sound data:
package musicbox.example;
import javax.sound.sampled.LineUnavailableException;
import musicbox.engine.SoundPlayer;
public class CChordTest {
private static final int SAMPLE_RATE = 1024 * 64;
private static final double PI2 = 2 * Math.PI;
/*
* Note frequencies in Hz.
*/
private static final double C4 = 261.626;
private static final double E4 = 329.628;
private static final double G4 = 391.995;
/**
* Returns buffer containing audio information representing the C chord
* played for the specified duration.
*
* #param duration The duration in milliseconds.
* #return Array of bytes representing the audio information.
*/
private static byte[] generateSoundBuffer(int duration) {
double durationInSeconds = duration / 1000.0;
int samples = (int) durationInSeconds * SAMPLE_RATE;
byte[] out = new byte[samples];
for (int i = 0; i < samples; i++) {
double value = 0.0;
double t = (i * durationInSeconds) / samples;
value += Math.sin(t * C4 * PI2); // C note
value += Math.sin(t * E4 * PI2); // E note
value += Math.sin(t * G4 * PI2); // G note
out[i] = (byte) (value * Byte.MAX_VALUE);
}
return out;
}
public static void main(String... args) throws LineUnavailableException {
SoundPlayer player = new SoundPlayer(SAMPLE_RATE);
player.play(generateSoundBuffer(1000));
}
}
Perhaps I'm misunderstanding some physics or math here, but it seems like each sinusoid ought to represent the sound of each note (C, E, and G), and by summing the three sinusoids, I should hear something similar to when I play those three notes simultaneously on the keyboard. What I'm hearing, however, is not even close to that.
For what it's worth, if I comment out any two of the sinusoids and keep the third, I do hear the (correct) note corresponding to that sinusoid.
Can somebody spot what I'm doing wrong?
To combine audio signals you need to average their samples, not sum them.
Divide the value by 3 before converting to byte.
You don't say in what way it sounds incorrect, adding three sin values like that you are going to get a signal that ranges from -3.0 to 3.0 and so is going to clip when you apply your *Byte.MAX_VALUE, this is why averaging probable worked for you, adding is correct its just you need to scale the result after to prevent clipping and dividing by the number of sine waves is the easiest way to do this. But if you start changing the number of sine waves dynamically and try to use the same strategy you wont get the result you expect, you have to scale the signal for when you signal is at its loudest. Remember real audio is not going to be at maximum amplitude so you don't have to worry about it two much if you synthesised audio isn't, also, the way we perceive sound volume is logarithmic so a signal at half amplitude is a difference of -3dB which is pretty close to the smallest change in amplitude we can hear.

Multi audio tones to sound card using portaudio

I am trying to generate a tone to the sound card (Frequency: 1950 hz, duration: 40 ms, level: -30 db, right-channel (stereo), on steam 1). Eventually, I would like to play two of these tones (one goes to channel 1 and one goes to channel 2).
Any help or direction is greatly appreciated.
Thanks,
DW
Hi Bjorn, I tried this but I am not getting the what I am expecting as a frequency (plus seems sound is not clean). Any ideas what's wrong?
I greatly appreciate any help.
#define SAMPLE_RATE (44100)
#define TABLE_SIZE (200)
float FREQUENCY = 422;
...
for(int i=0; i<TABLE_SIZE; i++ )
{
data.sine[i] = (float) sin( (double)i * ((2.0 * M_PI)/(double)SAMPLE_RATE) * FREQUENCY );
}
data.left_phase = 0;
data.right_phase = 0;
...
... in callback function ...
for(unsigned long i = 0; i < framesPerBuffer; i++ )
{
// fill output buffer with sin wave
*out++ = data->amp * data->sine[data->left_phase]; // left
*out++ = data->amp * data->sine[data->right_phase]; // right
data->left_phase += 1;
if( data->left_phase >= TABLE_SIZE )
data->left_phase -= TABLE_SIZE;
data->right_phase += 1;
if( data->right_phase >= TABLE_SIZE )
data->right_phase -= TABLE_SIZE;
}
PortAudio has sample code for generating tones, you just need to figure out the frequency. See for example this answer:
[portaudio]Transmit and Detect frequency - Windows
Update:
Rather than trying to store a table of sine data, simply calculate the sine value in the callback using this formula:
amplitude[n] = sin( n * desiredFreq * 2 * pi / samplerate )
so (untested) your code will look something like this:
typedef struct
{
long n;
} MyData;
float FREQUENCY = 422;
static int MyCallback(
const void *inputBuffer,
void *outputBuffer,
unsigned long framesPerBuffer,
const PaStreamCallbackTimeInfo* timeInfo,
PaStreamCallbackFlags statusFlags,
void *userData
)
{
MyData *data = (MyData*)userData;
float *out = (float*)outputBuffer;
(void) timeInfo; /* Prevent unused variable warnings. */
(void) statusFlags;
(void) inputBuffer;
for(unsigned long i = 0; i < framesPerBuffer; i++ )
{
// fill output buffer with sin wave
float v = sin( data->n * FREQUENCY * 2 * PI / (float) SAMPLERATE )
*out++ = v; // left
*out++ = v; // right
}
return paContinue;
}
This code is not without problems: eg. eventually n will "wrap around" and I'm not sure if sin remains accurate and efficient as the input gets larger. Nevertheless it's a good starting point, and if you just need to generate a few seconds of a tone on modern hardware, this is really all you need. If you need something fancier, get this working first, then you can worry about making it more efficient and robust with a LUT.

How to play audio sample buffers from AVCaptureAudioDataOutput

The main goal of the app Im trying to make is a peer-to-peer video streaming. (Sort of like FaceTime using bluetooth/WiFi).
Using AVFoundation, I was able to capture video/audio sample buffers. Then Im sending the video/audo sample buffer data. Now the problem is to process the sample buffer data in the receiving side.
As for the video sample buffer, I was able to get a UIImage from the sample buffer. But for the audio sample buffer, I dont know how to process it so I can play the audio.
So the question is how can I process/play the audio sample buffers?
Right now Im just plotting the waveform, just like in apple's Wavy sample code:
CMSampleBufferRef sampleBuffer;
CMItemCount numSamples = CMSampleBufferGetNumSamples(sampleBuffer);
NSUInteger channelIndex = 0;
CMBlockBufferRef audioBlockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
size_t audioBlockBufferOffset = (channelIndex * numSamples * sizeof(SInt16));
size_t lengthAtOffset = 0;
size_t totalLength = 0;
SInt16 *samples = NULL;
CMBlockBufferGetDataPointer(audioBlockBuffer, audioBlockBufferOffset, &lengthAtOffset, &totalLength, (char **)(&samples));
int numSamplesToRead = 1;
for (int i = 0; i < numSamplesToRead; i++) {
SInt16 subSet[numSamples / numSamplesToRead];
for (int j = 0; j < numSamples / numSamplesToRead; j++)
subSet[j] = samples[(i * (numSamples / numSamplesToRead)) + j];
SInt16 audioSample = [Util maxValueInArray:subSet ofSize:(numSamples / numSamplesToRead)];
double scaledSample = (double) ((audioSample / SINT16_MAX));
// plot waveform using scaledSample
[updateUI:scaledSample];
}
To show video you can use
(here is : getting of ARGB picture and converting to Qt (nokia qt) QImage you can replace by other image)
place it to delegate class
- (void)captureOutput:(AVCaptureOutput *)captureOutput
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection
NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
CVPixelBufferLockBaseAddress(imageBuffer,0);
SVideoSample sample;
sample.pImage = (char *)CVPixelBufferGetBaseAddress(imageBuffer);
sample.bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
sample.width = CVPixelBufferGetWidth(imageBuffer);
sample.height = CVPixelBufferGetHeight(imageBuffer);
QImage img((unsigned char *)sample.pImage, sample.width, sample.height, sample.bytesPerRow, QImage::Format_ARGB32);
self->m_receiver->eventReceived(img);
CVPixelBufferUnlockBaseAddress(imageBuffer,0);
[pool drain];

Raw Sound playing

I've been working for some time with image formats and i know that an image is an array of pixels (24- maybe 32 bits long). The question is: what is the way a sound file is represented? To be honest i'm not even sure what i should be googling for. Also i would be interested how do you use the data, i mean actually playing the sounds in the file. For an image file you have all sorts of abstract devices to draw an image on(Graphics:java,c#, HDC:cpp(win32), etc.) .I hope i have been clear enough.
Here's a dandy overview of how .wav is stored. I found it by typing "wave file format" into google.
http://www.sonicspot.com/guide/wavefiles.html
WAV files can also store compressed audio, but I believe most of the time they are not compressed. But the WAV format is designed as a container for a number of options on how that audio is stored.
Here's a snipped of code that I found at another question here at stackoverflow that I like in C# that builds a WAV-formatted audio MemoryStream and then plays that stream (without saving it to a file, like many other answers rely on). But saving it to a file can easily be added with one line of code if you want it saved to disk, but I would think that most of the time, that'd be undesirable.
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Windows.Forms;
public static void PlayBeep(UInt16 frequency, int msDuration, UInt16 volume = 16383)
{
var mStrm = new MemoryStream();
BinaryWriter writer = new BinaryWriter(mStrm);
const double TAU = 2 * Math.PI;
int formatChunkSize = 16;
int headerSize = 8;
short formatType = 1;
short tracks = 1;
int samplesPerSecond = 44100;
short bitsPerSample = 16;
short frameSize = (short)(tracks * ((bitsPerSample + 7) / 8));
int bytesPerSecond = samplesPerSecond * frameSize;
int waveSize = 4;
int samples = (int)((decimal)samplesPerSecond * msDuration / 1000);
int dataChunkSize = samples * frameSize;
int fileSize = waveSize + headerSize + formatChunkSize + headerSize + dataChunkSize;
// var encoding = new System.Text.UTF8Encoding();
writer.Write(0x46464952); // = encoding.GetBytes("RIFF")
writer.Write(fileSize);
writer.Write(0x45564157); // = encoding.GetBytes("WAVE")
writer.Write(0x20746D66); // = encoding.GetBytes("fmt ")
writer.Write(formatChunkSize);
writer.Write(formatType);
writer.Write(tracks);
writer.Write(samplesPerSecond);
writer.Write(bytesPerSecond);
writer.Write(frameSize);
writer.Write(bitsPerSample);
writer.Write(0x61746164); // = encoding.GetBytes("data")
writer.Write(dataChunkSize);
{
double theta = frequency * TAU / (double)samplesPerSecond;
// 'volume' is UInt16 with range 0 thru Uint16.MaxValue ( = 65 535)
// we need 'amp' to have the range of 0 thru Int16.MaxValue ( = 32 767)
// so we simply set amp = volume / 2
double amp = volume >> 1; // Shifting right by 1 divides by 2
for (int step = 0; step < samples; step++)
{
short s = (short)(amp * Math.Sin(theta * (double)step));
writer.Write(s);
}
}
mStrm.Seek(0, SeekOrigin.Begin);
new System.Media.SoundPlayer(mStrm).Play();
writer.Close();
mStrm.Close();
} // public static void PlayBeep(UInt16 frequency, int msDuration, UInt16 volume = 16383)
But this code shows a bit of insight into the WAV-format, and it is even code that allows a person to build your own WAV-format in C# source code.

Resources