libfaac: Queue input is backward in time - audio

I am using libav along with libfaac to encode audio into aac.
following is the logic:
frames[n]
i = 0 ;
while (there are frames)
{
cur_frame = frames[i];
av_encode_audio(frame, ...., &frame_finished);
if( frame_finished )
{
i++;
}
}
but I am getting this annoying warning for few frames "queue input is backward in time !"

The answer is very simple, you are not supposed to pass the same frame again to the libfaac,
so even if the frame_finished is not 1 you should still go to the next frame.
it should be as follows:
frames[n]
i = 0 ;
while (there are frames)
{
cur_frame = frames[i];
av_encode_audio(frame, ...., &frame_finished);
i++;
}

Related

creating audio file based on frequencies

I'm using node.js for a project im doing.
The project is to convert words into numbers and then to take those numbers and create an audio output.
The audio output should play the numbers as frequencies. for example, I have an array of numbers [913, 250,352] now I want to play those numbers as frequencies.
I know I can play them in the browser with audio API or any other third package that allows me to do so.
The thing is that I want to create some audio file, I tried to convert those numbers into notes and then save it as Midi file, I succeed but the problem is that the midi file takes the frequencies, convert them into the closest note (example: 913 will convert into 932.33HZ - which is note number 81),
// add a track
var array = gematriaArray
var count = 0
var track = midi.addTrack()
var note
for (var i = 0; i < array.length; i++) {
note = array[i]
track = track.addNote({
//here im converting the freq -> midi note.
midi: ftom(parseInt(note)),
time: count,
duration: 3
})
count++
}
// write the output
fs.writeFileSync('./public/sounds/' + name + random + '.mid', new Buffer.from(midi.toArray()))
I searched the internet but I couldn't find anything that can help.
I really want to have a file that the user can download with those numbers as frequencies, someone knows what can be done to get this result?
Thanks in advance for the helpers.
this function will populate a buffer with floating point values which represent the height of the raw audio curve for the given frequency
var pop_audio_buffer_custom = function (number_of_samples, given_freq, samples_per_second) {
var number_of_samples = Math.round(number_of_samples);
var audio_obj = {};
var source_buffer = new Float32Array(number_of_samples);
audio_obj.buffer = source_buffer;
var incr_theta = (2.0 * Math.PI * given_freq) / samples_per_second;
var theta = 0.0;
for (var curr_sample = 0; curr_sample < number_of_samples; curr_sample++) {
audio_obj.buffer[curr_sample] = Math.sin(theta);
console.log(audio_obj.buffer[curr_sample] , "theta ", theta);
theta += incr_theta;
}
return audio_obj;
}; // pop_audio_buffer_custom
var number_of_samples = 10000; // long enough to be audible
var given_freq = 300;
var samples_per_second = 44100; // CD quality sample rate
var wav_output_filename = "/tmp/wav_output_filename.wav"
var synthesized_obj = {};
synthesized_obj.buffer = pop_audio_buffer_custom(number_of_samples, given_freq, samples_per_second);
the world of digital audio is non trivial ... the next step once you have an audio buffer is to translate the floating point representation into something which can be stored in bytes ( typically 16 bit integers dependent on your choice of bit depth ) ... then that 16 bit integer buffer needs to get written out as a WAV file
audio is a wave sometimes called a time series ... when you pound your fist onto the table the table wobbles up and down which pushes tiny air molecules in unison with that wobble ... this wobbling of air propagates across the room and reaches a microphone diaphragm or maybe your eardrum which in turn wobbles in resonance with this wave ... if you glued a pencil onto the diaphragm so it wobbled along with the diaphragm and you slowly slid a strip of paper along the lead tip of the pencil you would see a curve being written onto that paper strip ... this is the audio curve ... an audio sample is just the height of that curve at an instant of time ... if you repeatedly wrote down this curve height value X times per second at a constant rate you will have a list of data points of raw audio ( this is what above function creates ) ... so a given audio sample is simply the value of the audio curve height at a given instant in time ... since computers are not continuous instead are discrete they cannot handle the entire pencil drawn curve so only care about this list of instantaneously measured curve height values ... those are audio samples
above 32 bit floating point buffer can be fed into following function to return a 16 bit integer buffer
var convert_32_bit_float_into_signed_16_bit_int_lossy = function(input_32_bit_buffer) {
// this method is LOSSY - intended as preliminary step when saving audio into WAV format files
// output is a byte array where the 16 bit output format
// is spread across two bytes in little endian ordering
var size_source_buffer = input_32_bit_buffer.length;
var buffer_byte_array = new Int16Array(size_source_buffer * 2); // Int8Array 8-bit twos complement signed integer
var value_16_bit_signed_int;
var index_byte = 0;
console.log("size_source_buffer", size_source_buffer);
for (var index = 0; index < size_source_buffer; index++) {
value_16_bit_signed_int = ~~((0 < input_32_bit_buffer[index]) ? input_32_bit_buffer[index] * 0x7FFF :
input_32_bit_buffer[index] * 0x8000);
buffer_byte_array[index_byte] = value_16_bit_signed_int & 0xFF; // bitwise AND operation to pluck out only the least significant byte
var byte_two_of_two = (value_16_bit_signed_int >> 8); // bit shift down to access the most significant byte
buffer_byte_array[index_byte + 1] = byte_two_of_two;
index_byte += 2;
};
// ---
return buffer_byte_array;
};
next step is to persist above 16 bit int buffer into a wav file ... I suggest you use one of the many nodejs libraries for that ( or even better write your own as its only two pages of code ;-)))

Using pa_simple api efficiently

I'm playing around with pa_simple api to get a better understanding of digital audio. I've written some code to feed custom generated audio data to the pulseaudio server:
char buff[2];
while (playing) {
timePassed += periodMicros;
int val = userFunc(timePassed/1000000.0);
buff[0] = (val >> 0) & 0xff;
buff[1] = (val >> 1) & 0xff;
int error;
pa_simple_write(s, buff, 2, &error);
}
func generates a simple sine wave like this:
uint16_t func(double time)
{
double op = sin(2*3.14*freq * time);
return op * 100;
}
It produces a note as expected, but eats up 50% of cpu on my machine, because of the unrestricted while loop in the first codeblock. So I tried fixing it like this:
void _mainLoop() {
char buff[2*bufSize];
while (playing) {
for ( int i = 0; i < bufSize; i++ ) {
uint8_t word = userFunc(timePassed/1000000.0);
timePassed += periodMicros;
buff[i*2+0] = (word >> 0) & 0xff;
buff[i*2+1] = (word >> 1) & 0xff;
}
pa_simple_write(s, buff, 2*bufSize, &error);
this_thread::sleep_for(chrono::microseconds(bufSize*periodMicros));
}
}
It fixes the cpu issue but the audio generated by it is no longer a perfect note. It appears to be a bit higher pitched with a retro feel to it.
I wanted to know the correct way to use the pa_simple api without my code eating up the cpu.
Note that the sound generated by second code isn't fixed even if I change bufSize to 1 and remove the sleep line.
Edit:
This is how I've initialized the pulseaudio connection:
pa_sample_spec ss = {
.format = PA_SAMPLE_S16LE,
.channels = 1,
.rate = 44100
};
s = pa_simple_new(
NULL,
"SYNTH",
PA_STREAM_PLAYBACK,
NULL,
"Music",
&ss,
NULL,
NULL,
NULL
);
It was a bug in the program. I was storing the output of userFunc in a uint, hence the negative values were being cutoff. Changing that to int and removing the sleep call fixed the problem, since the pa_simple api in synchronous.

Play 2 different frequencies alternatively in Java

I am a newbie in Java Sounds. I want to play 2 different frequencies alternatively for 1 second each in a loop for some specified time.
Like, if I have 2 frequencies 440hz and 16000hz and the time period is 10 seconds then for every 'even' second 440hz gets played and for every 'odd' second 16000hz, i.e. 5 seconds each alternatively.
I have learned a few things through some examples and I have also made a program that runs for a single user specified frequency for a time also given by the user with the help of those examples.
I will really appreciate if someone can help me out on this.
Thanks.
I am also attaching that single frequency code for reference.
import java.nio.ByteBuffer;
import java.util.Scanner;
import javax.sound.sampled.*;
public class Audio {
public static void main(String[] args) throws InterruptedException, LineUnavailableException {
final int SAMPLING_RATE = 44100; // Audio sampling rate
final int SAMPLE_SIZE = 2; // Audio sample size in bytes
Scanner in = new Scanner(System.in);
int time = in.nextInt(); //Time specified by user in seconds
SourceDataLine line;
double fFreq = in.nextInt(); // Frequency of sine wave in hz
//Position through the sine wave as a percentage (i.e. 0 to 1 is 0 to 2*PI)
double fCyclePosition = 0;
//Open up audio output, using 44100hz sampling rate, 16 bit samples, mono, and big
// endian byte ordering
AudioFormat format = new AudioFormat(SAMPLING_RATE, 16, 1, true, true);
DataLine.Info info = new DataLine.Info(SourceDataLine.class, format);
if (!AudioSystem.isLineSupported(info)) {
System.out.println("Line matching " + info + " is not supported.");
throw new LineUnavailableException();
}
line = (SourceDataLine) AudioSystem.getLine(info);
line.open(format);
line.start();
// Make our buffer size match audio system's buffer
ByteBuffer cBuf = ByteBuffer.allocate(line.getBufferSize());
int ctSamplesTotal = SAMPLING_RATE * time; // Output for roughly user specified time in seconds
//On each pass main loop fills the available free space in the audio buffer
//Main loop creates audio samples for sine wave, runs until we tell the thread to exit
//Each sample is spaced 1/SAMPLING_RATE apart in time
while (ctSamplesTotal > 0) {
double fCycleInc = fFreq / SAMPLING_RATE; // Fraction of cycle between samples
cBuf.clear(); // Discard samples from previous pass
// Figure out how many samples we can add
int ctSamplesThisPass = line.available() / SAMPLE_SIZE;
for (int i = 0; i < ctSamplesThisPass; i++) {
cBuf.putShort((short) (Short.MAX_VALUE * Math.sin(2 * Math.PI * fCyclePosition)));
fCyclePosition += fCycleInc;
if (fCyclePosition > 1) {
fCyclePosition -= 1;
}
}
//Write sine samples to the line buffer. If the audio buffer is full, this will
// block until there is room (we never write more samples than buffer will hold)
line.write(cBuf.array(), 0, cBuf.position());
ctSamplesTotal -= ctSamplesThisPass; // Update total number of samples written
//Wait until the buffer is at least half empty before we add more
while (line.getBufferSize() / 2 < line.available()) {
Thread.sleep(1);
}
}
//Done playing the whole waveform, now wait until the queued samples finish
//playing, then clean up and exit
line.drain();
line.close();
}
}
Your best bet is probably creating Clips as shown in the sample code below.
That said, the MHz range is typically not audible—looks like you have a typo in your question. If it's no typo, you will run into issues with Mr. Nyquist.
Another hint: Nobody uses Hungarian Notation in Java.
import javax.sound.sampled.*;
import java.nio.ByteBuffer;
import java.nio.ShortBuffer;
public class AlternatingTones {
public static void main(final String[] args) throws LineUnavailableException, InterruptedException {
final Clip clip0 = createOneSecondClip(440f);
final Clip clip1 = createOneSecondClip(16000f);
clip0.addLineListener(event -> {
if (event.getType() == LineEvent.Type.STOP) {
clip1.setFramePosition(0);
clip1.start();
}
});
clip1.addLineListener(event -> {
if (event.getType() == LineEvent.Type.STOP) {
clip0.setFramePosition(0);
clip0.start();
}
});
clip0.start();
// prevent JVM from exiting
Thread.sleep(10000000);
}
private static Clip createOneSecondClip(final float frequency) throws LineUnavailableException {
final Clip clip = AudioSystem.getClip();
final AudioFormat format = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED, 44100f, 16, 1, 2, 44100, true);
final ByteBuffer buffer = ByteBuffer.allocate(44100 * format.getFrameSize());
final ShortBuffer shortBuffer = buffer.asShortBuffer();
final float cycleInc = frequency / format.getFrameRate();
float cyclePosition = 0f;
while (shortBuffer.hasRemaining()) {
shortBuffer.put((short) (Short.MAX_VALUE * Math.sin(2 * Math.PI * cyclePosition)));
cyclePosition += cycleInc;
if (cyclePosition > 1) {
cyclePosition -= 1;
}
}
clip.open(format, buffer.array(), 0, buffer.capacity());
return clip;
}
}
The method I would use would be to count frames while outputting to a SourceDataLine. When you have written one second's worth of frames, switch frequencies. This will give much better timing accuracy than attempting to fiddle with Clips.
I'm unclear if the code you are showing is something you wrote or copied-and-pasted. If you have a question about how it doesn't work, I'm happy to help if you show what you tried and what errors or exceptions were generated.
When outputting to a SourceDataLine, there will have to be a step where you convert the short value (-32768..+32767) to two bytes as per the 16-bit encoding specified in the audio format you have. I don't see where this is being done in your code. [EDIT: can see where the putShort() method does this, though it only works for BigEndian, not the more common LittleEndian.]
Have you looked over the Java Tutorial
Sound Trail?

How do I swap stereo channels in raw PCM audio data on OS X?

I'm writing audio from an external decoding library on OS X to an AIFF file, and I am able to swap the endianness of the data with OSSwapInt32().
The resulting AIFF file (16-bit PCM stereo) does play, but the left and right channels are swapped.
Would there be any way to swap the channels as I am writing each buffer?
Here is the relevant loop:
do
{
xmp_get_frame_info(writer_context, &writer_info);
if (writer_info.loop_count > 0)
break;
writeModBuffer.mBuffers[0].mDataByteSize = writer_info.buffer_size;
writeModBuffer.mBuffers[0].mNumberChannels = inputFormat.mChannelsPerFrame;
// Set up our buffer to do the endianness swap
void *new_buffer;
new_buffer = malloc((writer_info.buffer_size) * inputFormat.mBytesPerFrame);
int *ourBuffer = writer_info.buffer;
int *ourNewBuffer = new_buffer;
memset(new_buffer, 0, writer_info.buffer_size);
int i;
for (i = 0; i <= writer_info.buffer_size; i++)
{
ourNewBuffer[i] = OSSwapInt32(ourBuffer[i]);
};
writeModBuffer.mBuffers[0].mData = ourNewBuffer;
frame_size = writer_info.buffer_size / inputFormat.mBytesPerFrame;
err = ExtAudioFileWrite(writeModRef, frame_size, &writeModBuffer);
} while (xmp_play_frame(writer_context) == 0);
This solution is very specific to 2 channel audio. I chose to do it at the same time you're looping to change the byte ordering to avoid an extra loop. I'm going through the loop 1/2 the number and processing two samples per iteration. The samples are interleaved so I copy from odd sample indexes into even sample indexes and vis-a-versa.
for (i = 0; i <= writer_info.buffer_size/2; i++)
{
ourNewBuffer[i*2] = OSSwapInt32(ourBuffer[i*2 + 1]);
ourNewBuffer[i*2 + 1] = OSSwapInt32(ourBuffer[i*2]);
};
An alternative is to use a table lookup for channel mapping.

Encoding video only FLV

I am trying to generate a video only FLV file, I am using:
libx264 + ffmpeg
30 fps ( fixed )
playback is done using VLC 2.0.1 and flowplayer
When playing the FLV the frame-rate seems ~1 frame per sec, following is the way I cfg ffmpeg:
AVOutputFormat* fmtOutput = av_oformat_next(0);
while((0 != fmtOutput) && (0 != strcmp(fmtOutput->name, "flv")))
fmtOutput = av_oformat_next(fmtOutput);
m_pFmtCtxOutput = avformat_alloc_context();
m_pFmtCtxOutput->oformat = fmtOutput;
AVStream* pOutVideoStream= av_new_stream(m_pFmtCtxOutput, pInVideoStream->id);
AVCodec* videoEncoder = avcodec_find_encoder(CODEC_ID_H264);
pOutVideoStream->codec->width = 640;
pOutVideoStream->codec->height = 480;
pOutVideoStream->codec->level = 30;
pOutVideoStream->codec->pix_fmt = PIX_FMT_YUV420P;
pOutVideoStream->codec->bit_rate = 3000000;
pOutVideoStream->cur_dts = 0;
pOutVideoStream->first_dts = 0;
pOutVideoStream->index = 0;
pOutVideoStream->avg_frame_rate = (AVRational){ 30, 1 };
pOutVideoStream->time_base =
pOutVideoStream->codec->time_base= (AVRational){ 1, 30000 };
pOutVideoStream->codec->gop_size = 30;
%% Some specific libx264 settings %%
m_dVideoStep = 1000;// packet dts/pts is incremented by this amount each frame
pOutVideoStream->codec->flags |= CODEC_FLAG_GLOBAL_HEADER;
avcodec_open(pOutVideoStream->codec, videoEncoder);
The resulting file seems OK, with the exception of the playback frame-rate.
having in mind that:
pOutVideoStream->avg_frame_rate = (AVRational){ 30, 1 };
pOutVideoStream->time_base = (AVRational){ 1, 30000 };
pOutVideoStream->codec->time_base= (AVRational){ 1, 30000 };
For each frame I increment the dts/pts by 1000
What am I doing wrong here? why the file is playing choppy ( ~1 fps )?
Any help will be appreciated.
Nadav at Sophin
Stepping through the flv muxer code With a debugger, I have found the ffmpeg implementation to support PTS of a resolution no other than msec, that is, having time_base = (AVRational){ 1, 1000 }.
Also, 'AVStream::r_frame_rate' must be set in order for the flv muxer to properly resolve the frame-rate.

Resources