I would like to use PortAudio library to play audio data. This audio data comes from UDP paquets.
I saw there is Pa_OpenDefaultStream() (and Pa_OpenStream() which is pretty similar) function to open a stream :
PaStream *stream;
PaError err;
/* Open an audio I/O stream. */
err = Pa_OpenDefaultStream( &stream,
0, /* no input channels */
2, /* stereo output */
paFloat32, /* 32 bit floating point output */
SAMPLE_RATE,
256, /* frames per buffer, i.e. the number
of sample frames that PortAudio will
request from the callback. Many apps
may want to use
paFramesPerBufferUnspecified, which
tells PortAudio to pick the best,
possibly changing, buffer size.*/
patestCallback, /* this is your callback function */
&data ); /*This is a pointer that will be passed to
your callback*/
I guess I have to use it to play my paquets but I don't know how to use it :
What is the first parameter ?
Why do I have to define a call-back function ?
Here is a link to the PortAudio documentation : http://www.portaudio.com/trac/
Any help would be greatly appreciated :)
Thanks.
The first parameter is a pointer to an input/output stream, of type PaStream. The audio data will be read from / written to this stream.
You need to write a callback function that the PortAudio library will call when it needs to read or write audio to / from your PC. Any other audio processing you want to do (e.g. DSP) will be done here also. A simple callback function would just copy the input to the output, for streaming I/O. If you're having trouble using callbacks, use the Blocking API instead, it may be easier to understand.
Compile and run the examples for details (e.g. patest_read_record.c), theres lots of info there.
Related
I'm going to create voice chat. My backend server works on Node.js and almost every connection between client and server uses socket.io.
Is websockets appropriate for my use case? I prefer communication client -> server -> clients than P2P because I expect even 1000 clients connected to one room.
If websocket is ok, then which method is the best to send AudioBuffer to server and playback on other clients? I do it like that:
navigator.getUserMedia({audio: true}, initializeRecorder, errorCallback);
function initializeRecorder(MediaStream) {
var audioCtx = new window.AudioContext();
var sourceNode = audioCtx.createMediaStreamSource(MediaStream);
var recorder = audioCtx.createScriptProcessor(4096, 1, 1);
recorder.onaudioprocess = recorderProcess;
sourceNode.connect(recorder);
recorder.connect(audioCtx.destination);
}
function recorderProcess(e) {
var left = e.inputBuffer.getChannelData(0);
io.socket.post('url', left);
}
But after receive data on other clients I don't know how to playback this Audio Stream from Buffer Arrays.
EDIT
1) Why if I don't connect ScriptProcessor (recorder variable) to destination, onaudioprocess method isn't fired?
Documentation info - "although you don't have to provide a destination if you, say, just want to visualise some audio data" - Web Audio concepts and usage
2) Why I don't hear anything from my speakers after connect recorder variable to destination and if I connect sourceNode variable directly to destination, I do.
Even if onaudioprocess method doesn't do anything.
Anyone can help?
I think web sockets are appropriate here. Just make sure that you are using binary transfer. (I use BinaryJS for this myself, allowing me to open up arbitrary streams to the server.)
Getting the data from user media capture is pretty straightforward. What you have is a good start. The tricky party is on playback. You will have to buffer the data and play it back using your own script processing node.
This isn't too hard if you use PCM everywhere... the raw samples you get from the Web Audio API. The downside of this is that there is a lot of overhead shoving 32-bit floating point PCM around. This uses a ton of bandwidth which isn't needed for speech alone.
I think the easiest thing to do in your case is to reduce the bit depth to an arbitrary bit depth that works well for your application. 8-bit samples are plenty for discernible speech and will take up quite a bit less bandwidth. By using PCM, you avoid having to implement a codec in JS and then having to deal with the buffering and framing of data for that codec.
To summarize, once you have the raw sample data in a typed array in your script processing node, write something to convert those samples from 32-bit float to 8-bit signed integers. Send these buffers to your server in the same size chunks as they come in on, over your binary web socket. The server will then send these to all the other clients on their binary web sockets. When the clients receive audio data, it will buffer it for whatever amount of time you choose to prevent dropping audio. Your client code will convert those 8-bit samples back to 32-bit float and put it in a playback buffer. Your script processing node will pick up whatever is in the buffer and start playback as data is available.
First off, I'm new to the world of Go and lower level programming, so bear with me... :)
So what I'm trying to do is this; read a .wav-file with the libsndfile binding for Go and play it with portaudio.
I cannot find any examples for this, and clearly I lack basic knowledge about pointers, streams and buffers to make this happen. Here is my take on it so far, I've tried to read the docs and the few examples I've been able to find and put the pieces together. I think I'm able to open the file and the stream but I don't get how to connect the two.
package main
import (
"code.google.com/p/portaudio-go/portaudio"
"fmt"
"github.com/mkb218/gosndfile/sndfile"
"math/rand"
)
func main() {
portaudio.Initialize()
defer portaudio.Terminate()
// Open file with sndfile
var i sndfile.Info
file, fileErr := sndfile.Open("hello.wav", sndfile.Read, &i)
fmt.Println("File: ", file, fileErr)
// Open portaudio stream
h, err := portaudio.DefaultHostApi()
stream, err := portaudio.OpenStream(portaudio.HighLatencyParameters(nil, h.DefaultOutputDevice), func(out []int32) {
for i := range out {
out[i] = int32(rand.Uint32())
}
})
defer stream.Close()
fmt.Println("Stream: ", stream, err)
// Play portaudio stream
// ....
framesOut := make([]int32, 32000)
data, err := file.ReadFrames(framesOut)
fmt.Println("Data: ", data, err)
}
I would be ever so grateful for a working example and some tips/links for beginners. If you have a solution that involves other libraries than the two mentioned above, that's ok too.
Aha, audio programming! Welcome to the world of soft-realtime computing :)
Think about the flow of data: a bunch of bits in a .wav file on disk are read by your program and sent to the operating system which hands them off to a sound card where they are converted to an analog signal that drives speakers generating the sound waves that finally reach your ears.
This flow is very sensitive to time fluctuations. If it is held up at any point you will perceive noticeable and sometimes jarring artifacts in the final sound.
Generally the OS/sound card are solid and well tested - most audio artifacts are caused by us developers writing shoddy application code ;)
Libraries such as PortAudio help us out by taking care of some of the thread proirity black magic and making the scheduling approachable. Essentially it says "ok I'm going to start this audio stream, and every X milliseconds when I need the next bit of sample data I'll callback whatever function you provide."
In this case you've provided a function that fills the output buffer with random data. To playback the wave file instead, you need to change this callback function.
But! You don't want to be doing I/O in the callback. Reading some bytes off disk could take tens of milliseconds, and portaudio needs that sample data now so that it gets to the sound card in time. Similarly, you want to avoid acquiring locks or any other operation that could potentially block in the audio callback.
For this example it's probably simplest to load the samples before starting the stream, and use something like this for the callback:
isample := 0
callback := func(out []int32) {
for i := 0; i < len(out); i++ {
out[i] = framesOut[(isample + i) % len(framesOut)]
}
isample += len(out)
}
Note that % len(framesOut) will cause the loaded 32000 samples to loop over and over - PortAudio will keep the stream running until you tell it to stop.
Actually, you need to tell it to start too! After opening it call stream.Start() and add a sleep after that or your program is likely to exit before it gets a chance to play anything.
Finally, this also assumes that the sample format in the wave file is the same as the sample format you requested from PortAudio. If the formats don't match you will still hear something, but it probably won't sound pretty! Anyway sample formats are a whole 'nother question.
Of course loading all your sample data up front so you can refer to it within the audio callback isn't a fantastic approach except once you get past hello world stuff. Generally you use a ring-buffer or something similar to pass sample data to the audio callback.
PortAudio provides another API (the "blocking" API) that does this for you. For portaudio-go, this is invoked by passing a slice into OpenStream instead of a function. When using the blocking API you pump sample data into the stream by (a) filling the slice you passed into OpenStream and (b) calling stream.Write().
This is much longer than I intended so I better leave it there. HTH.
I'm trying to capture user's audio input from the browser. I have done it with WAV but the files are really big. A friend of mine told me that OGG files are much smaller.
Does anyone knows how to convert WAV to OGG?
I also have the raw data buffer, I don't really need to convert. But I just need the OGG encoder.
Here's the WAV encoder from Matt Diamond's RecorderJS:
function encodeWAV(samples){
var buffer = new ArrayBuffer(44 + samples.length * 2);
var view = new DataView(buffer);
/* RIFF identifier */
writeString(view, 0, 'RIFF');
/* file length */
view.setUint32(4, 32 + samples.length * 2, true);
/* RIFF type */
writeString(view, 8, 'WAVE');
/* format chunk identifier */
writeString(view, 12, 'fmt ');
/* format chunk length */
view.setUint32(16, 16, true);
/* sample format (raw) */
view.setUint16(20, 1, true);
/* channel count */
view.setUint16(22, 2, true);
/* sample rate */
view.setUint32(24, sampleRate, true);
/* byte rate (sample rate * block align) */
view.setUint32(28, sampleRate * 4, true);
/* block align (channel count * bytes per sample) */
view.setUint16(32, 4, true);
/* bits per sample */
view.setUint16(34, 16, true);
/* data chunk identifier */
writeString(view, 36, 'data');
/* data chunk length */
view.setUint32(40, samples.length * 2, true);
floatTo16BitPCM(view, 44, samples);
return view;
}
is there one for OGG?
The Web Audio spec is actually intended to allow exactly this kind of functionality, but is just not close to fulfilling that purpose yet:
This specification describes a high-level JavaScript API for processing and synthesizing audio in web applications. The primary paradigm is of an audio routing graph, where a number of AudioNode objects are connected together to define the overall audio rendering. The actual processing will primarily take place in the underlying implementation (typically optimized Assembly / C / C++ code), but direct JavaScript processing and synthesis is also supported.
Here's a statement on the current w3c audio spec draft, which makes the following points:
While processing audio in JavaScript, it is extremely challenging to get reliable, glitch-free audio while achieving a reasonably low-latency, especially under heavy processor load.
JavaScript is very much slower than heavily optimized C++ code and is not able to take advantage of SSE optimizations and multi-threading which is critical for getting good performance on today's processors. Optimized native code can be on the order of twenty times faster for processing FFTs as compared with JavaScript. It is not efficient enough for heavy-duty processing of audio such as convolution and 3D spatialization of large numbers of audio sources.
setInterval() and XHR handling will steal time from the audio processing. In a reasonably complex game, some JavaScript resources will be needed for game physics and graphics. This creates challenges because audio rendering is deadline driven (to avoid glitches and get low enough latency).
JavaScript does not run in a real-time processing thread and thus can be pre-empted by many other threads running on the system.
Garbage Collection (and autorelease pools on Mac OS X) can cause unpredictable delay on a JavaScript thread.
Multiple JavaScript contexts can be running on the main thread, stealing time from the context doing the processing.
Other code (other than JavaScript) such as page rendering runs on the main thread.
Locks can be taken and memory is allocated on the JavaScript thread. This can cause additional thread preemption.
The problems are even more difficult with today's generation of mobile devices which have processors with relatively poor performance and power consumption / battery-life issues.
ECMAScript (js) is really fast for a lot of things, and is getting faster all the time depending on what engine is interpreting the code. For something as intensive as audio processing however, you would be much better off using a low-level tool that's compiled to optimize resources specific to the task. I'm currently using ffmpeg on the server side to accomplish something similar.
I know that it is really inefficient to have to send a wav file across an internet connection just to obtain a more compact .ogg file, but that's the current state of things with the web audio api. To do any client-side processing the user would have to explicitly give access to the local file system and execution privileges for the file to make the conversion.
Edit: You could also use Google's native-client if you don't mind limiting your users to Chrome. It seems like very promising technology that loads in a sandbox and achieves speeds nearly as good natively executed code. I'm assuming that there will be similar implementations in other browsers at some point.
This question has been driving me crazy because I haven't seen anyone come up with a really clean solution, so I came up with my own library:
https://github.com/sb2702/audioRecord.js
Basic usage
audioRecord.requestDevice(function(recorder){
// Request user permission for microphone access
recorder.record(); // Start recording
recorder.stop(); /Stop recording
recorder.exportOGG(function(oggBlob){
//Here's your OGG file
});
recorder.exportMP3(function(mp3Blob){
//Here's your mp3 file
});
recorder.exportWAV(function(wavBlob){
//Here's your WAV file
});
});
Using the continuous mp3 encoding option, it's entirely reasonable to capture and encode audio input entirely in the browser, cross-browser, without a server or native code.
DEMO: http://sb2702.github.io/audioRecord.js/
It's still rough around the edges, but I'll try to clean / fix it up.
NEW: Derivative work of Matt Diamond's recorderjs recording to Ogg-Opus
To encode to Ogg-Opus a file in whole in a browser without special extensions, one may use an Emscripten port of opus-tools/opusenc (demo). It comes with decoding support for WAV, AIFF and a couple of other formats and a re-sampler built in.
An Ogg-Vorbis encoder is also available.
Since the questioner is primarily out for audio compression, they might be also interested in mp3 encoding using lame.
Ok, this might not be a direct answer as it does not say how to convert .wav into .ogg. Then again, why bother with the conversion, when you can the .ogg file directly. This depends on MediaRecorder API, but browsers which support WebAudio usually have this too( Firefox 25+ and Chrome 47+)
github.io Demo
Github Code Source
I would like to write a VPI/PLI interface which will open audio files (i.e. wav, aiff, etc)
and present the data to Verilog simulator. I am using Icarus at the moment and wish to
use libsndfile to handle input file formats and data type conversion.
I am not quite sure what to use in the C code ... have looked at IEEE 1364-2001 and still
confused which functions am I supposed to use.
Ideally I'd like to have a verilog module with data port (serial or parallel), clock input
and start/stop pin. I'd like to implement two modules, one for playback from from a file, and another would record output from a filter under test.
Can I do it all in C and just instantiate the module in my testbench or I'll have to write
a function (say $read_audio_data) and wrapper module to call it on each clock pulse ??
Hm, or may be I need to create the module and then get a handle for it and pass value/vect
to the handle somehow ?
I am not quite concerned about how file names will be set, as I probably
wouldn't do it from the verilog code anyway.
And I will probably stick to 24-bit integer samples for the time being and
libsndfile supposed to handle conversion quite nicely.
Perhaps, I'll stick to serial for now (may be even do in the I2S-like fashion) and
de-serialise it in Verilog if needed.
Also I have looked at Icarus plug-in which implements a video camera that reads PNG files,
though there are many more aspects to image processing then there is to audio.
Hence that code looks a bit overcomplicated to me at the moment - neither I managed to get
it to run.
I suggest approaching it like this:
figure out your C/Verilog interface
implement the audio file access with that interface in mind, but not worrying about VPI
implement the C/Verilog glue using VPI
The interface can probably be pretty simple. One function to open the audio file and specify any necessary parameters (sample size, big/little endian, etc...), and another function returns the next sample. If you need to support reading from multiple files in the same simulation, you'll need to pass sort of handle to the PLI functions to identify which file you're reading from.
The Verilog usage could be as simple as:
initial $OpenAudioFile ("filename");
always #(posedge clk)
audio_data <= $ReadSample;
The image-vpi sample looks like a reasonable example to start from. The basic idioms to use in the C code are:
Argument access
// Get a handle to the system task/function call that invoked your PLI routine
vpiHandle tf_obj = vpi_handle (vpiSysTfCall, NULL)
// Get an iterator for the arguments to your PLI routine
vpiHandle arg_iter = vpi_iterate (vpiArgument, tf_obj)
// Iterate through the arguments
vpiHandle arg_obj;
arg_obj = vpi_scan (arg_iter);
// do something with the first argument
arg_obj = vpi_scan (arg_iter);
// do something with the second argument
Retrieving values from Verilog
s_vpi_value v;
v.format = vpiIntVal;
vpi_get_value (handle, &v);
// value is in v.value.integer
Writing values to Verilog
s_vpi_value v;
v.format = vpiIntVal;
v.value.integer = 0x1234;
vpi_put_value (handle, &v, NULL, vpiNoDelay);
To read or write values larger than 32 bits, you will need to use vpiVectorVal instead of vpiIntVal, and de/encode a s_vpi_vector structure.
I have spent a few days now implementing the PLI testbench,
if anyone reads this and they may find it useful -
here is my source code. There is a readme file and below
is the screenshot of some basic results ;)
Use git clone git://github.com/errordeveloper/sftb to obtain
the code repo or download it from the github.com.
I have also wrote about this in my new blog, so hopefully
if anyone searches for this sort of thing they will find it.
I couldn't find anything similar, hence started this project!
This sounds like a good fit for Cocotb an open-source project which abstracts VPI to provide a Pythonic interface to your DUT. You wouldn't have to write any additional Verilog testbench or wrapper RTL or call VPI tasks or functions from Verilog as the testbenches are pure Python.
Your testbench as described would look something like this:
import cocotb
from cocotb.clock import Clock
from cocotb.triggers import RisingEdge
# Whatever audio-file IO library you happen to like best...
from scikits.audiolab import wavread
#cocotb.test()
def stream_file(dut, fname="testfile.wav")
# Start a clock generator
cocotb.fork(Clock(dut.clk, 5000))
data, sample_frequency, encoding = wavread(fname)
result = []
while data:
yield RisingEdge(dut.clk)
dut.data_in <= data.pop(0)
result.append(dut.data_out.value.integer)
# Write result to output file
Disclaimer: I'm one of the Cocotb developers and thus potentially biased, however I'd also challenge anybody to produce functionality similar to the above testbench as quickly and with fewer lines of (maintainable) code.
I'm working on playing audio from an audio stream using VC++ with the QtMultimedia library. Since I'm not too experienced with Qt's libraries I started by reading in a .wav file and writing it to a buffer:
ifstream wavFile;
char* file = "error_ex.wav";
wavFile.open( file, ios::binary );
After that, I used ifstream's .read() function and write all the data into a buffer. After the buffer is written it's sent off to the audio writer that prepares it for Qt:
QByteArray fData;
for( int i = 0; i < (int)data.size(); ++i )
{
fData.push_back(data.at(i));
}
m_pBuffer->open(QIODevice::ReadWrite);
m_pBuffer->write( fData );
m_pBuffer->close();
(m_pBuffer is of type QBuffer)
Once the QBuffer is ready I attempt to play the buffer:
QIODevice* ioDevice = m_pAudioOut->start();
ioDevice->write( m_pBuffer->buffer() );
(m_pAudioOut is of type QAudioOutput)
This results in a small pop from the speakers and then it stops playing. Any ideas why?
Running Visual Studios 2008 on Windows XP SP2 using Qt library 4.6.3.
As Frank pointed out, if your requirement is simply to play audio data from a file, a higher-level API would do the job, and would simplify your application code. Phonon would be one option; alternatively, the QtMobility project provides the QMediaPlayer API for high-level use cases.
Given that the question is specifically about using QIODevice however, and that you mentioned that reading from a WAV file was just your intitial approach, I'll assume that you actually need a streaming API, i.e. one which allows the client to control the buffering, rather than handing over this control to a higher-level abstraction such as Phonon.
QAudioOutput can be used in two different modes, depending on which overload of start() is called:
"Pull mode": void QAudioOutput::start(QIODevice *)
In this mode, QAudioOutput will pull data from the supplied QIODevice without further intervention from the client. It is a good choice if the QIODevice being used is one which is provided by Qt (e.g. QFile, QAbstractSocket etc).
"Push mode": QIODevice* QAudioOutput::start()
In this mode, the QAudioOutput client must push mode to the audio device by calling QIODevice::write(). This will need to be done in a loop, something like:
qint64 dataRemaining = ... // assign correct value here
while (dataRemaining) {
qint64 bytesWritten = audioOutput->write(buffer, dataRemaining);
dataRemaining -= bytesWritten;
buffer += bytesWritten;
// Then wait for a short time
}
How the wait is implemented will depend on the context of your application - if audio is being written from a dedicated thread, it could simply sleep(). Alternatively, if audio is being written from the main thread, you will probably want the write to be triggered by a QTimer.
Since you don't mention anything about using a loop around the write() calls in your app, it looks like what is happening is that you write a short segment of data (which plays as a pop), then don't write any more.
You can see code using both modes in the examples/multimedia/audiooutput app which is delivered with Qt.
Are you sure you use the right (high-level) API? It would be weird if you had to handle data streams and buffering manually. Also, QIODevice::write() doesn't necessarily write the whole buffer but might stop after n bytes, just like POSIX write() (that's why one always should check the return value).
I didn't look into QtMultimedia yet, but using the more mature Phonon, video and audio output worked just fine for me in the past. It works like this:
Create a Phonon::AudioOutput object
Create a Phonon::MediaObject object
Phonon::createPath( mediaObject, audioObject )
mediaObject->setCurrentSource( Phonon::MediaSource( path ) );
mediaObject->play();
There are also examples in Qt.