I'm currently playing with Web Audio API.
I have a mono synthesizer of my creation in javascript, so I created a JavaScriptAudioNode element with 0 inputs and 1 output, conected with the audioDestinationNode of my context.
Everytime the process function is called, i call
event.outputBuffer.getChannelData and pass the output channel array to
my synth. Being my synth mono, it expects only one channel (array),
but JavaScriptAudioNode actually comes with two channels per output.
so i can "fill" only the left or the right channel, depending if i
call getChannelData(0) or (1).
is there a way to have a mono JavaScriptAudioNode element? If not, is
there a way to automatically "upmix" my mono channel into two stereo
channels?
(yeah, I could do it by hand, with a weighted addition, but laziness is the greatest virtue).
Thanks!
If you do this, it will create a node with 0 inputs and 1 output.
node = context.createJavaScriptNode(<BUFFER_SIZE>, 0, 1),
Related
Wireless connections like bluetooth are limited by transmission bandwidth resulting in a limited bitrate and audio sampling frequency.
Can a high definition audio output like 24bit/96khz be created by combining two separate audio streams of 24bit/48khz each, transmitted from a source to receiver speakers/earphones.
I tried to understand how a DSP(digital signal processor) works, but I am unable to find the exact technical words that explain this kind of audio splitting and re-combining technique for increasing the audio resolution
No, you would have to upsample the two original audio streams to 96 kHz. Combining two audio streams will not increase audio resolution; all you're really doing is summing two streams together.
You'll probably want to read this free DSP resource for more information.
Here is a simple construction which could be used to create two audio streams at 24bit/48kHz from a higher resolution 24bit/96kHz stream, which could later be recombined to recreate a single audio stream at 24bit/96kHz.
Starting with an initial high resolution source at 24bit/96kHz {x[0],x[1],x[2],...}:
Take every even sample of the source (i.e. {x[0],x[2],x[4],...} ), and send it over your first 24bit/48kHz channel (i.e. producing the stream y1 such that y1[0]=x[0], y1[1]=x[2], ...).
At the same time, take every odd sample {x[1],x[3],x[5],...} of the source, and send it over your second 24bit/48kHz channel (i.e. producing the stream y2 such that y2[0]=x[1], y2[1]=x[3], ...).
At the receiving end, you should then be able to reconstruct the original 24bit/96kHz audio signal by interleaving the samples from your first and second channel. In other words you would be recreating an output stream out with:
out[0] = y1[0]; // ==x[0]
out[1] = y2[0]; // ==x[1]
out[2] = y1[1]; // ==x[2]
out[3] = y2[1]; // ==x[3]
out[4] = y1[2]; // ==x[4]
out[5] = y2[2]; // ==x[5]
...
That said, transmitting those two streams of 24bit/48kHz would require an effective bandwidth of 2*24bit*48000kHz = 2304kbps, which is exactly the same as transmitting one stream of 24bit/96kHz. So, while this allows you to fit the audio stream in channels of fixed bandwidth, you are not reducing the total bandwidth requirement this way.
Could please you provide you definition of "combining". Based on the data rates, it seems like you want to do a multiplex (combining two mono channels into a stereo channel). If the desire is to "add" two channels together (two monos into a single mono or two stereo channels into one stereo), then you should not have to increase your sampling rate (you are adding two band limited signals, increasing the sampling rate is not necessary).
I want to built a SoundWave sampling an audio stream.
I read that a good method is to get amplitude of the audio stream and represent it with a Polygon. But, suppose we have and AudioGraph with just a DeviceInputNode and a FileOutpuNode (a simple recorder).
How can I get the amplitude from a node of the AudioGraph?
What is the best way to periodize this sampling? Is a DispatcherTimer good enough?
Any help will be appreciated.
First, everything you care about is kind of here:
uwp AudioGraph audio processing
But since you have a different starting point, I'll explain some more core things.
An AudioGraph node is already periodized for you -- it's generally how audio works. I think Win10 defaults to periods of 10ms and/or 20ms, but this can be set (theoretically) via the AudioGraphSettings.DesiredSamplesPerQuantum setting, with the AudioGraphSettings.QuantumSizeSelectionMode = QuantumSizeSelectionMode.ClosestToDesired; I believe the success of this functionality actually depends on your audio hardware and not the OS specifically. My PC can only do 480 and 960. This number is how many samples of the audio signal to accumulate per channel (mono is one channel, stereo is two channels, etc...), and this number will also set the callback timing as a by-product.
Win10 and most devices default to 48000Hz sample rate, which means they are measuring/output data that many times per second. So with my QuantumSize of 480 for every frame of audio, i am getting 48000/480 or 100 frames every second, which means i'm getting them every 10 milliseconds by default. If you set your quantum to 960 samples per frame, you would get 50 frames every second, or a frame every 20ms.
To get a callback into that frame of audio every quantum, you need to register an event into the AudioGraph.QuantumProcessed handler. You can directly reference the link above for how to do that.
So by default, a frame of data is stored in an array of 480 floats from [-1,+1]. And to get the amplitude, you just average the absolute value of this data.
This part, including handling multiple channels of audio, is explained more thoroughly in my other post.
Have fun!
I am trying to figure out how to adjust the volume level of a PCM audio stream in node.
I have looked all over npmjs.org at all of the modules that I could find for working with audio, but haven't found anything that will take a stream in, change the volume, and give me a stream out.
Are there any modules that exist that can do this, perhaps even even something that wasn't made specifically for it?
If not, then I could create a module, if someone can point me in the right direction for modifying a stream byte by byte.
Here is what I am trying to accomplish:
I am writing a program to receive several PCM audio streams, and mix them for several outputs with varying volume levels. Example:
inputs vol output
music 25% output 1
live audio 80% output 1
microphone 0% output 1
music 100% output 2
live audio 0% output 2
microphone 0% output 2
What type of connection are you using? (Would make it easier to give example code)
What you basically want to do, is create a connection. Then on the connection or request object add a listener for the 'data' event. If you don't set an encoding, the data parameter on the callback should be a Buffer. The data event is triggered after each chunk is delivered through the network.
The Buffer gives you byte-size access to the data-stream using regular javascript number values. You can then parse that chunk, keep them in memory over multiple data-events using a closure (in order to buffer multiple chunks). And when appropriate write the parsed and processed data to a socket (another socket or the same in case of bi-directional sockets). Don't forget to manage your closure in order to avoid memory leaks!
This is just an abstract description. Let me know if anything needs clarification.
we're having trouble playing streamed audio in a browser (using Chrome).
We have a process which is streaming some audio (for example an internet radio) on udp on some port. It's avconv (avconv -y -i SOMEURL -f alaw udp://localhost:PORT).
We have a NodeJs server which receives this audio stream and forwards it to multiple clients connected via websockets. The audio stream which NodeJs receives is wrapped in a buffer which is an array with numbers from 0 to 255. The data is sent to the browser without any issues and then we're using AudioContext to play the audio stream in the browser (our code is based on AudioStreamer - https://github.com/agektmr/AudioStreamer).
At first, all all we got at this point was static. When looking into the AudioStreamer code, we realized that the audio stream data should be in the -1 to 1 range. With this knowledge we tried modifying each value in the buffer with this formula x = (x/128) - 1. We did it just to see what would happen and surprisingly the static became a bit less awful - you could even make out melodies of songs or words if the audio was speech. But it's still very very bad, lots of static, so this is obviously not a solution - but it does show that we are indeed receiving the audio stream via the websockets and not just some random data.
So the question is - what are we doing wrong? Is there a codec/format we should be using? Of course all the code (the avconv, NodeJs and client side) can be modified at will. We could also use another browser if needed, though I assume that's not the problem here. The only thing we do know is that we really need this to work through websockets.
The OS running the avconv and NodeJs is Ubuntu (various versions 10-13)
Any ideas? All help will be appreciated.
Thanks!
Tomas
The conversion from integer samples to floating point samples is incorrect. You must take into account:
Number of channels
Number of bits per sample
Signed/unsigned
Endianess
Let's assume you have a typical WAV file at 16 bit stereo, signed, little-endian. You're on the right track with your formula, but try this:
x = (x/32768) - 1
I've been searching high and low for an example on how to use the Speex library's preprocessor for multichannel audio.
The documentation for speex_preprocess_state_init() says that:
Creates a new preprocessing state. You MUST create one state per channel processed.
I assume that means I need to call speex_preprocess_run() on each channel separately, but won't that potentially "skew" the result if the preprocessor happens to remove more noise from one channel than the other?
Also, speex_preprocess_run() indicates whether the audio is considered voice or noise/silence. If I have to call the function for each channel, what happens if one channel is considered voice and the other isn't?
Am I overthinking this?
Voices recorded in stereo typically mix down to mono without trouble. Microphone placement can cause some phasing issues, but that generally isn't an issue.
Once you mix down to mono, you can process the audio as normal.
Alternatively, you can pick one of the channels, and ignore the second. This might not be as reliable though, as the voice might have been off-axis when recorded.