I thought it would be a simple task ...
- platform: linux on laptop
- language: python
- object: generate tone to be heard on speaker or headphones. Tone will be modified in real-time, many times per second (think about a metal-finder)
Initial design was to generate a tone in python and pipe it to aplay().
Since aplay consume data at a known rate (the sampling rate), I thought my tone generator would not have to care about timing if silences (between tones) where generated at normal sampling rate (null amplitude).
First result show an important time lag (many seconds). I found that the pipe is fairly long by default (64KB). That's 8 seconds of samples (at 8Khz).
I found a way to reduce the pipe size at 4KB but it is still too long (0.5s lag).
Sampling at a very high frequency would reduce the lag but I don't like that solution.
Second approach was to generate a real silence (no sample) during a silence and the generator would sleep() during the silence.
Result is that aplay complains about underrunning and, for some reason, the tones were truncated and mishandled (bad rendering).
So, my question is:
What are the best ways to send a tone to the audio stack without piping?
Related
I'm making a game in which there are a series of events (which happens, say, every 30 frames in a 60fps setting) that I want to sync with the music (at 120 bpm). In usual cases, e.g. rhythm games, syncing the events to the music is easier, because human seems to perceive much smaller gaps in music than in videos. However, in my case, the game heavily depends on frame-based time, and a lot of things will break if I change the schedule of my series of events.
After a lot of experiments, it seems to me almost impossible to tweak the music without disturbing the human ear: A jump of ~1ms is noticeable, a ~10ms discrepancy between video and audio is noticeable, a 0.5% change in the pitch is noticeable. And I don't have handy tools to speed up audio without changing the pitch.
What is the easiest way out in this circumstance? Is there any reference on this subject that I can refer to? Any advice is appreciated!
The method I that I successfully use (in Java) is to route the playback signal through a path that allows the counting of PCM frames (audio frames run at rates like 44100 fps, as opposed to screen updates which run at rates like 60 fps). I don't know about other languages, but with Java, this can be done by outputting using a SourceDataLine class. As the audio frame count is incremented, it can be compared to the next item (pending item) on a collection of events that require triggers to other systems or threads. Java has an excellent class for handling the collection of events: ConcurrentSkipListSet. It is asynchronous, and automatically sorts elements via a Comparator set to the desired PCM frame count.
Some example code that showing the counting of frames can be seen in this tutorial Using Files and Format Converters, if you search on the page for the phrase "Here, do something useful with the audio data". They are counting bytes, not PCM frames, but the example does give the basic idea.
Why is counting PCM effective? I think this has to do with the fact that this code (in Java) is the closest we get to the point where audio data is fed to the native code controlling the sound system, and that this code employs a blocking queue. Thus, the write operations only happen when the audio system is ready to receive and playback more sound data, and audio systems have to be very accurate in how they maintain their rate of processing. The amount of time variance that occurs here (especially if the thread is given a high priority) is smaller than the time variance incurred by choices made by the JVM as it juggles multiple threads and processes.
I need to play 4 audios through a browser web.
These audios last 150ms, 300ms, 450ms and 600ms.
I don't care about latency (if an audio is played 100 ms after it's not that important for my purpose).
But I do care about the duration of these audios: is the 150ms audio last exactly 150ms or there is an error due to the audio board or other components?
I know for sure that there is an error (I see a test using a Mac).
My question is: can anyone show me a paper, an article or anything that talks about the duration and test different setting or tell me if this error is always (Windows, Mac, old device, new device) very small (less than 10ms for example).
In other words: if I play an audio of 100ms how long does it really last (100ms? more? less?)?
In what manner is the sound not lasting the correct amount of time?
Does the beginning or the end get cut off?
Does the sound play back slower or faster than it should?
In my experience, I've never heard an error with playback rates caused by the browser or sound boards. But I have come across situations where a sound is played back with a different audio format than which is was encoded. For example, a sound encoded at 48000 fps played back at 44100 fps will take longer to execute, but will be very close to the original in pitch (maybe about a 1/2 step lower). I recommend as a diagnostic step to confirm the audio format used at each end. How to do so will depend on the systems being used.
I want to built a SoundWave sampling an audio stream.
I read that a good method is to get amplitude of the audio stream and represent it with a Polygon. But, suppose we have and AudioGraph with just a DeviceInputNode and a FileOutpuNode (a simple recorder).
How can I get the amplitude from a node of the AudioGraph?
What is the best way to periodize this sampling? Is a DispatcherTimer good enough?
Any help will be appreciated.
First, everything you care about is kind of here:
uwp AudioGraph audio processing
But since you have a different starting point, I'll explain some more core things.
An AudioGraph node is already periodized for you -- it's generally how audio works. I think Win10 defaults to periods of 10ms and/or 20ms, but this can be set (theoretically) via the AudioGraphSettings.DesiredSamplesPerQuantum setting, with the AudioGraphSettings.QuantumSizeSelectionMode = QuantumSizeSelectionMode.ClosestToDesired; I believe the success of this functionality actually depends on your audio hardware and not the OS specifically. My PC can only do 480 and 960. This number is how many samples of the audio signal to accumulate per channel (mono is one channel, stereo is two channels, etc...), and this number will also set the callback timing as a by-product.
Win10 and most devices default to 48000Hz sample rate, which means they are measuring/output data that many times per second. So with my QuantumSize of 480 for every frame of audio, i am getting 48000/480 or 100 frames every second, which means i'm getting them every 10 milliseconds by default. If you set your quantum to 960 samples per frame, you would get 50 frames every second, or a frame every 20ms.
To get a callback into that frame of audio every quantum, you need to register an event into the AudioGraph.QuantumProcessed handler. You can directly reference the link above for how to do that.
So by default, a frame of data is stored in an array of 480 floats from [-1,+1]. And to get the amplitude, you just average the absolute value of this data.
This part, including handling multiple channels of audio, is explained more thoroughly in my other post.
Have fun!
Recently I was super excited to discover baudio but unfortunately it has a bug (issue #13) that renders it practically useless. In light of this I've taken the core ideas and started making my own version. here's a breakdown of what it does.
generate amplitude output at short intervals in the form of 8bit pcm data (aka single hex characters representing amplitude levels 0-15)
pipe node's sdtout to play's (sox's) stdin
I ~neeed~ it to be an instrument usable in real time so I'm piping the data (8bit amplitude samples) in chunks 20 times per second but this seems to cause a lot of lag and choppiness. when I create a file out of node's output and pipe the uninterrupted stream of hex digits to play:
cat bitfile | play -c 1 -r 44100 -t s8 -
it works perfectly.
here is the js for my node applet. it requires the 'through' module:
npm install through
So my question is:
can I do anything to reduce this piping lag? faster hardware? different approach? hopeless?
I am developing a digital delay on a microcontroller and I am stuck with the delay decay. The delay is implemented with a comb filter.
Here it is: http://www.tonmeister.ca/main/textbook/intro_to_sound_recording837x.png
The delay line, "emulating the tape", is implemented as a circula buffer. The effect can be killed and such case does not represents an issue; when turning the effect off though, I have the tail of the delay left in the buffer to process, as if the delay had been frozen and the tail slowly decay (depending on the feedback gain).
My question is: how many times I have to recirculate samples through the buffer?
One way I thought to approach this could be by modelling the physical process ... assuming that the input sequence has a loudness of 0dB for its entire duration and that, after going through the delay line, it gets attenuated by a factor of 1/10. In terms of loudness this corresponds to a drop of 20dB, as power = voltage^2, every time the sequence goes through the feedback path. The weakest audible sound has a loudness of −130dB but, taking into consideration the ambient noise as well, −120dB will be sufficient as the least reference power. Hence, after the echoes have been through the feedback path 6 times (120dB/20dB) they will be no longer audible.
Is there a more efficient way?
Thank you!