Real duration of audios played through browsers - audio

I need to play 4 audios through a browser web.
These audios last 150ms, 300ms, 450ms and 600ms.
I don't care about latency (if an audio is played 100 ms after it's not that important for my purpose).
But I do care about the duration of these audios: is the 150ms audio last exactly 150ms or there is an error due to the audio board or other components?
I know for sure that there is an error (I see a test using a Mac).
My question is: can anyone show me a paper, an article or anything that talks about the duration and test different setting or tell me if this error is always (Windows, Mac, old device, new device) very small (less than 10ms for example).
In other words: if I play an audio of 100ms how long does it really last (100ms? more? less?)?

In what manner is the sound not lasting the correct amount of time?
Does the beginning or the end get cut off?
Does the sound play back slower or faster than it should?
In my experience, I've never heard an error with playback rates caused by the browser or sound boards. But I have come across situations where a sound is played back with a different audio format than which is was encoded. For example, a sound encoded at 48000 fps played back at 44100 fps will take longer to execute, but will be very close to the original in pitch (maybe about a 1/2 step lower). I recommend as a diagnostic step to confirm the audio format used at each end. How to do so will depend on the systems being used.

Related

Sync music to frame-based time

I'm making a game in which there are a series of events (which happens, say, every 30 frames in a 60fps setting) that I want to sync with the music (at 120 bpm). In usual cases, e.g. rhythm games, syncing the events to the music is easier, because human seems to perceive much smaller gaps in music than in videos. However, in my case, the game heavily depends on frame-based time, and a lot of things will break if I change the schedule of my series of events.
After a lot of experiments, it seems to me almost impossible to tweak the music without disturbing the human ear: A jump of ~1ms is noticeable, a ~10ms discrepancy between video and audio is noticeable, a 0.5% change in the pitch is noticeable. And I don't have handy tools to speed up audio without changing the pitch.
What is the easiest way out in this circumstance? Is there any reference on this subject that I can refer to? Any advice is appreciated!
The method I that I successfully use (in Java) is to route the playback signal through a path that allows the counting of PCM frames (audio frames run at rates like 44100 fps, as opposed to screen updates which run at rates like 60 fps). I don't know about other languages, but with Java, this can be done by outputting using a SourceDataLine class. As the audio frame count is incremented, it can be compared to the next item (pending item) on a collection of events that require triggers to other systems or threads. Java has an excellent class for handling the collection of events: ConcurrentSkipListSet. It is asynchronous, and automatically sorts elements via a Comparator set to the desired PCM frame count.
Some example code that showing the counting of frames can be seen in this tutorial Using Files and Format Converters, if you search on the page for the phrase "Here, do something useful with the audio data". They are counting bytes, not PCM frames, but the example does give the basic idea.
Why is counting PCM effective? I think this has to do with the fact that this code (in Java) is the closest we get to the point where audio data is fed to the native code controlling the sound system, and that this code employs a blocking queue. Thus, the write operations only happen when the audio system is ready to receive and playback more sound data, and audio systems have to be very accurate in how they maintain their rate of processing. The amount of time variance that occurs here (especially if the thread is given a high priority) is smaller than the time variance incurred by choices made by the JVM as it juggles multiple threads and processes.

Delay Audio on Windows 10 by a second

Is there a way to universally delay audio in Windows 10 (I have Realtek Hi-Definiton Audio) as an example.
I have 2 reasons why I want to accomplish this
1) My audio is playing a quarter of a second before the video plays (out of sync). This is consistent in youtube, vlc media player, windows media player, pretty much any video content... the mouth will move 1/4 second after the audio. The delay builds up overtime as well, about 5 minutes in it becomes unbearable.
2) Unrelated to #1 I want to scan the audio and edit it in near real time. Searching for certain sounds and also reading certain subtitles.

How to get amplitude of an audio stream in an AudioGraph to build a SoundWave using Universal Windows?

I want to built a SoundWave sampling an audio stream.
I read that a good method is to get amplitude of the audio stream and represent it with a Polygon. But, suppose we have and AudioGraph with just a DeviceInputNode and a FileOutpuNode (a simple recorder).
How can I get the amplitude from a node of the AudioGraph?
What is the best way to periodize this sampling? Is a DispatcherTimer good enough?
Any help will be appreciated.
First, everything you care about is kind of here:
uwp AudioGraph audio processing
But since you have a different starting point, I'll explain some more core things.
An AudioGraph node is already periodized for you -- it's generally how audio works. I think Win10 defaults to periods of 10ms and/or 20ms, but this can be set (theoretically) via the AudioGraphSettings.DesiredSamplesPerQuantum setting, with the AudioGraphSettings.QuantumSizeSelectionMode = QuantumSizeSelectionMode.ClosestToDesired; I believe the success of this functionality actually depends on your audio hardware and not the OS specifically. My PC can only do 480 and 960. This number is how many samples of the audio signal to accumulate per channel (mono is one channel, stereo is two channels, etc...), and this number will also set the callback timing as a by-product.
Win10 and most devices default to 48000Hz sample rate, which means they are measuring/output data that many times per second. So with my QuantumSize of 480 for every frame of audio, i am getting 48000/480 or 100 frames every second, which means i'm getting them every 10 milliseconds by default. If you set your quantum to 960 samples per frame, you would get 50 frames every second, or a frame every 20ms.
To get a callback into that frame of audio every quantum, you need to register an event into the AudioGraph.QuantumProcessed handler. You can directly reference the link above for how to do that.
So by default, a frame of data is stored in an array of 480 floats from [-1,+1]. And to get the amplitude, you just average the absolute value of this data.
This part, including handling multiple channels of audio, is explained more thoroughly in my other post.
Have fun!

How to play a song with Audio Queue API in order to find the Accurate timings of the beat of the song

Now I am working on a musical project in which i need accurate timings.I already used NSTimer,NSdate,but iam geting delay while playing the beats(beats i.e tik tok)So i have decided to use Audio Queue API to play my sound file present in the main bundle that is of .wav format, Its been 2 weeks i am struggling with this, Can anybody please help me out of this problem.
Use uncompressed sounds and a single Audio Queue, continuously running. Don't stop it between beats. Instead mix in raw PCM samples of each Tock sound (etc.) starting some exact number of samples apart. Play silence (zeros) in between. That will produce sub-millisecond accurate timing.

I hear clicking in audio with a DirectShow graph created with Graph Edit yet player software on my PC plays audio smoothly

I have a DirectShow application that I built with Delphi 6 using the DSPACK component library. For two days I have been trying to solve a problem with audio playback. When I run the filter graph I create I hear repetitive clicks in the playback. What was really confusing was that the audio file I created simultaneously with my filter graph had clean continuous audio, not gaps. So I knew that the audio buffers were being delivered properly but something I was doing was "jamming up" the "live" playback. Or so I thought. I spent two days diagnosing the problem looking for semaphores being held too long (locks) or perhaps timestamp problems, which I documented in this other Stack Overflow post:
Getting stuttering during rendering of my DirectShow filter despite output file being "smooth"
A few minutes ago I decided to try a test with the Graph Edit utility. I created a dead simple graph consisting of just the capture device I was using (VOIP phone microphone), and the renderer device I was using (HD ATI Rear Audio output to headphones). Two filters total. Much to my surprise I heard the same clicking. So here was a case that did not involve my code at all and I heard clicking.
Then I changed the audio renderer in the Graph Edit created filter graph to the VOIP phone ear piece. The clicking went away.
Now I know there's a way to get smooth audio on ut the ATI Rear Audio device since its the preferred audio output device and everything from videos I play on my PC to wave files I play on it sound flawless. So are the other software programs doing something different than just connecting filters? I am wondering if perhaps the default mode for the HD ATI Rear Audio is without double-buffering and perhaps those other software programs know how to enable that feature? Or are they doing something else, perhaps using another DirectShow or DirectSound filter or technique for example, to make the audio play smoothly on the HD ATI Rear Audio renderer?
What you possibly having (depends on actual stuttering though) is that when you are using capture and playback devices backed by different hardware, their sampling rates slightly differ. For example, you capture 22050 Hz at actual rate of (22050 - 2%) Hz and you play it back with hardware consuming bytes at (22050 + 2%) Hz.
Now obviously this won't work out smooth: eventually playback will experience data underlow... If you save into file and play back from file, it will go smooth as the file will be able to supply data at the rate of playback device. If capture and playback devices are the same hardware, they are likely to use shared "hardware" clock and rates match.
The problem is known as "rate matcing" and is discussed on MSDN in Live Sources section.

Resources