How to get exact timestamp of audio recording start in with PortAudio? - node.js

We are using PortAudio for recording audio in our electron application. As a node wrapper, we use naudiodon.
The application needs to record both audio and video, but using different sources. Audio, as said, is being recorded with Port Audio, with additional app logic on top. Video, on the other hand, is being recorded with standard MediaRecorder API, with its own formats, properties, and codecs.
We use event 'onstart' to track actual video start and in order to sync audio and video, we must also know the exact audio start time.
Problem is: We are not able to detect that exact timestamp of audio start. What should be the correct way of doing it?
Here is what we tried:
 1. The first option is to listen to portaudio.AudioIO events, such as 'data' and 'readable'. Those are called as soon as PortAudio has new data chunk, so tracking the very first chunk minus its length in milliseconds would result in approximate audio start.
 2. The second option is to add Writable pipe to AudioIO, and do pretty much the same thing as with events.
The issue is, that by doing any of those options, calculated start doesn't always result in the actual timestamp of audio start. While playing around with port audio it was known, that calculated timestamp is higher than it should be, as though some chunks are being buffered before actually released.
Actual audio start and first chunk release can be different, in a range of around 50 - 500 ms with chunk length ~50ms. So chunks might buffer sometimes, and sometimes they don't. Is there any way to track the actual start time of the first chunk? I wasn't able to find any relevant info in checking port audio docs.
Maybe there are any other ways to keep using PortAudio and record video separately, but finally achieve the same desired feature, of synching them together?
PortAudio 19.5, Naudiodon 2.1.0, Electron 6.0.9, Node.js 12.4.0

Related

HLS Live streaming with re-encoding

I come to a technical problem and I need you.
Situation data:
I record the screen as well as 1 to 2 audio tracks (microphone and speaker).
These three recordings are done separately (it could be mixed but I don't prefer) and every 10s (this is configurable), I send the chunk of recorded data to my backend. We, therefore, have 2 to 3 chunks sent every 10s.
These data chunks are interdependent. Example: The 1st video chunk starts with the headers and a keyframe. The second chunk can be in the middle of a frame. It's like having the entire video and doing a random one-bit split.
The video stream is in h264 in a WebM container. I don't have a lot of control over it.
The audio stream is in opus in a WebM container. I can't use aac directly, nor do I have much control.
Given the reality, the server may be restarted randomly (crash, update, scaled, ...). It doesn't happen often (4 times a week). In addition, the customer can, once the recording ends on his side, close the application or his computer. This will prevent the end of the recording from being sent. Once it reconnects, the missing data chunks are sent. This, therefore, prevents the use of a "live" stream on the backend side.
Goals :
Store video and audio as it is received on the server in cloud storage.
Be able to start playing the video/audio even when the upload has not finished (so in a live stream)
As soon as the last chunks have been received on the server, I want the entire video to be already available in VoD (Video On Demand) with as little delay as possible.
Everything must be distributed with the audios in AAC. The audios can be mixed or not, and mixed or not with the video.
Current and blocking solution:
The most promising solution I have seen is using HLS to support the Live and VoD mode that I need. It would also bring a lot of optimization possibilities for the future.
Video isn't a problem in this context, here's what I do:
Every time I get a data chunk, I append it to a screen.webm file.
Then I spit the file with ffmpeg
ffmpeg -ss {total_duration_in_storage} -i screen.webm -c: v copy -f hls -hls_time 8 -hls_list_size 0 output.m3u8
I ignore the last file unless it's the last chunk.
I upload all the files to the cloud storage along with a newly updated output.m3u8 with the new file information.
Note: total_duration_in_storage corresponds to the time already uploaded
on cloud storage. So the sum of the parts presents in the last output.m3u8.
Note 2: I ignore the last file in point 3 because it allows me to have keyframes in each song of my playlist and therefore to be able to use a seeking which allows segmenting only the parts necessary for each new chunk.
My problem is with the audio. I can use the same method and it works fine, I don't re-encode. But I need to re-encode in aac to be compatible with HLS but also with Safari.
If I re-encode only the new chunks that arrive, there is an auditory glitch
The only possible avenue I have found is to re-encode and segment all the files each time a new chunk comes along. This will be problematic for long recordings (multiple hours).
Do you have any solutions for this problem or another way to achieve my goal?
Thanks a lot for your help!

how to get low latency in Exoplayer RTSP live streaming on android?

I am working on RTSP live Streaming. I am getting live stream on my android App using exoplayer RTSP stream player. But latency of that streaming is about 3 seconds. As latency on vlc media player is 1 second. so how to reduce latency in exoplayer. Is there any way please tell me
What you're facing is buffering latency. VLC uses its own engine and buffering algorithms. However, if you wanna reduce buffering latency on ExoPlayer, you gotta get yourself familiarized with LoadControl. ExoPlayer uses a DefaultLoadControl in default instantiation. This LoadControl is a class that belongs to the ExoPlayer library, and it determines the values of time durations the player should spend in order to buffer the stream. If you wanna reduce the delay, you gotta reduce the LoadControl buffer values.
Here's a quick snippet of creating a SimpleExoPlayer instance with a custom load control :
val customLoadControl = DefaultLoadControl.Builder()
.setBufferDurationsMs(minBuffer, maxBuffer, playbackBuffer, playbackRebuffer)
.build()
Parameters in a nutshell : minBuffer is the minimum buffered video duration, maxBuffer is the maximum buffered video duration, playbackBuffer is the required buffered video duration in order to start playing , playbackRebuffer is the required buffered video duration in case it failed and it retries.
In your case, you should set your values really low, especially the playbackBuffer and minBuffer parameters. You should mess around with small values (they are in milliseconds). Going with a value of 1000 for both minBuffer and playbackBuffer
How to use the custom load control : After building the custom load control instance and storing it in a variable, you should pass it when you're building your SimpleExoPlayer :
myMediaPlayer = SimpleExoPlayer.Builder(this#MainActivity)
.setLoadControl(customLoadControl)
.build()
What to expect:
Using the default values is always recommended. If you mess with the values, the app may crash, the stream may be stuck, or the player may get very glitchy. So manipulate the values intelligently.
Here is the javadocs DefaultLoadControl Javadocs
If you don't know what buffering is, exoplayer (or any other player) may need to buffer (load the upcoming portion of the video/audio and store it in memory, rendered, way faster to access and reduces playback problems. Streamed media however needs buffering because it comes in form of chunks. So each chunk that arrives, will eventually be buffered. If you set the required buffered duration to 1000, you are telling ExoPlayer that the first chunk of stream that arrives whose length is 1000 millisecond should be buffered and played right away. I believe there is no simpler way to explain this. Best of luck.

How can I detect corrupt/incomplete MP3 file, from a node.js app?

The common situation when the integrity of an MP3 file is not correct, is when the file has been partially uploaded to the server. In this case, the indicated audio duration doesn't correspond to what is really in the MP3 file: we can hear the beginning, but at some point the playing stops and the indicated duration of the audio player is broken.
I tried with libraries like node-ffprobe, but it seems they just read metadata, without making comparison with real audio data in the file. Is there a way to detect efficiently a corrupted or incomplete MP3 file from node.js?
Note: the client uploading MP3 files is a hardware (an audio recorder), uploading files on a FTP server. Not a browser. So I'm not able to upload potentially more useful data from the client.
MP3 files don't normally have a duration. They're just a series of MPEG frames. Sometimes, there is an ID3 tag indicating duration, but not always.
Players can determine duration by choosing one of a few methods:
Decode the entire audio file.This is the slowest method, but if you're going to decode the file anyway, you might as well go this route as it gives you an exact duration.
Read the whole file, skimming through frame headers.You'll have to read the whole file from disk, but you won't have to decode it. Can be slow if I/O is slow, but gives you an exact duration.
Read the first frame's bitrate and estimate duration by file size.Definitely the fastest method, and the one most commonly used by players. Duration is an estimate only, and is reasonably accurate for CBR, but can be wildly inaccurate for VBR.
What I'm getting at is that these files might not actually be broken. They might just be VBR files that your player doesn't know the duration of.
If you're convinced they are broken (such as stopping in the middle of content), then you'll have to figure out how you want to handle it. There are probably only a couple ways to determine this:
Ideally, there's an ID3 tag indicating duration, and you can decode the whole file and determine its real duration to compare.
Usually, that ID3 tag won't exist, so you'll have to check to see if the last frame is complete or not.
Beyond that, you don't really have a good way of knowing if the stream is incomplete, since there is no outer container that actually specifies number of frames to expect.
The expression for calculating the filesize of an mp3 based on duration and encoding (from this answer) is quite simple:
x = length of song in seconds
y = bitrate in kilobits per second
(x * y) / 1024 = filesize (MB)
There is also a javascript implementation for the Web Audio API in another answer on that same question. Perhaps that would be useful in your Node implementation.
mp3diags is some older open source software for fixing mp3s and which was great for batch processing stuff like this. The source is c++ and still available if you're feeling nosy and want to see how some of these features are implemented.
Worth a look since it has some features that might be be useful in your context:
What is MP3 Diags and what does it do?
low quality audio
missing VBR header
missing normalization data
Correcting files that show incorrect song duration
Correcting files in which the player cannot seek correctly

Icecast: I have strange behaviour with repeats of end of tracks, as well as pitch changes from my Icecast Server

I only began using icecast a few days ago, so if I stuffed something up somewhere, please let me know.
I have a weird problem with icecast. Everytime a track is "finished" on icecast, a section of the end of the currently playing track (i think 64kbs of the track) is repeated about 2 to 3 times before the next song plays, but the next song doesn't begin playing in the start, but a few seconds of the way through. Also, I can notice that the playback speed (and hence the pitch) sometimes differs from the original as well.
I consulted this post and this post that was quoted below which taught me what the <burst-on-connect> and the <burst-size> tags are used for. It also taught me this:
What's happening here is that nothing is being added to the buffer, so clients connect, get the contents of that buffer, and then the stream ends. The client must be re-connecting repeatedly, and it keeps getting that same buffer.
Cheers to Brad for that post. A solution to this problem was provided in a comments section of that post and it said to decrease the <source-timeout> of the icecast server, so that it will close the connection quicker and stop any repeating. But this is assuming I want to close the mountpoint, and I dont, because what I am using Icecast for is actually a 24/7 radio player. If I did close my mountpoint, then what happens is VLC just turns off and doesn't repeatedly attempt to connect anymore. Unless this is wrong. I don't know.
I use VLC to hear the playback of the icecast streams and I use nodeshout which is a bunch of bindings from libshout built for node.js. I use nodeshout to send data to a bunch of mounts on my icecast server. In the future I plan to make a site that will listen to the icecast streams, meaning it will replace VLC.
icecast.xml
<limits>
<clients>100</clients>
<sources>4</sources>
<queue-size>1008576</queue-size>
<client-timeout>30</client-timeout>
<header-timeout>15</header-timeout>
<source-timeout>30</source-timeout>
<burst-on-connect>1</burst-on-connect>
<burst-size>252144</burst-size>
</limits>
This is a summary of the audio sending code on my node.js server.
nodejs
// these lines of code is a smaller part of a function, and this sets all the information. The variables name, description etc come from the arguments of the function
var nodeshout = require("nodeshout");
let shout = nodeshout.create();
shout.setHost('localhost');
shout.setPort(8000);
shout.setUser('source');
shout.setPassword(process.env.icecastPassword); //password in .env file
shout.setName(name);
shout.setDescription(description);
shout.setMount(mount);
shout.setGenre(genre);
shout.setFormat(1); // 0=ogg, 1=mp3
shout.setAudioInfo('bitrate', '128');
shout.setAudioInfo('samplerate', '44100');
shout.setAudioInfo('channels', '2');
return shout
// now meanwhile somewhere lower in the file, there is this summary of how the audio is sent to the icecast server
var nodeshout = require("nodeshout")
var {FileReadStream, ShoutStream} = require("nodeshout") //here is where the FileReadStream and ShoutStream functions come from
const filecontent = new FileReadStream(pathToSong, 65536); //if I change the 65536 to a higher value, then more bytes are being repeated at the end of the track. If I decrease this, it starts sounding buggy and off.
var streamcontent = filecontent.pipe(new ShoutStream(shoutstream))
streamcontent.on('finish', () => {
next()
console.log("Track has finished on " + stream.name + ": " + chosenTrack)
})
I also notice weirder behaviour. After the previous song had it's last chunk repeated a few times, that's when the server calls the streamcontent.on('finish') event that is located in the nodejs script, and only then does it warn me that the track is finished.
What I have tried
I tried messing around with the <source-timeout> tag, the number of bytes (or bits im not sure) that are being sent on nodejs, the burst size, I also tried turning bursting off completely but it results in super strange behavior.
I also thought creating a new stream every time per song was a bad idea as seen in new ShoutStream(shoutstream) when piping the file data, but using the same stream meant that the program would return an error because it would write the next track to the shoutstream after it had said it had closed.
If any more information is necessary to figure out what is going on, I can provide it. Thanks for your time.
Edit: I would like to add: Do you think I should manually control how many bytes are sent to icecast and then use the same stream object instead of calling a new one every time?
I found out why the stream didn't play some tracks as opposed to others.
How I got there
I could not switch to ogg/vorbis or ogg/opus for my stream, so I had to do something with my source client. I double checked everything was correct and that my audio files were in the correct bitrate. When i ran the ffprobe tool with ffprobe audio.mp3 sometimes the bitrates did not adhere to the typical rates of 120kbps, 128kbps, 192, 312, etc etc so on. It was always some strange value such as 129852 just to provide an example.
I then downloaded the checkmate mp3 checker here and checked my audio files, and they were all encoded in a variable bitrate!!! VBR damnit!
TLDR
I fixed my problem by re-encoding all my tracks to a constant bitrate of 128kbps using ffmpeg.
Quick Edit: I am pretty sure that programs such as Darkice might already support variable bit rate transfers to Icecast servers, but it would be impractical for me to use darkice, hence why I stuck with nodeshout.

How to play a song with Audio Queue API in order to find the Accurate timings of the beat of the song

Now I am working on a musical project in which i need accurate timings.I already used NSTimer,NSdate,but iam geting delay while playing the beats(beats i.e tik tok)So i have decided to use Audio Queue API to play my sound file present in the main bundle that is of .wav format, Its been 2 weeks i am struggling with this, Can anybody please help me out of this problem.
Use uncompressed sounds and a single Audio Queue, continuously running. Don't stop it between beats. Instead mix in raw PCM samples of each Tock sound (etc.) starting some exact number of samples apart. Play silence (zeros) in between. That will produce sub-millisecond accurate timing.

Resources