NodeJS Server Side MP3 'Playback'? - node.js

This is my current approach to fake a Radio-Like stream with node.
Node ReadStream
This ReadStream just reads an mp3 and streams it to a html5 based audio player.
––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
Counter
This counter represents the current playback position of the RadioStream.
It keeps incrementing each second to simulate playback. Once a client connects to the server, the stream will start at the counters position. The only thing which I do not get around is the correct increment size of the counter.
––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
Or is there a way to offset a mp3 stream by seconds ?
Metadata
Once I have the correct position, it will be super easy to build a playlist with Metadata, such as Song Name, Composer etc and push them to the client via socketio.
––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
If you have an idea how to solve this better please let me know.
I also tried using icecast with the following source clients :
vlc: cuesheet support is buggy, it is not forwarding metadata
Liquidsoap : cuesheet support is buggy, it is not forwarding
metadata
I tried executing ezstream with node and start the counter which increments in seconds, but the counter gets very fast out of sync.
Looks like my approch is anything else than ideal, so how to solve this in a smarter way please?

Related

How to have smooth playback experience in playlists?

After creating the playlist with mp4 URLs the loading time between two mp4 files is high and the stream is not running smoothly. Please let me know if this can be fix by changing some settings on the server.
Let me explain the best practices for that. I hope it helps.
Improve WebRTC Playback Experience
ATTENTION: It does not make sense to play the stream with WebRTC because it’s already recorded file and there is no ultra low latency requirement. It make sense to play the stream with HLS. Just keep in mind that WebRTC playback uses more processing resources than the HLS. Even if you would like to decrease the amount of time to switch streams, please read the followings.
Open the embedded player(/usr/local/antmedia/webapps/{YOUR_APP}/play.html)
Find the genericCallback method and decrease the timeout value from 3000 to 1000 or even lower at the end of the genericCallback method. It’s exactly this line
Decrease the key frame interval of the video. You can set to 1 seconds. Generally recommend value is 2 seconds. WebRTC needs key frame to start the play. If the key frame interval is 10 seconds(default in ffmpeg), player may wait up to 10 seconds to play.
Improve HLS Playback Experience
Open the properties file of the application -> /usr/local/antmedia/webapps/{YOUR_APP}/WEB-INF/red5-web.properties
Add the following property
settings.hlsflags=delete_segments+append_list+omit_endlist
Let me explain what it means.
delete_segments just deletes the segment files that is out of the list so that your disk will not get full.
append_list just adds the
new segment files to the older m3u8 file so that player thinks that it’s just playing the same stream.
omit_endlist disables writing the
EXT-X-ENDLIST to the end of the file so player thinks that new segments are in their way and it wait for them. It does not run
stopping the stream.
Disable deleting hls files on ended to not encounter any race condition. Continue editing the file /usr/local/antmedia/webapps/{YOUR_APP}/WEB-INF/red5-web.properties and replace the following line
settings.deleteHLSFilesOnEnded=true with this one
settings.deleteHLSFilesOnEnded=false
Restart the Ant Media Server
sudo service antmedia restart
antmedia.io

Icecast: I have strange behaviour with repeats of end of tracks, as well as pitch changes from my Icecast Server

I only began using icecast a few days ago, so if I stuffed something up somewhere, please let me know.
I have a weird problem with icecast. Everytime a track is "finished" on icecast, a section of the end of the currently playing track (i think 64kbs of the track) is repeated about 2 to 3 times before the next song plays, but the next song doesn't begin playing in the start, but a few seconds of the way through. Also, I can notice that the playback speed (and hence the pitch) sometimes differs from the original as well.
I consulted this post and this post that was quoted below which taught me what the <burst-on-connect> and the <burst-size> tags are used for. It also taught me this:
What's happening here is that nothing is being added to the buffer, so clients connect, get the contents of that buffer, and then the stream ends. The client must be re-connecting repeatedly, and it keeps getting that same buffer.
Cheers to Brad for that post. A solution to this problem was provided in a comments section of that post and it said to decrease the <source-timeout> of the icecast server, so that it will close the connection quicker and stop any repeating. But this is assuming I want to close the mountpoint, and I dont, because what I am using Icecast for is actually a 24/7 radio player. If I did close my mountpoint, then what happens is VLC just turns off and doesn't repeatedly attempt to connect anymore. Unless this is wrong. I don't know.
I use VLC to hear the playback of the icecast streams and I use nodeshout which is a bunch of bindings from libshout built for node.js. I use nodeshout to send data to a bunch of mounts on my icecast server. In the future I plan to make a site that will listen to the icecast streams, meaning it will replace VLC.
icecast.xml
<limits>
<clients>100</clients>
<sources>4</sources>
<queue-size>1008576</queue-size>
<client-timeout>30</client-timeout>
<header-timeout>15</header-timeout>
<source-timeout>30</source-timeout>
<burst-on-connect>1</burst-on-connect>
<burst-size>252144</burst-size>
</limits>
This is a summary of the audio sending code on my node.js server.
nodejs
// these lines of code is a smaller part of a function, and this sets all the information. The variables name, description etc come from the arguments of the function
var nodeshout = require("nodeshout");
let shout = nodeshout.create();
shout.setHost('localhost');
shout.setPort(8000);
shout.setUser('source');
shout.setPassword(process.env.icecastPassword); //password in .env file
shout.setName(name);
shout.setDescription(description);
shout.setMount(mount);
shout.setGenre(genre);
shout.setFormat(1); // 0=ogg, 1=mp3
shout.setAudioInfo('bitrate', '128');
shout.setAudioInfo('samplerate', '44100');
shout.setAudioInfo('channels', '2');
return shout
// now meanwhile somewhere lower in the file, there is this summary of how the audio is sent to the icecast server
var nodeshout = require("nodeshout")
var {FileReadStream, ShoutStream} = require("nodeshout") //here is where the FileReadStream and ShoutStream functions come from
const filecontent = new FileReadStream(pathToSong, 65536); //if I change the 65536 to a higher value, then more bytes are being repeated at the end of the track. If I decrease this, it starts sounding buggy and off.
var streamcontent = filecontent.pipe(new ShoutStream(shoutstream))
streamcontent.on('finish', () => {
next()
console.log("Track has finished on " + stream.name + ": " + chosenTrack)
})
I also notice weirder behaviour. After the previous song had it's last chunk repeated a few times, that's when the server calls the streamcontent.on('finish') event that is located in the nodejs script, and only then does it warn me that the track is finished.
What I have tried
I tried messing around with the <source-timeout> tag, the number of bytes (or bits im not sure) that are being sent on nodejs, the burst size, I also tried turning bursting off completely but it results in super strange behavior.
I also thought creating a new stream every time per song was a bad idea as seen in new ShoutStream(shoutstream) when piping the file data, but using the same stream meant that the program would return an error because it would write the next track to the shoutstream after it had said it had closed.
If any more information is necessary to figure out what is going on, I can provide it. Thanks for your time.
Edit: I would like to add: Do you think I should manually control how many bytes are sent to icecast and then use the same stream object instead of calling a new one every time?
I found out why the stream didn't play some tracks as opposed to others.
How I got there
I could not switch to ogg/vorbis or ogg/opus for my stream, so I had to do something with my source client. I double checked everything was correct and that my audio files were in the correct bitrate. When i ran the ffprobe tool with ffprobe audio.mp3 sometimes the bitrates did not adhere to the typical rates of 120kbps, 128kbps, 192, 312, etc etc so on. It was always some strange value such as 129852 just to provide an example.
I then downloaded the checkmate mp3 checker here and checked my audio files, and they were all encoded in a variable bitrate!!! VBR damnit!
TLDR
I fixed my problem by re-encoding all my tracks to a constant bitrate of 128kbps using ffmpeg.
Quick Edit: I am pretty sure that programs such as Darkice might already support variable bit rate transfers to Icecast servers, but it would be impractical for me to use darkice, hence why I stuck with nodeshout.

How to get exact timestamp of audio recording start in with PortAudio?

We are using PortAudio for recording audio in our electron application. As a node wrapper, we use naudiodon.
The application needs to record both audio and video, but using different sources. Audio, as said, is being recorded with Port Audio, with additional app logic on top. Video, on the other hand, is being recorded with standard MediaRecorder API, with its own formats, properties, and codecs.
We use event 'onstart' to track actual video start and in order to sync audio and video, we must also know the exact audio start time.
Problem is: We are not able to detect that exact timestamp of audio start. What should be the correct way of doing it?
Here is what we tried:
 1. The first option is to listen to portaudio.AudioIO events, such as 'data' and 'readable'. Those are called as soon as PortAudio has new data chunk, so tracking the very first chunk minus its length in milliseconds would result in approximate audio start.
 2. The second option is to add Writable pipe to AudioIO, and do pretty much the same thing as with events.
The issue is, that by doing any of those options, calculated start doesn't always result in the actual timestamp of audio start. While playing around with port audio it was known, that calculated timestamp is higher than it should be, as though some chunks are being buffered before actually released.
Actual audio start and first chunk release can be different, in a range of around 50 - 500 ms with chunk length ~50ms. So chunks might buffer sometimes, and sometimes they don't. Is there any way to track the actual start time of the first chunk? I wasn't able to find any relevant info in checking port audio docs.
Maybe there are any other ways to keep using PortAudio and record video separately, but finally achieve the same desired feature, of synching them together?
PortAudio 19.5, Naudiodon 2.1.0, Electron 6.0.9, Node.js 12.4.0

NodeJS Simulate Live Video Stream

I have a video file that I would like to start broadcasting from NodeJS, preferably through Express, at a given time. That is, if the video starts being available at timestamp t0, then if a client hits the video endpoint at time t0+60, the video playback would start at 60 seconds in.
My key requirement is that when a client connect at a given time, no more of that video be available than what would have been seen so far, so the client connecting at t0+60 would not be able to watch past the minute mark (plus some error threshold) initially, and every ~second, another second of video availability would be added, simulating a live experience synced across all clients regardless of when each loads the stream.
So far, I've tried my luck converting videos to Apple's HLS protocol (because the name sounds promising) and I was able to host the m3u8 files using Node's hls-server library, where the call is very straightforward:
import HLSServer = require('hls-server');
import http = require('http');
const source = __dirname + '/resources';
const server = http.createServer();
const hls = new HLSServer(server, {
path: '/streams', // Base URI to output HLS streams
dir: source // Directory that input files are stored
});
server.listen(8000);
However, it sends the entire video to the browser when asked, and appears to offer no option of forcing a start at a given frame. (I imagine forcing the start position can be done out of band by simply sending the current time to the client and then having the client do whatever is necessary with HTML and Javascript to advance to the latest position).
There are some vague approaches that I saw online that use MP4, but from what I understand, due to its compression, it is hard to know how many bytes of video data correspond to what footage duration as it may widely vary.
There are also some other tutorials which have a direct pipe from an input source such as a webcam, thereby requiring liveness, but for my comparatively simple use case where the video file is already present, I'm content with the ability to maintain a limited amount of precision, such as ±10 seconds, just as long as all clients are forced to be approximately in sync.
Thank you very much in advance, and I appreciate any pointers.

Audio streaming by websockets

I'm going to create voice chat. My backend server works on Node.js and almost every connection between client and server uses socket.io.
Is websockets appropriate for my use case? I prefer communication client -> server -> clients than P2P because I expect even 1000 clients connected to one room.
If websocket is ok, then which method is the best to send AudioBuffer to server and playback on other clients? I do it like that:
navigator.getUserMedia({audio: true}, initializeRecorder, errorCallback);
function initializeRecorder(MediaStream) {
var audioCtx = new window.AudioContext();
var sourceNode = audioCtx.createMediaStreamSource(MediaStream);
var recorder = audioCtx.createScriptProcessor(4096, 1, 1);
recorder.onaudioprocess = recorderProcess;
sourceNode.connect(recorder);
recorder.connect(audioCtx.destination);
}
function recorderProcess(e) {
var left = e.inputBuffer.getChannelData(0);
io.socket.post('url', left);
}
But after receive data on other clients I don't know how to playback this Audio Stream from Buffer Arrays.
EDIT
1) Why if I don't connect ScriptProcessor (recorder variable) to destination, onaudioprocess method isn't fired?
Documentation info - "although you don't have to provide a destination if you, say, just want to visualise some audio data" - Web Audio concepts and usage
2) Why I don't hear anything from my speakers after connect recorder variable to destination and if I connect sourceNode variable directly to destination, I do.
Even if onaudioprocess method doesn't do anything.
Anyone can help?
I think web sockets are appropriate here. Just make sure that you are using binary transfer. (I use BinaryJS for this myself, allowing me to open up arbitrary streams to the server.)
Getting the data from user media capture is pretty straightforward. What you have is a good start. The tricky party is on playback. You will have to buffer the data and play it back using your own script processing node.
This isn't too hard if you use PCM everywhere... the raw samples you get from the Web Audio API. The downside of this is that there is a lot of overhead shoving 32-bit floating point PCM around. This uses a ton of bandwidth which isn't needed for speech alone.
I think the easiest thing to do in your case is to reduce the bit depth to an arbitrary bit depth that works well for your application. 8-bit samples are plenty for discernible speech and will take up quite a bit less bandwidth. By using PCM, you avoid having to implement a codec in JS and then having to deal with the buffering and framing of data for that codec.
To summarize, once you have the raw sample data in a typed array in your script processing node, write something to convert those samples from 32-bit float to 8-bit signed integers. Send these buffers to your server in the same size chunks as they come in on, over your binary web socket. The server will then send these to all the other clients on their binary web sockets. When the clients receive audio data, it will buffer it for whatever amount of time you choose to prevent dropping audio. Your client code will convert those 8-bit samples back to 32-bit float and put it in a playback buffer. Your script processing node will pick up whatever is in the buffer and start playback as data is available.

Resources