HLS Live streaming with S3 - are these assumptions correct? - node.js

I want to make a live stream. And, I want to use HLS.
I understand that a HLS live stream is just a main playlist file with '.m3u8' extension that lists all the files to be played.
But, for live stream, since all the files are not available readily, they are added as they come in.
I want to use S3 for now to host these files and the playlist file.
Now, I want to update the playlist file in S3. But it's actually going to replace the existing playlist file instead of just updating it (according to this answer).
So, I'm assuming that there will be no dead-time during the file replace. If there is a dead-time, how do I overcome it? Is this the way to do it or is there some other better way to do this.
I'm using a NodeJS server, just FYI.
*dead-time time when there is no file.

I want to make a live stream. And, I want to use HLS.
Why HLS? Why not DASH? DASH is also segmented and implemented almost exactly as HLS, but has more flexibility as far as codec choice and what not. Either is fine, but if you're starting from scratch today, I recommend DASH, and the DASH.js reference player code, which uses Media Source Extensions.
I understand that a HLS live stream is just a main playlist file with '.m3u8' extension that lists all the files to be played.
Correct.
But, for live stream, since all the files are not available readily, they are added as they come in.
Correct.
Now, I want to update the playlist file in S3. But it's actually going to replace the existing playlist file instead of just updating it
Yes, and as the other answer noted, there's no difference. The playlist file will be overwritten with the new full copy. The S3 API doesn't allow appending to a file, unless doing a multi-part upload which really isn't the same thing. In any case, your playlist file for a live stream isn't going to contain each and every segment anyway. Usually you only keep the last handful of segments in the playlist, but this is up to you to decide how far back to go.
So, I'm assuming that there will be no dead-time during the file replace.
S3 doesn't replace that object until the full new object is uploaded and stored. There will never be a case where a partial file is there. S3 isn't like a regular file system. Additionally, if a subsequent upload fails, the old object is still going to remain.
HLS and DASH players read the playlist and buffer a ton of data before beginning playback. (This is why they notoriously have high latency.) It will be several seconds before the new segment is uploaded and added to the playlist, so it's important that they already have data in the buffer to play. This is why you don't have to worry about any drop-out, unless there is a failure to upload in time.
I'm using a NodeJS server, just FYI.
Is that so? Sounds like you're using S3 to me... not sure what Node.js has to do with any of this.

Related

NodeJS: Every audio file created through fs.writeFile and Blob has excessive length

I'm creating an app which needs to take the user's voice and convert it to text, but the audio file seems to have something wrong with the length after it's creation.
Here is how I'm gathering the audio and converting the data to a Blob. I'm just using the getUserMedia method to record the audio and convert it to a Blob when its stopped.
This is the beginning of the function I pass the Blob object to. I convert the Blob to a buffer and write that buffer to a file in the root directory of my project. But once the file has been written and I go to listen to it, the length of the audio is 435:13:24 no matter how long the original audio was. But even though the length is that long, the file sounds exactly like it should and ends at the correct time.
Img Heres a picture of what the file looks like when viewed.
Although this may not seem like a big deal since listening to the file provides the correct audio, I'm passing the file to an API that converts it to text, and it almost always gives either the wrong translation or an error about the file containing nothing. I've tried different ways of writing the Blob data to the file, and tried cutting off the excess length audio, but nothing has worked.
This is my first post on stack overflow since I ran out of options to fix this, so I'm sorry if the question is kind of vague or I formatted it incorrectly in some way.

How can I detect corrupt/incomplete MP3 file, from a node.js app?

The common situation when the integrity of an MP3 file is not correct, is when the file has been partially uploaded to the server. In this case, the indicated audio duration doesn't correspond to what is really in the MP3 file: we can hear the beginning, but at some point the playing stops and the indicated duration of the audio player is broken.
I tried with libraries like node-ffprobe, but it seems they just read metadata, without making comparison with real audio data in the file. Is there a way to detect efficiently a corrupted or incomplete MP3 file from node.js?
Note: the client uploading MP3 files is a hardware (an audio recorder), uploading files on a FTP server. Not a browser. So I'm not able to upload potentially more useful data from the client.
MP3 files don't normally have a duration. They're just a series of MPEG frames. Sometimes, there is an ID3 tag indicating duration, but not always.
Players can determine duration by choosing one of a few methods:
Decode the entire audio file.This is the slowest method, but if you're going to decode the file anyway, you might as well go this route as it gives you an exact duration.
Read the whole file, skimming through frame headers.You'll have to read the whole file from disk, but you won't have to decode it. Can be slow if I/O is slow, but gives you an exact duration.
Read the first frame's bitrate and estimate duration by file size.Definitely the fastest method, and the one most commonly used by players. Duration is an estimate only, and is reasonably accurate for CBR, but can be wildly inaccurate for VBR.
What I'm getting at is that these files might not actually be broken. They might just be VBR files that your player doesn't know the duration of.
If you're convinced they are broken (such as stopping in the middle of content), then you'll have to figure out how you want to handle it. There are probably only a couple ways to determine this:
Ideally, there's an ID3 tag indicating duration, and you can decode the whole file and determine its real duration to compare.
Usually, that ID3 tag won't exist, so you'll have to check to see if the last frame is complete or not.
Beyond that, you don't really have a good way of knowing if the stream is incomplete, since there is no outer container that actually specifies number of frames to expect.
The expression for calculating the filesize of an mp3 based on duration and encoding (from this answer) is quite simple:
x = length of song in seconds
y = bitrate in kilobits per second
(x * y) / 1024 = filesize (MB)
There is also a javascript implementation for the Web Audio API in another answer on that same question. Perhaps that would be useful in your Node implementation.
mp3diags is some older open source software for fixing mp3s and which was great for batch processing stuff like this. The source is c++ and still available if you're feeling nosy and want to see how some of these features are implemented.
Worth a look since it has some features that might be be useful in your context:
What is MP3 Diags and what does it do?
low quality audio
missing VBR header
missing normalization data
Correcting files that show incorrect song duration
Correcting files in which the player cannot seek correctly

Download and play partial audio files in Python

I have an audio streaming application that uses requests to download the audio file and then played using Gstreamer.
I want to trim the first few seconds of all the audio files that i have. I could use ffmpeg to trim but that would waste cpu resources on my embedded platform and also waste network bandwidth
(The number of songs are around 1000, and they get downloaded continously, so it does make a difference)
I have tried downloading partial file using the range header in requests but that doesn't work. I can't play the file.
Can someone please tell me how i can make this work?
The audio files are generally .m4a / .webm but they are extracted from youtube so can't say for sure.
This is an uneasy task.. there is no clean way how to do it..
you can probably use the valve element set it to drop by default..
and then put some timer which sets the drop to false..
not sure how this will work, you need to try.
Here are some hints:

Smooth video playback with HTTP Live Stream

I'm trying to save to a file an HTTP live video stream. I know that for this purpose, I will need to request periodically the M3U8 file, parse it to extract the URL of the media segments, download the segments and reassemble them. The problem I'm having is finding the right strategy to achieve smooth playback. The reassembled video is always choppy, the audio skips etc...only the first few seconds are okay.My M3U8 file looks something like this:
#EXTM3U
#EXT-X-TARGETDURATION:2
#EXT-X-VERSION:3
#EXT-X-ALLOW-CACHE:NO
#EXT-X-MEDIA-SEQUENCE:1105
#EXTINF:1.00000,
tmp/video1.ts
#EXTINF:1.00000,
tmp/video2.ts
#EXTINF:1.00000,
tmp/video2.ts
#EXTINF:1.00000,
tmp/video3.ts
#EXTINF:1.00000,
tmp/video4.ts
#EXTINF:1.00000,
tmp/video5.ts
After I parse the file, I start downloading all TS files (one at the time) and when I'm about to download the second from the last, I request a new M3U8 file. Is this wrong? maybe the server has not yet updated the segments? Therefore, I'm re-downloading the same ones? I tried to wait for 5 seconds (number_of_videos * duration) before requesting a new playlist but I still experience the playback issues mentioned.Any idea/strategy on how I can achieve smooth playback?
The basic strategy is more or less as follows.
You start by processing the manifest file and downloading the first few segments to fill your buffer. You begin playback once you are happy you have enough data in the buffer, while continuing to download additional segments sequentially until the end of the manifest, at which point you request it again. If you find new segments in the refreshed manifest, you add these URLs to your download queue. If you do not, you wait for a predetermined period of time and refresh the manifest again. For example, your client could poll the M3U8 manifest depending on the (total duration of the segments * number of segments / 2).
I know some commercial players enter a paranoid mode when the playback buffer is getting low and the refreshed manifest does not contain any new segments to download. In this case, they begin requesting for the updated manifest much more frequently.
You also need to pay attention to caching between your client and the HTTP server with the content. For example, some CDNs will cache the manifest file for a minimum mandatory period of time, so if you try to request it within this period, you may get a stale manifest file served back to you.
From your example above (and I hope it is just in your hand-crafted example), the duration for each segment appears to be 1 second, which is quite low. If this is really the case, you would want to adjust your initial buffer accordingly.
And lastly, I presume you've verified the source stream with a stable player, to make sure the problem is not on the other end?
-- ab1

How does youtube support starting playback from any part of the video?

Basically I'm trying to replicate YouTube's ability to begin video playback from any part of hosted movie. So if you have a 60 minute video, a user could skip straight to the 30 minute mark without streaming the first 30 minutes of video. Does anyone have an idea how YouTube accomplishes this?
Well the player opens the HTTP resource like normal. When you hit the seek bar, the player requests a different portion of the file.
It passes a header like this:
RANGE: bytes-unit = 10001\n\n
and the server serves the resource from that byte range. Depending on the codec it will need to read until it gets to a sync frame to begin playback
Video is a series of frames, played at a frame rate. That said, there are some rules about the order of what frames can be decoded.
Essentially, you have reference frames (called I-Frames) and you have modification frames (class P-Frames and B-Frames)... It is generally true that a properly configured decoder will be able to join a stream on any I-Frame (that is, start decoding), but not on P and B frames... So, when the user drags the slider, you're going to need to find the closest I frame and decode that...
This may of course be hidden under the hood of Flash for you, but that is what it will be doing...
I don't know how YouTube does it, but if you're looking to replicate the functionality, check out Annodex. It's an open standard that is based on Ogg Theora, but with an extra XML metadata stream.
Annodex allows you to have links to named sections within the video or temporal URIs to specific times in the video. Using libannodex, the server can seek to the relevant part of the video and start serving it from there.
If I were to guess, it would be some sort of selective data retrieval, like the Range header in HTTP. that might even be what they use. You can find more about it here.

Resources