Does YouTube store video and audio separately - audio

youtube-dl can be used to see what formats are used to store YouTube content:
youtube-dl -F https://youtu.be/??????
The above command hints that the audio and video are mostly stored separately. Is it right? Does YouTube streaming combine audio and video in real-time?
Formats for a sample YouTube content

Most large streaming services will use ABR streaming (see: https://stackoverflow.com/a/42365034/334402).
The two most common ABR streaming formats are HLS and MPEG-DASH and both provide a manifest or index file which the player downloads first and which will contain links to the media streams, typically audio, video, subtitle tracks etc.
For encrypted content the audio and video, and even different bit rate video tracks, may all have separate encryption keys.
The player will download the audio and video tracks and synchronise them for playback.

in general streaming video and audio are sent in separate channels .... ditto for multi track audio like 5+1 ... during transport these channels are wrapped by a media container like mp4 etc
motive is partly due to distinct compression algorithms ... some algos are best for audio versus others for video and baked into these algos is the spread and sharing of data over time across video frames see B-frames for details ... these channels are not limited to video and audio ... if you own the sending and receiving sides you can send arbitrary data in many distinct channels by making up your own data protocol ... as an aside modern codec like H.256 allow data to get sent from receiver back to sender when you think you are simply viewing a movie (read the RFC)
youtube stores each of its various flavors of video and audio in separate files on its end then combines them based in desired streaming quality choices on a per download basis

Related

ffmpeg - correctly handling misaligned audio/video input stream before outputting to rtmp

I use a video player called MPV to transcode a dynamic playlist of media files.
I pipe MPV's encoded output into FFMPEG and format it for rtmp delivery.
However the playlist may contain media with misaligned audio and video, ie - the audio track may be shorter / longer than the video track.
No matter what MPV will only output what it's given. So if my media file has audio that is 1 second long and video that is 2 seconds long, it will output a media stream with exactly the same misalignment, rather than generating null audio or skipping to the next item in the playlist when it first encounters an active stream ending (eof).
For example, assuming my playlist was full of problematic media where the audio and video of each file was misaligned:
If I output this media stream to a popular streaming service's server, it could lead to stuttering and/or loss of a/v sync.
Similarly, if I output this media stream to a file and played it back in MPV or another video player, the result appears to be more like this:
I have tried to fix this in MPV in all sorts of ways, trying every relevant command line option available. I even wrote a user script that detects 'eof' audio and skips to the next item in the playlist, but it is not fast enough and still leads to small gaps of audio.
So my only hope is correcting it in ffmpeg. In the event of null audio/video, I need a fallback or a generative filter that can fill these empty gaps with silence (audio) or a colour/image (video).
I'm open to any ideas, and if my understanding in a/v encoding is a little off please educate me.

Routing AVPlayer audio output to AVAudioEngine

Due to the richness and complexity of my app's audio content, I am using AVAudioEngine to manage all audio across the app. I am converting every audio source to be represented as a node in my AVAudioEngine graph.
For example, instead using AVAudioPlayer objects to play mp3 files in my app, I create AVAudioPlayerNode objects using buffers of those audio files.
However, I do have a video player in my app that plays video files with audio using the AVPlayer framework (I know of nothing else in iOS that can play video files). Unfortunately, there seems to be no way I can obtain the audio output stream as a node in my AVAudioEngine graph.
Any pointers?
If you have a video file, you can extract audio data and pull it out from the video.
Then you can set the volume of AVPlayer to 0. (If you didn't remove audio data from the video)
and Play AVAudioPlayerNode.
If you receive the video data through network, You should make parser of the packet and divide them.
But AV-sync is very tough thing.

How to stream the video in node js or directly by AWS S3 by chunks to client side like youtube, sonylive, facebook does

Let's assume I save a video in AWS S3 bucket or it's saved in my ubuntu server and now I want to send that video in streams as all the streaming giants does
Suppose their video length 1:00 so what they are doing sending some seconds of data only so that we don't need to download the complete video on our side but how can they do it, I tried the same but when I click on networks it's not like that sending video data in chunks, all video loaded at once, I won't see any chunks in the networks
Example youtube, sony live, Netflix, prime, Facebook, etc many of the video giants are sending video chunks by chunks so that it won't require too much load on the client-side
I studied in one of the links they are saving video like
If abc is the video name.
abc
144
audio
segment0
segment3
segment6
segment9
....so on
video
segment0
segment3
segment6
segment9
... so on
240
audio
segment0
segment3
segment6
segment9
....so on
video
segment0
segment3
segment6
segment9
... so on
In the above 144 is bit rate similarly there are other bit rates how can I send data by like, so I need to save data like that.
In Youtube, Netflix, In Prime there are options of changing bitrate, changing subtitle so how they save data and send to the client, also they change quality according to the net speed so how do I achieve that
How can I send my video in similar chunks using node js?

mpeg-dash and codecs specification

Looking at the article :http://www.streamingmedia.com/Articles/Editorial/What-Is-.../What-is-MPEG-DASH-79041.aspx
And it makes statements like:DASH is codec-independent, and will work with H.264, WebM and other codecs
DASH supports both the ISO Base Media File Format (essentially the MP4 format) and MPEG-2 transport streams
DASH does not specify a DRM method but supports all DRM techniques specified in ISO/IEC 23001-7: Common Encryption
But how is audio/video compression, or DRM method is specified in Media Presentation? Where cab i find more details?
DASH is a streaming protocol - the video stream is inside a 'container' and the container is broken into chunks and streamed. A very high level view of the video component is:
elementary video stream encoded with some codec
fragmented mp4 container (broken into chunks to facilitate ABR)
MPEG DASH streaming protocol
The mp4 container header information contains information about all the streams it contains - this will include the codec that it used to encode the stream (e.g. h.264 for a video stream).
ABR essentially allows the client device or player download the video in chunks, e.g 10 second chunks, and select the next chunk from the bit rate most appropriate to the current network conditions.
The DASH manifest (essentially an index file that contains pointers to the different bit rate streams etc) contains header information about the protections systems in use, for example Widevine or PlayReady DRMs.
The mp4 container also contains information about the protection system in a special PSSH (Protection System Specific Headers) header for the protection systems in use, for example again, Widevine or PlayReady.
Generally DASH streams will have the protection information in both places to ensure that all players can play the stream, but last time I looked, I think the spec strictly speaking says it can be in either or both.
The specs themselves are available here:
http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html (search for DASH)
https://www.iso.org/standard/68042.html - unfortunately, this one requires payment AFAIK. You can see a W3C spec which uses it here, however: https://w3c.github.io/encrypted-media/format-registry/stream/mp4.html
And there is a nice overview of DASH here:
https://www.w3.org/2011/09/webtv/slides/W3C-Workshop.pdf
And, of course, the classic reference to some of the drivers for DASH and similar standards:
https://xkcd.com/927/

Streaming video from nodejs to an open player

Odd ball question for somebody just getting started with html5 players and streaming video....
When using YouTube long videos can be scrolled towards then end then played from there. Assuming YouTube first pulls down metadata like total video start/stop points and a bunch of thumbnails for scrolling.
Is this possible with an open html5 video player (like projekkter)? Reason asking is that I have video data inside a mongo database that I would like to stream similar to the YouTube player.
Inside mongo I have a bunch of smaller h264 files each in a document: actual raw h264 usually 1000kb (max 2 seconds), creation timestamp (long), and potentially a converted format (like mp4) for known clients. Idea is to query off a time range and order by creation time then piping the results into readable stream. There is a nice ffmpeg module to take streams and reformat if needed. Thought about piping the stream to the client with binaryjs and appending it into the player.
But the source directives in the documentation are usually URLs plus I need to lock down the start/stop point for the total video being played plus thumbnails.

Resources