Restreaming video containing two languages live with ffmpeg

Restreaming video containing two languages live with ffmpeg - audio

I have a project where i need to restream a live stream which has two languages setup on the audio.
Spanish on left and English on right
The stream mapping is:
Stream #0:0: Video: h264 ([7][0][0][0] / 0x0007), yuv420p, 512x288 [SAR 1:1 DAR 16:9], q=2-31, 1k tbn, 1k tbc
Stream #0:1: Audio: mp3 ([2][0][0][0] / 0x0002), 44100 Hz, stereo, s16, 18 kb/s
I need to restream this back live with just the English from the right side or just spanish from the left side, I tried looking everywhere but did not find any type of solution .
Since this needs to be done live, I can't be using other programs to separate video and audio to get it done.
This needs to be done through ffmpeg and I wonder if it even capable of doing so with original built or it would need some custom modification.

You can use the -map_channel option or the pan filter. Unfortunately you didn't specify if you want a stereo or mono output. If stereo you can simply mute a channel or duplicate a channel into both left and right channels of the output. Here are some examples assuming you want to keep a stereo output.
To copy the right channel of the input to the left and right channels of the output:
ffmpeg -i input -map_channel 0.1.1 -map_channel 0.1.1 output
To mute the left channel:
ffmpeg -i input -map_channel -1 -map_channel 0.1.1 output
To mute the left channel using pan:
ffmpeg -i input -filter:a pan="stereo:c1=c1" output
FFmpeg usage questions are better suited for superuser.com since stackoverflow is programming specific.

Related

How to repackage mov/mp4 file into HLS playlist with multiple audio streams

I'm trying to convert some videos (in the different formats, e.g., mp4, mov) which contain one video stream and multiple audio streams into one HLS playlist with multiple audio streams (treated as languages) and only one video stream.
I already browsed a lot of stack threads and tried many different approaches, but I was only able to find answers for creating different HLS playlists with different audios.
Sample scenario which I have to handle:
I have one mov file, containing one video stream and 2 audio streams.
I need to create an HLS playlist from this mov file, which will use this one video stream, but would encode these 2 audio streams as language tracks (so let's say it's ENG and FRA)
Such prepared HLS can be later streamed in the player, and the end user would have a possibility to switch between audio tracks while watching the clip.
What I was able to achieve is to create multiple hls playlists each with different audio track.
ffmpeg -i "file_name.mp4" \
-map 0:v -map 0:a -c:v copy -c:a copy -start_number 0 \
-f hls \
-hls_time 10 \
-hls_playlist_type vod \
-hls_list_size 0 \
-master_pl_name master_playlist_name.m3u8 \
-var_stream_map "v:0,agroup:groupname a:0,agroup:groupname,language:ENG a:1,agroup:groupname" file_name_%v_.m3u8
My biggest issue is that I'm having hard time understanding how -map and -var_stream_map options should be used in my case, or if they even should be used in this scenario.
An example of the result of ffmpeg -i command on the original mov file which should be converted into HLS.
Stream #0:0[0x1](eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709, progressive), 1920x1080, 8786 kb/s, 25 fps, 25 tbr, 12800 tbn (default)
Metadata:
handler_name : Apple Video Media Handler
vendor_id : [0][0][0][0]
timecode : 00:00:56:05
Stream #0:1[0x2](und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 127 kb/s (default)
Metadata:
handler_name : SoundHandler
vendor_id : [0][0][0][0]
Stream #0:2[0x3](und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 127 kb/s
Metadata:
handler_name : SoundHandler
vendor_id : [0][0][0][0]
I also checked this blogpost and I would like to achieve this exact effect, but with video, not with audio.
For example, -var_stream_map "v:0,a:0 v:1,a:0 v:2,a:0" implies that
the audio stream denoted by a:0 is used in all three video renditions.

The stream_map looks fine. However the hls_muxer will not create a valid HLS playlist since it is missing the codec and bitrate information from the input stream since the audio and the video stream are copied (remember -c:v copy -c:a copy) and not parsed / re-encoded. To add those, use the -tag and -b options to specify the properties for all your video and audio streams in HLS.
Example for your video stream:
-tag:v:0 h264 -b:v:0 8786k

Dealing with problems in FLAC audio files with ffmpeg

I have gotten a set of FLAC (audio) files from a friend. I copied them to my Sonos music library, and got set to enjoy a nice album. Unfortunately, Sonos would not play the files. As a result I have been getting to know ffmpeg.
Sonos' complaint with the FLAC files was that it was "encoded at an unsupported sample rate". With rolling eyes and shaking head, I note that the free VLC media player happily plays these files, but the product I've paid for (Sonos) - does not. But I digress...
ffprobe revealed that the FLAC files contain both an Audio channel and a Video channel:
$ ffprobe -hide_banner -show_streams "/path/to/Myaudio.flac"
Duration: 00:02:23.17, start: 0.000000, bitrate: 6176 kb/s
Stream #0:0: Audio: flac, 176400 Hz, stereo, s32 (24 bit)
Stream #0:1: Video: mjpeg (Progressive), yuvj444p(pc, bt470bg/unknown/unknown), 450x446 [SAR 72:72 DAR 225:223], 90k tbr, 90k tbn, 90k tbc (attached pic)
Metadata:
comment : Cover (front)
Cool! I guess this is how some audio players are able to display the 'album artwork' when they play a song? Note also that the Audio stream is reported at 176400 Hz! Apparently I'm out of touch; I thought that 44.1khz sampling rate effectively removed all of the 'sampling artifacts' we could hear. Anyway, I learned that Sonos would support a max of 48kHz sampling rate, and this (the 176.4kHz rate) is what Sonos was unhappy about. I used ffmpeg to 'dumb it down' for them:
$ ffmpeg -i "/path/to/Myaudio.flac" -sample_fmt s32 -ar 48000 "/path/to/Myaudio48K.flac"
This seemed to work - at least I got a FLAC file that Sonos would play. However, I also got what looks like a warning of some sort:
[swscaler # 0x108e0d000] deprecated pixel format used, make sure you did set range correctly
[flac # 0x7feefd812a00] Frame rate very high for a muxer not efficiently supporting it.
Please consider specifying a lower framerate, a different muxer or -vsync 2
A bit more research turned up this answer which I don't quite understand, and then in a comment says, "not to worry" - at least wrt the swscaler part of the warning.
And that (finally) brings me to my questions:
1.a. What framerate, muxer & other specifications make a graphic compatible with a majority of programs that use the graphic?
1.b. How should I use ffmpeg to modify the Video channel to set these specifications (ref. Q 1.a.)?
2.a. How do I remove the Video channel from the .flac audio file?
2.b. How do I add a Video channel into a .flac file?
EDIT:
I asked the above (4) questions after failing to accomplish a 'direct' conversion (a single ffmpeg command) from FLAC at 176.4 kHz to ALAC (.m4a) at 48 kHz (max supported by Sonos). I reasoned that an 'incremental' approach through a series of conversions might get me there. With the advantage of hindsight, I now see I should have posted my original failed direct conversion incantation... we live and learn.
That said, the accepted answer below meets my final objective to convert a FLAC file encoded at 176.4kHz to an ALAC (.m4a) at 48kHz, and preserve the cover art/video channel.

What framerate, muxer & other specifications make a graphic compatible with a majority of programs that use the graphic?
A cover art is just a single frame so framerate has no relevance in this case. However, you don't want a video stream, it has to remain a single image, so -vsync 0 should be added. Muxer is simply the specific term for the packager as used in media file processing. It is decided by the choice of format e.g. FLAC, WAV..etc. What's important is the codec for the cover art; usually, it's PNG or JPEG. For FLAC, PNG is the default codec.
How do I remove the Video channel from the .flac audio file
ffmpeg -i "/path/to/Myaudio.flac" -vn -c copy "/path/to/Myaudio48K.flac"
(All this does is skip any video in the input and copy everything else)
How do I add a Video channel into a .flac file?
To add cover art to audio-only formats, like MP3, FLAC..etc, the video stream has to have a disposition of attached picture. So,
ffmpeg -i "/path/to/Myaudio.flac" -i coverimage -sample_fmt s32 -ar 48000 -disposition:v attached_pic -vsync 0 "/path/to/Myaudio48K.flac"
For direct conversion to ALAC, use
ffmpeg -i "/path/to/Myaudio.flac" -i coverimage -ar 48000 -c:a alac -disposition:v attached_pic -vsync 0 -c:v png "/path/to/Myaudio48K.m4a"

Slow audio-video sync drift when merging wav and mp4 with ffmpeg

I have an mp4 file with only a single video stream (no audio) and a wav audio file that I would like to add to the video using ffmpeg. The audio and the video have been recorded simultaneously during a conference, the former from a mixer output on a PC and the latter from a digital videocamera.
I am using this ffmpeg command:
ffmpeg -i incontro3.mp4 -itsoffset 18.39 -i audio_mix.wav -c:v copy -c:a aac final-video.mp4
where I'm using the -itsoffset 18.39 option since I know that 18.39s is the video-audio delay.
The problem I'm experiencing is that in the output file, while the audio is perfectly in sync with the video at the beginning, it slowly drifts out of sync during the movie.
The output if ffprobe on the video file is:
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'incontro3.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf57.25.100
Duration: 00:47:22.56, start: 0.000000, bitrate: 888 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 886 kb/s, 25 fps, 25 tbr, 12800 tbn (default)
Metadata:
handler_name : VideoHandler
and the ffprobe output for the audio file is:
Input #0, wav, from 'audio_mix.wav':
Metadata:
track : 5
encoder : Lavf57.25.100
Duration: 00:46:32.20, bitrate: 1411 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, 2 channels, s16, 1411 kb/s
I'm using the latest ffmpeg Zeranoe windows build git-9591ca7 (2016-05-25).
Thanks in anticipation for any help/ideas!
UPDATE 1: It looks like the problem is upstream the video-audio merging, and could be in the concatenation and conversion of the MTS files generated by the video camera into the mp4 video. I will follow up as I make any progress in understanding...
UPDATE 2: The problem is not in the initial merging of the MTS files generated by the camera. Or, at least, it occurs identically if I merge them with cat or with ffmpeg -f concat
UPDATE 3: Following #Mulvya's suggestion, I observed that the drift rate is constant (at least as far as I can tell judging by eye). I also tried to superimpose the A/V tracks with another software, and the drift is exactly the same, thereby ruling out ffmpeg as culprit. My (bad) feeling is that the issue could be related to the internal clocks of the digital video camera and the laptop used for audio recording running at slightly different rates (see here the report of an identical issue I just found).

Since the drift rate is constant, you can use a combination of FFmpeg filters to retime the audio.
ffmpeg -i audio_mix.wav -af asetrate=44100*(10/9),aresample=44100 retimed.wav
Here, 44100*(10/9) indicates the actual no. of samples that represent 1 second of sound i.e. if after 100 seconds of playback of the original WAV, the audio just heard is the 90th second, then the sample consumption rate should be increased by 10/9. That would make for an unconventional sample rate, so aresample is added to resample it back to a standard rate.

Convert form 30 to 60fps by increasing speed, not duplicating frames, using FFmpeg

I have a video that is incorrectly labelled at 30fps, it is actually 60fps and so looks like it's being played at half speed. The audio is fine, that is, the soundtrack finishes half way through the video clip. I'd like to know how, if possible to fix this, that is double the video speed, making it 60fps and meaning that the audio and video are synced.
The file is H.264 and the audio MPEG-4 AAC.
File details as given by ffmpeg, as requested:
ffmpeg version 0.8.9-6:0.8.9-0ubuntu0.13.10.1, Copyright (c) 2000-2013 the Libav developers
built on Nov 9 2013 19:09:46 with gcc 4.8.1
*** THIS PROGRAM IS DEPRECATED ***
This program is only provided for compatibility and will be removed in a future release. Please use avconv instead.
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from './Tignes60fps.mp4':
Metadata:
major_brand : mp42
minor_version : 0
compatible_brands: isommp42
creation_time : 2014-01-13 02:23:09
Duration: 00:08:33.21, start: 0.000000, bitrate: 5690 kb/s
Stream #0.0(eng): Video: h264 (High), yuv420p, 1920x1080 [PAR 1:1 DAR 16:9], 5609 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc
Metadata:
creation_time : 2014-01-13 02:23:09
Stream #0.1(eng): Audio: aac, 48000 Hz, stereo, s16, 156 kb/s
Metadata:
creation_time : 2014-01-13 02:23:09
At least one output file must be specified

Use -vsync drop:
ffmpeg -i input.avi -vcodec copy -vsync drop -r 60 output.avi
Source timestamps will be destroyed and the output muxer will create a new ones based on given frame rate (-r switch).

Okay so here's how I achieved what I wanted.
avconv -i input.mp4 -r 60 -filter:v "setpts=0.5*PTS" output.mp4
This left the audio unchanged, so it now synced up nicely with the video.
This was originally a video that was incorrectly exported as 30fps when really it was 60, so the video was playing at half the speed for twice a long, with the audio track finishing half way through. The above fixed this, sped up the video, without loosing frames, it now plays at 60fps, normal speed and is in sync with the audio.
Credit to rogerdpack for suggesting setpts, but you were very minimal! A fuller answer would have been appreciated!

How do I alter my FFMPEG command to make my HTTP Live Streams more efficient?

I want to reduce the muxing overhead when creating .ts files using FFMPEG.
Im using FFMPEG to create a series of transport stream files used for HTTP live streaming.
./ffmpeg -i myInputFile.ismv \
-vcodec copy \
-acodec copy \
-bsf h264_mp4toannexb \
-map 0 \
-f segment \
-segment_time 10\
-segment_list_size 999999 \
-segment_list output/myVarientPlaylist.m3u8 \
-segment_format mpegts \
output/myAudioVideoFile-%04d.ts
My input is in ismv format and contains a video and audio stream:
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 320x240, 348 kb/s, 29.97 tbr, 10000k tbn, 59.94 tbc
Stream #0:1(und): Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 63 kb/s
There is an issues related to muxing that is causing a large amout of overhead to be added to the streams. This is how the issue was described to me for the audio:
So for a given aac stream, the overhead will be 88% (since 200 bytes will map to 2 x 188 byte packets).
For video, the iframe packets are quite large, so they translate nicely into .ts packets, however, the diffs can be as small as an audio packet, therefore they suffer from the same issue.
The solution is to combine several aac packets into one larger stream before packaging them into .ts. Is this possible out of the box with FFMPEG?

It is not possible. Codecs rely on the encapsulating container for framing, which means to signal the start and length of a frame.
Your graphic actually misses an element, which is the PES packet. Your audio frame will be put into a PES packet first (which indicates its length), then the PES packet will be cut into smaller chunks which will be TS packets.
By design you can not start a new PES packet (containing an audio frame in your case) in a TS packet which already contains data. A new PES packet will always start in a new TS packet. Otherwise it would be impossible to start playing mid-stream (broadcast sitation) - it would be impossible to know on which byte in the TS the new PES begins (remember you have missed the beginning of the current PES).
There are some mitigating factors, the FF FF FF padding will probably be compressed by the networking hardware. Also if you are using HTTP (instead of UDP or RDP), gzip compression can be enabled (but I doubt it would help much).

I've fixed the worst problem of syncing the TS output on each frame in http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=75c8d7c2b4972f6ba2cef605949f57322f7c0361 - please try a version after that.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string