ffmpeg, how to concat two streams, one with and one without audio - audio

I have one clip filmed at 240 FPS. I want to slow it down 8x and concat the slow motion version of it to the fast version. The fast version has audio but the slow does not. When I open the finished movie using totem in Ubuntu I get no sound. However, the sound appears to be correct when I use VLC. I think this is an issue with the sound not being the same length as the final movie. I think I somehow need to pad the sound to the length of the final movie. Anyone know how to pad the audio or a better way to do this?
ffmpeg -hwaccel cuda -i GX010071_1.MP4 -filter_complex "[0:v]setpts=8*PTS[s];[0:v]framerate=30[f]; [f] [s] concat=n=2 [c]" -map '[c]' -map 0:a -c:v hevc_nvenc SLOW.MP4

Looks like I just needed to combine the apad filter with the shortest option. The following command works.
ffmpeg -hwaccel cuda -i GX010071_1.MP4 -filter_complex "[0:v]setpts=8*PTS[s];[0:v]framerate=30[f]; [f] [s] concat=n=2 [c]" -map '[c]' -map 0:a -af apad -c:v hevc_nvenc -shortest SLOW.MP4

Related

Concat mp4 videos and merge their audios to the final output

I have several videos and photos and need to merge them with the cross-dissolve effect. The algorithm is next:
Create videos from images and add silent audio to them (so they will also have a sound stream):
ffmpeg -y -f lavfi -i anullsrc -loop 1 -i /tmp/media/import-2020-Aug-19-Wednesday-05-40-34/ea5c93fd-d946-4742-b8f7-ea9ae4d43441.jpg -c:v libx264 -t 10 -pix_fmt yuv420p -vf scale=750:1280 /tmp/media/import-2020-Aug-19-Wednesday-05-40-34/ea5c93fd-d946-4742-b8f7-ea9ae4d43441.mp4
Combine all the videos and audios into one using this command:
ffmpeg
-i /tmp/media/import-2020-Aug-19-Wednesday-05-40-34/temp_68d437c0-f5e2-4651-b07e-91533480b6ef.mp4
-i /tmp/media/import-2020-Aug-19-Wednesday-05-40-34/temp_48f3c111-610d-40c7-ac71-6ce2fbb16184.mp4
-i /tmp/media/import-2020-Aug-19-Wednesday-05-40-34/temp_1593b5d8-7e16-417d-9372-2267581cd504.mp4
-i /tmp/media/import-2020-Aug-19-Wednesday-05-40-34/temp_1ac7f6be-1b12-4e31-b904-1491cc9b9494.mp4
-i /tmp/media/import-2020-Aug-19-Wednesday-05-40-34/temp_ea5c93fd-d946-4742-b8f7-ea9ae4d43441.mp4
-filter_complex
"[0:v]trim=start=0:end=8.032,setpts=PTS-STARTPTS[clip0];
[1:v]trim=start=2:end=13.047,setpts=PTS-STARTPTS[clip1];
[2:v]trim=start=2:end=13.558,setpts=PTS-STARTPTS[clip2];
[3:v]trim=start=2:end=13.186,setpts=PTS-STARTPTS[clip3];
[4:v]trim=start=2,setpts=PTS-STARTPTS[clip4];
[0:v]trim=start=9.032:end=10.032,setpts=PTS-STARTPTS[out0];
[1:v]trim=start=14.047:end=15.047,setpts=PTS-STARTPTS[out1];
[2:v]trim=start=14.558:end=15.558,setpts=PTS-STARTPTS[out2];
[3:v]trim=start=14.186:end=15.186,setpts=PTS-STARTPTS[out3];
[1:v]trim=start=0:end=2,setpts=PTS-STARTPTS[in1];
[2:v]trim=start=0:end=2,setpts=PTS-STARTPTS[in2];
[3:v]trim=start=0:end=2,setpts=PTS-STARTPTS[in3];
[4:v]trim=start=0:end=2,setpts=PTS-STARTPTS[in4];
[in1]format=pix_fmts=yuva420p,fade=t=in:st=0:d=2:alpha=1[fadein1];
[in2]format=pix_fmts=yuva420p,fade=t=in:st=0:d=2:alpha=1[fadein2];
[in3]format=pix_fmts=yuva420p,fade=t=in:st=0:d=2:alpha=1[fadein3];
[in4]format=pix_fmts=yuva420p,fade=t=in:st=0:d=2:alpha=1[fadein4];
[out0]format=pix_fmts=yuva420p,fade=t=out:st=0:d=2:alpha=1[fadeout0];
[out1]format=pix_fmts=yuva420p,fade=t=out:st=0:d=2:alpha=1[fadeout1];
[out2]format=pix_fmts=yuva420p,fade=t=out:st=0:d=2:alpha=1[fadeout2];
[out3]format=pix_fmts=yuva420p,fade=t=out:st=0:d=2:alpha=1[fadeout3];
[fadein1]fifo[fadein1fifo];
[fadein2]fifo[fadein2fifo];
[fadein3]fifo[fadein3fifo];
[fadein4]fifo[fadein4fifo];
[fadeout0]fifo[fadeout0fifo];
[fadeout1]fifo[fadeout1fifo];
[fadeout2]fifo[fadeout2fifo];
[fadeout3]fifo[fadeout3fifo];
[fadeout0fifo][fadein1fifo]overlay[crossfade0];
[fadeout1fifo][fadein2fifo]overlay[crossfade1];
[fadeout2fifo][fadein3fifo]overlay[crossfade2];
[fadeout3fifo][fadein4fifo]overlay[crossfade3];
[clip0][crossfade0][clip1][crossfade1][clip2][crossfade2][clip3][crossfade3][clip4]concat=n=9[output];
[0:a][1:a]acrossfade=d=10:c1=tri:c2=tri[A1];
[A1][2:a]acrossfade=d=10:c1=tri:c2=tri[A2];
[A2][3:a]acrossfade=d=10:c1=tri:c2=tri[A3];
[A3][4:a]acrossfade=d=10:c1=tri:c2=tri[audio] "
-vsync 0 -map "[output]" -map "[audio]" /tmp/media/final/some_filename_d0d2aab0-792a-4540-b2d3-e64abe98bf5c.mp4
And all works pretty well, but if I have, for example:
picture
video
video
picture
Then the sound from the second video is mapping to the first picture and sound from the third video to second video. And the third video actually goes without sound.
It seems like it's happening because the silent sound of the first picture is pretty short. An I right?
If so, how can I increase its duration?
I would much appreciate any help with this!
Assuming 5 inputs of 10 seconds each, all with audio streams*, with ffmpeg 4.3 or newer, use the xfade and acrossfade filters.
ffmpeg
-i in1.mp4
-i in2.mp4
-i in3.mp4
-i in4.mp4
-i in5.mp4
-filter_complex
" [0][1]xfade=transition=fade:duration=2:offset=8[V01];
[V01][2]xfade=transition=fade:duration=2:offset=16[V02];
[V02][3]xfade=transition=fade:duration=2:offset=24[V03];
[V03][4]xfade=transition=fade:duration=2:offset=32[video];
[0:a][1:a]acrossfade=d=2:c1=tri:c2=tri[A01];
[A01][2:a]acrossfade=d=2:c1=tri:c2=tri[A02];
[A02][3:a]acrossfade=d=2:c1=tri:c2=tri[A03];
[A03][4:a]acrossfade=d=2:c1=tri:c2=tri[audio]"
-vsync 0 -map "[video]" -map "[audio]" out.mp4
*if there's no existing audio stream, add one using the command in step 1.
If the existing audio stream of a file isn't 10 seconds long, use these filters on it before acrossfade.
[input]aresample=async=1:first_pts=0,apad,atrim=0:10[filtered]
and then use this filtered stream as input.

Merge 2 Files (audio and Video), with BITC and watermark in FFMPEG

I need to write ffmpeg profile to merge to merge video and audio files, and swap audio in video file from audio file, add BITC , and implement watermark from network location.
Can do it separately, but as I`m not FFMPEG expert, hard for me to combine all of above together.
Any advise would be appreciate.
Best regards all
Use the overlay filter for the watermark and the drawtext filter for the burnt-in timecode:
ffmpeg -i video.mp4 -i audio.mp3 -i watermark.png -filter_complex "[0:v:0]drawtext=fontfile=/usr/share/fonts/TTF/DejaVuSansMono.ttf:timecode='01\:23\:45\:00':r=25:x=(w-text_w)/2:y=h-text_h-20:fontsize=20:fontcolor=white:box=1:boxborderw=4:boxcolor=black[bg];[1][bg]overlay=W-w-10:H-h-12:format=auto[v]" -map "[v]" -map 1:a -shortest output.mp4

Mapping streams by language in FFmpeg

I have lots of files with multiple audio and subtitle languages, however the track numbers aren't consistent (the English audio stream isn't always the first) so using a command such as:
ffmpeg -i "input.mkv" -map 0 -map -0:a:1 -c:v copy -c:a copy "output.mkv"
doesn't yield expected results. After searching around I discovered it was possible to map streams based on language with this command:
ffmpeg -i "input.mkv" -map 0 -map -0:m:language:eng -c:v copy -c:a copy "output.mkv"
However -map -0:m:language:eng will remove all tracks with the English language flag. To keep the subtitle tracks you can use -map 0:s this is a good solution however, I want to know if it's possible to only map audio streams based on language. I.e.,
I want to remove English audio while retaining all other streams without using stream IDs.
-0:m:language:eng will remove english audio tracks and keep all others.
to keep only english audio tracks and remove all others, remove the dash at the beginning: 0:m:language:eng
the dash at the beginning creates a negative mapping, which tells ffmpeg "remove this and only things that match this"
i know this is 8 months later, but i thought it would be helpful for those who end up here off of google searches like i did.
Edit: Ignore initial reply. Not possible at present. Use workaround on top.
ffmpeg -i "in.mkv" -map 0:a -map -0:m:language:eng -map 0:v -map 0:s -map 0:d? -map 0:t? -c copy "out.mkv"
This achieves the desired result because ffmpeg implements the map options in given order.
You need to suffix the metadata selectors to the stream type selector i.e.
ffmpeg.exe -i "%f" -map 0 -map -0:a:m:language:eng -c:v copy -c:a copy "../%f"
Updated As far as I can tell this is the best way to remove English audio while retaining all other streams without using stream IDs which I find to be more inconsistent then language flags. Generally people use correct language flags however audio languages are less likely to keep the same ID.
ffmpeg -i "in.mkv" -map a -map -m:language:eng -map v -map s -map d? -map t -c:v copy -c:a copy "out.mkv"
The command will map every audio stream then remove audio with the English language flag. It will then map all video, subtitle and attachment streams. You can add -disposition:a:0 default to give the first audio stream the [default] flag if needed. Note: Only use when you are removing audio that has the default flag already. Change -disposition:a:0 to -disposition:a:1 and so on if you want to set a different audio track to default.
The following will copy the video and English only audio stream.
ffmpeg -i "G:\VIDEO_TS\VTS_01_1.VOB" -map i:0x1e0 -map i:0x80 "THE STRANGERS 1.mp4"

ffmpeg merge silent video with another video+audio

I want to create, in a single command, a video from 3 sources:
a silent background video;
a smaller video to be overlayed (same length of 1), KEEPING its AUDIO;
a PNG logo to be overlayed
I can create the video but cannot get the audio track. I don't understand if -vf is supposed to work in this case. This is what I've tried to do :
ffmpeg.exe -y -i MASTER_SILENT_VIDEO.mp4 -vf "movie=SMALLER_VIDEO_WITH_AUDIO.flv, scale=320:-1[inner];movie=MY_LOGO.png[inner2]; [in][inner] overlay=800:480,amerge [step1]; [step1][inner2] overlay=30:30 [out]" completed.mp4
The "amerge" filter should do the audio merging job, but of course it doesn't work. I've found similar questions involving -map or filtergraph but they refer to mixing a video source and an audio source; I tried several filtergraph examples without success. Any idea?
overlay one video over other using audio from one input
Use -filter_complex, eliminate the movie source filters, and explicitly define output streams with -map:
ffmpeg -y -i main.mp4 -i overlay_with_audio.flv -i logo.png -filter_complex
"[1:v]scale=320:-1[scaled];
[0:v][scaled]overlay=800:480[bg];
[bg][2:v]overlay=30:30,format=yuv420p[video]"
-map "[video]" -map 1:a -movflags +faststart
output.mp4
You may have to provide additional options to the overlay filters depending on the length of the inputs and how you want overlay to react, but because you did not provide the complete console output from your command I had to make a generic, less efficient, and possibly incorrect example.
overlay one video over other merging audio from both inputs
ffmpeg -y -i main.mp4 -i overlay_with_audio.flv -i logo.png -filter_complex
"[1:v]scale=320:-1[scaled];
[0:v][scaled]overlay=800:480[bg];
[bg][2:v]overlay=30:30,format=yuv420p[video];
[0:a][1:a]amerge=inputs=2[audio]"
-map "[video]" -map "[audio]" -ac 2 -movflags +faststart
output.mp4
I'm assuming both inputs are stereo and that you want a stereo output. Also see FFmpeg Wiki: Audio channel Manipulation - 2 × stereo → stereo.

ffmpeg concat and scale simultaneously?

I have two ffmpeg commands:
ffmpeg -i d:\1.mp4 -i d:\1.mp4 -filter_complex "[0:0] [0:1] [1:0] [1:1] concat=n=2:v=1:a=1 [v] [a]" -map "[v]" -map "[a]" d:\3.mp4
and
ffmpeg -i d:\1.mp4 -vf scale=320:240 d:\3.mp4
How to use them both simultaneously?
For posterity:
The accepted answer does not work if the input sources are of different sizes (which is the primary reason why you need to scale before combining).
What you need to do is to first scale and then pipe that video output into the concat filter like so:
ffmpeg -i input1.mp4 -i input2.mp4 -filter_complex \
"[0:v]scale=1024:576:force_original_aspect_ratio=1[v0]; \
[1:v]scale=1024:576:force_original_aspect_ratio=1[v1]; \
[v0][0:a][v1][1:a]concat=n=2:v=1:a=1[v][a]" -map [v] -map [a] output.mp4
Had this problem today and was pulling my hair for good three hours trying to figure this out and unfortunately the accepted answer did not work as noted in the comments.
ffmpeg -i d:\1.mp4 -i d:\2.mp4 -filter_complex "concat=n=2:v=1:a=1 [v] [a]; \
[v]scale=320:200[v2]" -map "[v2]" -map "[a]" d:\3.mp4
Firstly we concatenate everything and pipe result to [v] [a] (see filtergraph syntax docs - its output from concat filter). Next we take [v], scale it and output to [v2], lastly we take [v2] and [a] and mux it to d:\3.mp4 file.
Construct custom filtergraph, move resize procedure nearer to the video source, for example let's deal with a more complex graph in order to grasp the spirit of its construction language:
ffmpeg.exe -i Movie_oriented_minus_90.mov -i Movie_pause.mp4 -i Sound_pause.aac -filter_complex "[0:v:0]scale=1920:1080 [c1],[c1]vflip[c2],[c2]hflip[clip], [clip] [0:a:0] [1:v:0] [2:a:0] concat=n=2:v=1:a=1 [v] [a]" -map "[v]" -map "[a]" -c:v libx264 -q:v 0 -acodec mp3 -s 1920x1080 Movie_oriented_plus_90_with_pause.mp4
Stage 1: Movie_oriented_minus_90 is source 0, its video stream
[0:v:0] fed into scale filter and produced as [c1], then [c1] flipped
vertically into [c2] and then [c2] flipped horizontally into [clip]
thus rotated for 180 degree
Stage 2: 1st video stream concatenated with 2nd source stream, i.e.
1st video stream: [clip] (processed stream from source 0) and sound
from original video [0:a:0] 2nd video stream: constructed from video
from source 1 [1:v:0] and audio [2:a:0] from source 2 (30 sec of
silence made with -filter_complex "aevalsrc=0:d=30" during separated
run of ffmpeg)
Stage 3: the resulting video sequence [v] and [a] then compressed
with x264 codec into the target mp4 file
So, the main problem with your question was that you tried to concatenate streams with dfferent sizes and only then applied resizing operation for the already aggregated stream which is of course can't consist of media samples with different sizes.

Resources