Concat mp4 videos and merge their audios to the final output - audio

I have several videos and photos and need to merge them with the cross-dissolve effect. The algorithm is next:
Create videos from images and add silent audio to them (so they will also have a sound stream):
ffmpeg -y -f lavfi -i anullsrc -loop 1 -i /tmp/media/import-2020-Aug-19-Wednesday-05-40-34/ea5c93fd-d946-4742-b8f7-ea9ae4d43441.jpg -c:v libx264 -t 10 -pix_fmt yuv420p -vf scale=750:1280 /tmp/media/import-2020-Aug-19-Wednesday-05-40-34/ea5c93fd-d946-4742-b8f7-ea9ae4d43441.mp4
Combine all the videos and audios into one using this command:
ffmpeg
-i /tmp/media/import-2020-Aug-19-Wednesday-05-40-34/temp_68d437c0-f5e2-4651-b07e-91533480b6ef.mp4
-i /tmp/media/import-2020-Aug-19-Wednesday-05-40-34/temp_48f3c111-610d-40c7-ac71-6ce2fbb16184.mp4
-i /tmp/media/import-2020-Aug-19-Wednesday-05-40-34/temp_1593b5d8-7e16-417d-9372-2267581cd504.mp4
-i /tmp/media/import-2020-Aug-19-Wednesday-05-40-34/temp_1ac7f6be-1b12-4e31-b904-1491cc9b9494.mp4
-i /tmp/media/import-2020-Aug-19-Wednesday-05-40-34/temp_ea5c93fd-d946-4742-b8f7-ea9ae4d43441.mp4
-filter_complex
"[0:v]trim=start=0:end=8.032,setpts=PTS-STARTPTS[clip0];
[1:v]trim=start=2:end=13.047,setpts=PTS-STARTPTS[clip1];
[2:v]trim=start=2:end=13.558,setpts=PTS-STARTPTS[clip2];
[3:v]trim=start=2:end=13.186,setpts=PTS-STARTPTS[clip3];
[4:v]trim=start=2,setpts=PTS-STARTPTS[clip4];
[0:v]trim=start=9.032:end=10.032,setpts=PTS-STARTPTS[out0];
[1:v]trim=start=14.047:end=15.047,setpts=PTS-STARTPTS[out1];
[2:v]trim=start=14.558:end=15.558,setpts=PTS-STARTPTS[out2];
[3:v]trim=start=14.186:end=15.186,setpts=PTS-STARTPTS[out3];
[1:v]trim=start=0:end=2,setpts=PTS-STARTPTS[in1];
[2:v]trim=start=0:end=2,setpts=PTS-STARTPTS[in2];
[3:v]trim=start=0:end=2,setpts=PTS-STARTPTS[in3];
[4:v]trim=start=0:end=2,setpts=PTS-STARTPTS[in4];
[in1]format=pix_fmts=yuva420p,fade=t=in:st=0:d=2:alpha=1[fadein1];
[in2]format=pix_fmts=yuva420p,fade=t=in:st=0:d=2:alpha=1[fadein2];
[in3]format=pix_fmts=yuva420p,fade=t=in:st=0:d=2:alpha=1[fadein3];
[in4]format=pix_fmts=yuva420p,fade=t=in:st=0:d=2:alpha=1[fadein4];
[out0]format=pix_fmts=yuva420p,fade=t=out:st=0:d=2:alpha=1[fadeout0];
[out1]format=pix_fmts=yuva420p,fade=t=out:st=0:d=2:alpha=1[fadeout1];
[out2]format=pix_fmts=yuva420p,fade=t=out:st=0:d=2:alpha=1[fadeout2];
[out3]format=pix_fmts=yuva420p,fade=t=out:st=0:d=2:alpha=1[fadeout3];
[fadein1]fifo[fadein1fifo];
[fadein2]fifo[fadein2fifo];
[fadein3]fifo[fadein3fifo];
[fadein4]fifo[fadein4fifo];
[fadeout0]fifo[fadeout0fifo];
[fadeout1]fifo[fadeout1fifo];
[fadeout2]fifo[fadeout2fifo];
[fadeout3]fifo[fadeout3fifo];
[fadeout0fifo][fadein1fifo]overlay[crossfade0];
[fadeout1fifo][fadein2fifo]overlay[crossfade1];
[fadeout2fifo][fadein3fifo]overlay[crossfade2];
[fadeout3fifo][fadein4fifo]overlay[crossfade3];
[clip0][crossfade0][clip1][crossfade1][clip2][crossfade2][clip3][crossfade3][clip4]concat=n=9[output];
[0:a][1:a]acrossfade=d=10:c1=tri:c2=tri[A1];
[A1][2:a]acrossfade=d=10:c1=tri:c2=tri[A2];
[A2][3:a]acrossfade=d=10:c1=tri:c2=tri[A3];
[A3][4:a]acrossfade=d=10:c1=tri:c2=tri[audio] "
-vsync 0 -map "[output]" -map "[audio]" /tmp/media/final/some_filename_d0d2aab0-792a-4540-b2d3-e64abe98bf5c.mp4
And all works pretty well, but if I have, for example:
picture
video
video
picture
Then the sound from the second video is mapping to the first picture and sound from the third video to second video. And the third video actually goes without sound.
It seems like it's happening because the silent sound of the first picture is pretty short. An I right?
If so, how can I increase its duration?
I would much appreciate any help with this!

Assuming 5 inputs of 10 seconds each, all with audio streams*, with ffmpeg 4.3 or newer, use the xfade and acrossfade filters.
ffmpeg
-i in1.mp4
-i in2.mp4
-i in3.mp4
-i in4.mp4
-i in5.mp4
-filter_complex
" [0][1]xfade=transition=fade:duration=2:offset=8[V01];
[V01][2]xfade=transition=fade:duration=2:offset=16[V02];
[V02][3]xfade=transition=fade:duration=2:offset=24[V03];
[V03][4]xfade=transition=fade:duration=2:offset=32[video];
[0:a][1:a]acrossfade=d=2:c1=tri:c2=tri[A01];
[A01][2:a]acrossfade=d=2:c1=tri:c2=tri[A02];
[A02][3:a]acrossfade=d=2:c1=tri:c2=tri[A03];
[A03][4:a]acrossfade=d=2:c1=tri:c2=tri[audio]"
-vsync 0 -map "[video]" -map "[audio]" out.mp4
*if there's no existing audio stream, add one using the command in step 1.
If the existing audio stream of a file isn't 10 seconds long, use these filters on it before acrossfade.
[input]aresample=async=1:first_pts=0,apad,atrim=0:10[filtered]
and then use this filtered stream as input.

Related

How to take metadata from .mp3 file and put it to a video as a text using FFmpeg?

In my previously opened topic:
How to make FFmpeg automatically inject mp3 audio tracks in the single cycled muted video
I've got detailed explanation from #llogan how to broadcast looped short muted video on youtube automatically injecting audio tracks in it without interrupting a translation.
I plan to enhance the flow and the next question I faced with is how to dynamically put an additional text to the broadcast.
Prerequisites:
youtube broadcast is up and running by ffmpeg
short 3 min video is paying in infinity loop
audio tracks from playlist are automatically taken by "ffmpeg concat" and injected in the video one by one
this is a basic command to start translation:
ffmpeg -re -fflags +genpts -stream_loop -1 -i video.mp4 -re -f concat
-i input.txt -map 0:v -map 1:a -c:v libx264 -tune stillimage -vf format=yuv420p -c:a copy -g 20 -b:v 2000k -maxrate 2000k -bufsize
8000k -f flv rtmp://a.rtmp.youtube.com/live2/my-key
Improvements I want to bring
I plan to store some metadata in audio files (basically it's an artist name and a song name)
At the moment a particular song starts playing artist/song name should be taken from metadata and displayed on the video as text during the whole song is playing.
When the current song finishes and a new one starts playing the previous artist/song text should be replaced with the new one etc
My question is how to properly take metadata and add it to the existing broadcast config using ffmpeg?
This is a fairly broad question and I don't have a complete solution. But I can provide a partial answer containing several commands that you can use to help implement a solution.
Update text on video on demand
See Can you insert text from a file in real time with ffmpeg streaming?
Get title & artist metadata
With ffprobe:
ffprobe -v error -show_entries format_tags=title -of default=nw=1:nk=1 input.mp3
ffprobe -v error -show_entries format_tags=artist -of default=nw=1:nk=1 input.mp3
Or combined: format_tags=title,artist (note that title will display first, then artist, regardless of order in the command).
Get duration of a song
See How to get video duration in seconds?
What you need to figure out
The hard part is knowing when to update the file referenced in textfile in drawtext filter as shown in Update text on video on demand above.
Lazy solution
Pre-make a video per song including the title and artist info. Simple Bash example:
audio=input.mp3; ffmpeg -stream_loop -1 -i video.mp4 -i "$audio" -filter_complex "[0:v]scale=1280:720:force_original_aspect_ratio=increase,crop=1280:720,setsar=1,fps=25,drawtext=text='$(ffprobe -v error -show_entries format_tags=title,artist -of default=nw=1:nk=1 $audio)':fontsize=18:fontcolor=white:x=10:y=h-th-10,format=yuv420p[v]" -map "[v]" -map 1:a -c:v libx264 -c:a aac -ac 2 -ar 44100 -g 50 -b:v 2000k -maxrate 2000k -bufsize 6000k -shortest "${audio%.*}.mp4"
Now that you already did the encoding, and everything is conformed to the same attributes for proper concatenation, you can probably just stream copy your playlist to YouTube (but I didn't test):
ffmpeg -re -f concat -i input.txt -c copy -f flv rtmp://a.rtmp.youtube.com/live2/my-key
Refer to your previous question on how to dynamically update the playlist.
References:
FFmpeg Wiki: Streaming to YouTube
Resizing videos with ffmpeg to fit into specific size
How to concatenate videos in ffmpeg with different attributes?

How to delay audio after a specific position with ffmpeg?

I have a 10 seconds a.mp4 with two streams: Stream #0 is a video stream and Stream #1 is a audio stream.
Now, I want to delay the audio stream by 4 seconds after the time position 00:03. It is to say, in the output file, I want that: 00:00-00:03 is the original audio, 00:03-00:07 has no sound, 00:07-00:14 is the original 00:03-00:10 audio.
I've tried this:
ffmpeg -i a.mp4 -t 00:00:03 -i a.map4 -itsoffset 4 -ss 00:00:03 -i a.mp4 -map 0:v -map 1:a -map 2:a -codec copy output.mp4
But it seems that there are two audio streams in the output.mp4 and only one of them can be played once. Then I tried amix filter:
ffmpeg -i a.mp4 -t 00:00:03 -i a.mp4 -itsoffset 4 -ss 00:00:03 -i a.mp4 -filter_complex "[1:a][2:a] amix=inputs=2" -map 0:v output.mp4
But it also doesn't work. I'm new to ffmpeg so I have no idea what should I do now? Any idea for me? Very much thanks!
Use the asetpts filter to change timestamps, and aresample to (optionally) insert silence in that gap.
ffmpeg -i a.mp4 -af "asetpts='if(lt(T\,3),PTS,PTS+4/TB)',aresample=async=1" -c:v copy output.mp4
Test without aresample to see if your player is tolerant of large gaps in the audio stream.

ffmpeg to calculate audio/visual difference between compressed and non-compressed video

I'm trying to calculate the audio + visual difference between a harshly compressed video file and one that hasn't been.
I'm using pipes because ultimately I wish this to take src from a camera stream.
I've managed to get the video results that I'm looking for, but I'm struggling with the audio.
I've added a line to invert the phase of the compressed audio, so that when they add up in the blend they should almost cancel each other out, but that doesn't happen.
ffmpeg -i input.avi -f avi -c:v libxvid -qscale:v 30 -c:a wmav1 - | \
ffmpeg -i - -f avi -af "aeval='-val(0)':c=same" - | \
ffmpeg -i input.avi -i - -filter_complex "blend=all_mode=difference" -c:v libx264 -crf 18 -f avi - | \
ffplay -
I can still hear all the audio, when what I should be hearing are solely compression artifacts. thx
To preface, I'm not sure your method would identify audio compression 'artifacts'
Your command doesn't perform any audio comparison, it only inverts a single channel. Also, the audio and video are compressed twice and the codecs the last ffmpeg command receives are the default AVI codecs of mpeg4 and mp3.
Use
ffmpeg -i input.avi -f matroska -c:v libxvid -qscale:v 30 -c:a wmav1 - |\
ffmpeg -i input.avi -i - -filter_complex "[0][1]blend=all_mode=difference;[1]aselect=gt(n\,0),asetpts=PTS-STARTPTS[1a];[0][1a]amerge,aeval=val(0)-val(1):c=mono" -c:v rawvideo -c:a pcm_s16le -f matroska - |\
ffplay -
I assume your audio is mono. If your audio has N channels, your aeval will need N expressions where the Mth expression is val(M-1)-val(N+M-1)
I also trim out the first encoded audio frame in order to mitigate encoder delay that Paul mentioned, and it seems to work here.
There might be some delay introduced with encoded audio samples. Also your command is incorrect.

Append black frames to video when audio is longer in ffmpeg

I'm trying to utilize ffmpeg as a video editor and this is mostly due to that the regular video editor dropped more frames than I was comfortable with.
ffmpeg -i "videoplayback1" -t 00:09:51 -i "audioplayback1" -t 00:09:54.38 -vcodec libx264 -crf 20 -acodec copy "playback1.mp4"
As you can see, I'm trimming the video shorter than the audio, but what I want is something that is the opposite of the -shortest command switch, to have the file continue for the duration of the audio -t, and adding physical black frames for the remainder of that time.
As it is now, the video is still clipped as if I was using the -shortest switch. I tried some -vf and filter_complex but either I get errors, or that the audio is still chopped, the video frozen, but the duration is that of the longest -t.
How would I go about adding black frames for as long as the audio is playing?
Your command is malformed in that it's not trimming the video shorter than the audio. Option placement matters. Input options for an input go before that input, so
ffmpeg -t 00:09:51 -i "videoplayback1" -t 00:09:54.38 -i "audioplayback1" -vcodec libx264 ...
For your editing requirement, I would drop the video trim and use the drawbox filter to blacken the frame after the desired video trim point.
ffmpeg -i "videoplayback1" -t 00:09:54.38 -i "audioplayback1" -vf drawbox=t=fill:enable='gt(t,591)' -shortest -c:v libx264 -crf 20 -c:a copy "playback1.mp4"
drawboxis set to draw over the whole frame with the default color of black after 591 seconds of video. -shortest terminates the output.

ffmpeg merge silent video with another video+audio

I want to create, in a single command, a video from 3 sources:
a silent background video;
a smaller video to be overlayed (same length of 1), KEEPING its AUDIO;
a PNG logo to be overlayed
I can create the video but cannot get the audio track. I don't understand if -vf is supposed to work in this case. This is what I've tried to do :
ffmpeg.exe -y -i MASTER_SILENT_VIDEO.mp4 -vf "movie=SMALLER_VIDEO_WITH_AUDIO.flv, scale=320:-1[inner];movie=MY_LOGO.png[inner2]; [in][inner] overlay=800:480,amerge [step1]; [step1][inner2] overlay=30:30 [out]" completed.mp4
The "amerge" filter should do the audio merging job, but of course it doesn't work. I've found similar questions involving -map or filtergraph but they refer to mixing a video source and an audio source; I tried several filtergraph examples without success. Any idea?
overlay one video over other using audio from one input
Use -filter_complex, eliminate the movie source filters, and explicitly define output streams with -map:
ffmpeg -y -i main.mp4 -i overlay_with_audio.flv -i logo.png -filter_complex
"[1:v]scale=320:-1[scaled];
[0:v][scaled]overlay=800:480[bg];
[bg][2:v]overlay=30:30,format=yuv420p[video]"
-map "[video]" -map 1:a -movflags +faststart
output.mp4
You may have to provide additional options to the overlay filters depending on the length of the inputs and how you want overlay to react, but because you did not provide the complete console output from your command I had to make a generic, less efficient, and possibly incorrect example.
overlay one video over other merging audio from both inputs
ffmpeg -y -i main.mp4 -i overlay_with_audio.flv -i logo.png -filter_complex
"[1:v]scale=320:-1[scaled];
[0:v][scaled]overlay=800:480[bg];
[bg][2:v]overlay=30:30,format=yuv420p[video];
[0:a][1:a]amerge=inputs=2[audio]"
-map "[video]" -map "[audio]" -ac 2 -movflags +faststart
output.mp4
I'm assuming both inputs are stereo and that you want a stereo output. Also see FFmpeg Wiki: Audio channel Manipulation - 2 × stereo → stereo.

Resources