Mixing various audio and video sources into a single video - audio

I've already read FFmpeg - Overlay one video onto another video?, How to overlay 2 videos at different time over another video in single ffmpeg command?, FFmpeg - Multiple videos with 4 areas and different play times (and many similar questions tagged [ffmpeg] about setpts), and the following code is working, but I'm sure we can simplify it, and have a more elegant solution.
I'd like to mix multiple sources (image and sound) , with different starting points:
t (seconds) 0 1 2 3 4 5 6 7 8 9 10 11 12 13
test.png [-------------------------------]
a.mp3 [-------]
without_sound.mp4 [-------------------] (overlay at x,y=200,200)
b.mp3 [---]
with_sound.mp4 [---------------------------------------] (overlay at x,y=100,100)
This works:
ffmpeg -i test.png
-t 2 -i a.mp3
-t 5 -i without_sound.mp4
-t 1 -i b.mp3
-t 10 -i with_sound.mp4
-filter_complex "
[0]setpts=PTS-STARTPTS[s0];
[1]adelay=2000^|2000[s1];
[2]setpts=PTS-STARTPTS+7/TB[s2];
[3]adelay=5000^|5000[s3];
[4]setpts=PTS-STARTPTS+3/TB[s4];
[4:a]adelay=3000^|3000[t4];
[s1][s3][t4]amix=inputs=3[outa];
[s0][s4]overlay=100:100[o2];
[o2][s2]overlay=200:200[outv]
" -map [outa] -map [outv]
out.mp4 -y
but:
is it normal that we have to use both setpts and adelay? I have tried without adelay and then the sound is not shifted. Said differently, is there a way to simplify:
[4]setpts=PTS-STARTPTS+3/TB[s4];
[4:a]adelay=3000^|3000[t4];
?
is there a way to do it with setpts and asetpts only? When I replaced adelay=5000|5000 with asetpts=PTS-STARTPTS+5/TB and also for the other one, it didn't give the expected time-shifting (see below)
in similar questions/answers I often see overlay=...:enable='between(t,...,...)', here it seems it is not needed, why?
More generally, how would you simplify this "mix multiple audio and video" ffmpeg code?
More details about the second bullet point: if we replace adelay by asetpts,
-filter_complex "
[0]setpts=PTS-STARTPTS[s0];
[1]asetpts=PTS-STARTPTS+2/TB[s1];
[2]setpts=PTS-STARTPTS+7/TB[s2];
[3]asetpts=PTS-STARTPTS+5/TB[s3];
[4]setpts=PTS-STARTPTS+3/TB[s4];
[4:a]asetpts=PTS-STARTPTS+3/TB[t4];
[s1][s3][t4]amix=inputs=3[outa];
[s0][s4]overlay=100:100[o2];
[o2][s2]overlay=200:200[outv]
it doesn't work: [3] should begin at 0'05", and [4:a] at 0'03" but they all begin at the same time than [1], i.e. at 0'02".
It seems that amix only takes the first asetpts in consideration, and discards the others; is it true?

is it normal that we have to use both setpts and adelay?
Yes, the former is for video streams; the latter, for audio. asetpts is not suitable for use with amix since the latter ignores starting time offsets. adelay fills in with silence from 0 to the desired offset.
I often see overlay=...:enable='between(t,...,...)', here it seems it is not needed, why?
Overlay syncs its main and overlay video frames by timestamps. enable is needed if one wishes to disable overlay when synced frames are available for both inputs.

Related

Beeping out portions of an audio file using ffmpeg

I'm trying to use ffmpeg to beep out sections of an audio file (say 10-15 and 20-30). However only the first portion(10-20) gets beeped, whilst the next portion gets muted.
ffmpeg -i input.mp3 -filter_complex "[0]volume=0:enable='between(t,10,15)+between(t,20,30)'[main];sine=d=5:f=800,adelay=10s,pan=stereo|FL=c0|FR=c0[beep];[main][beep]amix=inputs=2" output.wav
Using this as my reference, but not able to make much progress.
Edit : Well, sine=d=5 clearly mentions the duration as 5 (my bad). Seems like this command can be used to add beeping to only one specific portion, how can I possibly change it to add beeps to different sections with varying durations.
ffmpeg -i input.mp3 -af "volume=enable='between(t,5,10)':volume=0[main];sine=d=5:f=800,adelay=5s,pan=stereo|FL=c0|FR=c0[beep];[main][beep]amix=inputs=2,
volume=enable='between(t,15,20)':volume=0[main];sine=d=5:f=800,adelay=15s,pan=stereo|FL=c0|FR=c0[beep];[main][beep]amix=inputs=2, volume=enable='between(t,40,50)':volume=0[main];sine=d=10:f=800,adelay=40s,pan=stereo|FL=c0|FR=c0[beep];[main][beep]amix=inputs=2" output.wav
The above code beeps 5-10, 15-20 and 40-50
This seems to work. Separating the different beeping settings with a ,(comma) and making changes at all 3 places: between, sine=d=x where x seems to be the duration and adelay=ys where y is the delay, meaning when the beeping starts. So between would be (t, y, y+x).
References : Mute specified sections of an audio file using ffmpeg and FFMPEG:Adding beep sound to another audio file in specific time portions
Would love to know a more easier/convenient way of doing this. So I'm not marking this as an answer.

How do I mix multiple audio tracks mit FFMPEG and adjust each volume?

Let's say I have an input .mp4 file that contains 4 audio tracks.
How can I change their volumes independently and convert it to a new file that just contains all the 4 audio tracks mixed together and stored in the first audio track? For example I want the first, second and third audio tracks from the input file to be double their original volume and the fourth to be half its original volume, all saved in the output files first audio track. How would that command look like?
Here you can find many good answers: How to overlay/downmix two audio files using ffmpeg
where the most comprehensive one links to https://trac.ffmpeg.org/wiki/AudioChannelManipulation
I recently had a similar use case: freely mixing 6 mono tracks of a multi-track recording to stereo output with different volumes on either or both output channels, which can be achieved like this:
ffmpeg -i 0.flac -i 1.flac -i 2.flac -i 3.flac -i 4.flac -i 5.flac \
-filter_complex [0:a][1:a][2:a][3:a][4:a][5:a]amerge=inputs=6,pan=stereo|c0=c0+1.2*c1+1.2*c2+1.3*c3+c4|c1=c0+1.3*c3+c4+0.8*c5[a] \
-map [a] output.flac

FFMPEG reducing Generation Loss when inserting many videos into another video

I am trying to insert many miniclips.mp4 into a main.mp4 video - Although I have been able to do this using this solution, I seem to suffer from Generation Loss
The command I am using (within a python script, in a loop at many different intervals) is:
ffmpeg -i main.mp4 -i miniclipX.mp4 -filter_complex "[0:v]drawbox=t=fill:enable='between(t,5,6.4)'[bg];[1:v]setpts=PTS+5/TB[fg];[bg][fg]overlay=x=(W-w)/2:y=(H-h)/2:eof_action=pass;[1:a]adelay=5s:all=1[a1];[0:a][a1]amix" output.mp4
(Then renaming output.mp4 to main.mp4 within a loop)
Would there be anyway to either:
A) Reduce generation loss by implementing certain flags
or
B) Include many different input files and many different -filter_complex's in a singular command to achieve what I am after?
Because you did not provide the ffmpeg log (and therefore there is no info about your ffmpeg or your inputs), for this answer I'll assume all videos are the same width and height.
Example to show miniclip1.mp4 at 5 seconds and miniclip2.mp4 at 10 seconds:
ffmpeg -i main.mp4 -i miniclip1.mp4 -i miniclip2.mp4 -filter_complex
"[1:v]setpts=PTS+5/TB[offset1];[0:v][offset1]overlay=x=(W-w)/2:y=(H-h)/2:eof_action=pass[bg];
[2:v]setpts=PTS+10/TB[offset2];[bg][offset2]overlay=x=(W-w)/2:y=(H-h)/2:eof_action=pass;
[1:a]adelay=5s:all=1[a1];
[2:a]adelay=10s:all=1[a2];
[0:a][a1][a2]amix=inputs=3"
output.mp4
Command was broken into multiple lines so it is easier to read. Make it one line when executing.

How to divide my video horizontally using ffmpeg (without any other side-effects)?

I am processing my video(640 X 1280 dimensions). I want to divide my video horizontally into 2 separate videos(each video will now be 640 X 640 in dimensions),then combine them horizontally (video dimension will be now 1280 X 640)in a single video. I did the research on the internet and my issue was solved and not solved at the same time
I made a batch file and add these commands in it:-
ffmpeg -i input.mp4 -filter_complex "[0]crop=iw:ih/2:0:0[top];[0]crop=iw:ih/2:0:oh[bottom]" -map "[top]" top.mp4 -map "[bottom]" bottom.mp4
ffmpeg -i top.mp4 -i bottom.mp4 -filter_complex hstack output.mp4
Yes,my task got solved but many other issues also came out of it:-
1.) My output video has NO audio in it. No idea why there is no audio in the end results
2.) My main video file (on which I am doing all this) is 258 MB in size. But the result was only 38 MB in size. No idea what is happening? And even worse,I closely looked at the video,results were pretty same (only animation were not as smooth in output file as compared to input file)
3.) It is taking too much time(I know that computing takes some time but maybe there may be some way/sacrifice to make the process much quicker)
Thanks in advance for helping me
Combine your two commands
ffmpeg -i input.mp4 -filter_complex "[0]crop=iw:ih/2:0:0[top];[0]crop=iw:ih/2:0:oh[bottom];[top][bottom]hstack" -preset fast -c:a copy output.mp4
If you need it to encode faster then use a faster -preset as shown in FFmpeg Wiki: H.264.
x264 is a better encoder than your phone so it is not surprising that the file size is smaller.
Or use your player to do it
No need to wait for encoding. Just have your player do everything upon playback. This does not output a file, but only plays the re-arranged video. Example using mpv:
mpv --lavfi-complex="[vid1]split[v0][v1];[v0]crop=iw:ih/2:0:0[c0];[v1]crop=iw:ih/2:0:oh[c1];[c0][c1]hstack[vo]" input.mp4

Ffmpeg: alternating audio languages in resulting movie for language learning

Need to convert multiple-(audio)-language-video to single-audio-stream-video where 2 languages alternate repeatedly.
(10sec Lang2) + (15sec Lang3) + (10sec Lang2) + (15sec Lang3) + ... and so on till the end.
I assume it should be done with piping in and changing audio streams. (I've read ffmpeg piping documentation but didn't quite understand it).
I've done audio switching task (with scripting on Windows) by lively changing audio languages in video player but need better and crossplatform solution for little kid - preprepared video.
If possible, to adjust loudness of one of the input audio streams to the other.
P.S. Think it would be useful for many married (programmers) to show little kids bilingual cartoons. (To prepare for language learning). By balancing 10/15 sec you may retain kid's attention — the older they grow the more native language they demand.
Irrelevant, just for what's my experience:
%ffmpeg% -y -f concat -safe 0 -i %playlist% -i %picture% -map:v 0 -map:v 1 -c:v copy -disposition:v:0 attached_pic -ac 1 -af aresample=resampler=soxr -ar 16000 -%title% -%album% -%artist% %lyrics% -c:a aac -q:a 1 %output%

Resources