Mix additional audio file with video(+audio) in ffmpeg - audio

I'm trying to mix additional audio file with video which has also audio within. But the problem is that I already have complex ffmpeg command and don't know how to combine them together.
This is my existing ffmpeg command which uses some offsets and replaces additional audio file with embedded one (audio inside video) and also overlays few gauges and watermark to the video.
ffmpeg -y
-ss 00:00:01:213 -i videoFile.mp4
-ss 00:00:03:435 -i audioFile.wav
-i watermark.png
-framerate 6 -i gauge1_path/img-%04d.png
-framerate 1 -i gauge2_path/img-%04d.png
-framerate 2 -i gauge3_path/img-%04d.png
-framerate 2 -i gauge4_path/img-%04d.png
-framerate 2 -i gauge5_path/img-%04d.png
-framerate 2 -i gauge6_path/img-%04d.png
-filter_complex [0][2]overlay=(21):(H-h-21)[ovr0];
[ovr0][3]overlay=(W-w-21):(H-h-21)[ovr1];
[ovr1][4]overlay=(W-w-21):(H-h-333)[ovr2];
[ovr2][5]overlay=(W-w-21):(H-h-418)[ovr3];
[ovr3][6]overlay=(W-w-21):(H-h-503)[ovr4];
[ovr4][7]overlay=(W-w-21):(H-h-588)[ovr5];
[ovr5][8]overlay=(W-w-21):(H-h-673)
-map 0:v -map 1:a -c:v libx264 -preset ultrafast -crf 23 -t 00:5:10:000 output.mp4
Now I would like to use ffmpeg's amix in order to mix both audios instead of replacing them, if possible with ability to set volumes. But official documentation amix says nothing about volume.
Separately both seems to work ok.
ffmpeg -y -i video.mp4 -i audio.mp3 -filter_complex [0][1]amix=inputs=2[a] -map 0:v -map [a] -c:v copy output.mp4
and
ffmpeg -y -i video.mp4 -i audio.mp3 -i watermark.png -filter_complex [0][2]overlay=(21):(H-h-21)[ovr0] -map [ovr0]:v -map 1:a -c:v libx264 -preset ultrafast -crf 23 output.mp4
but together
ffmpeg -y -i video.mp4 -i audio.mp3 -i watermark.png -filter_complex [0][1]amix=inputs=2[a];[a][2]overlay=(21):(H-h-21)[ovr0] -map [ovr0]:v -map [a] -c:v libx264 -preset ultrafast -crf 23 output.mp4
I'm getting an error:
ffmpeg version N-93886-gfbdb3aa179 Copyright (c) 2000-2019 the FFmpeg developers
built with gcc 8.3.1 (GCC) 20190414
configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt
libavutil 56. 28.100 / 56. 28.100
libavcodec 58. 52.101 / 58. 52.101
libavformat 58. 27.103 / 58. 27.103
libavdevice 58. 7.100 / 58. 7.100
libavfilter 7. 53.101 / 7. 53.101
libswscale 5. 4.101 / 5. 4.101
libswresample 3. 4.100 / 3. 4.100
libpostproc 55. 4.100 / 55. 4.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'video.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
creation_time : 1970-01-01T00:00:00.000000Z
encoder : Lavf53.24.2
Duration: 00:00:29.57, start: 0.000000, bitrate: 1421 kb/s
Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 1032 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default)
Metadata:
creation_time : 1970-01-01T00:00:00.000000Z
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 383 kb/s (default)
Metadata:
creation_time : 1970-01-01T00:00:00.000000Z
handler_name : SoundHandler
[mp3 # 0000015e2f934ec0] Estimating duration from bitrate, this may be inaccurate
Input #1, mp3, from 'audio.mp3':
Duration: 00:00:45.33, start: 0.000000, bitrate: 128 kb/s
Stream #1:0: Audio: mp3, 44100 Hz, stereo, fltp, 128 kb/s
Input #2, png_pipe, from 'watermark.png':
Duration: N/A, bitrate: N/A
Stream #2:0: Video: png, rgb24(pc), 100x56 [SAR 3779:3779 DAR 25:14], 25 tbr, 25 tbn, 25 tbc
[Parsed_amix_0 # 0000015e2ff2e940] Media type mismatch between the 'Parsed_amix_0' filter output pad 0 (audio) and the 'Parsed_overlay_1' filter input pad 0 (video)
[AVFilterGraph # 0000015e2f91c600] Cannot create the link amix:0 -> overlay:0
Error initializing complex filters.
Invalid argument
So my question: whether it's possible to combine amix and overlay together and how and in which order they should be used? Or should I look something different because amix unable to set volume levels?
Thanks in advance!

Use
ffmpeg -y -i video.mp4 -i audio.mp3 -i watermark.png -filter_complex [0][1]amix=inputs=2,volume=2[a];[0][2]overlay=(21):(H-h-21)[ovr0] -map [ovr0]:v -map [a] -c:v libx264 -preset ultrafast -crf 23 output.mp4
You're sending the mixed audio to the overlay filter which requires video input. Overlay should be fed the original video stream. The audio output [a] should left alone. It is consumed as a mapped output stream.
volume filter added after amix to restore volume. amix reduces volume, in order to avoid clipping.

Related

ffmpeg creating image slideshow in audio concat command

I have a working ffmpeg which when ran in windows 10 command prompt will combine 3 audio files with 1 image file into a video.
The single image file is used as just a static image for the entire video's duration. (16:50 or 1010 seconds)
song1.mp3 00:00:00-00:06:23
song2.mp3 00:06:23-00:12:04
song3.wav 00:12:04-00:16:50
I am trying to change this so that instead of a single static image, I have a slideshow of 4 different images play during the duration of the video.
I was thinking of doing this by taking the total length in seconds (1010) divided by 4 would mean 252.5 seconds per image, so a timeline like:
img1: 0 - 252.5
img2: 252.5 - 505
img3: 505 - 757.5
img4: 757.5 - 252.5
How can I change my command so that the output video is a slideshow of images with the timestamps above?
Here is my working ffmpeg command with explanation from myself which hopefully explains some of it :
ffmpeg -loop 1 -framerate 2 -i "C:\Users\...local_filepath...\ffmpeg-commands\images\img1.png" -i "C:\Users\...local_filepath...\ffmpeg-commands\audio files\song1.mp3" -i "C:\Users\...local_filepath...\ffmpeg-commands\audio files\song2.mp3" -i "C:\Users\...local_filepath...\ffmpeg-commands\audio files\song3.wav" -c:a pcm_s32le -filter_complex concat=n=3:v=0:a=1 -vcodec libx264 -bufsize 3M -filter:v "scale=w=640:h=638,pad=ceil(iw/2)*2:ceil(ih/2)*2" -crf 18 -pix_fmt yuv420p -shortest -tune stillimage -t 1010 "C:\Users\...local_filepath...\ffmpeg-commands\videos\outputvideo.mkv"
ffmpeg
//loop single image
-loop 1
//set video framerate
-framerate 2
//take image and audio files
-i "C:\Users\...local_filepath...\ffmpeg-commands\images\img1.png"
-i "C:\Users\...local_filepath...\ffmpeg-commands\audio files\song1.mp3"
-i "C:\Users\...local_filepath...\ffmpeg-commands\audio files\song2.mp3"
-i "C:\Users\...local_filepath...\ffmpeg-commands\audio files\song3.wav"
//set audio codec
-c:a pcm_s32le
//organize the audio files (?)
-filter_complex concat=n=3:v=0:a=1
//set video codec
-vcodec libx264
//limiting the output bitrate
-bufsize 3M
//scale single image to 640x638 resolution
-filter:v "scale=w=640:h=638,pad=ceil(iw/2)*2:ceil(ih/2)*2"
//range of the CRF scale is 0–51, where 0 is lossless (for 8 bit only, for 10 bit use -qp 0), 23 is the default, and 51 is worst quality
-crf 18
//set pixel format
-pix_fmt yuv420p
//try to ensure video is not longer then concatenated audio files
-shortest
//use still image
-tune stillimage
//specify output video time to ensure it is exactly the same length as concattenated audio
-t 1010
//specify output filepath
"C:\Users\...local_filepath...\ffmpeg-commands\videos\outputvideo.mkv"
And here is the command raw how you would run it.
ffmpeg -loop 1 -framerate 2 -i "C:\Users\marti\Documents\projects\jan2023-rendertune\ffmpeg-commands\images\img1.png" -i "C:\Users\marti\Documents\projects\jan2023-rendertune\ffmpeg-commands\audio files\song1.mp3" -i "C:\Users\marti\Documents\projects\jan2023-rendertune\ffmpeg-commands\audio files\song2.mp3" -i "C:\Users\marti\Documents\projects\jan2023-rendertune\ffmpeg-commands\audio files\song3.wav" -c:a pcm_s32le -filter_complex concat=n=3:v=0:a=1 -vcodec libx264 -bufsize 3M -filter:v "scale=w=640:h=638,pad=ceil(iw/2)*2:ceil(ih/2)*2" -crf 18 -pix_fmt yuv420p -shortest -tune stillimage -t 1010 "C:\Users\marti\Documents\projects\jan2023-rendertune\ffmpeg-commands\videos\concatVideo-250771.mkv"
And here is my ffmpeg version info:
ffmpeg version 4.3 Copyright (c) 2000-2020 the FFmpeg developers
built with gcc 9.3.1 (GCC) 20200621
configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libsrt --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libgsm --disable-w32threads --enable-libmfx --enable-ffnvcodec --enable-cuda-llvm --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt --enable-amf
libavutil 56. 51.100 / 56. 51.100
libavcodec 58. 91.100 / 58. 91.100
libavformat 58. 45.100 / 58. 45.100
libavdevice 58. 10.100 / 58. 10.100
libavfilter 7. 85.100 / 7. 85.100
libswscale 5. 7.100 / 5. 7.100
libswresample 3. 7.100 / 3. 7.100
libpostproc 55. 7.100 / 55. 7.100
We may use loop filer for each frame, and concat the "looped" frames:
ffmpeg -r 2 -i img1.png -r 2 -i img2.png -r 2 -i img3.png -r 2 -i img4.png -i song1.mp3 -i song2.mp3 -i song3.wav -filter_complex "[4:a][5:a][6:a]concat=n=3:v=0:a=1[a];[0:v]scale=w=640:h=638,setsar=1,loop=505:505[v0];[1:v]scale=w=640:h=638,setsar=1,loop=505:505[v1];[2:v]scale=w=640:h=638,setsar=1,loop=505:505[v2];[3:v]scale=w=640:h=638,setsar=1,loop=505:505[v3];[v0][v1][v2][v3]concat=n=4:v=1:a=0,pad=ceil(iw/2)*2:ceil(ih/2)*2[v]" -map "[v]" -map "[a]" -c:a pcm_s32le -c:v libx264 -bufsize 3M -crf 18 -pix_fmt yuv420p -tune stillimage -t 1010 outputvideo.mkv
For making the command shorter, assume all the files are in the same folder.
-r 2 before each input image sets framerate of the input video to 2fps.
[4:a][5:a][6:a]concat=n=3:v=0:a=1[a] - concatenate the audio input files as in your question.
[0:v]scale=w=640:h=638,setsar=1 and [1:v]scale=w=640:h=638,setsar=1 ... - scales each one of the images to size 640x638 (assume the input images may have different sizes, we have to scale each image, before concatenating).
setsar=1 Makes sure that all the concatenated videos have the same SAR (Storage Aspect Ratio), and usually avoids aspect ratio deformation.
loop=505:505 - Uses loop filter for repeating each image 505 times (505 times at 2Hz applies 252.5 seconds).
The fist 505 is the number of loops, the second 505 is the number of frames in each loop (they should have the same value).
[0:v]scale=w=640:h=638,loop=505:505[v0] - applies scale filter, then loop 505 times, and store the result with temporary name [v0].
[v0][v1][v2][v3]concat=n=4:v=1:a=0 - concatenate the 4 videos (after scaling and looping).
pad=ceil(iw/2)*2:ceil(ih/2)*2[v] applies padding to the concatenated videos.
-map "[v]" -map "[a]" maps the video [v] to the output video channel, and the autio [a] to the output audio channel.
Since we have -t 1010, we don't need the -shortest argument (the output duration is defined to be 1010 seconds).

ffmpeg sequence of multiple filters syntax

i am trying to use multiple filters in ffpmeg, but it does not allow more than one -af.
so, then i decided to try to do it with a -complex_filter.
sudo ffmpeg -f alsa -i default:CARD=Device \
-filter_complex \
"lowpass=5000,highpass=200; \
volume=+5dB; \
afftdn=nr=0.01:nt=w;" \
-c:a libmp3lame -b:a 128k -ar 48000 -ac 1 -t 00:00:05 -y $recdir/audio_$(date '+%Y_%m_%d_%H_%M_%S').mp3
it must work, but for some reason i get an error:
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, alsa, from 'default:CARD=Device':
Duration: N/A, start: 1625496748.441207, bitrate: 1536 kb/s
Stream #0:0: Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s
[AVFilterGraph # 0xaaab0a8b14e0] No such filter: ''
Error initializing complex filters.
Invalid argument
i have tried quotes and others, nothing helps..
ffmpeg -f alsa -i default:CARD=Device \
-filter_complex \
"lowpass=5000,highpass=200,volume=+5dB,afftdn=nr=0.01:nt=w" \
-c:a libmp3lame -b:a 128k -ar 48000 -ac 1 -t 00:00:05 -y $recdir/audio_$(date '+%Y_%m_%d_%H_%M_%S').mp3
If you end your filtergraph with ; then ffmpeg expects another filter. That is why you got the error No such filter: ''. Avoid ending with ;.
You have a linear series of simple filters so separate the filters with commas. This also means you can still use -af instead of -filter_complex if you prefer.
See FFmpeg Filtering Introduction to see the difference between ; and ,.

Can't use -shortest parameter when using multiple audio streams ffmpeg

I want to add a second audio stream to an mp4 video file already containing sound.
The second audio stream is a little longer than the video, but I want the final product to be the same length.
I tried using the -shortest feature but the second audio stream I wanted to add want not audible at all.
I think -shortest only allows for one stream, so what can I do to keep the video the same length and keep both audio streams?
Here is the full command I used before asking this question:
ffmpeg -i input.mp4 -i input.m4a -map 0:v -map 0:a -shortest output.mp4
Output of ffmpeg -i output.mp4:
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'output.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.45.100
Duration: 00:00:32.08, start: 0.000000, bitrate: 1248 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 480x600 [SAR 1:1 DAR 4:5], 1113 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 128 kb/s (default)
Metadata:
handler_name : SoundHandler
You have to map the audio from the 2nd input as well.
ffmpeg -i input.mp4 -i input.m4a -map 0:v -map 0:a -map 1:a -shortest -fflags +shortest -max_interleave_delta 100M output.mp4
See https://stackoverflow.com/a/55804507/ for explanation of how to make shortest effective.

ffmpeg - mux video and audio and trim the audio

I have a long audio part
and a short video part which I want to mux together.
I'm trying the following command to mux:
Video_0-0002.h264 - whole file (2 secs long)
Audio.wav - from 4 till 6 seconds
ffmpeg -y -i /Documents/viz-1/01/Video_0-0002.h264 -i /Documents/viz-1/01/Audio.wav -codec:v copy -f mp4 -af atrim=4:6 -strict experimental -movflags faststart /Documents/viz-1/01/Video_0-0001.mp4
But the audio is messed up...
how can I do it correctly?
Also tried, sounds like there is silence in the end.
ffmpeg -y -i Video_0-0003.h264 -i Audio.wav -c:v copy -af atrim=6:8,asetpts=PTS-STARTPTS -strict experimental -movflags +faststart Video_0-0003.mp4
Input #0, h264, from 'Video_0-0003.h264':
Duration: N/A, bitrate: N/A
Stream #0:0: Video: h264 (Main), yuv420p(progressive), 388x388 [SAR 1:1 DAR 1:1], 30 fps, 30 tbr, 1200k tbn, 60 tbc
Guessed Channel Layout for Input Stream #1.0 : stereo
Input #1, wav, from 'Audio.wav':
Duration: 00:00:16.98, bitrate: 1411 kb/s
Stream #1:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s
Output #0, mp4, to 'Video_0-0003.mp4':
Metadata:
encoder : Lavf57.56.100
Stream #0:0: Video: h264 (Main) ([33][0][0][0] / 0x0021), yuv420p(progressive), 388x388 [SAR 1:1 DAR 1:1], q=2-31, 30 fps, 30 tbr, 1200k tbn, 1200k tbc
Stream #0:1: Audio: aac (LC) ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp, 128 kb/s
Metadata:
encoder : Lavc57.64.101 aac
Stream mapping:
Stream #0:0 -> #0:0 (copy)
Stream #1:0 -> #0:1 (pcm_s16le (native) -> aac (native))
Press [q] to stop, [?] for help
[mp4 # 0x7fca8f015000] Timestamps are unset in a packet for stream 0. This is deprecated and will stop working in the future. Fix your code to set the timestamps properly
[mp4 # 0x7fca8f015000] Starting second pass: moving the moov atom to the beginning of the file
frame= 60 fps=0.0 q=-1.0 Lsize= 242kB time=00:00:02.02 bitrate= 982.2kbits/s speed= 21x
video:207kB audio:32kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 1.382400%
[aac # 0x7fca8f017400] Qavg: 1076.270
Try
ffmpeg -y -i /Documents/viz-1/01/Video_0-0002.h264 -i /Documents/viz-1/01/Audio.wav -c:v copy -af atrim=4:6,asetpts=PTS-STARTPTS -strict experimental -movflags +faststart /Documents/viz-1/01/Video_0-0001.mp4
You can try to cut audio by video timing, and then marge video and audio track.
Use -vn and -an in separate ffmpeg process.
ffmpeg -i video.mp4 -c:v h264 -an -y video.h264
ffmpeg -i video.mp4 -c:a aac -t 00:01:00 -vn -y audio.aac
And for marge tracks:
ffmpeg -i auido.acc -i video.h264 -c:v copy -c:a copy -f mp4 -y out.mp4

sound acceleration while converting sound with ffmpeg libfaac

i'm trying to convert sound with ffmpeg using comand:
fmpeg -y -i /Users/Artem/Sites/waprik/testing/orig.mp4 -acodec libfaac -b:a 64k -ar 41000 -ac 2 -threads 0 -vn /Users/Artem/Sites/waprik/public/testing.m4a
original sound is 4:18 min, but output sound duration is 4 minutes and it sounds accelerated. How can i fix it ?
by the way, original sound is
Duration: 00:04:18.81
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 191 kb/s
Metadata:
creation_time : 2014-03-07 05:45:06
handler_name : IsoMedia File Produced by Google, 5-11-2011
you mistyped the audio rate. It should be 44100 instead of 41000:
ffmpeg -y -i /Users/Artem/Sites/waprik/testing/orig.mp4 -acodec libfaac -b:a 64k -ar 41000 -ac 2 -threads 0 -vn /Users/Artem/Sites/waprik/public/testing.m4a
Here's the math to prove it! Your initial track is 4 minutes 18 seconds, or 258 seconds. The ratio of your conversion rate to actual rate is 41000/44100, or .9297052. Multiply that ratio by your 258-second track and we end up with a 239.86-second track...or 3 minutes 59.86 seconds.
What was happening is that you were telling ffmpeg that instead of 44100 frames in a second, there were actually only 41000. So, it grabbed 41000 of the 44100 and called that a second, even though it really wasn't. The result is that you end up with a faster/shorter, slightly degraded audio file.

Resources