ffmpeg creating image slideshow in audio concat command - audio

I have a working ffmpeg which when ran in windows 10 command prompt will combine 3 audio files with 1 image file into a video.
The single image file is used as just a static image for the entire video's duration. (16:50 or 1010 seconds)
song1.mp3 00:00:00-00:06:23
song2.mp3 00:06:23-00:12:04
song3.wav 00:12:04-00:16:50
I am trying to change this so that instead of a single static image, I have a slideshow of 4 different images play during the duration of the video.
I was thinking of doing this by taking the total length in seconds (1010) divided by 4 would mean 252.5 seconds per image, so a timeline like:
img1: 0 - 252.5
img2: 252.5 - 505
img3: 505 - 757.5
img4: 757.5 - 252.5
How can I change my command so that the output video is a slideshow of images with the timestamps above?
Here is my working ffmpeg command with explanation from myself which hopefully explains some of it :
ffmpeg -loop 1 -framerate 2 -i "C:\Users\...local_filepath...\ffmpeg-commands\images\img1.png" -i "C:\Users\...local_filepath...\ffmpeg-commands\audio files\song1.mp3" -i "C:\Users\...local_filepath...\ffmpeg-commands\audio files\song2.mp3" -i "C:\Users\...local_filepath...\ffmpeg-commands\audio files\song3.wav" -c:a pcm_s32le -filter_complex concat=n=3:v=0:a=1 -vcodec libx264 -bufsize 3M -filter:v "scale=w=640:h=638,pad=ceil(iw/2)*2:ceil(ih/2)*2" -crf 18 -pix_fmt yuv420p -shortest -tune stillimage -t 1010 "C:\Users\...local_filepath...\ffmpeg-commands\videos\outputvideo.mkv"
ffmpeg
//loop single image
-loop 1
//set video framerate
-framerate 2
//take image and audio files
-i "C:\Users\...local_filepath...\ffmpeg-commands\images\img1.png"
-i "C:\Users\...local_filepath...\ffmpeg-commands\audio files\song1.mp3"
-i "C:\Users\...local_filepath...\ffmpeg-commands\audio files\song2.mp3"
-i "C:\Users\...local_filepath...\ffmpeg-commands\audio files\song3.wav"
//set audio codec
-c:a pcm_s32le
//organize the audio files (?)
-filter_complex concat=n=3:v=0:a=1
//set video codec
-vcodec libx264
//limiting the output bitrate
-bufsize 3M
//scale single image to 640x638 resolution
-filter:v "scale=w=640:h=638,pad=ceil(iw/2)*2:ceil(ih/2)*2"
//range of the CRF scale is 0–51, where 0 is lossless (for 8 bit only, for 10 bit use -qp 0), 23 is the default, and 51 is worst quality
-crf 18
//set pixel format
-pix_fmt yuv420p
//try to ensure video is not longer then concatenated audio files
-shortest
//use still image
-tune stillimage
//specify output video time to ensure it is exactly the same length as concattenated audio
-t 1010
//specify output filepath
"C:\Users\...local_filepath...\ffmpeg-commands\videos\outputvideo.mkv"
And here is the command raw how you would run it.
ffmpeg -loop 1 -framerate 2 -i "C:\Users\marti\Documents\projects\jan2023-rendertune\ffmpeg-commands\images\img1.png" -i "C:\Users\marti\Documents\projects\jan2023-rendertune\ffmpeg-commands\audio files\song1.mp3" -i "C:\Users\marti\Documents\projects\jan2023-rendertune\ffmpeg-commands\audio files\song2.mp3" -i "C:\Users\marti\Documents\projects\jan2023-rendertune\ffmpeg-commands\audio files\song3.wav" -c:a pcm_s32le -filter_complex concat=n=3:v=0:a=1 -vcodec libx264 -bufsize 3M -filter:v "scale=w=640:h=638,pad=ceil(iw/2)*2:ceil(ih/2)*2" -crf 18 -pix_fmt yuv420p -shortest -tune stillimage -t 1010 "C:\Users\marti\Documents\projects\jan2023-rendertune\ffmpeg-commands\videos\concatVideo-250771.mkv"
And here is my ffmpeg version info:
ffmpeg version 4.3 Copyright (c) 2000-2020 the FFmpeg developers
built with gcc 9.3.1 (GCC) 20200621
configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libsrt --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libgsm --disable-w32threads --enable-libmfx --enable-ffnvcodec --enable-cuda-llvm --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt --enable-amf
libavutil 56. 51.100 / 56. 51.100
libavcodec 58. 91.100 / 58. 91.100
libavformat 58. 45.100 / 58. 45.100
libavdevice 58. 10.100 / 58. 10.100
libavfilter 7. 85.100 / 7. 85.100
libswscale 5. 7.100 / 5. 7.100
libswresample 3. 7.100 / 3. 7.100
libpostproc 55. 7.100 / 55. 7.100

We may use loop filer for each frame, and concat the "looped" frames:
ffmpeg -r 2 -i img1.png -r 2 -i img2.png -r 2 -i img3.png -r 2 -i img4.png -i song1.mp3 -i song2.mp3 -i song3.wav -filter_complex "[4:a][5:a][6:a]concat=n=3:v=0:a=1[a];[0:v]scale=w=640:h=638,setsar=1,loop=505:505[v0];[1:v]scale=w=640:h=638,setsar=1,loop=505:505[v1];[2:v]scale=w=640:h=638,setsar=1,loop=505:505[v2];[3:v]scale=w=640:h=638,setsar=1,loop=505:505[v3];[v0][v1][v2][v3]concat=n=4:v=1:a=0,pad=ceil(iw/2)*2:ceil(ih/2)*2[v]" -map "[v]" -map "[a]" -c:a pcm_s32le -c:v libx264 -bufsize 3M -crf 18 -pix_fmt yuv420p -tune stillimage -t 1010 outputvideo.mkv
For making the command shorter, assume all the files are in the same folder.
-r 2 before each input image sets framerate of the input video to 2fps.
[4:a][5:a][6:a]concat=n=3:v=0:a=1[a] - concatenate the audio input files as in your question.
[0:v]scale=w=640:h=638,setsar=1 and [1:v]scale=w=640:h=638,setsar=1 ... - scales each one of the images to size 640x638 (assume the input images may have different sizes, we have to scale each image, before concatenating).
setsar=1 Makes sure that all the concatenated videos have the same SAR (Storage Aspect Ratio), and usually avoids aspect ratio deformation.
loop=505:505 - Uses loop filter for repeating each image 505 times (505 times at 2Hz applies 252.5 seconds).
The fist 505 is the number of loops, the second 505 is the number of frames in each loop (they should have the same value).
[0:v]scale=w=640:h=638,loop=505:505[v0] - applies scale filter, then loop 505 times, and store the result with temporary name [v0].
[v0][v1][v2][v3]concat=n=4:v=1:a=0 - concatenate the 4 videos (after scaling and looping).
pad=ceil(iw/2)*2:ceil(ih/2)*2[v] applies padding to the concatenated videos.
-map "[v]" -map "[a]" maps the video [v] to the output video channel, and the autio [a] to the output audio channel.
Since we have -t 1010, we don't need the -shortest argument (the output duration is defined to be 1010 seconds).

Related

FFmpeg to Azure Media Services Smooth Streaming Input

I would like to ask about ffmpeg config or command to to mp4 fragment to Azure Media Service live event using smooth streaming / isml protocol. The AMS is not getting any input yet from ffmpeg.
This is my current running command:
ffmpeg -f dshow -i video="Webcam" -movflags isml+frag_keyframe -f isml -r 10 http://endpoint/ingest.isml/streaming1
When I am using RTMP with Wirecast is running well.
Any suggestion on ffmpeg command with isml protocol?
Thank you
it is possibly the way you are formatting the ingest URL. The Smooth ingest protocol expects the name of /Streams(yourtrackname-identifier) after it.
See the Smooth ingest specification for details
Here is an FFMPEG command line that I had sitting around that worked for me on a Raspberry Pi at one time
ffmpeg -i /dev/video1 -pix_fmt yuv420p -f ismv -movflags isml+frag_keyframe -video_track_timescale 10000000 -frag_duration 2000000 -framerate 30 -r 30 -c:v h264_omx -preset ultrafast -map 0:v:0 -b:v:0 2000k -minrate:v:0 2000k -maxrate:v:0 2000k -bufsize 2500k -s:v:0 640x360 -map 0:v:0 -b:v:1 500k -minrate:v:1 500k -maxrate:v:1 500k -s:v:1 480x360 -g 60 -keyint_min 60 -sc_threshold 0 -c:a libfaac -ab 48k -map 0:a? -threads 0 "http://johndeu-nimbuspm.channel.mediaservices.windows.net/ingest.isml/Streams(video)"
Note that i used the following stream identifier - ingest.isml/Streams(video)
Here are a couple more commands that may help.
fmpeg -v debug -y -re -i "file.wmv" -movflags isml+frag_keyframe -video_track_timescale 10000000 -frag_duration 2000000 -f ismv -threads 0 -c:a libvo_aacenc -ac 2 -b:a 20k -c:v libx264 -preset fast -profile:v baseline -g 48 -keyint_min 48 -b:v 200k -s:v 320x240 http://xxxx.userid.channel.mediaservices.windows.net/ingest.isml/Streams(video)
Multi-bitrate encoding and ingest
ffmpeg -re -stream_loop -1 -i C:\Video\tears_of_steel_1080p.mov -movflags isml+frag_keyframe -f ismv -threads 0 -c:a aac -ac 2 -b:a 64k -c:v libx264 -preset fast -profile:v main -g 48 -keyint_min 48 -sc_threshold 0 -map 0:v -b:v:0 5000k -minrate:v:0 5000k -maxrate:v:0 5000k -s:v:0 1920x1080 -map 0:v -b:v:1 3000k -minrate:v:1 3000k -maxrate:v:1 3000k -s:v:1 1280x720 -map 0:v -b:v:2 1800k -minrate:v:2 1800k -maxrate:v:2 1800k -s:v:2 854x480 -map 0:v -b:v:3 1000k -minrate:v:3 1000k -maxrate:v:3 1000k -s:v:3 640x480 -map 0:v -b:v:4 600k -minrate:v:4 600k -maxrate:v:4 600k -s:v:4 480x360 -map 0:a:0
http://.myradarmedia.channel.mediaservices.windows.net/ingest.isml/Streams(stream0^)
EXPLANATION OF WHAT IS GOING ON ABOVE ON THE FFMPEG COMMAND LINE.
ffmpeg
-re **READ INPUT AT NATIVE FRAMERATE
-stream_loop -1 **LOOP INFINITE
-i C:\Video\tears_of_steel_1080p.mov **INPUT FILE IS THIS MOV FILE
-movflags isml+frag_keyframe **OUTPUT IS SMOOTH STREAMING THIS SETS THE FLAGS
-f ismv **OUTPUT ISMV SMOOTH
-threads 0 ** SETS THE THREAD COUNT TO USE FOR ALL STREAMS. YOU CAN USE A STREAM SPECIFIC COUNT AS WELL
-c:a aac ** SET TO AAC CODEC
-ac 2 ** SET THE OUTPUT TO STEREO
-b:a 64k ** SET THE BITRATE FOR THE AUDIO
-c:v libx264 ** SET THE VIDEO CODEC
-preset fast ** USE THE FAST PRESET FOR X246
-profile:v main **USE THE MAIN PROFILE
-g 48 ** GOP SIZE IS 48 frames
-keyint_min 48 ** KEY INTERVAL IS SET TO 48 FRAMES
-sc_threshold 0 ** NOT SURE!
-map 0:v ** MAP THE FIRST VIDEO TRACK OF THE FIRST INPUT FILE
-b:v:0 5000k **SET THE OUTPUT TRACK 0 BITRATE
-minrate:v:0 5000k ** SET OUTPUT TRACK 0 MIN RATE TO SIMULATE CBR
-maxrate:v:0 5000k ** SET OUTPUT TRACK 0 MAX RATE TO SIMULATE CBR
-s:v:0 1920x1080 **SCALE THE OUTPUT OF TRACK 0 to 1920x1080.
-map 0:v ** MAP THE FIRST VIDEO TRACK OF THE FIRST INPUT FILE
-b:v:1 3000k ** SET THE OUTPUT TRACK 1 BITRATE TO 3Mbps
-minrate:v:1 3000k -maxrate:v:1 3000k ** SET THE MIN AND MAX RATE TO SIMULATE CBR OUTPU
-s:v:1 1280x720 ** SCALE THE OUTPUT OF TRACK 1 to 1280x720
-map 0:v -b:v:2 1800k ** REPEAT THE ABOVE STEPS FOR THE REST OF THE OUTPUT TRACKS
-minrate:v:2 1800k -maxrate:v:2 1800k -s:v:2 854x480
-map 0:v -b:v:3 1000k -minrate:v:3 1000k -maxrate:v:3 1000k -s:v:3 640x480
-map 0:v -b:v:4 600k -minrate:v:4 600k -maxrate:v:4 600k -s:v:4 480x360
-map 0:a:0 ** FINALLY TAKE THE SOURCE AUDIO FROM THE FIRST SOURCE AUDIO TRACK.
http://.myradarmedia.channel.mediaservices.windows.net/ingest.isml/Streams(stream0^)
The URL above is part of the output o the command... formatting issue.

Problem with combining a video and an audio stream from USB device

I have two USB devices attached to an RPi, both show up as usual as /dev/video0. Here's some additional info coming from two command line inputs:
Device 1, video only (attached to an RPi4):
ffmpeg -f v4l2 -list_formats all -i /dev/video0 reports
[video4linux2,v4l2 # 0xe5e1c0] Compressed: mjpeg :
Motion-JPEG : 1280x720 640x480 320x240
v4l2-ctl --list-formats-ext reports
ioctl: VIDIOC_ENUM_FMT
Type: Video Capture
[0]: 'MJPG' (Motion-JPEG, compressed)
Size: Discrete 1280x720
Interval: Stepwise 0.033s - 0.033s with step 0.000s
(30.000-30.000 fps)
Size: Discrete 640x480
Interval: Stepwise 0.033s - 0.033s with step 0.000s
(30.000-30.000 fps)
Size: Discrete 320x240
Interval: Stepwise 0.033s - 0.033s with step 0.000s
(30.000-30.000 fps)
Does work: ffmpeg -f v4l2 -i /dev/video0 -vcodec h264_omx -preset ultrafast -tune zerolatency -g 300 -b:v 1M -mpegts_service_type advanced_codec_digital_hdtv -f mpegts udp://OtherMachine:Port?pkt_size=1316
Device 2, video and audio (attached to an RPi3, but does not work either on the RPi4):
ffmpeg -f v4l2 -list_formats all -i /dev/video0 reports
[video4linux2,v4l2 # 0x2c41210] Compressed: mjpeg :
Motion-JPEG : 1920x1080 1280x720
v4l2-ctl --list-formats-ext reports
ioctl: VIDIOC_ENUM_FMT
Index : 0
Type : Video Capture
Pixel Format: 'MJPG' (compressed)
Name : Motion-JPEG
Size: Discrete 1920x1080
Interval: Discrete 0.033s
(30.000 fps)
Interval: Discrete 0.067s
(15.000 fps)
Size: Discrete 1280x720
Interval: Discrete 0.033s
(30.000 fps)
Interval: Discrete 0.067s
(15.000 fps)
After quite some tedious work and way too many hours I got this running:
Video only: ffmpeg -f v4l2 -input_format mjpeg -i /dev/video0 -c:v copy -preset ultrafast -tune zerolatency -g 300 -f matroska udp://OtherMachine:Port?pkt_size=1316
Does not work at all: ffmpeg -f v4l2 -input_format mjpeg -i /dev/video0 -c:v copy -preset ultrafast -tune zerolatency -g 300 -f mpegts udp://OtherMachine:Port?pkt_size=1316, on "OtherMachine" I do see that there is an incoming data stream via VLC, but it could not be digested properly.
Audio only: ffmpeg -f alsa -thread_queue_size 1024 -i plughw:1 -c:a mp2 -ac 2 -ar 44100 -preset ultrafast -tune zerolatency -b:a 128K -f mpegts udp://OtherMachine:Port?pkt_size=1316
But this does not work either:
ffmpeg -f v4l2 -input_format mjpeg -i /dev/video0 -f alsa -thread_queue_size
1024 -i plughw:1 -c:v copy -c:a mp2 -ac 2 -ar 44100 -preset ultrafast -tune zerolatency -g 300 -b:a 128K -f mpegts udp://OtherMachine:Port?pkt_size=1316
Could you please provide a hint on how to get these two streams for device 2 working together? Both of them come from the same hardware/device, my guess is that the MJPG video stream is somehow not fully compliant with the mpegts standard (like it is for device 1) since it works with matroska, but not with mpegts. Could that be? What needs to be done in that case?
Another hint, with the same kind of hardware setup I can do this
cvlc -vvv v4l2:///dev/video0 --input-slave=alsa://plughw:1,0 --sout='#transcode{acodec=mpga,ab=128}:std{access=http,mux=asf,dst=:Port}'
So, here my understanding is that video gets passed on unchanged (mjpeg) and audio gets transcoded via vlc's mpga which presumably corresponds to mp2 for ffmpeg. The container format is asf, but I was not able to get that running with ffmpeg for no obvious reason. Anyway, picking up this vlc broadcast stream via http://StreamingMachine:Port on any other machine in my network is working well. But how to achieve that with ffmpeg directly and potentially not as http:// but udp:// or pipe stream?
Alternatively, let me ask this question: Given that I have an incoming mjpeg video stream as well as an incoming mp2 audio stream which kind of container format (ok, it's obviously not mpegts) is the most appropriate one for combined streaming across my LAN or even into a pipe for further processing? Believe me, I tried my very best over a couple of hours to find out how to proceed but with no success. At least to my humble knowledge there is nothing such like a table providing answers to questions of that kind.
I'd be glad to get some insights.
Best

Mix additional audio file with video(+audio) in ffmpeg

I'm trying to mix additional audio file with video which has also audio within. But the problem is that I already have complex ffmpeg command and don't know how to combine them together.
This is my existing ffmpeg command which uses some offsets and replaces additional audio file with embedded one (audio inside video) and also overlays few gauges and watermark to the video.
ffmpeg -y
-ss 00:00:01:213 -i videoFile.mp4
-ss 00:00:03:435 -i audioFile.wav
-i watermark.png
-framerate 6 -i gauge1_path/img-%04d.png
-framerate 1 -i gauge2_path/img-%04d.png
-framerate 2 -i gauge3_path/img-%04d.png
-framerate 2 -i gauge4_path/img-%04d.png
-framerate 2 -i gauge5_path/img-%04d.png
-framerate 2 -i gauge6_path/img-%04d.png
-filter_complex [0][2]overlay=(21):(H-h-21)[ovr0];
[ovr0][3]overlay=(W-w-21):(H-h-21)[ovr1];
[ovr1][4]overlay=(W-w-21):(H-h-333)[ovr2];
[ovr2][5]overlay=(W-w-21):(H-h-418)[ovr3];
[ovr3][6]overlay=(W-w-21):(H-h-503)[ovr4];
[ovr4][7]overlay=(W-w-21):(H-h-588)[ovr5];
[ovr5][8]overlay=(W-w-21):(H-h-673)
-map 0:v -map 1:a -c:v libx264 -preset ultrafast -crf 23 -t 00:5:10:000 output.mp4
Now I would like to use ffmpeg's amix in order to mix both audios instead of replacing them, if possible with ability to set volumes. But official documentation amix says nothing about volume.
Separately both seems to work ok.
ffmpeg -y -i video.mp4 -i audio.mp3 -filter_complex [0][1]amix=inputs=2[a] -map 0:v -map [a] -c:v copy output.mp4
and
ffmpeg -y -i video.mp4 -i audio.mp3 -i watermark.png -filter_complex [0][2]overlay=(21):(H-h-21)[ovr0] -map [ovr0]:v -map 1:a -c:v libx264 -preset ultrafast -crf 23 output.mp4
but together
ffmpeg -y -i video.mp4 -i audio.mp3 -i watermark.png -filter_complex [0][1]amix=inputs=2[a];[a][2]overlay=(21):(H-h-21)[ovr0] -map [ovr0]:v -map [a] -c:v libx264 -preset ultrafast -crf 23 output.mp4
I'm getting an error:
ffmpeg version N-93886-gfbdb3aa179 Copyright (c) 2000-2019 the FFmpeg developers
built with gcc 8.3.1 (GCC) 20190414
configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt
libavutil 56. 28.100 / 56. 28.100
libavcodec 58. 52.101 / 58. 52.101
libavformat 58. 27.103 / 58. 27.103
libavdevice 58. 7.100 / 58. 7.100
libavfilter 7. 53.101 / 7. 53.101
libswscale 5. 4.101 / 5. 4.101
libswresample 3. 4.100 / 3. 4.100
libpostproc 55. 4.100 / 55. 4.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'video.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
creation_time : 1970-01-01T00:00:00.000000Z
encoder : Lavf53.24.2
Duration: 00:00:29.57, start: 0.000000, bitrate: 1421 kb/s
Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 1032 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default)
Metadata:
creation_time : 1970-01-01T00:00:00.000000Z
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 383 kb/s (default)
Metadata:
creation_time : 1970-01-01T00:00:00.000000Z
handler_name : SoundHandler
[mp3 # 0000015e2f934ec0] Estimating duration from bitrate, this may be inaccurate
Input #1, mp3, from 'audio.mp3':
Duration: 00:00:45.33, start: 0.000000, bitrate: 128 kb/s
Stream #1:0: Audio: mp3, 44100 Hz, stereo, fltp, 128 kb/s
Input #2, png_pipe, from 'watermark.png':
Duration: N/A, bitrate: N/A
Stream #2:0: Video: png, rgb24(pc), 100x56 [SAR 3779:3779 DAR 25:14], 25 tbr, 25 tbn, 25 tbc
[Parsed_amix_0 # 0000015e2ff2e940] Media type mismatch between the 'Parsed_amix_0' filter output pad 0 (audio) and the 'Parsed_overlay_1' filter input pad 0 (video)
[AVFilterGraph # 0000015e2f91c600] Cannot create the link amix:0 -> overlay:0
Error initializing complex filters.
Invalid argument
So my question: whether it's possible to combine amix and overlay together and how and in which order they should be used? Or should I look something different because amix unable to set volume levels?
Thanks in advance!
Use
ffmpeg -y -i video.mp4 -i audio.mp3 -i watermark.png -filter_complex [0][1]amix=inputs=2,volume=2[a];[0][2]overlay=(21):(H-h-21)[ovr0] -map [ovr0]:v -map [a] -c:v libx264 -preset ultrafast -crf 23 output.mp4
You're sending the mixed audio to the overlay filter which requires video input. Overlay should be fed the original video stream. The audio output [a] should left alone. It is consumed as a mapped output stream.
volume filter added after amix to restore volume. amix reduces volume, in order to avoid clipping.

Audio Slowly Desynchronizing When Segmenting

I use ffmpeg's ability to segment video while I record so I can record constantly without my hard drive filling up.
It works really well, expect the audio desynchronizes from the video when the file segments. The video seems to be uninterrupted but I can actually hear a tiny jump in the audio when I join segments later on. One would think that ffmpeg would store packets in a queue during segmentation so nothing is lost but that doesn't seem to be the case... Any way I could force it to do something like that?
Here is my current block:
ffmpeg -y -thread_queue_size 5096 -f dshow -video_size 3440x1440 -rtbufsize 2147.48M -framerate 100 -pixel_format nv12 ^
-itsoffset 00:00:00.012 -i video="Video (00 Pro Capture HDMI 4K+)" -thread_queue_size 5096 -guess_layout_max 0 -f dshow ^
-rtbufsize 2147.48M -i audio="SPDIF/ADAT (1+2) (RME Fireface UC)" -map 0:0,1:0 -map 1:0 -c:v h264_nvenc -preset: llhp ^
-pix_fmt nv12 -b:v 250M -minrate 250M -maxrate 250M -bufsize 250M -b:a 384k -ac 2 -r 100 -vsync 1 ^
-max_muxing_queue_size 5096 -segment_time 600 -segment_wrap 9 -f segment C:\Users\djcim\Videos\PC\PC\PC%02d.mp4
I am delaying the video stream because right out the gate it's a little bit ahead of the audio.
PS: aresample or async seem to have no effect or at least not a desirable one.
Using -reset_timestamps in conjunction with encoding .ts instead of .mp4 has mostly solved this issue. -reset_timestamps does not appear to work when encoding .mp4, not sure why, maybe a bug?
I say mostly because audio still drifts around a frame after the first segment, but not exponentially. I find audio de-synced by one frame acceptable. Although I should mention now when I try to concat the clips back together I have audio drift issues, not always, but sometimes. Using aresample=async=250 fixes the drift after concat but you can hear the audio stretch a bit when doing so. Can't expect everything to work perfectly.
ffmpeg - y -thread_queue_size 9999 -indexmem 9999 -guess_layout_max 0 -f dshow -video_size 3440x1440 -rtbufsize 2147.48M ^
-framerate 200 -pixel_format nv12 -i video="Video (00 Pro Capture HDMI 4K+)":audio="SPDIF/ADAT (1+2) (RME Fireface UC)" ^
-map 0:0,0:1 -map 0:1 -flags +cgop -force_key_frames expr:gte(t,n_forced*2) -c:v h264_nvenc -preset: llhp -pix_fmt nv12 ^
-b:v 250M -minrate 250M -maxrate 250M -bufsize 250M -c:a aac -ar 44100 -b:a 384k -ac 2 -r 100 ^
-af "atrim=0.038, asetpts=PTS-STARTPTS, aresample=async=250" -vsync 1 -ss 00:00:01.096 -max_muxing_queue_size 9999 ^
-f segment -segment_time 600 -segment_wrap 9 -reset_timestamps 1 C:\Users\djcim\Videos\PC\PC\PC%02d.ts

How to input an audio file, generate video, split, crop and overlay to output a kaleidoscope effect

I need to create an FFMPEG script which reads in an audio file ("testloop.wav" in this example) generates a video from the waveform using the "showcqt" filter , and then crops and overlays the output from that to generate a kaleidoscope effect. This is the code I have so far - the generation of the intial video and the output section work correctly, but there is a fault in the split, crop and overlay section which I cannot trace.
ffmpeg -i "testloop.wav" -i "testloop.wav" \
-filter_complex "[0:a]showcqt,format=yuv420p[v]" -map "[v]" \
"split [tmp1][tmp2]; \
[tmp1] crop=iw:(ih/3)*2:0:0, pad=0:ih+ih/2 [top]; \
[tmp2] crop=iw:ih/3:0:(ih/3)*2, hflip [bottom]; \
[top][bottom] overlay=0:(H/3)*2"\
-map 1:a:0 -codec:v libx264 -crf 21 -bf 2 -flags +cgop -pix_fmt yuv420p -codec:a aac -strict -2 -b:a 384k -r:a 48000 -movflags faststart "${i%.wav}.mp4
You can't split or define multiple filter_complexes. Also, no need to feed input twice.
ffmpeg -i "testloop.wav" \
-filter_complex "[0:a]showcqt,format=yuv420p, \
split [tmp1][tmp2]; \
[tmp1] crop=iw:(ih/3)*2:0:0, pad=0:ih+ih/2 [top]; \
[tmp2] crop=iw:ih/3:0:(ih/3)*2, hflip [bottom]; \
[top][bottom] overlay=0:(H/3)*2"\
-c:v libx264 -crf 21 -bf 2 -flags +cgop -pix_fmt yuv420p \
-c:a aac -strict -2 -b:a 384k -ar 48000 -movflags +faststart out.mp4
(I'm not debugging the logic of the effect you're trying to achieve. Only the syntax)

Resources