Audio drifts when concatenating clips - audio

I am trying to concatenate a bunch of short, 1-second video clips in ts format using the following command:
var convertCommand = "cd clips; ffmpeg -y -i concat:\"" + convertedFilenames.join("|") + "\" -c:a aac -strict experimental -bsf:a aac_adtstoasc \"" + user._id + ".mp4\"; mv \"" + user._id + ".mp4\" \"full/" + user._id + ".mp4\"";
This works great, however, the audio "drifts" very slowly, and after about 15 seconds, the audio has been delayed by about 1 second.
Is there a way I can encode audio differently to avoid this? Does this have to do with these commands?
-c:a aac -strict experimental -bsf:a aac_adtstoasc
For completion, this is the script used to trim the clips first into 1 second clips:
cd clips; ffmpeg -y -i ./converted/${1}.ts -ss 00:00:00 -t 00:00:01 -vcodec libx264 -acodec libvo_aacenc -y ./converted/${1}_trimmed.ts;
Thanks a lot in advance.

What you describe is audio recorded at 48 kHz but played back at 44.1 kHz. The concatenation isn't going to convert that audio for you... it's simply muxing into the right container.

Related

Concat mp4 videos and merge their audios to the final output

I have several videos and photos and need to merge them with the cross-dissolve effect. The algorithm is next:
Create videos from images and add silent audio to them (so they will also have a sound stream):
ffmpeg -y -f lavfi -i anullsrc -loop 1 -i /tmp/media/import-2020-Aug-19-Wednesday-05-40-34/ea5c93fd-d946-4742-b8f7-ea9ae4d43441.jpg -c:v libx264 -t 10 -pix_fmt yuv420p -vf scale=750:1280 /tmp/media/import-2020-Aug-19-Wednesday-05-40-34/ea5c93fd-d946-4742-b8f7-ea9ae4d43441.mp4
Combine all the videos and audios into one using this command:
ffmpeg
-i /tmp/media/import-2020-Aug-19-Wednesday-05-40-34/temp_68d437c0-f5e2-4651-b07e-91533480b6ef.mp4
-i /tmp/media/import-2020-Aug-19-Wednesday-05-40-34/temp_48f3c111-610d-40c7-ac71-6ce2fbb16184.mp4
-i /tmp/media/import-2020-Aug-19-Wednesday-05-40-34/temp_1593b5d8-7e16-417d-9372-2267581cd504.mp4
-i /tmp/media/import-2020-Aug-19-Wednesday-05-40-34/temp_1ac7f6be-1b12-4e31-b904-1491cc9b9494.mp4
-i /tmp/media/import-2020-Aug-19-Wednesday-05-40-34/temp_ea5c93fd-d946-4742-b8f7-ea9ae4d43441.mp4
-filter_complex
"[0:v]trim=start=0:end=8.032,setpts=PTS-STARTPTS[clip0];
[1:v]trim=start=2:end=13.047,setpts=PTS-STARTPTS[clip1];
[2:v]trim=start=2:end=13.558,setpts=PTS-STARTPTS[clip2];
[3:v]trim=start=2:end=13.186,setpts=PTS-STARTPTS[clip3];
[4:v]trim=start=2,setpts=PTS-STARTPTS[clip4];
[0:v]trim=start=9.032:end=10.032,setpts=PTS-STARTPTS[out0];
[1:v]trim=start=14.047:end=15.047,setpts=PTS-STARTPTS[out1];
[2:v]trim=start=14.558:end=15.558,setpts=PTS-STARTPTS[out2];
[3:v]trim=start=14.186:end=15.186,setpts=PTS-STARTPTS[out3];
[1:v]trim=start=0:end=2,setpts=PTS-STARTPTS[in1];
[2:v]trim=start=0:end=2,setpts=PTS-STARTPTS[in2];
[3:v]trim=start=0:end=2,setpts=PTS-STARTPTS[in3];
[4:v]trim=start=0:end=2,setpts=PTS-STARTPTS[in4];
[in1]format=pix_fmts=yuva420p,fade=t=in:st=0:d=2:alpha=1[fadein1];
[in2]format=pix_fmts=yuva420p,fade=t=in:st=0:d=2:alpha=1[fadein2];
[in3]format=pix_fmts=yuva420p,fade=t=in:st=0:d=2:alpha=1[fadein3];
[in4]format=pix_fmts=yuva420p,fade=t=in:st=0:d=2:alpha=1[fadein4];
[out0]format=pix_fmts=yuva420p,fade=t=out:st=0:d=2:alpha=1[fadeout0];
[out1]format=pix_fmts=yuva420p,fade=t=out:st=0:d=2:alpha=1[fadeout1];
[out2]format=pix_fmts=yuva420p,fade=t=out:st=0:d=2:alpha=1[fadeout2];
[out3]format=pix_fmts=yuva420p,fade=t=out:st=0:d=2:alpha=1[fadeout3];
[fadein1]fifo[fadein1fifo];
[fadein2]fifo[fadein2fifo];
[fadein3]fifo[fadein3fifo];
[fadein4]fifo[fadein4fifo];
[fadeout0]fifo[fadeout0fifo];
[fadeout1]fifo[fadeout1fifo];
[fadeout2]fifo[fadeout2fifo];
[fadeout3]fifo[fadeout3fifo];
[fadeout0fifo][fadein1fifo]overlay[crossfade0];
[fadeout1fifo][fadein2fifo]overlay[crossfade1];
[fadeout2fifo][fadein3fifo]overlay[crossfade2];
[fadeout3fifo][fadein4fifo]overlay[crossfade3];
[clip0][crossfade0][clip1][crossfade1][clip2][crossfade2][clip3][crossfade3][clip4]concat=n=9[output];
[0:a][1:a]acrossfade=d=10:c1=tri:c2=tri[A1];
[A1][2:a]acrossfade=d=10:c1=tri:c2=tri[A2];
[A2][3:a]acrossfade=d=10:c1=tri:c2=tri[A3];
[A3][4:a]acrossfade=d=10:c1=tri:c2=tri[audio] "
-vsync 0 -map "[output]" -map "[audio]" /tmp/media/final/some_filename_d0d2aab0-792a-4540-b2d3-e64abe98bf5c.mp4
And all works pretty well, but if I have, for example:
picture
video
video
picture
Then the sound from the second video is mapping to the first picture and sound from the third video to second video. And the third video actually goes without sound.
It seems like it's happening because the silent sound of the first picture is pretty short. An I right?
If so, how can I increase its duration?
I would much appreciate any help with this!
Assuming 5 inputs of 10 seconds each, all with audio streams*, with ffmpeg 4.3 or newer, use the xfade and acrossfade filters.
ffmpeg
-i in1.mp4
-i in2.mp4
-i in3.mp4
-i in4.mp4
-i in5.mp4
-filter_complex
" [0][1]xfade=transition=fade:duration=2:offset=8[V01];
[V01][2]xfade=transition=fade:duration=2:offset=16[V02];
[V02][3]xfade=transition=fade:duration=2:offset=24[V03];
[V03][4]xfade=transition=fade:duration=2:offset=32[video];
[0:a][1:a]acrossfade=d=2:c1=tri:c2=tri[A01];
[A01][2:a]acrossfade=d=2:c1=tri:c2=tri[A02];
[A02][3:a]acrossfade=d=2:c1=tri:c2=tri[A03];
[A03][4:a]acrossfade=d=2:c1=tri:c2=tri[audio]"
-vsync 0 -map "[video]" -map "[audio]" out.mp4
*if there's no existing audio stream, add one using the command in step 1.
If the existing audio stream of a file isn't 10 seconds long, use these filters on it before acrossfade.
[input]aresample=async=1:first_pts=0,apad,atrim=0:10[filtered]
and then use this filtered stream as input.

All audios should be played after combing using ffmpeg. But there is only audio for the first part

This code is combining 3 mp4 files using ffmpeg command. Each file has audio.
After combining, I can listen audio for only first part.
How can I solve this problem?
===========================================================================
ffmpeg -y -i "tmp/titled-0c33a83dc70534c67f66.mp4" -i "tmp/titled-1c2fc9a95e644ab135a3.mp4" -i "tmp/titled-73c3fb1a3ea435cacdd2.mp4" -i "logo/logo.png" -filter_complex "
nullsrc=s=1280x720[bg];
[0:v]setpts=PTS-STARTPTS+0/TB[v0];
[1:v]setpts=PTS-STARTPTS+4.039/TB[v1];
[2:v]setpts=PTS-STARTPTS+8.078/TB[v2];
[bg][v0]overlay=x='if(lte(t,4.039),0,min(0,-w*min(1,max(0,0.98*(t-4.039)^2))))':y=0,trim=duration=13.145[bg];
[bg][v1]overlay=x='if(gte(t,8.078),-w*min(1,max(0,0.98*(t-8.078)^2)),max(0,1280*(1-min(1,max(0,0.69*(atan(8*(t-4.039)^2.7)))))))':y=0[bg];
[bg][v2]overlay=x='max(0,1280*(1-min(1,max(0,0.69*(atan(8*(t-8.078)^2.7))))))':y=0"
-y -vcodec h264 -crf 13 -acodec aac -strict -2 "out.mp4"

ffmpeg to calculate audio/visual difference between compressed and non-compressed video

I'm trying to calculate the audio + visual difference between a harshly compressed video file and one that hasn't been.
I'm using pipes because ultimately I wish this to take src from a camera stream.
I've managed to get the video results that I'm looking for, but I'm struggling with the audio.
I've added a line to invert the phase of the compressed audio, so that when they add up in the blend they should almost cancel each other out, but that doesn't happen.
ffmpeg -i input.avi -f avi -c:v libxvid -qscale:v 30 -c:a wmav1 - | \
ffmpeg -i - -f avi -af "aeval='-val(0)':c=same" - | \
ffmpeg -i input.avi -i - -filter_complex "blend=all_mode=difference" -c:v libx264 -crf 18 -f avi - | \
ffplay -
I can still hear all the audio, when what I should be hearing are solely compression artifacts. thx
To preface, I'm not sure your method would identify audio compression 'artifacts'
Your command doesn't perform any audio comparison, it only inverts a single channel. Also, the audio and video are compressed twice and the codecs the last ffmpeg command receives are the default AVI codecs of mpeg4 and mp3.
Use
ffmpeg -i input.avi -f matroska -c:v libxvid -qscale:v 30 -c:a wmav1 - |\
ffmpeg -i input.avi -i - -filter_complex "[0][1]blend=all_mode=difference;[1]aselect=gt(n\,0),asetpts=PTS-STARTPTS[1a];[0][1a]amerge,aeval=val(0)-val(1):c=mono" -c:v rawvideo -c:a pcm_s16le -f matroska - |\
ffplay -
I assume your audio is mono. If your audio has N channels, your aeval will need N expressions where the Mth expression is val(M-1)-val(N+M-1)
I also trim out the first encoded audio frame in order to mitigate encoder delay that Paul mentioned, and it seems to work here.
There might be some delay introduced with encoded audio samples. Also your command is incorrect.

Increasing a file's volume using VLC CLI

My goal is to have a script that takes an audio file and increases its volume by 50%.
I currently use the following AutoHotKey snippet to encode a file to MP3:
run_string := "bash -c ""\""c:\Program Files\VideoLAN\VLC\vlc.exe\"" -I dummy \""" . file_path . "\"" --sout='#transcode{acodec=mp3,vcodec=dummy}:standard{access=file,mux=raw,dst=\""" . file_path . ".mp3\""}' vlc://quit"""
How can I modify this line to not only encode to mp3, but also increase the volume of the file by 50%? I tried setting --volume 150 but it just made the file play, while I don't want to play, I want to have it saved with that volume.
If you have suggestions for other Windows-compatible tools to modify audio that can do this, (along with instructions on how to do this) I'll be happy to hear about them.
I suggest you to use ffmpeg. it is very powerful, cross platform 32 or 64 bit, audio and video converter. Can be downloaded from Zeranoe FFmpeg - Builds
Below sample commands work for audio extracting from video, or audio converter with volume increasing or decreasing support.
Extract audio from video to MP3, or convert audio to MP3 (sample InputFilePath_VideoOrAudio = "e:\video.mp4" or "e:\audio.m4a")
e:\ffmpeg\ffmpeg.exe -y -i "InputFilePath_VideoOrAudio" -acodec libmp3lame -ab 192k -ar 48000 -sn -dn -vn "E:\out.mp3"
Extract audio from video to MP3 and increase volume 150% while extracting add -af "volume=1.5" parameter.
e:\ffmpeg\ffmpeg.exe -y -i "InputFilePath_VideoOrAudio" -acodec libmp3lame -ab 192k -ar 48000 -sn -dn -vn -af "volume=1.5" "E:\out.mp3"
List of audio converter parameters (mp3,ogg,ac3,wma,flac,wav,aiff,m4a....). to change volume level while converting to audio add -af "volume=VolumeValue" parameter.
VolumeValue=0.5 decrease volume %50
VolumeValue=1.5 increase volume %150
VolumeValue=2.0 increase volume %200 and so on.
e:\ffmpeg\ffmpeg.exe -y -i "InputFilePath_VideoOrAudio" -acodec libmp3lame -ab 192k -ar 48000 -sn -dn -vn -af "E:\out.mp3"
e:\ffmpeg\ffmpeg.exe -y -i "InputFilePath_VideoOrAudio" -acodec ac3 -ab 192k -ar 48000 -sn -dn -vn "E:\out.ac3"
e:\ffmpeg\ffmpeg.exe -y -i "InputFilePath_VideoOrAudio" -f ogg -acodec libvorbis -ab 192k -ar 48000 -sn -dn -vn "E:\out.ogg"
e:\ffmpeg\ffmpeg.exe -y -i "InputFilePath_VideoOrAudio" -acodec wmav2 -ab 192k -ar 48000 -sn -dn -vn "E:\out.wma"
e:\ffmpeg\ffmpeg.exe -y -i "InputFilePath_VideoOrAudio" -acodec flac -sn -dn -vn "E:\out.flac"
e:\ffmpeg\ffmpeg.exe -y -i "InputFilePath_VideoOrAudio" -sn -dn -vn "E:\out.wav"
e:\ffmpeg\ffmpeg.exe -y -i "InputFilePath_VideoOrAudio" -f aiff -sn -dn -vn "E:\out.aiff"
e:\ffmpeg\ffmpeg.exe -y -i "InputFilePath_VideoOrAudio" -acodec aac -ab 192k -ar 48000 -sn -dn -vn "E:\out.m4a"
Note 1: some codecs can be experimental in such case you should use -strict experimental or -strict -2 parameters.
Note 2: -ab parameter means audio bit rate. Some devices can not play audio file that bit rate greater than -ab 192k. Use -ab 128k or -ab 192k with -ar 44100 parameters to produce audio file that can be playable most of the mobile devices. -ac 2 parameter means stereo -ac 1 means mono.
to convert specific part of the input file use -ss 00:00:00 and -t parameters. -ss means Start From -t means duration. Important: parameter -ss should placed before the -i parameter, otherwise ffmpeg seeks to -ss position slowly.
Samples: assume that input file duration is 00:20:00 (20 minutes)
using only -ss 00:05:00 means convert input file starting from 5th minute to end of the input file. Duration of the output file will be 15 minutes.
using -ss 00:05:00 with -t 120 or -t 00:02:00 means convert 120 seconds, starting from 5th minute. Duration of the output file will be 120 seconds.
e:\ffmpeg\ffmpeg.exe -y -ss 00:05:00 -i "InputFilePath_VideoOrAudio" -t 120 -acodec libmp3lame -ab 192k -ar 48000 -sn -dn -vn -af "E:\out.mp3"
Note: -y means in advance YES to ffmpeg's yes/no questions such as output file already exist, over write? with -y parameters ffmpeg over writes the output file if it is already exist without asking the user.
-sn disables subtitle, -vn disable video, -dn disable data streams for output file.
If you just want a CLI tool then you could use ffmpeg:
ffmpeg.exe -i test.mp3 -af volume=1.5 loud.mp3
^ ^ ^
input new volume level output name
If you'd like to be able to do it programmatically, looking at your profile I deduced that python should not be a problem :)
So you can use the nice pydub module together with ffmpeg (or avconv which it also supports) for your task.
E.g:
from pydub import AudioSegment
AudioSegment.converter = r"C:\PATH_TO_FFMPEG_DIR\bin\ffmpeg.exe"
sound = AudioSegment.from_mp3("test.mp3") # <- the input file
new = sound.export("loud.mp3", format="mp3", parameters=["-vol", "384"]) # 384 <-> 150% volume
new.flush()
new.close()
The reason for 384 is that the ffmpeg doc states that
-vol volume change audio volume (256=normal)
So 256*1.5 = 384
Tested this on my windows 7 machine just now...
Hope this helps.
The "--volume" option in VLC doesn't actually change the volume of the output video as you would think it would. What you want to do is add the compressor filter and then set the "compressor-makup-gain". Set it to a value from 1-24 depending on how loud you want the video to be. So your command would be something like this:
run_string := "bash -c ""\""c:\Program Files\VideoLAN\VLC\vlc.exe\"" -I dummy \""" . file_path . "\"" --sout='#transcode{acodec=mp3,vcodec=dummy,afilter=compressor}:standard{access=file,mux=raw,dst=\""" . file_path . ".mp3 --compressor-makeup-gain=20\""}' vlc://quit"""
By the way, for anyone who is trying to figure out how to use VLC to increase the volume of the audio in a video file, here's how you can do that:
"C:\Program Files (x86)\VideoLAN\VLC\vlc.exe" yoursourcefile.mp4 :sout=#transcode{acodec=mp3,ab=128,channels=2,samplerate=44100,afilter=compressor}:file{dst=outputfilename.mp4} :sout-all :sout-keep --compressor-makeup-gain=20
Replace "yoursourcefile.mp4" and "outputfilename.mp4" with your own file names. In my experience, VLC crashed about half the time I ran this command, so you may need to try it more than once if it crashes on you.
Run this on a dir to increasing all files volume on that dir, one by one (or else it would eat up all CPU)
FOR %f IN (*) DO (start /wait "" "C:\Program Files
(x86)\VideoLAN\VLC\vlc.exe" %f
:sout=#transcode{acodec=mp3,afilter=compressor}:file{dst=Boost%f}
:sout-all :sout-keep --play-and-exit --compressor-makeup-gain=10)
I believe mp3gain has a command line option for this. You could run this as a separate pass over the generated file:
http://mp3gain.sourceforge.net/

ffmpeg stream offset command (-itsoffset) not working

I would really appreciate if someone could give some pointers regarding the use of itsoffset with ffmpeg. I have read a number of posts on this subject, some of them explain very clearly how to re-synchronize audio and video with -itsoffset, but I haven't been able to make it work.
My avi file is encoded with ffmpeg, in two passes, using the following command for the second pass:
ffmpeg -i whole-vts_01.avs -pass 2 -y -vcodec libxvid -vtag XVID -b:v 1300K -g 240 -trellis 2 -mbd rd -flags +mv4+aic -acodec ac3 -ac 2 -ar 48000 -b:a 128k output.avi
For whatever reason, I end up with a 1 sec delay in the video (or the audio is 1 sec early). It doesn't happen too often but I see it from time to time.
Among other attempts, I have tried the following:
(1) ffmpeg -i output.avi -itsoffset 00:00:01.0 -i output.avi -vcodec copy -acodec copy -map 0:0 -map 1:1 output-resynched.avi
(2) ffmpeg -i output.avi -itsoffset 00:00:01.0 -i output.ac3 -vcodec copy -acodec copy -map 0:0 -map 1:0 output-resynched2.avi
(3) ffmpeg -itsoffset -00:00:01.00 -i output.avi output-resynched8.avi
(4) ffmpeg -i output.avi -itsoffset -1.0 -i output.avi -vcodec copy -acodec copy -map 0:1 -map 1:0 output-resynched13.avi
Here are the results:
Audio garbled and only 5m 35 s long vs. 1h 41m.
(Output.ac3 is audio component of output.avi) Video and audio
identical to original, offset didn't work
Audio did get shifted, but original encoding parameters replaced with default ones (as expected).
Audio garbled and only 9m 56s long vs. 1h 41m.
I see that many people explain, and apparently use the process described above, but it doesn't seem to be working for me. Am I missing something obvious? I would very much like to be able to use -itsoffset as it is cleaner than my workaround solution.
FWIW, here is a different, and longer way of obtaining the desired result:
First create a shifted video only file using -ss:
ffmpeg -i output.avi -ss 1.0 -vcodec copy -an oupput_videoshifted.avi
Then extract the audio:
ffmpeg -i output.avi -vn -acodec copy outputaudioonly.ac3
And finally remux both components:
ffmpeg -i output_videoshifted.avi -i output_audioonly.ac3 -vcodec copy -acodec copy -map 0:0 -map 1:0 output-resynched14.avi
The process works, is fast enough, but I would really prefer to use the one pass -itsoffset solution.
Here is what I did and it work for me
The first input setting -i and the second input is come from the same one video file.
Delay 1 second in first input video and the second input audio just make a copy
ffmpeg -y -itsoffset 00:00:01.000 -i "d:\Video1.mp4" -i "d:\Video1.mp4"
-map 0:v -map 1:a -vcodec copy -acodec copy
-f mp4 -threads 2 -v warning "Video2.mp4"
Delay 1 second in second input audio and the first input video just make a copy
ffmpeg -y -i "d:\Video1.mp4" -itsoffset 00:00:01.000 -i "d:\Video1.mp4"
-map 0:v -map 1:a -vcodec copy -acodec copy
-f mp4 -threads 2 -v warning "Video2.mp4"
The problem is located on -vcodec copy -acodec copy because the shifting will only work on keyframes. I have had the same problem.
Just don't copy (audio/)video, try the thing with -itsoffset, but use
-vcodec libxvid -vtag XVID -b:v 1300K -g 240 -trellis 2 -mbd rd -flags +mv4+aic -acodec ac3 -ac 2 -ar 48000 -b:a 128k
for re-encoding. It should work.

Resources