I would like to cut a video at the beginning at any particular timestamp, and it need to be precise, so the nearest key frame is not good enough.
Also, these videos are rather long - an hour or longer - so I would like to avoid re-encoding this altogether if possible, or otherwise only re-encode a minimal fraction of the total duration. Thus, would like to maximise the use of -vcodec copy.
How can I accomplish this using ffmpeg?
NOTE: See scenario, and my own rough idea for a possible solution below.
Scenario:
Original video
Length of 1:00:00
Has a key frame every 10s
Desired cut:
From 0:01:35 through till the end
Attempt #1:
Using -ss 0:01:35 -i blah.mp4 -vcodec copy, what results is a file where:
audio starts at 0:01:30
video also starts at 0:01:30
this starts both the audio and the video too early
using -i blah.mp4 -ss 0:01:35 -vcodec copy, what results is a file where:
audio starts at 0:01:35,
but the video is blank/ black for the first 5 seconds,
until 0:01:40, when the video starts
this starts the audio on time,
but the video starts too late
Rough idea
(1) cut 0:01:30 to 0:01:40
re-encode this to have new key frames,
including one at the target time of 0:01:35
then cut this to get the 5 seconds from 0:01:35 through 0:01:40
(2) cut 0:01:40 through till the end
without re-encoding, using -vcodec copy
(3) ffmpeg concat the first short clip (the 5 second one)
with the second long clip
I know/ can work out the commands for (2) and (3), but am unsure about what commands are needed for (1).
List timestamps of key frames:
ffprobe -v error -select_streams v:0 -skip_frame nokey -show_entries frame=pkt_pts_time -of csv=p=0 input.mp4
It will output something like:
0.000000
2.502000
3.795000
6.131000
10.344000
12.554000
16.266000
...
Let's say you want to delete timestamps 0 to 5, and then stream copy the remainder. The closest following key frame is 6.131.
Re-encode 5 to 6.131. Ensure the input and output match attributes and formats. For MP4 default settings should do most of the work, assuming H.264/AAC, but you may have to manually match the profile.
ffmpeg -i input.mp4 -ss 5 -to 6.131 trimmed.mp4
Make input.txt for the concat demuxer:
file 'trimmed.mp4'
file 'input.mp4'
inpoint 6.131
Concatenate:
ffmpeg -f concat -i input.mp4 -c copy output.mp4
try
ffmpeg -i src.mp4 -vcodec copy -reset_timestamps 1 -map 0 out.mp4
or
ffmpeg -i src.mp4 -vcodec copy -reset_timestamps 1 -map 0 src_.m3u8
which generates hls playlists
How to use the command line tool ffmpeg on Windows to split a sound file to multiple sound files without changing the sound properties same everything each one is fixed 30 seconds length. I got this manual example from here:
ffmpeg -i long.mp3 -acodec copy -ss 00:00:00 -t 00:00:30 half1.mp3
ffmpeg -i long.mp3 -acodec copy -ss 00:00:30 -t 00:00:30 half2.mp3
But is there a way to tell it to split the input file to equally sound files each one is 30 seconds and the last one is the remaining what ever length.
You can use the segment muxer.
ffmpeg -i long.mp3 -acodec copy -vn -f segment -segment_time 30 half%d.mp3
Add -segment_start_number 1 to start segment numbering from 1.
I need to insert a short beep into another audio file (similar to a censorship bleep) using linux and/or php.
I'm thinking there should be some way to do it with ffmpeg (with some combination of -t, concat, map, async, adelay, itsoffset?) or avconv or mkvmerge - but haven't found anyone doing this. Maybe I need to do it in 2 stages somehow?
For example if I have a 60 second mp3 and want to beep out 2 seconds at 2 places the desired result would be:
0:00-0:15 from original
0:15-0:17 beep (overwrites the 2 secs of original)
0:17-0:40 from original
0:40-0:42 beep
0:42-0:60 from original
I have a 2 second beep.mp3, but can use something else instead like -i "sine=frequency=1000:duration=2"
You can use the concat demuxer.
Create a text file, e.g.
file main.wav
inpoint 0
outpoint 15
file beep.wav
file main.wav
inpoint 17
outpoint 40
file beep.wav
file main.wav
inpoint 40
outpoint 42
and then
ffmpeg -f concat -i list.txt out.mp3
Convert the beep file to have the same sampling rate and channel count as the main audio.
First, you need to have beep.mp3 time equal to 60 seconds or little bit less than your mp3 file time.
Then, you can use ffmpeg code -ss <start_time> -t <duration> -i <your_file>.mp3
ffmpeg -ss 00:00:00 -t 15 -i ./original.mp3 -ss 00:15:00 -t 2 -i ./beep.mp3 -ss 00:17:00 -t 23 -i ./original.mp3 -ss 00:40:00 -t 2 -i ./beep.mp3 -ss 00:42:00 -i ./original.mp3 -filter_complex '[0:0][1:0] concat=n=2:v=0:a=1[out]' -map '[out]' ./output.mp3
at the end you will get output.mp3 file as you needed.
I have an mp4 file and I want to take two sequential sections of the video out and render them as individual files, later recombining them back into the original video. For instance, with my video video.mp4, I can run
ffmpeg -i video.mp4 -ss 56 -t 4 out1.mp4
ffmpeg -i video.mp4 -ss 60 -t 4 out2.mp4
creating out1.mp4 which contains 00:00:56 to 00:01:00 of video.mp4, and out2.mp4 which contains 00:01:00 to 00:01:04. However, later I want to be able to recombine them again quickly (i.e., without reencoding), so I use the concat demuxer,
ffmpeg -f concat -safe 0 -i files.txt -c copy concat.mp4
where files.txt contains
file out1.mp4
file out2.mp4
which theoretically should give me back 00:00:56 to 00:01:04 of video.mp4, however there are always dropped audio frames where the concatenation occurs, creating a very unpleasant sound artifact, an audio blip, if you will.
I have tried using async and -af apad on initially creating the two sections of the video but I am still faced with the same problem, and have not found the solution elsewhere. I have experienced this issue in multiple different use cases, so hopefully this simple example will shed some light on the real problem.
I suggest you export segments to MOV with PCM audio, then concat those but with re-encoding audio.
ffmpeg -i video.mp4 -c:a pcm_s16le -ss 56 -t 4 out1.mov
...
and then
ffmpeg -f concat -safe 0 -i files.txt -c:v copy concat.mp4
I noticed that ffmpeg amix filter doesn't output good result in specific situation. It works fine if input files have equal duration. In that case volume is dropped in constant value and could be fixed with ",volume=2".
In my case I'm using files with different duration. Resulted volume is not good. First mixed stream resulted in lowest volume, and last one is highest. You can see on image that volume is increased linearly withing a time.
My command:
ffmpeg -i temp_0.mp4 -i user_2123_10.mp4 -i user_2123_3.mp4 -i user_2123_4.mp4
-i user_2123_7.mp4 -i user_2123_5.mp4 -i user_2123_1.mp4 -i user_2123_8.mp4
-i user_2123_0.mp4 -i user_2123_6.mp4 -i user_2123_9.mp4 -i user_2123_2.mp4
-i user_2123_11.mp4 -filter_complex "[1:a]adelay=34741.0[aud1];
[2:a]adelay=18241.0[aud2];[3:a]adelay=20602.0[aud3];
[4:a]adelay=27852.0[aud4];[5:a]adelay=22941.0[aud5];
[6:a]adelay=13142.0[aud6];[7:a]adelay=29810.0[aud7];
[8:a]adelay=12.0[aud8];[9:a]adelay=25692.0[aud9];
[10:a]adelay=32143.002[aud10];[11:a]adelay=16101.0[aud11];
[12:a]adelay=40848.0[aud12];
[0:a][aud1][aud2][aud3][aud4][aud5][aud6][aud7]
[aud8][aud9][aud10][aud11]
[aud12]amix=inputs=13:duration=first:dropout_transition=0"
-vcodec copy -y temp_1.mp4
That could be fixed by applying silence at the beginning and end of each clip, then they will have same duration and volume will be at the same level.
Please suggest how I can use amix to mix many inputs and ensure constant volume level.
amix scales each input's volume by 1/n where n = no. of active inputs. This is evaluated for each audio frame. So when an input drops out, the volume of the remaining inputs is scaled by a smaller amount, hence their volumes increase.
Changing the dropout_transition for all earlier inputs, as suggested in other answers, is one approach, but I think it will result in coarse volume modulations. Better method is to normalize the audio after the amix.
At present, you have two options, the loudnorm or the dynaudnorm filter. The latter is much faster
Syntax is to add it after the amix, so
[aud11][aud12]amix=inputs=13:duration=first:dropout_transition=0,dynaudnorm"
Read the documentation, if you wish to tweak parameters for maximum volume or RMS mode normalization..etc
The latest version of FFMPEG includes the normalize parameter for the amix filter, which you can use to turn off the constantly changing normalization. Here's the documentation for it.
Your amix filter string can be changed to:
[aud12]amix=inputs=13:normalize=0
The solution I've found is to specify the volume for each track in a "descendant" order and use no normalization filter afterwards.
I use this example, where I concat the same audio file in different positions:
ffmpeg -vn -i test.mp3 -i test.mp3 -i test.mp3 -filter_complex "[0]adelay=0|0,volume=3[a];[1]adelay=2000|2000,volume=2[b];[2]adelay=4000|4000,volume=1[c];[a][b][c]amix=inputs=3:dropout_transition=0" -q:a 1 -acodec libmp3lame -y amix-volume.mp3
More details, see this image. The first track is the normal mixing, the second is the one with volumes specified; the third is the original track. As we can see the 2nd track looks to have a normal volume.
ffmpeg -vn -i test.mp3 -i test.mp3 -i test.mp3 -filter_complex "[0]adelay=0|0[a];[1]adelay=2000|2000[b];[2]adelay=4000|4000[c];[a][b][c]amix=inputs=3:dropout_transition=0" -q:a 1 -acodec libmp3lame -y amix-no-volume.mp3
ffmpeg -vn -i test.mp3 -i test.mp3 -i test.mp3 -filter_complex "[0]adelay=0|0,volume=3[a];[1]adelay=2000|2000,volume=2[b];[2]adelay=4000|4000,volume=1[c];[a][b][c]amix=inputs=3:dropout_transition=0" -q:a 1 -acodec libmp3lame -y amix-volume.mp3
I can't really understand why amix changes the volume; anyway; I was digging around since a while for a good solution.
The solution seems to be a combination of "pre-amp", or multiplication, as Maxim puts it, AND you have to set dropout_transition >= max delay + max input length (or a very high number):
amix=inputs=13:dropout_transition=1000,volume=13
Notes:
amix has to resample float anyway, so there is no downside with adding the volume filter (which by default resamples to float, too).
And since we're using floats, there's no clipping and (almost) no loss of precision.
H't to #Mulvya for the analysis but their solution is frustratingly non-mathematical
I was originally trying to do this with sox, which was too slow. Sox's remix filter has the -m switch which disables the 1/n adjustment.
While faster, ffmpeg seems to be using way more memory for the same task. YMMV - I didn't test this thoroughly, because I finally settled on a small python script which uses pydub's overlay function, and only keeps the final output file and one segment in memory (whereas ffmpeg and sox seem to keep all of the segments in memory).
I got the same problem but found a solution!
First the Problem: i had to mix a background music file with 3 different TTS voice pieces that start with different delay. At the end the background sound was extremely loud.
I tried the suggested answer but it did not work for me, the end volume was still much higher. So my thoughts were: "All inputs must have the same length so everytime the same amount of audio is active in the mix"
apad on all TTS inputs with whole_len set and -shortest option in combination did the work for me.
Example call:
ffmpeg -y
-nostats
-hide_banner
-v quiet
-hwaccel auto
-f image2pipe
-i pipe:0
-i bgAudio.aac
-i TTS1.mp3
-i TTS2.mp3
-i TTS3.mp3
-filter_complex [1:a]loudnorm=I=-16:TP=-1.5:LRA=11:linear=false[a0];[2:a]loudnorm=I=-16:TP=-1.5:LRA=11:linear=false:dual_mono=true,adelay=7680|7680,apad=whole_len=2346240[a1];[3:a]loudnorm=I=-16:TP=-1.5:LRA=11:linear=false:dual_mono=true,adelay=14640|14640,apad=whole_len=2346240[a2];[4:a]loudnorm=I=-16:TP=-1.5:LRA=11:linear=false:dual_mono=true,adelay=3240|3240,apad=whole_len=2346240[a3];[a0][a1][a2][a3]amix=inputs=4:dropout_transition=0,asplit=6[audio0][audio1][audio2][audio3][audio4][audio5];[0:v]format=yuv420p,split=6[1080p][720p][480p][360p][240p][144p]
-map [audio0] -map [1080p] -s 1920x1080 -shortest out1080p.mp4
-map [audio1] -map [720p] -s 1280x720 -shortest out720p.mp4
-map [audio2] -map [480p] -s 858x480 -shortest out480p.mp4
-map [audio3] -map [360p] -s 640x360 -shortest out360p.mp4
-map [audio4] -map [240p] -s 426x240 -shortest out240p.mp4
-map [audio5] -map [144p] -s 256x144 -shortest out144p.mp4
Hope someone helps this!
Try to use multiplication:
"amix=inputs="+ chunks.length + ":duration=first:dropout_transition=3,volume=" + chunks.length
Sorry, for not sending ffmpeg output.
After all we ended up by writing small util in C++ for mixing audio. But first we converted mp4 to raw(pcm) format. That worked just fine for us, even requires addition HDD space for raw intermediate files.
Code looks like this:
short addSounds(short a, short b) {
double da = a;
da /= 65536.0;
da += 0.5;
double db = b;
db /= 65536.0;
db += 0.5;
double z = 0;
if (da < 0.5 && db < 0.5) {
z = 2 * da*db;
}
else {
z = 2 * ( da + db ) - 2 * da* db - 1;
}
z -= 0.5;
z *= 65536.0;
return (short)z;
}
I will show you my code.
"amix="+inputs.size()+",volume="+(inputs.size()+1)/2+"[mixout]\""
I don't use the code dropout_transition=0 because it will cause the problem you meet.
but I also find the problem that volume will be lower as the size of inputs increases.
so I make the volume louder.
try to change dropout transition to the duration of the first input:
duration=first:dropout_transition=_duration_of_the_first_input_in_seconds_
here is my ffmpeg command:
ffmpeg -y -i long.wav -i short.wav -filter_complex "[1:a]adelay=6000|6000[a1];[1:a]adelay=10000|10000[a2];[1:a]adelay=14000|14000[a3];[1:a]adelay=18000|18000[a4];[1:a]adelay=21000|21000[a5];[1:a]adelay=25500|25500[a6];[0:a][a1][a2][a3][a4][a5][a6]amix=inputs=7:duration=first:dropout_transition=32[aout]" -map "[aout]" -ac 2 -b:a 192k -ar 44100 output.mp3
see two dropout transitions as screenshot