ffmpeg 4: Using the stream_loop parameter to loop the audio during a video ends up with an infinite loop - audio

Summary
Context
The software I use
The problem
Results
4.1. Actual Results
4.2. Expected Results
What did I try to fix the bug?
How to reproduce this bug: minimal and testable example with the provided required data
The question
Sources
Context
I would want to set an audio WAV as the background sound of a video WEBM. The video can be shorter or longer than the audio. At the moment I add the audio over the video, I don't know the length of both streams. The audio must repeat until the video ends (the audio can be truncated if the video ends before the end of the last repetition of the audio).
The software I use
I use ffmpeg version 4.2.2-1ubuntu1~18.04.sav0.
The problem
ffmpeg seems to enter in an infinite loop when it proccesses in order to mix the audio and the video. Also, the length of the currently-generating-output-file (which contains both video and audio) is equal to the length of the audio, instead of the length of the video.
The problem seems to be triggered by this command line:
ffmpeg -i directory_1/video.webm -stream_loop -1 -fflags +shortest -max_interleave_delta 50000 -i directory_2/audio.wav directory_3/video_and_audio.webm
Results
Actual Results
Three things:
The infinite loop of the ffmpeg process: I must manually stop the ffmpeg process
The output video file with music (which is currently generating but output anyway): it contains both audio and video. But the length of the output file is equal to the length of the audio, instead of the length of the video.
The following output logs:
ffmpeg version 4.2.2-1ubuntu1~18.04.sav0 Copyright (c) 2000-2019 the
FFmpeg developers built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
configuration: --prefix=/usr --extra-version='1ubuntu1~18.04.sav0'
--toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared libavutil 56. 31.100 / 56. 31.100 libavcodec 58. 54.100 / 58. 54.100 libavformat 58. 29.100 / 58. 29.100 libavdevice 58. 8.100 /
58. 8.100 libavfilter 7. 57.100 / 7. 57.100 libavresample 4. 0. 0 / 4. 0. 0 libswscale 5. 5.100 / 5. 5.100 libswresample 3. 5.100 / 3. 5.100 libpostproc 55. 5.100 /
55. 5.100 Input #0, matroska,webm, from 'youtubed/my_youtube_video.webm': Metadata:
encoder : Chrome Duration: N/A, start: 0.000000, bitrate: N/A
Stream #0:0(eng): Video: vp8, yuv420p(progressive), 3200x1608, SAR 1:1 DAR 400:201, 1k tbr, 1k tbn, 1k tbc (default)
Metadata:
alpha_mode : 1 Guessed Channel Layout for Input Stream #1.0 : stereo Input #1, wav, from 'tmp_music/original_music.wav':
Duration: 00:00:11.78, bitrate: 1411 kb/s
Stream #1:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s Stream mapping: Stream #0:0 -> #0:0 (vp8
(native) -> vp9 (libvpx-vp9)) Stream #1:0 -> #0:1 (pcm_s16le
(native) -> opus (libopus)) Press [q] to stop, [?] for help
[libvpx-vp9 # 0x5645268aed80] v1.8.2 [libopus # 0x5645268b09c0] No bit
rate set. Defaulting to 96000 bps. Output #0, webm, to
'youtubed/my_youtube_video_with_music.webm': Metadata:
encoder : Lavf58.29.100
Stream #0:0(eng): Video: vp9 (libvpx-vp9), yuv420p(progressive), 3200x1608 [SAR 1:1 DAR 400:201], q=-1--1, 200 kb/s, 1k fps, 1k tbn, 1k
tbc (default)
Metadata:
alpha_mode : 1
encoder : Lavc58.54.100 libvpx-vp9
Side data:
cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
Stream #0:1: Audio: opus (libopus), 48000 Hz, stereo, s16, 96 kb/s
Metadata:
encoder : Lavc58.54.100 libopus
Expected Results
No infinite loop during the ffmpeg process
Concerning the output logs, I don't know what it should look.
The output file with the audio and the video should:
3.1. If the video is longer than the audio, then the audio is repeated until it exactly fits the video. The audio can be truncated.
3.2. If the video is shorter than the audio, then the audio is truncated and exactly fits the video.
3.3. If both video and audio are of the same length, then the audio exactly fits the video.
How to reproduce this bug? (+ required data)
Download the following files (resp. audio and video) (I must refresh these download links every 24 hours):
1.1. https://a.uguu.se/dmgsmItjJMDq_audio.wav
1.2. https://a.uguu.se/w3qHDlGq6mOW_video.webm
Move them into the directory/directories of your choice.
Open your CLI, move to the adequat directory and copy/paste/execute the instruction given in Part. The Problem (don't forget to eventually modify this instruction by indicating the adequat directories, according to step 2.).
You'll face my problem.
What did I try to fix the bug?
Nothing, since I don't even understand why the bug occures.
The question
How to correct my command in order to mix these audio and video streams without any infinite loop during the ffmpeg process, keeping in mind that I don't know their length, and that audio must be repeated in order to fit the video, even if audio must be truncated (in the case of the last repetition of the audio file must be truncated because the video stream has just ended)?
Sources
The source is the command line you can find in Part. The problem.

The placement of some of your options is wrong. All of the shortest related options belong in front of the output.
ffmpeg -i directory_1/video.webm -stream_loop -1 -i directory_2/audio.wav -c:v copy -shortest -fflags +shortest -max_interleave_delta 100M directory_3/video_and_audio.webm
There's no need to transcode the video unless you wish to.

Related

Is there any program that will render all segments of non-silence from an mp3 file into a separate file? [duplicate]

I am trying to use the following command with the latest ffmpeg build to remove silence from my .mp3 files:
ffmpeg -i SILENCE.mp3 -af silencedetect=n=-50dB:d=1 -y -ab 192k SILENCE_OUT.mp3
However, the following output is produced:
ffmpeg version N-66154-g1654ca7 Copyright (c) 2000-2014 the FFmpeg developers
built on Sep 5 2014 22:10:38 with gcc 4.8.3 (GCC)
configuration: --enable-gpl --enable-version3 --disable-w32threads --enable-av
isynth --enable-bzlib --enable-fontconfig --enable-frei0r --enable-gnutls --enab
le-iconv --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --
enable-libfreetype --enable-libgme --enable-libgsm --enable-libilbc --enable-lib
modplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrw
b --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libschroedinge
r --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --en
able-libvidstab --enable-libvo-aacenc --enable-libvo-amrwbenc --enable-libvorbis
--enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-
libx265 --enable-libxavs --enable-libxvid --enable-decklink --enable-zlib
libavutil 54. 7.100 / 54. 7.100
libavcodec 56. 1.100 / 56. 1.100
libavformat 56. 4.100 / 56. 4.100
libavdevice 56. 0.100 / 56. 0.100
libavfilter 5. 1.100 / 5. 1.100
libswscale 3. 0.100 / 3. 0.100
libswresample 1. 1.100 / 1. 1.100
libpostproc 53. 0.100 / 53. 0.100
Input #0, mp3, from 'SILENCE.mp3':
Metadata:
title : Snowblind (Featuring Tasha Baxter)
artist : Au5
album : Snowblind (Featuring Tasha Baxter)
genre : Electronica
performer : Au5
track : 1/1
date : 2014
album_artist : Au5,Tasha Baxter
major_brand : mp42
minor_version : 0
compatible_brands: isommp42
encoder : Lavf55.42.100
Duration: 00:05:50.80, start: 0.025057, bitrate: 192 kb/s
Stream #0:0: Audio: mp3, 44100 Hz, stereo, s16p, 192 kb/s
Output #0, mp3, to 'SILENCE_OUT.mp3':
Metadata:
TIT2 : Snowblind (Featuring Tasha Baxter)
TPE1 : Au5
TALB : Snowblind (Featuring Tasha Baxter)
TCON : Electronica
TPE3 : Au5
TRCK : 1/1
TDRL : 2014
TPE2 : Au5,Tasha Baxter
major_brand : mp42
minor_version : 0
compatible_brands: isommp42
TSSE : Lavf56.4.100
Stream #0:0: Audio: mp3 (libmp3lame), 44100 Hz, stereo, s16p, 192 kb/s
Metadata:
encoder : Lavc56.1.100 libmp3lame
Stream mapping:
Stream #0:0 -> #0:0 (mp3 (native) -> mp3 (libmp3lame))
Press [q] to stop, [?] for help
[silencedetect # 0000000004398f40] silence_start: -0.00628118
[silencedetect # 0000000004398f40] silence_end: 3.21413 | silence_duration: 3.22
041
[silencedetect # 0000000004398f40] silence_start: 343.844
[libmp3lame # 00000000043b2940] Trying to remove 1152 samples, but the queue is
empty
size= 8223kB time=00:05:50.79 bitrate= 192.0kbits/s
video:0kB audio:8222kB subtitle:0kB other streams:0kB global headers:0kB muxing
overhead: 0.011485%
The generated audio file however still has the original length without any silence removed.
See the following images:
Any help is appreciated!
EDIT:
Alright, silence detect is only DETECTING the silence. Not removing it. I will try to post a solution for this.
Use the silenceremove filter. This removes silence from the audio track only - it will leave the video unedited, i.e., things will go out of sync
Its arguments are a little cryptic.
An example
ffmpeg -i input.mp3 -af silenceremove=1:0:-50dB output.mp3
This removes silence
at the beginning (indicated by the first argument 1)
with minimum length zero (indicated by the second argument 0)
silence is classified as anything under -50 decibels (indicated by -50dB).
Documentation:
FFMPEG silence remove filter
Also anyone looking to find the right value to classify silence as may wish to look into normalising their input audio volume to 0dB first, to do this in ffmpeg see this answer.
Edit
As pointed out by #mems, to detect whether your version of ffmpeg has the filter run
ffmpeg -hide_banner -filters | grep silenceremove
if you have the filter it'll output something like
silenceremove A->A Remove silence
ffmpeg silence detect only detects the silence. One has to scan the ffmpeg output and cut the mp3 file.
In theory, this would be done as:
ffmpeg -i INPUT.mp3 -af silencedetect=n=-50dB:d=1
and monitoring for output in form of:
[silencedetect # 0000000004970f80] silence_start: -0.00154195
[silencedetect # 0000000004970f80] silence_end: 3.20435 | silence_duration: 3.2059
...
[silencedetect # 0000000004970f80] silence_start: 343.84
And, cutting start and end silence:
ffmpeg -i INPUT.mp3 -ss 3.20435 -t (343.84-3.20435)
I ended up writing a small Java program which does it. Hints:
ffmpeg writes to stderr. This means, you need to use ProcessBuilder and redirectErrorStream(true).
secondly, you need to extract the silence_start and silence_end information.
then you might use the timestamps to cut the video
Following code may be helpful:
Using Java and FFMPEG with silencedetect to remove audio silence
After reading the FFmpeg silenceremove documentation, this is how you remove silence at the beginning and end of an audio file (keeps silence in the middle).
ffmpeg -i "INPUT.mp3" -af silenceremove=start_periods=1:stop_periods=1:detection=peak "OUTPUT.mp3"
As indicated by other posters, silencedetect doesn't remove anything. To remove all silence (here lower than -30 dB) from an audio file, and leave 2 second gaps between fragments, use the following.
ffmpeg -i inputfile.mp3 -af "silenceremove=start_periods=1:stop_periods=-1:start_threshold=-30dB:stop_threshold=-30dB:start_silence=2:stop_silence=2" outputfile.mp3
From the following way can remove silence from the beginning and end of the file.
ffmpeg -i input.mp3 -af "silenceremove=start_periods=1:start_duration=1:start_threshold=-50dB:detection=peak,aformat=dblp,areverse,silenceremove=start_periods=1:start_duration=1:start_threshold=-50dB:detection=peak,aformat=dblp,areverse" input_silence_removed.mp3

Unknown V4L2 pixel format equivalent for yuvj420p

I am trying to pipe a mp4 video located in Videos/video.mp4 to a virtual webcam device located at /dev/video0.
I tried running:
ffmpeg -re -i Videos/video.mp4 -map 0:v -f v4l2 /dev/video0
and I keep getting the following error:
[video4linux2,v4l2 # 0x5580cf270100] Unknown V4L2 pixel format equivalent for yuvj420p
Could not write header for output file #0 (incorrect codec parameters ?): Invalid argument
Error initializing output stream 0:0 --
Conversion failed!
Full log:
ffmpeg version 4.2.2-1+b1 Copyright (c) 2000-2019 the FFmpeg developers
built with gcc 9 (Debian 9.2.1-28)
configuration: --prefix=/usr --extra-version=1+b1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
libavutil 56. 31.100 / 56. 31.100
libavcodec 58. 54.100 / 58. 54.100
libavformat 58. 29.100 / 58. 29.100
libavdevice 58. 8.100 / 58. 8.100
libavfilter 7. 57.100 / 7. 57.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 5.100 / 5. 5.100
libswresample 3. 5.100 / 3. 5.100
libpostproc 55. 5.100 / 55. 5.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'Videos/video.mp4':
Metadata:
major_brand : mp42
minor_version : 0
compatible_brands: isommp42
creation_time : 2020-03-23T04:24:01.000000Z
com.android.version: 8.1.0
Duration: 00:01:00.14, start: 0.000000, bitrate: 20048 kb/s
Stream #0:0(eng): Video: h264 (Baseline) (avc1 / 0x31637661), yuvj420p(pc, smpte170m), 1920x1080, 19898 kb/s, SAR 1:1 DAR 16:9, 29.43 fps, 29.58 tbr, 90k tbn, 180k tbc (default)
Metadata:
rotate : 270
creation_time : 2020-03-23T04:24:01.000000Z
handler_name : VideoHandle
Side data:
displaymatrix: rotation of 90.00 degrees
Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 96 kb/s (default)
Metadata:
creation_time : 2020-03-23T04:24:01.000000Z
handler_name : SoundHandle
Stream mapping:
Stream #0:0 -> #0:0 (h264 (native) -> rawvideo (native))
Press [q] to stop, [?] for help
[video4linux2,v4l2 # 0x5580cf270100] Unknown V4L2 pixel format equivalent for yuvj420p
Could not write header for output file #0 (incorrect codec parameters ?): Invalid argument
Error initializing output stream 0:0 --
Conversion failed!
The desired result is that the mp4 video is seen by apps that try to view the webcam. I am running this on a desktop without a webcam or video interface, which is why I am using /dev/video0
Add -vf format=yuv420p (or the alias -pix_fmt yuv420p).
The v4l2 output device doesn't support yuvj420p which is the pixel format of your input. In most cases ffmpeg will automatically choose a supported pixel format, but it is unable to do so for V4L2 output, so you have to manually do it:
ffmpeg -re -i Videos/video.mp4 -map 0:v -vf format=yuv420p -f v4l2 /dev/video0

When I append a silent audio (mp3) to an existing list of audio it garbles the final audio?

After several hours I have narrowed down the issue with the garbled audio to be the 2-seconds silence audio mp3 I am appending (I think I had produced it once with Wavelab)
However, I tried using ffmpeg according to a post to produce a similar 2 seconds audio but it too will corrupt/garble/chop voice in the final concatenation of audio files.
ffmpeg -f lavfi -i anullsrc=r=44100:cl=mono -t 2 -q:a 9 -acodec libmp3lame SILENCE_2sec.MP3
I typically will have several audio files to concatenate together but for simplicity I have able to narrow it to a couple of files simplifying to the following script. A simple Windows batch file you should be able to use and reproduce the issue at your end.
rem
rem
SET EXE="S:\_BINS\FFmpeg 4.2.1 20200112\bin\ffmpeg.exe"
SET ROOTPATH=.\
SET IN_FILE="%ROOTPATH%MyList.txt"
ECHO file '%ROOTPATH%HELLO.mp3' > MyList.txt
ECHO file 'SILENCE_2sec.MP3' >> MyList.txt
SET OPTIONS= -f concat -safe 0 -i %IN_FILE% -c copy -y
SET OUT_FILE="%ROOTPATH%CONCATENATED_AUDIO_2.MP3"
SET INFO_FILE="INFO.TXT"
%EXE% %OPTIONS% %OUT_FILE% 1> %INFO_FILE% 2>&1
ECHO ======================== >> %INFO_FILE%
ECHO IN_FILE=%IN_FILE% >> %INFO_FILE%
ECHO EXE=%EXE% >> %INFO_FILE%
ECHO OPTIONS=%OPTIONS% >> %INFO_FILE%
ECHO ======================== >> %INFO_FILE%
Here is the console info output from the ffmpeg, let me know if you need other output include ones from ffprobe
ffmpeg version git-2020-01-10-3d894db Copyright (c) 2000-2020 the FFmpeg developers
built with gcc 9.2.1 (GCC) 20191125
configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt --enable-amf
libavutil 56. 38.100 / 56. 38.100
libavcodec 58. 65.103 / 58. 65.103
libavformat 58. 35.101 / 58. 35.101
libavdevice 58. 9.103 / 58. 9.103
libavfilter 7. 70.101 / 7. 70.101
libswscale 5. 6.100 / 5. 6.100
libswresample 3. 6.100 / 3. 6.100
libpostproc 55. 6.100 / 55. 6.100
[mp3 # 000000000036af80] Estimating duration from bitrate, this may be inaccurate
Input #0, concat, from '.\MyList.txt':
Duration: N/A, start: 0.000000, bitrate: 32 kb/s
Stream #0:0: Audio: mp3, 24000 Hz, mono, fltp, 32 kb/s
Output #0, mp3, to '.\CONCATENATED_AUDIO_2.MP3':
Metadata:
TSSE : Lavf58.35.101
Stream #0:0: Audio: mp3, 24000 Hz, mono, fltp, 32 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (copy)
Press [q] to stop, [?] for help
[mp3 # 0000000000372d00] Application provided invalid, non monotonically increasing dts to muxer in stream 0: 17280 >= 17255
size= 11kB time=00:00:02.73 bitrate= 33.2kbits/s speed=2.73e+03x
video:0kB audio:11kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.137446%
========================
IN_FILE=".\MyList.txt"
EXE="S:\_BINS\FFmpeg 4.2.1 20200112\bin\ffmpeg.exe"
OPTIONS= -f concat -safe 0 -i ".\MyList.txt" -c copy -y
========================
I believe I am running FFmpeg 4.2.1, recently installed (20200112)
You may produce the HELLO.mp3 by saving the following link
https://translate.google.com.vn/translate_tts?en=UTF-8&q=Hello+&tl=en&client=tw-ob
FYI, I am still a novice of ffmpeg and using it more like a black box with the help I received in this very super forum.
Please be as explicit as you can with command line options on how I can fix this issue.
Thank you.
Additional Hints Debugging:
If I append more files after the silence audio it seems that the silence audio impacts (garbles, chops) the previous audio.
You may try the following for the list of audio files input.
ECHO file '%ROOTPATH%HELLO.mp3' > MyList.txt
ECHO file 'SILENCE_2sec.MP3' >> MyList.txt
ECHO file '%ROOTPATH%HELLO.mp3' >> MyList.txt
ECHO file '%ROOTPATH%HELLO.mp3' >> MyList.txt
I typically add one or more silence file to derive a post silence effect after the actual audio. That's my current logic. However if you have an alternative to appending a silence in the process of concatenating several audio files or appending x-seconds silence to an existing audio file. I can use that method as well from my coding.
Thank you.
The silent audio needs to match the parameters of the main audio:
Stream #0:0: Audio: mp3, 24000 Hz, mono, fltp, 32 kb/s
The parameters above are:
sample rate (24000 Hz)
channel layout (mono)
sample format (fltp)
bitrate (32 kb/s)
The important parameters are sample rate and channel layout. In the anullsrc filter you can set these with the r/sample_rate and cl/channel_layout options as shown in ffmpeg -h filter=anullsrc.
Example command:
ffmpeg -f lavfi -i anullsrc=r=24000:cl=mono -t 2 -b:a 32k -c:a libmp3lame SILENCE_2sec.MP3

ffmpeg replace part of audio file with looped audio

I am quite new to ffmpeg and I am trying to replace a part of a first audio file with another second file. The second file can be too short, so some sort of loop should exist.
After some research I came up with the following command arguments and it gives me the output as long as I only do one replacement. But I would like to do multiple replacements. So any help on what I am doing wrong? Any suggestions/remarks on the way of working are also very welcome.
(Any typos in the commands below can be ignored, I generate the command by script and for ease of use I simplified the names.)
Works (One replacement):
"ffmpeg.exe" -y -i "first.wav" -i "second.wav" -filter_complex "[1:a][1:a][1:a]concat=n=3:v=0:a=1,asetpts=PTS-STARTPTS[replaceBase];[0:a]atrim=0:3,asetpts=PTS-STARTPTS[partA];[replaceBase]atrim=0:2,asetpts=PTS-STARTPTS[replaceA];[0:a]atrim=start=5,asetpts=PTS-STARTPTS[partB];[partA][replaceA][partB]concat=n=3:v=0:a=1[aout]" -map "[aout]" Out.wav
Works Not (Multiple replacements):
"ffmpeg.exe" -y -i "first.wav" -i "second.wav" -filter_complex "[1:a][1:a][1:a]concat=n=3:v=0:a=1,asetpts=PTS-STARTPTS[replaceBase];[0:a]atrim=0:3,asetpts=PTS-STARTPTS[partA];[replaceBase]atrim=0:2,asetpts=PTS-STARTPTS[replaceA];[0:a]atrim=5:4,asetpts=PTS-STARTPTS[partB];[replaceBase]atrim=0:2,asetpts=PTS-STARTPTS[replaceB];[0:a]atrim=start=6,asetpts=PTS-STARTPTS[partC];[partA][replaceA][partB][replaceB][PartC]concat=n=4:v=0:a=1[aout]" -map "[aout]" Out.wav
ffmpeg version N-76860-g72eaf72 Copyright (c) 2000-2015 the FFmpeg developers
built with gcc 5.2.0 (GCC)
configuration: --enable-gpl --enable-version3 --disable-w32threads --enable-avisynth --enable-bzlib --enable-fontconfig --enable-frei0r --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libdcadec --enable-libfreetype --enable-libgme --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libschroedinger --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-aacenc --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxvid --enable-libzimg --enable-lzma --enable-decklink --enable-zlib
libavutil 55. 9.100 / 55. 9.100
libavcodec 57. 16.100 / 57. 16.100
libavformat 57. 19.100 / 57. 19.100
libavdevice 57. 0.100 / 57. 0.100
libavfilter 6. 15.100 / 6. 15.100
libswscale 4. 0.100 / 4. 0.100
libswresample 2. 0.101 / 2. 0.101
libpostproc 54. 0.100 / 54. 0.100
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, wav, from '3897583stereo.wav':
Duration: 00:00:12.07, bitrate: 256 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 8000 Hz, 2 channels, s16, 256 kb/s
Guessed Channel Layout for Input Stream #1.0 : stereo
Input #1, wav, from 'beep-021.wav':
Metadata:
encoder : Lavf57.19.100
Duration: 00:00:00.30, bitrate: 1413 kb/s
Stream #1:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, 2 channels, s16, 1411 kb/s
[wav # 057242c0] Invalid stream specifier: replaceBase.
Last message repeated 1 times
Stream specifier 'STREAM CUT matches no streams.
Thanks in advance!
I managed to find a workaround (or maybe just how it should be done) by splitting the looped stream with asplit. Remarks for the way of processing are still welcome...
"ffmpeg.exe" -y -i "first.wav" -i "second.wav" -filter_complex "[1:a][1:a][1:a]concat=n=3:v=0:a=1,asetpts=PTS-STARTPTS[replaceBase];[replaceBase]asplit=2 [replaceA][replaceB];[0:a]atrim=0:3,asetpts=PTS-STARTPTS[partA];[replaceA]atrim=0:2,asetpts=PTS-STARTPTS[replaceTrimmedA];[0:a]atrim=5:6,asetpts=PTS-STARTPTS[partB];[replaceB]atrim=0:2,asetpts=PTS-STARTPTS[replaceTrimmedB];[0:a]atrim=start=8,asetpts=PTS-STARTPTS[partC];[partA][replaceTrimmedA][partB][replaceTrimmedB][PartC]concat=n=4:v=0:a=1[aout]" -map "[aout]" Out.wav
Regards,

using FFMPEG with silencedetect to remove audio silence

I am trying to use the following command with the latest ffmpeg build to remove silence from my .mp3 files:
ffmpeg -i SILENCE.mp3 -af silencedetect=n=-50dB:d=1 -y -ab 192k SILENCE_OUT.mp3
However, the following output is produced:
ffmpeg version N-66154-g1654ca7 Copyright (c) 2000-2014 the FFmpeg developers
built on Sep 5 2014 22:10:38 with gcc 4.8.3 (GCC)
configuration: --enable-gpl --enable-version3 --disable-w32threads --enable-av
isynth --enable-bzlib --enable-fontconfig --enable-frei0r --enable-gnutls --enab
le-iconv --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --
enable-libfreetype --enable-libgme --enable-libgsm --enable-libilbc --enable-lib
modplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrw
b --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libschroedinge
r --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --en
able-libvidstab --enable-libvo-aacenc --enable-libvo-amrwbenc --enable-libvorbis
--enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-
libx265 --enable-libxavs --enable-libxvid --enable-decklink --enable-zlib
libavutil 54. 7.100 / 54. 7.100
libavcodec 56. 1.100 / 56. 1.100
libavformat 56. 4.100 / 56. 4.100
libavdevice 56. 0.100 / 56. 0.100
libavfilter 5. 1.100 / 5. 1.100
libswscale 3. 0.100 / 3. 0.100
libswresample 1. 1.100 / 1. 1.100
libpostproc 53. 0.100 / 53. 0.100
Input #0, mp3, from 'SILENCE.mp3':
Metadata:
title : Snowblind (Featuring Tasha Baxter)
artist : Au5
album : Snowblind (Featuring Tasha Baxter)
genre : Electronica
performer : Au5
track : 1/1
date : 2014
album_artist : Au5,Tasha Baxter
major_brand : mp42
minor_version : 0
compatible_brands: isommp42
encoder : Lavf55.42.100
Duration: 00:05:50.80, start: 0.025057, bitrate: 192 kb/s
Stream #0:0: Audio: mp3, 44100 Hz, stereo, s16p, 192 kb/s
Output #0, mp3, to 'SILENCE_OUT.mp3':
Metadata:
TIT2 : Snowblind (Featuring Tasha Baxter)
TPE1 : Au5
TALB : Snowblind (Featuring Tasha Baxter)
TCON : Electronica
TPE3 : Au5
TRCK : 1/1
TDRL : 2014
TPE2 : Au5,Tasha Baxter
major_brand : mp42
minor_version : 0
compatible_brands: isommp42
TSSE : Lavf56.4.100
Stream #0:0: Audio: mp3 (libmp3lame), 44100 Hz, stereo, s16p, 192 kb/s
Metadata:
encoder : Lavc56.1.100 libmp3lame
Stream mapping:
Stream #0:0 -> #0:0 (mp3 (native) -> mp3 (libmp3lame))
Press [q] to stop, [?] for help
[silencedetect # 0000000004398f40] silence_start: -0.00628118
[silencedetect # 0000000004398f40] silence_end: 3.21413 | silence_duration: 3.22
041
[silencedetect # 0000000004398f40] silence_start: 343.844
[libmp3lame # 00000000043b2940] Trying to remove 1152 samples, but the queue is
empty
size= 8223kB time=00:05:50.79 bitrate= 192.0kbits/s
video:0kB audio:8222kB subtitle:0kB other streams:0kB global headers:0kB muxing
overhead: 0.011485%
The generated audio file however still has the original length without any silence removed.
See the following images:
Any help is appreciated!
EDIT:
Alright, silence detect is only DETECTING the silence. Not removing it. I will try to post a solution for this.
Use the silenceremove filter. This removes silence from the audio track only - it will leave the video unedited, i.e., things will go out of sync
Its arguments are a little cryptic.
An example
ffmpeg -i input.mp3 -af silenceremove=1:0:-50dB output.mp3
This removes silence
at the beginning (indicated by the first argument 1)
with minimum length zero (indicated by the second argument 0)
silence is classified as anything under -50 decibels (indicated by -50dB).
Documentation:
FFMPEG silence remove filter
Also anyone looking to find the right value to classify silence as may wish to look into normalising their input audio volume to 0dB first, to do this in ffmpeg see this answer.
Edit
As pointed out by #mems, to detect whether your version of ffmpeg has the filter run
ffmpeg -hide_banner -filters | grep silenceremove
if you have the filter it'll output something like
silenceremove A->A Remove silence
ffmpeg silence detect only detects the silence. One has to scan the ffmpeg output and cut the mp3 file.
In theory, this would be done as:
ffmpeg -i INPUT.mp3 -af silencedetect=n=-50dB:d=1
and monitoring for output in form of:
[silencedetect # 0000000004970f80] silence_start: -0.00154195
[silencedetect # 0000000004970f80] silence_end: 3.20435 | silence_duration: 3.2059
...
[silencedetect # 0000000004970f80] silence_start: 343.84
And, cutting start and end silence:
ffmpeg -i INPUT.mp3 -ss 3.20435 -t (343.84-3.20435)
I ended up writing a small Java program which does it. Hints:
ffmpeg writes to stderr. This means, you need to use ProcessBuilder and redirectErrorStream(true).
secondly, you need to extract the silence_start and silence_end information.
then you might use the timestamps to cut the video
Following code may be helpful:
Using Java and FFMPEG with silencedetect to remove audio silence
After reading the FFmpeg silenceremove documentation, this is how you remove silence at the beginning and end of an audio file (keeps silence in the middle).
ffmpeg -i "INPUT.mp3" -af silenceremove=start_periods=1:stop_periods=1:detection=peak "OUTPUT.mp3"
As indicated by other posters, silencedetect doesn't remove anything. To remove all silence (here lower than -30 dB) from an audio file, and leave 2 second gaps between fragments, use the following.
ffmpeg -i inputfile.mp3 -af "silenceremove=start_periods=1:stop_periods=-1:start_threshold=-30dB:stop_threshold=-30dB:start_silence=2:stop_silence=2" outputfile.mp3
From the following way can remove silence from the beginning and end of the file.
ffmpeg -i input.mp3 -af "silenceremove=start_periods=1:start_duration=1:start_threshold=-50dB:detection=peak,aformat=dblp,areverse,silenceremove=start_periods=1:start_duration=1:start_threshold=-50dB:detection=peak,aformat=dblp,areverse" input_silence_removed.mp3

Resources