Is there any program that will render all segments of non-silence from an mp3 file into a separate file? [duplicate] - audio

I am trying to use the following command with the latest ffmpeg build to remove silence from my .mp3 files:
ffmpeg -i SILENCE.mp3 -af silencedetect=n=-50dB:d=1 -y -ab 192k SILENCE_OUT.mp3
However, the following output is produced:
ffmpeg version N-66154-g1654ca7 Copyright (c) 2000-2014 the FFmpeg developers
built on Sep 5 2014 22:10:38 with gcc 4.8.3 (GCC)
configuration: --enable-gpl --enable-version3 --disable-w32threads --enable-av
isynth --enable-bzlib --enable-fontconfig --enable-frei0r --enable-gnutls --enab
le-iconv --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --
enable-libfreetype --enable-libgme --enable-libgsm --enable-libilbc --enable-lib
modplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrw
b --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libschroedinge
r --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --en
able-libvidstab --enable-libvo-aacenc --enable-libvo-amrwbenc --enable-libvorbis
--enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-
libx265 --enable-libxavs --enable-libxvid --enable-decklink --enable-zlib
libavutil 54. 7.100 / 54. 7.100
libavcodec 56. 1.100 / 56. 1.100
libavformat 56. 4.100 / 56. 4.100
libavdevice 56. 0.100 / 56. 0.100
libavfilter 5. 1.100 / 5. 1.100
libswscale 3. 0.100 / 3. 0.100
libswresample 1. 1.100 / 1. 1.100
libpostproc 53. 0.100 / 53. 0.100
Input #0, mp3, from 'SILENCE.mp3':
Metadata:
title : Snowblind (Featuring Tasha Baxter)
artist : Au5
album : Snowblind (Featuring Tasha Baxter)
genre : Electronica
performer : Au5
track : 1/1
date : 2014
album_artist : Au5,Tasha Baxter
major_brand : mp42
minor_version : 0
compatible_brands: isommp42
encoder : Lavf55.42.100
Duration: 00:05:50.80, start: 0.025057, bitrate: 192 kb/s
Stream #0:0: Audio: mp3, 44100 Hz, stereo, s16p, 192 kb/s
Output #0, mp3, to 'SILENCE_OUT.mp3':
Metadata:
TIT2 : Snowblind (Featuring Tasha Baxter)
TPE1 : Au5
TALB : Snowblind (Featuring Tasha Baxter)
TCON : Electronica
TPE3 : Au5
TRCK : 1/1
TDRL : 2014
TPE2 : Au5,Tasha Baxter
major_brand : mp42
minor_version : 0
compatible_brands: isommp42
TSSE : Lavf56.4.100
Stream #0:0: Audio: mp3 (libmp3lame), 44100 Hz, stereo, s16p, 192 kb/s
Metadata:
encoder : Lavc56.1.100 libmp3lame
Stream mapping:
Stream #0:0 -> #0:0 (mp3 (native) -> mp3 (libmp3lame))
Press [q] to stop, [?] for help
[silencedetect # 0000000004398f40] silence_start: -0.00628118
[silencedetect # 0000000004398f40] silence_end: 3.21413 | silence_duration: 3.22
041
[silencedetect # 0000000004398f40] silence_start: 343.844
[libmp3lame # 00000000043b2940] Trying to remove 1152 samples, but the queue is
empty
size= 8223kB time=00:05:50.79 bitrate= 192.0kbits/s
video:0kB audio:8222kB subtitle:0kB other streams:0kB global headers:0kB muxing
overhead: 0.011485%
The generated audio file however still has the original length without any silence removed.
See the following images:
Any help is appreciated!
EDIT:
Alright, silence detect is only DETECTING the silence. Not removing it. I will try to post a solution for this.

Use the silenceremove filter. This removes silence from the audio track only - it will leave the video unedited, i.e., things will go out of sync
Its arguments are a little cryptic.
An example
ffmpeg -i input.mp3 -af silenceremove=1:0:-50dB output.mp3
This removes silence
at the beginning (indicated by the first argument 1)
with minimum length zero (indicated by the second argument 0)
silence is classified as anything under -50 decibels (indicated by -50dB).
Documentation:
FFMPEG silence remove filter
Also anyone looking to find the right value to classify silence as may wish to look into normalising their input audio volume to 0dB first, to do this in ffmpeg see this answer.
Edit
As pointed out by #mems, to detect whether your version of ffmpeg has the filter run
ffmpeg -hide_banner -filters | grep silenceremove
if you have the filter it'll output something like
silenceremove A->A Remove silence

ffmpeg silence detect only detects the silence. One has to scan the ffmpeg output and cut the mp3 file.
In theory, this would be done as:
ffmpeg -i INPUT.mp3 -af silencedetect=n=-50dB:d=1
and monitoring for output in form of:
[silencedetect # 0000000004970f80] silence_start: -0.00154195
[silencedetect # 0000000004970f80] silence_end: 3.20435 | silence_duration: 3.2059
...
[silencedetect # 0000000004970f80] silence_start: 343.84
And, cutting start and end silence:
ffmpeg -i INPUT.mp3 -ss 3.20435 -t (343.84-3.20435)
I ended up writing a small Java program which does it. Hints:
ffmpeg writes to stderr. This means, you need to use ProcessBuilder and redirectErrorStream(true).
secondly, you need to extract the silence_start and silence_end information.
then you might use the timestamps to cut the video
Following code may be helpful:
Using Java and FFMPEG with silencedetect to remove audio silence

After reading the FFmpeg silenceremove documentation, this is how you remove silence at the beginning and end of an audio file (keeps silence in the middle).
ffmpeg -i "INPUT.mp3" -af silenceremove=start_periods=1:stop_periods=1:detection=peak "OUTPUT.mp3"

As indicated by other posters, silencedetect doesn't remove anything. To remove all silence (here lower than -30 dB) from an audio file, and leave 2 second gaps between fragments, use the following.
ffmpeg -i inputfile.mp3 -af "silenceremove=start_periods=1:stop_periods=-1:start_threshold=-30dB:stop_threshold=-30dB:start_silence=2:stop_silence=2" outputfile.mp3

From the following way can remove silence from the beginning and end of the file.
ffmpeg -i input.mp3 -af "silenceremove=start_periods=1:start_duration=1:start_threshold=-50dB:detection=peak,aformat=dblp,areverse,silenceremove=start_periods=1:start_duration=1:start_threshold=-50dB:detection=peak,aformat=dblp,areverse" input_silence_removed.mp3

Related

add black&silence to beginning of a video

Hi I am struggling to add black&silence to the begining of a video with ffmpeg. I did search a lot but they look too complex for me.
Below command is what I find to add black&silence to the end of of video, now how can I tune it to the beginning of a video?
ffmpeg -i input.mp4 -f lavfi -i color=s=1920x1080:d=10 -filter_complex [0:v][1]concat -af [0]apad -shortest output.mp4
Looks I need to use adelay instead of apad, below is the command that makes sense to me, but the audio is not delayed.
ffmpeg -i input.mp4 -f lavfi -i color=s=1920x1080:d=10 -filter_complex [1][0:v]concat -af [0]adelay=10 output.mp4
Here is the input info and ffmpeg version:
ffmpeg -i input.mp4
ffmpeg version 4.2.1-static https://johnvansickle.com/ffmpeg/ Copyright (c) 2000-2019 the FFmpeg developers
built with gcc 6.3.0 (Debian 6.3.0-18+deb9u1) 20170516
configuration: --enable-gpl --enable-version3 --enable-static --disable-debug --disable-ffplay --disable-indev=sndio --disable-outdev=sndio --cc=gcc-6 --enable-fontconfig --enable-frei0r --enable-gnutls --enable-gmp --enable-libgme --enable-gray --enable-libaom --enable-libfribidi --enable-libass --enable-libvmaf --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librubberband --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libvorbis --enable-libopus --enable-libtheora --enable-libvidstab --enable-libvo-amrwbenc --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libdav1d --enable-libxvid --enable-libzvbi --enable-libzimg
libavutil 56. 31.100 / 56. 31.100
libavcodec 58. 54.100 / 58. 54.100
libavformat 58. 29.100 / 58. 29.100
libavdevice 58. 8.100 / 58. 8.100
libavfilter 7. 57.100 / 7. 57.100
libswscale 5. 5.100 / 5. 5.100
libswresample 3. 5.100 / 3. 5.100
libpostproc 55. 5.100 / 55. 5.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.29.100
Duration: 00:01:00.00, start: 0.000998, bitrate: 2526 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709), 1920x1080, 2394 kb/s, 24 fps, 24 tbr, 16k tbn, 48 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 124 kb/s (default)
Metadata:
handler_name : SoundHandler
At least one output file must be specified
Thanks!
There are several methods to do this. The first method is simple and easy but re-encodes the main video. The other method is slightly more complicated but does not re-encode the main video, so the quality is preserved this method will be faster for long videos.
tpad & adelay filters
Using the tpad and adelay filters:
ffmpeg -i input.mp4 -filter_complex "[0:v]tpad=start_duration=2[v];[0:a]adelay=2s:all=true[a]" -map "[v]" -map "[a]" output.mp4
If your ffmpeg is older than version 4.2 then change adelay=2s:all=true to adelay=2000|2000.
color & anullsrc filters with concat demuxer
Make 2 second black and silence that match the attributes of the input. Using the color and anullsrc filters:
ffmpeg -f lavfi -i color=size=1920x1080:rate=24:duration=2 -f lavfi -i anullsrc=channel_layout=stereo:sample_rate=44100 -video_track_timescale 16k -shortest black.mp4
Make join.txt containing:
file 'black.mp4'
file 'input.mp4'
Concatenate with the concat demuxer:
ffmpeg -f concat -i join.txt -c copy output.mp4

ffmpeg 4: Using the stream_loop parameter to loop the audio during a video ends up with an infinite loop

Summary
Context
The software I use
The problem
Results
4.1. Actual Results
4.2. Expected Results
What did I try to fix the bug?
How to reproduce this bug: minimal and testable example with the provided required data
The question
Sources
Context
I would want to set an audio WAV as the background sound of a video WEBM. The video can be shorter or longer than the audio. At the moment I add the audio over the video, I don't know the length of both streams. The audio must repeat until the video ends (the audio can be truncated if the video ends before the end of the last repetition of the audio).
The software I use
I use ffmpeg version 4.2.2-1ubuntu1~18.04.sav0.
The problem
ffmpeg seems to enter in an infinite loop when it proccesses in order to mix the audio and the video. Also, the length of the currently-generating-output-file (which contains both video and audio) is equal to the length of the audio, instead of the length of the video.
The problem seems to be triggered by this command line:
ffmpeg -i directory_1/video.webm -stream_loop -1 -fflags +shortest -max_interleave_delta 50000 -i directory_2/audio.wav directory_3/video_and_audio.webm
Results
Actual Results
Three things:
The infinite loop of the ffmpeg process: I must manually stop the ffmpeg process
The output video file with music (which is currently generating but output anyway): it contains both audio and video. But the length of the output file is equal to the length of the audio, instead of the length of the video.
The following output logs:
ffmpeg version 4.2.2-1ubuntu1~18.04.sav0 Copyright (c) 2000-2019 the
FFmpeg developers built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
configuration: --prefix=/usr --extra-version='1ubuntu1~18.04.sav0'
--toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared libavutil 56. 31.100 / 56. 31.100 libavcodec 58. 54.100 / 58. 54.100 libavformat 58. 29.100 / 58. 29.100 libavdevice 58. 8.100 /
58. 8.100 libavfilter 7. 57.100 / 7. 57.100 libavresample 4. 0. 0 / 4. 0. 0 libswscale 5. 5.100 / 5. 5.100 libswresample 3. 5.100 / 3. 5.100 libpostproc 55. 5.100 /
55. 5.100 Input #0, matroska,webm, from 'youtubed/my_youtube_video.webm': Metadata:
encoder : Chrome Duration: N/A, start: 0.000000, bitrate: N/A
Stream #0:0(eng): Video: vp8, yuv420p(progressive), 3200x1608, SAR 1:1 DAR 400:201, 1k tbr, 1k tbn, 1k tbc (default)
Metadata:
alpha_mode : 1 Guessed Channel Layout for Input Stream #1.0 : stereo Input #1, wav, from 'tmp_music/original_music.wav':
Duration: 00:00:11.78, bitrate: 1411 kb/s
Stream #1:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s Stream mapping: Stream #0:0 -> #0:0 (vp8
(native) -> vp9 (libvpx-vp9)) Stream #1:0 -> #0:1 (pcm_s16le
(native) -> opus (libopus)) Press [q] to stop, [?] for help
[libvpx-vp9 # 0x5645268aed80] v1.8.2 [libopus # 0x5645268b09c0] No bit
rate set. Defaulting to 96000 bps. Output #0, webm, to
'youtubed/my_youtube_video_with_music.webm': Metadata:
encoder : Lavf58.29.100
Stream #0:0(eng): Video: vp9 (libvpx-vp9), yuv420p(progressive), 3200x1608 [SAR 1:1 DAR 400:201], q=-1--1, 200 kb/s, 1k fps, 1k tbn, 1k
tbc (default)
Metadata:
alpha_mode : 1
encoder : Lavc58.54.100 libvpx-vp9
Side data:
cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
Stream #0:1: Audio: opus (libopus), 48000 Hz, stereo, s16, 96 kb/s
Metadata:
encoder : Lavc58.54.100 libopus
Expected Results
No infinite loop during the ffmpeg process
Concerning the output logs, I don't know what it should look.
The output file with the audio and the video should:
3.1. If the video is longer than the audio, then the audio is repeated until it exactly fits the video. The audio can be truncated.
3.2. If the video is shorter than the audio, then the audio is truncated and exactly fits the video.
3.3. If both video and audio are of the same length, then the audio exactly fits the video.
How to reproduce this bug? (+ required data)
Download the following files (resp. audio and video) (I must refresh these download links every 24 hours):
1.1. https://a.uguu.se/dmgsmItjJMDq_audio.wav
1.2. https://a.uguu.se/w3qHDlGq6mOW_video.webm
Move them into the directory/directories of your choice.
Open your CLI, move to the adequat directory and copy/paste/execute the instruction given in Part. The Problem (don't forget to eventually modify this instruction by indicating the adequat directories, according to step 2.).
You'll face my problem.
What did I try to fix the bug?
Nothing, since I don't even understand why the bug occures.
The question
How to correct my command in order to mix these audio and video streams without any infinite loop during the ffmpeg process, keeping in mind that I don't know their length, and that audio must be repeated in order to fit the video, even if audio must be truncated (in the case of the last repetition of the audio file must be truncated because the video stream has just ended)?
Sources
The source is the command line you can find in Part. The problem.
The placement of some of your options is wrong. All of the shortest related options belong in front of the output.
ffmpeg -i directory_1/video.webm -stream_loop -1 -i directory_2/audio.wav -c:v copy -shortest -fflags +shortest -max_interleave_delta 100M directory_3/video_and_audio.webm
There's no need to transcode the video unless you wish to.

When I append a silent audio (mp3) to an existing list of audio it garbles the final audio?

After several hours I have narrowed down the issue with the garbled audio to be the 2-seconds silence audio mp3 I am appending (I think I had produced it once with Wavelab)
However, I tried using ffmpeg according to a post to produce a similar 2 seconds audio but it too will corrupt/garble/chop voice in the final concatenation of audio files.
ffmpeg -f lavfi -i anullsrc=r=44100:cl=mono -t 2 -q:a 9 -acodec libmp3lame SILENCE_2sec.MP3
I typically will have several audio files to concatenate together but for simplicity I have able to narrow it to a couple of files simplifying to the following script. A simple Windows batch file you should be able to use and reproduce the issue at your end.
rem
rem
SET EXE="S:\_BINS\FFmpeg 4.2.1 20200112\bin\ffmpeg.exe"
SET ROOTPATH=.\
SET IN_FILE="%ROOTPATH%MyList.txt"
ECHO file '%ROOTPATH%HELLO.mp3' > MyList.txt
ECHO file 'SILENCE_2sec.MP3' >> MyList.txt
SET OPTIONS= -f concat -safe 0 -i %IN_FILE% -c copy -y
SET OUT_FILE="%ROOTPATH%CONCATENATED_AUDIO_2.MP3"
SET INFO_FILE="INFO.TXT"
%EXE% %OPTIONS% %OUT_FILE% 1> %INFO_FILE% 2>&1
ECHO ======================== >> %INFO_FILE%
ECHO IN_FILE=%IN_FILE% >> %INFO_FILE%
ECHO EXE=%EXE% >> %INFO_FILE%
ECHO OPTIONS=%OPTIONS% >> %INFO_FILE%
ECHO ======================== >> %INFO_FILE%
Here is the console info output from the ffmpeg, let me know if you need other output include ones from ffprobe
ffmpeg version git-2020-01-10-3d894db Copyright (c) 2000-2020 the FFmpeg developers
built with gcc 9.2.1 (GCC) 20191125
configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt --enable-amf
libavutil 56. 38.100 / 56. 38.100
libavcodec 58. 65.103 / 58. 65.103
libavformat 58. 35.101 / 58. 35.101
libavdevice 58. 9.103 / 58. 9.103
libavfilter 7. 70.101 / 7. 70.101
libswscale 5. 6.100 / 5. 6.100
libswresample 3. 6.100 / 3. 6.100
libpostproc 55. 6.100 / 55. 6.100
[mp3 # 000000000036af80] Estimating duration from bitrate, this may be inaccurate
Input #0, concat, from '.\MyList.txt':
Duration: N/A, start: 0.000000, bitrate: 32 kb/s
Stream #0:0: Audio: mp3, 24000 Hz, mono, fltp, 32 kb/s
Output #0, mp3, to '.\CONCATENATED_AUDIO_2.MP3':
Metadata:
TSSE : Lavf58.35.101
Stream #0:0: Audio: mp3, 24000 Hz, mono, fltp, 32 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (copy)
Press [q] to stop, [?] for help
[mp3 # 0000000000372d00] Application provided invalid, non monotonically increasing dts to muxer in stream 0: 17280 >= 17255
size= 11kB time=00:00:02.73 bitrate= 33.2kbits/s speed=2.73e+03x
video:0kB audio:11kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.137446%
========================
IN_FILE=".\MyList.txt"
EXE="S:\_BINS\FFmpeg 4.2.1 20200112\bin\ffmpeg.exe"
OPTIONS= -f concat -safe 0 -i ".\MyList.txt" -c copy -y
========================
I believe I am running FFmpeg 4.2.1, recently installed (20200112)
You may produce the HELLO.mp3 by saving the following link
https://translate.google.com.vn/translate_tts?en=UTF-8&q=Hello+&tl=en&client=tw-ob
FYI, I am still a novice of ffmpeg and using it more like a black box with the help I received in this very super forum.
Please be as explicit as you can with command line options on how I can fix this issue.
Thank you.
Additional Hints Debugging:
If I append more files after the silence audio it seems that the silence audio impacts (garbles, chops) the previous audio.
You may try the following for the list of audio files input.
ECHO file '%ROOTPATH%HELLO.mp3' > MyList.txt
ECHO file 'SILENCE_2sec.MP3' >> MyList.txt
ECHO file '%ROOTPATH%HELLO.mp3' >> MyList.txt
ECHO file '%ROOTPATH%HELLO.mp3' >> MyList.txt
I typically add one or more silence file to derive a post silence effect after the actual audio. That's my current logic. However if you have an alternative to appending a silence in the process of concatenating several audio files or appending x-seconds silence to an existing audio file. I can use that method as well from my coding.
Thank you.
The silent audio needs to match the parameters of the main audio:
Stream #0:0: Audio: mp3, 24000 Hz, mono, fltp, 32 kb/s
The parameters above are:
sample rate (24000 Hz)
channel layout (mono)
sample format (fltp)
bitrate (32 kb/s)
The important parameters are sample rate and channel layout. In the anullsrc filter you can set these with the r/sample_rate and cl/channel_layout options as shown in ffmpeg -h filter=anullsrc.
Example command:
ffmpeg -f lavfi -i anullsrc=r=24000:cl=mono -t 2 -b:a 32k -c:a libmp3lame SILENCE_2sec.MP3

Sample accurate audio slicing in ffmpeg?

I need to slice an audio file in .wav format into 10 second chunks.
These chunks need to be exactly 10 seconds, not 10.04799988232 seconds.
the current code I am using is
ffmpeg -i test.wav -ss 0 -to 10 -c:a libfdk_aac -b:a 80k aac/test.aac
ffmpeg version 3.2.2 Copyright (c) 2000-2016 the FFmpeg developers
built with Apple LLVM version 8.0.0 (clang-800.0.42.1)
configuration: --prefix=/usr/local/Cellar/ffmpeg/3.2.2 --enable-shared --enable-pthreads --enable-gpl --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-libxvid --enable-opencl --disable-lzma --enable-nonfree --enable-vda
libavutil 55. 34.100 / 55. 34.100
libavcodec 57. 64.101 / 57. 64.101
libavformat 57. 56.100 / 57. 56.100
libavdevice 57. 1.100 / 57. 1.100
libavfilter 6. 65.100 / 6. 65.100
libavresample 3. 1. 0 / 3. 1. 0
libswscale 4. 2.100 / 4. 2.100
libswresample 2. 3.100 / 2. 3.100
libpostproc 54. 1.100 / 54. 1.100
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, wav, from '/Users/chris/Repos/mithc/client/assets/audio/wav/test.wav':
Duration: 00:04:37.62, bitrate: 2307 kb/s
Stream #0:0: Audio: pcm_s24le ([1][0][0][0] / 0x0001), 48000 Hz, stereo, s32 (24 bit), 2304 kb/s
Output #0, adts, to '/Users/chris/Repos/mithc/client/assets/audio/aac/test.aac':
Metadata:
encoder : Lavf57.56.100
Stream #0:0: Audio: aac (libfdk_aac), 48000 Hz, stereo, s16, 80 kb/s
Metadata:
encoder : Lavc57.64.101 libfdk_aac
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s24le (native) -> aac (libfdk_aac))
Press [q] to stop, [?] for help
size= 148kB time=00:00:15.01 bitrate= 80.6kbits/s speed=40.9x
video:0kB audio:148kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000000%
This code does not produce exact slices, any ideas how can this be accomplished?
Not possible*. AAC audio is stored in frames which decode to 1024 samples. So, for a 48000 Hz feed, each frame has a duration of 0.02133 seconds.
If you store the audio in a container like M4A which indicates duration per-packet, the duration of the last frame is adjusted to satisfy the specified t/ss-to. But the last frame still contains the full 1024 samples. See the readout below of the last 3 frames of a silent stream specified to be 10 seconds in a M4A. Compare the packet size(s) vis-a-vis the duration.
stream #0:
keyframe=1
duration=0.021
dts=9.941 pts=9.941
size=213
stream #0:
keyframe=1
duration=0.021
dts=9.963 pts=9.963
size=213
stream #0:
keyframe=1
duration=0.016
dts=9.984 pts=9.984
size=214
If this stream were originally stored in .aac, total duration would not be 10.00 seconds. Now whether M4A does the trick for you will depend on your player.
*there is a variant of AAC which decodes to 960 samples. So, a 48 kHz audio could be encoded to a stream exactly 10 seconds long. FFmpeg does not sport such an AAC encoder. AFAIK, many apps including itunes will not play such a file correctly. If you want to encode to this spec, there's an encoder available at https://github.com/Opendigitalradio/ODR-AudioEnc

using FFMPEG with silencedetect to remove audio silence

I am trying to use the following command with the latest ffmpeg build to remove silence from my .mp3 files:
ffmpeg -i SILENCE.mp3 -af silencedetect=n=-50dB:d=1 -y -ab 192k SILENCE_OUT.mp3
However, the following output is produced:
ffmpeg version N-66154-g1654ca7 Copyright (c) 2000-2014 the FFmpeg developers
built on Sep 5 2014 22:10:38 with gcc 4.8.3 (GCC)
configuration: --enable-gpl --enable-version3 --disable-w32threads --enable-av
isynth --enable-bzlib --enable-fontconfig --enable-frei0r --enable-gnutls --enab
le-iconv --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --
enable-libfreetype --enable-libgme --enable-libgsm --enable-libilbc --enable-lib
modplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrw
b --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libschroedinge
r --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --en
able-libvidstab --enable-libvo-aacenc --enable-libvo-amrwbenc --enable-libvorbis
--enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-
libx265 --enable-libxavs --enable-libxvid --enable-decklink --enable-zlib
libavutil 54. 7.100 / 54. 7.100
libavcodec 56. 1.100 / 56. 1.100
libavformat 56. 4.100 / 56. 4.100
libavdevice 56. 0.100 / 56. 0.100
libavfilter 5. 1.100 / 5. 1.100
libswscale 3. 0.100 / 3. 0.100
libswresample 1. 1.100 / 1. 1.100
libpostproc 53. 0.100 / 53. 0.100
Input #0, mp3, from 'SILENCE.mp3':
Metadata:
title : Snowblind (Featuring Tasha Baxter)
artist : Au5
album : Snowblind (Featuring Tasha Baxter)
genre : Electronica
performer : Au5
track : 1/1
date : 2014
album_artist : Au5,Tasha Baxter
major_brand : mp42
minor_version : 0
compatible_brands: isommp42
encoder : Lavf55.42.100
Duration: 00:05:50.80, start: 0.025057, bitrate: 192 kb/s
Stream #0:0: Audio: mp3, 44100 Hz, stereo, s16p, 192 kb/s
Output #0, mp3, to 'SILENCE_OUT.mp3':
Metadata:
TIT2 : Snowblind (Featuring Tasha Baxter)
TPE1 : Au5
TALB : Snowblind (Featuring Tasha Baxter)
TCON : Electronica
TPE3 : Au5
TRCK : 1/1
TDRL : 2014
TPE2 : Au5,Tasha Baxter
major_brand : mp42
minor_version : 0
compatible_brands: isommp42
TSSE : Lavf56.4.100
Stream #0:0: Audio: mp3 (libmp3lame), 44100 Hz, stereo, s16p, 192 kb/s
Metadata:
encoder : Lavc56.1.100 libmp3lame
Stream mapping:
Stream #0:0 -> #0:0 (mp3 (native) -> mp3 (libmp3lame))
Press [q] to stop, [?] for help
[silencedetect # 0000000004398f40] silence_start: -0.00628118
[silencedetect # 0000000004398f40] silence_end: 3.21413 | silence_duration: 3.22
041
[silencedetect # 0000000004398f40] silence_start: 343.844
[libmp3lame # 00000000043b2940] Trying to remove 1152 samples, but the queue is
empty
size= 8223kB time=00:05:50.79 bitrate= 192.0kbits/s
video:0kB audio:8222kB subtitle:0kB other streams:0kB global headers:0kB muxing
overhead: 0.011485%
The generated audio file however still has the original length without any silence removed.
See the following images:
Any help is appreciated!
EDIT:
Alright, silence detect is only DETECTING the silence. Not removing it. I will try to post a solution for this.
Use the silenceremove filter. This removes silence from the audio track only - it will leave the video unedited, i.e., things will go out of sync
Its arguments are a little cryptic.
An example
ffmpeg -i input.mp3 -af silenceremove=1:0:-50dB output.mp3
This removes silence
at the beginning (indicated by the first argument 1)
with minimum length zero (indicated by the second argument 0)
silence is classified as anything under -50 decibels (indicated by -50dB).
Documentation:
FFMPEG silence remove filter
Also anyone looking to find the right value to classify silence as may wish to look into normalising their input audio volume to 0dB first, to do this in ffmpeg see this answer.
Edit
As pointed out by #mems, to detect whether your version of ffmpeg has the filter run
ffmpeg -hide_banner -filters | grep silenceremove
if you have the filter it'll output something like
silenceremove A->A Remove silence
ffmpeg silence detect only detects the silence. One has to scan the ffmpeg output and cut the mp3 file.
In theory, this would be done as:
ffmpeg -i INPUT.mp3 -af silencedetect=n=-50dB:d=1
and monitoring for output in form of:
[silencedetect # 0000000004970f80] silence_start: -0.00154195
[silencedetect # 0000000004970f80] silence_end: 3.20435 | silence_duration: 3.2059
...
[silencedetect # 0000000004970f80] silence_start: 343.84
And, cutting start and end silence:
ffmpeg -i INPUT.mp3 -ss 3.20435 -t (343.84-3.20435)
I ended up writing a small Java program which does it. Hints:
ffmpeg writes to stderr. This means, you need to use ProcessBuilder and redirectErrorStream(true).
secondly, you need to extract the silence_start and silence_end information.
then you might use the timestamps to cut the video
Following code may be helpful:
Using Java and FFMPEG with silencedetect to remove audio silence
After reading the FFmpeg silenceremove documentation, this is how you remove silence at the beginning and end of an audio file (keeps silence in the middle).
ffmpeg -i "INPUT.mp3" -af silenceremove=start_periods=1:stop_periods=1:detection=peak "OUTPUT.mp3"
As indicated by other posters, silencedetect doesn't remove anything. To remove all silence (here lower than -30 dB) from an audio file, and leave 2 second gaps between fragments, use the following.
ffmpeg -i inputfile.mp3 -af "silenceremove=start_periods=1:stop_periods=-1:start_threshold=-30dB:stop_threshold=-30dB:start_silence=2:stop_silence=2" outputfile.mp3
From the following way can remove silence from the beginning and end of the file.
ffmpeg -i input.mp3 -af "silenceremove=start_periods=1:start_duration=1:start_threshold=-50dB:detection=peak,aformat=dblp,areverse,silenceremove=start_periods=1:start_duration=1:start_threshold=-50dB:detection=peak,aformat=dblp,areverse" input_silence_removed.mp3

Resources