FFmpeg concatenates two m4a files incorrectly - audio

Problem is, when i concatenate two m4a files with concat demuxer, ffmpeg produces files whose duration is incorrect. You can see that the duration of output file is very different from the duration of two input files combined. Please help me spot the issue in it. My ultimate goal is to append silent audio to the end of the audio file. For that, i generate silent audio file with ffmpeg and then try to concat it with other audio file.
Command I used to generate audio file:
ffmpeg -nostdin -loglevel error -y -threads 0 -filter_complex aevalsrc=0 -t 4 /home/ec2-user/videocreation/temp/silence.m4a
Command I used for concat:
ffmpeg -f concat -safe 0 -i temp.txt -c copy output.m4a
I have two file paths listed in temp.txt:
[ec2-user#ip-10-0-1-126 server]$ cat temp.txt
file /home/ec2-user/videoData/DnXptC4ld8/FADING_OUT_VOLUP_Blrt_Decrypt_1ed5c4d569d8a1f23428b65217f65eaf_audio.m4a
file /home/ec2-user/videocreation/temp/silence.m4a
First file ffprobe:
[ec2-user#ip-10-0-1-126 server]$ ffprobe /home/ec2-user/videoData/DnXptC4ld8/FADING_OUT_VOLUP_Blrt_Decrypt_1ed5c4d569d8a1f23428b65217f65eaf_audio.m4a
ffprobe version N-80097-g89e9393 Copyright (c) 2007-2016 the FFmpeg developers
built with gcc 4.8.5 (GCC) 20150623 (Red Hat 4.8.5-4)
configuration: --prefix=/home/ec2-user/ffmpeg_build --extra-cflags=-I/home/ec2-user/ffmpeg_build/include --extra-ldflags=-L/home/ec2-user/ffmpeg_build/lib --bindir=/home/ec2-user/bin --pkg-config-flags=--static --enable-gpl --enable-nonfree --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265
libavutil 55. 24.100 / 55. 24.100
libavcodec 57. 43.100 / 57. 43.100
libavformat 57. 37.100 / 57. 37.100
libavdevice 57. 0.101 / 57. 0.101
libavfilter 6. 46.100 / 6. 46.100
libswscale 4. 1.100 / 4. 1.100
libswresample 2. 0.101 / 2. 0.101
libpostproc 54. 0.100 / 54. 0.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/home/ec2-user/videoData/DnXptC4ld8/FADING_OUT_VOLUP_Blrt_Decrypt_1ed5c4d569d8a1f23428b65217f65eaf_audio.m4a':
Metadata:
major_brand : M4A
minor_version : 512
compatible_brands: isomiso2
encoder : Lavf57.44.100
Duration: 00:00:01.77, start: 0.000000, bitrate: 4 kb/s
Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 11025 Hz, mono, fltp, 0 kb/s (default)
Metadata:
handler_name : SoundHandler
Second file ffprobe:
[ec2-user#ip-10-0-1-126 server]$ ffprobe /home/ec2-user/videocreation/temp/silence.m4a
ffprobe version N-80097-g89e9393 Copyright (c) 2007-2016 the FFmpeg developers
built with gcc 4.8.5 (GCC) 20150623 (Red Hat 4.8.5-4)
configuration: --prefix=/home/ec2-user/ffmpeg_build --extra-cflags=-I/home/ec2-user/ffmpeg_build/include --extra-ldflags=-L/home/ec2-user/ffmpeg_build/lib --bindir=/home/ec2-user/bin --pkg-config-flags=--static --enable-gpl --enable-nonfree --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265
libavutil 55. 24.100 / 55. 24.100
libavcodec 57. 43.100 / 57. 43.100
libavformat 57. 37.100 / 57. 37.100
libavdevice 57. 0.101 / 57. 0.101
libavfilter 6. 46.100 / 6. 46.100
libswscale 4. 1.100 / 4. 1.100
libswresample 2. 0.101 / 2. 0.101
libpostproc 54. 0.100 / 54. 0.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/home/ec2-user/videocreation/temp/silence.m4a':
Metadata:
major_brand : M4A
minor_version : 512
compatible_brands: isomiso2
encoder : Lavf57.44.100
Duration: 00:00:04.02, start: 0.000000, bitrate: 4 kb/s
Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 1 kb/s (default)
Metadata:
handler_name : SoundHandler
Output file ffprobe:
[ec2-user#ip-10-0-1-126 server]$ ffprobe output.m4a
ffprobe version N-80097-g89e9393 Copyright (c) 2007-2016 the FFmpeg developers
built with gcc 4.8.5 (GCC) 20150623 (Red Hat 4.8.5-4)
configuration: --prefix=/home/ec2-user/ffmpeg_build --extra-cflags=-I/home/ec2-user/ffmpeg_build/include --extra-ldflags=-L/home/ec2-user/ffmpeg_build/lib --bindir=/home/ec2-user/bin --pkg-config-flags=--static --enable-gpl --enable-nonfree --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265
libavutil 55. 24.100 / 55. 24.100
libavcodec 57. 43.100 / 57. 43.100
libavformat 57. 37.100 / 57. 37.100
libavdevice 57. 0.101 / 57. 0.101
libavfilter 6. 46.100 / 6. 46.100
libswscale 4. 1.100 / 4. 1.100
libswresample 2. 0.101 / 2. 0.101
libpostproc 54. 0.100 / 54. 0.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'output.m4a':
Metadata:
major_brand : M4A
minor_version : 512
compatible_brands: isomiso2
encoder : Lavf57.37.100
Duration: 00:00:23.22, start: 0.000000, bitrate: 0 kb/s
Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 11025 Hz, mono, fltp, 0 kb/s (default)
Metadata:
handler_name : SoundHandler

Files need to have the same properties. Your silence file has a different sampling rate. Use
ffmpeg -f lavfi -i anullsrc -ar 11025 -ac 1 -t 4 silence.m4a

Related

add black&silence to beginning of a video

Hi I am struggling to add black&silence to the begining of a video with ffmpeg. I did search a lot but they look too complex for me.
Below command is what I find to add black&silence to the end of of video, now how can I tune it to the beginning of a video?
ffmpeg -i input.mp4 -f lavfi -i color=s=1920x1080:d=10 -filter_complex [0:v][1]concat -af [0]apad -shortest output.mp4
Looks I need to use adelay instead of apad, below is the command that makes sense to me, but the audio is not delayed.
ffmpeg -i input.mp4 -f lavfi -i color=s=1920x1080:d=10 -filter_complex [1][0:v]concat -af [0]adelay=10 output.mp4
Here is the input info and ffmpeg version:
ffmpeg -i input.mp4
ffmpeg version 4.2.1-static https://johnvansickle.com/ffmpeg/ Copyright (c) 2000-2019 the FFmpeg developers
built with gcc 6.3.0 (Debian 6.3.0-18+deb9u1) 20170516
configuration: --enable-gpl --enable-version3 --enable-static --disable-debug --disable-ffplay --disable-indev=sndio --disable-outdev=sndio --cc=gcc-6 --enable-fontconfig --enable-frei0r --enable-gnutls --enable-gmp --enable-libgme --enable-gray --enable-libaom --enable-libfribidi --enable-libass --enable-libvmaf --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librubberband --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libvorbis --enable-libopus --enable-libtheora --enable-libvidstab --enable-libvo-amrwbenc --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libdav1d --enable-libxvid --enable-libzvbi --enable-libzimg
libavutil 56. 31.100 / 56. 31.100
libavcodec 58. 54.100 / 58. 54.100
libavformat 58. 29.100 / 58. 29.100
libavdevice 58. 8.100 / 58. 8.100
libavfilter 7. 57.100 / 7. 57.100
libswscale 5. 5.100 / 5. 5.100
libswresample 3. 5.100 / 3. 5.100
libpostproc 55. 5.100 / 55. 5.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.29.100
Duration: 00:01:00.00, start: 0.000998, bitrate: 2526 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709), 1920x1080, 2394 kb/s, 24 fps, 24 tbr, 16k tbn, 48 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 124 kb/s (default)
Metadata:
handler_name : SoundHandler
At least one output file must be specified
Thanks!
There are several methods to do this. The first method is simple and easy but re-encodes the main video. The other method is slightly more complicated but does not re-encode the main video, so the quality is preserved this method will be faster for long videos.
tpad & adelay filters
Using the tpad and adelay filters:
ffmpeg -i input.mp4 -filter_complex "[0:v]tpad=start_duration=2[v];[0:a]adelay=2s:all=true[a]" -map "[v]" -map "[a]" output.mp4
If your ffmpeg is older than version 4.2 then change adelay=2s:all=true to adelay=2000|2000.
color & anullsrc filters with concat demuxer
Make 2 second black and silence that match the attributes of the input. Using the color and anullsrc filters:
ffmpeg -f lavfi -i color=size=1920x1080:rate=24:duration=2 -f lavfi -i anullsrc=channel_layout=stereo:sample_rate=44100 -video_track_timescale 16k -shortest black.mp4
Make join.txt containing:
file 'black.mp4'
file 'input.mp4'
Concatenate with the concat demuxer:
ffmpeg -f concat -i join.txt -c copy output.mp4

After transcoding using ffmpeg, I found audio bitrate is not the value I expected

I used ffmpeg to transcode some files into new format and with certain parameters. After transcoding, I found some output file's metadata is not what I expected, the output value is not the same with I set in the cmd line.
Before transcoding I check the media info of the inputfile:
ffmpeg -i dz2015082000010.mpg
ffmpeg version 3.2.4 Copyright (c) 2000-2017 the FFmpeg developers
built with gcc 4.8.3 (GCC) 20140911 (Red Hat 4.8.3-9)
configuration: --enable-static --enable-memalign-hack --enable-libx264
--enable-gpl --enable-pthreads --enable-version3 --enable-avisynth --enable-bzlib --enable-iconv --enable-zlib --enable-nonfree --extra-cflags=-I/usr/local/include/ --extra-ldflags=-L/usr/local/lib --enable-debug=3 --disable-optimizations --enable-nonfree --enable-libmp3lame libavutil 55. 34.101 / 55. 34.101 libavcodec 57. 64.101 / 57. 64.101 libavformat 57. 56.101 /
57. 56.101 libavdevice 57. 1.100 / 57. 1.100 libavfilter 6. 65.100 / 6. 65.100 libswscale 4. 2.100 / 4. 2.100 libswresample 2. 3.100 / 2. 3.100 libpostproc 54. 1.100 /
54. 1.100 Input #0, mpeg, from 'dz2015082000010.mpg': Duration: 00:01:49.30, start: 0.685389, bitrate: 15723 kb/s
Stream #0:0[0x1e0]: Video: mpeg2video (Main), yuv420p(tv, top first), 1920x1080 [SAR 1:1 DAR 16:9], 15000 kb/s, 25 fps, 25 tbr,
90k tbn, 50 tbc
Stream #0:1[0x1c0]: Audio: mp2, 48000 Hz, stereo, s16p, 384 kb/s At least one output file must be specified
Next, transcoding with the cmd line:
ffmpeg -i dz2015082000010.mpg -vcodec libx264 -b:v 4000k -s 1920x1080 -r 25 -g 25 -vprofile main -acodec aac -strict -2 -b:a 128k -ac 2 -ar 44100 -y output.ts
After transcoding, I check the media info of the output file:
ffmpeg -i output.ts
ffmpeg version 3.2.4 Copyright (c) 2000-2017 the FFmpeg developers built with gcc 4.8.3 (GCC) 20140911 (Red Hat
4.8.3-9) configuration: --enable-static --enable-memalign-hack --enable-libx264 --enable-gpl --enable-pthreads --enable-version3 --enable-avisynth --enable-bzlib --enable-iconv --enable-zlib --enable-nonfree --extra-cflags=-I/usr/local/include/ --extra-ldflags=-L/usr/local/lib --enable-debug=3 --disable-optimizations --enable-nonfree --enable-libmp3lame libavutil 55. 34.101 / 55. 34.101 libavcodec 57. 64.101
/ 57. 64.101 libavformat 57. 56.101 / 57. 56.101
libavdevice 57. 1.100 / 57. 1.100 libavfilter 6. 65.100
/ 6. 65.100 libswscale 4. 2.100 / 4. 2.100
libswresample 2. 3.100 / 2. 3.100 libpostproc 54. 1.100
/ 54. 1.100 Input #0, mpegts, from 'full-2.ts': Duration:
00:01:49.30, start: 1.456778, bitrate: 4455 kb/s Program 1
Metadata:
service_name : Service01
service_provider: FFmpeg
Stream #0:0[0x100]: Video: h264 (Main) ([27][0][0][0] / 0x001B), yuv420p(progressive), 1920x1080 [SAR 1:1 DAR 16:9], 25 fps, 25 tbr,
90k tbn, 50 tbc
Stream #0:1[0x101]: Audio: aac (LC) ([15][0][0][0] / 0x000F), 44100 Hz, stereo, fltp, 4 kb/s At least one output file must be
specified
I don't know why the audio bitrate is changed to 4 kb/s after transcoding, I set the value with -b:a 128k before, anybody can help me? BTW, the output file sounds all right.
The native encoder won't waste bits on silent portions. And it doesn't do strict CBR. If you really need an output to be around the target bitrate, you can mix in a very low level of noise.

Sample accurate audio slicing in ffmpeg?

I need to slice an audio file in .wav format into 10 second chunks.
These chunks need to be exactly 10 seconds, not 10.04799988232 seconds.
the current code I am using is
ffmpeg -i test.wav -ss 0 -to 10 -c:a libfdk_aac -b:a 80k aac/test.aac
ffmpeg version 3.2.2 Copyright (c) 2000-2016 the FFmpeg developers
built with Apple LLVM version 8.0.0 (clang-800.0.42.1)
configuration: --prefix=/usr/local/Cellar/ffmpeg/3.2.2 --enable-shared --enable-pthreads --enable-gpl --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-libxvid --enable-opencl --disable-lzma --enable-nonfree --enable-vda
libavutil 55. 34.100 / 55. 34.100
libavcodec 57. 64.101 / 57. 64.101
libavformat 57. 56.100 / 57. 56.100
libavdevice 57. 1.100 / 57. 1.100
libavfilter 6. 65.100 / 6. 65.100
libavresample 3. 1. 0 / 3. 1. 0
libswscale 4. 2.100 / 4. 2.100
libswresample 2. 3.100 / 2. 3.100
libpostproc 54. 1.100 / 54. 1.100
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, wav, from '/Users/chris/Repos/mithc/client/assets/audio/wav/test.wav':
Duration: 00:04:37.62, bitrate: 2307 kb/s
Stream #0:0: Audio: pcm_s24le ([1][0][0][0] / 0x0001), 48000 Hz, stereo, s32 (24 bit), 2304 kb/s
Output #0, adts, to '/Users/chris/Repos/mithc/client/assets/audio/aac/test.aac':
Metadata:
encoder : Lavf57.56.100
Stream #0:0: Audio: aac (libfdk_aac), 48000 Hz, stereo, s16, 80 kb/s
Metadata:
encoder : Lavc57.64.101 libfdk_aac
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s24le (native) -> aac (libfdk_aac))
Press [q] to stop, [?] for help
size= 148kB time=00:00:15.01 bitrate= 80.6kbits/s speed=40.9x
video:0kB audio:148kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000000%
This code does not produce exact slices, any ideas how can this be accomplished?
Not possible*. AAC audio is stored in frames which decode to 1024 samples. So, for a 48000 Hz feed, each frame has a duration of 0.02133 seconds.
If you store the audio in a container like M4A which indicates duration per-packet, the duration of the last frame is adjusted to satisfy the specified t/ss-to. But the last frame still contains the full 1024 samples. See the readout below of the last 3 frames of a silent stream specified to be 10 seconds in a M4A. Compare the packet size(s) vis-a-vis the duration.
stream #0:
keyframe=1
duration=0.021
dts=9.941 pts=9.941
size=213
stream #0:
keyframe=1
duration=0.021
dts=9.963 pts=9.963
size=213
stream #0:
keyframe=1
duration=0.016
dts=9.984 pts=9.984
size=214
If this stream were originally stored in .aac, total duration would not be 10.00 seconds. Now whether M4A does the trick for you will depend on your player.
*there is a variant of AAC which decodes to 960 samples. So, a 48 kHz audio could be encoded to a stream exactly 10 seconds long. FFmpeg does not sport such an AAC encoder. AFAIK, many apps including itunes will not play such a file correctly. If you want to encode to this spec, there's an encoder available at https://github.com/Opendigitalradio/ODR-AudioEnc

how to change the audio bitrate sent by local ip camera?

How can I change the audio bit rate generated by the openrtsp ? I like to have the same bit rate sent by the camera.
./openRTSP "rtsp://user:pass#IP_CAMERA/....."
The bit rate sent by the camera i 64 kb/s but when i try to get informations about the audio output of openrtsp i get 352 kb/s.
ffmpeg version git-2014-07-16-aa1d096 Copyright (c) 2000-2014 the FFmpeg developers
built on Jul 16 2014 18:28:34 with gcc 4.6 (Ubuntu/Linaro 4.6.3-1ubuntu5)
configuration: --extra-cflags=-I/home/zied/junk/include --extra-ldflags=-L/usr/local/lib/ --enable-gpl --enable-libx264
libavutil 52. 92.100 / 52. 92.100
libavcodec 55. 69.100 / 55. 69.100
libavformat 55. 48.100 / 55. 48.100
libavdevice 55. 13.102 / 55. 13.102
libavfilter 4. 11.100 / 4. 11.100
libswscale 2. 6.100 / 2. 6.100
libswresample 0. 19.100 / 0. 19.100
libpostproc 52. 3.100 / 52. 3.100
[mulaw # 0x9ac0360] Estimating duration from bitrate, this may be inaccurate
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, mulaw, from 'audio-PCMA-2.ul':
Duration: 00:00:48.46, bitrate: 352 kb/s
Stream #0:0: Audio: pcm_mulaw, 44100 Hz, 1 channels, s16, 352 kb/s
Best regards,
openRTSP does not change the bitrate, it just saves incoming samples to file.
44100 * 8 / 1000 = 352.8 kbps
If you want a lower bitrate, you need to see if your camera supports other audio formats.

FFMPEG command issue

I am having an issue with FFMPEG. To be exact I'm trying to generate a number of 'meaningful' thumbnails from a video file.
I have found this command on the internet:
ffmpeg -ss 3 -i input.mp4 -vf "select=gt(scene\,0.4)" -frames:v 5 -vsync vfr fps=fps=1/600 out%02d.jpg
Sadly it doesn't work for me, as I'm getting:
[NULL # 0x86c2420] Unable to find a suitable output format for 'fps=fps=1/600'
fps=fps=1/600: Invalid argument
I have tried including "fps=fps=1/600" inside -vf, which resulted in only one picture being generated. What am I doing wrong?
EDIT:
This is an example of a full output:
$ ffmpeg -ss 3 -i video.ogg -vf "select=gt(scene\,0.4)" -frames:v 5 -vsync vfr fps=fps=1/600 out%02d.jpg
ffmpeg version 2.5.3 Copyright (c) 2000-2015 the FFmpeg developers
built on Jan 10 2015 23:26:13 with gcc 4.9.2 (GCC) 20141224 (prerelease)
configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-avisynth --enable-avresample --enable-fontconfig --enable-gnutls --enable-gpl --enable-libass --enable-libbluray --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-librtmp --enable-libschroedinger --enable-libspeex --enable-libtheora --enable-libv4l2 --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-libxvid --enable-runtime-cpudetect --enable-shared --enable-swresample --enable-vdpau --enable-version3 --enable-x11grab
libavutil 54. 15.100 / 54. 15.100
libavcodec 56. 13.100 / 56. 13.100
libavformat 56. 15.102 / 56. 15.102
libavdevice 56. 3.100 / 56. 3.100
libavfilter 5. 2.103 / 5. 2.103
libavresample 2. 1. 0 / 2. 1. 0
libswscale 3. 1.101 / 3. 1.101
libswresample 1. 1.100 / 1. 1.100
libpostproc 53. 3.100 / 53. 3.100
[theora # 0x9b59140] 7 bits left in packet 82
[ogg # 0x9b586e0] Broken file, keyframe not correctly marked.
Last message repeated 2 times
Input #0, ogg, from 'video.ogg':
Duration: 00:09:56.46, start: 0.000000, bitrate: 2237 kb/s
Stream #0:0: Video: theora, yuv420p, 854x480, 24 tbr, 24 tbn, 24 tbc
Stream #0:1: Audio: vorbis, 48000 Hz, stereo, fltp, 192 kb/s
[NULL # 0x9b97660] Unable to find a suitable output format for 'fps=fps=1/600'
fps=fps=1/600: Invalid argument
All I had to do is add -vf before "fps=fps=1/600"

Resources