Merging multichannel audio tracks from Mumble with ffmpeg

Merging multichannel audio tracks from Mumble with ffmpeg - audio

We record talks through Mumble and because Mumble has a nitfy multichannel feature I'd figured we could get subtitles from YouTube by uploading each track to YouTube separately with for file in *; do ffmpeg -loop 1 -r 2 -i "$img" -i "$file" -vf scale=-1:380 -c:v libx264 -preset slow -tune stillimage -crf 18 -c:a copy -shortest -pix_fmt yuv420p -threads 0 "$file".mkv; done I then can prepend with a eg. a sed shell script a nickname for each speaker in the automatic captions i.e. subtitles from YouTube. Works like a charm.
But merging those tracks with ffmpeg gets tricky. I use ffmpeg -i input1.ogg -input2.ogg -i input3.ogg -i input4.ogg -input5.ogg -filter_complex "[0:a][1:a][2:a][3:a][4:a] amerge=inputs=5[aout]" -map "[aout]" -ac 2 output.ogg
Somehow ffmpeg shortens the resulting audio track and I don't yet have an idea why. I tried using the longest first and last since including silent tracks made even a shorter mixdown. Here are the warnings:
[Parsed_amerge_0 # 0x7f8b29f02d20] No channel layout for input 1
[Parsed_amerge_0 # 0x7f8b29f02d20] Input channel layouts overlap: output layout will be determined by the number of distinct input channels
But it says
[Parsed_amerge_0 # 0x7f8b29f02d20] No channel layout for input 1
even when I change the order of inputs.
Allthough according to Mumble's documentation the tracks should be equal length VLC media info shows different track times. However the tracks are not out of sync just cut off at the end.
I also have no idea why ffmpeg mentions FLAC, all the files are vorbis.
ffmpeg -i Mumble-2017-09-09-16-33-18-149.210.187.155-chrisaiki2.ogg -i Mumble-2017-09-09-16-33-18-149.210.187.155-Recorder.ogg -i Mumble-2017-09-09-16-33-18-149.210.187.155-steempowerpics.ogg -i Mumble-2017-09-09-16-33-18-149.210.187.155-Taconator.ogg -i Mumble-2017-09-09-16-33-18-149.210.187.155-fuzzynewest.ogg -filter_complex "[0:a][1:a][2:a][3:a][4:a] amerge=inputs=5[aout]" -map "[aout]" -ac 2 output5.ogg
ffmpeg version 2.8.4 Copyright (c) 2000-2015 the FFmpeg developers
built with Apple LLVM version 7.0.2 (clang-700.1.81)
configuration: --prefix=/usr/local/Cellar/ffmpeg/2.8.4 --enable-shared -- enable-pthreads --enable-gpl --enable-version3 --enable-hardcoded-tables --enable- avresample --cc=clang --host-cflags= --host-ldflags= --enable-opencl --enable- libx264 --enable-libmp3lame --enable-libvo-aacenc --enable-libxvid --enable-vda
libavutil 54. 31.100 / 54. 31.100
libavcodec 56. 60.100 / 56. 60.100
libavformat 56. 40.101 / 56. 40.101
libavdevice 56. 4.100 / 56. 4.100
libavfilter 5. 40.101 / 5. 40.101
libavresample 2. 1. 0 / 2. 1. 0
libswscale 3. 1.101 / 3. 1.101
libswresample 1. 2.101 / 1. 2.101
libpostproc 53. 3.100 / 53. 3.100
Input #0, ogg, from 'Mumble-2017-09-09-16-33-18-149.210.187.155- chrisaiki2.ogg':
Duration: 00:40:01.19, start: 0.000000, bitrate: 17 kb/s
Stream #0:0: Audio: vorbis, 48000 Hz, mono, fltp, 86 kb/s
Metadata:
ENCODER : libsndfile
TITLE : chrisaiki2
Input #1, ogg, from 'Mumble-2017-09-09-16-33-18-149.210.187.155-Recorder.ogg':
Duration: 00:33:57.88, start: 0.000000, bitrate: 1 kb/s
Stream #1:0: Audio: vorbis, 48000 Hz, mono, fltp, 86 kb/s
Metadata:
ENCODER : libsndfile
TITLE : Recorder
Input #2, ogg, from 'Mumble-2017-09-09-16-33-18-149.210.187.155-steempowerpics.ogg':
Duration: 00:33:53.93, start: 0.000000, bitrate: 1 kb/s
Stream #2:0: Audio: vorbis, 48000 Hz, mono, fltp, 86 kb/s
Metadata:
ENCODER : libsndfile
TITLE : steempowerpics
Input #3, ogg, from 'Mumble-2017-09-09-16-33-18-149.210.187.155-Taconator.ogg':
Duration: 00:35:36.37, start: 0.000000, bitrate: 6 kb/s
Stream #3:0: Audio: vorbis, 48000 Hz, mono, fltp, 86 kb/s
Metadata:
ENCODER : libsndfile
TITLE : Taconator
Input #4, ogg, from 'Mumble-2017-09-09-16-33-18-149.210.187.155-fuzzynewest.ogg':
Duration: 00:41:53.23, start: 0.000000, bitrate: 30 kb/s
Stream #4:0: Audio: vorbis, 48000 Hz, mono, fltp, 86 kb/s
Metadata:
ENCODER : libsndfile
TITLE : fuzzynewest
File 'output5.ogg' already exists. Overwrite ? [y/N] y
[Parsed_amerge_0 # 0x7f8b29f02d20] No channel layout for input 1
[Parsed_amerge_0 # 0x7f8b29f02d20] Input channel layouts overlap: output layout will be determined by the number of distinct input channels
[flac # 0x7f8b2b005600] encoding as 24 bits-per-sample
Output #0, ogg, to 'output5.ogg':
Metadata:
encoder : Lavf56.40.101
Stream #0:0: Audio: flac, 48000 Hz, stereo, s32 (24 bit), 128 kb/s (default)
Metadata:
encoder : Lavc56.60.100 flac
Stream mapping:
Stream #0:0 (vorbis) -> amerge:in0
Stream #1:0 (vorbis) -> amerge:in1
Stream #2:0 (vorbis) -> amerge:in2
Stream #3:0 (vorbis) -> amerge:in3
Stream #4:0 (vorbis) -> amerge:in4
amerge -> Stream #0:0 (flac)
Press [q] to stop, [?] for help
size= 100900kB time=00:33:53.94 bitrate= 406.4kbits/s
video:0kB audio:100441kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.457024%
Mumble multichannel talk on reddit

The amerge documentation states:
If inputs do not have the same duration, the output will stop with the
shortest.
amix may be a better filter for this case.

I used amix in the end like this:
ffmpeg -i input1.ogg -i input2.ogg -i input3.ogg -i inout4.ogg -i input5.ogg -filter_complex "[0:a][1:a][2:a][3:a][4:a] amix=inputs=5:duration=longest[aout]" -map "[aout]" -ac 2 -c:a libvorbis -b:a 128k output.ogg
ffmpeg didn't recognize libvorbis so I had to reinstall with brew first: brew reinstall ffmpeg --with-libvorbis
I then used ffmpeg -loop 1 -r 2 -i "$img" -i "$snd" -vf scale=-1:380 -c:v libx264 -preset slow -tune stillimage -crf 18 -c:a copy -shortest -pix_fmt yuv420p -threads 0 output.mkv to upload the mixed audio tracks to YouTube.
I had merged the subtitles which were generated with YouTube as well and I just added those to the resulting video. Works like a charm.

Related

Decoding AAC to PCM with ffmpeg results in noise

I have a .mp4 file generated with ffmpeg as follows.
ffmpeg -y -i video_extended.mp4 -itsoffset 00:00:04.00 -i output5-1.wav -map 0:0 -map 1:0 -c:v copy -c:a aac -ac 6 -ar 48000 -b:a 128k -async 1 mixed.mp4
Playing mixed.mp4 file with ffplay is fine and there is no impact to the sound quality. Below is the output I get from ffplay when using the command ffplay -i mixed.mp4
> Input #0, mov,mp4,m4a,3gp,3g2,mj2, from
> 'mixed_h264_aac_512k_async_qp0_all_I.mp4': Metadata:
> major_brand : isom
> minor_version : 512
> compatible_brands: isomiso2avc1mp41
> encoder : Lavf58.76.100 Duration: 00:00:16.02, start: 0.000000, bitrate: 49136 kb/s Stream #0:0[0x1](und): Video: h264 (High 4:4:4 Predictive) (avc1 / 0x31637661), yuv422p10le(progressive),
> 1920x1080, 65409 kb/s, 59.94 fps, 59.94 tbr, 11988 tbn (default)
> Metadata:
> handler_name : VideoHandler
> vendor_id : [0][0][0][0] Stream #0:1[0x2](und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 71 kb/s (default)
> Metadata:
> handler_name : SoundHandler
> vendor_id : [0][0][0][0] Switch subtitle stream from #-1 to #-1 vq= 1606KB sq= 0B f=0/0
Then, I decode the mixed.mp4 file back to raw PCM using the following command.
ffmpeg -i mixed.mp4 -vn -acodec pcm_s16le -f s16le -ar 48000 -ac 6 raw_audio.pcm
However, this raw_audio.pcm contains a lot of noise and ffplay output shows the following output
[s16le # 0x7f7490000c80] Estimating duration from bitrate, this may be inaccurate
Input #0, s16le, from 'separated_audio_s16.pcm':
Duration: 00:00:16.02, bitrate: 4607 kb/s
Stream #0:0: Audio: pcm_s16le, 48000 Hz, 6 channels, s16, 4608 kb/s
[pcm_s16le # 0x7f749002b940] Multiple frames in a packet.
[pcm_s16le # 0x7f749002b940] Invalid PCM packet, data has size 8 but at least a size of 12 was expected
Last message repeated 32 times
[pcm_s16le # 0x7f749002b940] Invalid PCM packet, data has size 8 but at least a size of 12 was expected
Last message repeated 11 times
Switch subtitle stream from #-1 to #-1 vq= 0KB sq= 0B f=0/0
[pcm_s16le # 0x7f749002b940] Invalid PCM packet, data has size 8 but at least a size of 12 was expected
Last message repeated 11 times
[pcm_s16le # 0x7f749002b940] Invalid PCM packet, data has size 8 but at least a size of 12 was expected
Last message repeated 11 times
[pcm_s16le # 0x7f749002b940] Invalid PCM packet, data has size 8 but at least a size of 12 was expected
Can someone please explain the issue here? Note that the ffplay command that works correctly for mixed.mp4 shows fltp as the audio format, whereas when playing the raw_audio.pcm file, it is seen as s16.
Is this a resampling issue in ffmpeg, and how can I rectify this?
I’m using ffmpeg and ffplay versions 5.0.1 in a Fedora 36 system.
Thank you.

Add another audio over a file with mixed audio tracks in ffmpeg

I have a file that was formed by concatenating three different files: a.mp4, b.mp4 and c.mp4.
do ffmpeg -f concat -i "concat-file.txt" -map 0:v -map 0:a -c:v libx264 -crf 23 -fflags +genpts joined-file.mp4"
After that I run this command, mentioned here: How to add a new audio (not mixing) into a video using ffmpeg?
ffmpeg -i joined-file.mp4 -i audio.mp3 -filter_complex "[0:a][1:a]amerge=inputs=2[a]" -map 0:v -map "[a]" -c:v copy -ac 2 -shortest output.mp4
What is causing this issue?
Thanks. :)
UPDATE:
Here are the commands that I have been running:
ffmpeg -i "middle/b.mp4" -c:v copy -video_track_timescale 30k -c:a aac -ac 6 -ar 44100 -shortest "wrap/b.mp4"
ffmpeg -f concat -i "concat-file.txt" -map 0:v -map 0:a -c:v libx264 -crf 23 -fflags +genpts "joined/abc.mp4"
ffmpeg -i "joined/abc.mp4" -i audio.mp3 -filter_complex "[0:a:0][1:a:0]amerge=inputs=2[a]" -map 0:v -map "[a]" -c:v copy -ac 2 -shortest "final/abc-cmplt.mp4"
Here is the "concat-file.txt":
file 'bits/a.mp4'
file 'wrap/b.mp4'
file 'bits/c.mp4'
All the video files a.mp4, b.mp4 and c.mp4 have their original audio. After I run the commands above, the joint video abc-cmplt.mp4 has combined audio (audio.mp3 plus their own) for the first (a.mp4) and last parts (c.mp4). However, the middle part only has its own audio and the extra audio I am trying to add does not seem to merge with the audio of b.mp4 in the final joint file.
Output of ffmpeg -i bits/a.mp4 -i wrap/b.mp4 -i bits/c.mp4:
ffmpeg version 4.2.2 Copyright (c) 2000-2019 the FFmpeg developers
built with gcc 9.2.1 (GCC) 20200122
configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt
libavutil 56. 31.100 / 56. 31.100
libavcodec 58. 54.100 / 58. 54.100
libavformat 58. 29.100 / 58. 29.100
libavdevice 58. 8.100 / 58. 8.100
libavfilter 7. 57.100 / 7. 57.100
libswscale 5. 5.100 / 5. 5.100
libswresample 3. 5.100 / 3. 5.100
libpostproc 55. 5.100 / 55. 5.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'bits/a.mp4':
Metadata:
major_brand : mp42
minor_version : 0
compatible_brands: mp42mp41
creation_time : 2021-03-10T08:50:04.000000Z
Duration: 00:00:01.05, start: 0.000000, bitrate: 1846 kb/s
Stream #0:0(eng): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1080x1920 [SAR 1:1 DAR 9:16], 1462 kb/s, 30 fps, 30 tbr, 30k tbn, 60 tbc (default)
Metadata:
creation_time : 2021-03-10T08:50:04.000000Z
handler_name : ?Mainconcept Video Media Handler
encoder : AVC Coding
Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 317 kb/s (default)
Metadata:
creation_time : 2021-03-10T08:50:04.000000Z
handler_name : #Mainconcept MP4 Sound Media Handler
Input #1, mov,mp4,m4a,3gp,3g2,mj2, from 'wrap/b.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.29.100
Duration: 00:00:27.93, start: 0.000000, bitrate: 234 kb/s
Stream #1:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1080x1920, 231 kb/s, 30 fps, 30 tbr, 30k tbn, 60 tbc (default)
Metadata:
handler_name : VideoHandler
Input #2, mov,mp4,m4a,3gp,3g2,mj2, from 'bits/c.mp4':
Metadata:
major_brand : mp42
minor_version : 0
compatible_brands: mp42mp41
creation_time : 2021-03-10T08:42:52.000000Z
Duration: 00:00:01.05, start: 0.000000, bitrate: 1829 kb/s
Stream #2:0(eng): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1080x1920 [SAR 1:1 DAR 9:16], 1320 kb/s, 30 fps, 30 tbr, 30k tbn, 60 tbc (default)
Metadata:
creation_time : 2021-03-10T08:42:52.000000Z
handler_name : ?Mainconcept Video Media Handler
encoder : AVC Coding
Stream #2:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 317 kb/s (default)
Metadata:
creation_time : 2021-03-10T08:42:52.000000Z
handler_name : #Mainconcept MP4 Sound Media Handler

All files to be concatenated by the concat demuxer must have these same attributes. b.mp4 has a different H.264 profile and lacks audio. Fix that:
ffmpeg -i "middle/b.mp4" -f lavfi -i anullsrc=cl=stereo:r=48000 -c:v libx264 -profile:v main -video_track_timescale 30k -shortest "wrap/b.mp4"
Then concatenate and mix the audio:
ffmpeg -f concat -i "concat-file.txt" -i audio.mp3 -c:v libx264 -crf 23 -filter_complex "[0:a:0][1:a:0]amerge=inputs=2" -ac 2 "joined/abc.mp4"
Option
Description
-f lavfi
Tell ffmpeg the following input is a filter instead of a file.
-i anullsrc=cl=stereo:r=48000
Use the anullsrc filter to generate silent stereo audio with 48000 sample rate.
-profile:v main
Set H.264 Profile to Main.
-video_track_timescale 30k
Set timescale to 30k to match the other videos (30k tbn).

ffmpeg : Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height

I'm experiencing troubles running ffmpeg on my synology. I'm trying to convert .avi video to mp4.
Here is the command :
ffmpeg -i vid20160623.avi -acodec libfaac -b:a 128k -vcodec mpeg4 -b:v 1200k -flags +aic+mv4 -f mp4 vid20160623.mp4
And the logs :
ffmpeg version 2.7.1 Copyright (c) 2000-2015 the FFmpeg developers
built with gcc 4.9.3 (crosstool-NG 1.20.0) 20150311 (prerelease)
configuration: --prefix=/usr --incdir='${prefix}/include/ffmpeg' --arch=arm --target-os=linux --cross-prefix=/usr/local/arm-unknown-linux-gnueabi/bin/arm-unknown-linux-gnueabi- --enable-cross-compile --enable-optimizations --enable-pic --enable-gpl --enable-shared --disable-static --enable-version3 --enable-nonfree --enable-libfaac --enable-encoders --enable-pthreads --disable-bzlib --disable-protocol=rtp --disable-muxer=image2 --disable-muxer=image2pipe --disable-swscale-alpha --disable-ffserver --disable-ffplay --disable-devices --disable-bzlib --disable-altivec --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libmp3lame --disable-vaapi --disable-decoder=amrnb --disable-encoder=zmbv --disable-encoder=dca --disable-encoder=ac3 --disable-encoder=ac3_fixed --disable-encoder=eac3 --disable-decoder=dca --disable-decoder=eac3 --disable-decoder=truehd --cc=/usr/local/arm-unknown-linux-gnueabi/bin/arm-unknown-linux-gnueabi-ccache-gcc
libavutil 54. 27.100 / 54. 27.100
libavcodec 56. 41.100 / 56. 41.100
libavformat 56. 36.100 / 56. 36.100
libavdevice 56. 4.100 / 56. 4.100
libavfilter 5. 16.101 / 5. 16.101
libswscale 3. 1.101 / 3. 1.101
libswresample 1. 2.100 / 1. 2.100
libpostproc 53. 3.100 / 53. 3.100
Input #0, avi, from 'vid20160623.avi':
Metadata:
encoder : MEncoder git-ab94fc6-4.4.3
Duration: 00:20:18.07, start: 0.000000, bitrate: 1197 kb/s
Stream #0:0: Video: mpeg4 (Advanced Simple Profile) (XVID / 0x44495658), yuv420p, 624x352 [SAR 1:1 DAR 39:22], 1056 kb/s, 25 fps, 23.98 tbr, 25 tbn, 23.98 tbc
Stream #0:1: Audio: mp3 (U[0][0][0] / 0x0055), 48000 Hz, stereo, s16p, 128 kb/s
Output #0, mp4, to vid20160623.mp4':
Metadata:
encoder : MEncoder git-ab94fc6-4.4.3
Stream #0:0: Video: mpeg4, none, q=2-31, 128 kb/s, SAR 351:352 DAR 0:0, 23.98 fps
Metadata:
encoder : Lavc56.41.100 mpeg4
Stream #0:1: Audio: aac, 0 channels, 128 kb/s
Metadata:
encoder : Lavc56.41.100 libfaac
Stream mapping:
Stream #0:0 -> #0:0 (mpeg4 (native) -> mpeg4 (native))
Stream #0:1 -> #0:1 (mp3 (native) -> aac (libfaac))
Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height
I tried to reduce b:a and b:v but it did not work.
Any help will be welcome.

Looks like the encoder is not recognizing the resolution of the output stream. Also, the codec frame rate is different than the actual frame rate. Try,
ffmpeg -i vid20160623.avi -acodec libfaac -b:a 128k -vf "scale=624:352,setsar=1" -vcodec mpeg4 -r 25 -b:v 1200k -flags +aic+mv4 -f mp4 vid20160623.mp4

Maybe a late answer but as I looked for a while for the same reason...
The issue is not in your command line but in FFmpeg version itself!
You asked for 1200k bit rate but we can see in output->Stream 0:0, it would try 128kb/s!
I have just used same command line with FFMpeg 2.8.11 and it works like a charm!

FFMPEG - Get the exact calculated audio filesize after encode

Im trying to guess an audio (mp3) filesize before encode with ffmpeg, afterward, need to have the exact calculated filesize.
Here is the formula im using to predict and calculate the filesize (hope im not wrong) :
( Bitrates x Duration ) / 8) x 1000 = Filesize in Bytes.
Im going to give a real example so that everyone can understand the use case.
Example :
Having an m4a file with the following data :
Name : Assuming xxx.m4a
Filesize : 8 304 014 bytes (8,3 Mo)
Bitrates : 256k
Duration : 260 seconds
Expected filesize : ( (256 x 260) / 8 ) x 1000 = 8 320 000 bytes
Then im running the following ffmpeg command :
ffmpeg -i xxx.m4a -f mp3 -y -minrate 256k -maxrate 256k -bufsize 256k -b:a 256k -fs 8320000 output.mp3
Console output :
ffmpeg version 2.7.2 Copyright (c) 2000-2015 the FFmpeg developers
built with Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn)
configuration: --prefix=/usr/local/Cellar/ffmpeg/2.7.2_1 --enable-shared --enable-pthreads --enable-gpl --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-opencl --enable-libx264 --enable-libmp3lame --enable-libvo-aacenc --enable-libxvid --enable-vda
libavutil 54. 27.100 / 54. 27.100
libavcodec 56. 41.100 / 56. 41.100
libavformat 56. 36.100 / 56. 36.100
libavdevice 56. 4.100 / 56. 4.100
libavfilter 5. 16.101 / 5. 16.101
libavresample 2. 1. 0 / 2. 1. 0
libswscale 3. 1.101 / 3. 1.101
libswresample 1. 2.100 / 1. 2.100
libpostproc 53. 3.100 / 53. 3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'xxx.m4a':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2mp41
encoder : Lavf56.36.100
Duration: 00:04:20.53, start: 0.000000, bitrate: 254 kb/s
Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 253 kb/s (default)
Metadata:
handler_name : SoundHandler
Output #0, mp3, to 'output.mp3':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2mp41
TSSE : Lavf56.36.100
Stream #0:0(und): Audio: mp3 (libmp3lame), 44100 Hz, stereo, fltp, 256 kb/s (default)
Metadata:
handler_name : SoundHandler
encoder : Lavc56.41.100 libmp3lame
Stream mapping:
Stream #0:0 -> #0:0 (aac (native) -> mp3 (libmp3lame))
Press [q] to stop, [?] for help
size= 8127kB time=00:04:20.02 bitrate= 256.1kbits/s
video:0kB audio:8127kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.011765%
Problem and Questions :
Can you tell me why im I getting an output with 8 322 546 bytes and
not 8 320 000 as expected ?
Is there something wrong in my formula or the ffmpeg command ?
What solution can you suggest to get the exact predicted filesize ?
Thank you in advance.

Besides the muxing overhead inherent in the container, MP3 audio is stored in frames. And each frame has fixed number of 1152 samples. The encoder will output full frames so for an output sampling rate of 44100, the closest to 260 seconds is
ceiling of (260 x 44100/1152) = 9954 frames = ~260.02285 seconds.
This throws your calculation, by itself, off balance, even if the encoding assumptions were right.
Even then, the bit reservoir may come into play.
Edit:
You can drop the bitrate and add silent padding, but this too isn't precise as muxing overhead comes into play
ffmpeg -i xxx.m4a -f lavfi -t 5 -i anullsrc -lavfi "[0:a][1:a]concat=n=2:v=0:a=1" -f mp3 -y -minrate 224k -maxrate 224k -bufsize 224k -b:a 224k -fs N output.mp3
Here, the fs should be calculated as per MP3 + 5 seconds duration.

ffmpeg - when merging an image and audio, audio gets shortened

I am trying to merge a png image with 11 seconds of audio and create an mp4 file. When I execute ffmpeg I end up with a total duration of 10 seconds for the mp4 file. The command I'm using is...
ffmpeg -r 6 -loop 1 -i "image1.png" -i "audio1.wav" out.mp4
UPDATE: Here is the log that is produced...
FFmpeg version SVN-r15986, Copyright (c) 2000-2008 Fabrice Bellard, et al.
configuration: --extra-cflags=-fno-common --enable-memalign-hack --enable-pthr
eads --enable-libmp3lame --enable-libxvid --enable-libvorbis --enable-libtheora
--enable-libspeex --enable-libfaac --enable-libgsm --enable-libx264 --enable-lib
schroedinger --enable-avisynth --enable-swscale --enable-gpl
libavutil 49.12. 0 / 49.12. 0
libavcodec 52. 6. 0 / 52. 6. 0
libavformat 52.23. 1 / 52.23. 1
libavdevice 52. 1. 0 / 52. 1. 0
libswscale 0. 6. 1 / 0. 6. 1
built on Dec 3 2008 01:59:37, gcc: 4.2.4
Input #0, image2, from 'image1.png':
Duration: 00:00:00.16, start: 0.000000, bitrate: N/A
Stream #0.0: Video: png, rgb32, 400x300, 6.00 tb(r)
Input #1, wav, from 'audio1.wav':
Duration: 00:00:11.07, bitrate: 88 kb/s
Stream #1.0: Audio: pcm_u8, 11025 Hz, mono, s16, 88 kb/s
File 'out.mp4' already exists. Overwrite ? [y/N] y
Output #0, mp4, to 'out.mp4':
Stream #0.0: Video: mpeg4, yuv420p, 400x300, q=2-31, 200 kb/s, 6.00 tb(c)
Stream #0.1: Audio: libfaac, 11025 Hz, mono, s16, 64 kb/s
Stream mapping:
Stream #0.0 -> #0.0
Stream #1.0 -> #0.1
Press [q] to stop encoding
frame= 1 fps= 0 q=4.1 Lsize= 42kB time=0.17 bitrate=2063.7kbits/s
video:14kB audio:26kB global headers:0.kB muxing overhead 4.894235%
I have also tried using
ffmpeg -loop 1 -i "image1.png" -i "audio1.wav" -t 11 out.mp4
This command does create an mp4 of 11 seconds but the audio is still cut off at 10 seconds.
Why is the audio being cutoff at 10 seconds?
Thanks,
Gary

One possible thing is that your audio file is just 10 seconds long.
Are you sure you are losing a whole second of the audio? Maybe it is just a few miliseconds which could cause some rounding issues? You can check this by running
ffprobe "audio1.wav"
and then
ffprobe "out.mp4"
Adding some options of codec/bitrate (e.g. -c:a copy -b:a copy) might help

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Merging multichannel audio tracks from Mumble with ffmpeg - audio

The amerge documentation states: If inputs do not have the same duration, the output will stop with the shortest. amix may be a better filter for this case.

Related

Decoding AAC to PCM with ffmpeg results in noise

Add another audio over a file with mixed audio tracks in ffmpeg

ffmpeg : Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height

FFMPEG - Get the exact calculated audio filesize after encode

ffmpeg - when merging an image and audio, audio gets shortened

Categories

Resources