Understanding the ffmpeg decoder format - audio

Using ffmpeg, I get this output (2 different examples)
Input #0, au, from 'c:\TestFiles\sample.au':
Duration: 00:02:02.09, start: 0.000000, bitrate: 1411 kb/s
Stream #0:0: Audio: pcm_s16be ([3][0][0][0] / 0x0003), 44100 Hz, stereo, s16, 1411 kb/s
---
Input #0, wav, from 'c:\TestFiles\sample.wav':
Metadata:
encoder : Lavf57.83.100
Duration: 00:02:02.09, bitrate: 1411 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s
I understand what the pcm_s16be part means, but what is the information that follows that in parenthesis, e.g., ([3][0][0][0] / 0x0003)? I can't find any documentation that explains it

[3][0][0][0] / 0x0003 <-- these are codec tags.
The hex value on the right is the raw value as stored (usually little-endian); the format on the left is the literary representation. Usually if it's printable characters, that will be shown e.g. mp4a else the decimal number in square brackets.

Related

Extracting mp3 from mp4 via FFMPEG doesn't apply audio bitrate

I want to extract a simplified mono low bitrate low frequency and basically low sized audio from any mp4 file. I use the following FFMPEG command but the resulting audio somehow has 2 channels and its bitrate is the same as the original audio from mp4.
ffmpeg.exe -report -y -i "{mp4}" -vn -acodec libmp3lame -ac 1 -ab 64k -ar 24000 -f mp3 output "{mp3}"
Here is my report:
ffmpeg started on 2021-07-17 at 18:30:20
Report written to "ffmpeg-20210717-183020.log"
Log level: 48
Command line:
"C:\\ffmpeg\\bin\\ffmpeg.exe" -report -y -i "C:/Users/ScottRobertson/Desktop/VirtualExhibition/Final Booth Data/1-Abidi (gold)/Video/Video1.mp4" -vn -acodec libmp3lame -ac 1 -ab 64k -f mp3 output "C:/Users/ScottRobertson/Desktop/VirtualExhibition/Final Booth Data/1-Abidi (gold)/Video/Audio.mp3"
ffmpeg version 4.3.1-2021-01-01-full_build-www.gyan.dev Copyright (c) 2000-2021 the FFmpeg developers
built with gcc 10.2.0 (Rev5, Built by MSYS2 project)
configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-lzma --enable-libsnappy --enable-zlib --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libdav1d --enable-libzvbi --enable-librav1e --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-libaom --enable-libopenjpeg --enable-libvpx --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libilbc --enable-l libavutil 56. 51.100 / 56. 51.100
libavcodec 58. 91.100 / 58. 91.100
libavformat 58. 45.100 / 58. 45.100
libavdevice 58. 10.100 / 58. 10.100
libavfilter 7. 85.100 / 7. 85.100
libswscale 5. 7.100 / 5. 7.100
libswresample 3. 7.100 / 3. 7.100
libpostproc 55. 7.100 / 55. 7.100
Splitting the commandline.
Reading option '-report' ... matched as option 'report' (generate a report) with argument '1'.
Reading option '-y' ... matched as option 'y' (overwrite output files) with argument '1'.
Reading option '-i' ... matched as input url with argument 'C:/Users/ScottRobertson/Desktop/VirtualExhibition/Final Booth Data/1-Abidi (gold)/Video/Video1.mp4'.
Reading option '-vn' ... matched as option 'vn' (disable video) with argument '1'.
Reading option '-acodec' ... matched as option 'acodec' (force audio codec ('copy' to copy stream)) with argument 'libmp3lame'.
Reading option '-ac' ... matched as option 'ac' (set number of audio channels) with argument '1'.
Reading option '-ab' ... matched as option 'ab' (audio bitrate (please use -b:a)) with argument '64k'.
Reading option '-f' ... matched as option 'f' (force format) with argument 'mp3'.
Reading option 'output' ... matched as output url.
Reading option 'C:/Users/ScottRobertson/Desktop/VirtualExhibition/Final Booth Data/1-Abidi (gold)/Video/Audio.mp3' ... matched as output url.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option report (generate a report) with argument 1.
Applying option y (overwrite output files) with argument 1.
Successfully parsed a group of options.
Parsing a group of options: input url C:/Users/ScottRobertson/Desktop/VirtualExhibition/Final Booth Data/1-Abidi (gold)/Video/Video1.mp4.
Successfully parsed a group of options.
Opening an input file: C:/Users/ScottRobertson/Desktop/VirtualExhibition/Final Booth Data/1-Abidi (gold)/Video/Video1.mp4.
[NULL # 000002e7d956ed40] Opening 'C:/Users/ScottRobertson/Desktop/VirtualExhibition/Final Booth Data/1-Abidi (gold)/Video/Video1.mp4' for reading
[file # 000002e7d956fd80] Setting default whitelist 'file,crypto,data'
[mov,mp4,m4a,3gp,3g2,mj2 # 000002e7d956ed40] Format mov,mp4,m4a,3gp,3g2,mj2 probed with size=2048 and score=100
[mov,mp4,m4a,3gp,3g2,mj2 # 000002e7d956ed40] ISO: File Type Major Brand: mp42
[mov,mp4,m4a,3gp,3g2,mj2 # 000002e7d956ed40] Unknown dref type 0x206c7275 size 12
[mov,mp4,m4a,3gp,3g2,mj2 # 000002e7d956ed40] Setting codecpar->delay to 1 for stream st: 0
[mov,mp4,m4a,3gp,3g2,mj2 # 000002e7d956ed40] Unknown dref type 0x206c7275 size 12
[mov,mp4,m4a,3gp,3g2,mj2 # 000002e7d956ed40] Before avformat_find_stream_info() pos: 139797 bytes read:65536 seeks:1 nb_streams:2
[h264 # 000002e7d9570c00] nal_unit_type: 7(SPS), nal_ref_idc: 3
[h264 # 000002e7d9570c00] nal_unit_type: 8(PPS), nal_ref_idc: 3
[h264 # 000002e7d9570c00] nal_unit_type: 5(IDR), nal_ref_idc: 3
[h264 # 000002e7d9570c00] Format yuv420p chosen by get_format().
[h264 # 000002e7d9570c00] Reinit context to 1920x1088, pix_fmt: yuv420p
[h264 # 000002e7d9570c00] no picture
[h264 # 000002e7d9570c00] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 2
[h264 # 000002e7d9570c00] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 0
[h264 # 000002e7d9570c00] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 0
[h264 # 000002e7d9570c00] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 0
[h264 # 000002e7d9570c00] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 2
[h264 # 000002e7d9570c00] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 0
[h264 # 000002e7d9570c00] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 0
[mov,mp4,m4a,3gp,3g2,mj2 # 000002e7d956ed40] All info found
[mov,mp4,m4a,3gp,3g2,mj2 # 000002e7d956ed40] After avformat_find_stream_info() pos: 158020 bytes read:65536 seeks:1 frames:11
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'C:/Users/ScottRobertson/Desktop/VirtualExhibition/Final Booth Data/1-Abidi (gold)/Video/Video1.mp4':
Metadata:
major_brand : mp42
minor_version : 0
compatible_brands: mp42mp41
creation_time : 2020-11-24T06:56:49.000000Z
Duration: 00:00:56.45, start: 0.000000, bitrate: 893 kb/s
Stream #0:0(eng), 10, 1/30000: Video: h264 (Main) (avc1 / 0x31637661), yuv420p(tv, bt709), 1920x1080, 749 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
Metadata:
creation_time : 2020-11-24T06:56:49.000000Z
handler_name : Mainconcept Video Media Handler
encoder : AVC Coding
Stream #0:1(eng), 1, 1/44100: Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 125 kb/s (default)
Metadata:
creation_time : 2020-11-24T06:56:49.000000Z
handler_name : #Mainconcept MP4 Sound Media Handler
Successfully opened the file.
Parsing a group of options: output url output.
Applying option vn (disable video) with argument 1.
Applying option acodec (force audio codec ('copy' to copy stream)) with argument libmp3lame.
Applying option ac (set number of audio channels) with argument 1.
Applying option ab (audio bitrate (please use -b:a)) with argument 64k.
Applying option f (force format) with argument mp3.
Successfully parsed a group of options.
Opening an output file: output.
[file # 000002e7d958e340] Setting default whitelist 'file,crypto,data'
Successfully opened the file.
Parsing a group of options: output url C:/Users/ScottRobertson/Desktop/VirtualExhibition/Final Booth Data/1-Abidi (gold)/Video/Audio.mp3.
Successfully parsed a group of options.
Opening an output file: C:/Users/ScottRobertson/Desktop/VirtualExhibition/Final Booth Data/1-Abidi (gold)/Video/Audio.mp3.
[file # 000002e7da1bc480] Setting default whitelist 'file,crypto,data'
Successfully opened the file.
Stream mapping:
Stream #0:1 -> #0:0 (aac (native) -> mp3 (libmp3lame))
Stream #0:1 -> #1:0 (aac (native) -> mp3 (libmp3lame))
Press [q] to stop, [?] for help
cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
detected 8 logical cores
[graph_0_in_0_1 # 000002e7dac2b7c0] Setting 'time_base' to value '1/44100'
[graph_0_in_0_1 # 000002e7dac2b7c0] Setting 'sample_rate' to value '44100'
[graph_0_in_0_1 # 000002e7dac2b7c0] Setting 'sample_fmt' to value 'fltp'
[graph_0_in_0_1 # 000002e7dac2b7c0] Setting 'channel_layout' to value '0x3'
[graph_0_in_0_1 # 000002e7dac2b7c0] tb:1/44100 samplefmt:fltp samplerate:44100 chlayout:0x3
[format_out_0_0 # 000002e7dac2b8c0] Setting 'sample_fmts' to value 's32p|fltp|s16p'
[format_out_0_0 # 000002e7dac2b8c0] Setting 'sample_rates' to value '44100|48000|32000|22050|24000|16000|11025|12000|8000'
[format_out_0_0 # 000002e7dac2b8c0] Setting 'channel_layouts' to value '0x4'
[format_out_0_0 # 000002e7dac2b8c0] auto-inserting filter 'auto_resampler_0' between the filter 'Parsed_anull_0' and the filter 'format_out_0_0'
[AVFilterGraph # 000002e7da1bdfc0] query_formats: 4 queried, 7 merged, 3 already done, 0 delayed
[auto_resampler_0 # 000002e7d958edc0] [SWR # 000002e7d95fd240] Using fltp internally between filters
[auto_resampler_0 # 000002e7d958edc0] [SWR # 000002e7d95fd240] Matrix coefficients:
[auto_resampler_0 # 000002e7d958edc0] [SWR # 000002e7d95fd240] FC: FL:0.707107 FR:0.707107
[auto_resampler_0 # 000002e7d958edc0] ch:2 chl:stereo fmt:fltp r:44100Hz -> ch:1 chl:mono fmt:fltp r:44100Hz
Output #0, mp3, to 'output':
Metadata:
major_brand : mp42
minor_version : 0
compatible_brands: mp42mp41
TSSE : Lavf58.45.100
Stream #0:0(eng), 0, 1/44100: Audio: mp3 (libmp3lame), 44100 Hz, mono, fltp, 64 kb/s (default)
Metadata:
creation_time : 2020-11-24T06:56:49.000000Z
handler_name : #Mainconcept MP4 Sound Media Handler
encoder : Lavc58.91.100 libmp3lame
[graph_1_in_0_1 # 000002e7d9a5bf00] Setting 'time_base' to value '1/44100'
[graph_1_in_0_1 # 000002e7d9a5bf00] Setting 'sample_rate' to value '44100'
[graph_1_in_0_1 # 000002e7d9a5bf00] Setting 'sample_fmt' to value 'fltp'
[graph_1_in_0_1 # 000002e7d9a5bf00] Setting 'channel_layout' to value '0x3'
[graph_1_in_0_1 # 000002e7d9a5bf00] tb:1/44100 samplefmt:fltp samplerate:44100 chlayout:0x3
[format_out_1_0 # 000002e7da1b8580] Setting 'sample_fmts' to value 's32p|fltp|s16p'
[format_out_1_0 # 000002e7da1b8580] Setting 'sample_rates' to value '44100|48000|32000|22050|24000|16000|11025|12000|8000'
[format_out_1_0 # 000002e7da1b8580] Setting 'channel_layouts' to value '0x4|0x3'
[AVFilterGraph # 000002e7da1be5c0] query_formats: 4 queried, 9 merged, 0 already done, 0 delayed
Output #1, mp3, to 'C:/Users/ScottRobertson/Desktop/VirtualExhibition/Final Booth Data/1-Abidi (gold)/Video/Audio.mp3':
Metadata:
major_brand : mp42
minor_version : 0
compatible_brands: mp42mp41
TSSE : Lavf58.45.100
Stream #1:0(eng), 0, 1/44100: Audio: mp3 (libmp3lame), 44100 Hz, stereo, fltp (default)
Metadata:
creation_time : 2020-11-24T06:56:49.000000Z
handler_name : #Mainconcept MP4 Sound Media Handler
encoder : Lavc58.91.100 libmp3lame
cur_dts is invalid st:0 (0) [init:1 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
cur_dts is invalid st:0 (0) [init:1 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
cur_dts is invalid st:0 (0) [init:1 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
cur_dts is invalid st:0 (0) [init:1 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
cur_dts is invalid st:0 (0) [init:1 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
cur_dts is invalid st:0 (0) [init:1 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
cur_dts is invalid st:0 (0) [init:1 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
cur_dts is invalid st:0 (0) [init:1 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
cur_dts is invalid st:0 (0) [init:1 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
size= 129kB time=00:00:16.40 bitrate= 64.2kbits/s speed=32.8x
size= 254kB time=00:00:32.49 bitrate= 64.1kbits/s speed=32.5x
size= 256kB time=00:00:47.83 bitrate= 43.8kbits/s speed=31.8x
[out_0_0 # 000002e7d95b3d40] EOF on sink link out_0_0:default.
[out_1_0 # 000002e7d9bae780] EOF on sink link out_1_0:default.
No more output streams to write to, finishing.
[libmp3lame # 000002e7d958dbc0] Trying to remove 175 more samples than there are in the queue
[libmp3lame # 000002e7da1bb940] Trying to remove 175 more samples than there are in the queue
size= 442kB time=00:00:56.45 bitrate= 64.1kbits/s speed= 33x
video:0kB audio:1324kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Input file #0 (C:/Users/ScottRobertson/Desktop/VirtualExhibition/Final Booth Data/1-Abidi (gold)/Video/Video1.mp4):
Input stream #0:0 (video): 10 packets read (17858 bytes);
Input stream #0:1 (audio): 2431 packets read (886147 bytes); 2431 frames decoded (2489344 samples);
Total: 2441 packets (904005 bytes) demuxed
Output file #0 (output):
Output stream #0:0 (audio): 2161 frames encoded (2489344 samples); 2162 packets muxed (451813 bytes);
Total: 2162 packets (451813 bytes) muxed
Output file #1 (C:/Users/ScottRobertson/Desktop/VirtualExhibition/Final Booth Data/1-Abidi (gold)/Video/Audio.mp3):
Output stream #1:0 (audio): 2161 frames encoded (2489344 samples); 2162 packets muxed (903627 bytes);
Total: 2162 packets (903627 bytes) muxed
2431 frames successfully decoded, 0 decoding errors
[AVIOContext # 000002e7d958e400] Statistics: 2 seeks, 3 writeouts
[AVIOContext # 000002e7da1bc580] Statistics: 2 seeks, 5 writeouts
[AVIOContext # 000002e7d9578040] Statistics: 2725576 bytes read, 65 seeks
How can I force the desired audio bitrate?
Also, why is there 2 outputs called Output file #0 and Output file #1 and why the #0 doesn't have any path?
Remove the word output in your command.
That is not an option name; ffmpeg treats it as the name of an output target since it does not immediately follow a token of the form -option.

ALSA: snd_pcm_writei returns EAGAIN

I'm struggling to understand what I'm doing wrong in my audio playback routine.
I've a thread that takes buffers from other threads and plays them the same way this alsa example program does: https://www.alsa-project.org/alsa-doc/alsa-lib/_2test_2pcm_8c-example.html
I'm referring to the write_loop() function. This is the device configuration setup up of the pcm.c example program (output of snd_pcm_dump()):
ALSA <-> PulseAudio PCM I/O Plugin
Its setup is:
stream : PLAYBACK
access : RW_INTERLEAVED
format : S16_LE
subformat : STD
channels : 1
rate : 44100
exact rate : 44100 (44100/1)
msbits : 16
buffer_size : 22050
period_size : 4410
period_time : 100000
tstamp_mode : NONE
tstamp_type : GETTIMEOFDAY
period_step : 1
avail_min : 4410
period_event : 0
start_threshold : 22050
stop_threshold : 22050
silence_threshold: 0
silence_size : 0
boundary : 6206523236469964800
What I see placing some printf() around snd_pcm_writei() is that it gets executed 5 times straight and every next loop snd_pcm_writei() takes 100ms to complete. This is exactly what I was expecting to see.
This is device setup of my program:
ALSA <-> PulseAudio PCM I/O Plugin
Its setup is:
stream : PLAYBACK
access : RW_INTERLEAVED
format : FLOAT_LE
subformat : STD
channels : 1
rate : 44100
exact rate : 44100 (44100/1)
msbits : 32
buffer_size : 13230
period_size : 4410
period_time : 100000
tstamp_mode : NONE
tstamp_type : GETTIMEOFDAY
period_step : 1
avail_min : 4410
period_event : 0
start_threshold : 4410
stop_threshold : 13230
silence_threshold: 0
silence_size : 0
boundary : 7447827883763957760
What happens is snd_pcm_writei() runs 5 times (and this is ok) but after that every new loop it returns immediately with -EAGAIN. Retrying continuously for 100ms (100% cpu usage) to play the same buffer eventually it gets played, snd_pcm_writei() returns a positive number and for next audio buffer I get immediately -EAGAIN, for 100ms; and so on. The audio playback, however, is fine.
What I don't understand is why it doesn't wait 100ms to play the new buffer instead of returning immediately -EAGAIN (cannot find anything in ALSA docs about snd_pcm_writei() returning -EAGAIN).
Thanks in advance for any help!
A PCM device can be in blocking mode (waiting) or in non-blocking mode (returning -EAGAIN).
This mode can be set with a flag when calling snd_pcm_open(), or with snd_pcm_nonblock().

How to convert MP3 to AMR using ffmpeg in windows commandline

Using windows based ffmpeg to convert MP3 to AMR.
For some reason it fails with error as given below.
Don't know how to given the correct parameters for AMR.
C:\Program Files (x86)\AMR to MP3 Converter>ffmpeg -i mfile.mp3 -ar 8000 -ab 12.2k audio.amr
FFmpeg version SVN-r26400, Copyright (c) 2000-2011 the FFmpeg developers
built on Jan 18 2011 04:07:05 with gcc 4.4.2
configuration: --enable-gpl --enable-version3 --enable-libgsm --enable-libvorb
is --enable-libtheora --enable-libspeex --enable-libmp3lame --enable-libopenjpeg
--enable-libschroedinger --enable-libopencore_amrwb --enable-libopencore_amrnb
--enable-libvpx --disable-decoder=libvpx --arch=x86 --enable-runtime-cpudetect -
-enable-libxvid --enable-libx264 --enable-librtmp --extra-libs='-lrtmp -lpolarss
l -lws2_32 -lwinmm' --target-os=mingw32 --enable-avisynth --enable-w32threads --
cross-prefix=i686-mingw32- --cc='ccache i686-mingw32-gcc' --enable-memalign-hack
libavutil 50.36. 0 / 50.36. 0
libavcore 0.16. 1 / 0.16. 1
libavcodec 52.108. 0 / 52.108. 0
libavformat 52.93. 0 / 52.93. 0
libavdevice 52. 2. 3 / 52. 2. 3
libavfilter 1.74. 0 / 1.74. 0
libswscale 0.12. 0 / 0.12. 0
[mp3 # 003abeb0] max_analyze_duration reached
[mp3 # 003abeb0] Estimating duration from bitrate, this may be inaccurate
Input #0, mp3, from 'mfile.mp3':
Metadata:
title : SPL_TRACK
artist : Krishnan
album : Lord
genre : Lord
track : 5
Duration: 00:30:06.00, start: 0.000000, bitrate: 127 kb/s
Stream #0.0: Audio: mp3, 44100 Hz, 2 channels, s16, 128 kb/s
[libopencore_amrnb # 017e0d20] Only mono supported
Output #0, amr, to 'audio.amr':
Stream #0.0: Audio: libopencore_amrnb, 8000 Hz, 2 channels, s16, 12 kb/s
Stream mapping:
Stream #0.0 -> #0.0
Error while opening encoder for output stream #0.0 - maybe incorrect parameters
such as bit_rate, rate, width or height
As the error msg says, Only mono supported , so add -ac 1.
In any case, your ffmpeg is prehistoric. Get a new binary from https://ffmpeg.zeranoe.com/builds/

Pulseaudio/alsa : slow playback device wake-up

I have a Debian machine (3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u2 (2016-10-19) x86_64 GNU/Linux) on which I have some audio devices connected. The devices that are stereo works well, but I have a problem with a mono headset. When I type the command
aplay -v -D plughw:2,0 ~/piano2.wav
the device waits up to 3-4 seconds before starting to output sound. If I retype the command in the following 5 seconds, the sound is played directly, but if I wait a bit more, I have to wait 3-4 seconds again before hearing anything.
Here is the output when I run the command above :
Playing WAVE '/home/console/piano2.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
Plug PCM: Route conversion PCM (sformat=S16_LE)
Transformation table:
0 <- 0*0.5 + 1*0.5
Its setup is:
stream : PLAYBACK
access : RW_INTERLEAVED
format : S16_LE
subformat : STD
channels : 2
rate : 48000
exact rate : 48000 (48000/1)
msbits : 16
buffer_size : 24000
period_size : 6000
period_time : 125000
tstamp_mode : NONE
period_step : 1
avail_min : 6000
period_event : 0
start_threshold : 24000
stop_threshold : 24000
silence_threshold: 0
silence_size : 0
boundary : 6755399441055744000
Slave: Rate conversion PCM (16000, sformat=S16_LE)
Converter: libspeex (builtin)
Protocol version: 10002
Its setup is:
stream : PLAYBACK
access : MMAP_INTERLEAVED
format : S16_LE
subformat : STD
channels : 1
rate : 48000
exact rate : 48000 (48000/1)
msbits : 16
buffer_size : 24000
period_size : 6000
period_time : 125000
tstamp_mode : NONE
period_step : 1
avail_min : 6000
period_event : 0
start_threshold : 24000
stop_threshold : 24000
silence_threshold: 0
silence_size : 0
boundary : 6755399441055744000
Slave: Hardware PCM card 2 'Jabra PRO 9460' device 0 subdevice 0
Its setup is:
stream : PLAYBACK
access : MMAP_INTERLEAVED
format : S16_LE
subformat : STD
channels : 1
rate : 16000
exact rate : 16000 (16000/1)
msbits : 16
buffer_size : 8000
period_size : 2000
period_time : 125000
tstamp_mode : NONE
period_step : 1
avail_min : 2000
period_event : 0
start_threshold : 8000
stop_threshold : 8000
silence_threshold: 0
silence_size : 0
boundary : 9007199254740992000
appl_ptr : 0
hw_ptr : 0
And this is my .asoundrc file :
pcm.!default {
type plug
slave {
pcm "hw:0,0"
}
}
pcm.device1 {
type plug
slave {
pcm "hw:1,0"
}
}
pcm.device2 {
type plug
slave {
pcm "hw:2,0"
}
}
pcm.device3 {
type plug
slave {
pcm "hw:3,0"
}
}
ctl.!default {
type hw
card 0
}
ctl.device1 {
type hw
card 1
}
ctl.device2 {
type hw
card 2
}
ctl.device3 {
type hw
card 3
}
Does anybody have and idea why I get such a delay when my device wakes-up ?
Thanks

muxing into MPEG-TS: wrong parameters for audio stream

I am trying to mux video (H.264) and audio (PCM_S16LE, no compression) into an MPEG transport stream using ffmpeg. The video shows fine. The audio stream, however, does not play. The audio stream, shown by ffprobe is AAC, which is obviously not my intention. So I must be doing something wrong in adding the audio stream. Any idea how I can correct this?
This is my code for adding an audio stream:
void add_audio_stream()
{
CodecID codec_id = CODEC_ID_PCM_S16LE;
AVStream *p_ast = av_new_stream(fc, 1);
if (!p_ast) {
fprintf(stderr, "Could not alloc audio stream\n");
exit(1);
}
ai = p_ast->index;
AVCodecContext *pcc = p_ast->codec;
avcodec_get_context_defaults2( pcc, AVMEDIA_TYPE_AUDIO );
pcc->codec_type = AVMEDIA_TYPE_AUDIO;
pcc->codec_id = codec_id;
pcc->sample_fmt = AV_SAMPLE_FMT_S16;
//pcc->bit_rate = 44100*16*2;
pcc->bit_rate = 0;
pcc->sample_rate = 44100;
pcc->channels = 2;
pcc->time_base = (AVRational){1, 44100};
// some formats want stream headers to be separate
if (fc->oformat->flags & AVFMT_GLOBALHEADER)
{
printf(" **** 1 ****\n");
pcc->flags |= CODEC_FLAG_GLOBAL_HEADER;
}
else
printf(" **** 2 ****\n");
AVCodec *codec;
/* find the audio encoder */
codec = avcodec_find_encoder(pcc->codec_id);
if (!codec) {
fprintf(stderr, "codec not found\n");
exit(1);
}
/* open it */
if (avcodec_open(pcc, codec) < 0)
{
fprintf(stderr, "could not open codec\n");
exit(1);
}
}
Here is the output of ffprobe:
ffprobe version N-32405-g6337de9, Copyright (c) 2007-2011 the FFmpeg developers
built on Sep 8 2011 11:20:12 with gcc 4.4.3
configuration: --enable-gpl --enable-version3 --enable-nonfree --enable-postproc --enable-libfaac --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libtheora --enable-libvorbis --enable-libx264 --enable-libxvid --enable-x11grab --enable-libmp3lame
libavutil 51. 16. 0 / 51. 16. 0
libavcodec 53. 13. 0 / 53. 13. 0
libavformat 53. 12. 0 / 53. 12. 0
libavdevice 53. 3. 0 / 53. 3. 0
libavfilter 2. 39. 0 / 2. 39. 0
libswscale 2. 1. 0 / 2. 1. 0
libpostproc 51. 2. 0 / 51. 2. 0
[mpegts # 0xa96daa0] Continuity Check Failed
[mpegts # 0xa96daa0] Continuity Check Failed
[aac # 0xa974da0] channel element 0.1 is not allocated
[aac # 0xa974da0] More than one AAC RDB per ADTS frame is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
.
.
lot of gobbly-gook about missing AAC parameters . . .
.
.
[aac # 0xa974da0] More than one AAC RDB per ADTS frame is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
[aac # 0xa974da0] Error decoding AAC frame header.
[mpegts # 0xa96daa0] max_analyze_duration 5000000 reached at 5429789
[mpegts # 0xa96daa0] Continuity Check Failed
[mpegts # 0xa96daa0] Continuity Check Failed
Input #0, mpegts, from 'test_audio_video.mts':
Duration: 00:00:40.35, start: 0.010000, bitrate: 1907 kb/s
Program 1
Metadata:
service_name : Service01
service_provider: FFmpeg
Stream #0.0[0x100]: Video: h264 (Constrained Baseline) ([27][0][0][0] / 0x001B), yuv420p, 640x480, 30 fps, 30 tbr, 90k tbn, 60 tbc
Stream #0.1[0x101]: Audio: aac ([6][0][0][0] / 0x0006), 96000 Hz, 4.0, s16, 9 kb/s
I think i doubt that MEPG2 TS will allow PCM Audio. It can take up MPI, MP2 or AAC. AAC is being taken up more as a default choice rather than identificaiton.
Also, unlike Video, Audio headers are not very descriminative. i.e. doesn't have start codes and stuff, so other than PES header, usually there is no way to find out what type of Audio it is.
Encode audio if possible.
Try Gspot application to cross check.
Looks like MPEG-TS supports AES3-formatted PCM audio as private data, as specified by SMPTE 302M.
There is currently a s302m encoder/decoder in ffmpeg that will allow you to easily accomplish your goal.
some time ago I played with it too. And what I found out is that:
- Blu-ray only support 48000 Sample Rate
- I always use BIG ENDIAN not LITTLE ENDIAN.
I think ffmpeg will use Blu-ray settings for mpeg2_ts.

Resources