More precision from ffmpeg silencedetect - audio

I am trying to split a very large (70 hours) mp3 file into smaller files. My first step is the get the timestamps using the silencedetect command in ffmpeg. It works fine for the first few timestamps, but unfortunately, the results are rounded to six significant digits.
The code I am executing is:
ffmpeg -i input.mp3 -af silencedetect=d=3 -hide_banner -nostats -f null -
My results are:
Input #0, mp3, from 'input.mp3':
Duration: 70:46:05.32, start: 0.050113, bitrate: 64 kb/s
Stream #0:0: Audio: mp3, 22050 Hz, stereo, fltp, 64 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (mp3 (mp3float) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
Metadata:
encoder : Lavf58.29.100
Stream #0:0: Audio: pcm_s16le, 22050 Hz, stereo, s16, 705 kb/s
Metadata:
encoder : Lavc58.54.100 pcm_s16le
[silencedetect # 0x5590d08bd700] silence_start: 10.6895
[silencedetect # 0x5590d08bd700] silence_end: 15.0054 | silence_duration: 4.31587
[silencedetect # 0x5590d08bd700] silence_start: 446.958
[silencedetect # 0x5590d08bd700] silence_end: 450.959 | silence_duration: 4.00168
[silencedetect # 0x5590d08bd700] silence_start: 1168.17
[silencedetect # 0x5590d08bd700] silence_end: 1172.17 | silence_duration: 4.00694
[silencedetect # 0x5590d08bd700] silence_start: 1880.8
[silencedetect # 0x5590d08bd700] silence_end: 1884.8 | silence_duration: 3.99265
...
[silencedetect # 0x5590d08bd700] silence_start: 123108
[silencedetect # 0x5590d08bd700] silence_end: 123111 | silence_duration: 3.61946
[silencedetect # 0x5590d08bd700] silence_start: 123286
[silencedetect # 0x5590d08bd700] silence_end: 123290 | silence_duration: 4.01646
[silencedetect # 0x5590d08bd700] silence_start: 124229
[silencedetect # 0x5590d08bd700] silence_end: 124233 | silence_duration: 4.01846
[silencedetect # 0x5590d08bd700] silence_start: 124442
[silencedetect # 0x5590d08bd700] silence_end: 124446 | silence_duration: 4.0298
...
Rounding to the nearest second is not sufficient for my purposes. Ideally, I would like each timestamp to be accurate to the hundredth of a second or something similar. Does anybody know a way to achieve this?

Append ametadata=print:file=- to the filterchain and parse stdout in your program. It provides the frame time in seconds, frames, and pts. Grab the time_base from ffprobe and you can compute accurate time.
If you're using Python, you can try the following with my ffmpegio package:
from ffmpegio import analyze as ffa, probe as ffp
from pprint import pprint
input = "BigBuckBunny.mp4"
tb = next(info for info in ffp.streams_basic(input)
if info["codec_type"] == "audio")["time_base"]
print(f'time_base = {tb} s')
# analyze first 5 minutes and return silent intervals in the first 5 minutes
(logger,) = ffa.run(input, ffa.SilenceDetect(d=1), time_units="pts", to=60 * 5)
pprint([(pts0 * tb, pts1 * tb) for pts0, pts1 in logger.output.interval])
returns the silent intervals in fractions
time_base = 1/44100 s
[(Fraction(947456, 11025), Fraction(958976, 11025)),
(Fraction(976384, 11025), Fraction(39680, 441)),
(Fraction(1018624, 11025), Fraction(146176, 1575))]

Unfortunately, this is hardcoded in FFmpeg:
static inline char *av_ts_make_time_string(char *buf, int64_t ts, AVRational *tb)
{
if (ts == AV_NOPTS_VALUE) snprintf(buf, AV_TS_MAX_STRING_SIZE, "NOPTS");
else snprintf(buf, AV_TS_MAX_STRING_SIZE, "%.6g", av_q2d(*tb) * ts);
return buf;
}
The relevant part is the %.6g... this is setting the formatting.
You'll have to submit a patch to get it changed.

Related

ffmpeg says "No JPEG data found in image" when reading image paths from Linux pipe

I'm trying to convert a set of pictures into a video, and I want to read the file paths of the pictures from the pipe. The command I would like to run looks like this:
find dir/*.JPG | sort | ffmpeg -f image2pipe -r 1 -vcodec mjpeg -s 6000x4000 -pix_fmt yuvj422p -i - -vcodec libx264 -s 1080x720 -r 20 -pix_fmt yuv420p out.mkv
But I keep obtaining the No JPEG data found in image error. Here is the full log:
Input #0, image2pipe, from 'pipe:':
Duration: N/A, bitrate: N/A
Stream #0:0: Video: mjpeg, yuvj422p(bt470bg/unknown/unknown), 6000x4000, 1 fps, 1 tbr, 1 tbn, 1 tbc
Stream mapping:
Stream #0:0 -> #0:0 (mjpeg (native) -> h264 (libx264))
[mjpeg # 0x558e98cd7300] No JPEG data found in image
Error while decoding stream #0:0: Invalid data found when processing input
[swscaler # 0x558e98ce9440] deprecated pixel format used, make sure you did set range correctly
[libx264 # 0x558e98cdaac0] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
[libx264 # 0x558e98cdaac0] profile High, level 3.1, 4:2:0, 8-bit
[libx264 # 0x558e98cdaac0] 264 - core 161 r3039 544c61f - H.264/MPEG-4 AVC codec - Copyleft
2003-2021 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=6 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=20 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, matroska, to 'out.mkv':
Metadata:
encoder : Lavf58.76.100
Stream #0:0: Video: h264 (H264 / 0x34363248), yuv420p, 1080x720, q=2-31, 20 fps, 1k tbn
Metadata:
encoder : Lavc58.134.100 libx264
Side data:
cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
frame= 0 fps=0.0 q=0.0 Lsize= 1kB time=00:00:00.00 bitrate=N/A speed= 0x
video:0kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Conversion failed!
The pictures are in the following format (with mediainfo) and the filenames are in the form DSC_1234.JPG:
Format : JPEG
Video
Format : JPEG
Width : 6 000 pixels
Height : 4 000 pixels
Display aspect ratio : 3:2
Color space : YUV
Chroma subsampling : 4:2:2
Bit depth : 8 bits
Compression mode : Lossy
Also, I would like to avoid using a solution without piping the paths (with -f image2 -i DSC_%04d.JPG for example). Do you have any idea what's happening?
Your command is passing the names of the files to the pipe, but the content of the files is needed, please adjust your command like this:
find dir/ -iname '*.jpg' | xargs cat | ffmpeg -f image2pipe -r 1 -vcodec mjpeg -s 6000x4000 -pix_fmt yuvj422p -i - -vcodec libx264 -s 1080x720 -r 20 -pix_fmt yuv420p out.mkv

Using FFmpeg with URL input causes SIGSEGV in AWS Lambda (Python runtime)

I'm trying to implement a video converting solution on AWS Lambda following their article named Processing user-generated content using AWS Lambda and FFmpeg.
However when I run my command with subprocess.Popen() it returns -11 which translates to SIGSEGV (segmentation fault).
I've tried to process the video with the newest (4.3.1) static build from John Van Sickle's site as with the "official" ffmpeg-lambda-layer but it seems like it doesn't matter which one I use, the result is the same.
If I download the video to the Lambda's /tmp directory and add this downloaded file as an input to FFmpeg it works correctly (with the same parameters). However I'm trying to prevent this as the /tmp directory's max. size is only 512 MB which is not quite enough for me.
The relevant code which returns SIGSEGV:
ffmpeg_cmd = '/opt/bin/ffmpeg -stream_loop -1 -i "' + s3_source_signed_url + '" -i /opt/bin/audio.mp3 -i /opt/bin/watermark.png -shortest -y -deinterlace -vcodec libx264 -pix_fmt yuv420p -preset veryfast -r 30 -g 60 -b:v 4500k -c:a copy -map 0:v:0 -map 1:a:0 -filter_complex scale=1920:1080:force_original_aspect_ratio=decrease,pad=1920:1080:(ow-iw)/2:(oh-ih)/2,setsar=1,overlay=(W-w)/2:(H-h)/2,format=yuv420p -loglevel verbose -f flv -'
command1 = shlex.split(ffmpeg_cmd)
p1 = subprocess.Popen(command1, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = p1.communicate()
print(p1.returncode) #prints -11
stderr of FFmpeg:
ffmpeg version 4.1.3-static https://johnvansickle.com/ffmpeg/ Copyright (c) 2000-2019 the FFmpeg developers
built with gcc 6.3.0 (Debian 6.3.0-18+deb9u1) 20170516
configuration: --enable-gpl --enable-version3 --enable-static --disable-debug --disable-ffplay --disable-indev=sndio --disable-outdev=sndio --cc=gcc-6 --enable-fontconfig --enable-frei0r --enable-gnutls --enable-gmp --enable-gray --enable-libaom --enable-libfribidi --enable-libass --enable-libvmaf --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librubberband --enable-libsoxr --enable-libspeex --enable-libvorbis --enable-libopus --enable-libtheora --enable-libvidstab --enable-libvo-amrwbenc --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzvbi --enable-libzimg
libavutil 56. 22.100 / 56. 22.100
libavcodec 58. 35.100 / 58. 35.100
libavformat 58. 20.100 / 58. 20.100
libavdevice 58. 5.100 / 58. 5.100
libavfilter 7. 40.101 / 7. 40.101
libswscale 5. 3.100 / 5. 3.100
libswresample 3. 3.100 / 3. 3.100
libpostproc 55. 3.100 / 55. 3.100
[tcp # 0x728cc00] Starting connection attempt to 52.219.74.177 port 443
[tcp # 0x728cc00] Successfully connected to 52.219.74.177 port 443
[h264 # 0x729b780] Reinit context to 1280x720, pix_fmt: yuv420p
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'https://bucket.s3.amazonaws.com --> presigned url with 15 min expiration time':
Metadata:
major_brand : mp42
minor_version : 0
compatible_brands: mp42mp41isomavc1
creation_time : 2015-09-02T07:42:42.000000Z
Duration: 00:00:15.64, start: 0.000000, bitrate: 2640 kb/s
Stream #0:0(und): Video: h264 (High), 1 reference frame (avc1 / 0x31637661), yuv420p(tv, bt709, left), 1280x720 [SAR 1:1 DAR 16:9], 2475 kb/s, 25 fps, 25 tbr, 25 tbn, 50 tbc (default)
Metadata:
creation_time : 2015-09-02T07:42:42.000000Z
handler_name : L-SMASH Video Handler
encoder : AVC Coding
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 160 kb/s (default)
Metadata:
creation_time : 2015-09-02T07:42:42.000000Z
handler_name : L-SMASH Audio Handler
[mp3 # 0x733f340] Skipping 0 bytes of junk at 1344.
Input #1, mp3, from '/opt/bin/audio.mp3':
Metadata:
encoded_by : Logic Pro X
date : 2021-01-03
coding_history :
time_reference : 158760000
umid : 0x0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000004500F9E4
encoder : Lavf58.49.100
Duration: 00:04:01.21, start: 0.025057, bitrate: 320 kb/s
Stream #1:0: Audio: mp3, 44100 Hz, stereo, fltp, 320 kb/s
Metadata:
encoder : Lavc58.97
Input #2, png_pipe, from '/opt/bin/watermark.png':
Duration: N/A, bitrate: N/A
Stream #2:0: Video: png, 1 reference frame, rgba(pc), 701x190 [SAR 1521:1521 DAR 701:190], 25 tbr, 25 tbn, 25 tbc
[Parsed_scale_0 # 0x7341140] w:1920 h:1080 flags:'bilinear' interl:0
Stream mapping:
Stream #0:0 (h264) -> scale
Stream #2:0 (png) -> overlay:overlay
format -> Stream #0:0 (libx264)
Stream #1:0 -> #0:1 (copy)
Press [q] to stop, [?] for help
[h264 # 0x72d8600] Reinit context to 1280x720, pix_fmt: yuv420p
[Parsed_scale_0 # 0x733c1c0] w:1920 h:1080 flags:'bilinear' interl:0
[graph 0 input from stream 0:0 # 0x7669200] w:1280 h:720 pixfmt:yuv420p tb:1/25 fr:25/1 sar:1/1 sws_param:flags=2
[graph 0 input from stream 2:0 # 0x766a980] w:701 h:190 pixfmt:rgba tb:1/25 fr:25/1 sar:1521/1521 sws_param:flags=2
[auto_scaler_0 # 0x7670240] w:iw h:ih flags:'bilinear' interl:0
[deinterlace_in_2_0 # 0x766b680] auto-inserting filter 'auto_scaler_0' between the filter 'graph 0 input from stream 2:0' and the filter 'deinterlace_in_2_0'
[Parsed_scale_0 # 0x733c1c0] w:1280 h:720 fmt:yuv420p sar:1/1 -> w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x2
[Parsed_pad_1 # 0x733ce00] w:1920 h:1080 -> w:1920 h:1080 x:0 y:0 color:0x000000FF
[Parsed_setsar_2 # 0x733da00] w:1920 h:1080 sar:1/1 dar:16/9 -> sar:1/1 dar:16/9
[auto_scaler_0 # 0x7670240] w:701 h:190 fmt:rgba sar:1521/1521 -> w:701 h:190 fmt:yuva420p sar:1/1 flags:0x2
[Parsed_overlay_3 # 0x733e440] main w:1920 h:1080 fmt:yuv420p overlay w:701 h:190 fmt:yuva420p
[Parsed_overlay_3 # 0x733e440] [framesync # 0x733e5a8] Selected 1/50 time base
[Parsed_overlay_3 # 0x733e440] [framesync # 0x733e5a8] Sync level 2
[libx264 # 0x72c1c00] using SAR=1/1
[libx264 # 0x72c1c00] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 # 0x72c1c00] profile Progressive High, level 4.0, 4:2:0, 8-bit
[libx264 # 0x72c1c00] 264 - core 157 r2969 d4099dd - H.264/MPEG-4 AVC codec - Copyleft 2003-2019 - http://www.videolan.org/x264.html - options: cabac=1 ref=1 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=2 psy=1 psy_rd=1.00:0.00 mixed_ref=0 me_range=16 chroma_me=1 trellis=0 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=0 threads=9 lookahead_threads=3 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=1 keyint=60 keyint_min=6 scenecut=40 intra_refresh=0 rc_lookahead=10 rc=abr mbtree=1 bitrate=4500 ratetol=1.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, flv, to 'pipe:':
Metadata:
major_brand : mp42
minor_version : 0
compatible_brands: mp42mp41isomavc1
encoder : Lavf58.20.100
Stream #0:0: Video: h264 (libx264), 1 reference frame ([7][0][0][0] / 0x0007), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 4500 kb/s, 30 fps, 1k tbn, 30 tbc (default)
Metadata:
encoder : Lavc58.35.100 libx264
Side data:
cpb: bitrate max/min/avg: 0/0/4500000 buffer size: 0 vbv_delay: -1
Stream #0:1: Audio: mp3 ([2][0][0][0] / 0x0002), 44100 Hz, stereo, fltp, 320 kb/s
Metadata:
encoder : Lavc58.97
frame= 27 fps=0.0 q=32.0 size= 247kB time=00:00:00.03 bitrate=59500.0kbits/s speed=0.0672x
frame= 77 fps= 77 q=27.0 size= 1115kB time=00:00:02.03 bitrate=4478.0kbits/s speed=2.03x
frame= 126 fps= 83 q=25.0 size= 2302kB time=00:00:04.00 bitrate=4712.4kbits/s speed=2.64x
frame= 177 fps= 87 q=26.0 size= 3576kB time=00:00:06.03 bitrate=4854.4kbits/s speed=2.97x
frame= 225 fps= 88 q=25.0 size= 4910kB time=00:00:07.96 bitrate=5047.8kbits/s speed=3.13x
frame= 272 fps= 89 q=27.0 size= 6189kB time=00:00:09.84 bitrate=5147.9kbits/s speed=3.22x
frame= 320 fps= 90 q=27.0 size= 7058kB time=00:00:11.78 bitrate=4907.5kbits/s speed=3.31x
frame= 372 fps= 91 q=26.0 size= 8098kB time=00:00:13.84 bitrate=4791.0kbits/s speed=3.4x
And that's the end of it. It should continue to do the processing until 00:04:02 as that's my audio's length but it stops here every time (approximately this is my video length).
The relevant code which works correctly:
ffmpeg_cmd = '/opt/bin/ffmpeg -stream_loop -1 -i "' + '/tmp/' + s3_source_key + '" -i /opt/bin/audio.mp3 -i /opt/bin/watermark.png -shortest -y -deinterlace -vcodec libx264 -pix_fmt yuv420p -preset veryfast -r 30 -g 60 -b:v 4500k -c:a copy -map 0:v:0 -map 1:a:0 -filter_complex scale=1920:1080:force_original_aspect_ratio=decrease,pad=1920:1080:(ow-iw)/2:(oh-ih)/2,setsar=1,overlay=(W-w)/2:(H-h)/2,format=yuv420p -loglevel verbose -f flv -'
command1 = shlex.split(ffmpeg_cmd)
p1 = subprocess.Popen(command1, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = p1.communicate()
print(p1.returncode) #prints 0
With this code it repeats the video as many times as it has to do to be as long as the audio.
Both versions work correctly on my computer.
This question is almost the same but in my case FFmpeg is able to access the signed URL.

Create HLS streamable audio file from mp3

I am using following command to create a hls aac audio file for web streaming
ffmpeg -y -i song.mp3 -c:a aac -b:a 128k -f hls -hls_time 7 -hls_list_size 0 -hls_segment_filename file%d.m4a playlist.m3u8
This command works only with some audio files. With many mp3 files I receive following output:
C:\ffmpeg>ffmpeg -y -i song.mp3 -c:a aac -b:a 128k -f hls -hls_time 7 -hls_list_size 0 -hls_segment_filename file%d.m4a playlist.m3u8
ffmpeg version git-2020-01-31-62d92a8 Copyright (c) 2000-2020 the FFmpeg developers
built with gcc 9.2.1 (GCC) 20200122
configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt --enable-amf
libavutil 56. 38.100 / 56. 38.100
libavcodec 58. 67.100 / 58. 67.100
libavformat 58. 37.100 / 58. 37.100
libavdevice 58. 9.103 / 58. 9.103
libavfilter 7. 72.100 / 7. 72.100
libswscale 5. 6.100 / 5. 6.100
libswresample 3. 6.100 / 3. 6.100
libpostproc 55. 6.100 / 55. 6.100
[mp3 # 0000027d800babc0] Estimating duration from bitrate, this may be inaccurate
Input #0, mp3, from 'song.mp3':
Metadata:
TSS : Logic Pro 8.0.2
iTunNORM : 000000EE 000000ED 00000C34 00001135 000088F0 0000B505 000080FA 00007577 00009B82 00018F49
iTunSMPB : 00000000 00000210 00000A07 00000000008783E9 00000000 007AD4E6 00000000 00000000 00000000 00000000 00000000 00000000
genre : Rock
TCM : Kevin MacLeod
album : Funk and Blues
TKE : C
TBP : 101
title : Funkorama
artist : Kevin MacLeod
date : 2008-06-16 18:35
Duration: 00:03:21.46, start: 0.000000, bitrate: 325 kb/s
Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 320 kb/s
Stream #0:1: Video: mjpeg (Baseline), yuvj444p(pc, bt470bg/unknown/unknown), 400x400 [SAR 72:72 DAR 1:1], 90k tbr, 90k tbn, 90k tbc (attached pic)
Metadata:
comment : Other
Stream mapping:
Stream #0:1 -> #0:0 (mjpeg (native) -> h264 (libx264))
Stream #0:0 -> #0:1 (mp3 (mp3float) -> aac (native))
Press [q] to stop, [?] for help
[hls # 0000027d80100c40] Frame rate very high for a muxer not efficiently supporting it.
Please consider specifying a lower framerate, a different muxer or -vsync 2
[libx264 # 0000027d800c1280] using SAR=1/1
[libx264 # 0000027d800c1280] MB rate (56250000) > level limit (16711680)
[libx264 # 0000027d800c1280] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 # 0000027d800c1280] profile High 4:4:4 Predictive, level 6.2, 4:4:4, 8-bit
[libx264 # 0000027d800c1280] 264 - core 159 - H.264/MPEG-4 AVC codec - Copyleft 2003-2019 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=4 threads=12 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, hls, to 'playlist.m3u8':
Metadata:
TSS : Logic Pro 8.0.2
iTunNORM : 000000EE 000000ED 00000C34 00001135 000088F0 0000B505 000080FA 00007577 00009B82 00018F49
iTunSMPB : 00000000 00000210 00000A07 00000000008783E9 00000000 007AD4E6 00000000 00000000 00000000 00000000 00000000 00000000
genre : Rock
TCM : Kevin MacLeod
album : Funk and Blues
TKE : C
TBP : 101
title : Funkorama
artist : Kevin MacLeod
date : 2008-06-16 18:35
encoder : Lavf58.37.100
Stream #0:0: Video: h264 (libx264), yuvj444p(pc, progressive), 400x400 [SAR 72:72 DAR 1:1], q=-1--1, 90k fps, 90k tbn, 90k tbc (attached pic)
Metadata:
comment : Other
encoder : Lavc58.67.100 libx264
Side data:
cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
Stream #0:1: Audio: aac (LC), 44100 Hz, stereo, fltp, 128 kb/s
Metadata:
encoder : Lavc58.67.100 aac
[mp3float # 0000027d80146580] overread, skip -7 enddists: -6 -6 speed=68.6x
[mp3float # 0000027d80146580] overread, skip -6 enddists: -5 -5
[mp3float # 0000027d80146580] overread, skip -6 enddists: -4 -4
Last message repeated 2 times
[mp3float # 0000027d80146580] overread, skip -7 enddists: -6 -6
Last message repeated 2 times
[mp3float # 0000027d80146580] overread, skip -5 enddists: -2 -2
[mp3float # 0000027d80146580] overread, skip -7 enddists: -6 -6
[mp3float # 0000027d80146580] overread, skip -6 enddists: -4 -4
Last message repeated 1 times
[mp3float # 0000027d80146580] overread, skip -7 enddists: -6 -6
Last message repeated 1 times
[mp3float # 0000027d80146580] overread, skip -6 enddists: -4 -4
[mp3float # 0000027d80146580] overread, skip -5 enddists: -3 -3
[mp3float # 0000027d80146580] overread, skip -6 enddists: -4 -4
[mp3float # 0000027d80146580] overread, skip -7 enddists: -6 -6
Last message repeated 2 times
[mp3float # 0000027d80146580] overread, skip -5 enddists: -4 -4
[hls # 0000027d80100c40] Opening 'file0.m4a' for writingate=N/A speed=64.1x
[hls # 0000027d80100c40] Opening 'playlist.m3u8.tmp' for writing
frame= 1 fps=0.3 q=33.0 Lsize=N/A time=00:03:21.45 bitrate=N/A speed=63.7x
video:7kB audio:3209kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
[libx264 # 0000027d800c1280] frame I:1 Avg QP:34.64 size: 6567
[libx264 # 0000027d800c1280] mb I I16..4: 19.5% 53.0% 27.5%
[libx264 # 0000027d800c1280] 8x8 transform intra:53.0%
[libx264 # 0000027d800c1280] coded y,u,v intra: 46.8% 26.1% 15.3%
[libx264 # 0000027d800c1280] i16 v,h,dc,p: 38% 39% 9% 14%
[libx264 # 0000027d800c1280] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 21% 14% 26% 8% 5% 6% 5% 7% 7%
[libx264 # 0000027d800c1280] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 42% 16% 14% 7% 4% 5% 3% 4% 4%
[libx264 # 0000027d800c1280] kb/s:4728240.00
[aac # 0000027d800bcc40] Qavg: 2138.508
Notice the "mp3float overread" message.
It results in a single file0.m4a file without splitting it up after every 7 seconds as specified.
This is an example audio file I am trying to convert to a aac hls stream that results the mentioned problem: https://incompetech.com/music/royalty-free/index.html?isrc=USUAN1100474
How can I convert an audio file to a web friendly hls stream with ffmpeg?
You are using the option -hls_list_size 0 which makes 1 container file.
I use -muxdelay 0 -f segment -sc_threshold 0 -segment_time 15 -segment_list "playlist.m3u8" -segment_format mpegts "file%d.ts" in all my HLS video encode commands.
To put this in a working command for you that would be:
ffmpeg -y -i "song.mp3" -c:a aac -b:a 128k -muxdelay 0 -f segment -sc_threshold 0 -segment_time 7 -segment_list "playlist.m3u8" -segment_format mpegts "file%d.m4a"

No sound when capture screen

To capture my pc screen with following ffmpeg command.
ffmpeg -f pulse -ac 2 -i default -f x11grab -r 30 -s 1920x1080 -i :0.0 -acodec pcm_s16le -vcodec libx264 -preset ultrafast -threads 0 -y /tmp/output.mkv
The output displayed on my console when to execute the above command.
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, pulse, from 'default':
Duration: N/A, start: 1515543051.106987, bitrate: 1536 kb/s
Stream #0:0: Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s
[x11grab # 0x564bda5b8520] Stream #0: not enough frames to estimate rate; consider increasing probesize
Input #1, x11grab, from ':0.0':
Duration: N/A, start: 1515543052.415508, bitrate: N/A
Stream #1:0: Video: rawvideo (BGR[0] / 0x524742), bgr0, 1920x1080, 30 fps, 1000k tbr, 1000k tbn, 1000k tbc
No pixel format specified, yuv444p for H.264 encoding chosen.
Use -pix_fmt yuv420p for compatibility with outdated media players.
[libx264 # 0x564bda5c1560] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 LZCNT
[libx264 # 0x564bda5c1560] profile High 4:4:4 Predictive, level 4.0, 4:4:4 8-bit
[libx264 # 0x564bda5c1560] 264 - core 148 r2748 97eaef2 - H.264/MPEG-4 AVC codec - Copyleft 2003-2016 - http://www.videolan.org/x264.html - options: cabac=0 ref=1 deblock=0:0:0 analyse=0:0 me=dia subme=0 psy=1 psy_rd=1.00:0.00 mixed_ref=0 me_range=16 chroma_me=1 trellis=0 8x8dct=0 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=6 threads=3 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=0 weightp=0 keyint=250 keyint_min=25 scenecut=0 intra_refresh=0 rc=crf mbtree=0 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=0
Output #0, matroska, to '/tmp/output.mkv':
Metadata:
encoder : Lavf57.56.101
Stream #0:0: Video: h264 (libx264) (H264 / 0x34363248), yuv444p, 1920x1080, q=-1--1, 30 fps, 1k tbn, 30 tbc
Metadata:
encoder : Lavc57.64.101 libx264
Side data:
cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
Stream #0:1: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, stereo, s16, 1536 kb/s
Metadata:
encoder : Lavc57.64.101 pcm_s16le
Stream mapping:
Stream #1:0 -> #0:0 (rawvideo (native) -> h264 (libx264))
Stream #0:0 -> #0:1 (pcm_s16le (native) -> pcm_s16le (native))
Playing a music when to make the screenshot,i can hear the music playing lound.
It's strange that vedio came out perfectly without sound when to play the captured /tmp/output.mkv.
Open my volume control with pavucontrol.
Nothing in recording window, maybe the blank recording window result in no souond when to capture screen!
How to fix it ?
Solved.
ffmpeg -f pulse -ac 2 -i default -f x11grab -r 30 -s 1920x1080 -i :0.0 -acodec pcm_s16le -vcodec libx264 -preset ultrafast -threads 0 -y /tmp/test.avi
To change output file format from test.mkv as test.avi .

FFMPEG: Converting from raw audio to audio/mp4 (audio is being converted with slow speed)

If I convert from mp3 to mp4 directly everything works perfectly. But if I try to convert from raw pcm, the audio speed is slowed down.
I've tried the following (this works):
ffmpeg -i mp3/1.mp3 -strict -2 final.mp4
This doesn't work as expected:
ffmpeg -f s16le -i final.raw -strict -2 -r 26 final.mp4
With the following output:
Input #0, s16le, from 'final.raw':
Duration: 00:08:37.38, bitrate: 705 kb/s
Stream #0:0: Audio: pcm_s16le, 44100 Hz, 1 channels, s16, 705 kb/s
File 'final.mp4' already exists. Overwrite ? [y/N] y
Output #0, mp4, to 'final.mp4':
Metadata:
encoder : Lavf56.40.101
Stream #0:0: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, mono, fltp, 128 kb/s
Metadata:
encoder : Lavc56.60.100 aac
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> aac (native))
Press [q] to stop, [?] for help
size= 8273kB time=00:08:37.38 bitrate= 131.0kbits/s
video:0kB audio:8185kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 1.073808%
I've tried to set parameters like:
ffmpeg -ar 44100 -f s16le -i final.raw -strict -2 -r 26 final.mp4
With no luck.
In order to get the PCM from mp3 I'm using nodejs lame decoder:
var decoder = new lame.Decoder({
channels: 2,
bitDepth: 16,
sampleRate: 44100,
bitRate: 128,
outSampleRate: 44100, // 22050
mode: lame.STEREO
});

Resources