Detecting video volume - audio

I'm streaming few RTMP streams through nginx and I want to check every few seconds what stream has the highest volume.
Specifically these streams are of talking heads and I assume that usually only one of them is speaking at a time, and I'm trying to find which one.
Since nginx can output hls (Apple http live streaming) I decided to check every few seconds the last segment of each stream using ffmpeg.
Example:
ffmpeg -f mp3 -i /my/path/camera67/123.ts -af "volumedetect" -f null /dev/null
For some reason the max_volume is always zero (max_volume: 0.0 dB) and mean_volume seems meaningless regarding the volume.
Do you have any idea why it's always zero?
Is there a helpful way to understand mean_volume?
Can you think of a different tool that may give me the volume (e.g. mediainfo or ffprobe)?
I also tried:
ffmpeg -f lavfi -i amovie=/my/path/camera67/123.ts,volumedetect
This time I got:
[mpegts # 0x130bf40] start time for stream 1 is not set in estimate_timings_from_pts
[mpegts # 0x130bf40] Could not find codec parameters for stream 1 (Audio: aac ([15][0][0][0] / 0x000F), 0 channels, fltp): unspecified sample rate
Consider increasing the value for the 'analyzeduration' and 'probesize' options
[Parsed_amovie_0 # 0x130bcc0] No audio stream with index '-1' found
[lavfi # 0x130abc0] Error initializing filter 'amovie' with args '/my/path/camera67/123.ts'
amovie=/my/path/camera67/123.ts,volumedetect: Invalid argument
Any idea?
Thanks,
T.

So that's what happened.
I streamed MP3 to nginx that transcoded the input to HLS segments that doesn't support MP3.
Listening to the RTMP output caused me thinking that the audio is working fine, but when I listened to the HLS output I heard nothing.
I changed my original stream to AAC, then the HLS stream gave the right output and immediately I saw correlation between the music and the mean and max volumes.
Thank you all.

Related

ffmpeg duplicate last frame as long as the audio length

I am recording AVI files with Camtasia. For some reason the video stream length is 2,3-5 seconds less than the audio stream.
When I convert the video with ffmpeg from AVI to MP4 it cuts the audio to the video length.
Would duplicating the last frame until the end of the audio be a solution? If yes how can this be done using ffmpeg?
The important thing is to convert the AVI to MP4 using ffmpeg and keep the audio stream of the video complete.
Thank you.
Edit 1: This issue is automatically solved by ffmpeg 2.x somehow but ffmpeg 4.x will cut audio. With the same settings the old version converts correctly.
Edit 2: tpad helped. Thank you very much #kesh. I used
-filter_complex 'tpad=stop=NUMBER_OF_FRAMES:stop_mode=clone'
I tried to get the duration using ffprobe and multiplied the number of seconds with number of frames per second but it was not enough. For each video I had to increase that number with 100,150 frames.
The issue is I cannot detect the exact number of frames to tell tpad. I also tried
-filter_complex 'tpad=stop=-1:stop_mode=clone'
but it freezez while processing.
Is there any other option?

Is there a way to ensure mp3 duration accuracy with variable bit rate using FFMPEG?

In our application, we are processing audio files using ffmpeg. Specifically, we use the NodeJS library fluent-ffmpeg, (npm link).
Our audio files are generated from various text to speech providers. We recently noticed that when we converted audio using ssml to add pauses to the generated audio, the duration on the file is no longer correct. Upon further investigation, we noticed that the standard audios were also incorrect, just more accurate overall due to the more consistent data. When we put a pause at the beginning of the audio, the estimate was the worst, overshooting it by a very large margin (e.g., a 25s audio clip would read as 3 minutes long, but skip to the end when playing past the 25s mark.
I did some searching and research into the structure of MP3 files, and to me it seems like the issue is because the duration gets estimated by various audio players. Windows media player is an example, but Firefox's web player seems to also do this. I tried changing the ffmpeg command from using .audioQuality(0), which sets ffmpeg to use VBR, to .audioBitrate(320), which tells ffmpeg to use a constant bitrate.
For reference, the we are using libmp3lame, and the full command that gets run is the following, for the VBR and CBR cases respectively:
For VBR (broken durations): ffmpeg -i <URL> -acodec libmp3lame -aq 0 -f mp3 pipe:1
For CBR (correct duration): ffmpeg -i <URL> -acodec libmp3lame -b:a 320k -f mp3 pipe:1
Note: we then pipe the output to the requesting client application after sending the appropriate file headers, hence the pipe:1 output. The input is a cloud storage url where the source file is located
This fixes our problem of having a correct duration, and it makes sense to me why this would fix it if the problem was because the duration is being estimated by some of these players / audio consumers. But, this came at the cost that the file size was significantly larger, which also makes sense to me. While testing we found that compared to the same file in WAV, the VBR mp3 was about 10% of the WAV file size, while the CBR mp3 was still 50% of the WAV file size. This practically defeats the purpose of supporting the mp3 format for our use-case, which is a smaller but slightly lossy alternative to the large WAV file.
While researching, I found that there can be ID3 tags in a chunk at the beginning of the mp3 file, specifying information for the consumer of the audio to know the duration before potentially having processed the whole file. But, I also found that there doesn't seem to be a standard, at least for duration. More things like song title, album, artist, etc.
My question is, is there a way to get the proper duration onto an mp3 file, preferably via some ffmpeg mechanism, while still using VBR? Thanks!
FFmpeg does write a Xing header by default with duration info. However, that value is only known after the entire stream data has been received, so ffmpeg has to seek to the head to write it. Since you're piping the output, that can't be done.
Write the file locally or to some seekable destination, and then upload.

Using FFmpeg or Similar to Normalize audio in a video to EBU R128 standard

This is my first time here on stack overflow asking question.
I am stuck and really struggling with this. I am trying to make some of my MXF video files to be EBU r128 standard for its audio.
This means that it has to be -23 and not higher than 0.5.
My current process
Watch_folder > Encoding to MXF > Output_folder
I need to makesure when its comes to output folder, those MXF files are EBU R128 Loudness compliant.
What I have done so Far:
FFMPEG:
ffmpeg -i input.mxf -af loudnorm=I=-23:LRA=7:tp=-2:print_format=json -f null -
got the result:
Input Integrated: -15.1 LUFS
Input True Peak: +0.0 dBTP
Input LRA: 17.1 LU
Input Threshold: -26.2 LUFS
Output Integrated: -17.1 LUFS
Output True Peak: -1.5 dBTP
Output LRA: 5.3 LU
Output Threshold: -27.6 LUFS
Normalization Type: Dynamic
Target Offset: +1.1 LU
then i did
ffmpeg -i input.mxf -af loudnorm=I=-23:LRA=7:tp=-2:measured_I=-15.1:measured_LRA=17.1:measured_tp=0:measured_thresh=-27.6:offset=1.1 -ar 48k -y output.mxf
However, when i put it through the software Eff, it says that its not EBU compliant.
*EDIT:
This also reduces the quality. for example; my 6 Gb becomes 250 MB and you can tell the quality downgraded
ffmpeg-normalize
I did the following
ffmpeg-normalize input.mxf -c:a pcm_s32le -ar 48000 -o output.mxf
but this gives me errors.
if i do it without the output file type, i get a mkv which will not work for me. i need it to be mxf.
OK, a few issues here.
Firstly, if your file is measured at -26.2 LUFS, you'd need to add 3.2 dB to get it to -23. But you can't do that, because your true peak is too high (you'd be over full scale). You'll need to compress (dynamic audio compression, not file/rate compression) the audio or use at least a limiter to achieve this.
A good R128 audio track should be mixed properly rather than just run through a normaliser, otherwise you risk it either failing the standard or unwanted audio effects.
If you don't have access to audio editing software or someone who can do this for you, then FFMPEG does include an audio limiter, which will give you enough headroom to raise the level to -23 LUFS.
You can do that with something like this:
-filter_complex alimiter=level_in=1:level_out=1:limit=1.5:attack=7:release=100:level=disabled
However, tuning a limiter well depends on what the video file is of (music, speech, etc) and it is something that's worth taking some time over. Alter the attack and release values until you get the result you want.
Secondly, the reason that FFMPEG has produced a smaller file of lower quality is because you didn't specify anything in the video section. FFMPEG's default action with video is (usually) to encode to h264, so whatever your codec here is (I am assuming DNxHD from the fact that you're using an MXF wrapper) needs to be specified. FFMPEG will copy the video stream though and leave it alone if you include the option -c:v copy (which means copy video codec, basically).
Post your results once you have tried these...!

mkv file out of sync with linear drift

I have a bunch of mkv files, with FLAC as the audio codec and FFV1 as the video one.
The files were created using an EasyCap aquisition dongle from a VCR analog source. Specifically, I used VLC's "open acquisition device" prompt and selected PAL. Then, I converted the files (audio PCM, video raw YUV) to (FLAC, FFV1) using
ffmpeg.exe -i input.avi -acodec flac -vcodec ffv1 -level 3 -threads 4 -coder 1 -context 1 -g 1 -slices 24 -slicecrc 1 output.mkv
Now, the files are progressively out of sync. It may be due to the fact that while (maybe) the video has a constant framerate, the FLAC track has variable framerate. So, is there a way to sync the track to audio, or something alike? Can FFmpeg do this? Thanks
EDIT
On Mulvya hint, I plotted the difference in sync at various times; the first column shows the seconds elapsed, the second shows the difference - in secs. The plot seems to behave linearly, with 0.0078 as a constant slope. NOTE: measurements taken by hands, by means of a chronometer
EDIT 2
Playing around with VirtualDub, I found that changing the framerate to 25 fps from the original 24.889 (Video->Frame rate...->Change frame rate to) and using the track converted to wav definitely does work. Two problems, though: VirtualDub crashes when importing the original FFV1-FLAC mkv file, so I had to convert the video to H264 to try it out; more, I find it difficult to use an external encoder to save VirtualDub output.
So, could I avoid using VirtualDub, and simply use ffmpeg for it? Here's the exported vdscript:
VirtualDub.audio.SetSource("E:\\4_track2.wav", "");
VirtualDub.audio.SetMode(0);
VirtualDub.audio.SetInterleave(1,500,1,0,0);
VirtualDub.audio.SetClipMode(1,1);
VirtualDub.audio.SetEditMode(1);
VirtualDub.audio.SetConversion(0,0,0,0,0);
VirtualDub.audio.SetVolume();
VirtualDub.audio.SetCompression();
VirtualDub.audio.EnableFilterGraph(0);
VirtualDub.video.SetInputFormat(0);
VirtualDub.video.SetOutputFormat(7);
VirtualDub.video.SetMode(3);
VirtualDub.video.SetSmartRendering(0);
VirtualDub.video.SetPreserveEmptyFrames(0);
VirtualDub.video.SetFrameRate2(25,1,1);
VirtualDub.video.SetIVTC(0, 0, 0, 0);
VirtualDub.video.SetCompression();
VirtualDub.video.filters.Clear();
VirtualDub.audio.filters.Clear();
The first line imports the wav-converted audio track.
Can I set an equivalent pipe in ffmpeg (possibly, using FLAC - not wav)? SetFrameRate2 is maybe the key, here.

Extract audio from Transport Stream and preserve length

I'm using ffmpeg to extract audio from MPEG Transport Stream file recorded by DVB-S card. The command:
ffmpeg -i video.ts -vn audio.wav
The source file seems to be corrupted. I noticed the corruption happens from time to time, especially for videos longer than 1 hour. I've got errors like these:
[mp2 # 0x1bb5500] Header missing
Error while decoding stream #0:1
[mpegts # 0x17eaf40] Continuity check failed for pid 5261 expected 2 got 6
The problem is that the resulting audio.wav is shorter than the source video (40m33s and 40m59s accordingly). I'm looking for the way to preserve the original length in the resulting audio file.
I tried the recent ffmpeg under Windows and avconv under Ubuntu, output format was MP3 and WAV. For every case I've got the same results.
I didn't find whether it's possible to do it with ffmpeg however I found ProjectX - a tool which tries to fix the broken TS stream. Website: http://project-x.sourceforge.net/
With:
java -jar ProjectX.jar -demux my_video.ts
the stream is demuxed into audio and video files which are guaranteed to have the same length. I simply mux them back using ffmpeg.

Resources