Optimised FFMPEG parameters - audio

How do I optimize the following to the specifications. The current one has random lags & pause while streaming.
ffmpeg -re -y -i FILENAME.mp4 -vcodec libx264 -b:v 600k -filter:v yadif -ac 1 -ar 44100 -f flv "rtmp://..."
Video Format
Maximum 720p (720 x 1280) resolution, at 30 frames per second. (or 1
key frame every 2 seconds).
You must send an I-frame (keyframe) at least once every two seconds
throughout the stream..
Recommended max bit rate is 4000 Kbps.
The accepts H264 encoded video and AAC encoded audio only.
Advanced Settings
Pixel Aspect Ratio: Square.
Frame Types: Progressive Scan.
Audio Sample Rate: 44.1 KHz.
Audio Bitrate: 128 Kbps stereo.
Bitrate Encoding: CBR.
My file is in mp4 generated with iMovies. Thanks in advance!

Use VBV by adding -maxrate and -bufsize. Use a -maxrate value that is below your maximum upload rate and leave some room for overhead.
Use the -g option for your keyframe interval with a value that is double that of your frame rate.
Make sure you're not using an ancient version of ffmpeg and x264.
Sore more details at FFmpeg Wiki: Encoding for Streaming Sites.

These look like the specs for the Facebook Live API.
Doubtful that the random lags and pauses are on your end (or at least not in the encoding), but do check your network connection to ensure you have an excessive amount of bandwidth and relatively low latency. Facebook Live operates with low latency, requiring your connection to be truly flawless.

Related

FFMPEG command to mix audio and video with adjustable volume

I have:
Video file of X length
Audio of Y length
I am trying to achieve an output video that has the following qualities:
The volume level of the added audio should be adjustable
The audio should loop till the end of the video
It should not break even if the input video does not have any audio
I should be able to mute the audio of the source video if needed.
All of the above, in the fastest possible way.
I'm not well versed with FFMPEG, maybe some experts could help.
since you are using a library i assume that you know how to run pure FFmpeg commands
based on your third condition we will divide the solution to two part :
It should not break even if the input video does not have any audio
in order to cover this condition, you can check if there is any audio stream in your video file before running any FFmpeg command with below code:
private boolean isVideoContainAudioStream(String videoPath) {
MediaMetadataRetriever retriever = new MediaMetadataRetriever();
retriever.setDataSource(videoPath);
String hasAudioStream = retriever.extractMetadata(MediaMetadataRetriever.METADATA_KEY_HAS_AUDIO);
if (hasAudioStream != null && hasAudioStream.equals("yes"))
return true;
else
return false;
}
1. Part One :
so if the result of above function is equal to true, your video file contain audio stream so you can run below command :
ffmpeg -i video.mp4 -filter_complex "amovie=/path/to/audio/file/audio.mp3:loop=0,asetpts=N/SR/TB,volume=2.0[audio];[0:a]volume=0.5[sa];[sa][audio]amix[fa]" -map 0:v -map [fa] -vcodec libx264 -preset ultrafast -shortest fout.mp4
in above command we take audio file at a specific path with amovie filter
loop=0, Loop audio infinitely
asetpts=N/SR/TB, Generate timestamps by counting samples
volume=2.0, multiply audio volume by 2.0
video's audio stream is accessible with [0:a] filter pad so we take it and set the volume to half of the input's volume and name it [sa] obviously if you want to mute the audio of the source video you change that part to :
[0:a]volume=0.0[sa]
after that we will mix two audio streams using amix filter and name it [fa], so far we have everything we wanted, and we just want to merge audio and video streams
-vcodec libx264, we are using x264 video encoding because it has lots of configs to gain better performance and speed
-shortest, since we loop audio infinitely, we tell the ffmpeg to continue creating frames until the shortest stream ends (video stream is the short one for sure)
-preset ultrafast, preset is one of the x264 options, ultrafast will give you more encoding speed at the cost of more size in output file, usually using veryfast value for this flag is a good combination of speed and size
2. Part Two :
if the isVideoContainAudioStream function return false (which means your input video is muted) you can run below command:
ffmpeg -i mute_video.mp4 -filter_complex "amovie=/path/to/audio/file/audio.mp3:loop=0,asetpts=N/SR/TB,volume=2.0[audio]" -map 0:v -map [audio] -vcodec libx264 -preset ultrafast -crf 18 -shortest m_fout.mp4
in above command we use another x264 options called CRF
Constant Rate Factor (CRF)
Use this rate control mode if you want to keep the best quality and care less about the file size. This is the recommended rate control mode for most uses.
The range of the CRF scale is 0–51, where 0 is lossless, 23 is the default, and 51 is worst quality possible. A lower value generally leads to higher quality, and a subjectively sane range is 17–28. Consider 17 or 18 to be visually lossless or nearly so; it should look the same or nearly the same as the input but it isn't technically lossless.
The range is exponential, so increasing the CRF value +6 results in roughly half the bitrate / file size, while -6 leads to roughly twice the bitrate.
Choose the highest CRF value that still provides an acceptable quality. If the output looks good, then try a higher value. If it looks bad, choose a lower value.
thats it, there is lots of option for x264 encoder, you can check all available options at this link:
H.264 Video Encoding Guide

Dealing with problems in FLAC audio files with ffmpeg

I have gotten a set of FLAC (audio) files from a friend. I copied them to my Sonos music library, and got set to enjoy a nice album. Unfortunately, Sonos would not play the files. As a result I have been getting to know ffmpeg.
Sonos' complaint with the FLAC files was that it was "encoded at an unsupported sample rate". With rolling eyes and shaking head, I note that the free VLC media player happily plays these files, but the product I've paid for (Sonos) - does not. But I digress...
ffprobe revealed that the FLAC files contain both an Audio channel and a Video channel:
$ ffprobe -hide_banner -show_streams "/path/to/Myaudio.flac"
Duration: 00:02:23.17, start: 0.000000, bitrate: 6176 kb/s
Stream #0:0: Audio: flac, 176400 Hz, stereo, s32 (24 bit)
Stream #0:1: Video: mjpeg (Progressive), yuvj444p(pc, bt470bg/unknown/unknown), 450x446 [SAR 72:72 DAR 225:223], 90k tbr, 90k tbn, 90k tbc (attached pic)
Metadata:
comment : Cover (front)
Cool! I guess this is how some audio players are able to display the 'album artwork' when they play a song? Note also that the Audio stream is reported at 176400 Hz! Apparently I'm out of touch; I thought that 44.1khz sampling rate effectively removed all of the 'sampling artifacts' we could hear. Anyway, I learned that Sonos would support a max of 48kHz sampling rate, and this (the 176.4kHz rate) is what Sonos was unhappy about. I used ffmpeg to 'dumb it down' for them:
$ ffmpeg -i "/path/to/Myaudio.flac" -sample_fmt s32 -ar 48000 "/path/to/Myaudio48K.flac"
This seemed to work - at least I got a FLAC file that Sonos would play. However, I also got what looks like a warning of some sort:
[swscaler # 0x108e0d000] deprecated pixel format used, make sure you did set range correctly
[flac # 0x7feefd812a00] Frame rate very high for a muxer not efficiently supporting it.
Please consider specifying a lower framerate, a different muxer or -vsync 2
A bit more research turned up this answer which I don't quite understand, and then in a comment says, "not to worry" - at least wrt the swscaler part of the warning.
And that (finally) brings me to my questions:
1.a. What framerate, muxer & other specifications make a graphic compatible with a majority of programs that use the graphic?
1.b. How should I use ffmpeg to modify the Video channel to set these specifications (ref. Q 1.a.)?
2.a. How do I remove the Video channel from the .flac audio file?
2.b. How do I add a Video channel into a .flac file?
EDIT:
I asked the above (4) questions after failing to accomplish a 'direct' conversion (a single ffmpeg command) from FLAC at 176.4 kHz to ALAC (.m4a) at 48 kHz (max supported by Sonos). I reasoned that an 'incremental' approach through a series of conversions might get me there. With the advantage of hindsight, I now see I should have posted my original failed direct conversion incantation... we live and learn.
That said, the accepted answer below meets my final objective to convert a FLAC file encoded at 176.4kHz to an ALAC (.m4a) at 48kHz, and preserve the cover art/video channel.
What framerate, muxer & other specifications make a graphic compatible with a majority of programs that use the graphic?
A cover art is just a single frame so framerate has no relevance in this case. However, you don't want a video stream, it has to remain a single image, so -vsync 0 should be added. Muxer is simply the specific term for the packager as used in media file processing. It is decided by the choice of format e.g. FLAC, WAV..etc. What's important is the codec for the cover art; usually, it's PNG or JPEG. For FLAC, PNG is the default codec.
How do I remove the Video channel from the .flac audio file
ffmpeg -i "/path/to/Myaudio.flac" -vn -c copy "/path/to/Myaudio48K.flac"
(All this does is skip any video in the input and copy everything else)
How do I add a Video channel into a .flac file?
To add cover art to audio-only formats, like MP3, FLAC..etc, the video stream has to have a disposition of attached picture. So,
ffmpeg -i "/path/to/Myaudio.flac" -i coverimage -sample_fmt s32 -ar 48000 -disposition:v attached_pic -vsync 0 "/path/to/Myaudio48K.flac"
For direct conversion to ALAC, use
ffmpeg -i "/path/to/Myaudio.flac" -i coverimage -ar 48000 -c:a alac -disposition:v attached_pic -vsync 0 -c:v png "/path/to/Myaudio48K.m4a"

Using FFmpeg or Similar to Normalize audio in a video to EBU R128 standard

This is my first time here on stack overflow asking question.
I am stuck and really struggling with this. I am trying to make some of my MXF video files to be EBU r128 standard for its audio.
This means that it has to be -23 and not higher than 0.5.
My current process
Watch_folder > Encoding to MXF > Output_folder
I need to makesure when its comes to output folder, those MXF files are EBU R128 Loudness compliant.
What I have done so Far:
FFMPEG:
ffmpeg -i input.mxf -af loudnorm=I=-23:LRA=7:tp=-2:print_format=json -f null -
got the result:
Input Integrated: -15.1 LUFS
Input True Peak: +0.0 dBTP
Input LRA: 17.1 LU
Input Threshold: -26.2 LUFS
Output Integrated: -17.1 LUFS
Output True Peak: -1.5 dBTP
Output LRA: 5.3 LU
Output Threshold: -27.6 LUFS
Normalization Type: Dynamic
Target Offset: +1.1 LU
then i did
ffmpeg -i input.mxf -af loudnorm=I=-23:LRA=7:tp=-2:measured_I=-15.1:measured_LRA=17.1:measured_tp=0:measured_thresh=-27.6:offset=1.1 -ar 48k -y output.mxf
However, when i put it through the software Eff, it says that its not EBU compliant.
*EDIT:
This also reduces the quality. for example; my 6 Gb becomes 250 MB and you can tell the quality downgraded
ffmpeg-normalize
I did the following
ffmpeg-normalize input.mxf -c:a pcm_s32le -ar 48000 -o output.mxf
but this gives me errors.
if i do it without the output file type, i get a mkv which will not work for me. i need it to be mxf.
OK, a few issues here.
Firstly, if your file is measured at -26.2 LUFS, you'd need to add 3.2 dB to get it to -23. But you can't do that, because your true peak is too high (you'd be over full scale). You'll need to compress (dynamic audio compression, not file/rate compression) the audio or use at least a limiter to achieve this.
A good R128 audio track should be mixed properly rather than just run through a normaliser, otherwise you risk it either failing the standard or unwanted audio effects.
If you don't have access to audio editing software or someone who can do this for you, then FFMPEG does include an audio limiter, which will give you enough headroom to raise the level to -23 LUFS.
You can do that with something like this:
-filter_complex alimiter=level_in=1:level_out=1:limit=1.5:attack=7:release=100:level=disabled
However, tuning a limiter well depends on what the video file is of (music, speech, etc) and it is something that's worth taking some time over. Alter the attack and release values until you get the result you want.
Secondly, the reason that FFMPEG has produced a smaller file of lower quality is because you didn't specify anything in the video section. FFMPEG's default action with video is (usually) to encode to h264, so whatever your codec here is (I am assuming DNxHD from the fact that you're using an MXF wrapper) needs to be specified. FFMPEG will copy the video stream though and leave it alone if you include the option -c:v copy (which means copy video codec, basically).
Post your results once you have tried these...!

How to convert MP3's to constant bitrate using FFMPEG

I have found that MP3's encoded with variable bit rate cause the currentTime property to be reported incorrectly, especially when scrubbing. That has wreaked havok on my app and has been a nightmare to debug.
I believe I need to convert all my MP3's to constant bitrate. Can FFMPEG (or something else) help me do that efficiently?
Props to Terrill Thompson for attempting to pin this down*
I also had issues with HTML5 being inaccurate for large mp3s. Since quality was not a big issue for my audio, I converted to constant bit rate of 8kbps, sample rate 8k, mono and it solved my issues.
You can convert to a contant bit rate for a few files using Audacity (export > save to mp3 > constant bit rate).
Or, using FFMPEG:
ffmpeg -i input.wav -codec:a libmp3lame -b:a 8k output.mp3
If you also want to reduce to mono and a 8k sample rate:
ffmpeg -i input.wav -codec:a libmp3lame -b:a 8k -ac 1 -ar 8000 output.mp3
Using the second compressed an hour of audio to under 5MB.
Something else is going on. currentTime should not be influenced by the fact that you are using variable-bit rate MP3s.
Perhaps the context sampleRate is not the same as the sample rate as the MP3s? That will mess up timing of the audio samples because WebAudio will resample the MP3s to the context sample rate.

ffmpeg conversion to mp4 shifts the audio by one frame

I have a .mov file (codec = motion jpeg) that has an audio stream that includes small pulses at every second.
When I convert this file to mp4 using ffmpeg I notice that all my pulses are now off by one frame.
I simply used "ffmpeg -i source_file.mov target_file.mp4"
Here is an image of the comparison between the audio signals:
A1 is the original audio (.mov) and A2 is the mp4 output audio of ffmpeg.
As you can see the pulses are one frame late compared to the original.
I know that the h264 codec is lossy but one frame offset seems like a big loss if you ask me.
Is there any option I could use with ffmpeg to have a better audio stream ?
Here is the input file: https://www.dropbox.com/s/6y5g7lo5dvu0ub1/BBB_09_tree_trunk_009_ANIM_001.mov?dl=0
Here is the output file:
https://www.dropbox.com/s/10zuzwn0qs8l853/BBB_09_tree_trunk_009_ANIM_001.mp4?dl=0
If you copy the audio over, you shouldn't get the shift.
ffmpeg -i source_file.mov -c:a copy target_file.mp4
I've been working on this issue for my own needs and my file format has to be mp4. I'm working from mxf files. I've tried several options and found this to give the most accurate result (I've removed specifics for simplicity):
ffmpeg -ss 00:00:00.021 -i "input.mxf" -itsoffset -0.044 -i "input.mxf" -c:v libx264 -c:a aac -map 0:a -map 1:v "output.mp4"
Starting the first file at 21ms and mapping it as the audio, then shifting the video back 44ms gave gave me the most accurate sync (within several samples). I don't know why 22ms wasn't as accurate (when that's what the primer sample issue seems to equate to) and I found nothing that allowed me to work more granular, in samples. A filter with a PTS offset had no affect. Perhaps it works differently with different file formats. It's also worth noting that the same command without the -itsoffest gave the same sync result with one difference; the video stream duration was 1 frame and 1ms off the audio and container durations. With the -itsoffest, the durations were only 1ms different. You can use 22ms to achieve an accurate duration, but check your sync, it might be out that slightest bit more.
Also worth noting that I stumbled across some developer commentary on the -itsoffset tag which clarified that it doesn't work on audio, it works on video. It seems like the answer above is suggesting to map the offest against the audio, which apparently is not how the function is built to work. https://trac.ffmpeg.org/ticket/1349
try mpeg2 audio: -acodec mp2 it worked for me

Resources