Is there a way to detect black on FFMPEG video files - audio

I am trying to run a QC check on my video files.
I know that there is a way to detect black frame or audio loss in a video file. Can anyone help me with how the syntax is written?
I have tried doing the following but I am having issues as I do not know how to tell from the output.
ffmpeg -i inputfile.mxf -vf blackdetect=d=0.1:pix_th=.1 -f rawvideo -y /dev/null
Also is there ay ways to check if I have any packets that are in error by using ffprobe or ffmpeg
I also do not understand waht this 0.1:pix_th=.1 is doing?
EDIT*:
I have used this command now
ffmpeg -i 01.mxf -vf blackdetect=d=0:pix_th=.01 -f rawvideo -y /NUL
this gives me
[blackdetect # 000001a2ed843740] black_start:0.04 black_end:2
black_duration:1.96
[mpeg2video # 000001a2ed86efc0] ac-tex damaged at 45 304.08
bitrate=829328.3kbits/s dup=1 drop=0 speed= 5.6x
However, the actual video has more than this for the black frame.
is there a way to tell it to continue looking at the video and get all black frames, not just the first instance.

I also do not understand what this 0.1:pix_th=.1 is doing?
d=0.1 mention the duration of continues black screen in seconds which you wants to detect. For example, if you set it as 5 then you ll get notified only if the input video contains black screen for 5 or more seconds. It won't detect less than 5 second black.
pix_th=.1 mention the pixel threshold of black frame which you want to detect (darkness of the black frame).
You can set a value between 0 to 1.
0-> pure black (maximum dark).
1-> lite black (detect all frames, because you are telling the ffmpeg to detect minimum to maximum pixel value as black frame).
However, the actual video has more than this for the black frame.
is there a way to tell it to continue looking at the video and get all black frames, not just the first instance.
Increase the pix_th value and check.
for more information BlackDetect

Related

Combine Audio and Images in Stream

I would like to be able to create images on the fly and also create audio on the fly too and be able to combine them together into an rtmp stream (for Twitch or YouTube). The goal is to accomplish this in Python 3 as that is the language my bot is written in. Bonus points for not having to save to disk.
So far, I have figured out how to stream to rtmp servers using ffmpeg by loading a PNG image and playing it on loop as well as loading a mp3 and then combining them together in the stream. The problem is I have to load at least one of them from file.
I know I can use Moviepy to create videos, but I cannot figure out whether or not I can stream the video from Moviepy to ffmpeg or directly to rtmp. I think that I have to generate a lot of really short clips and send them, but I want to know if there's an existing solution.
There's also OpenCV which I hear can stream to rtmp, but cannot handle audio.
A redacted version of an ffmpeg command I have successfully tested with is
ffmpeg -loop 1 -framerate 15 -i ScreenRover.png -i "Song-Stereo.mp3" -c:v libx264 -preset fast -pix_fmt yuv420p -threads 0 -f flv rtmp://SITE-SUCH-AS-TWITCH/.../STREAM-KEY
or
cat Song-Stereo.mp3 | ffmpeg -loop 1 -framerate 15 -i ScreenRover.png -i - -c:v libx264 -preset fast -pix_fmt yuv420p -threads 0 -f flv rtmp://SITE-SUCH-AS-TWITCH/.../STREAM-KEY
I know these commands are not set up properly for smooth streaming, the result manages to screw up both Twitch's and Youtube's player and I will have to figure out how to fix that.
The problem with this is I don't think I can stream both the image and the audio at once when creating them on the spot. I have to load one of them from the hard drive. This becomes a problem when trying to react to a command or user chat or anything else that requires live reactions. I also do not want to destroy my hard drive by constantly saving to it.
As for the python code, what I have tried so far in order to create a video is the following code. This still saves to the HD and is not responsive in realtime, so this is not very useful to me. The video itself is okay, with the one exception that as time passes on, the clock the qr code says versus the video's clock start to spread apart farther and farther as the video gets closer to the end. I can work around that limitation if it shows up while live streaming.
def make_frame(t):
img = qrcode.make("Hello! The second is %s!" % t)
return numpy.array(img.convert("RGB"))
clip = mpy.VideoClip(make_frame, duration=120)
clip.write_gif("test.gif",fps=15)
gifclip = mpy.VideoFileClip("test.gif")
gifclip.set_duration(120).write_videofile("test.mp4",fps=15)
My goal is to be able to produce something along the psuedo-code of
original_video = qrcode_generator("I don't know, a clock, pyotp, today's news sources, just anything that can be generated on the fly!")
original_video.overlay_text(0,0,"This is some sample text, the left two are coordinates, the right three are font, size, and color", Times_New_Roman, 12, Blue)
original_video.add_audio(sine_wave_generator(0,180,2)) # frequency min-max, seconds
# NOTICE - I did not add any time measurements to the actual video itself. The whole point is this is a live stream and not a video clip, so the time frame would be now. The 2 seconds list above is for our psuedo sine wave generator to know how long the audio clip should be, not for the actual streaming library.
stream.send_to_rtmp_server(original_video) # Doesn't matter if ffmpeg or some native library
The above example is what I am looking for in terms of video creation in Python and then streaming. I am not trying to create a clip and then stream it later, I am trying to have the program be able to respond to outside events and then update it's stream to do whatever it wants. It is sort of like a chat bot, but with video instead of text.
def track_movement(...):
...
return ...
original_video = user_submitted_clip(chat.lastVideoMessage)
original_video.overlay_text(0,0,"The robot watches the user's movements and puts a blue square around it.", Times_New_Roman, 12, Blue)
original_video.add_audio(sine_wave_generator(0,180,2)) # frequency min-max, seconds
# It would be awesome if I could also figure out how to perform advance actions such as tracking movements or pulling a face out of a clip and then applying effects to it on the fly. I know OpenCV can track movements and I hear that it can work with streams, but I cannot figure out how that works. Any help would be appreciated! Thanks!
Because I forgot to add the imports, here are some useful imports I have in my file!
import pyotp
import qrcode
from io import BytesIO
from moviepy import editor as mpy
The library, pyotp, is for generating one time pad authenticator codes, qrcode is for the qr codes, BytesIO is used for virtual files, and moviepy is what I used to generate the GIF and MP4. I believe BytesIO might be useful for piping data to the streaming service, but how that happens, depends entirely on how data is sent to the service, whether it be ffmpeg over command line (from subprocess import Popen, PIPE) or it be a native library.
Are you using ffmpeg.exe and running a command through CMD? If so you can use either concat demuxer or pipe. When you use concat demuxer, ffmpeg can take image input from a text file. Text file should contain image paths and ffmpeg can find those images from different folders. Following code line shows how you can use concat demuxer. Image locations are saved to input.txt fie.
ffmpeg -f concat -i input.txt -vsync vfr -pix_fmt yuv420p output.mp4
But most suitable solution would be to use a data pipe to feed images to ffmpeg.
cat *.png | ffmpeg -f image2pipe -i - output.mkv
you can check this link to see more information about ffmpeg data pipe.
Generating multiple videos and streaming at real time is not a very stable solution. You can run into several problems.
I have settled on using Gstreamer to create my streams on the fly. It can allow me to take separate video and audio streams and combine them together. I do not exactly have a working example right now, but I hopefully will either have an answer or figure it out on my own soon, at Gstreamer in Python exits instantly, but is fine on command line.

SoX detecting volume is always near max

I'm trying to detect speech volume above a threshold in short, 2-3 second, audio files with sox but it's always coming out about 90% max volume regardless of silence or noise.
This is the command i'm using (i've tried varying the scale option):
sox noise.wav -n stats -s 99
If i shout and have the microphone in my mouth or bash it i can get a detectable difference of about 95% volume but it is a desktop style microphone. Playing back the audio files there is an audible silence recorded but there is still a big distinction when speaking from a distance.
Is there a setting i'm missing or has anyone else encountered this?

How can low frame rate video be made to look more smooth?

I am trying to clean up a video that was recorded in 2003 in low-light conditions on what was possibly a cameraphone. The video has been cleaned up somewhat (cropped, logos removed and stabilized), but it remains quite jerky, due in large part to its low frame rate. What are some tricks that might clean up the video in this regard? I feel that I am asking for something a bit like tweening in flash animations, but for pixels, whereby additional frames are generated using nearby frames of the video. Does such a trick exist? Is there another way to approach this problem?
To reproduce the video processing so far, take the following steps:
# get video
wget http://www.anwarweb.net/saddamdown.wmv
# crop
ffmpeg -i saddamdown.wmv -filter:v "crop=292:221:14:10" -c:a copy saddamdown_crop.wmv
# remove logo 1
ffmpeg -i saddamdown_crop.wmv -vf delogo=x=17:y=77:w=8:h=54 -c:a copy saddamdown_crop_delogo_1.wmv
# remove logo 2
ffmpeg -i saddamdown_crop_delogo_1.wmv -vf delogo=x=190:y=174:w=54:h=8 -c:a copy saddamdown_crop_delogo_1_delogo_2.wmv
# stabilize
ffmpeg -i saddamdown_crop_delogo_1_delogo_2.wmv -vf deshake saddamdown_crop_delogo_1_delogo_2_deshake.wmv
Note: The video is of the Saddam Hussein execution.
You could try with slowmoVideo: https://github.com/slowmoVideo/slowmoVideo
It's an open source software to create smooth slow motion effects from pixel motion analysis (Windows, Linux, OSX with wine or crossover. Read and write with ffmpeg).
First calculate the slow down ratio: for example if the original video is 18fps and the desired output is 24fps, set the speed of slowmo to 75% (18/24=0.75).
The result depends a lot on the video content, obviously the more fixed are the shots the better.
Anyway you can tweak what they call "Optical Flow", that is the analysis part of the process.
Good luck ;)

How to Simply Remove Duplicate Frames from a Video using ffmpeg

First of all, I'd preface this by saying I'm NO EXPERT with video manipulation,
although I've been fiddling with ffmpeg for years (in a fairly limited way). Hence, I'm not too flash with all the language folk often use... and how it affects what I'm trying to do in my manipulations... but I'll have a go with this anyway...
I've checked a few links here, for example:
ffmpeg - remove sequentially duplicate frames
...but the content didn't really help me.
I have some hundreds of video clips that have been created under both Windows and Linux using both ffmpeg and other similar applications. However, they have some problems with times in the video where the display is 'motionless'.
As an example, let's say we have some web site that streams a live video into, say, a Flash video player/plugin in a web browser. In this case, we're talking about a traffic camera video stream, for example.
There's an instance of ffmpeg running that is capturing a region of the (Windows) desktop into a video file, viz:-
ffmpeg -hide_banner -y -f dshow ^
-i video="screen-capture-recorder" ^
-vf "setpts=1.00*PTS,crop=448:336:620:360" ^
-an -r 25 -vcodec libx264 -crf 0 -qp 0 ^
-preset ultrafast SAMPLE.flv
Let's say the actual 'display' that is being captured looks like this:-
123456789 XXXXX 1234567 XXXXXXXXXXX 123456789 XXXXXXX
^---a---^ ^-P-^ ^--b--^ ^----Q----^ ^---c---^ ^--R--^
...where each character position represents a (sequence of) frame(s). Owing to a poor internet connection, a "single frame" can be displayed for an extended period (the 'X' characters being an (almost) exact copy of the immediately previous frame). So this means we have segments of the captured video where the image doesn't change at all (to the naked eye, anyway).
How can we deal with the duplicate frames?... and how does our approach change if the 'duplicates' are NOT the same to ffmpeg but LOOK more-or-less the same to the viewer?
If we simply remove the duplicate frames, the 'pacing' of the video is lost, and what used to take, maybe, 5 seconds to display, now takes a fraction of a second, giving a very jerky, unnatural motion, although there are no duplicate images in the video. This seems to be achievable using ffmpeg with an 'mp_decimate' option, viz:-
ffmpeg -i SAMPLE.flv ^ ... (i)
-r 25 ^
-vf mpdecimate,setpts=N/FRAME_RATE/TB DEC_SAMPLE.mp4
That reference I quoted uses a command that shows which frames 'mp_decimate' will remove when it considers them to be 'the same', viz:-
ffmpeg -i SAMPLE.flv ^ ... (ii)
-vf mpdecimate ^
-loglevel debug -f null -
...but knowing that (complicated formatted) information, how can we re-organize the video without executing multiple runs of ffmpeg to extract 'slices' of video for re-combining later?
In that case, I'm guessing we'd have to run something like:-
user specifies a 'threshold duration' for the duplicates
(maybe run for 1 sec only)
determine & save main video information (fps, etc - assuming
constant frame rate)
map the (frame/time where duplicates start)->no. of
frames/duration of duplicates
if the duration of duplicates is less than the user threshold,
don't consider this period as a 'series of duplicate frames'
and move on
extract the 'non-duplicate' video segments (a, b & c in the
diagram above)
create 'new video' (empty) with original video's specs
for each video segment
extract the last frame of the segment
create a short video clip with repeated frames of the frame
just extracted (duration = user spec. = 1 sec)
append (current video segment+short clip) to 'new video'
and repeat
...but in my case, a lot of the captured videos might be 30 minutes long and have hundreds of 10 sec long pauses, so the 'rebuilding' of the videos will take a long time using this method.
This is why I'm hoping there's some "reliable" and "more intelligent" way to use
ffmepg (with/without the 'mp_decimate' filter) to do the 'decimate' function in only a couple of passes or so... Maybe there's a way that the required segments could even be specified (in a text file, for example) and as ffmpeg runs it will
stop/restart it's transcoding at specified times/frame numbers?
Short of this, is there another application (for use on Windows or Linux) that could do what I'm looking for, without having to manually set start/stop points,
extracting/combining video segments manually...?
I've been trying to do all this with ffmpeg N-79824-gcaee88d under Win7-SP1 and (a different version I don't currently remember) under Puppy Linux Slacko 5.6.4.
Thanks a heap for any clues.
I assume what you want to do is to keep frames with motion and upto 1 second of duplicate frames but discard the rest.
ffmpeg -i in.mp4 -vf
"select='if(gt(scene,0.01),st(1,t),lte(t-ld(1),1))',setpts=N/FRAME_RATE/TB"
trimmed.mp4
What the select filter expression does is make use of an if-then-else operator:
gt(scene,0.01) checks if the current frame has detected motion relative to the previous frame. The value will have to be calibrated based on manual observation by seeing which value accurately captures actual activity as compared to sensor/compression noise or visual noise in the frame. See here on how to get a list of all scene change values.
If the frame is evaluated to have motion, the then clause evaluates st(1,t). The function st(val,expr) stores the value of expr in a variable numbered val and it also returns that expression value as its result. So, the timestamp of the kept frames will keep on being updated in that variable until a static frame is encountered.
The else clause checks the difference between the current frame timestamp and the timestamp of the stored value. If the difference is less than 1 second, the frame is kept, else discarded.
The setpts sanitizes the timestamps of all selected frames.
Edit: I tested my command with a video input I synthesized and it worked.
I've done a bit of work on this question... and have found the following works pretty well...
It seems like the input video has to have a "constant frame rate" for things to work properly, so the first command is:-
ffmpeg -i test.mp4 ^
-vf "scale=trunc(iw/2)*2:trunc(ih/2)*2" ^
-vsync cfr test01.mp4
I then need to look at the 'scores' for each frame. Such a listing is produced by:-
ffmpeg -i test01.mp4 ^
-vf select="'gte(scene,0)',metadata=print" -f null -
I'll look at all those scores... and average them (mean) - a bit dodgy but it seems to work Ok. In this example, that average score is '0.021187'.
I then have to select a 'persistence' value -- how long to let the 'duplicated' frames run. If you force it to only keep one frame, the entire video will tend to run much too quickly... So, I've been using 0.2 seconds as a starting point.
So the next command becomes:-
ffmpeg -i test01.mp4 ^
-vf "select='if(gt(scene,0.021187),st(1,t),lte(t-ld(1),0.20))',
setpts=N/FRAME_RATE/TB" output.mp4
After that, the resultant 'output.mp4' video seems to work pretty well. It's only a bit of fiddling with the 'persistence' value that might need to be done to compromise between having a smoother-playing video and scenes that change a bit abruptly.
I've put together some Perl code that works Ok, which I'll work out how to post, if folks are interested in it... eventually(!)
Edit: Another advantage of doing this 'decimating', is that files are of shorter duration (obviously) AND they are smaller in size. For example, a sample video that ran for 00:07:14 and was 22MB in size went to 00:05:35 and 11MB.
Variable frame rate encoding is totally possible, but I don't think it does what you think it does. I am assuming that you wish to remove these duplicate frames to save space/bandwidth? If so, it will not work because the codec is already doing it. Codecs use reference frames, and only encode what has changed from the reference. Hence the duplicate frame take almost no space to begin with. Basically frames are just encoded as a packet of data saying, copy the previous frame, and make this change. The X frames have zero changes, so it only takes a few bytes to encode each one.

Envelope pattern in SoX (Sound eXchange) or ffmpeg

I've been using SoX to generate white noise. I'm after a way of modulating the volume across the entire track in a way that will create a pattern similar to this:
White noise envelope effect
I've experimented with fade, but that fades in to 100% volume and fades out to 0% volume, which is just a pain in this instance.
The tremolo effect isn't quite what I'm after either, as the frequency of the pattern will be changing over time.
The only other alternative is to split the white noise file into separate files, apply fade and then apply trim to either end so it doesn't fade all the way, but this seems like a lot of unnecessary processing.
I've been checking out this example Using SoX to change the volume level of a range of time in an audio file, but I don't think it's quite what I'm after.
I'm using the command-line in Ubuntu with SoX, but I'm open to suggestions with ffmpeg, or any other Linux based command-line solution.
With ffmpeg, you could use the volume filter
ffmpeg -i input.wav -af \
"volume='if(lt(mod(t\,5)/5\,0.5), 0.2+0.8*mod(2*t\,5)/5\, 1.0-0.8*mod(t-(5/2)\,5)/(5/2))':eval=frame" \
output.wav
The expression in the filter above, increases the volume from 0.2 to 1.0 over t=0 to t=2.5 seconds, then gradually back down to 0.2 at t=5 seconds. The period of the envelope here is 5 seconds.

Resources