I want to create a Python script that make subtitle (a ticking timecode).
I have an h.264 binary file that has timestamp in each frame. so I want to parse this timestamp and make the subtitle with it.
Here's what I've tried with ffmpeg below.
ffmpeg -y -i video.avi -vf "drawtext=fontfile=C:\Windows\Fonts\consolab.ttf: fontsize=12:fontcolor=yellow: box=1:boxcolor=black#0.4: text='TIME\: %{pts\:gmtime\:1561939200}':x=(w-tw)/2:y=(h-th)/2" test.avi
Output .avi is okay with the timestamp, but, I needed to re-encode the source video file which takes a lot of time.
So, I want to use a different way, which is creating subtitles.
Question is, is there any way to make subtitle with frame time? or useful info to fill me in?
The current result:
The expected result:
You can use directly Timecode by defining the starting time :
ffmpeg -i source_file -c:v libx264 -vf drawtext="fontfile=C\\:/Windows/Fonts/arial.ttf:fontsize=45:timecode='10\:00\:00\:00':fontcolor='white':box=1:boxcolor='black':rate=25:x=(w-text_w)/2:y=h/1.2" -c:a aac -f mp4 output_with_TC.mp4
You can also grab timecode from file using ffprobe and use it in a batch file
#echo off
cd C:\bin
set file=%1
For %%A in ("%file%") do (
Set Folder=%%~dpA
Set Name=%%~nA
)
set "CommandLine=ffprobe -i %file% -show_entries format_tags=timecode -of default=noprint_wrappers=1"
setlocal EnableDelayedExpansion
for /F "delims=" %%I in ('!CommandLine!') do set "TC=%%I"
echo.File is: %file%
echo.TC is: %TC%
set hours=%TC:~13,2%
set minutes=%TC:~16,2%
set seconds=%TC:~19,2%
set frames=%TC:~22,2%
set h=!hours: =!
set m=!minutes: =!
set s=!seconds: =!
set f=!frames: =!
set final_TC=%h%:%m%:%s%:%f%
echo.final_TC is : %final_TC%
ffmpeg -i %file% -c:v libx264 -vf drawtext="fontfile=C\\:/Windows/Fonts/arial.ttf:fontsize=45:timecode=\'%final_TC%\':fontcolor='white':box=1:boxcolor='black#0.4':rate=25:x=1600:y=30" -c:a aac -f mp4 output_%Name%.mp4 -y
endlocal
Maybe this code will help you to get a quick solution (I've modified slightly for posting here):
Set the fps variable to your own video's FPS.
set the myTimeNum var by using a timer to increase some frameNum count every FPS interval.
Logic:
Where example FPS is 10...
FPS= 10; frameNum = 0; myTimeNum = 0
Means for 10 times per second (using a timer) you must...
myTimeNum = (1000 / FPS) * frameNum
timecode = smpte.totc(myTimeNum, fps)
frameNum++
Code to try:
fps = 30
myTimeNum = 50000;
timecode = smpte.totc(myTimeNum, fps)
print timecode
#Converts frames to SMPTE timecode of arbitrary frame rate and back.
#For DF calculations use 29.976 frame rate.
#Igor Ridanovic, igor#HDhead.com
def totc(x, fps):
"""Converts frame count to SMPTE timecode."""
spacer = ':'
frHour = fps * 3600
frSec = fps * 60
hr = int(x // frHour)
mn = int((x - hr * frHour) // frSec)
sc = int((x - hr * frHour - mn * frSec) // fps)
fr = int(round(x - hr * frHour - mn * frSec - sc * fps))
return(
str(hr).zfill(2) + spacer +
str(mn).zfill(2) + spacer +
str(sc).zfill(2) + spacer +
str(fr).zfill(2))
Related
So I have a video that I am trying to crop with ffmpeg and does crop successfully, but the cropped video has about 1 second of black video at the start.
blackFrame video
I assume it is my lack of knowledge of how to set the right codec or I am splitting a frame somehow.
This is the block of code on how I am cropping my video.
import subprocess
import time
def crop_video(original_vid_name, cropped_vid_name, start_time, end_time):
start_time_timeformat = time.strftime('%H:%M:%S', time.gmtime(start_time))
end_time_timeformat = time.strftime('%H:%M:%S', time.gmtime(end_time))
ffmpeg_timecrop_cmd = "-ss {} -to {}".format(start_time_timeformat, end_time_timeformat)
ffmpeg_quality_codec_cmd = '-q:v 10 -crf 18 -c:v copy -c:a copy -avoid_negative_ts make_zero'
ffmpeg_cmd = 'ffmpeg {} -i {} {} {}'.format(
ffmpeg_timecrop_cmd, original_vid_name, ffmpeg_quality_codec_cmd, cropped_vid_name
)
subprocess.call(ffmpeg_cmd, shell=True)
Please let me know what I am doing wrong?
I'm recording audio with Julia and want to be able to trigger a 5 second recording after the audio signal exceeds a certain volume. This is my record script so far:
using PortAudio, SampledSignals, LibSndFile, FileIO, Dates
stream = PortAudioStream("HDA Intel PCH: ALC285 Analog (hw:0,0)")
buf = read(stream, 5s)
close(stream)
save(string("recording_", Dates.format(now(), "yyyymmdd_HHMMSS"), ".wav"), buf, Fs = 48000)
I'm new to Julia and signal processing in general. How can I tell this only to start recording once the audio exceeds a specified volume threshold?
You need to test the sound you capture for average amplitude and act on that. Save if loud enough, otherwise rinse and repeat.
using PortAudio, SampledSignals, LibSndFile, FileIO
const hassound = 10 # choose this to fit
suprathreshold(buf, thresh = hassound) = norm(buf) / sqrt(length(buf)) > thresh # power over threshold
stream = PortAudioStream("HDA Intel PCH: ALC285 Analog (hw:0,0)")
while true
buf = read(stream, 5s)
close(stream)
if suprathreshold(buf)
save("recording.wav", buf, Fs = 48000) # should really append here maybe???
end
end
I tried many methods but did not get the exact length value of an mp3 file.
With moviepy:
audiofile = AudioFileClip(url)
print("duration moviepy: " + str(audiofile.duration))
I get result:
duration moviepy: 183.59
With mutagen:
from mutagen.mp3 import MP3
audio = MP3(url)
print("duration mutagen: " + str(audio.info.length))
I received another value of duration:
duration mutagen: 140.93416666666667
Actual duration value when I open the file using windows media player: 2m49s
I don't know what happens to my audio file, I test a few files from the music website and still get the correct value.
This is my audio file
try pyxsox
I tried to use pysox to the audio file which includes this question post
note: pysox needs SOX cli.
how to use it is like this.
import sox
mp3_path = "YOUR STRANGE MP3 FILEPATH"
length = sox.file_info.duration(mp3_path)
print("duration sec: " + str(length))
print("duration min: " + str(int(length/60)) + ':' + str(int(length%60)))
results are
duration sec: 205.347982
duration min: 3:25
and the others information on that mp3 file's duration.
ID3 information => 2.49
mutagen => 2:20
pysox => 3:25
actural length => 3:26
mutagen seems just to read ID3 information.
Using Mutagen
pip install mutagen
python:
import os
from mutagen.mp3 import MP3
def convert_seconds(seconds):
hours = seconds // 3600
seconds %= 3600
minutes = seconds // 60
seconds %= 60
return "%02d:%02d:%02d" % (hours, minutes, seconds)
path = "Your mp3 files floder."
total_length = 0
for root, dirs, files in os.walk(os.path.abspath(path)):
for file in files:
if file.endswith(".mp3"):
audio = MP3(os.path.join(root, file))
length = audio.info.length
total_length += length
hours, minutes, seconds = convert_seconds(total_length).split(":")
print("total duration: " + str(int(hours)) + ':' + str(int(minutes)) + ':' + str(int(seconds)))
I need to remove the B-Frames and also add a silent audio track to an mpeg. This is my source file (mediainfo input.mpg):
General
Complete name : input.mpg
Format : MPEG-PS
File size : 3.88 MiB
Duration : 4s 0ms
Overall bit rate : 8 131 Kbps
Writing library : encoded by TMPGEnc (ver. 2.525.64.184)
Video
ID : 224 (0xE0)
Format : MPEG Video
Format version : Version 1
Format settings, BVOP : Yes
Format settings, Matrix : Default
Format settings, GOP : M=3, N=9
Duration : 4s 0ms
Bit rate : 8 000 Kbps
Width : 800 pixels
Height : 600 pixels
Display aspect ratio : 4:3
Frame rate : 30.000 fps
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive
Compression mode : Lossy
Bits/(Pixel*Frame) : 0.556
Time code of first frame : 00:00:00:00
Time code source : Group of pictures header
GOP, Open/Closed : Open
GOP, Open/Closed of first frame : Closed
Stream size : 3.80 MiB (98%)
Writing library : TMPGEnc 2.525.64.184
I'm trying it with:
ffmpeg -f lavfi -i anullsrc -i input.mpg -c:v mpeg1video -b:v 8000k \
-minrate 8000k -maxrate 8000k -pix_fmt yuv420p -g 9 -acodec mp2 \
-ac 2 -ab 128k -ar 44100 -async 1 -shortest -y out.mpg
mediainfo out.mpg
General
Complete name : out.mpg
Format : MPEG-PS
File size : 3.96 MiB
Duration : 4s 23ms
Overall bit rate : 8 251 Kbps
Video
ID : 224 (0xE0)
Format : MPEG Video
Format version : Version 1
Format settings, BVOP : No
Format settings, Matrix : Default
Format settings, GOP : N=9
Duration : 4s 0ms
Bit rate : 8 000 Kbps
Width : 800 pixels
Height : 600 pixels
Display aspect ratio : 4:3
Frame rate : 30.000 fps
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive
Compression mode : Lossy
Bits/(Pixel*Frame) : 0.556
Time code of first frame : 00:00:00:00
Time code source : Group of pictures header
GOP, Open/Closed : Open
GOP, Open/Closed of first frame : Closed
Stream size : 3.80 MiB (96%)
Audio
ID : 192 (0xC0)
Format : MPEG Audio
Format version : Version 1
Format profile : Layer 2
Duration : 4s 23ms
Bit rate mode : Constant
Bit rate : 128 Kbps
Channel(s) : 2 channels
Sampling rate : 44.1 KHz
Compression mode : Lossy
Delay relative to video : -11ms
Stream size : 62.9 KiB (2%)
Unfortunately is the audio duration different to the video duration and there is some "Delay relative to video" of -11ms
I found in another post this option:
-af asetpts=PTS+0.011/TB
which gives me this output:
Audio
ID : 192 (0xC0)
Format : MPEG Audio
Format version : Version 1
Format profile : Layer 2
Duration : 3s 997ms
Bit rate mode : Constant
Bit rate : 128 Kbps
Channel(s) : 2 channels
Sampling rate : 44.1 KHz
Compression mode : Lossy
Stream size : 62.5 KiB (2%)
This one is close but still not my "4s 0ms" what I expected. How can I
add a silent audio track with the "absoutly exact" duration? And do I encode the video right?
Try
ffmpeg -f lavfi -i aevalsrc=0|0:d=4.00 -i input.mpg -c:v mpeg1video -b:v 8000k \
-minrate 8000k -maxrate 8000k -pix_fmt yuv420p -g 9 -acodec mp2 \
-ac 2 -ab 128k -ar 44100 -async 1 -shortest -y out.mpg
Am running my ALSA Driver on Ubuntu 14.04, 64bit, 3.16.0-30-generic Kernel.
Hardware is proprietary hardware, hence cant give much details.
Following is the existing driver implementation:
Driver is provided sample format, sample rate, channel_count as input via module parameter. (Due to requirements need to provide inputs via module parameters)
Initial snd_pcm_hardware structure for playback path.
#define DEFAULT_PERIOD_SIZE (4096)
#define DEFAULT_NO_OF_PERIODS (1024)
static struct snd_pcm_hardware xxx_playback =
{
.info = SNDRV_PCM_INFO_MMAP |
SNDRV_PCM_INFO_INTERLEAVED |
SNDRV_PCM_INFO_MMAP_VALID |
SNDRV_PCM_INFO_SYNC_START,
.formats = SNDRV_PCM_FMTBIT_S16_LE,
.rates = (SNDRV_PCM_RATE_8000 | \
SNDRV_PCM_RATE_16000 | \
SNDRV_PCM_RATE_48000 | \
SNDRV_PCM_RATE_96000),
.rate_min = 8000,
.rate_max = 96000,
.channels_min = 1,
.channels_max = 1,
.buffer_bytes_max = (DEFAULT_PERIOD_SIZE * DEFAULT_NO_OF_PERIODS),
.period_bytes_min = DEFAULT_PERIOD_SIZE,
.period_bytes_max = DEFAULT_PERIOD_SIZE,
.periods_min = DEFAULT_NO_OF_PERIODS,
.periods_max = DEFAULT_NO_OF_PERIODS,
};
Similar values for captures side snd_pcm_hardware structure.
Please, note that the following below values are replaced in playback open entry point, based on the current audio test configuration:
(user provides audio format, audio rate, ch count via module parameters as inputs to the driver, which are refilled in snd_pcm_hardware structure)
xxx_playback.formats = user_format_input
xxx_playback.rates = xxx_playback.rate_min, xxx_playback.rate_max = user_sample_rate_input
xxx_playback.channels_min = xxx_playback.channels_max = user_channel_input
Similarly values are re-filled for capture snd_pcm_hardware structure in capture open entry point.
Hardware is configured for clocks based on channel_count, format, sample_rate and driver registers successfully with ALSA layer
Found aplay/arecord working fine for channel_count = 1 or 2 or 4
During aplay/arecord, in driver when "runtime->channels" value is checked, it reflects the channel_count configured, which sounds correct to me.
Record data matches with played, since its a loop back test.
But when i use channel_count = 3, Both aplay or arecord reports
"Broken configuration for this PCM: no configurations available"!! for a wave file with channel_count '3'
ex: Playing WAVE './xxx.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Channels 3
ALSA lib pcm_params.c:2162:(snd1_pcm_hw_refine_slave) Slave PCM not usable
aplay: set_params:1204: Broken configuration for this PCM: no configurations available
With Following changes I was able to move ahead a bit:
.........................
Method1:
Driver is provided channel_count '3' as input via module parameter
Modified Driver to fill snd_pcm_hardware structure as payback->channels_min = 2 & playback->channels_min = 3; Similar values for capture path
aplay/arecord reports as 'channel count not available', though the wave file in use has 3 channels
ex: aplay -D hw:CARD=xxx,DEV=0 ./xxx.wav Playing WAVE './xxx.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Channels 3
aplay: set_params:1239: Channels count non available
Tried aplay/arecord with plughw, and aplay/arecord moved ahead
arecord -D plughw:CARD=xxx,DEV=0 -d 3 -f S16_LE -r 48000 -c 3 ./xxx_rec0.wav
aplay -D plughw:CARD=xxx,DEV=0 ./xxx.wav
Recording WAVE './xxx_rec0.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Channels 3
Playing WAVE './xxx.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Channels 3
End of Test
During aplay/arecord, In driver when "runtime->channels" value is checked it returns value 2!!! But played wavefile has ch count 3...
When data in recorded file is checked its all silence
.........................
Method2:
Driver is provided channel_count '3' as input via module parameter
Modified Driver to fill snd_pcm_hardware structure as playback->channels_min = 3 & playback->channels_min = 4; Similar values for capture path
aplay/arecord reports as 'channel count not available', though the wave file in use has 3 channels
Tried aplay/arecord with plughw, and aplay/arecord moved ahead
During aplay/arecord, In driver when "runtime->channels" value is checked it returns value 4!!! But played wavefile has ch count 3...
When data in recorded file is checked its all silence
.........................
So from above observations, the runtime->channels is either 2 or 4, but never 3 channels was used by alsa stack though requested. When used Plughw, alsa is converting data to run under 2 or 4 channel.
Can anyone help why am unable to use channel count 3.
Will provide more information if needed.
Thanks in Advance.
A period (and the entire buffer) must contain an integral number of frames, i.e., you cannot have partial frames.
With three channels, one frame has six bytes. The fixed period size (4096) is not divisible by six without remainder.
Thanks CL.
I used period size 4092 for this particular test case with channel count 3, and was able to do loop back successfully (without using plughw).
One last question, when I used plughw earlier, and when runtime->channels was either 2 or 4, why was the recorded data not showing?