Lags at the junction of the connection between two videos in moviepy - python-3.x

After the video is merged, there is a slight duplication of the last second clip1 delay between the two merged videos. Tell me what could be the reason? The code:
clip1 = VideoFileClip("../1.mp4")
clip2 = VideoFileClip("../2.mp4")
final_clip = concatenate_videoclips([clip1, clip2])
final_clip.write_videofile("../final.mp4")
An example of a lag is available at the link from the 2 second:
https://drive.google.com/file/d/1MPFmAeQan8LxnNkyKcv1YKuaAl4yTMY0/view?usp=sharing

Related

How to set individual image display durations with ffmpeg-python

I am using ffmpeg-python 0.2.0 with Python 3.10.0. Displaying videos in VLC 3.0.17.4.
I am making an animation from a set of images. Each image is displayed for different amount of time.
I have the basics in place with inputting images and concatenating streams, but I can't figure out how to correctly set frame duration.
Consider the following example:
stream1 = ffmpeg.input(image1_file)
stream2 = ffmpeg.input(image2_file)
combined_streams = ffmpeg.concat(stream1, stream2)
output_stream = ffmpeg.output(combined_streams, output_file)
ffmpeg.run(output_stream)
With this I get a video with duration of a split second that barely shows an image before ending. Which is to be expected with two individual frames.
For this example, my goal is to have a video of 5 seconds total duration, showing the image in stream1 for 2 seconds and the image in stream2 for 3 seconds.
Attempt 1: Setting t for inputs
stream1 = ffmpeg.input(image1_file, t=2)
stream2 = ffmpeg.input(image2_file, t=3)
combined_streams = ffmpeg.concat(stream1, stream2)
output_stream = ffmpeg.output(combined_streams, output_file)
ffmpeg.run(output_stream)
With this, I get a video with the duration of a split second and no image displayed.
Attempt 2: Setting frames for inputs
stream1 = ffmpeg.input(image1_file, frames=48)
stream2 = ffmpeg.input(image2_file, frames=72)
combined_streams = ffmpeg.concat(stream1, stream2)
output_stream = ffmpeg.output(combined_streams, output_file, r=24)
ffmpeg.run(output_stream)
In this case, I get the following error from ffmpeg:
Option frames (set the number of frames to output) cannot be applied to input url ########## -- you are trying to apply an input option to an output file or vice versa. Move this option before the file it belongs to.
I can't tell if this is a bug in ffmpeg-python or if I did it wrong.
Attempt 3: Setting framerate for inputs
stream1 = ffmpeg.input(image1_file, framerate=1/2)
stream2 = ffmpeg.input(image2_file, framerate=1/3)
combined_streams = ffmpeg.concat(stream1, stream2)
output_stream = ffmpeg.output(combined_streams, output_file)
ffmpeg.run(output_stream)
With this, I get a video with the duration of a split second and no image displayed. However, when I set both framerate values to 1/2, I get an animation of 4 seconds duration that displays the first image for two seconds and the second image for two seconds. This is the closest I got to a functional solution, but it is not quite there.
I am aware that multiple images can be globbed by input, but that would apply the same duration setting to all images, and my images each have different durations, so I am looking for a different solution.
Any ideas for how to get ffmpeg-python to do the thing is much appreciated.
A simple solution is adding loop=1 and framerate=24 to the "first example":
import ffmpeg
image1_file = 'image1_file.png'
image2_file = 'image2_file.png'
output_file = 'output_file.mp4'
stream1 = ffmpeg.input(image1_file, framerate=24, t=2, loop=1)
stream2 = ffmpeg.input(image2_file, framerate=24, t=3, loop=1)
combined_streams = ffmpeg.concat(stream1, stream2)
output_stream = ffmpeg.output(combined_streams, output_file)
ffmpeg.run(output_stream)
loop=1 - Makes the input image repeating in a loop (the repeated duration is set by t=2 and t=3).
framerate=24 - Images don't have framerate (opposed to video), so they are getting the default framerate (25fps) if framerate is not specified.
Assuming the desired output framerate is 24fps, we may set the input framerate to 24fps.
Selecting framerate=24 sets the input framerate to 24fps (and prevents framerate conversion).
You need to manipulate the timestamp of the source images and use the ts_from_file option of the image2 demuxer:
ts_from_file
If set to 1, will set frame timestamp to modification time of image file. Note that monotonity of timestamps is not provided: images go in the same order as without this option. Default value is 0. If set to 2, will set frame timestamp to the modification time of the image file in nanosecond precision.
You should be able to use os.utime if ok to modify the original file or shutil.copy2 to copy and modify.

Fastest way to slice and download hundreds of NetCDF files from THREDDS/OPeNDap server

I am working with NASA-NEX-GDDP CMIP6 data. I currently have working code that individually opens and slices each file, however it takes days to download one variable for all model outputs and scenarios. My goal is to have all temperature and precipitation data for all models outputs and scenarios then apply climate indicators and make an ensemble with xclim.
url = 'https://ds.nccs.nasa.gov/thredds2/dodsC/AMES/NEX/GDDP-CMIP6/UKESM1-0-LL/ssp585/r1i1p1f2/tasmax/tasmax_day_UKESM1-0-LL_ssp585_r1i1p1f2_gn_2098.nc'
lat = 53
lon = 0
try:
with xr.open_dataset(url) as ds:
ds.interp(lat=lat,lon=lon).to_netcdf(url.split('/')[-1])
except Exception as e: print(e)
This code works but is very slow (days for one variable, one location). Wondering if there is a better, faster way? I'd rather not download the whole files as they are each 240 MB!
Update:
I have also tried the following to take advantage of dask parallel tasks and it is slightly faster but still on the order of days to complete for a full variable output:
def interp_one_url(path,lat,lon):
with xr.open_dataset(path) as ds:
ds = ds.interp(lat=lat,lon=lon)
return ds
urls = ['https://ds.nccs.nasa.gov/thredds2/dodsC/AMES/NEX/GDDP-CMIP6/UKESM1-0-LL/ssp585/r1i1p1f2/tasmax/tasmax_day_UKESM1-0-LL_ssp585_r1i1p1f2_gn_2100.nc',
'https://ds.nccs.nasa.gov/thredds2/dodsC/AMES/NEX/GDDP-CMIP6/UKESM1-0-LL/ssp585/r1i1p1f2/tasmax/tasmax_day_UKESM1-0-LL_ssp585_r1i1p1f2_gn_2099.nc']
lat = 53
lon = 0
paths = [url.split('/')[-1] for url in urls]
datasets = [interp_one_url(url,lat,lon) for url in urls]
xr.save_mfdataset(datasets, paths=paths)
One way is to download via the ncss portal instead of the OpenDAP, available via NASA. The URL is different but it is iterative as well.
e.g.
lat = 53
lon = 0
URL = "https://ds.nccs.nasa.gov/thredds/ncss/AMES/NEX/GDDP-CMIP6/ACCESS-CM2/historical/r1i1p1f1/pr/pr_day_ACCESS-CM2_historical_r1i1p1f1_gn_2014.nc?var=pr&north={}&west={}&east={}&south={}&disableProjSubset=on&horizStride=1&time_start=2014-01-01T12%3A00%3A00Z&time_end=2014-12-31T12%3A00%3A00Z&timeStride=1&addLatLon=true"
wget.download(URL.format(lat,lon,lon+1,lat-1) #north, west, east, south boundary
This accomplishes the slicing and download in one step. Once you have the URL, you can use something like wget, and complete downloads in parallel, which will speed up compared to selecting and saving one at a time

Python Pandas data frame setting copy of slice working sometimes but not always, despite nearly identical code

I have one data frame called patient_df that is made like this:
PATIENT_COLS = ['Origin', 'Status', 'Team', 'Bed', 'Admit_Time', 'First_Consult', 'Decant_Time', 'Ward_Time', 'Discharge_Order', 'Discharged'] # data to track for each patient
patient_df = pd.DataFrame(columns=PATIENT_COLS)
Then, at multiple points in my code I will access a row of this data frame and update fields associated with it (the row at patient_ID doesn't exist prior to me creating it in the first line):
patient_df.loc[patient_ID] = [None for i in range(NUM_PATIENT_COLS)]
record = patient_df.loc[patient_ID]
record.Origin = ORIGIN()
record.Admit_Time = sim_time
This code runs perfectly with no errors or warnings and the output is as expected (the actual data frame is updated).
However, I have another data frame called ip_df:
ip_df = pd.read_csv(PATH + 'Clean_IP.csv')
Now, when I try to access the rows in the same way (this time the rows already exist):
for patient in ALC_patients:
record = ip_df.loc[patient]
orig_end = record.IP_Discharge_DT
record.IP_LOS = MAX_STAY
record.IP_Discharge_DT = record.N_Left_DT + timedelta(days=MAX_STAY)
I get
SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame
Now, I realize what's happening is I'm actually accessing a copy of the data frame and thus not updating the actual one, and I can fix this by using
ip_df[patient, 'IP_LOS'] = MAX_STAY
However, I find the first code much cleaner, plus I don't have to make the data frame search for the row again every time. Why is this working with patient_df but not for ip_df, and is there anything I can change to use code more like what I am for patient_df?
pd.options.mode.chained_assignment = None # default='warn'
According to this link setting this in your code will turn off the warn flag

OpenCV - Capture arbitrary frame from video file

I use the following code to extract a specific frame from a video file. In this example, I'm simply getting the middle frame:
import cv2
video_path = '/tmp/wonderwall.mp4'
vidcap = cv2.VideoCapture(video_path)
middle_frame = int(vidcap.get(cv2.CAP_PROP_FRAME_COUNT) / 2)
success, image = vidcap.read()
count = 0
success = True
while success:
success, image = vidcap.read()
if count == middle_frame:
temp_file = tempfile.NamedTemporaryFile(suffix='.jpg', delete=False)
cv2.imwrite(temp_file.name, image)
count += 1
However, with this method, capturing the middle frame in a very large file can take a while.
Apparently, in the older cv module, one could do:
import cv
img = cv.QueryFrame(capture)
Is there a similar way in cv2 to grab a specific frame in a video file, without having to iterate through all frames?
You can do it in the same way, in C++ (python conversion should be more than easy).
cv::VideoCapture cap("file.avi");
double number_of_frame = cap.get(CV_CAP_PROP_FRAME_COUNT);
cap.set(CV_CAP_PROP_POS_FRAMES, IndexOfTheFrameYouWant);
cv::Mat frameIwant = cap.read();
For reference :
VideoCapture::get(int propId)
Can take various flag returning nearly all you can wish for (http://docs.opencv.org/2.4/modules/highgui/doc/reading_and_writing_images_and_video.html and look for get() ).
VideoCapture::set(int propId, double value)
Set will do what you want (same doc look for set) if you use the propID 1, and the index of the frame you desire.
You should note that if the index you use as parameter is superior to the max frame that the code will grab the last frame of the video if you are lucky, or crash at run time.

Sphinx 4 Transcription Time Index

How do I get time index (or frame number) in Sphinx 4 when I set it to transcribe an audio file?
The code I'm using looks like this:
audioURL = ...
AudioFileDataSource dataSource = (AudioFileDataSource) cm.lookup("audioFileDataSource");
dataSource.setAudioFile(audioURL, null);
Result result;
while ((result = Recognizer.recognize()) != null) {
Token token = result.getBestToken();
//DoubleData data = (DoubleData) token.getData();
//long frameNum = data.getFirstSampleNumber(); // data seem always null
String resultText = token.getWordPath(false, false);
...
}
I tried to get time of transcription from result/token objects, e.g. similar to what a subtitler do. I've found Result.getFrameNumber() and Token.getFrameNumber() but they appear to return the number of frames decoded and not the time (or frame) where the result was found in the context of entire audio file.
I looked at AudioFileDataSource.getDuration()[=private] and the Recognizer classes but haven't figure out how to get the needed transcribed time-index..
Ideas? :)
Frame number is the time multiplied by frame rate which is 100 frames/second.
Anyway, please find the patch for subtitles demo which returns timings here:
http://sourceforge.net/mailarchive/forum.php?thread_name=1380033926.26218.12.camel%40localhost.localdomain&forum_name=cmusphinx-devel
The patch applies to subversion trunk, not to the 1.0-beta version.
Please note that this part is under major refactoring, so the API will be obsolete soon. However, I hope you will be able to create subtitles with just few calls without all current complexity.

Resources