Suppose the following:
I got two video files with audio
They are pretty much the same (going by 'images'), but differ in codec, quality, .. and audio.
To have a more specific example, furtermore suppose:
Video 1 (the 'target') is running #24 FPS in 2160p [HEVC]. The first 25 seconds of audio and video are not "belonging"/unessential to the video depicted.
Video 2 is running #30 FPS in 720p [X264]. The first 19 seconds of audio and video are not "belonging"/unessential to the video depicted.
I now want to perfectly merge the audio of "Video 2" into the target Video, so that the audio of both videos pretty match (except e.g. language)
My initial thought would be to synchronize both videos by frames using fingerprinting/hashes similar to various repost checker bots.
Is there an easier method, how would I go from there (preferably using ffmpeg)?
Bonus points if the first 25 seconds of the "Video 1" audio are prepended to the "Video 2" audio while the first 19 seconds are omitted.
Related
I have a recording as a collection of files in mpegts format, like
audio: a-1.ts, a-2.ts, a-3.ts, a-4.ts
video: v-1.ts, v-2.ts, v-3.ts
I need to make a single video clip in mp4 or mkv format.
However, there are two problems:
audio and video segments have different duration each, number of audio segments is different from number of video segments. Total duration of audio and video matches. Hence I can not concat pairwise audio video segments using mpeg and merge them afterwards, I get sync issues increasing progressively
few segments are corrupt or missing. So if I concat audio and video streams separately using ffmpeg I get streams of different lengths. When I merge these streams using ffmpeg I have correct a/v synchronization until time when first missing packet is encountered.
It's OK if video freezes for a while or there is silence for a while as long as most of the video is in sync with audio.
I've checked with tsduck and PCR seems to be present in all audio and video segments yet I could not find a way to merge streams using mpegTS PCR as sync reference. Please advise how can I achieve this.
I want to do live audio translation via microphone, to get streamed live vid/audio from Facebook, plug the mic into laptop and do live translation by mixing existing audio stream with one coming from the mic (translation). This is OK, somehow I got this part by using audio filter "amix" and mix two audio streams together into one. Now I want to add more perfection to it, is it possible to (probably is) upon mic voice detection to automatically decrease/fade down 20% volume of input/original audio stream to hear translation (mic audio) more loudly and then when mic action/voice stops for lets say 3-5 seconds the volume of original audio stream fades up/goes up to normal volume... is this too much, i can play with sox or similar?
I use a video player called MPV to transcode a dynamic playlist of media files.
I pipe MPV's encoded output into FFMPEG and format it for rtmp delivery.
However the playlist may contain media with misaligned audio and video, ie - the audio track may be shorter / longer than the video track.
No matter what MPV will only output what it's given. So if my media file has audio that is 1 second long and video that is 2 seconds long, it will output a media stream with exactly the same misalignment, rather than generating null audio or skipping to the next item in the playlist when it first encounters an active stream ending (eof).
For example, assuming my playlist was full of problematic media where the audio and video of each file was misaligned:
If I output this media stream to a popular streaming service's server, it could lead to stuttering and/or loss of a/v sync.
Similarly, if I output this media stream to a file and played it back in MPV or another video player, the result appears to be more like this:
I have tried to fix this in MPV in all sorts of ways, trying every relevant command line option available. I even wrote a user script that detects 'eof' audio and skips to the next item in the playlist, but it is not fast enough and still leads to small gaps of audio.
So my only hope is correcting it in ffmpeg. In the event of null audio/video, I need a fallback or a generative filter that can fill these empty gaps with silence (audio) or a colour/image (video).
I'm open to any ideas, and if my understanding in a/v encoding is a little off please educate me.
Is there a way to universally delay audio in Windows 10 (I have Realtek Hi-Definiton Audio) as an example.
I have 2 reasons why I want to accomplish this
1) My audio is playing a quarter of a second before the video plays (out of sync). This is consistent in youtube, vlc media player, windows media player, pretty much any video content... the mouth will move 1/4 second after the audio. The delay builds up overtime as well, about 5 minutes in it becomes unbearable.
2) Unrelated to #1 I want to scan the audio and edit it in near real time. Searching for certain sounds and also reading certain subtitles.
For the past few days, I have been trying to get my lossless .mov video(that has an audio track) to a .webm format.
Some info on the video & audio is that the fps is 30. Also the audio track has about 3-5 seconds of silence/blank audio before you start hearing some music.
My problem is that is seems during the transcoding to webm, it strips away this blank audio because when I go to play the video, the audio starts right away.I've also notice that it jumps right away to ~4 seconds in the video. When i play it on the browser, it jumps to that moment in the timeline. If I try to scrub to the beginning, the video ends.
I've have figured somethings out.
This is just a webm problem. This does not happen with ogv or mp4
It only happens if they is blank audio in the beginning of the audio track.
I am using ffmpeg with the libvpx and libvorbis librarys and I am doing just the basic command line setup
ffmpeg -i "infile" "outfile.webm"