Fluent FFMPEG trigger callback on JPEG output each frame - node.js

I'm trying to use Fluent FFMPEG in NodeJS to output a cropped frame from a video. I want to trigger an OCR call to Tesseract on every frame that is created. Is there a way in Fluent FFMPEG to listen to each file being created?
Ideally I would like to output each file to a buffer to skip saving it to disk and speed up the Tesseract calls. Any help would be much appreciated!
Here's the code to generate the still frames:
console.time("Process time");
const ffmpeg = require('fluent-ffmpeg')
ffmpeg('test.mp4')
.duration(1)
.videoFilters([
{
filter: 'crop',
options: '1540:1000:250:0'
}
])
.outputOptions('-q:v 2')
.output('images/outimage_%03d.jpeg')
.on('end', function() {
console.log('Finished processing');
console.timeEnd("Process time");
})
.run();

Related

FFMPEG Encoding a video from a Readable stream

I'm facing an issue with the seeked event in Chrome. The issue seems to be due to how the video being seeked is encoded.
The problem seems to occur most frequently when using ytdl-core and piping a Readable stream into an FFMPEG child process.
let videoStream: Readable = ytdl.downloadFromInfo(info, {
...options,
quality: "highestvideo"
});
With ytdl-core in order to get the highest quality you must combine the audio and video. So here is how I am doing it.
const ytmux = (link, options: any = {}) => {
const result = new stream.PassThrough({
highWaterMark: options.highWaterMark || 1024 * 512
});
ytdl.getInfo(link, options).then((info: videoInfo) => {
let audioStream: Readable = ytdl.downloadFromInfo(info, {
...options,
quality: "highestaudio"
});
let videoStream: Readable = ytdl.downloadFromInfo(info, {
...options,
quality: "highestvideo"
});
// create the ffmpeg process for muxing
let ffmpegProcess: any = cp.spawn(
ffmpegPath.path,
[
// supress non-crucial messages
"-loglevel",
"8",
"-hide_banner",
// input audio and video by pipe
"-i",
"pipe:3",
"-i",
"pipe:4",
// map audio and video correspondingly
// no need to change the codec
// output mp4 and pipe
"-c:v",
"libx264",
"-x264opts",
"fast_pskip=0:psy=0:deblock=-3,-3",
"-preset",
"veryslow",
"-crf",
"18",
"-c",
"copy",
"-pix_fmt",
"yuv420p",
"-movflags",
"frag_keyframe+empty_moov",
"-g",
"300",
"-f",
"mp4",
"-map",
"0:v",
"-map",
"1:a",
"pipe:5"
],
{
// no popup window for Windows users
windowsHide: true,
stdio: [
// silence stdin/out, forward stderr,
"inherit",
"inherit",
"inherit",
// and pipe audio, video, output
"pipe",
"pipe",
"pipe"
]
}
);
audioStream.pipe(ffmpegProcess.stdio[4]);
videoStream.pipe(ffmpegProcess.stdio[3]);
ffmpegProcess.stdio[5].pipe(result);
});
return result;
};
I am playing around with tons of different arguments. The result of this video gets uploaded to a Google Bucket. Then when seeking in Chrome I am getting some issues with certain frames, they are not being seeked.
When I pass it through FFMPEG locally and re-encode it, then upload it, I notice there are no issues.
Here is an image comparing the two results when running ffmpeg -i FILE (the one on the left works fine and the differences are minor)
I tried adjusting the arguments in the muxer code and am continuing to try and compare with the re-encoded video. I have no idea why this is happening, something to do with the frames.

How to effectively turn high resolution images into a video with ffmpeg?

I have 24 frames (frame-%d.png)
I want to turn them into a video that will be 1 second long
That means that each frame should play for 1/24 seconds
I'm trying to figure out the correct settings in order to achieve that:
await new Promise((resolve) => {
ffmpeg()
.on('end', () => {
setTimeout(() => {
console.log('done')
resolve()
}, 100)
})
.on('error', (err) => {
throw new Error(err)
})
.input('/my-huge-frames/frame-%d.png')
.inputFPS(1/24)
.output('/my-huge-video.mp4')
.outputFPS(24)
.noAudio()
.run()
Are my inputFPS(1/24) & outputFPS(24) correct ?
Each frame-%d.png is huge: 32400PX x 32400PX (~720Mb). Will ffmpeg be able to generate such a video, and if so, will the video be playable? If not, what is the maximum resolution each frame-%d.png should have instead?
Since the process will be quite heavy, I believe using the command line could be more appropriate. In that case, what is the equivalent of the above Js code in the command line (as in ffmpeg -framerate etc...) ?
your output image size is too large for most common video codecs.
h.264 2048x2048
h.265 8192×4320
av1 7680×4320
You may be able to do raw RGB or raw YUV, but that is going to be huge
~1.5GB per frame for YUV420...
what are you planning to play this on, I know of some dome theaters that theoretically able run something like 15 simultaneous 4k feeds... but they are processed before hand...

Precise method of segmenting & transcoding video+audio (via ffmpeg), into an on-demand HLS stream?

recently I've been messing around with FFMPEG and streams through Nodejs. My ultimate goal is to serve a transcoded video stream - from any input filetype - via HTTP, generated in real-time as it's needed in segments.
I'm currently attempting to handle this using HLS. I pre-generate a dummy m3u8 manifest using the known duration of the input video. It contains a bunch of URLs that point to individual constant-duration segments. Then, once the client player starts requesting the individual URLs, I use the requested path to determine which time range of video the client needs. Then I transcode the video and stream that segment back to them.
Now for the problem: This approach mostly works, but has a small audio bug. Currently, with most test input files, my code produces a video that - while playable - seems to have a very small (< .25 second) audio skip at the start of each segment.
I think this may be an issue with splitting using time in ffmpeg, where possibly the audio stream cannot be accurately sliced at the exact frame the video is. So far, I've been unable to figure out a solution to this problem.
If anybody has any direction they can steer me - or even a prexisting library/server that solves this use-case - I appreciate the guidance. My knowledge of video encoding is fairly limited.
I'll include an example of my relevant current code below, so others can see where I'm stuck. You should be able to run this as a Nodejs Express server, then point any HLS player at localhost:8080/master to load the manifest and begin playback. See the transcode.get('/segment/:seg.ts' line at the end, for the relevant transcoding bit.
'use strict';
const express = require('express');
const ffmpeg = require('fluent-ffmpeg');
let PORT = 8080;
let HOST = 'localhost';
const transcode = express();
/*
* This file demonstrates an Express-based server, which transcodes & streams a video file.
* All transcoding is handled in memory, in chunks, as needed by the player.
*
* It works by generating a fake manifest file for an HLS stream, at the endpoint "/m3u8".
* This manifest contains links to each "segment" video clip, which browser-side HLS players will load as-needed.
*
* The "/segment/:seg.ts" endpoint is the request destination for each clip,
* and uses FFMpeg to generate each segment on-the-fly, based off which segment is requested.
*/
const pathToMovie = 'C:\\input-file.mp4'; // The input file to stream as HLS.
const segmentDur = 5; // Controls the duration (in seconds) that the file will be chopped into.
const getMetadata = async(file) => {
return new Promise( resolve => {
ffmpeg.ffprobe(file, function(err, metadata) {
console.log(metadata);
resolve(metadata);
});
});
};
// Generate a "master" m3u8 file, which the player should point to:
transcode.get('/master', async(req, res) => {
res.set({"Content-Disposition":"attachment; filename=\"m3u8.m3u8\""});
res.send(`#EXTM3U
#EXT-X-STREAM-INF:BANDWIDTH=150000
/m3u8?num=1
#EXT-X-STREAM-INF:BANDWIDTH=240000
/m3u8?num=2`)
});
// Generate an m3u8 file to emulate a premade video manifest. Guesses segments based off duration.
transcode.get('/m3u8', async(req, res) => {
let met = await getMetadata(pathToMovie);
let duration = met.format.duration;
let out = '#EXTM3U\n' +
'#EXT-X-VERSION:3\n' +
`#EXT-X-TARGETDURATION:${segmentDur}\n` +
'#EXT-X-MEDIA-SEQUENCE:0\n' +
'#EXT-X-PLAYLIST-TYPE:VOD\n';
let splits = Math.max(duration / segmentDur);
for(let i=0; i< splits; i++){
out += `#EXTINF:${segmentDur},\n/segment/${i}.ts\n`;
}
out+='#EXT-X-ENDLIST\n';
res.set({"Content-Disposition":"attachment; filename=\"m3u8.m3u8\""});
res.send(out);
});
// Transcode the input video file into segments, using the given segment number as time offset:
transcode.get('/segment/:seg.ts', async(req, res) => {
const segment = req.params.seg;
const time = segment * segmentDur;
let proc = new ffmpeg({source: pathToMovie})
.seekInput(time)
.duration(segmentDur)
.outputOptions('-preset faster')
.outputOptions('-g 50')
.outputOptions('-profile:v main')
.withAudioCodec('aac')
.outputOptions('-ar 48000')
.withAudioBitrate('155k')
.withVideoBitrate('1000k')
.outputOptions('-c:v h264')
.outputOptions(`-output_ts_offset ${time}`)
.format('mpegts')
.on('error', function(err, st, ste) {
console.log('an error happened:', err, st, ste);
}).on('progress', function(progress) {
console.log(progress);
})
.pipe(res, {end: true});
});
transcode.listen(PORT, HOST);
console.log(`Running on http://${HOST}:${PORT}`);
I had the same problem as you, and I've managed to fix this issue as i mentioned in the comment by starting the complete HLS transcoding instead of doing manually the segment requested by the client. I'm going to simplify what I've done and also share the link to my github repo where I've implemented this. I did the same as you for generating the m3u8 manifest:
const segmentDur = 4; // Segment duration in seconds
const splits = Math.max(duration / segmentDur); // duration = duration of the video in seconds
let out = '#EXTM3U\n' +
'#EXT-X-VERSION:3\n' +
`#EXT-X-TARGETDURATION:${segmentDur}\n` +
'#EXT-X-MEDIA-SEQUENCE:0\n' +
'#EXT-X-PLAYLIST-TYPE:VOD\n';
for (let i = 0; i < splits; i++) {
out += `#EXTINF:${segmentDur}, nodesc\n/api/video/${id}/hls/${quality}/segments/${i}.ts?segments=${splits}&group=${group}&audioStream=${audioStream}&type=${type}\n`;
}
out += '#EXT-X-ENDLIST\n';
res.send(out);
resolve();
This works fine when you transcode the video (i.e use for example libx264 as video encoder in the ffmpeg command later on). If you use videocodec copy the segments won't match the segmentDuration from my testing. Now you have a choice here, either you start the ffmpeg transcoding at this point when the m3u8 manifest is requested, or you wait until the first segment is requested. I went with the second option since I want to support starting the transcoding based on which segment is requested.
Now comes the tricky part, when the client requests a segment api/video/${id}/hls/<quality>/segments/<segment_number>.ts in my case you have to first check if any transcoding is already active. If a transcoding is active, you have to check if the requested segment has been processed or not. If it has been processed we can simply send the requested segment back to the client. If it hasn't been processed yet (for example because of a user seek action) we can either wait for it (if the latest processed segment is close to the requested) or we can stop the previous transcoding and restart at the newly requested segment.
I'm gonna try to keep this answer as simple as I can, the ffmpeg command I use to achieve the HLS transcoding looks like this:
this.ffmpegProc = ffmpeg(this.filePath)
.withVideoCodec(this.getVideoCodec())
.withAudioCodec(audioCodec)
.inputOptions(inputOptions)
.outputOptions(outputOptions)
.on('end', () => {
this.finished = true;
})
.on('progress', progress => {
const seconds = this.addSeekTimeToSeconds(this.timestampToSeconds(progress.timemark));
const latestSegment = Math.max(Math.floor(seconds / Transcoding.SEGMENT_DURATION) - 1); // - 1 because the first segment is 0
this.latestSegment = latestSegment;
})
.on('start', (commandLine) => {
logger.DEBUG(`[HLS] Spawned Ffmpeg (startSegment: ${this.startSegment}) with command: ${commandLine}`);
resolve();
})
.on('error', (err, stdout, stderr) => {
if (err.message != 'Output stream closed' && err.message != 'ffmpeg was killed with signal SIGKILL') {
logger.ERROR(`Cannot process video: ${err.message}`);
logger.ERROR(`ffmpeg stderr: ${stderr}`);
}
})
.output(this.output)
this.ffmpegProc.run();
Where output options are:
return [
'-copyts', // Fixes timestamp issues (Keep timestamps as original file)
'-pix_fmt yuv420p',
'-map 0',
'-map -v',
'-map 0:V',
'-g 52',
`-crf ${this.CRF_SETTING}`,
'-sn',
'-deadline realtime',
'-preset:v ultrafast',
'-f hls',
`-hls_time ${Transcoding.SEGMENT_DURATION}`,
'-force_key_frames expr:gte(t,n_forced*2)',
'-hls_playlist_type vod',
`-start_number ${this.startSegment}`,
'-strict -2',
'-level 4.1', // Fixes chromecast issues
'-ac 2', // Set two audio channels. Fixes audio issues for chromecast
'-b:v 1024k',
'-b:a 192k',
];
And input options:
let inputOptions = [
'-copyts', // Fixes timestamp issues (Keep timestamps as original file)
'-threads 8',
`-ss ${this.startSegment * Transcoding.SEGMENT_DURATION}`
];
Parameters worth noting is the -start_number in the output options, this basically tells ffmpeg which number to use for the first segment, if the client requests for example segment 500 we want to keep it simple and start the numbering at 500 if we have to restart the transcoding. Then we have the standard HLS settings (hls_time, hls_playlist_type and f). In the inputoptions I use -ss to seek to the requested transcoding, since we know we told the client in the generated m3u8 manifest that each segment was 4 seconds long, we can just seek to 4 * requestedSegment.
You can see in the 'progress' event from ffmpeg I calculate the latest processed segment by looking at the timemark. By converting the timemark to seconds, then adding the applied seek-time for the transcoding we can calculate approximately which segment was just finished by dividing the amount of seconds with the segment duration which I've set to 4.
Now there is a lot more to keep track of than just this, you have to save the ffmpeg processes that you've started so you can check if a segment is finished or not and if a transcoding is active when the segment is requested. You also have to stop already running transcodings if the user requests a segment far in the future so you can restart it with the correct seek time.
The downside to this approach is that the file is actually being transcoded and saved to your file system while the transcoding is running, so you need to remove the files when the user stops requesting segments.
I've implemented this so it handles the things I've mentioned (long seeks, different resolution requests, waiting until segment is finished etc). If you want to have a look at it it's located here: Github Dose, most interesting files are the transcoding class, hlsManger class and the endpoint for the segments. I tried explaining this as good as I can so I hope you can use this as some sort of base or idea on how to move forward.

NodeJs: How to pipe two streams into one spawned process stdin (i.e. ffmpeg) resulting in a single output

In order to convert PCM audio to MP3 I'm using the following:
function spawnFfmpeg() {
var args = [
'-f', 's16le',
'-ar', '48000',
'-ac', '1',
'-i', 'pipe:0',
'-acodec', 'libmp3lame',
'-f', 'mp3',
'pipe:1'
];
var ffmpeg = spawn('ffmpeg', args);
console.log('Spawning ffmpeg ' + args.join(' '));
ffmpeg.on('exit', function (code) {
console.log('FFMPEG child process exited with code ' + code);
});
ffmpeg.stderr.on('data', function (data) {
console.log('Incoming data: ' + data);
});
return ffmpeg;
}
Then I pipe everything together:
writeStream = fs.createWriteStream( "live.mp3" );
var ffmpeg = spawnFfmpeg();
stream.pipe(ffmpeg.stdin);
ffmpeg.stdout.pipe(/* destination */);
The thing is... Now I want to merge (overlay) two streams into one. I already found how to do it with ffmpeg: How to overlay two audio files using ffmpeg
But, the ffmpeg command expects two inputs and so far I'm only able to pipe one input stream into the pipe:0 argument. How do I pipe two streams in the spawned command? Would something like ffmpeg -i pipe:0 -i pipe:0... work? How would I pipe the two incoming streams with PCM data (since the command expects two inputs)?
You could use named pipes for this, but that isn't going to work on all platforms.
I would instead do the mixing in Node.js. Since your audio is in normal PCM samples, that makes this easy. To mix, you simply add them together.
The first thing I would do is convert your PCM samples to a common format... 32-bit float. Next, you'll have to decide how you want to handle cases where both channels are running at the same time and both are carrying loud sounds such that the signal will "clip" by exceeding 1.0 or -1.0. One option is to simply cut each channel's sample value in half before adding them together.
Another option, depending on your desired output, is to let it exceed the normal range and pass it to FFmpeg. FFmpeg can take in 32-bit float samples. There, you can apply proper compression/limiting to bring the signal back under clipping before encoding to MP3.

node js ffmpeg complexfilter overlay another video

I have this code planning to output a video containing vid1 and vid2 side by side. So I add a padding to the right of vid1 and tried to use overlay to put vid2 on that space but instead the output video shows a duplicate of vid1 to the right. Can someone please tell me what is wrong with my code and how to fix it? Thanks
ffmpeg("vid1.mp4")
.input("vid2.mp4")
.complexFilter([
"scale=300:300[rescaled]",
{
filter:"pad",options:{w:"600",h:"300"},
inputs:"rescaled",outputs:"padded"
},
{
filter:"overlay", options:{x:"300",y:"0"},
inputs:["padded","vid2.mp4"],outputs:"output"
}
], 'output')
.output("output.mp4")
.on("error",function(er){
console.log("error occured: "+er.message);
})
.on("end",function(){
console.log("success");
})
.run();
I used following code in a previous project to do the same thing:
ffmpeg()
.input("vid1.mp4")
.input("vid2.mp4")
.complexFilter([
'[0:v]scale=300:300[0scaled]',
'[1:v]scale=300:300[1scaled]',
'[0scaled]pad=600:300[0padded]',
'[0padded][1scaled]overlay=shortest=1:x=300[output]'
])
.outputOptions([
'-map [output]'
])
.output("output.mp4")
.on("error",function(er){
console.log("error occured: "+er.message);
})
.on("end",function(){
console.log("success");
})
.run();
Note that in this case, any audio from the video is disregarded and dropped. If you want audio as well, you will have to add complex mixdown filters that use the [0:a] and [1:a] channels as input.
The -map parameter in the outputOptions list tells the ffmpeg project to map the variable output into the output.mp4 file. If you need audio, you will have to add another -map parameter to the outputOptions as well for the audio.

Resources