Read an image with Jimp while processing previous image - node.js

This may have been asked before, but I couldn't find it!
I need to process a series of image files with jimp, and I'm trying to be as efficient as possible.
My current program simply uses "await jimp.read(...)" synchronously before processing it. What I'd like to do is read the images asynchronously, but here's the catch: I need to process the results in order.
At a minimum, I'm thinking I need to read the next image asynchronously while processing the current image, then (when finished processing) wait for the next image to finish being read before starting the next iteration, if that makes sense.
Is there a pattern I should use for doing this? Is there a better approach than what I describe above?
Thanks!
Edited to include a summary of my current approach:
image = await jimp.read("image_1");
for (let i = 1; i <= totalImages; i++) {
if (i < totalImages)
var imagePromise = jimp.read("image_" + (i+1));
processImage(image);
image = await imagePromise;
}
It just processes one image while reading the next, but it may need to (a)wait if the next image hasn't been completely read yet.

Related

Unable to use one readable stream to write to two different targets in Node JS

I have a client side app where users can upload an image. I receive this image in my Node JS app as readable data and then manipulate it before saving like this:
uploadPhoto: async (server, request) => {
try {
const randomString = `${uuidv4()}.jpg`;
const stream = Fse.createWriteStream(`${rootUploadPath}/${userId}/${randomString}`);
const resizer = Sharp()
.resize({
width: 450
});
await data.file
.pipe(resizer)
.pipe(stream);
This works fine, and writes the file to the projects local directory. The problem comes when I try to use the same readable data again in the same async function. Please note, all of this code is in a try block.
const stream2 = Fse.createWriteStream(`${rootUploadPath}/${userId}/thumb_${randomString}`);
const resizer2 = Sharp()
.resize({
width: 45
});
await data.file
.pipe(resizer2)
.pipe(stream2);
The second file is written, but when I check the file, it seems corrupted or didn't successfully write the data. The first image is always fine.
I've tried a few things, and found one method that seems to work but I don't understand why. I add this code just before the I create the second write stream:
data.file.on('end', () => {
console.log('There will be no more data.');
});
Putting the code for the second write stream inside the on-end callback block doesn't make a difference, however, if I leave the code outside of the block, between the first write stream code and the second write stream code, then it works, and both files are successfully written.
It doesn't feel right leaving the code the way it is. Is there a better way I can write the second thumb nail image? I've tried to use the Sharp module to read the file after the first write stream writes the data, and then create a smaller version of it, but it doesn't work. The file doesn't ever seem to be ready to use.
You have 2 alternatives, which depends on how your software is designed.
If possible, I would avoid to execute two transform operations on the same stream in the same "context", eg: an API endpoint. I would rather separate those two different tranform so they do not work on the same input stream.
If that is not possible or would require too many changes, the solution is to fork the input stream and the pipe it into two different Writable. I normally use Highland.js fork for these tasks.
Please also see my comments on how to properly handle streams with async/await to check when the write operation is finished.

Is there a way to keep node js cluster send() messages in order

Using some added sequence checks below, I see that messages are sometimes arriving out of order and that breaks the code.
I am thinking I must queue up out-of-order messages upon receive to make sure things get processed in order.
Is this just the nature of NodeJS ?
// In the master process:
msg.sequence = next_sequence[i]++;
worker[i].send(msg)
// In worker(s):
process.on("message",handler);
....
var last_sequence = 0;
function handler(msg){
if ( last_sequence + 1 != msg.sequence ) console.log(...);
last_sequence = msg.sequence;
After using send(JSON.stringify(msg)) and JSON.parse when receiving, the behavior seems more deterministic and message sequence numbers are in order.
So it seems that send() does not immediately copy the data and it can still be changed some short while after calling send().
Can anyone confirm this ?

nodejs vs. ruby / understanding requests processing order

I have a simple utility that i use to size image on the fly via url params.
Having some troubles with the ruby image libraries (cmyk to rvb is, how to say… "unavailable"), i gave it a shot via nodejs, which solved the issue.
Basically, if the image does not exists, node or ruby transforms it. Otherwise when the image has already been requested/transformed, the ruby or node processes aren't touched, the image is returned statically
The ruby works perfectly, a bit slow if lot of transforms are requested at once, but very stable, it always go through whatever the amount (i see the images arriving one the page one after another)
With node, it works also perfectly, but when a large amount of images are requested, for a single page load, the first images is transformed, then all the others requests returns the very same image (the last transformed one). If I refresh the page, the first images (already transformed) is returned right away, the second one is returned correctly transformed, but then all the other images returned are the same as the one just newly transformed. and it goes on the same for every refresh. not optimal , basically the resquests are "merged" at some point and all return the same image. for reason i don't understand
(When using 'large amount', i mean more than 1)
The ruby version :
get "/:commands/*" do |commands,remote_path|
path = "./public/#{commands}/#{remote_path}"
root_domain = request.host.split(/\./).last(2).join(".")
url = "https://storage.googleapis.com/thebucket/store/#{remote_path}"
img = Dragonfly.app.fetch_url(url)
resized_img = img.thumb(commands).to_response(env)
return resized_img
end
The node js version :
app.get('/:transform/:id', function(req,res,next){
parser.parse(req.params,function(resized_img){
// the transform are done via lovell/sharp
// parser.parse parse the params, write the file,
// return the file path
// then :
fs.readFileSync(resized_img, function(error,data){
res.write(data)
res.end()
})
})
})
Feels like I'm missing here a crucial point in node. I expected the same behaviour with node and ruby, but obviously the same pattern transposed in the node area just does not work as expected. Node is not waiting for a request to process, rather processes those somehow in an order that is not clear to me
I also understand that i'm not putting the right words to describe the issue, hoping that it might speak to some experienced users, let them provide clarifiactions to get a better understanding of what happens behind the node scenes

Mongoose js batch find

I'm using mongoose 3.8. I need to fetch 100 documents, execute the callback function then fetch next 100 documents and do the same thing.
I thought .batchSize() would do the same thing, but I'm getting all the data at once.
Do I have to use limit or offset? If yes, can someone give a proper example to do it?
If it can be done with batchSize, why is it not working for me?
MySchema.find({}).batchSize(20).exec(function(err,docs)
{
console.log(docs.length)
});
I thought it would print 20 each time, but its printing whole count.
This link has the information you need.
You can do this,
var pagesize=100;
MySchema.find().skip(pagesize*(n-1)).limit(pagesize);
where n is the parameter you receive in the request, which is the page number client wants to receive.
Docs says:
In most cases, modifying the batch size will not affect the user or the application, as the mongo shell and most drivers return results as if MongoDB returned a single batch.
You may want to take a look at streams and perhaps try to accumulate subresults:
var stream = Dummy.find({}).stream();
stream.on('data', function (dummy) {
callback(dummy);
})

How do I load lines from a file at an interval in NodeJS without loading the whole file into memory?

I'm not very experienced in Node, but I'd like to load a file one line at a time and process it, with two special constraints:
I don't want to load the entire file into memory (it could be huge)
I want to process each line about a second apart. Ideally, at a random interval between 100ms and 2000ms
Or another way of looking at the problem: I want to treat a file as a test stream of data.
Everything I've found thus far seems to involve either loading the whole thing into an array at once, or loading it line by line but doing so pretty much instantly.
Hmm, just found this. Dunno if there's a better way to do it, but it seems to be pretty straightforward using line-by-line:
var Random = require('random-js')();
var LineByLineReader = require('line-by-line'), lr = new LineByLineReader(filename);
lr.on('line', function (line) {
lr.pause(); // this blocks so the next line won't read right away.
sendToStream(line);
setTimeout(function () {
lr.resume(); // resume after random interval
}, Random.integer(100, 2000));
});

Resources