How do you reliably serve images from a server? - node.js

This has got to be a simple question, I know. But this problem is driving me insane.
I have an app that allows users to upload both images & files that can be up to 10 MB in size.
I'm using nodejs on the backend to handle them and am currently saving the base64 into a blob in the database(mysql). I figured this would be the issue, but I checked the database and it only had 3% average usage. So this wasn't the bottleneck. The EC2 doesn't go over 10% CPU so this seems fine aswell. NetworkIn averages around 200 MB per & NetworkOut averages around 100 MB.
I moved things over to an S3Bucket however fetching the images still goes through the server which is causing super slow loading times (like 10 seconds)
Should I just throw in the towel and move things over to imgur or does anybody know of something I can do or check?
Edit: Here's the monitor

Related

Nodejs memory gets filled too quick when uploading images ~10MB

Summary
Uploading an image from a Nodejs backend to AWS S3 via JIMP fills up heaps of memory.
Workflow
Frontend (react) sends image via formsubmission to the API
The server parses the form data
JIMP is rotating the image
JIMP is resizing the image if > 1980px wide
JIMP creates Buffer
Buffer is being uploaded to S3
Promise resolved -> Image meta data (URL, Bucket name, index, etc.) saved in Database (MongoDB)
Background
The Server is hosted on Heroku with only 512MB RAM. Uploading smaller images and all other requests are working fine. However, the app crashes when uploading a single image larger than ~8MB, with only a single user online.
Investigation so far
I've tried to replicate this on my local environment. Since I don't have a memory restriction, the app won't crash but the memory usage is ~870MB when uploading a 10MB image. A 6MB image stays around 60MB RAM usage. I've updated all the npm packages, and have tried to disabled any processing of the image.
I've tried to look for memory leaks as seen in the following screenshots, however, following the same workflow as above for the same image (6MB) and taking 3 heap snapshots are giving around 60MB RAM usage.
First, I thought the problem is that the image processing (resizing) takes too much memory, but this would not explain the big gap between 60MB (for a 6MB image) and around 800MB for a 10MB image.
Then I thought it's related to the item "system / JSArrayBufferData" (seen in ref2) which is taking around 30% of the memory. However, this item is always there, even I do not upload an image. It only appears just before I stop the recording snapshot in the "Memory tab" under the "Chrome dev tools". However, I'm still not 100% sure what exactly it is.
Now, I believe this is related to the "TimeList" (seen in ref3). I think it's coming from timeouts waiting for the file to be uploaded to S3. However, here as well, I'm absolutely not sure why this is happening.
The following are the screenshots of in my opinion important parts of the snapshots of the Chrome Inspector running on the server on nodejs with the --inspect flag.
Ref1: Shows full items of 3rd Snapshot - All of the 3 snapshots have uploaded the same image of 6MB. Garbage seems properly collected as memory size did not increase
Ref2: Shows the end 3rd Snapshot, just before I stopped recording. Unsure what "system / JSArrayBufferData" is.
Ref3: Shows the end of the 5th Snapshot, this is the one with a 10MB image. Those little, continuous spikes are the items "TimeList" which seems to be related to a timeout. It seems they appear when the server is waiting for a response from AWS. It also seems this is what's filling up the memory as this item is not there when uploading something less than 10MB.
Ref4: Shows the immediate end of the 5th Snapshot, just before stopping the recording. "system / JSArrayBufferData" appears again, however, only at the end.
Question
Unfortunately, I'm not sure how to articulate my question as I don't know what the problem is or for what I really need to look out. I would be very appreciative for any tips or experiences.
the high memory consumption was caused by the package "Jimp" which has been used to read the file, rotate it, resize it, and create a buffer to upload to the file storage system.
The part of reading the file, i.e. Jimp.read('filename') has caused the memory problem. It's a known bug as seen here: https://github.com/oliver-moran/jimp/issues/153
By now, I've switched to the 'sharp' image processing package, and am now able to easily upload images and videos larger than 10MB.
I hope it helps for people running into this as well.
Cheers

My node app becomes slow even after increased the ram size

The node app that am working becomes very slow and not even responding sometimes. While checking the logs I found that there was a problem with the memory. My app uses all 1400 ram space, then i searched for the solution and found to increase the memory so i increased the max space to 6gb ram, but still the app hits 6 gb and get hanged for period of time then restarts. Which makes my app very slow. Is there any way to solve this problem like clearing the memory or some other solution to make it fast.
Note: am using sequelize queries for sql, I'm not sure whether the problem is because of that.
Thanks in advance

When uploading multiple small files it takes up to 60 seconds for it to complete

I'm trying to setup uploading to google cloud storage, and typically I will have about 200 concurrent uploads of files that are in size of 5 to 10kb. When I'm using the same code with local ceph s3 compatible storage, upload time is barely more than 2-3ms (which is obvious), and when uploading to google s3-like storage, if I have 3 to 5 threads upload time is usually within 200ms for a file. However, as soon as I reach decent concurrency - I get linear increments on the upload times.
First 10 files are uploaded within 200ms, next 10 within 5s, next 10 within 10s and so on till it get to a 60s.
If I use multiple processes - the result is the same. I'm using nodejs to perform the uploads with https://github.com/Automattic/knox module, pool is turned off, so its not an issue of sockets being queued up. I've tested enabling pool with maxSockets set to 500 or so, doesnt help much. When checking with sockstat, concurrently I only have up to 40 connections opened to google servers, even though I would initiate more than 500 to 1000 uploads at the same time using 16 processes. This is extremely weird.
Can anybody help me to diagnose the problem? Is there a limit of connections that google would allow to be opened from a single ip address?
I'm sure it's not a problem with my code, because beforehand I was using it with a local s3 storage (by local I mean I have a cluster of 20 machines with disks, and even though it's in the same data center if there would've been a problem with blocking operations or lack of sockets or anything similar I would've seen an increase in the upload time just as well, but there is not such a thing when using ceph). Reason I'm trying to migrate to google is that managing dying hard drives is pretty annoying and that happens often
According to Google's quotas page,
Sockets
Daily Data and Per-Minute (Burst) Data Limits
Applications using sockets are rate limited on a per minute and a per day basis. Per minute limits are set to handle burst behavior from applications.
The page also shows the limits. You might be running into them, or it could be a limitation of the hardware Google has your app running on.
Maybe its disk throughput problem, check your server iostat first. May be your disk it is too slow too handle file traffic or you got stuck because of open socket limits or file descriptor limits.
If this is the case some problems can be fixed via OS fine tuning. If its your disk latency problem then you can switch from hdd to ssd or increase your cluster size.

What is consuming memory in my Node JS application?

Background
I have a relatively simple node js application (essentially just expressjs + mongoose). It is currently running in production on an Ubuntu Server and serves about 20,000 page views per day.
Initially the application was running on a machine with 512 MB memory. Upon noticing that the server would essentially crash every so often I suspected that the application might be running out of memory, which was the case.
I have since moved the application to a server with 1 GB of memory. I have been monitoring the application and within a few minutes the application tends to reach about 200-250 MB of memory usage. Over longer periods of time (say 10+ hours) it seems that the amount keeps growing very slowly (I'm still investigating that).
I have been since been trying to figure out what is consuming the memory. I have been going through my code and have not found any obvious memory leaks (for example unclosed db connections and such).
Tests
I have implemented a handy heapdump function using node-heapdump and I have now enabled --expore-gc to be able to manually trigger garbage collection. From time to time I try triggering a manual GC to see what happens with the memory usage, but it seems to have no effect whatsoever.
I have also tried analysing heapdumps from time to time - but I'm not sure if what I'm seeing is normal or not. I do find it slightly suspicious that there is one entry with 93% of the retained size - but it just points to "builtins" (not really sure what the signifies).
Upon inspecting the 2nd highest retained size (Buffer) I can see that it links back to the same "builtins" via a setTimeout function in some Native Code. I suspect it is cache or https related (_cache, slabBuffer, tls).
Questions
Does this look normal for a Node JS application?
Is anyone able to draw any sort of conclusion from this?
What exactly is "builtins" (does it refer to builtin js types)?

Socket.io CPU and Ram usage

Recently started playing with socket.io on my Digital Ocean droplet (1 core 1gb ram). I'm currently playing with twitter streams.
Currently, there is a single twitter stream which emits tweets only. The client takes the tweets and prints them to DOM.
The CPU usage is constantly moving back and fourth between 60& and 15% (generally arund 30-40) and ram usage is around 150mb.
This seems very weird to me as without socket.io things are a lot calmer.
Do you know what might be going on here?
If you're using node, 150mb ram might not be that atypical. node starts around ~100mb. Do you have any sort of console logging feature to check when your events are being emitted? there might be lots of things you aren't seeing, and marking them with console.log statements might make it very apparent

Resources