Why is expressjs res.download() extremly slow?

Why is expressjs res.download() extremly slow? - node.js

I have an express app that allows downloads via special ids in the url.
Somebody can add a file, say file.pdf and someone else can download it via http://app/files/<64 char id>.
For the files/:id route I use res.download().
The download speed however never exceeds 300kbit/s and averages around 162kbit/s (measured with wget). This is in contrast to the 200 Mbit/s bandwidth that iperf3 reports between the two systems. The files are also placed on a ssd so access speeds should not be an issue.
The average file size is around 10 MB.
Is there any way I can speed res.download up or how I can implement the same app with different, more performant, expressjs functions ?

Related

Azure CDN high max latency

We are experimenting with using blazor webasembly with angular. It works nicely, but blazor requires a lot of dlls to be loaded, so we decided to store them in azure blob storage, and to serve it we use microsoft CDN on azure.
When we check average latency as users started working, it shows values between 200-400 ms. But max values of latency jump to 5-6 minutes.
This happens for our ussual workload 1k-2k users over course of 1 hour. If they dont have blazor files cached locally yet that can be over 60 files per user requested from cdn.
My question is if this is expected behaviour or we can have some bad configuration somewhere.
I mention blazor WS just in case, not sure it can be problem specificaly with way how these files are loaded, or it is only because it is big amount of fetched files.
Thanks for any advice in advance.
I did check if files are served from cache, and from response headers it seems so: x-cache: TCP_HIT. Also Byte Hit ratio from cdn profile seem ok,mostly 100% and it never falls under 65%.

Torn between a couple practices for file upload to CDN

The platform I'm working on involves the client(ReactJS) and server(NodeJs, Express), of course.
The major feature of this platform involves users uploading images constantly.
Everything has been setup successfully using multer to receive images in formdata on my api server and now its time to create an "image management system".
The main problem I'll be tackling is the unpredictable file size of users. The files are images and the depend on the OS of the user i.e user taking pictures, users taking screenshots.
The first solution is to determine the max file size, and to transport it using a compression algorithm to the api server. When the backend receives it successfully, images are uploaded to a CDN (Cloudinary) and then the link is stored in the database along with other related records.
The second which im strongly leaning towards is shifting this "upload to CDN" system to the client side and make the client connect to cloudinary directly and after grab the secure link and insert into the JSON that would be sent to the server.
This eliminates the problem of grappling with file sizes which is progress, but I'll like to know if it is a good practice.

Restricting the file size is possible when using the Cloudinary Upload Widget for client-side uploads.
You can include the 'maxFileSize' parameter when calling the widget, setting its value to 500000 (value should be provided in bytes).
https://cloudinary.com/documentation/upload_widget
If the client will try and upload a larger file he/she will get back an error stating the max file size is exceeded and the upload will fail.
Alternatively, you can choose to limit the dimensions of the image, so if exceeded, instead of failing the upload with an error, it will automatically scale down to the given dimensions while retaining aspect ratio and the upload request will succeed.
However, using this method doesn't guarantee the uploaded file will be below a certain desired size (e.g. 500kb) as each image is different and one image that is scaled-down to given dimensions can result in a file size that is smaller than your threshold, while another image may slightly exceed it.
This can be achieved using the limit cropping method as part of an incoming transformation.
https://cloudinary.com/documentation/image_transformations#limit
https://cloudinary.com/documentation/upload_images#incoming_transformations

Scalability for intensive pdf generation tasks on a node.js app using puppeteer?

The goal of the app is to generate a pdf using puppeteer, we fetch the data, build the html template then using chrome headless generate the pdf, we then, return a link to the newly generated pdf.
The issue, is it takes about 7000 ms to generate a pdf, mainly because of the three puppeteer functions : launch (launch the headless broweser), goto (navigate to the html template) and pdf (generates the pdf).
So having around 7~8 seconds to answer one request, with more incoming requests or a sudden spike, it could easily takes about 40 to 50 seconds for 30 simultaneous requests, which I find unacceptable.
After many time spent on research, I will implement the cluster module to take advantage of multiple processes.
But besides clustering, are there any other possible options to optimize the time on a single instance?

There are something to consider ...
Consider to call puppeteer.launch once per application start. Your conversion script will just check is browser instance already exists and use it by calling newPage(), which basically create new tab, instead of every time creating the browser.
You may consider to intercept Request as page.on('request', this.onPageRequest); when calling goto() and filter out certain types of the files which page is loading right now, but you don't need them for PDF rendering; you may filter out external resources as well if this is your case.
When use pdf() you may return back Buffer from your service, instead of using file system and return link to the location of PDF file created. this may or may not speed up things, depend on your service setup; anyway less IO should be better.
This is probably all you can do for single instance of your app; With the implementation above regular (couple of pages) PDF with a few images render for me in 1-2 sec.
To speed up things use clustering. Other than embed it inside your application you may consider to use PM2 manager to start and scale multiple instances of your service.

Node.js running on Azure Website max file upload

I am running an Express website on an Azure Website instance (note I say Azure Website, not Azure Webrole)
Initially, uploading large files failed with an HTTP 500 error. After much research, I found that the solution is to manually adjust the value of the parameter <requestLimits maxAllowedContentLength="xxxxxxx" /> in the web.config file to a higher value. I increased that value to 1Gb and large files started to get uploaded successfully.
However, when I increase the size of that parameter (maxAllowedContentLength) to something much larger (say, 5Gb or 10Gb), the website does not even start up anymore. It looks like there is a hard-coded limit to how large this parameter can be.
Does anyone have links to documentation where the max value of this parameter is specified by Microsoft for an Azure Website, or any pointers on how to get files up to 10Gb to be uploaded?

maxAllowedContentLength is a uint which has a max value of 4,294,967,295 which makes the max limit 4GB - If you want to upload larger amounts of data, you will have to use chunked transfer encoding.

HTML website with Knockoutjs bindings has performance issue

I am working on my client's pure HTML CSS website having data bindings with JSON datasets using Knockoutjs. For tables I have used Datatables library.
I have hosted the website on Windows Azure websites.
Here is the link of website : http://bit.ly/(REMOVED SINCE IT IS CONFEDENTIAL)
It takes around 4 seconds to load the website even though I have used CDN for common JS libraries.
It should not have that much load time. I am unable to find the culprit here. I am fetching data from 4 different datasets. Does it impact on performance? Or there is problem with Windows Azure datacenter, It takes while to get response from Azure server. Is Azure culprit?
You can examine the page load time on the website link given above.
Any help would be appreciated.
Solution :
Instead of using sync calls, used
$.getJSON(url, function(data){
//whole knockoutjs logic and bindings
}

All model .js files (starting with patientMedicationChart-Index.js) are loaded synchronously (async:false is set in that file). This means that the browser has to wait for each script file to be loaded before continuing to load the next.
I count about 10 files loaded like that for your demo, which (for me) each take about 200ms to load (about 95% of that 200ms is spent waiting for a response, which also seems rather slow; that might be a server issue with Azure). So times 10 is already 2 seconds spent loading those files, and only after loading all of them will the ready event for the page be triggered.
There might be a reason for wanting to load those files synchronously, but as it is, it's causing a significant part of the loading time for the entire page.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Why is expressjs res.download() extremly slow? - node.js

Related

Azure CDN high max latency

Torn between a couple practices for file upload to CDN

Scalability for intensive pdf generation tasks on a node.js app using puppeteer?

Node.js running on Azure Website max file upload

HTML website with Knockoutjs bindings has performance issue

Categories

Resources