I have a background script that makes repeated AJAX requests to an Amazon S3 storage bucket.
I've noticed that each time a new request is made to a different object in the same bucket, the background script is having to wait for the DNS lookup (~50ms) and SSL handshake (~80ms).
For some reason I cannot identify, the background script has no memory that it has made previous requests to the same Amazon S3 server / bucket.
I have tried to solve this by adding a preconnect link to the top of the background page - this has no effect. Pinging it with a head request will not actually save me any time.
Any ideas from Chrome Extension wizards?
Related
I have an API in node js that receives images. The problem is that I noticed that whenever I start to send an image to the api, other requests from other users are blocked until the image is completely received by the api. How can I resolve this and prevent other requests from being blocked or queued?
I have a node app that takes an url, scrape some text with puppeteer and translate it using deepl before sending me back the result in a txt file. It works as expected locally but having a lot of urls to visit and wanting to learn, I'm trying to make this app works with AWS Lambda and a docker image.
I was thinking about using a GET/POST request to send the url to API Gateway to trigger my lambda and wait for it to send me back the txt file. The issue is the whole process takes 2/3 minutes to complete and send back the file. It is not a problem locally but I know you should not have an http request wait for 3 minutes before returning.
I don't really know how to tackle this problem. Should I create a local server and make the lambda post a request to my ip adress once it is done?
I'm a loss here.
Thanks in advance!
One can see a few alternatives to what is seemingly asynchronous processing concern.
Poke the lambda with the data it needs (via an API, SDK, or CLI) then have it write its results to an S3 bucket. One could poll the s3 bucket for the results asynchronously and pull them down, obviously this requires some scripting.
An another approach would be to have the lambda post the results to a SNS topic that you've subscribed to.
That said, I'm not entirely sure what is meant by local IP, but I would avoid pushing data directly to a self-managed server (or your local IP), rather I would want to use one of the AWS "decoupling" services like SNS, SQS, or even S3 to split apart processing steps. This way it's possible to make many requests and pull down the data as needed.
I've got a laravel service that loads a reactjs page that fires off around 30+ axios calls after loading. When I look at the source tab, it looks like only 3 of the calls are being processed at a time.
I'm testing this by connecting to the AWS RDS instance from my local environment. I tried using a db.t3.medium and a db.t3.large with no noticeable change.
The applicate has multiple database connections. Each requests uses all three connection to gather the required data. All of the requests execute the exact same query from one database and then each of the requests executes a query on a different table in the second database.
Is there a reason why AWS isn't processing all of my requests simultaneously?
You aren’t looking at the good performance indicator. You are looking at your browser network console. Your browser limits the number of request it can do on the same host simultaneously.
You can find more information here: Max parallel http connections in a browser?
I have a NodeJS REST API which has endpoints for users to upload assets (mostly images). I distribute my assets through a CDN. How I do it right now is call my endpoint /assets/upload with a multipart form, the API creates the DB resource for the asset and then use SFTP to transfer the image to the CDN origin's. Upon success I respond with the url of the uploaded asset.
I noticed that the most expensive operation for relatively small files is the connection to the origin through SFTP.
So my first question is:
1. Is it a bad idea to always keep the connection alive so that I can
always reuse it to sync my files.
My second question is:
2. Is it a bad idea to have my API handle the SFTP transfer to the CDN origin, should I consider having a CDN origin that could handle the HTTP request itself?
Short Answer: (1) it a not a bad idea to keep the connection alive, but it comes with complications. I recommend trying without reusing connections first. And (2) The upload should go through the API, but there maybe be ways to optimize how the API to CDN transfer happens.
Long Answer:
1. Is it a bad idea to always keep the connection alive so that I can always reuse it to sync my files.
It is generally not a bad idea to keep the connection alive. Reusing connections can improve site performance, generally speaking.
However, it does come with some complications. You need to make sure the connection is up. You need to make sure that if the connection went down you recreate it. There are cases where the SFTP client thinks that the connection is still alive, but it actually isn't, and you need to do a retry. You also need to make sure that while one request is using a connection, no other requests can do so. You would possibly want a pool of connections to work with, so that you can service multiple requests at the same time.
If you're lucky, the SFTP client library already handles this (see if it supports connection pools). If you aren't, you will have to do it yourself.
My recommendation - try to do it without reusing the connection first, and see if the site's performance is acceptable. If it isn't, then consider reusing connections. Be careful though.
2. Is it a bad idea to have my API handle the SFTP transfer to the CDN origin, should I consider having a CDN origin that could handle the HTTP request itself?
It is generally a good idea to have the HTTP request go through the API for a couple of reasons:
For security reasons' you want your CDN upload credentials to be stored on your API, and not on your client (website or mobile app). You should assume that your code for website can be seen (via view source) and people can generally decompile or reverse engineer mobile apps, and they'll be able to see your credentials in the code.
This hides implementation details from the client, so you can change this in the future without the client code needing to change.
#tarun-lalwani's suggestion is actually a good one - use S3 to store the image, and use a lambda trigger to upload it to the CDN. There are a couple of Node.js libraries that allow you to stream the image through your API's http request towards the S3 bucket directly. This means that you don't have to worry about disk space on your machine instance.
Regarding your question to #tarun-lalwani's comment - one way to do it is to use the S3 image url path until the lambda function is finished. S3 can serve images too, if properly given permissions to do so. Then after the lambda function is finished uploading to the CDN, you just replace the image path in your db.
In HTTP 1.0 I know that a new socket connection is made as soon as the browser sends a new GET request. I was wondering if the browser sends the GET request for each individual file in the website. For example, let's say we have a static website with 3 image files and the index.html file. When we connect to the server, does the browser send 4 separate requests (aka 4 different connections), or does it only connect to the website once and retrieve all the content (aka only 1 connection is enough)?
As explained in this answer (regarding HTTP 1.0 vs 1.1), in v1.0 every request is sent in a separate connection, so that would be 4, however, due to caching mechanisms (which are available in v1.0), the browser might not send any request at all, and hence not open any connection.
If you open the developer console in a browser and look at Network (in Chrome) it shows you all of the requests that are made. It will make an individual request for each resource. Also, if an image is used 20 times it will be requested once and displayed 20 times. Although all of these requests are made separately it could still be that they are all done through the same connection as a request and a connection are not the same thing. Hope this gives you a bit of direction. These two links may give you a bit more information on connections to the server.
https://en.wikipedia.org/wiki/HTTP_persistent_connection
https://en.wikipedia.org/wiki/HTTP_pipelining