HTTP2 with node.js behind nginx proxy

HTTP2 with node.js behind nginx proxy - node.js

I have a node.js server running behind an nginx proxy. node.js is running an HTTP 1.1 (no SSL) server on port 3000. Both are running on the same server.
I recently set up nginx to use HTTP2 with SSL (h2). It seems that HTTP2 is indeed enabled and working.
However, I want to know whether the fact that the proxy connection (nginx <--> node.js) is using HTTP 1.1 affects performance. That is, am I missing the HTTP2 benefits in terms of speed because my internal connection is HTTP 1.1?

In general, the biggest immediate benefit of HTTP/2 is the speed increase offered by multiplexing for the browser connections which are often hampered by high latency (i.e. slow round trip speed). These also reduce the need (and expense) of multiple connections which is a work around to try to achieve similar performance benefits in HTTP/1.1.
For internal connections (e.g. between webserver acting as a reverse proxy and back end app servers) the latency is typically very, very, low so the speed benefits of HTTP/2 are negligible. Additionally each app server will typically already be a separate connection so again no gains here.
So you will get most of your performance benefit from just supporting HTTP/2 at the edge. This is a fairly common set up - similar to the way HTTPS is often terminated on the reverse proxy/load balancer rather than going all the way through.
However there are potential benefits to supporting HTTP/2 all the way through. For example it could allow server push all the way from the application. Also potential benefits from reduced packet size for that last hop due to the binary nature of HTTP/2 and header compression. Though, like latency, bandwidth is typically less of an issue for internal connections so importance of this is arguable. Finally some argue that a reverse proxy does less work connecting a HTTP/2 connect to a HTTP/2 connection than it would to a HTTP/1.1 connection as no need to convert one protocol to the other, though I'm sceptical if that's even noticeable since they are separate connections (unless it's acting simply as a TCP pass through proxy). So, to me, the main reason for end to end HTTP/2 is to allow end to end Server Push, but even that is probably better handled with HTTP Link Headers and 103-Early Hints due to the complications in managing push across multiple connections and I'm not aware of any HTTP proxy server that would support this (few enough support HTTP/2 at backend never mind chaining HTTP/2 connections like this) so you'd need a layer-4 load balancer forwarding TCP packers rather than chaining HTTP requests - which brings other complications.
For now, while servers are still adding support and server push usage is low (and still being experimented on to define best practice), I would recommend only to have HTTP/2 at the end point. Nginx also doesn't, at the time of writing, support HTTP/2 for ProxyPass connections (though Apache does), and has no plans to add this, and they make an interesting point about whether a single HTTP/2 connection might introduce slowness (emphasis mine):
Is HTTP/2 proxy support planned for the near future?
Short answer:
No, there are no plans.
Long answer:
There is almost no sense to implement it, as the main HTTP/2 benefit
is that it allows multiplexing many requests within a single
connection, thus [almost] removing the limit on number of
simalteneous requests - and there is no such limit when talking to
your own backends. Moreover, things may even become worse when using
HTTP/2 to backends, due to single TCP connection being used instead
of multiple ones.
On the other hand, implementing HTTP/2 protocol and request
multiplexing within a single connection in the upstream module will
require major changes to the upstream module.
Due to the above, there are no plans to implement HTTP/2 support in
the upstream module, at least in the foreseeable future. If you
still think that talking to backends via HTTP/2 is something needed -
feel free to provide patches.
Finally, it should also be noted that, while browsers require HTTPS for HTTP/2 (h2), most servers don't and so could support this final hop over HTTP (h2c). So there would be no need for end to end encryption if that is not present on the Node part (as it often isn't). Though, depending where the backend server sits in relation to the front end server, using HTTPS even for this connection is perhaps something that should be considered if traffic will be travelling across an unsecured network (e.g. CDN to origin server across the internet).
EDIT AUGUST 2021
HTTP/1.1 being text-based rather than binary does make it vulnerable to various request smuggling attacks. In Defcon 2021 PortSwigger demonstrated a number of real-life attacks, mostly related to issues when downgrading front end HTTP/2 requests to back end HTTP/1.1 requests. These could probably mostly be avoided by speaking HTTP/2 all the way through, but given current support of front end servers and CDNs to speak HTTP/2 to backend, and backends to support HTTP/2 it seems it’ll take a long time for this to be common, and front end HTTP/2 servers ensuring these attacks aren’t exploitable seems like the more realistic solution.

NGINX now supports HTTP2/Push for proxy_pass and it's awesome...
Here I am pushing favicon.ico, minified.css, minified.js, register.svg, purchase_litecoin.svg from my static subdomain too. It took me some time to realize I can push from a subdomain.
location / {
http2_push_preload on;
add_header Link "<//static.yourdomain.io/css/minified.css>; as=style; rel=preload";
add_header Link "<//static.yourdomain.io/js/minified.js>; as=script; rel=preload";
add_header Link "<//static.yourdomain.io/favicon.ico>; as=image; rel=preload";
add_header Link "<//static.yourdomain.io/images/register.svg>; as=image; rel=preload";
add_header Link "<//static.yourdomain.io/images/purchase_litecoin.svg>; as=image; rel=preload";
proxy_hide_header X-Frame-Options;
proxy_http_version 1.1;
proxy_redirect off;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header Host $http_host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_pass http://app_service;
}

In case someone is looking for a solution on this when it is not convenient to make your services HTTP2 compatible. Here is the basic NGINX configuration you can use to convert HTTP1 service into HTTP2 service.
server {
listen [::]:443 ssl http2;
listen 443 ssl http2;
server_name localhost;
ssl on;
ssl_certificate /Users/xxx/ssl/myssl.crt;
ssl_certificate_key /Users/xxx/ssl/myssl.key;
location / {
proxy_pass http://localhost:3001;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
}
}

NGINX does not support HTTP/2 as a client. As they're running on the same server and there is no latency or limited bandwidth I don't think it would make a huge different either way. I would make sure you are using keepalives between nginx and node.js.
https://www.nginx.com/blog/tuning-nginx/#keepalive

You are not losing performance in general, because nginx matches the request multiplexing the browser does over HTTP/2 by creating multiple simultaneous requests to your node backend. (One of the major performance improvements of HTTP/2 is allowing the browser to do multiple simultaneous requests over the same connection, whereas in HTTP 1.1 only one simultaneous request per connection is possible. And the browsers limit the number of connections, too. )

Related

Websocket connection re-tries and disconnects so soon when using 4 instance of server and polling as transport type

We are using nodejs + socketIO with transport type as polling as we have to pass token in headers so that we can authenticate the client so i cannot avoid polling transport type.
Now we are using nginx and 4 socket application for this.
I am getting two problem because of this.
When polling call finishes and upgrade to websocket transport type I am getting 400 bad request. That i got to know is because the second request is landing on other socket server which is rejecting this transport type websocket.
these connection are getting triggers to rapidly even once the websocket connection is successful.
This problem#2 comes only when we are running multiple instance of socket server. with single server its works fine and connection doent terminates

When using NGINX as a load balancer to implement reverse proxy to a multi-istance websocket application you have to configure Nginx so that each time a connection is made to an istance, all consecutive requests from the same client should be proxied to the same istance, to avoid unwanted disconnections. Basically you want to implement sticky sessions.
This is well documented in the Socket.io official documentation.
http {
server {
listen 3000;
server_name io.yourhost.com;
location / {
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $host;
proxy_pass http://nodes;
# enable WebSockets
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
}
upstream nodes {
# enable sticky session with either "hash" (uses the complete IP address)
hash $remote_addr consistent;
# or "ip_hash" (uses the first three octets of the client IPv4 address, or the entire IPv6 address)
# ip_hash;
# or "sticky" (needs commercial subscription)
# sticky cookie srv_id expires=1h domain=.example.com path=/;
server app01:3000;
server app02:3000;
server app03:3000;
}
}
The key line is hash $remote_addr consistent;, declared inside the upstream block.
Note that here there are 3 different socket istances deployed on hosts app01, app02, and app03 (always port 3000). If you want to run all of your istances on the same host, you should run them on different ports (example: app01:3001, app02:3002, app03:3003).
Moreover, note that if you have multiple socket server istances with several clients connected, you want that client1 connected to ServerA should be able to "see" and communicate with client2 connected to ServerB. To do this, you want ServerA and ServerB to communicate or at least to share informations. Socket.io can handle this for you with a small effort by using a Redis istance and the redis-adapter module. Check this part of the socket.io documentation.
Final note: both links I shared are from the same socket.io doc page but they point to a different section of the page. I strongly suggest you to read the whole page to have a complete overview about the whole architecture.

How to increase Jetty's header buffer size in the Spark UI reverse proxy

I'm getting "HTTP ERROR 502 Bad Gateway" when I click on a worker link in my standalone Spark UI. Looking at the master logs I can see a corresponding message...
HttpSenderOverHTTP.java:219 Generated headers (4096 bytes), chunk (-1 bytes), content (0 bytes) - HEADER_OVERFLOW/HttpGenerator#231f022d{s=START}
The network infrastructure in front of my Spark UI does indeed generate a header that is bigger than 4096 bytes, and the Spark reverse proxy is attempting to pass that to the worker UI. If I bypass that infrastructure the UI works as it should.
After digging into the Spark UI code I believe that the requestBufferSize init parameter of the Jetty ProxyServlet controls this.
Can this be increased at run-time via (say) a Java property? For example, something like...
SPARK_MASTER_OPTS=-Dorg.eclipse.jetty.proxy.ProxyServlet.requestBufferSize=8192 ...
I've tried the above without success -- I'm not familiar enough with Jetty or Servlets in general to know if that's even close to valid. Obviously I'm also looking into ways of reducing the header size but that involves systems that I have much less control over.
(Spark v3.0.2 / Jetty 9.4)

Here's the workaround that I was forced to use -- Putting a proxy in front of the Spark UI that strips the headers.
I used NGINX with this in the default.conf...
server {
listen 8080;
location / {
proxy_pass http://my-spark-master:8080/;
proxy_pass_request_headers off;
proxy_redirect off;
proxy_set_header Host $http_host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}

I have been fighting this 502 issue for some time now and indeed it seems to be caused by large headers from upstream proxy. I solved it by removing headers that aren't required anyways. Reviewed in browser, then remove using:
proxy_set_header Accept-Encoding "";
As an example.
Thanks for the great tip!
Paul

Scaling Socket.io Node.js App using Cloud Foundry and NginX Build Pack

I am trying to scale my Socket.io Node.js server horizontally using Cloud Foundry (on IBM Cloud).
As of now, my manifest.yml for cf looks like this:
applications:
- name: chat-app-server
memory: 512M
instances: 2
buildpacks:
- nginx_buildpack
This way the deployment goes through, but of course the socket connections between client and server fail because the connection is not sticky.
The official Socket.io documentation gives an example for using NginX for using multiple nodes.
When using a custom nginx.conf file using the Socket.io template I am missing some information (highlighted with ???).
events { worker_connections 1024; }
http {
server {
listen {{port}};
server_name ???;
location / {
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $host;
proxy_pass http://nodes;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
}
upstream nodes {
# enable sticky session based on IP
ip_hash;
server ???:???;
server ???:???;
}
}
I've tried to find out where cloud foundry runs the two instances specified in the manifest.yml file with no luck.
How do I get the required server addresses/ports from cloud foundry?
Is there a way to obtain this information dynamically from CF?
I am deploying my application using cf push.

I haven't used Socket.IO before, so I may be off base, but from a quick read of the docs, it seems like things should just work.
Two points from the docs:
a.) When using WebSockets, this is a non-issue. Cloud Foundry fully supports WebSockets. Hopefully, most of your clients can do that.
b.) When falling back to long polling, you need sticky sessions. Cloud Foundry supports sticky sessions out-of-the-box, so again, this should just work. There is one caveat though regarding CF's support of sticky sessions, it expects the session cookie name to be JSESSIONID.
Again, I'm not super familiar with Socket.IO, but I suspect it's probably using a different session cookie name by default (most things outside of Java do). You just need to change the session cookie name to JSESSIONID and sticky sessions should work.
TIP: you can check the session cookie name by looking at your cookies in your browser's dev tools.
Final note. You don't need Nginx here at all. Gorouter, which is Cloud Foundry's routing layer, will handle the sticky session support for you.

NGINX Reverse Proxy Causes 502 Errors On Some Pages

I have a Node.js/Express application running on an Ubuntu server. It sits behind an NGINX reverse proxy that passes traffic on port 80 (or 443 for ssl) to the application's port.
I've recently had an issue where for no identifiable reason, traffic trying to access / will eventually get a 504 error and timeout. As a test, I increased the timeout and am now getting a 502 error. I can access some other routes on my application, /login for example, with no problems.
When I restart my Express application, my app runs fine with no issues, usually for a few days until this happens again. Viewing the logs for my Express app, a good request looks something like:
GET / 200 15.786 ms - 1214
Whereas requests that aren't responding properly look like this:
GET / - - ms - -
This application has been running properly for about 13 months with no issues, this issue has arisen with no prompting. I haven't pushed any updates within the time that this has occurred.
Here is my NGINX config (modified a bit for security, e.g. example.com)
upstream site_upstream {
server 127.0.0.1:3000;
}
server {
listen 80;
listen 443 ssl;
server_name example.com;
ssl_certificate /etc/nginx/ssl/nginx.crt;
ssl_certificate_key /etc/nginx/ssl/nginx.key;
location / {
proxy_pass http://site_upstream;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
proxy_redirect http://rpa_upstream https://example.com;
}
}
I am unsure of if this an issue with my NGINX config or with my application itself as neither of my configurations have changed.

It sounds like a memory leak in either nginx or your Node application. If it starts to work again after restarting your Node application but without restarting nginx then it seems it's a problem with your Node app.
Try also accessing your app directly without a proxy to see what problems do you have in that case. You can sometimes get more detailed info that way in your browser's developer tools or with command-line tools like curl or benchmarks like Apache ab. Running heavy benchmarks with ab can help you spot the problems more quickly instead of waiting.
Of course it's hard to say what's exactly the problem when you don't show any code.
If it was working fine before, and if you didn't upgrade anything (your app, any Node modules, or Node itself) during that time, then maybe your traffic increased slightly and now you start seeing the problems that were not manifesting before. Or maybe your system now uses more RAM for other tasks and the memory leak starts to be a problem quicker than before.
You can start logging data returned by process.memoryUsage() on a regular intervals and see if anything looks problematic.
Also monitor your Node processes with ps, top, htop or other commands, or see the memory usage /proc/PID/status etc.
You can also monitor /proc/meminfo on regular intervals and see if the total memory used in your system is correlated with your application getting unresponsive.
Another thing that may be causing problems is for example conenctions to your database responding slowly or not at all, if you are not handling errors and timeouts inside of your application. Adding more extensive logging (a line entering every route handler, a line before every I/O opertation starts and after every I/O operation either succeeds or fails or times out) should give you some more insight into it.

Load balancing with nginx using hash method

I want to use nginx as a load balancer in front of several node.js application nodes.
round-robin and ip_hash methods are unbelievably easy to implement but in my use case, they're not the best fit.
I need nginx to serve clients to backend nodes in respect to their session id's which are given by first-landed node.
During my googlings, I've come up with "hash"ing method but I couldn't find too many resources around.
Here is what I tried:
my_site.conf:
http {
upstream my_servers {
hash $remote_addr$http_session_id consistent;
server 127.0.0.1:3000;
server 127.0.0.1:3001;
server 127.0.0.1:3002;
}
server {
listen 1234;
server_name example.com;
location / {
proxy_pass http://my_servers;
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
proxy_redirect off;
proxy_buffering off;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
}
And at the application, I return Session-ID header with the session id.
res.setHeader('Session-ID', req.sessionID);
I'm missing something, but what?

$http_session_id refers to header sent by client (browser), not your application response. And what you need is http://nginx.org/r/sticky, but it's in commercial subscription only.
There is third-party module that will do the same as commercial one, but you'll have to recompile nginx.

It doesn't work out of the box because nginx is a (good) webserver, but not a real load-balancer.
Prefer haproxy for load-balancing.
Furthermore, what you need is not hashing. You need persistence on a session-id header and you need to be able to persist on source IP until you get this header.
This is pretty straight forward with HAProxy. HAProxy can also be used to check if the session id has been generated by the server or if it has been forged by the client.
backend myapp
# create a stick table in memory
# (note: use peers to synchronize the content of the table)
stick-table type string len 32 expire 1h size 1m
# match client http-session-id in the table to do the persistence
stick match hdr(http-session-id)
# if not found, then use the source IP address
stick on src,lower # dirty trick to turn the IP address into a string
# learn the http-session-id that has been generated by the server
stick store-response hdr(http-session-id)
# add a header if the http-session-id seems to be forged (not found in the table)
# (note: only available in 1.6-dev)
acl has-session-id req.hdr(http-session-id) -m found
acl unknown-session-id req.hdr(http-session-id),in_table(myapp)
http-request set-header X-warning unknown\ session-id if has-session-id unknown-session-id
Then you are fully secured :)
Baptiste

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string