Identify an user IP behind a Proxy with Express.js

Identify an user IP behind a Proxy with Express.js - node.js

I've a site that sometimes receives many, unusual requests and want to block the requester. The session id is different for all requests but x-forwarded-for header is always the same.
Now, the x-forwarded-for IP I traced it with a geo-locator and it shows it is from a known ISP (Cox), so I can't just block all incoming requests from that ISP. The remoteAddress shows the address of MY load balancer.
Any idea what I can do to block this person without blocking all the ISP customers?

Related

AppEngine Cron Verification IP Address

I have an AppEngine Node.js application running in a standard environment, and I'm having some trouble with cron verification. The docs say that you can verify that the IP address comes from 0.1.0.2. In the request logs I can see that the request IP is 0.1.0.2; however, in my fastify request object, request.ip is 127.0.0.1. Anyone know what could be happening here?
I was thinking that maybe there's some sidecar like nginx that is accepting the requests, but in that case, I would expect to x-forwarded-for to be defined, but it's not.

As per documentation, X-Forwarded-For value is the list of IP addresses through which the client request has been routed. You can see that the first IP (0.1.0.1) as expected, is the IP of the client that created the request. The subsequent IPs are the proxy servers that handled the request before it reached the application server (e.g.X-Forwarded-For: clientIp, proxy1Ip, proxy2Ip). Therefore, in this case the VM is seeing the remote IP from a Google Cloud internal Load Balancer address, but we do use X-Forwarded-For for things like this.
One quick solution is to only check for the X-Appengine-Cron header rather than also checking the IP address. The X-Appengine-Cron header is set internally by Google App Engine. If your request handler finds this header it can trust that the request is a cron request.
For more information, you can refer to these Stackoverflow Link1 and Link2 which may help you.

Node server :remote-addr displayed local IP (192.X.X.X) when accessed from python-requests

I have an express server that uses nginx and monitors the X-Forwarded-For header.
The node server has the following lines of code:
app.set('trust proxy', '127.0.0.1');
app.use(morgan(':remote-addr')); // and other info too
Normally, when users make requests, independent of the client (mobile app, scripts, etc.) the IP displayed is the remote one.
Recently, I have observed that someone tried to hack into my server using python-requests/2.22.0 and the remote IP was not his IP address, it was 192.X.X.X. I tried to reproduce this myself by accessing the server from itself, but the remote address (global server IP address) was displayed.
Can you better explain to me how this works and if this is something I should be worried about?

They never accessed your server through Nginx; check the logs. They sent a local connection header directly to the IP:port hosting your server. This could be damaging if your security policies are not set correctly, it could leak site IPs and potentially allow an attacker to have a free path into your server without response back and no limits.
As we get scarier, the user could initiate a BGP hijack and take over the relay points sending users to your server end-points; this is one to YouTube or google more about.
As we finish off, know most hosting companies allow for private networking and do give somewhat of a firewall to use but most users assume this is secure when it actually is not! These private networks connect you to the hundreds->thousands servers in a rack or zone. So if the attacker bought a server next to yours (which would likely be a bot) they could scan the private networks for some fun-time which is against TOS but the hosts don't check this good enough or secure it.
In your case, it sounds like the server is responding to the entire internet and bots are having a go at it; Try setting your Node.js server up as localhost only, at port 443 or whatever and host that through nginx. That way anytime someone inserts your IP or domain name it is forwarded by nginx to the local resource. Someone couldn't just use the IP + Node.js port and play games. If you do this, a user may still send the header with fake IP but it won't result to IP Leak, or anything bad unless that IP had super powers on your site, which no filter on your site should say 192.168.x.x gets ADMIN mode. You can feel confident.

How does Host header help on a physical host hosting multiple Servers?

I have 1 single machine with an IP 1.2.3.4. This machine has 2 web servers and an ftp server:
Web Server 1 listens to port 82; the domain for it: ws1.example.com
Web Server 2 listens to port 83; the domain for it: ws2.example.com
FTP Server listens to port 21; the domain for it: ftp.example.com
This is what the DNS mapping looks like:
ws1.example.com CNAME example.com
ws2.example.com CNAME example.com
ftp.example.com CNAME example.com
example.com A 1.2.3.4
Case 1: I make a request at the browser URL ws1.example.com:82 and the DNS redirects me to example.com but with the Host header: ws1.example.com.
Case 2: I make a request at the browser URL ws2.example.com:83 and the DNS redirects me to example.com but with the Host header: ws2.example.com.
In both the cases:
the request ultimately reaches the same physical machine
when the request arrives:
In Case 1, the request arrives at this machine and the request is attended to by the application that is listening on port 82 i.e. Web Server 1.
In Case 2, the request arrives at this machine and the request is attended to by the application that is listening on port 83 i.e. Web Server 2.
The Host header, as I understand, is used to inform the receiving host to identify which server (from the multiple servers that this IP has been hosting) is this request meant for and accordingly directs the request to the appropriate application.
My question is:
In this example, what is the purpose of the Host header as the same physical machine with the same IP has multiple applications listening at their corresponding ports. Once the request reaches this machine, the appropriate port will anyway pick up and the other applications will ignore the request as the port does not match the request. So, what purpose is the Host header serving here when apprpriate ports are anyway doing their job, right and well?
Can I infer that
CNAMES
Multiple Web Servers behind a single IP
subsequent resolution of a particular user request to the appropriate Web Server with the Host header
make sense only when you are using something like a Reverse Proxy e.g. 1 machine interfaces with the client and redirects user requests to the appropriate web server on separate machines all listening on the same port e.g. 80, each in the network behind the reverse proxy in which case you have ws1.example.com and ws2.exmple.com both be redirected to the reverse proxy example.com and this reverse proxy now forwards it to the appropriate host based on the Host header?

No DNS redirections
First an important terminology fix:
There are no "redirects" in the DNS. In your case, the DNS is just use to map a name to an IP. Sometimes, because of CNAME, a name is mapped to another name which is then mapped to an IP. It does not matter if there are intermediate steps like that, at the end a name maps to an IP (or there is a DNS resolution failure)
This also means that if the URL has a specific port, then that is not changed, the final IP will be queried over the port mentioned in the URL.
Redirections are an HTTP level feature: when querying a webserver for https://www.mygreatsite.example/foo it will reply with an HTTP return code of 301, 302, 303, 307 or 308 and giving you (the HTTP client, aka the browser) the new URL to go to.
HTTP virtual hosting
In the good old days, IP addresses were plenty. If you were hosting both www.site1.example and www.site2.example on the same physical box you could attach one different IP address to each.
Hence, in that specific case, in a way, the HTTP host header is useless, the mere fact of connecting either to 192.0.2.37 or 192.0.2.42 already lets you know which site you want.
In fact in HTTP/0.9 there was no host header, as there were no headers at all.
But then, with mass virtual hosting coming into play, and IPv4 addresses becoming scarce, you could not anymore attach one single IP address per site, since it was also a waste.
So you had, through the DNS, either directly or indirectly (CNAME records), both websites resolving to the same IP.
Hence when the HTTP client connected to the server, the server by default has no way to know which website do you want. That is why the HTTP host header filled by the client lets the server know which website you want to access, irrespective to its IP address, that was resolved earlier through the DNS.
By default HTTP uses port 80, so it is often not visible in the URLs.
Of course if you forced your clients to use http://www.site1.example:4569 on one side and http://www.anothersite2.com:9873 on another side, then you are right the host header would not be really needed.
Except that the plan falls down for many reasons:
Port numbers are not an infinite space either and many of them are already used typically for other things; so even if you extend this scheme at one point you could not attach new websites to the same IP
But more important than the previous technical point, for humans this will be a nightmare and many people will use forget the port number and then not coming to the appropriate website.
Hence typically it is not done like that, if you want to expose some given service over HTTP but in a non default port you typically install a reverse proxy in front of it. Or you do an HTTP redirection from http://www.coolpublicname.example/ to http://www.complicatedinternalname.example:9713, but then the client sees this naked truth.
HTTPS virtual hosting
In passing note that HTTPS added a level of complexity because the HTTPS webserver needs to send its certificate to the client, but since each website can have a different certificate it needs to know which website the client wants to use, which it could learn through the host HTTP header but then comes after the TLS handshake is finished, so in the early stage of the server sending a certificate this is not available yet.
So at the earliest times of HTTPS we were forced again to do IP-based virtual hosting and not name-based virtual hosting like it was possible in pure HTTP thanks to the host header.
The solution was found with a TLS extension, the Server Name Indication (SNI), something that the client sends early to the server and gives the website name, so that the server can send the appropriate certificate, and hence we are back in business in the name-based case where you can theoretically have an infinite number of names resolving to the same IP for them to be served by one given webserver.

Is it worth it to use request-ip package for an express.js app instead of req.ip

I need to do a basic flooding control, nothing very sophisticated. I want to get source IP and delay the answer if they are requesting too many times in a short period.
I saw that there is a req.ip field but also a package: https://www.npmjs.com/package/request-ip
What's the difference?

I suggest you to use the request-ip module, because it looks for specific headers in the request and falls back to some defaults if they do not exist.
The following is the order it uses to determine the user ip from the request.
X-Client-IP
X-Forwarded-For header may return multiple IP addresses in the format: "client IP, proxy 1 IP, proxy 2 IP", so we take the the first one.
CF-Connecting-IP (Cloudflare)
Fastly-Client-IP (Fastly CDN and Firebase hosting header when forwared to a cloud function)
True-Client-IP (Akamai and Cloudflare)
X-Real-IP (nginx proxy/FastCGI)
X-Cluster-Client-IP (Rackspace LB, Riverbed Stingray)
X-Forwarded, Forwarded-For and Forwarded (Variations of #2)
appengine-user-ip (Google App Engine)
req.connection.remoteAddress
req.socket.remoteAddress
req.connection.socket.remoteAddress
req.info.remoteAddress
Cf-Pseudo-IPv4 (Cloudflare fallback)
request.raw (Fastify)
It permits to get the real client IP regardless of your web server configuration or proxy settings, or even the technology of the connection (HTTP, WebSocket...)
You can also take a look to the express req.ips (yes, ips, not req.ip) property to get more informations about the request:
req.ips (http://expressjs.com/en/api.html)
When the trust proxy setting does not evaluate to false, this property contains an array of IP addresses specified in the X-Forwarded-For request header. Otherwise, it contains an empty array. This header can be set by the client or by the proxy.
For example, if X-Forwarded-For is client, proxy1, proxy2, req.ips would be ["client", "proxy1", "proxy2"], where proxy2 is the furthest downstream.

If IP addresses can be spoofed so

If IP addresses can be spoofed by creating false or manipulated http headers, and therefore it should not be relied upon in validating the incoming request in our PHP/ASP pages, how come servers take that and rely on it? For example, denying IPs or allowing them are all based on IP.
do servers get the IP information some other ( and more reliable ) way than say PHP/ASP gets it thru server variables?

Servers are typically willing to rely upon the IP address of a connection for low-risk traffic because setting up a TCP session requires a three-way handshake. This handshake can only succeed if the IP address in the packets is routable and some machine is prepared to handle the connection. A rogue router could fake IP addresses but in general, it is more difficult to fake connections the further away from either endpoint the router is, so most people are prepared to rely on it for low-risk uses. (DNS spoofing is far more likely way to misrepresent a connection endpoint, for example.)
Higher-risk users must use something more like TLS, IPsec, or CIPSO (rare) to validate the connection end-point, or build user authentication onto the lower layers to authenticate specific connections (OpenSSH).
But the actual contents of the TCP session can be anything and everything -- and a server should not rely upon the contents of the TCP session (such as HTTP headers) to faithfully report IP addresses or anything else vital.

IP addresses cannot be spoofed. The address is needed for the server to send a reply.
PHP gets the IP address for its $_SERVER global from the server (hence the variable name!), which determines the address from lower in the protocol stack.
EDIT:
sarnold makes a good point that, in principle, one could corrupt routing tables to misdirect traffic. (Indeed, I believe there was an incident of this in a Tier 1 router in Asia a couple years ago.) So I should clarify that my comment that "IP addresses cannot be spoofed" was narrowly tailored to point out that the server variables will always faithfully reflect the destination IP. What goes on beyond the the server's borders is another matter altogether.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string