I have 1 single machine with an IP 1.2.3.4. This machine has 2 web servers and an ftp server:
Web Server 1 listens to port 82; the domain for it: ws1.example.com
Web Server 2 listens to port 83; the domain for it: ws2.example.com
FTP Server listens to port 21; the domain for it: ftp.example.com
This is what the DNS mapping looks like:
ws1.example.com CNAME example.com
ws2.example.com CNAME example.com
ftp.example.com CNAME example.com
example.com A 1.2.3.4
Case 1: I make a request at the browser URL ws1.example.com:82 and the DNS redirects me to example.com but with the Host header: ws1.example.com.
Case 2: I make a request at the browser URL ws2.example.com:83 and the DNS redirects me to example.com but with the Host header: ws2.example.com.
In both the cases:
the request ultimately reaches the same physical machine
when the request arrives:
In Case 1, the request arrives at this machine and the request is attended to by the application that is listening on port 82 i.e. Web Server 1.
In Case 2, the request arrives at this machine and the request is attended to by the application that is listening on port 83 i.e. Web Server 2.
The Host header, as I understand, is used to inform the receiving host to identify which server (from the multiple servers that this IP has been hosting) is this request meant for and accordingly directs the request to the appropriate application.
My question is:
In this example, what is the purpose of the Host header as the same physical machine with the same IP has multiple applications listening at their corresponding ports. Once the request reaches this machine, the appropriate port will anyway pick up and the other applications will ignore the request as the port does not match the request. So, what purpose is the Host header serving here when apprpriate ports are anyway doing their job, right and well?
Can I infer that
CNAMES
Multiple Web Servers behind a single IP
subsequent resolution of a particular user request to the appropriate Web Server with the Host header
make sense only when you are using something like a Reverse Proxy e.g. 1 machine interfaces with the client and redirects user requests to the appropriate web server on separate machines all listening on the same port e.g. 80, each in the network behind the reverse proxy in which case you have ws1.example.com and ws2.exmple.com both be redirected to the reverse proxy example.com and this reverse proxy now forwards it to the appropriate host based on the Host header?
No DNS redirections
First an important terminology fix:
There are no "redirects" in the DNS. In your case, the DNS is just use to map a name to an IP. Sometimes, because of CNAME, a name is mapped to another name which is then mapped to an IP. It does not matter if there are intermediate steps like that, at the end a name maps to an IP (or there is a DNS resolution failure)
This also means that if the URL has a specific port, then that is not changed, the final IP will be queried over the port mentioned in the URL.
Redirections are an HTTP level feature: when querying a webserver for https://www.mygreatsite.example/foo it will reply with an HTTP return code of 301, 302, 303, 307 or 308 and giving you (the HTTP client, aka the browser) the new URL to go to.
HTTP virtual hosting
In the good old days, IP addresses were plenty. If you were hosting both www.site1.example and www.site2.example on the same physical box you could attach one different IP address to each.
Hence, in that specific case, in a way, the HTTP host header is useless, the mere fact of connecting either to 192.0.2.37 or 192.0.2.42 already lets you know which site you want.
In fact in HTTP/0.9 there was no host header, as there were no headers at all.
But then, with mass virtual hosting coming into play, and IPv4 addresses becoming scarce, you could not anymore attach one single IP address per site, since it was also a waste.
So you had, through the DNS, either directly or indirectly (CNAME records), both websites resolving to the same IP.
Hence when the HTTP client connected to the server, the server by default has no way to know which website do you want. That is why the HTTP host header filled by the client lets the server know which website you want to access, irrespective to its IP address, that was resolved earlier through the DNS.
By default HTTP uses port 80, so it is often not visible in the URLs.
Of course if you forced your clients to use http://www.site1.example:4569 on one side and http://www.anothersite2.com:9873 on another side, then you are right the host header would not be really needed.
Except that the plan falls down for many reasons:
Port numbers are not an infinite space either and many of them are already used typically for other things; so even if you extend this scheme at one point you could not attach new websites to the same IP
But more important than the previous technical point, for humans this will be a nightmare and many people will use forget the port number and then not coming to the appropriate website.
Hence typically it is not done like that, if you want to expose some given service over HTTP but in a non default port you typically install a reverse proxy in front of it. Or you do an HTTP redirection from http://www.coolpublicname.example/ to http://www.complicatedinternalname.example:9713, but then the client sees this naked truth.
HTTPS virtual hosting
In passing note that HTTPS added a level of complexity because the HTTPS webserver needs to send its certificate to the client, but since each website can have a different certificate it needs to know which website the client wants to use, which it could learn through the host HTTP header but then comes after the TLS handshake is finished, so in the early stage of the server sending a certificate this is not available yet.
So at the earliest times of HTTPS we were forced again to do IP-based virtual hosting and not name-based virtual hosting like it was possible in pure HTTP thanks to the host header.
The solution was found with a TLS extension, the Server Name Indication (SNI), something that the client sends early to the server and gives the website name, so that the server can send the appropriate certificate, and hence we are back in business in the name-based case where you can theoretically have an infinite number of names resolving to the same IP for them to be served by one given webserver.
Related
I have an express server that uses nginx and monitors the X-Forwarded-For header.
The node server has the following lines of code:
app.set('trust proxy', '127.0.0.1');
app.use(morgan(':remote-addr')); // and other info too
Normally, when users make requests, independent of the client (mobile app, scripts, etc.) the IP displayed is the remote one.
Recently, I have observed that someone tried to hack into my server using python-requests/2.22.0 and the remote IP was not his IP address, it was 192.X.X.X. I tried to reproduce this myself by accessing the server from itself, but the remote address (global server IP address) was displayed.
Can you better explain to me how this works and if this is something I should be worried about?
They never accessed your server through Nginx; check the logs. They sent a local connection header directly to the IP:port hosting your server. This could be damaging if your security policies are not set correctly, it could leak site IPs and potentially allow an attacker to have a free path into your server without response back and no limits.
As we get scarier, the user could initiate a BGP hijack and take over the relay points sending users to your server end-points; this is one to YouTube or google more about.
As we finish off, know most hosting companies allow for private networking and do give somewhat of a firewall to use but most users assume this is secure when it actually is not! These private networks connect you to the hundreds->thousands servers in a rack or zone. So if the attacker bought a server next to yours (which would likely be a bot) they could scan the private networks for some fun-time which is against TOS but the hosts don't check this good enough or secure it.
In your case, it sounds like the server is responding to the entire internet and bots are having a go at it; Try setting your Node.js server up as localhost only, at port 443 or whatever and host that through nginx. That way anytime someone inserts your IP or domain name it is forwarded by nginx to the local resource. Someone couldn't just use the IP + Node.js port and play games. If you do this, a user may still send the header with fake IP but it won't result to IP Leak, or anything bad unless that IP had super powers on your site, which no filter on your site should say 192.168.x.x gets ADMIN mode. You can feel confident.
In node.js how do I create a server accessible with a name not a port?
instead of:
https://example.com:port
this kind of thing:
https://example.com/name/
A server (of any kind) is only named by the domain and port in the URL - it not named by the path at all. The browser parses the URL, takes the domain and port, looks up that domain in DNS to get the IP address, then makes a TCP connection to that specific IP address and port. So, in your example, that would be:
https://example.com:port
or
https://example.com
where the latter just uses the default port of 80. Only those portions of the URL specify the server that the browser will connect to. The path is then sent to that server and the server can then decide what it wants to do with that path when it receives the request.
That said, there are server-side tools you can use that will handle a request at the above server, look at the path and then forward that request to a different server/port. This is often called a proxy server. So, for example, you can run nginx (a pre-built, configurable proxy) that will let you configure that you want a request to https://example.com/name/ to go to some other host (which you can configure as some other IP address and port).
The browser will connect to example.com (which is your proxy) and send the http request for /name. The proxy will receive that request, look at the path, see that it is configured to forward that request to a different host, then connect to that other host, send the request to it, get the response back, then return the response back to the browser. The browser will not necessarily know that this "forwarding" is going on behind-the-scenes. It makes a request and gets an answer.
I am trying to get my head around Windows, Networks and Domains.
I currently have a server - svr. This is on my domain companyname.co.uk
I can connect to server and ping both svr and svr.companyname.co.uk.
On this server I have a number of applications with web access; TeamCity, Octopus etc. We currently connect to them by browsing to svr:xxxx where xxxx is the port of the web app host (http://svr:9090/ for TC)
I want to create friendly alias' - for example teamcity.companyname.co.uk would point at svr:9090, octopus.companyname.co.uk would point to svr:8090.
However, not being experienced in this area I can't seem to find relevant documents or sites that fully explain what I am looking for.
First, to make one thing clear: when you visit a web page like http://example.com, your web browser is actually making a request to example.com:80. This is done transparently because port 80 is the standard port for the HTTP protocol. As you know, you can request a non-standard port by appending it to the domain name in the URL: http://example.com:888/.
Unfortunately, you cannot have a domain name "alias" that somehow includes a non-standard port - your browser will always try to use port 80 if you don't specify a port.
One solution would be to use a proxy - nginx, apache, lighttpd, and others can all do this.
The idea is that you set up a proxy server that is listening on port 80 on your host. It waits for connections, then forwards those connections to a different server (on the same host, or on a different one) based on some rule. So, for example, you might have rules that look something like this:
IF host = teamcity.companyname.co.uk THEN forward to teamcity:9090
IF host = octopus.companyname.co.uk THEN forward to octopus:8090
The syntax for these rules vary widely between different proxy configurations, so this is just an example.
Note that this is not a redirect - the user's browser connects to teamcity.companyname.co.uk for all requests. It's the proxy that sends the request on to a different service and forwards any responses back to the client "behind the scenes".
These proxy configurations can get quite complex. For example, what if your teamcity application serves a page with a link on it that points to http://teamcity:9090/path/to/page? The user's browser is going to fail if they click on that link. Fortunately, proxies can be configured to rewrite URLs like this on the fly. You'll need to do some research to tailor this solution to your situation.
I have tried to send a DNS packet to get an IP of some web-site.
In some cases, like google, the IP was right and when i typed it in the url line it sent me to google.
But in other cases (for example : stackoverflow.com) its gave me an IP that didin't linked to the web-site.
To be sure that my packet is right, i tried to do Nslookap in the command line, and the result was the same.
So i cant find the right IP adress of a web-site.
There is the message that appear when I'm trying to enter stakoverflow
Fastly error: unknown domain: 151.101.65.69.
Please check that this domain has been added to a service.
You (generally speaking) can not open the website just by entering the IP address in your browser's address bar because web servers (and possibly many other network components that are between you and the web server) often do not host only one web site on that IP address so they rely on exact domain name typed in address bar to serve the right content.
I think, it's caused by yours internet restriction. Try to contact your ISP (your internet provider) about this problem. He will probably know more about cause of this problem.
Short answer: you need a host header.
Long answer: Since HTTP/1.1 introduced in 1997 (and then updated in 1999 and in 2014), the request needs a host header. That allows the web server to route a request to a corresponding server configuration, a virtual server in Apache speak. Some servers don't have this configured and is allowing requests to any host to be served from the same web server configuration.
HTTP/1.1 also allowed multi-tenant proxies, as Fastly, to exist in the Internet. Fastly is a CDN - content delivery network - that allows to cache websites content on closer to users and deliver it locally (faster than from a cloud or a colo, thus the name).
When you're not specifying the domain for the request, it looks like your client (or a library) is using the IP address as the host header. That's why the response from Fastly talks about domain: unknown domain: 151.101.65.69.
While Fastly do support service pinning to a dedicated IP address, which would have worked for your request - it doesn't look like stackoverflow is using the feature as they might not need it.
Reading a lot about servers, load balancing and similar topics, a question came to mind.
DNS servers are servers which gives you the IP for a given domain name. Is there a "dictator" knowing all the valid DNS servers in the world? If I want to make a DNS server, and someone requests a website it doesn't have. How would it know which other DNS to redirect the request to? What if I tell facebook.com to have a spoof IP, and everyone getting the IP from my DNS server would be communicating with a spoof facebook server? Obviously, this isn't how it works (at least not at a big degree), because then someone would have done it already to attack hundreds of people.
When one registers a domain, one has to specify the name server for that domain. What happens during this process? Is a request sent to this DNS server to notify it there is a new domain to save in the database? If so, how can anyone own the top domains like .com? And why cannot I for example make my own top domain name if I can make my own DNS server?
After looking at nginx as a load balancing system, I'm starting to wonder a bit. Is it so that a request to http://www.google.com/ works like this? The computer asks a DNS server for the IP address for google.com, and then requests it? This will only be one IP, and all requests to Google ends up at this one server? And then this IP will be connected to a nginx server, or a more basic hardware unit to route the request internally to other servers? So all requests go to one server before it redirects the request to a data center?
After looking up google.com, it says the name servers are ns1.google.com etc.. But what is the point of them, if you need a different name server to get to ns1.google.com in the first place?
Obviously what I've written doesn't make sense, because if it were true, the web as a whole would be unusable because of people exploiting the possibilities for malicious causes. And I can't imagine how ONE server could handle ALL the requests thrown at google.com.
I've tried searching Google, but all I get is theoretical explanations that led me to where I am now. It would have been great if someone would point me to some articles that explain this thoroughly, and hopefully a lot of other people will find this question useful.
Anyone can run a DNS server, but the challenge is getting someone to use it. Normally the DNS server IP is provided as a DHCP option or is statically assigned. If you can get someone to use your server, you can return any IP for any hostname, including creating new top-level domains (subject to any filtering at the client, of course. Web browsers might have difficulty with a new TLD, for example). Note that with DNSSEC, this will eventually change, as the name record will be digitally signed and your server won't be able to fake the signature exactly.
DNS servers operate in a tree. When one server receives a request for a domain it does not control, it forwards the request on to another DNS server. The other DNS server may be the one which returns the IP (this is called the authoritative server), or it may return a NS record which points to another server which then must be queried. The DNS root servers provide for resolving TLDs.
A DNS server does not need to always return the same IP for a given name. It may choose to return a different IP based on region, client IP, or even per-request. This is the most typical way to load balance. Multiple DNS servers can also load balance the DNS requests by using anycast routing, where many servers share the same public IP and traffic is routed to them randomly by publishing multiple routes for the same IP.