does the echo reply do another DNS lookup? - dns

When I do ping www.google.com the protocol ICMP does a DNS lookup and send a echo request to the google IP. When google responds, another dns lookup happens?
And if I do ping -n www.google.com do I save time but not performing the two dns lookups?

the protocol ICMP does a DNS lookup
This is incorrect. ICMP, like IP deals with addresses. When you use names, applications (that is the ping program) will query the OS to translate the name to an IP.
This mapping may be done in different way and the DNS is only one among others. On Unix systems, see the /etc/nsswitch.conf file that dictates how various things should be resolved, like hostsnames. Typically it is at least a mix of the content of the /etc/hosts file and DNS queries.
So once the ping program, typically using getaddringo OS class has resolved the name to an IP address, then it starts using the ICMP protocol towards that address.
When google responds, another dns lookup happens?
No, why should it? Again, when ICMP happens we are already and now solely in a world of addresses, there are no names there.
do I save time but not performing the two dns lookups?
Yes, probably, but just once (DNS uses caches) and the difference will be marginal and lost normally in many other things.
But the interesting question is why do you have this question, I mean what do you try to attempt going that way? Note that ping (because it uses ICMP) is a poor troubleshooting tool as ICMP traffic is often configured very differently in networks than IP traffic.
(And you should not use www.google.com as an anchor for tests, have a look at RIPE anchors instead for example, but again it depends a lot on what is your true underlying problem here).

Related

Two hostnames sharing the same IP

I noticed that one of Google's mail servers (alt4.aspmx.l.google.com) points to 74.125.200.26, but when I do a reverse DNS lookup on that IP I see that the hostname associated with it is sa-in-f26.1e100.net. My limited understanding of DNS is that when you have a situation like that, one hostname is an alias of the other, but that's not the case here.
My initial goal was making a Python program that given an IP address and a hostname, returns a boolean answer indicating whether the IP belongs to a mail server of that domain. The algorithm I implemented used dig to search all mail servers of a domain and then tried to match any of them to the hostname associated with the given IP (which I found using dig -x). My program fails with the case I mentioned before. What am I missing?
Sorry for my bad english. Thanks!
Many services can run on one server/ipaddress, and many hostnames can resolve to one IP address. In the other direction, one ip address will most often resolve to only one hostname (if it has PTR record at all), and the name will very often be something generic like ip-xx-yy-zz-qq.networkcarrier.net (so unrelated to any of the services that are legitimately running on that server).
Depending on the purpose of your check, perhaps you can just test if the hostname A record points to the required IP address (because your initial requirement is flawed: ip addresses do not belong to domains, they belong to network providers).
(Still, for some purposes, most notably as anti spam measure, there is a use case for checking if ip address resolves to some particular hostname.)

IGMPv2 flood source detection

In wireshark I can see Membership Query, general IGMPv2 requests coming over and over from 0.0.0.0 source which suggests ( according to RFC ) machine that hasn't received address yet. My question is how in Linux environment I can find such machine. This query triggers many answers and causes significant network communication slowdown.
When a machine is connected to a network for the first time, it will try to find the DHCP servers in order to get an IP address configuration. Untill then, as you already said, it has no IP address and the only identifier it has is it's MAC address, which is used to keep a comunication alive while it negotiates with the DHCP server (during this period it does not have an IP address until the very last).
Answering your question, you'd find the machine you are looking for making use of the MAC address. If you are on a small network, a manual check (ifconfig) will do it but, if you are on a big one, you better check the ARP table of your switch(es) to have a better idea where it could be.

nslookup not returning all ip addresses for host

I'm working on a program that tracks when search engines cannot find important files on my server.
From my computer I typed:
nslookup google.com
and I received 11 IP addresses in return all beginning with 74.125.226
I then tried:
dig google.com
and I still receive the same IP addresses.
When I navigate to here through searching: http://www.iplists.com/google.txt, the number of IP addresses is greater than 10 and that list includes ones starting with 64.249 which also appears in my server log.
Am I using the wrong linux tools to find every IP address associated with google? If so, what tools should I use?
I want to generate the IP address database myself rather than rely on someone's post of IP addresses just in case IP addresses are updated in the future.
The keyword here is GeoDNS: When website/service has many servers around the world (with the main purpose of providing the lowest possible latency to user requests), using GeoDNS it can respond with different IP addresses to DNS queries for the same hostname, based on the locaton of the DNS client that is making the query (of course it will try to respond with the IP address of the closest server).
Basically, there is no way you can retrieve all possible addresses just by querying DNS from one location. Also, google can add/remove/change addresses/servers on daily basis so you probably can never have the definite list (unless google publishes it somewhere).
The real-time example, this might help you:
https://www.whatsmydns.net/#A/www.google.com
Additional reading:
What is the difference between Anycast and GeoDNS / GeoIP wrt HA?

Where is an IP Anycast Nameserver system implemented?

Ive been reading alot about nameserver the last days. For our websites we want to optimize the waiting time of the visitors that is caused by our namserver. I will have some questions about IP Anycast and the general function of the DNS. Let me start by explaining what I understood the DNS works from user side:
User X wants to visit www.example.com, the following steps happen to get the IP address:
1.Step: User X sends request to the Nameserver of his ISP or nameserver by choice.(recursive nameserver)
2.Step: If the adress is not found, the recursive nameserver will send a request to one of the 13 root nameserver to get the nameserver for the .com TLD
3.Step: Query the .com nameserver to get the auhorative nameserver
4.Step: Query the auhorative nameserver to get the ip-address for www.example.com
First I realized that as a owner of a website you can only optimize Step number 4 and all other steps are not in our hands.
I came across IP Anycast nameserver (what is also used for the 13root nameservers) and totally understand the concept of distributed machines. But what I dont understand is where the decision logic, to which of the distributed machines the user will be send, according to his "position",is implemented? I mean when i buy an anycast nameserver, the logic should be implemented on the .com nameserver (Step 3), so that this nameserver decides to which machine of my anycast nameserver the user will be send.
For me thats really hard to understand and im asking myself if it really works that way? I hope someone can help me with these understanding questions.
Beside of that i found out, that another small method to gain some speed for the user, is to only use A Records and no CName Records anymore.
Are there some more ways to optimize a nameserver?
Thanks in advance!
Your question is not really related to programming, but more to operations, and is also a little too vague ("Are there some more ways to optimize a nameserver?").
But let us try to give you pointers.
User X wants to visit www.example.com, the following steps happen to get the IP address:
Your following description is then mostly correct. Note that at each step, by default until very recently, a recursive nameserver will send the whole name queried to each nameserver. Recently, QNAME Minimization appeared as a standard and now recursive nameservers can send to each authoritativ nameservers only the labels it neeeds to reply. This enhances privacy without changes to the protocol, but is not widespread today because some authoritative nameservers do not work correctly when queried that way.
As a domain name owner you can indeed only have an impact at the last step. But remember that recursive nameservers have caches, so the list of root nameservers as well as the list of .COM nameservers for example are so "hot" (so often needed) that they surely sit always in resolvers' caches, so basically step 1 and 2 happens do not happen often (at start when cache is empty typicaly).
I came across IP Anycast nameserver (what is also used for the 13root nameservers) and totally understand the concept of distributed machines. But what I dont understand is where the decision logic, to which of the distributed machines the user will be send, according to his "position",is implemented?
First things first, IP anycasting is not specific to the DNS, it is just hugely popular here because
it solves the load balancing/fail over problem that all big TLDs have
it works specially well with DNS over UDP which is a simple one query one reply protocol.
So any service can theoretically be anycasted. It means that a given IP address just appears at different locations in the world.
To summarize very broadly, Internet traffic between providers (AS numbers) is exchanged at peering points, where they interconnect and each provider says "I know about IP block 192.0.2.0/24, please send me all traffic for it", etc. for each blocks
(again this is a summary. Blocks are allocated by RIRs, and yes by default this is not very much authenticated so BGP hijacks happen when another provider also says "give me this traffic" when it shouldn't - and it happens because of malicious goals or just simple human errors).
For a normal (technical term: "unicast") IP address, only a single provider (AS) will announce it somewhere (technically: announce its block not just a single IP) and everything will be configured in such a way that wherever the start of the exchange is, for this single IP as destination, it will arrive at the exact same box.
On the contrary, for an anycast IP address, either a single provider or multiple ones (that is multiple Autonomous Systems) will announce this IP at various locations (peering points) in the globe. At each peering point, traffic for this IP will get taken by the provider announcing it there and then it will route this traffic to a specific server "nearby". Announcements of the same IP at peering point A and peering point B, will drive corresponding traffic on one side in datacentre X and the other for datacentre Y.
For the client, when everything works, it does not change anything, as long as all the replying servers react the same way to the same query. The client does not (and sometimes can not) even know the IP is anycasted or that it want to location X when another client doing the same thing will instead hit location Y.
So in short nameservers "decide" nothing in this regard. At each point of the DNS resolution, when they need to contact nameserver NS1 they know its IP address is IP1 and they just open an UDP (or sometimes TCP) connection to this IP, absolutely normally. It is the underlying IP and BGP protocols that will, if anycast is in action, make the response come from the appropriate "close" server.
Note that anycasting in this way, for DNS, achieve both:
fail-over : if one server dies, with appropriate monitoring, its provider withdraws its IP announcement, that is this local copy kind of disappear and the traffic will automatically (in order of seconds) shift to any other instance where the same IP is announced
load-balancing : rougly speaking, if you anycast one IP on 2 locations, each should receive 50% of traffic. It is not true in practice, and is very complicated (read: impossible) to predict or even monitor, because it all depends on the peering points, the agreements between the providers and various other policies (simple example: if you peer at two points where on first there is only one provider sending you trafic, and at other point you have 100 providers with whom you exchange traffic, then you may get more connections going to the second instance... except of course if single provider at first peering point is an ISP with millions of clients, where the other 100 providers are single small organizations...)
So, some nameservers are anycasted. Nowadays all the root ones are (but this was not true 16 months ago, see https://b.root-servers.org/news/2017/04/17/anycast.html as b.root-servers.org was the last one to board the anycast wagon) as well as all big TLDs, sometimes even with more that one "Anycast DNS providers".
For any domain name, you can get some providers that will give you a DNS service for it, based on a "cloud" of anycasted nameservers.
See for example:
https://www.pch.net/services/dns_anycast
https://www.netnod.se/dns/netnod-dns-services
https://dyn.com/dns/network-map/
http://www.cdns.net/anycast.html
https://www.rcodezero.at/en/home/
https://aws.amazon.com/route53/
https://cloud.google.com/dns/
and many others.
Now following on a totally different topic:
Beside of that i found out, that another small method to gain some speed for the user, is to only use A Records and no CName Records anymore.
This is not really something you gain things with, and CNAME records are useful in many other cases.
Again, you need to remember that there are caches.
So even if your configuration is:
www.mywebsite.example CNAME www.mywebsite.example.somegreatCDN.example
www.mywebsite.example.somegreatCDN.example A 192.0.2.128
it is true that this means on paper two DNS requests to finally be able to do an HTTP query, but in practice things will be cached (even more so today with big public open resolvers such as 1.1.1.1 or 8.8.8.8 or 9.9.9.9, that are anycasted too in fact), so the difference will be negligible (and only impacts the first time, never again until it is in cache) ... especially in the case of HTTP and everything that happens later that is opening, frequently, dozens of connections to download javscript source codes, CSS files, fonts, etc. that may be hosted elsewhere.
A lot of websites use CNAME records without negative impact. See www.amazon.com for example, right now:
;; ANSWER SECTION:
www.amazon.com. 730 IN CNAME www.cdn.amazon.com.
www.cdn.amazon.com. 11 IN CNAME d3ag4hukkh62yn.cloudfront.net.
d3ag4hukkh62yn.cloudfront.net. 11 IN A 54.239.172.122
You may however argue that some names will be more popular than others and hence stay longer in cache, which is certainly the case.
And finally:
Are there some more ways to optimize a nameserver?
Based on what? We touched various subjects above, all are compromises, you sacrifice something (it may be just "money") to gain something else (redundancy, etc.). There is no generic rule to declare when this compromise makes sense or not, it will depend a lot on your situation and what you are trying to do.
You are right, and should be congratulated about that, that you should invest some time around DNS setup, both for security and performance reasons. While a lot of money is often invested in huge HTTP setup to sustain various problems or spikes of activity (but even the best fail sometimes, see the recent Amazon Prime Day opening that was a gigantic failure), but often people forget about the DNS because it is on the infrastructure level so not well known nor understood (using UDP makes it already stand out from all other known protocols, as this is rare).
For example there is another completely different thing (it is orthogonal to anycasting, so it can work with or without it, the goals are different) that is related: "geo-DNS" means when a nameserver will reply differently based from where the client asks. This is meant to give, for example, a different IP for a webserver, one that is closer to client (so in that case the webserver itself is probably not anycasted). This is done by just looking at the source IP from the DNS packet, but it is often not good enough because the authoritative nameservers only see as source IP the one from the recursive nameserver and not the real end client one and nowadays with big open public recursive nameservers the location should be far off, so you also have a specific DNS option called EDNS Client Subnet that can be passed between recursive and authoritative nameservers so that they get the end client real IP address (in fact a block not a single IP for privacy reasons) and can act upon it.
Short answer is: you are right. The NameServers is where you can optimize and all "IP Anycast" products I have seen is just a NameServer setup that has a lot of locations.
They use the same system as the "root servers of the internet" but this does not mean that they have the same function. The IP Anycast is simply a method for multiple servers in different locations to serve the same IP address.
From WIKIPEDIA (http://en.wikipedia.org/wiki/Anycast)
On the Internet, anycast is usually implemented by using Border Gateway Protocol to simultaneously announce the same destination IP address range from many different places on the Internet. This results in packets addressed to destination addresses in this range being routed to the "nearest" point on the net announcing the given destination IP address.
If you are using a big ISP like ASCIO or someone using ULTRADNS you probably do not have to worry about this step too much, but if the NS is a local ISP it is worth considering. Make sure you have NS where your visitors are.
I assume this is where you came into contact with "IP Anycast" products. None that I have seen offers anything to attack step 1-2-3 but rather offers a large setup of NameServers allowing them to reduce resolving time due to closeness of networks.
Let me know if you are of the understanding that the offer is for a root NameServer setup, because I would like to see this.

First packet to be sent when starting to browse

Imagine a user sitting at an Ethernet-connected PC. He has a browser open. He types "www.google.com" in the address bar and hits enter.
Now tell me what the first packet to appear on the Ethernet is.
I found this question here: Interview Questions on Socket Programming and Multi-Threading
As I'm not a networking expert, I'd like to hear the answer (I'd assume it is "It depends" ;) ).
With a tool like Wireshark, I can obviously check my own computers behaviour. I'd like to know whether the packets I see (e.g. ARP, DNS, VRRP) are the same in each ethernet configuration (is it dependent on the OS? the driver? the browser even :)?) and which are the conditions in which they appear. Being on the data-link layer, is it maybe even dependent on the physical network (connected to a hub/switch/router)?
The answers that talk about using ARP to find the DNS server are generally wrong.
In particular, IP address resolution for off-net IP addresses is never done using ARP, and it's not the router's responsibility to answer such an ARP query.
Off-net routing is done by the client machine knowing which IP addresses are on the local subnets to which it is connected. If the requested IP address is not local, then the client machine refers to its routing table to find out which gateway to send the packet to.
Hence in most circumstances the first packet sent out will be an ARP request to find the MAC address of the default gateway, if it's not already in the ARP cache.
Only then can it send the DNS query via the gateway. In this case the packet is sent with the DNS server's IP address in the IP destination field, but with the gateway's MAC address on the ethernet packet.
You can always download wireshark and take a look.
Though to spoil the fun.
Assuming, the IP address of the host is not cached, and the MAC address of the DNS server is not cached, the first thing that will be sent will be a broadcast ARP message trying to find out the MAC address of the DNS server (which the router will respond to with its own address).
Next, the host name will be resolved using DNS. Then the returned IP address will be resolved using ARP (again the router will respond with its own address), and finally, the HTTP message will actually be sent.
Actually, it depends on a variety of initial conditions you left unspecified.
Assuming the PC is running an operating system containing a local DNS caching resolver (mine does), the first thing that happens before any packets are sent is the cache is searched for an IP address. This is complicated, because "www.google.com" isn't a fully-qualified domain name, i.e. it's missing the trailing dot, so the DNS resolver will accept any records already in its cache that match its search domain list first. For example, if your search domain list is "example.com." followed by "yoyodyne.com." then cached resources matching the names "www.google.com.example.com." "www.google.com.yoyodyne.com." and finally "www.google.com." will be used if available. Also note: if the web browser is one of the more popular ones, and the PC is running a reasonably current operating system, and the host has at least one network interface with a global scope IPv6 address assigned (and the host is on a network where www.google.com has AAAA records in its DNS horizon), then the remote address of the server might be IPv6 not IPv4. This will be important later.
If the remote address of the Google web server was locally cached in DNS, and the ARP/ND6 cache contains an entry for the IPv4/IPv6 address (respectively) of a default router, then the first transmitted packet will be a TCP SYN packet sourced from the interface address attached to the router and destined for the cached remote IPv4/IPv6 address. Alternatively, the default router could be reachable over some kind of layer-2 or layer-3 tunnel, in which case, the SYN packet will be appropriately encapsulated.
If the remote address of the Google web server was not locally cached, then the host will first need to query for the A and/or AAAA records in the DNS domain search list in sequence until it gets a positive response. If the first DNS resolving server address in the resolver configuration is in one of the local IPv4 subnet ranges, or in a locally attached IPv6 prefix with the L=1 bit set in the router advertisement, and the ARP/ND6 cache already contains an entry for the address in question, then the first packet the host will send is a direct DNS query for either an A record or a AAAA record matching the first fully-qualified domain name in the domain search list. Alternatively, if the first DNS server is not addressable on-link, and a default router has an ARP/ND6 cache entry already, then the DNS query packet will be sent to the default router to forward to the DNS server.
In the event the local on-link DNS server or a default router (respectively, as the case above may be) has no entry in the ARP/ND6 cache, then the first packet the host will send is either an ARP request or an ICMP6 neighbor solicitation for the corresponding address.
Oh, but wait... it's even more horrible. There are tweaky weird edge cases where the first packet the host sends might be a LLMNR query, an IKE initiation, or... or... or... how much do you really care about all this, buckaroo?
It depends
Got that right. E.g. does the local DNS cache contain the address? If not then a DNS lookup is likely to be the first thing.
If the host name is not in DNS cache nor in hosts file, first packet will go to DNS.
Otherwise, the first packet will be HTTP GET.
Well, whatever you try to do, the first thing happening is some Ethernet protocol related data. Notably, Ethernet adapters have to decide whether the Ethernet bus is available (so there's some collision detection taking place here)
It's hard to answer your question because it depends a lot on the type of ethernet network you're using. More information on Ethernet transmission can be found here and here

Resources