How does geographic lookup by IP work?

How does geographic lookup by IP work? - dns

Is which IPs are assigned to which ISPs public information? How do geo IP services obtain this information and maintain this information?
How can I personally figure out where a certain IP belongs without using one of these services?

For what it's worth, I worked at a senior level in the ISP industry for more than a decade so I have quite some experience with this.
Large IP ranges are allocated as needed by IANA to each of the Regional Internet Registries.
The regions are generally continental in size - IP addresses are not assigned on a per-country basis.
The RIRs in turn then allocate IP addresses to ISPs, who in turn assign them to end-users.
Each of the RIRs maintain a whois server which can be queried to find out not only which ISP has been assigned any netblock, but to a certain extent which end-user, and that end-user's address.
Note that many ISPs do not fill out this information for every single customer. Hence if you're a residential subscriber of a DSL service, it's likely that the Geo records will give the address of your ISP, and not your own address.
The various GeoLocation providers mostly work by mining these whois records. Note that the legality of doing so is something of a gray area - RIPE's database copyright statement is here.
IANA also maintains the root zone for the DNS, but that is completely separate from any IP allocation functions. It is very important to maintain the distinction between domain name operations and IP addresses.

To answer the specific question about "how it works": there's alot of manual labor involved, and the databases are to a large extent maintained manually. Just as other answers point out, there's no real correlation between IP ranges and countries, much less specific regions. Recently the system of IP address space distribution has been even more decentralized which means small private vendors can acquire IPv4 address ranges regardless of geographic region. This is why Google acquried Urchin so they could use their services for Google Analytics, which provides very accurate IP-to-geographic-region information.
If you don't want to use a service like MaxMind (free for personal use, and the database is open to some extent) or Google Analytics (free for personal use), there's free (and hence always slightly outdated) databases floating around, sometimes as flat files.

There are a variety of libraries that have mapping tables as well as services you can incorporate into your code.
The most important thing to understand is that there is no direct relationship between an IP address and any part of the world. The addresses are allocated in large blocks to organizations that are roughly geographical, which in turn allocate smaller blocks, this may happen at several levels for any given IP address (Alnitak explains the process well).
The fact is: WHOIS data does not have to be accurate. If I have an address block, I can say it is on Mars. And even if you narrow down the location of the final organization (say a very small ISP in Alaska), the user might be using dialup from Hawaii, or the server might be hosting a company from Guam.
So, there is always an element of risk/estimation in mapping an IP address (or a domain name) to a physical location. This is not to say you should never do it, there are many applications where rough or imperfect information is very useful.

Beware, the data is often slow to be updated, and even slower to replicate. My work place changed ISPs a number of years ago, and we were assigned a block of formerly Canadian IP addresses (we're based in the US), for months Google continued to give us google.ca as our default search engine. About 1/2 the time my home IP address comes up as being from my town, the other 1/2 from a town in another state.
Jason is right that the process is the same, but the updates are even slower and the data less accurate.

Alnitak's answer is pretty much on the mark.
As a side note, if you want to use a .dll to determine the user's location, then you can try this IPAddressExtension found on CodePlex. It has an internal database of ISP's to locations. As mentioned above by Alnitak, each ISP have IP blocks .. so this information is all buried inside the .dll :)
It's really easy to use. Just reference the .dll and then create an instance of a System.Net.IpAddress object! the extensions are listed on it.
I also need to declare that i'm the author of that codeplex project/product.
Please check it out :)
EDIT: added information about me being the author of that product.

Related

Understanding load balancing and DNS records

I am curious on how to setup multiple load-balancers (with different IP addresses) with a specific domain.
I understand that it is possible to setup multiple A-records in a DNS to all of my load-balancers, but I can understand that this is not ideal.
DNS' doesn't do any kind of is-alive checks, so if a load-balancer dies, the DNS will still send users to this address, right?
So how do you connect a domain/DNS with multiple load-balancers, while preventing a dead load-balancer from getting requests...
I read something about anycast, but is this the only solution?
I am just curious about how this issue is normally handled.
Thanks.

You have multiple solutions.
On a pure DNS level you can publish your records with a low TTL (say 5 minutes), and have your monitoring systems change the content of the zone by removing the dead record when detected. This does not provide immediate fail-over but can be often good enough.
It does not involve too complicated systems.
Also, some DNS servers allow some "programmed part", with a dynamic backend that can compute records based on some external parameters, like doing live checks and replying only with the live records.
Anycast is another solution indeed, and has then no relationship with the DNS anymore (although the DNS itself can be "anycasted" but then it is to resolve its possible failover needs, not the ones of your application).
Basically your multiple systems, on various places in the world, are advertised with the same IP address. So the DNS has only one record.
With the "magic" of BGP, each instance announcing a given IP address will collect all the nearby traffic, so you get load-balancing for free in fact. And you need some specific tooling so that, as soon as some local instance is dead (or in maintenance mode for example), you stop announcing its IP address there, so that all other networks in the world, again because of BGP, learn that to reach "something" behing that IP they need to go somewhere else, to another instance of yours announcing this IP.
This is far more complicated to setup as you need a proven BGP setup (and making errors in BGP can have even greater consequences than in DNS), and multiple instances located in different datacentres, and possibly multiple AS numbers, depending on how you want to do your anycast done. This clearly needs skilled professional in BGP routing where the first solution with only DNS (in the first case of just changing a static zonefile) is reachable by any enthousiastic amateur.
So the answer also slightly depend on the network locations of your load-balancers.

Can one intercept data outside of their local network?

Is it possible to access and intercept data transmissions between two hosts which are on separate subnets but still on the internet.
example say intercepting two hosts located in say japanese ISP subnet by an attacker located in a US ISP subnet without the use of malware or physical access or ISP intervention?
or is it just movie stuff?

There are things you can do, to gain access to specific users data-streams with sufficient time, effort and energy. It does not require malware or viruses, nor does it require physical access to the target networks.
What is does require is persistence and an awful lot of late nights and long days, filled with, quite often not a great deal beyond probes and tests for vulnerable network or user kit. Once you find a way through perimeter firewalls and networks, there is still a fair amount of work to go, however, there are things you can do on perimeter routers and switches, that allows you to dump copious amounts of data, flowing across the ports to a disk or elsewhere for inspection and fact-finding on your theoretical targets..
This is just one, very quickly drafted way, of achieving your specified goal - but I can assure you, it is almost never like you see it at the movies..
HTH

DNS server in country A and hosting in B

This is something where I get confused..
Say I acquired a domain name blabla.ge (ge is for Georgia) and hosting my files with US based hosting company. What are the downsides if any and is there an option to change the DNS server?
Cheers!

Agreed, there is no real downside. The tld is really not that important to basic usage. Yes root servers factor in here but really nothing that will impact your daily activities and you don't really need to worry.
For the nameservers, you can change these to any servers you wish and have access to manage the records. Location isn't important other than basic routing and response time. Nameservers generally should be on diverse networks and diverse locations per Best Practices. I have nameservers available in multiple countries and there's nothing wrong with that. If you are using the nameservers provided by your registrar, you likely have the diversity I mentioned, although they may be located in a single country (which is fine).
I have multiple domains registered with tlds such as .nl, .im, .com.de, etc. Some of these point to US-only nameservers, some use nameservers in multiple countries and a couple use the nameservers provided by my registrar (who I purchased the domain from).
From there, my A records point to servers in diverse locations.. Primarily the US and Netherlands. This set up works great, performance is adequate and there are no major downsides to doing it this way. You can change your nameservers for the .ge domain to use US servers or you can leave them overseas and use A records to point to your server(s) in the US. You can debate which method would be "best" given a situation but neither method is "wrong."
So in short, no major downside to doing this at all. And yes, changing your DNS server (nameserver) is always an option. Hope this helps.

Is it a good idea to call an image by it's IP address instead of a domain?

Let's say there is a page with 100 different user photo's shown on the page,
that is at least 100 DNS lookups right there, would this be reduced if I were to link using the an IP instead of a domain url?
http://217.345.33.444/images/photo.jpg instead of http://domain.com/images/photo.jpg

It lowers DNS lookup overhead but will force painful, monotonous, error-prone changes if that IP ever changes down the road.
Also, once a single name is resolved, it shouldn't be looked-up again ...

Its a bit late at night for my timezone, but I thought that DNS lookups are cached in various spots, (even on the local machine??) so it is not as bad as you think.
Thus the first call to lookup the domain will travel a fair way, but the results should be cached on in-between machines so that there is less performance hits with the later calls.
I am sure that this sort of thing was thought long and hard about by the designers of the DNS protocols.
Edit notes
Its taken me 3 edits just to get my spelling and grammar straight - it is definitely too late at night for me

DNS lookups are cached by your computer, so there will only be a single lookup per unique domain.
Additionally, most people use their internet provider's DNS server, and it will typically cache DNS lookups as well, so a lot of the time, the DNS lookup will just be a single network hop away.
You have no way of knowing when the IP address of a domain will change, so I do not recommend this approach.
Is there a reason you don't store the images on your own domain? If you did that:
the DNS issue would go away.
A lot of web servers don't allow hot linking of images, so this problem would be solved as well.
that would also create the possibility of spriting images together, if the set of images shown together doesn't change often.

Why is that 100 DNS lookups? Are all the images on different domains? You should only typically incur one lookup per unique domain (and that's assuming that domain has never been resolved before).

How confident are you that your IP address will never change? Also if you had those 100 images on 4 different domains performance would increase.

Every browser I know looks up for the DNS only once and than cache it. Even if it doesn't, the system does. There's no 100 lookups as you suspected.
You can take a proof of that with any simple traffic sniffer, as I did.

Cross-colo fail-over design, DNS level fail-over?

I'm interested in cross-colo fail-over strategies for web applications, such that if the main site fails users seamlessly land at the fail-over site in another colo.
The application side of things looks to be mostly figured out with a master-slave database setup between the colos and services designed to recover and be able to pick up mid-stream. I'm trying to figure out the strategy for moving traffic from the main site to the fail-over site. DNS failover, even with low TTLs, seems to carry a fair bit of latency.
What strategies would you recommend for quickly moving traffic between colos, assuming the servers at the main colo are unreachable?
If you have other interesting experience / words of wisdom about cross-colo failover I'd love to hear those as well.

DNS based mechanisms are troublesome, even if you put low TTLs in your zone files.
The reason for this is that many applications (e.g. MSIE) maintain their own caches which ignore the TTL. Other software will do a single gethostbyname() or equivalent call and store the result until the program is restarted.
Worse still, many ISPs' recursive DNS servers are known to ignore TTLs below their own preferred minimum and impose their own higher TTLs.
Ultimately if the site is to run from both data centers without changing its IP address then you need to look at arrangements for "Multihoming" via global BGP4 route announcements.
With multihoming you need to get at least a /24 netblock of "provider independent" (aka "PI") IP address space, and then have that only be announced to the global routing table from the backup site if the main site goes offline.

As for DNS, I like to reference, "Why DNS Based Global Server Load Balancing Doesn't Work". For everything else -- use BGP.
Designing networks in order to load balance using BGP is still not an easy task and I myself certainly am not an expert on this. It's also more complex than Wikipedia can tell you but there are a couple interesting articles on the web that detail how it can be done:
Load Balancing In BGP Networks
Load Sharing in Single and Multi homed environments
There is always more if you search for BGP and load balancing. There are also a couple whitepapers on the net which describe how Akamai does their global loadbalancing (I believe it's BGP too.), which is always interesting to read and learn about.
Beyond the obvious concepts you can use software and hardware to achieve, you might also want to check with your ISP/provider/colo if they can set you up.
Also, no offense in regard to your choice of colo (Who's the provider?), but most places should be setup to deal with downtimes and so on, they should not require you to take actions. Of course floods or aliens can always strike, but in that case I guess there are more important issues. :-)

If you can, Multicast - http://en.wikipedia.org/wiki/Multicast or AnyCast - http://en.wikipedia.org/wiki/Anycast

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string