What is the theory behind port forwarding when using P2P programs - p2p

Every time I use a different router and different P2P program, I get the same problem - port forwarding. I then usually read random values of ports(TCP, UDP, whatever) and paste it into random places in my router setttings page and repeat this process until the damn thing starts working. As I am a bit tired of doing that i would like to understand the theory behind it a little bit, so that I can put the right things in right places immediately. Could anybody just explain it briefly to me in a few words? Apologies for lengthy description of the problem, but I didn't know how to describe the level of understanding that I am talking about in a more concise way.
Thanks.

Well, the router hides you from the outer world, so you can only make outgoing connections, for which router takes care of sending your requests to the outer world, receiving responses, and sending those back to you. No one can send a packet to you unless you have specifically asked for it—i.e. you can only receive responses.
In case on p2p, the ability to send packets to your machine is important if not vital. So what you do is ask router to forward (here! that's where the word comes from) all incoming packets to port X to your machine, port X.

Originally IP addresses were provided per device, now-a-days we tend to have 1 IP address per household (unless your doing something crazy), also called your external IP. Your external IP is your connection to the world via your router, but each computer within your network has it's own IP (called internal IP). Port forwarding allows the external world to establish communications with a specific computer.
A web server is a simple example, web services typically rely on port 80, what-if in your network you had 4 computers, 1 of which was your web server. How would the outside world know which PC to contact? Port Forwarding allows you to tell your router to direct internet traffic to that server.

Related

TCP hole punching in Node without a server

I'm trying to follow the code given here to implement NAT hole punching in Node.js. I'd like to know if the server is strictly necessary. Having read about hole punching, I am under the impression that the purpose of the server is to allow the clients to exchange some information (including but not limited to their addresses and ports they want to communicate on) so that they can proceed to talk directly. Assuming the clients already had each other's information (again, including but not limited to their addresses and ports), would the server still be necessary? If so why and if not, how could this be implemented?
For instance, say one were to build an application where client_A prints out all information that would have been transmitted to the server for user_A to read, who then sends this to user_B, who then submits this info to client_B (this could be done via email for example). Wouldn't this avoid the need for a server?
Here is another explanation of why I think it might be possible to remove the server in the middle:
In NAT hole punching (assuming I understand it correctly), the communications begin when client_A sends a message to the server. The message contains some information that the server then passes on to client_B when client_B contacts the server. After this point, client_A and client_B are able to communicate directly without the need for the server. I am under the impression that once a direct connection between client_A and client_B has been established, the server could go offline and the two clients would still be able to communicate directly with one another. If this is the case, then I would imagine that any information that is being used to maintain this connection (be that addresses, ports, or any other kind of info) could be exchanged through any other channel (eg: email, a handwritten letter, a voice call, etc) at the beginning of the protocol, and then the connection could be established without ever needing the server.
Regarding 'tricking' the router
As manishig pointed out to me in a comment (thanks), NAT hole punching also requires tricking the router. If I understand correctly (please correct me if not) the router is tricked by having the router store the info for directing incoming packets from the server to client_A, however, these packets are actually coming from client_B after the initial phase of the protocol. If this is a correct description of the problem, is there a way to trick the router that doesn't require using a server?
There are ways to communicate between two remote computers over the internet without an intermidiate server, but IMO it is not the preferred way.
Why an intermidiate server is needed?
If client_A and client_B are both in the same LAN (e.g your home/office network) you can make sure (configure on the clients side and/or the router) that they will have a static ip address over this LAN and they can just talk freely.
E.G: If client_A is listening on port 8080, client_B can create a connection to client_A_ip on port 8080
Over the internet any packet sent is passed through NAT usually at least twice. One time after going through your LAN (e.g your home/office router) and at least once over an ISP endpoint. Which means you have no controll over the public ip and port assigned to your packet.
Now not only that you don't have controll over your packet's assigned public ip and port, these are also not static. They won't change while you have an active TCP connection, but you don't have any other guarantee from your ISP regarding your assigned public ip and port.
The intermediate server`s purpose is to dynamically update each client with it's peer info and also keeping the tcp connection open, so that peer to peer comunication will be available.
Alternative solution to an intermidiate server (Not recommended)
If you want your clients to communicate without an intermidiate server you can buy a public static ip from your ISP (if they support it) and then there are ways you can make (with some config) that one of your clients have a public static ip and port that the other client can connect to.
But I wouldn't recommend it, since it requires some understanding in IT and security risks.
Also if both client's are portable and connect to different networks all the time it's not a valid solution

Designing a DSR load balancer

I want to build a DSR load balancer for an application I am writing. I wont go into the application because it is irrelevant for this discussion. My goal is to create a simple load balancer that does direct server response for TCP packets. The idea is to receive all packets at the load balancer, then using something like round robin, select a server from a list of available servers which are defined in some config file. The next step would be to alter the packer received and change the destination ip to be equal to the chosen backend server. Finally, the packet will be sent over to the backend server using normal system calls for sending packets. Theoretically the backend server should receive the packet, and send one back to the original requester, and then the requester can communicate directly with the backend server rather than going through the load balancer.
I am concerned that this design will not work as I expect it to. The main question is, what happens when computer A send a packet to IP Y, but receives a packet back in the same TCP stream from a computer at IP X? Will it continue to send packets to IP Y? Or will it switch over to IP X?
So it turns out this is possible, but only halfway so, and I will explain what I mean by this. I have three processes, one which is netcat, used to initiate an tcp request, a second process, the dsr-lb, which receives packets on a certain port, changes the destination ip to a backend server(passed in via command line arg), and forwards it off using raw sockets, and a third process which is a basic echo server. I got this working on a local setup. The local setup consists of netcat running on my desktop, and dsr-lb and echo servers running on two different linux VMs on the desktop as well. The path of the packets was like this:
nc -> dsr-lb -> echo -> nc
When I said it only half works, what I meant was that outgoing traffic has to always go through the dsr-lb, but returning traffic can go directly to the client. The client does not send further traffic directly to the backend server, but still goes through the dsr-lb. This makes sense since the client opened a socket to the dsr-lb ip, and internally still remembers this ip, regardless of where the packet came from.
The comment saying "if its from a different IP, it's not the same stream. tcp is connection-based" is incorrect. I read through the linux source code, specifically the receive tcp packet portion, and it turns out that linux uses source ip, source port, destination ip, and destination port to calculate a hash which is uses to find the socket that should receive the traffic. However, if no such socket matches the hash, it tries again using only the destination ip and destination port and that is how this "magic" works. I have no idea if this would work on a windows machine though.
One caveat to this answer is that I also spun up two remote VMs and tried the same experiment, and it did not work. I am guessing it worked while all the machines were on the same switch, but there might be a little more work to do to get it to work if it goes through different routers. I am still trying to figure this out, but from using tcpdump to analyze the traffic, for some reason the dsr-lb is forwarding to the wrong port on the echo server. I am not sure if something is corrupted, or if the checksum is wrong after changing the destination ip and some router along the way is dropping it or changing it somehow(I suspect this might be the case) but hopefully I can get it working over an actual network.
The theory should still hold though. The IP layer is basically a packet forwarding layer and routers should not care about the contents of the packets, they should just forward packets based on their routing tables, so changing the destination of the packet while leaving the source the same should result in the source receiving any answer. The fact that the linux kernel ultimately resolves packets to sockets just using destination ip and port means the only real roadblock to this working does not really exist.
Also, if anyone is wondering why bother doing this, it may be useful for a loadbalancer in front of websocket servers. Its not as great as a direct connection from client to websocket server, but it is better than a loadbalancer that handles both requests and responses, which makes it more scalable, and more able to run on less resources.

Python – Exchange Data Between Different Networks

I investigated a lot about this topic but most of the guides just teach how to exchange data between devices on the same network and, regarding exchanging data between devices on different networks, no source was totally clear to me. I hope with this question somebody can give me (and other users) a good overview. If you have any guide or book about it I’d be super interested (for Java would also be fine).
First of all I’m interested in the difference between programs that
need to exchange data quickly (it may be an online videogame) versus
programs that need to exchange data accurately (it may be a message
app). My understanding is that the difference between the two is the
protocol used: in the first case is UDP (where no checks are done to
ensure there is no packets loss), in the second case is TCP (where
checks are done and data is exchanged more slowly). Is this correct?
So in an hypothetical Python script in the first case the socket
created would look like this:
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
While in the second case would look like this:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
My understanding is that to exchange data between different networks
you have to use port forwarding (very good explanation here),
concept that is clear to me. However, do you have any source that
suggests how to do it in Python? Also, is port forwarding
everything you need to do in order to exchange data between
different networks? Finally, I’m not sure I understand the role UPnP
plays in port forwarding. Based on this question it seems UPnP
is a way to automatically port forwarding. Is it correct? Can I use
miniupnpc library to do it automatically?
Finally, if I switch off and on my router, the private IP addresses
assigned to the devices connected to the network change (so the
private IP of my phone connected to my home WiFi could change, for
example, from 192.168.1.2 to 192.168.1.11). Does this represent a
problem in networking programming? If I set on the router a certain
port and the traffic that comes to that port is directed to a
certain private IP address and then this IP changes I suppose there is a
problem. Is this correct? If it is what is the solution?
Your understanding of use cases for UDP and TCP seem roughly
accurate. UDP ensure lower latency (not always) so for apps that
require lowest latency possible while also not caring about missed
packets, UDP is used. So if you think about video streaming, once a
packet is missed, it makes no sense to hold up every future packet
for that one old packet. This is because a small amount of data that
is missed doesn't really affect a user's watching experience. For
gaming, we want the newest data as soon as possible, so waiting for
old data also doesn't matter. But if you're implementing a protocol
or something that requires all data to be transmitted, TCP makes
sense since its absolutely vital that all information gets to the
receiver and in order.
There are a few methods to exchange data between two private networks. Port forwarding is certainly one method, and both machines on either network would have to have port forwarding. I don't know anything about automated port forwarding like you mention, but you can go into your router settings and set it up pretty easily. Another method of talking across networks is something like webRTC. Its a protocol that uses the STUN TURN and ICE protocols to perform something called NAT traversal. Short story shorter, it tricks your routers into letting your machines talk to each other(analogous to temporary port forwarding).
You're right that this could be an issue. However you should be able to setup static IP addresses in your router. So you can assign one machine to have a static IP address, setup port forwarding, and bam you have a permanent(hopefully) open connection.

Is there a way to test if a computer's connection is firewalled?

I'm writing a piece of P2P software, which requires a direct connection to the Internet. It is decentralized, so there is no always-on server that it can contact with a request for the server to attempt to connect back to it in order to observe if the connection attempt arrives.
Is there a way to test the connection for firewall status?
I'm thinking in my dream land where wishes were horses, there would be some sort of 3rd-party, public, already existent servers to whom I could send some sort of simple command, and they would send a special ping back. Then I could simply listen to see if that arrives and know whether I'm behind a firewall.
Even if such a thing does not exist, are there any alternative routes available?
Nantucket - does your service listen on UDP or TCP?
For UDP - what you are sort of describing is something the STUN protocol was designed for. It matches your definition of "some sort of simple command, and they would send a special ping back"
STUN is a very "ping like" (UDP) protocol for a server to echo back to a client what IP and port it sees the client as. The client can then use the response from the server and compare the result with what it thinks its locally enumerated IP address is. If the server's response matches the locally enumerated IP address, the client host can self determinte that it is directly connected to the Internet. Otherwise, the client must assume it is behind a NAT - but for the majority of routers, you have just created a port mapping that can be used for other P2P connection scenarios.
Further, you can you use the RESPONSE-PORT attribute in the STUN binding request for the server to respond back to a different port. This will effectively allow you to detect if you are firewalled or not.
TCP - this gets a little tricky. STUN can partially be used to determine if you are behind a NAT. Or simply making an http request to whatismyip.com and parsing the result to see if there's a NAT. But it gets tricky, as there's no service on the internet that I know of that will test a TCP connection back to you.
With all the above in mind, the vast majority of broadband users are likely behind a NAT that also acts as a firewall. Either given by their ISP or their own wireless router device. And even if they are not, most operating systems have some sort of minimal firewall to block unsolicited traffic. So it's very limiting to have a P2P client out there than can only work on direct connections.
With that said, on Windows (and likely others), you can program your app's install package can register with the Windows firewall so your it is not blocked. But if you aren't targeting Windows, you may have to ask the user to manually fix his firewall software.
Oh shameless plug. You can use this open source STUN server and client library which supports all of the semantics described above. Follow up with me offline if you need access to a stun service.
You might find this article useful
http://msdn.microsoft.com/en-us/library/aa364726%28v=VS.85%29.aspx
I would start with each os and ask if firewall services are turned on. Secondly, I would attempt the socket connections and determine from the error codes if connections are being reset or timeout. I'm only familiar with winsock coding, so I can't really say much for Linux or mac os.

NAT, P2P and Multiplayer

How can an application be designed such that two peers can communicate directly with each other (assuming both know each other's IPs), but without outgoing connections? That's, no ports will be opened. Bitorrent for example does it, but multiplayer games (as far as I know) require port forwarding.
I'm not sure what you mean by No Outgoing Connections, I'm going to assume like everyone else you meant no Incoming Connections (they are behind a NAT/FW/etc).
The most common one mentioned so far is UPNP, which in this context is a protocol that allows you as a computer to talk to the Gateway and say forward me this port because I want someone on the outside to be able to talk to me. UPNP is also designed for other things, but this is the common thing for home networking (Actually it's one of many definitions).
There are also more common and slightly more reliable ways if you don't own the network. The most common is called STUN but if I recall correctly there are a few variants. Basically you use a third party server that allows incoming connections to try and coordinate a communication channel. Basically, what you do is send a UDP packet to you're peer, which will open up you're NAT for a response, but gets dropped on you're peer's NAT (since no forwarding rule exists yet). Through the connection to the intermediary, they are then told to do the same, which now opens up their NAT, and matches the existing rule in you're NAT. Now the communications can proceed. Their is a variant of this which will allow a TCP/IP connection as well by sending SYN and SYN-ACK messages with some coordination.
The Wikipedia articles I've linked to has links to the relevant rfc's for these protocols on precisely how they work. Essentially it comes down to, there isn't an easy answer, as this is a very network centric problem.
You need a "meeting point" in the network somewhere: the participants "meet" at a "gateway" of some sort and the said "gateway function" takes care of the forwarding.
At least that's one way of doing it: I won't try to comment on the details of Bittorrent... I am sure you can google for links.
UPNP dealt with this mostly in the recent years, but the need to open ports is because the application has been coded to listen on a specific port for a response.
Ports beneath 1024 are called "registered" because they've been assigned a port number because a company paid for it. This doesn't mean you couldn't use port 53 for a webserver or SSH, just that most will assume when they see it that they are dealing with DNS. Ports above 1024 are unregistered, so there's no association - your web browser, be it Internet Explorer/Firefox/etc, is using an unregistered port to send the request to the StackOverflow webserver(s) on port 80. You can use:
netstat -a
..on windows hosts to see what network connections are currently established, including the port involved.
UPNP can be used to negotiate with the router to open and forward a port to your application. Even bit-torrent needs at least one of the peers to have an open port to enable p2p connections. There is no need for both peers to have an open port however, since they both communicate with the same server (tracker) that lets them negotiate and determine who has an open port.
An alternative is an echo-server / relay-server somewhere on the internet that both peers trust, and have that relay all the traffic.
The "problem" with this solution is that the echo-server needs to have lots of bandwidth to accomodate all connected peers since it relays all the traffic rather than establish p2p connections.
Check out EchoWare: http://www.echogent.com/tech.htm

Resources