Intercept traffic above the transport layer - linux

Firstly, I'm relatively new to network programming. I want to intercept and delay HTTP traffic before it gets to the server application. I've delved into libnetfilter_queue which gives me all the information I need to delay suitably, but at too low a level. I can delay traffic there, but unless I accept the IP datagrams almost immediately (so sending them up the stack when I want to delay them), they will get resent (when no ACK arrives), which isn't what I want.
I don't want or need to have to deal with TCP, just the payloads it delivers. So my question is how do I intercept traffic on a particular port before it reaches its destination, but after TCP has acknowledged and checked it?
Thanks
Edit: Hopefully it's obvious from the tag and libnetfilter_queue - this is for Linux

Hijack the connections through an HTTP proxy. Google up a good way to do this if you can't just set HTTP_PROXY on the client, or set up your filter running with the IP and port number of the current server, moving the real server to another IP.
So the actual TCP connections are between the client and you, then from you to the server. Then you don't have to deal with ACKs, because TCP always sees mission accomplished.
edit: I see the comments on the original already came up with this idea using iptables to redirect the traffic through your transparent proxy process on the same machine.

Well I've done what I suggested in my comment, and it works, even if it did feel a long-winded way of doing it.
The (or a) problem is that the web server now, understandably, thinks that every request comes from localhost. Really I would like this delay to be transparent to both client and server (except in time of course!). Is there anything I can do about this?
If not, what are the implications? Each HTTP session happens through a different port - is that enough for them to be separated completely as they should be? Presumably so considering it works when behind a NAT where the address for many sessions is the same.

Related

Designing a DSR load balancer

I want to build a DSR load balancer for an application I am writing. I wont go into the application because it is irrelevant for this discussion. My goal is to create a simple load balancer that does direct server response for TCP packets. The idea is to receive all packets at the load balancer, then using something like round robin, select a server from a list of available servers which are defined in some config file. The next step would be to alter the packer received and change the destination ip to be equal to the chosen backend server. Finally, the packet will be sent over to the backend server using normal system calls for sending packets. Theoretically the backend server should receive the packet, and send one back to the original requester, and then the requester can communicate directly with the backend server rather than going through the load balancer.
I am concerned that this design will not work as I expect it to. The main question is, what happens when computer A send a packet to IP Y, but receives a packet back in the same TCP stream from a computer at IP X? Will it continue to send packets to IP Y? Or will it switch over to IP X?
So it turns out this is possible, but only halfway so, and I will explain what I mean by this. I have three processes, one which is netcat, used to initiate an tcp request, a second process, the dsr-lb, which receives packets on a certain port, changes the destination ip to a backend server(passed in via command line arg), and forwards it off using raw sockets, and a third process which is a basic echo server. I got this working on a local setup. The local setup consists of netcat running on my desktop, and dsr-lb and echo servers running on two different linux VMs on the desktop as well. The path of the packets was like this:
nc -> dsr-lb -> echo -> nc
When I said it only half works, what I meant was that outgoing traffic has to always go through the dsr-lb, but returning traffic can go directly to the client. The client does not send further traffic directly to the backend server, but still goes through the dsr-lb. This makes sense since the client opened a socket to the dsr-lb ip, and internally still remembers this ip, regardless of where the packet came from.
The comment saying "if its from a different IP, it's not the same stream. tcp is connection-based" is incorrect. I read through the linux source code, specifically the receive tcp packet portion, and it turns out that linux uses source ip, source port, destination ip, and destination port to calculate a hash which is uses to find the socket that should receive the traffic. However, if no such socket matches the hash, it tries again using only the destination ip and destination port and that is how this "magic" works. I have no idea if this would work on a windows machine though.
One caveat to this answer is that I also spun up two remote VMs and tried the same experiment, and it did not work. I am guessing it worked while all the machines were on the same switch, but there might be a little more work to do to get it to work if it goes through different routers. I am still trying to figure this out, but from using tcpdump to analyze the traffic, for some reason the dsr-lb is forwarding to the wrong port on the echo server. I am not sure if something is corrupted, or if the checksum is wrong after changing the destination ip and some router along the way is dropping it or changing it somehow(I suspect this might be the case) but hopefully I can get it working over an actual network.
The theory should still hold though. The IP layer is basically a packet forwarding layer and routers should not care about the contents of the packets, they should just forward packets based on their routing tables, so changing the destination of the packet while leaving the source the same should result in the source receiving any answer. The fact that the linux kernel ultimately resolves packets to sockets just using destination ip and port means the only real roadblock to this working does not really exist.
Also, if anyone is wondering why bother doing this, it may be useful for a loadbalancer in front of websocket servers. Its not as great as a direct connection from client to websocket server, but it is better than a loadbalancer that handles both requests and responses, which makes it more scalable, and more able to run on less resources.

Is there a way to test if a computer's connection is firewalled?

I'm writing a piece of P2P software, which requires a direct connection to the Internet. It is decentralized, so there is no always-on server that it can contact with a request for the server to attempt to connect back to it in order to observe if the connection attempt arrives.
Is there a way to test the connection for firewall status?
I'm thinking in my dream land where wishes were horses, there would be some sort of 3rd-party, public, already existent servers to whom I could send some sort of simple command, and they would send a special ping back. Then I could simply listen to see if that arrives and know whether I'm behind a firewall.
Even if such a thing does not exist, are there any alternative routes available?
Nantucket - does your service listen on UDP or TCP?
For UDP - what you are sort of describing is something the STUN protocol was designed for. It matches your definition of "some sort of simple command, and they would send a special ping back"
STUN is a very "ping like" (UDP) protocol for a server to echo back to a client what IP and port it sees the client as. The client can then use the response from the server and compare the result with what it thinks its locally enumerated IP address is. If the server's response matches the locally enumerated IP address, the client host can self determinte that it is directly connected to the Internet. Otherwise, the client must assume it is behind a NAT - but for the majority of routers, you have just created a port mapping that can be used for other P2P connection scenarios.
Further, you can you use the RESPONSE-PORT attribute in the STUN binding request for the server to respond back to a different port. This will effectively allow you to detect if you are firewalled or not.
TCP - this gets a little tricky. STUN can partially be used to determine if you are behind a NAT. Or simply making an http request to whatismyip.com and parsing the result to see if there's a NAT. But it gets tricky, as there's no service on the internet that I know of that will test a TCP connection back to you.
With all the above in mind, the vast majority of broadband users are likely behind a NAT that also acts as a firewall. Either given by their ISP or their own wireless router device. And even if they are not, most operating systems have some sort of minimal firewall to block unsolicited traffic. So it's very limiting to have a P2P client out there than can only work on direct connections.
With that said, on Windows (and likely others), you can program your app's install package can register with the Windows firewall so your it is not blocked. But if you aren't targeting Windows, you may have to ask the user to manually fix his firewall software.
Oh shameless plug. You can use this open source STUN server and client library which supports all of the semantics described above. Follow up with me offline if you need access to a stun service.
You might find this article useful
http://msdn.microsoft.com/en-us/library/aa364726%28v=VS.85%29.aspx
I would start with each os and ask if firewall services are turned on. Secondly, I would attempt the socket connections and determine from the error codes if connections are being reset or timeout. I'm only familiar with winsock coding, so I can't really say much for Linux or mac os.

What is the theory behind port forwarding when using P2P programs

Every time I use a different router and different P2P program, I get the same problem - port forwarding. I then usually read random values of ports(TCP, UDP, whatever) and paste it into random places in my router setttings page and repeat this process until the damn thing starts working. As I am a bit tired of doing that i would like to understand the theory behind it a little bit, so that I can put the right things in right places immediately. Could anybody just explain it briefly to me in a few words? Apologies for lengthy description of the problem, but I didn't know how to describe the level of understanding that I am talking about in a more concise way.
Thanks.
Well, the router hides you from the outer world, so you can only make outgoing connections, for which router takes care of sending your requests to the outer world, receiving responses, and sending those back to you. No one can send a packet to you unless you have specifically asked for it—i.e. you can only receive responses.
In case on p2p, the ability to send packets to your machine is important if not vital. So what you do is ask router to forward (here! that's where the word comes from) all incoming packets to port X to your machine, port X.
Originally IP addresses were provided per device, now-a-days we tend to have 1 IP address per household (unless your doing something crazy), also called your external IP. Your external IP is your connection to the world via your router, but each computer within your network has it's own IP (called internal IP). Port forwarding allows the external world to establish communications with a specific computer.
A web server is a simple example, web services typically rely on port 80, what-if in your network you had 4 computers, 1 of which was your web server. How would the outside world know which PC to contact? Port Forwarding allows you to tell your router to direct internet traffic to that server.

DNS Server Refusing Connection

I am implementing a dns client, in which i try to connect to a local dns server, but the dns server is returning the message with an error code 5 , which means that its refusing the connection.
Any thoughts on why this might be happening ?? Thanks
DNS response error code 5 ("Refused") doesn't mean that the connection to the DNS server is refused.
It means that the DNS server refuses to provide whatever data you asked for, or to do whatever action you asked it to do (for example a dynamic update).
Since you mention a "connection", I assume that you are using TCP?
DNS primarilly uses UDP, and some DNS servers will refuse all requests over TCP.
So the solution might be as simple as switching to UDP.
Otherwise, assuming you are building your own DNS client from scratch, my first guess would be that you are formatting the request incorrectly. Eventhough the DNS protocol seems fairly simple, it is very easy to get this wrong.
Finally, the DNS server may of course simply be configured to refuse requests for whatever you are asking.
explicitly adding the network from which i wanted to allow-recursion fixed this problem for me:
these two lines added to /etc/bind/named.conf.options
recursion yes;
allow-recursion { 10.2.0.0/16; };
Policy enforcement?
The DNS server could be configured to accept only connections from certain hosts.
Hmm, if you're able to access StackOverflow you have a working DNS server SOMEwhere. Try doing
host -v stackoverflow.com
and look for messages like
Received 50 bytes from 192.168.1.1#53 in 75 ms
then pick the address out of that line and use THAT as your DNS - it's obviously willing to talk to you.
If you're on Windows, use NSLOOKUP for the same purpose. Your name server's address will be SOMEwhere in the output.
EDIT:
When I'm stuck for a DNS server, I use the one whose address I can remember most easily: 4.2.2.2 . See how that works for you.
You might try monitoring the conversation using WireShark. It can also decode the packets for you, which might help you determine if your client's packets are correctly encoded. Just filter on port 53 (DNS) to limit the packets captured by the trace.
Also, make sure you're using UDP and not TCP for queries; TCP should be used primarily for zone transfers, not queries.

Avoid traffic shaping by using ssh on port 443

I heard that if you use port 443 (the port usually used for https) for ssh, the encrypted packets look the same to your isp.
Could this be a way to avoid traffic shaping/throttling?
I'm not sure it's true that any given ssh packet "looks" the same as any given https packet.
However, over their lifetime they don't behave the same way. The session set up and tear down don't look alike (SSH offer a plain text banner during initial connect, for one thing). Also, typically wouldn't an https session be short lived? Connect, get your data, disconnect, whereas ssh would connect and persist for long periods of time? I think perhaps using 443 instead of 22 might get past naive filters, but I don't think it would fool someone specifically looking for active attempts to bypass their filters.
Is throttling ssh a common occurrence? I've experienced people blocking it, but I don't think I've experienced throttling. Heck, I usually use ssh tunnels to bypass other blocks since people don't usually care about it.
443, when used for HTTPS, relies on SSL (not SSH) for its encryption. SSH looks different than SSL, so it would depend on what your ISP was actually looking for, but it is entirely possible that they could detect the difference. In my experience, though, you'd be more likely to see some personal firewall software block that sort of behavior since it's nonstandard. Fortunately, it's pretty easy to write an SSL tunnel using a SecureSocket of some type.
In general, they can see how much bandwidth you are using, whether or not the traffic is encrypted. They'll still know the endpoints of the connection, how long it's been open, and how many packets have been sent, so if they base their shaping metrics on this sort of data, there's really nothing you can do to prevent them from throttling your connection.
Your ISP is probably more likely to traffic shape port 443 over 22, seeing as 22 requires more real-time responsiveness.
Not really a programming question though, maybe you'll get a more accurate response somewhere else..

Resources