Virtual TCP connections on Linux

Virtual TCP connections on Linux - linux

There are various services listening on my host's IP interface, and I am writing a proxy running on the same system that should be able initiate TCP connections to them. It should be able to specify any source IP address for the connections. I could do this with a TUN device, but the actual connections originate from networks not based on TCP, so the proxy would have to implement TCP and segment the streams by itself, which is non-trivial. I would prefer to use the socket API and somehow spoof the source address and port. Is this possible in Linux, or is there another solution?

I found the solution. IP_TRANSPARENT socket option should allow this.

Related

How to bind a service into any host's port?

Hello, 👋
I was wondering how services (like mysql, apache, mongoDB) are bind against a port in the server/local machine. How does this work?
I'm guessing that when the service starts, it tries to connect to the port and if possible, the service is "paused" until the OS receives a request against the selected port. Is there any documentation out explaining how this works?
Thank you!

May I help you?
This is a list of TCP and UDP port numbers used by protocols for operation of network applications.
The Transmission Control Protocol (TCP) and the User Datagram Protocol (UDP) only need one port for duplex, bidirectional traffic. They usually use port numbers that match the services of the corresponding TCP or UDP implementation, if they exist.

Do all the sockets in a namespace connect to the same port on the server in socket.io?

I thought when a server is started, it creates a specific number of TCP ports on a computer. so whenever a new connection comes in, it assigns a port to that client ('connection'). Recently I opened tutorialsPoint website 'https://www.tutorialspoint.com/socket.io/socket.io_namespaces.htm' and in there is written:
"Socket.IO allows you to “namespace” your sockets, which essentially means assigning different endpoints or paths. This is a useful feature to minimize the number of resources (TCP connections) and at the same time separate concerns within your application by introducing separation between communication channels. Multiple namespaces actually share the same WebSockets connection thus saving us socket ports on the server".
This part i did not understand: "Multiple namespaces actually share the same WebSockets connection thus saving us socket ports on the server". My question is how can all the connections share a single port on the web-server.
Any help will be highly appreciated.

Do all the sockets in a namespace connect to the same port on the server in socket.io?
Yes, they do.
First off socket.io is built on the underlying webSocket protocol. A webSocket connection starts with an http connection which is built on top of a TCP connection and then the two sides agree to "upgrade" the protocol to start talking the webSocket protocol instead of the http protocol.
So, when a socket.io connection comes in, it's initially an http connection.
Second, any TCP server is listening for inbound connections on a known port. The client must know what that port is and the client attempts to connect to the combination of IP address and port. A regular TCP server using only one network adapter will just be listening on that one port. All inbound client connections will arrive on that one port.
I thought when a server is started, it creates a specific number of TCP ports on a computer. so whenever a new connection comes in, it assigns a port to that client ('connection').
That's not how it works. A listening server creates a passive socket listening for inbound connections on one specific port. When a TCP client initiates an outbound connection, that client picks a dynamically selected port number for that outbound connection (that is unique for that client and not currently in use). This source port number is typically not visible in TCP, http, webSocket or socket.io programming (though you can see what is is if you want - you just don't have to use it yourself at the level we usually program at). It's part of the TCP plumbing that helps packets get delivered to the right socket. So, at that point it has a source IP address and a source port number. It then attempts to connect to a target IP address on a target port.
That unique combination of those four parameters:
source IP
source port (dynamically assigned on the client)
target IP (known in advance by the client)
target port (known in advance by the client)
defines a unique TCP connection. No two TCP connections will have the same four parameters. If the same client makes another TCP connection to the same target IP and port, it will be assigned a different source port number and thus it will be a different unique combination.
There's one little (somewhat confusing) aspect here that I'll make you aware of, but not try to overly explain or confuse things by. Many clients are actually on a private network and have a private IP address. That private IP address is not what the server actually sees as the source of the connection. At some point the connection goes through a gateway that connects the private network to a public network. This gateway will do NAT (network address translation). It will swap the private source IP/port for a public source IP/port that corresponds to the gateway itself. It remembers what it swapped so that when packets come back the other directly, it can swap it back. So, the target server actually believes it's communicating with the gateway, but anything the target sends to the gateway is "forwarded" onto the private IP address/port of the original sender. So, you don't really need to understand the details of the gateway except that it's serves as a broker between the private IP address of some computer on a private network and some computer on the public internet that you are trying to connect to. It does what's called "network address translation" to make this all work. For the rest of the discussion, you should forget about this and just pretend that both source and target are both on the public internet with public IP addresses (even though that is almost never the actual case, but the gateway makes it just work as if they were).
"Socket.IO allows you to “namespace” your sockets, which essentially means assigning different endpoints or paths. This is a useful feature to minimize the number of resources (TCP connections) and at the same time separate concerns within your application by introducing separation between communication channels. Multiple namespaces actually share the same WebSockets connection thus saving us socket ports on the server".
In socket.io, when you connect on a namespace, you are creating a new underlying webSocket connection to the same target IP/port. A server can have many inbound connections to the same IP/port. Each is given it's own TCP socket and the four parameters mentioned above uniquely define each one. When an inbound network packet arrives at the lowest level, TCP can tell which source IP and source port it came from and which target IP/port is was sent and that allows the TCP driver to figure out which socket that packet belongs to so that the packet can be delivered to the code that is monitoring that specific socket.
This part i did not understand: "Multiple namespaces actually share the same WebSockets connection thus saving us socket ports on the server". My question is how can all the connections share a single port on the web-server.
To use a namespace in socket.io, you make a new socket.io connection to that specific namespace. You don't use multiple namespaces on a single socket.io connection. But, a namespace operates at a higher level than the TCP or webSocket connection logic. It rides on top of that in the application layer. So, all namespace connections, no matter which namespace you are using, connect to the same server on the same IP and same port. Once the connection has been established, socket.io sends some data that it would like a "logical" connection on this namespace and then the receiving socket.io code is informed that the new connection belongs in this namespace.
Here's a useful article to read on the topic: Understanding socket and port in TCP.

Create VPN over TCP connection

I need to create a virtual IP network over TCP connection. The hosting system is Linux, with the TUN/TAP kernel driver, it's quite easy to receive & re-inject IP packets of the virtual network.
The difficult part is to transmit the received IP packets to another host. For some non-technical reasons, I can only transmit the packets over TCP protocol but not UDP. Transmit IP packets over UDP is easy, but with TCP it becomes tricky, here's the reason:
UDP protocol doesn't support retransmission/reordering, just like IP. So, if one UDP packet is sent for every received virtual IP packet, the kernel TCP/IP protocol stack would still see virtual IP packet loss/duplication/reordering(those are required for TCP/IP to work well, if those "features" are missing, the TCP connection speed on the virtual network would suffer). If IP packets are transmitted over TCP all required "features" will be missing, unless they are simulated some how.
It seems I have to fake some kind of packet duplication/loss/reordering on TCP connection, or patch the kernel TCP/IP protocol stack. Both options aren't easy.
Is there any other simpler solution to my problem ? or did I just go into a completely wrong direction ? I'm all ears.
==== UPDATE ====
I'm thinking about using raw IP socket (which could get rid of all the TCP retransmission/reordering stuff on the physical network easily while still using TCP packets) to transmit the received virtual network IP packets. But on the receiving host, how can I only receive the packets I'm interested in and return all other IP packets to the kernel TCP/IP stack ?

First of all, you do not want to make a VPN over TCP because you would end up with tcp-over-tcp eventually. The main issue is that the timers of your inner TCP and outer TCP might differ significantly which negatively impacts your TCP session reliability. You can find a bit longer explanation here.
UDP protocol doesn't support retransmission/reordering, just like IP. So, if one UDP packet is sent for every received virtual IP packet, the kernel TCP/IP protocol stack would still see virtual IP packet loss/duplication/reordering(those are required for TCP/IP to work well, if those "features" are missing, the TCP connection speed on the virtual network would suffer). If IP packets are transmitted over TCP all required "features" will be missing, unless they are simulated some how.
This does not make sense, if your outer layer uses TCP as a transport mechanism, nothing stops your inner layer to still use the full ip/tcp stack, including those features. They can conflict badly like I said, but it's not that this functionality disappears or breaks completely.
It seems like you actually want to use TCP just to have the headers and ignore the actual protocol, this would indeed avoid the issues with tcp over tcp. However, once again this is a very bad idea. Flow processing for firewalls, NAT, DPI, tcp boosters, becomes more and more common, if you fake TCP packets you might up stressing those boxes, possibly detoriating your own connection once again.
So you should ask yourself why you can't use UDP, and if no alternative protocol (header) is okay, like GRE or L2TP.

TCP/IP basics: Destination port relevance

Ok this is kind of embarassing but I just have a rather "noob" question.
In a client server TCP communications, where my system is a client accessing a remote server at say Port XX, isnt the client opening a random port YY in its system to talk to remote port XX?
So when we code we do specify the destination port XX right?
For the client, the port YY itself is chosen when the socket is created, isnt it?
Is there anyway I could monitor/restrict/control any client talking to a particular server?(like say clients talking to servers at specific serving ports??)
Is there any IPTABLE rule or some firewall rule restricting the client?
Can this be done at all??
Are destination ports saved in the socket structures? If so where??
Thanks!

First, server side creates a listening socket, with the chain of socket(2), bind(2), and listen(2) calls, then waits for incoming client connection requests with the accept(2) call. Once a client connects (socket(2) and then connect(2) on the client side) and the TCP/IP stacks of the client and the server machines complete the three way handshake, the accept(2) returns new socket descriptor - that's the server's end of the connected socket. Both bind(2) on the server side, and connect(2) on the client side take server's address and port.
Now, the full TCP connection is described by four numbers - server address, server port, client address, and client port. The first two must obviously be known to the client prior to the connection attempt (otherwise, where do we go?). The client address and port, while could be specified explicitly with the bind(2), are usually assigned dynamically - the address is the IP address of the outgoing network interface, as determined by the routing table, and the port selected out of range of ephemeral ports.
The netstat(8) command shows you established connections. Adding -a flag lets you see listening sockets, -n flag disables DNS and service resolution, so you just see numeric addresses and ports.
Linux iptables(8) allows you to restrict where clients are allowed to connect to. You can restrict based on source and destination ports, addresses, and more.
You can get socket local binding with getsockname(2) call, remote binding is given by getpeername(2).
Hope this makes it a bit more clear.

Yes you can create a firewall rule to prevent outbound TCP connections to port XX. For example, some organizations prevent outbound TCP port 25, to prevent spam being sent from network PCs to remote SMTP servers.

Remote port blocking in firewalls?

some guys use a firewall on their laptops which not only blocks their own local incoming ports (except those they need for their application) but also blocks messages unless they are issued from a distinct port number. We're talking about a local UDP server which is listening to UDP broadcasts.
The problem is that the remote client uses a random port, say 1024, which is blocked unless they tell the firewall to accept it.
What puzzles me is that as far as I know from using sockets in my programs is that usually the client gets its port number from the OS, whereas only when you have a server, you bind your socket to a distinct port, right?
In my literature and in tutorials and code snippets in the web I haven't found any clue that clients should be using fixed port numbers at all.
So how is this in reality? Am I probably missing a point?
Are there client applications around using fixed ports?
Is is actually useful to block remote ports with a firewall?
And if yes, what level of added security does this give to you?
Thanks for enlightenment in beforehand...

Although the default API's allow the network stack to select a local port for client connections, clients may specify a fixed port for various reasons.
Some specifications (FTP) specify a fixed port for clients. Most servers don't care if clients get this correct.
Some clients use a fixed pool of ports for egress from a LAN to the Internet. This allows firewall rules to more completely lock down outbound traffic.
Source ports are sometimes uses as a weak type of "security through obscurity".

You always get a random address and/or port when not explicitly having bound to one before sending.
Daemons are usually bound to a fixed port, so that:
you can actually contact them without having to try all possible ports or utilize a secondary resolver (remember the SUNRPC portmapping crap?)
and because a TCP socket is not allowed to listen() if it has not bound to a port, IIRC.
Are there client applications around using fixed ports?
Some can be configured so, like BIND9.
useful to block remote ports with a firewall?
No, because your peer may choose any port of his. Block him and you'll lose a customer, so to speak.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string