socket programming with getaddrinfo - linux

I'm using getaddrinfo in my socket programming in linux. I have created a client and a server. Currently the client has a hardcoded static port number. Everything works fine.
But I want the system to dynamically assign a port number to the client whenever it connects to the server. How do I do this using getaddrinfo?
I'm using a TCP socket.

Just don't call bind before calling connect and the TCP stack will assign the client a "random" source port. If you need to know what port you're connecting from (and you usually don't), you can call getsockaddr after calling connect.
Alternatively, can call bind specifying port 0. In that case, again, the stack will assign the client a "random" unused source port to connect from. This option would be preferred if you don't want to special case allowing the implementation to select the port or if you need to specify the local IP address for some reason.

Related

Node+supertest flakes with "Client network socket disconnected before secure TLS connection was established"

My node tests are randomly failing “Client network socket disconnected before secure TLS connection was established” and I’ve been debugging this for weeks. What’s going wrong? I’m using supertest
Tldr; Don’t rely on supertest to call listen and close on your server. Call server.listen before calling supertest.agent and handle calling close on your own.
Useful reading: https://gavv.github.io/articles/ephemeral-port-reuse/
The sockets created by net.Server.listen have the SO_REUSEADDR flag added to them. This means there can be multiple binds to the same port as along as they all add the SO_REUSEADDR flag.
Supertest by default will call server.listen(0) which creates and ipv6 socket on an ephemeral port using SO_REUSEADDR.
When you later use supertest to talk to your local server, it seems to prefer connecting over ipv4 instead of ipv6. And that’s ok at least on Macs because if dual-stacking, I.e. binding to “::” on ipv6 also listens to the same port on ipv4 if it isn’t taken by some other process.
However, every once in a while there can exist another process listening on the ipv4 version of the ephemeral port that’s being used by the test (SO_REUSEADDR allows this). The dual stacking logic from above will choose the ipv4 socket to a random process over the ipv6 socket that’s actually from your test.
There’s a million reasons why this shouldn’t work and the foreign process closes its end before the TLS handshake finishes, randomly giving you the error in the question. 

Thankfully, if your server is already listening when you call supertest.agent, supertest no longer tries to be smart about implicitly calling listen/close and you can use a fixed port outside of the ephemeral range to avoid all of this.

Do all the sockets in a namespace connect to the same port on the server in socket.io?

I thought when a server is started, it creates a specific number of TCP ports on a computer. so whenever a new connection comes in, it assigns a port to that client ('connection'). Recently I opened tutorialsPoint website 'https://www.tutorialspoint.com/socket.io/socket.io_namespaces.htm' and in there is written:
"Socket.IO allows you to “namespace” your sockets, which essentially means assigning different endpoints or paths. This is a useful feature to minimize the number of resources (TCP connections) and at the same time separate concerns within your application by introducing separation between communication channels. Multiple namespaces actually share the same WebSockets connection thus saving us socket ports on the server".
This part i did not understand: "Multiple namespaces actually share the same WebSockets connection thus saving us socket ports on the server". My question is how can all the connections share a single port on the web-server.
Any help will be highly appreciated.
Do all the sockets in a namespace connect to the same port on the server in socket.io?
Yes, they do.
First off socket.io is built on the underlying webSocket protocol. A webSocket connection starts with an http connection which is built on top of a TCP connection and then the two sides agree to "upgrade" the protocol to start talking the webSocket protocol instead of the http protocol.
So, when a socket.io connection comes in, it's initially an http connection.
Second, any TCP server is listening for inbound connections on a known port. The client must know what that port is and the client attempts to connect to the combination of IP address and port. A regular TCP server using only one network adapter will just be listening on that one port. All inbound client connections will arrive on that one port.
I thought when a server is started, it creates a specific number of TCP ports on a computer. so whenever a new connection comes in, it assigns a port to that client ('connection').
That's not how it works. A listening server creates a passive socket listening for inbound connections on one specific port. When a TCP client initiates an outbound connection, that client picks a dynamically selected port number for that outbound connection (that is unique for that client and not currently in use). This source port number is typically not visible in TCP, http, webSocket or socket.io programming (though you can see what is is if you want - you just don't have to use it yourself at the level we usually program at). It's part of the TCP plumbing that helps packets get delivered to the right socket. So, at that point it has a source IP address and a source port number. It then attempts to connect to a target IP address on a target port.
That unique combination of those four parameters:
source IP
source port (dynamically assigned on the client)
target IP (known in advance by the client)
target port (known in advance by the client)
defines a unique TCP connection. No two TCP connections will have the same four parameters. If the same client makes another TCP connection to the same target IP and port, it will be assigned a different source port number and thus it will be a different unique combination.
There's one little (somewhat confusing) aspect here that I'll make you aware of, but not try to overly explain or confuse things by. Many clients are actually on a private network and have a private IP address. That private IP address is not what the server actually sees as the source of the connection. At some point the connection goes through a gateway that connects the private network to a public network. This gateway will do NAT (network address translation). It will swap the private source IP/port for a public source IP/port that corresponds to the gateway itself. It remembers what it swapped so that when packets come back the other directly, it can swap it back. So, the target server actually believes it's communicating with the gateway, but anything the target sends to the gateway is "forwarded" onto the private IP address/port of the original sender. So, you don't really need to understand the details of the gateway except that it's serves as a broker between the private IP address of some computer on a private network and some computer on the public internet that you are trying to connect to. It does what's called "network address translation" to make this all work. For the rest of the discussion, you should forget about this and just pretend that both source and target are both on the public internet with public IP addresses (even though that is almost never the actual case, but the gateway makes it just work as if they were).
"Socket.IO allows you to “namespace” your sockets, which essentially means assigning different endpoints or paths. This is a useful feature to minimize the number of resources (TCP connections) and at the same time separate concerns within your application by introducing separation between communication channels. Multiple namespaces actually share the same WebSockets connection thus saving us socket ports on the server".
In socket.io, when you connect on a namespace, you are creating a new underlying webSocket connection to the same target IP/port. A server can have many inbound connections to the same IP/port. Each is given it's own TCP socket and the four parameters mentioned above uniquely define each one. When an inbound network packet arrives at the lowest level, TCP can tell which source IP and source port it came from and which target IP/port is was sent and that allows the TCP driver to figure out which socket that packet belongs to so that the packet can be delivered to the code that is monitoring that specific socket.
This part i did not understand: "Multiple namespaces actually share the same WebSockets connection thus saving us socket ports on the server". My question is how can all the connections share a single port on the web-server.
To use a namespace in socket.io, you make a new socket.io connection to that specific namespace. You don't use multiple namespaces on a single socket.io connection. But, a namespace operates at a higher level than the TCP or webSocket connection logic. It rides on top of that in the application layer. So, all namespace connections, no matter which namespace you are using, connect to the same server on the same IP and same port. Once the connection has been established, socket.io sends some data that it would like a "logical" connection on this namespace and then the receiving socket.io code is informed that the new connection belongs in this namespace.
Here's a useful article to read on the topic: Understanding socket and port in TCP.

If a port is used to connect one service, is it OK to be used to connect another service?

For example, I have a ruby on rails app(10.0.0.3), it will connect redis(10.0.0.4) and mysql(10.0.0.5)
if ror has used 10.0.0.3:12345 to establish a TCP connection to redis(10.0.0.4:6379), can ror use 10.0.0.3:12345 at the same time to connect(TCP) to 10.0.0.5:3306?
I'm confused of srcIP:srcPORT:dstIP:dstPORT, since dst ip is different, so I can use the port??
In theory this is possible, as a TCP connection is identified by the 4-tuple {source IP, source port, target IP, target port}.
However the kernel will probably not actually allow the second and subsequent bind() calls using the same local port, as bind() precedes connect().

Nodejs TCP connection client port assignment

I created tcp connection between client and server using nodejs (net module). Server is listening on already predefined port and client is connecting to that port.
As far as i understand port for client is dynamically assigned by node? Is that correct?
What kind of algorithm node is using to assign "random" port for the client? How this works, is this determined by node or by OS?
Is it possible to define static port which client is going to use? Is it possible to define range of ports for the client to use?
NOTE: I think i found discussion/question with similar subject on stackoverflow before, but i cannot find it anymore. I would apprecaite if you can share any reliable resources regarding this subject.
The source port number is usually pretty much irrelevant to your programming unless you have a router or firewall that is somehow restrictive in that regard. It is merely used by the underlying TCP infrastructure to keep track of different TCP connections.
From this article:
A TCP/IP connection is identified by a four element tuple: {source IP,
source port, destination IP, destination port}. To establish a TCP/IP
connection only a destination IP and port number are needed, the
operating system automatically selects source IP and port.
The above referenced article describes how Linux selects the source port number.
As to your particular questions:
What kind of algorithm node is using to assign "random" port for the
client? How this works, is this determined by node or by OS?
It is determined by the OS. That source port number is selected by the originating host at the TCP level before the connection is even made to node.js.
Some other reference articles:
Does the TCP source port have to be unique per host?
how can an application use port 80/HTTP without conflicting with browsers?
Note: there is no security reason I'm aware of for a firewall to limit the source port number or block certain source port numbers. They are a TCP bookkeeping number only, not related at all to security or the type of service being used. Note, this is different than the destination port which is usually correlated directly with the type of service being used (e.g. 80 is HTTP, 25 is SMTP, 143 is IMAP, etc... When you make a TCP connection to a different host, you specify the host address and the destination port number. You don't specify the source port number.
The selected answer is provides a lot of info, but does not deal with the underlying problem. Node does not appear to allow https.request to specify a port for the client. There exist localAddress and localPort options, but they appear to be broken.
I've opened a new question on this issue. Hopefully someone will answer with something other than "just don't do that."
Is there a way to set the source port for a node js https request?

TCP/IP basics: Destination port relevance

Ok this is kind of embarassing but I just have a rather "noob" question.
In a client server TCP communications, where my system is a client accessing a remote server at say Port XX, isnt the client opening a random port YY in its system to talk to remote port XX?
So when we code we do specify the destination port XX right?
For the client, the port YY itself is chosen when the socket is created, isnt it?
Is there anyway I could monitor/restrict/control any client talking to a particular server?(like say clients talking to servers at specific serving ports??)
Is there any IPTABLE rule or some firewall rule restricting the client?
Can this be done at all??
Are destination ports saved in the socket structures? If so where??
Thanks!
First, server side creates a listening socket, with the chain of socket(2), bind(2), and listen(2) calls, then waits for incoming client connection requests with the accept(2) call. Once a client connects (socket(2) and then connect(2) on the client side) and the TCP/IP stacks of the client and the server machines complete the three way handshake, the accept(2) returns new socket descriptor - that's the server's end of the connected socket. Both bind(2) on the server side, and connect(2) on the client side take server's address and port.
Now, the full TCP connection is described by four numbers - server address, server port, client address, and client port. The first two must obviously be known to the client prior to the connection attempt (otherwise, where do we go?). The client address and port, while could be specified explicitly with the bind(2), are usually assigned dynamically - the address is the IP address of the outgoing network interface, as determined by the routing table, and the port selected out of range of ephemeral ports.
The netstat(8) command shows you established connections. Adding -a flag lets you see listening sockets, -n flag disables DNS and service resolution, so you just see numeric addresses and ports.
Linux iptables(8) allows you to restrict where clients are allowed to connect to. You can restrict based on source and destination ports, addresses, and more.
You can get socket local binding with getsockname(2) call, remote binding is given by getpeername(2).
Hope this makes it a bit more clear.
Yes you can create a firewall rule to prevent outbound TCP connections to port XX. For example, some organizations prevent outbound TCP port 25, to prevent spam being sent from network PCs to remote SMTP servers.

Resources