I want a TCP server that waits for clients to connect, and as soon as they do, sends them some data continuously. I also want the server to notice if a client disappears suddenly, without a trace, and to remove them from the list of open sockets.
My code looks like this:
#!/usr/bin/env python3
import select, socket
# Listen Port
LISTEN_PORT = 1234
# Create socket
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Setup the socket
server.setblocking(0)
server.bind(('0.0.0.0', LISTEN_PORT))
server.listen(5)
# Make socket reusable
server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
# Setup TCP Keepalive
server.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
server.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, 1)
server.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, 3)
server.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPCNT, 5)
# Tell user we are listening
print("Listening on port %s" % LISTEN_PORT)
inputs = [server]
outputs = []
while True:
# Detecting clients that disappeared does NOT work when we ARE
# watching if any sockets are writable
#readable, writable, exceptional = select.select(inputs, outputs, inputs)
# Detecting clients that disappeared works when we aren't watching
# if any sockets are writable
readable, writable, exceptional = select.select(inputs, [], inputs)
for s in readable:
if s is server:
connection, client_address = s.accept()
print("New client connected: %s" % (client_address,))
connection.setblocking(0)
inputs.append(connection)
outputs.append(connection)
else:
try:
data = s.recv(1024)
except TimeoutError:
print("Client dropped out")
inputs.remove(s)
if s in outputs:
outputs.remove(s)
continue
if data:
print("Data from %s: %s" % (s.getpeername(), data.decode('ascii').rstrip()))
else:
print("%s disconnected" % (s.getpeername(),))
for s in writable:
s.send(b".")
As you can see, I'm using TCP Keepalive to allow me to see if a client has disappeared. The problem I'm seeing is this:
when I'm NOT having select() watch for writeable sockets, when the client disappears, select() will stop blocking after the TCP Keepalive timeout expires, and the socket will be in the readable list, so I can remove the client that disappeared from input and output (which is good)
when I AM having select() watch for writable sockets, when the client disappears, select() will NOT stop blocking after the TCP Keepalive timeout expires, and the client socket never ends up in the readable or writable list, so it never gets removed
I'm using telnet from a different machine as a client. To replicate a client disappearing, I'm using iptables to block the client from talking to the server while the client is connected.
Anyone know what's going on?
As the comments to your question have mentioned, the TCP_KEEPALIVE stuff won't make any difference for your use-case. TCP_KEEPALIVE is a mechanism for notifying a program when the peer on the other side of its TCP connection has gone away on an otherwise idle TCP connection. Since you are regularly sending data on the TCP connection(s), the TCP_KEEPALIVE functionality is never invoked (or needed) because the act of sending data over the connection is already enough, by itself, to cause the TCP stack to recognize ASAP when the remote client has gone away.
That said, I modified/simplified your example server code to get it to work (as correctly as possible) on my machine (a Mac, FWIW). What I did was:
Moved the socket.setsockopt(SO_REUSEADDR) to before the bind() line, so that bind() won't fail after you kill and then restart the program.
Changed the select() call to watch for writable-sockets.
Added exception-handling around the send() calls.
Moved the remove-socket-from-lists code into a separate RemoveSocketFromLists() function, to avoid redundant code
Note that the expected behavior for TCP is that if you quit a client gently (e.g. by control-C'ing it, or killing it via Task Manager, or otherwise causing it to exit in such a way that its host TCP stack is still able to communicate with the server to tell the server that the client is dead) then the server should recognize the dead client more or less immediately.
If, on the other hand, the client's network connectivity is disconnected suddenly (e.g. because someone yanked out the client computer's Ethernet or power cable) then it may take the server program several minutes to detect that the client has gone away, and that's expected behavior, since there's no way for the server to tell (in this situation) whether the client is dead or not. (i.e. it doesn't want to kill a viable TCP connection simply because a router dropped a few TCP packets, causing a temporary interruption in communications to a still-alive client)
If you want to try to drop the clients quickly in that scenario, you could try requiring the clients to send() a bit of dummy-data to the server every second or so. The server could keep track of the timestamp of when it last received any data from each client, and force-close any clients that it hasn't received any data from in "too long" (for whatever your idea of too long is). This would more or less work, although it risks false-positives (i.e. dropping clients that are still alive, just slow or suffering from packet-loss) if you set your timeout-threshold too low.
#!/usr/bin/env python3
import select, socket
# Listen Port
LISTEN_PORT = 1234
# Create socket
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Make socket reusable
server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
# Setup the socket
server.setblocking(0)
server.bind(('0.0.0.0', LISTEN_PORT))
server.listen(5)
# Tell user we are listening
print("Listening on port %s" % LISTEN_PORT)
inputs = [server]
outputs = []
# Removes the specified socket from every list in the list-of-lists
def RemoveSocketFromLists(s, listOfLists):
for nextList in listOfLists:
if s in nextList:
nextList.remove(s)
while True:
# Detecting clients that disappeared does NOT work when we ARE
# watching if any sockets are writable
readable, writable, exceptional = select.select(inputs, outputs, [])
for s in readable:
if s is server:
connection, client_address = s.accept()
print("New client connected: %s" % (client_address,))
connection.setblocking(0)
inputs.append(connection)
outputs.append(connection)
else:
try:
data = s.recv(1024)
print("Data from %s: %s" % (s.getpeername(), data.decode('ascii').rstrip()))
except:
print("recv() reports that %s disconnected" % s)
RemoveSocketFromLists(s, [inputs, outputs, writable])
for s in writable:
try:
numBytesSent = s.send(b".")
except:
print("send() reports that %s disconnected" % s)
RemoveSocketFromLists(s, [inputs, outputs])
Related
I have a long-running paho-MQTT (Python 3) client. The client is listen-only - it subscribes to topics and acts on those inputs but it does not publish. Everything runs fine until the server becomes unresponsive (server restart or network transport failure); at that point it becomes unresponsive since the connection is broken. The subscribes are all QOS=0.
What mechanism exists to alert the client that the server is inop? Do I need to manually check for stale input or is there a call-back or exception that will get thrown? If stale input is detected, what's the best practice for recovery to re-establish the subscriptions?
As described in the Paho Python docs
on_disconnect()
on_disconnect(client, userdata, rc)
Called when the client disconnects from the broker.
client, the client instance for this callback
userdata, the private user data as set in Client() or user_data_set()
rc, the disconnection result
The rc parameter indicates the disconnection state. If
MQTT_ERR_SUCCESS (0), the callback was called in response to a
disconnect() call. If any other value the disconnection was
unexpected, such as might be caused by a network error.
On Disconnect Example
def on_disconnect(client, userdata, rc):
if rc != 0:
print("Unexpected disconnection.")
mqttc.on_disconnect = on_disconnect
I am trying create a server-client-network with up to 200 clients connecting to one server. The part that worries me is, that the Server has to handle up to 200 connections:
# creating a server socket
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.bind((socket.gethostbyname(socket.gethostname()), PORT))
server.listen()
# allowing clients to connect
while True:
client, addr = server.accept()
print(addr, "has connected !")
# handle each client in its own thread
t = threading.Thread(target=handle_client, args=(client,), daemon=True)
t.start()
Having up to 200 clients, will create up to 200 Threads. My gut is telling me: "that is to much". So the first question is, should one create a newtwork with a single server and over 100 clients ? And the second question is: Is there a good way to handle such a situation ?
I tried to made a script that connects to a server with *python-socket.
# connection = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# in this case socket.socket() is more clearer
connection = socket.socket()
connection.connect((ip, port))
But when the ip or the port are incorrect, it tries to connect forever.
So how can you implement a certain time to wait for a server to respond/connect? For example when the server isn't reachable after 5 seconds, the connection will be closed with an error that the server is unreachable/offline.
To specify a specific time to wait for the server you can simply do this:
connection = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
connection.settimeout(5) # waits 5 sec until the connection will be closed
connection.connect((ip, port))
connection.settimeout(None) #sets the timeout to None (important)
In this example, the connection will wait for 5 seconds until it raises an exception, because the time is over. You can simply catch this exception in a try statement with "except socket.timeout:"
Honestly, this is lifted more or less from tutorials point networking on python-https://www.tutorialspoint.com/python/python_networking.html
Server program:
import socket # Import socket module
dd="You connected sucessfully to the server"
dd1=bytes(dd,'UTF-8')
s = socket.socket(socket.AF_INET,socket.SOCK_STREAM) # Create a socket object
host = socket.gethostname() # Get local machine name
port = 12345 # Reserve a port for your service.
s.bind((host, port)) # Bind to the port
s.listen(5) # Now wait for client connection.
while True:
c, addr = s.accept() # Establish connection with client.
print ('Got connection from', addr)
c.send(dd1)
c.close() # Close the connection
Client side:
import socket # Import socket module
s = socket.socket(socket.AF_INET) # Create a socket object
host = '80.78.xxx.xxx' # Get local machine name/value
port = 12345 # Reserve a port for your service.
s.connect((host, port))
print(s.recv(102004))
s.close() # Close the socket when done
So, I'm able to run it on the same PC and get result. If my understanding of the code is correct, host means localhost. But it should also work when I try to access it using the ip address. But it doesn't. Please help. It returns the error: Error 10060
https://help.globalscape.com/help/archive/cuteftp6/socket_error_=__10060.htm#:~:text=10060%20is%20a%20Connection%20Time,prefers%20PORT%20for%20data%20connections.&text=ERROR%3A%3E%20Can't%20connect%20to%20remote%20server.
I forwarded the ports 12340 to 12350 to my ip address on the router. Removed all firewall. Yet this happens.
A similar error happend when I tried to put a website up using node.js. Works perfectly on local host but doesn't work when I try to access using public IP address. I'm very confused and would be glad if you pointed to any literature to get a deeper understanding.
I have problem with a TCP socket receiving messages with wrong destination port.
The OS is Ubuntu Linux 10.10 and kernel version is 2.6.31-11-rt, but this happens with other kernels, too. The C/C++ program with this problem does this:
A TCP server socket is listening to connections in INADDR_ANY at port 9000.
Messages are received with recv(2) by a TCP message receiver thread. Connection is not closed after reading message, but the thread continues to read from the same connection forever.
Error: also messages to other ports than 9000 are received by the TCP message receiver. For example, when a remote SFTP client connects to the PC where TCP message receiver is listening, it causes the TCP message receiver to receive also the SFTP messages. How is this EVER possible? How can the TCP ports "leak" this way? I think SFTP should use port 22, right? Then how it's possible those messages are visible in port 9000?
More info:
At the same time there's a raw socket listening on another network interface, and the interface is in promiscuous mode. Could this have an effect?
The TCP connection is not closed in between message receptions. The message listener just keeps reading data from the socket. Is this really a correct way to implement a TCP message receiver?
Has anyone seen this kind of problem? Thanks in advance.
EDIT:
Ok, here is some code. The code looks to be allright, so the main strange thing is, how a TCP socket can ever receive data sent to another port?
/// Create TCP socket and make it listen to defined port
TcpSocket::listen() {
m_listenFd = socket(AF_INET, SOCK_STREAM, 0)
...
bzero(&m_servaddr, sizeof(sockaddr_in));
m_servaddr.sin_family = AF_INET;
m_servaddr.sin_addr.s_addr = htonl(INADDR_ANY);
m_servaddr.sin_port = htons(9000);
bind(m_listenFd, (struct sockaddr *)&m_servaddr, sizeof(sockaddr_in);
...
listen(m_listenFd, 1024);
...
m_connectFd = accept(m_listenFd, NULL, NULL);
}
/// Receive message from TCP socket.
TcpSocket::receiveMessage() {
Uint16 receivedBytes = 0;
// get the common fixed-size message header (this is an own message structure)
Uint16 numBytes = recv(m_connectFd, msgPtr + receivedBytes, sizeof(SCommonTcpMSGHeader), MSG_WAITALL);
...
receivedBytes = numBytes;
expectedMsgLength = commonMsgHeader->m_msgLength; // commonMsgHeader is mapped to received header bytes
...
// ok to get message body
numBytes = recv(m_connectFd, msgPtr + receivedBytes, expectedMsgLength - receivedBytes, MSG_WAITALL);
}
The TCP connection is not closed in between message receptions. The
message listener just keeps reading data from the socket. Is this
really a correct way to implement a TCP message receiver?
Yes but it must close the socket and exit when it receives the EOS indication (recv() returns zero).
I think it's ten to one your raw socket and TCP socket FDs are getting mixed up somewhere.
Umm... It appears that it was the raw socket after all which has received the messages. Can see from the log that it's the raw message handler printing out message reception things, not the TCP message handler. Duh...! :S Sorry. So it seems the binding of the raw socket into the other network interface doesn't work correctly. Need to fix that. Funny though that sometimes it works with SSH/SFTP, sometimes it does not.