Python, Multithreading, sockets sometimes fail to create - linux

Recently observed a rather odd behaviour that only happens in Linux but not freeBSD and was wondering whether anyone had an explanation or at least a guess of what might really be going on.
The problem:
The socket creation method, socket.socket(), sometimes fails. This only happens when multiple threads are creating the sockets, single-threaded works just fine.
To expand on socket.socket() fails, most of the time I get "error 13: Permission denied", but I have also seen "error 93: Protocol not supported".
Notes:
I have tried this on Ubuntu 18.04 (bug is there) and freeBSD 12.0 (bug is not there)
It only happens when multiple threads are creating sockets
I've used UDP as a protocol for the sockets, although that seems to be more fault-tolerant. I have tried it with TCP as well, it even goes haywire faster with similar errors.
It only happens sometimes, so multiple-runs might be required or as in the case I provided below - a bloated number of threads should also do the trick.
Code:
Here's some minimal code that you can use to reproduce that:
from threading import Thread
import socket
def foo():
udp = socket.getprotobyname('udp')
try:
send_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, udp)
except Exception as e:
print type(e)
print repr(e)
def main():
for _ in range(6000):
t = Thread(target=foo)
t.start()
main()
Note:
I have used an artificially large number of threads just to maximize the probability that you'd hit that error at least once within a run with UDP. As I said earlier, if you try TCP you'll see A LOT of errors with that number of threads. But in reality, even a more real-world number of threads like 20 or even 10 would trigger the error, you'd just likely need multiple runs in order to observe it.
Surrounding the socket creation with while, try/except will cause all subsequent calls to also fail.
Surrounding the socket creation with try/except and in the "exception handing" bit restarting the function, i.e. calling it again would work and will not fail.
Any ideas, suggestions or explanations are welcome!!!
P.S.
Technically I know I can get around my problem by having a single thread create as many sockets as I need and pass them as arguments to my other threads, but that is not the point really. I am more interested in why this is happening and how to solve it, rather than what workarounds there might be, even though these are also welcome. :)

I managed to solve it. The problem comes from getprotobyname() not being thread safe!
See:
The Linux man page
On another note, looking at the freeBSD man page also hints that this might cause problems with concurrency, however my experiments prove that it does not, maybe someone can follow up?
Anyway, a fixed version of the code for anyone interested would be to get the protocol number in the main thread (seems sensible and should have done that in the first place) and then pass it as an argument. It would both reduce the system calls that you perform and fix any concurrency-related problems with that within the program. The code would look as follows:
from threading import Thread
import socket
def foo(proto_num):
try:
send_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, proto_num)
except Exception as e:
print type(e)
print repr(e)
def main():
proto_num = socket.getprotobyname('udp')
for _ in range(6000):
t = Thread(target=foo, args=(proto_num,))
t.start()
main()
Exceptions with socket creation in the form of "Permission denied" or "Protocol not supported" will not be reported this way. Also, note that if you use SOCK_DGRAM the proto_number is redundant and might be skipped altogether, however the solution would be more relevant in case someone wants to create a SOCK_RAW socket.

Related

How to work around opencv bug crashing on imshow when not in main thread [duplicate]

This has been answered for Android, Objective C and C++ before, but apparently not for Python. How do I reliably determine whether the current thread is the main thread? I can think of a few approaches, none of which really satisfy me, considering it could be as easy as comparing to threading.MainThread if it existed.
Check the thread name
The main thread is instantiated in threading.py like this:
Thread.__init__(self, name="MainThread")
so one could do
if threading.current_thread().name == 'MainThread'
but is this name fixed? Other codes I have seen checked whether MainThread is contained anywhere in the thread's name.
Store the starting thread
I could store a reference to the starting thread the moment the program starts up, i.e. while there are no other threads yet. This would be absolutely reliable, but way too cumbersome for such a simple query?
Is there a more concise way of doing this?
The problem with threading.current_thread().name == 'MainThread' is that one can always do:
threading.current_thread().name = 'MyName'
assert threading.current_thread().name == 'MainThread' # will fail
Perhaps the following is more solid:
threading.current_thread().__class__.__name__ == '_MainThread'
Having said that, one may still cunningly do:
threading.current_thread().__class__.__name__ = 'Grrrr'
assert threading.current_thread().__class__.__name__ == '_MainThread' # will fail
But this option still seems better; "after all, we're all consenting adults here."
UPDATE:
Python 3.4 introduced threading.main_thread() which is much better than the above:
assert threading.current_thread() is threading.main_thread()
UPDATE 2:
For Python < 3.4, perhaps the best option is:
isinstance(threading.current_thread(), threading._MainThread)
The answers here are old and/or bad, so here's a current solution:
if threading.current_thread() is threading.main_thread():
...
This method is available since Python 3.4+.
If, like me, accessing protected attributes gives you the Heebie-jeebies, you may want an alternative for using threading._MainThread, as suggested. In that case, you may exploit the fact that only the Main Thread can handle signals, so the following can do the job:
import signal
def is_main_thread():
try:
# Backup the current signal handler
back_up = signal.signal(signal.SIGINT, signal.SIG_DFL)
except ValueError:
# Only Main Thread can handle signals
return False
# Restore signal handler
signal.signal(signal.SIGINT, back_up)
return True
Updated to address potential issue as pointed out by #user4815162342.

redis-py not closing threads on exit

I am using redis-py 2.10.6 and redis 4.0.11.
My application uses redis for both the db and the pubsub. When I shut down I often get either hanging or a crash. The latter usually complains about a bad file descriptor or an I/O error on a file (I don't use any) which happens while handling a pubsub callback, so I'm guessing the underlying issue is the same: somehow I don't get disconnected properly and the pool used by my redis.Redis object is alive and kicking.
An example of the output of the former kind of error (during _read_from_socket):
redis.exceptions.ConnectionError: Error while reading from socket: (9, 'Bad file descriptor')
Other times the stacktrace clearly shows redis/connection.py -> redis/client.py -> threading.py, which proves that redis isn't killing the threads it uses.
When I star the application I run:
self.redis = redis.Redis(host=XXXX, port=XXXX)
self.pubsub = self.redis.pubsub()
subscriptions = {'chan1': self.cb1, 'chan2': self.cb2} # cb1 and cb2 are functions
self.pubsub.subscribe(**subscriptions)
self.pubsub_thread = self.pubsub.run_in_thread(sleep_time=1)
When I want to exit the application the last instruction I execute in main is a call to a function in my redis using class, whose implementation is:
self.pubsub.close()
self.pubsub_thread.stop()
self.redis.connection_pool.disconnect()
My understanding is that in theory I do not even need to do any of these 'closing' calls, and yet, with or without them, I still can't guarantee a clean shutdown.
My question is, how am I supposed to guarantee a clean shutdown?
I ran into this same issue and it's largely caused by improper handling of the shutdown by the redis library. During the cleanup, the thread continues to process new messages and doesn't account for situations where the socket is no longer available. After scouring the code a bit, I couldn't find a way to prevent additional processing without just waiting.
Since this is run during a shutdown phase and it's a remedy for a 3rd party library, I'm not overly concerned about the sleep, but ideally the library should be updated to prevent further action while shutting down.
self.pubsub_thread.stop()
time.sleep(0.5)
self.pubsub.reset()
This might be worth an issue log or PR on the redis-py library.
PubSubWorkerThread class check for self._running.is_set() inside the loop.
To do a "clean shutdown" you should call self.pubsub_thread._running.clean() to set the thread event to false and it will stop.
Check how it work here:
https://redis.readthedocs.io/en/latest/_modules/redis/client.html?highlight=PubSubWorkerThread#

PYTHON - MULTITHREADING USING CLASSES

I am a absolute beginner in python multi threading. My application needs to telnet around 200 servers, execute commands and return the response. I have created separate classes for telnetting and processing the response. I read about GIL and race conditions in threading but not sure whether they will have impact in my code. Because for every thread i am creating a new instance of the class and accessing the method. So technically the threads will not share same resource. Can anyone please explain whether my assumption is right if not please explain the right way of doing it ?
Main method :
if __name__ == "__main__":
thread_list = []
for ip in server_list: # server list contains the IP of hosts
config_object = Configuration () # configuration class has method for telnet device
thread1 = threading.Thread(target=config_object.captureconfigprocess, args=(ip))
thread_list.append(thread1)
for thread in thread_list:
thread.start()
for thread in thread_list:
thread.join()
I read about GIL and race conditions in threading but not sure whether they will have impact in my code
Python does not have real threads. OS will see all python threads as one process and that will require CPU to context switch between instructions sent by python. This will cripple the performance of your code. Although python threads will be more than enough for most of the case, it may or may not be enough for your case. 200 servers may seem too much but it all boils down to how much communication happens between those 200 servers and your python client. To be sure, you have to try. If you want a better solution, use multiprocessing.
So technically the threads will not share same resource.
If each thread is using it's own resourse than shared resourse is not an issue to worry about.

when asyncio.StreamReader.read() called, which type of Error occur?

Now I make TCP server with asyncio.
I want added exception error handling in my code. (like below)
try:
data = await reader.read(SERVER_IO_BUFFER_SIZE)
except SomeError:
#error handle
So, I look asyncio official document.
but I can't find any of information about Errors that may occur.
(link: https://docs.python.org/3/library/asyncio-stream.html#asyncio.StreamReader.read)
How can I get infomation about Errors that may occur?
The exact errors that may occur will depend on the type of the stream behind the StreamReader. An implementation that talks to a socket will raise IOError, while an implementation that reads data from a database might raise some database-specific errors.
If you are dealing with the network, e.g. through asyncio.open_connection or asyncio.start_server, you can expect instances of IOError and its subclasses. In other words, use except IOError as e.
Also, if the coroutine is cancelled, you can get asyncio.CancelledError at any await. You probably don't want to handle that exception - just let it propagate, and be sure to use the appropriate finally clauses or with context managers to ensure cleanup. (This last part is a good idea regardless of CancelledError.)

Multithreaded HTTP GET requests slow down badly after ~900 downloads

I'm attempting to download around 3,000 files (each being maybe 3 MB in size) from Amazon S3 using requests_futures, but the download slows down badly after about 900, and actually starts to run slower than a basic for-loop.
It doesn't appear that I'm running out of memory or CPU bandwidth. It does, however, seem like the Wifi connection on my machine slows to almost nothing: I drop from a few thousand packets/sec to just 3-4. The weirdest part is that I can't load any websites until the Python process exits and I restart my wifi adapter.
What in the world could be causing this, and how can I go about debugging it?
If it helps, here's my Python code:
import requests
from requests_futures.sessions import FuturesSession
from concurrent.futures import ThreadPoolExecutor, as_completed
# get a nice progress bar
from tqdm import tqdm
def download_threaded(urls, thread_pool, session):
futures_session = FuturesSession(executor=thread_pool, session=session)
futures_mapping = {}
for i, url in enumerate(urls):
future = futures_session.get(url)
futures_mapping[future] = i
results = [None] * len(futures_mapping)
with tqdm(total=len(futures_mapping), desc="Downloading") as progress:
for future in as_completed(futures_mapping):
try:
response = future.result()
result = response.text
except Exception as e:
result = e
i = futures_mapping[future]
results[i] = result
progress.update()
return results
s3_paths = [] # some big list of file paths on Amazon S3
def make_s3_url(path):
return "https://{}.s3.amazonaws.com/{}".format(BUCKET_NAME, path)
urls = map(make_s3_url, s3_paths)
with ThreadPoolExecutor() as thread_pool:
with requests.session() as session:
results = download_threaded(urls, thread_pool, session)
Edit with various things I've tried:
time.sleep(0.25) after every future.result() (performance degrades sharply around 900)
4 threads instead of the default 20 (performance degrades more gradually, but still degrades to basically nothing)
1 thread (performance degrades sharply around 900, but recovers intermittently)
ProcessPoolExecutor instead of ThreadPoolExecutor (performance degrades sharply around 900)
calling raise_for_status() to throw an exception whenever the status is greater than 200, then catching this exception by printing it as a warning (no warnings appear)
use ethernet instead of wifi, on a totally different network (no change)
creating futures in a normal requests session instead of using a FutureSession (this is what I did originally, and found requests_futures while trying to fix the issue)
running the download only only a narrow range of files around the failure point (e.g. file 850 through file 950) -- performance is just fine here, print(response.status_code) shows 200 all the way, and no exceptions are caught.
For what it's worth, I have previously been able to download ~1500 files from S3 in about 4 seconds using a similar method, albeit with files an order of magnitude smaller
Things I will try when I have time today:
Using a for-loop
Using Curl in the shell
Using Curl + Parallel in the shell
Using urllib2
Edit: it looks like the number of threads is stable, but when the performance starts to go bad the number of "Idle Wake Ups" appears to spike from a few hundred to a few thousand. What does that number mean, and can I use it to solve this problem?
Edit 2 from the future: I never ended up figuring out this problem. Instead of doing it all in one application, I just chunked the list of files and ran each chunk with a separate Python invocation in a separate terminal window. Ugly but effective! The cause of the problem will forever be a mystery, but I assume it was some kind of problem deep in the networking stack of my work machine at the time.
This isn't a surprise.
You don't get any parallelism when you have more threads than cores.
You can prove this to yourself by simplifying the problem to a single core with multiple threads.
What happens? You can only have one thread running at a time, so the operating system context switches each thread to give everyone a turn. One thread works, the others sleep until they are woken up in turn to do their bit. In that case you can't do better than single thread.
You may do worse because context switching and memory allocated for each thread (1MB each) have a price, too.
Read up on Amdahl's Law.

Resources