I'm trying to understand whether requests.post is thread safe. In other words, assuming the server side can (and will) handle multiple requests concurrently, will something like this
def f(i) :
response = requests.post(...)
# do something with the response
with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
for i in range(10):
executor.submit(func, i)
work as expected?
this question and the discussion around it suggest that although the requests package claims to be thread-safe the Session() object is not thread safe. Anyway, it's not clear to me whether this has any bearing on my specific case or not (perhaps because I don't have a clear grasp of the relationship between sessions, cookies, and the requests.post operation).
Related
I have a simple rest service which allows you to create task. When a client requests a task - it returns a unique task number and starts executing in a separate thread. The easiest way to implement it
class Executor:
def __init__(self, max_workers=1):
self.executor = ThreadPoolExecutor(max_workers)
def execute(self, body, task_number):
# some logic
pass
def some_rest_method(request):
body = json.loads(request.body)
task_id = generate_task_id()
Executor(max_workers=1).execute(body)
return Response({'taskId': task_id})
Is it a good idea to create each time ThreadPoolExecutor with one (!) workers if i know than one request - is one new task (new thread). Perhaps it is worth putting them in the queue somehow? Maybe the best option is to create a regular stream every time?
Is it a good idea to create each time ThreadPoolExecutor...
No. That completely defeats the purpose of a thread pool. The reason for using a thread pool is so that you don't create and destroy a new thread for every request. Creating and destroying threads is expensive. The idea of a thread pool is that it keeps the "worker thread(s)" alive and re-uses it/them for each next request.
...with just one thread
There's a good use-case for a single-threaded executor, though it probably does not apply to your problem. The use-case is, you need a sequence of tasks to be performed "in the background," but you also need them to be performed sequentially. A single-thread executor will perform the tasks, one after another, in the same order that they were submitted.
Perhaps it is worth putting them in the queue somehow?
You already are putting them in a queue. Every thread pool has a queue of pending tasks. When you submit a task (i.e., executor.execute(...)) that puts the task into the queue.
what's the best way...in my case?
The bones of a simplistic server look something like this (pseudo-code):
POOL = ThreadPoolExecutor(...with however many threads seem appropriate...)
def service():
socket = create_a_socket_that_listens_on_whatever_port()
while True:
client_connection = socket.accept()
POOL.submit(request_handler, connection=connection)
def request_handler(connection):
request = receive_request_from(connection)
reply = generate_reply_based_on(request)
send_reply_to(reply, connection)
connection.close()
def main():
initialize_stuff()
service()
Of course, there are many details that I have left out. I can't design it for you. Especially not in Python. I've written servers like this in other languages, but I'm pretty new to Python.
I am a absolute beginner in python multi threading. My application needs to telnet around 200 servers, execute commands and return the response. I have created separate classes for telnetting and processing the response. I read about GIL and race conditions in threading but not sure whether they will have impact in my code. Because for every thread i am creating a new instance of the class and accessing the method. So technically the threads will not share same resource. Can anyone please explain whether my assumption is right if not please explain the right way of doing it ?
Main method :
if __name__ == "__main__":
thread_list = []
for ip in server_list: # server list contains the IP of hosts
config_object = Configuration () # configuration class has method for telnet device
thread1 = threading.Thread(target=config_object.captureconfigprocess, args=(ip))
thread_list.append(thread1)
for thread in thread_list:
thread.start()
for thread in thread_list:
thread.join()
I read about GIL and race conditions in threading but not sure whether they will have impact in my code
Python does not have real threads. OS will see all python threads as one process and that will require CPU to context switch between instructions sent by python. This will cripple the performance of your code. Although python threads will be more than enough for most of the case, it may or may not be enough for your case. 200 servers may seem too much but it all boils down to how much communication happens between those 200 servers and your python client. To be sure, you have to try. If you want a better solution, use multiprocessing.
So technically the threads will not share same resource.
If each thread is using it's own resourse than shared resourse is not an issue to worry about.
I've been searching for a while now but can't find an answere.
I know that cherrypy creates a new thread for handling requests (GET, PUT, POST, DELETE etc).
Now i fetch the parameters like this:
...
#cherrypy.tools.json_in()
#cherrypy.tools.json_out()
def POST(self):
Forum.lock_post.acquire()
conn = self.io.psqlConnect(self.dict_psql)
cur = conn.cursor(cursor_factory = psycopg2.extras.RealDictCursor)
params = cherrypy.request.json
...
return some_dict
As you can see im locking the thread to avoid race condition on the variable params. But is this really necessary? I'm asking cos if i do it like this all the other requests on POST will have to wait. Is there any better solution without locking the whole POST? I'm using params several times along the code.
First a clarification, CherryPy doesn't create a new thread for each requests, it has a predetermined pool of threads (10 by default), from which indeed one thread can be used to handle a single request at a time.
As for if you should lock cherrypy.request.json. You really don't, there is a concept called "thread locals" on which you can have multiple references to different objects depending on which thread is accessing such object. (python docs).
Having said that... you should make sure that the code that you write doesn't interfere with the state of the other threads (you can use the cherrypy.thread_data as a quick fix).
Take a look into the cherrypy plugin architecture, if you want a resource to be shared among threads usually a plugin is the way to: http://docs.cherrypy.org/en/latest/extend.html#plugins
Please consider a scala.js application which runs in the browser and consists of a main program and a web worker.
The main thread delegates long running operations to the web worker by passing messages that contain the names of methods and the parameters required to invoke them. The worker passes method return values back to the main thread in the form of response messages.
In simpler terms, this program abstracts web worker messaging so that code in the main thread can call methods in the worker thread in idiomatic and asynchronous Scala syntax.
Because web workers do not associate messages with their responses in any way, the abstraction relies on a registry, an intermediary object, that governs each cross context method call to associate the invocation with the result. This singleton could also bind callback functions but is there a way to accomplish this with futures instead of callbacks?
How can I build an abstraction over this registry that allows programmers to use it with the standard asynchronous programming structures in Scala: futures and promises?
How should I write this functionality so that scala programmers can interact with it in the canonical way? For example:
// long running method in the web worker
val f: Future[String] = Registry.ultimateQuestion(42) // async
f onSuccess { case q => println("The ultimate question is: " + q) }
I'm new to futures and promises, but it seems like they usually complete when some execution block terminates. In this case, receiving a response from the web worker signifies completion of the future. Is there a way to write a custom future that delegates its completion status to an external process? Is there another way to link the web worker response message to the status of the future?
Can/Should I extend the Future trait? Is this possible in Scala.js? Is there a concrete class that I should extend? Is there some other way to encapsulate these cross context web worker method calls in existing asynchronous Scala functionality?
Thank you for your consideration.
Hmm. Just spitballing here (I haven't used workers yet), but it seems like associating the request with the Future is fairly easy in the single-threaded JavaScript world you're working in.
Here's a hypothetical design. Say that each request/response to the worker is automatically wrapped in an Envelope; the Envelope contains a RequestId. So the send side looks something like (this is pseudo-code, but real-ish):
def sendRequest[R](msg:Message):Future[R] = {
val promise = Promise[R]
val id = nextRequestId()
val envelope = Envelope(id, msg)
register(id, promise)
sendToWorker(envelope)
promise.future
}
The worker processes msg, wraps the result in another Envelope, and the result gets handled back in the main thread with something like:
def handleResult(resultEnv:Envelope):Unit = {
val promise = findRegistered(resultEnv.id)
val result = resultEnv.msg
promise.success(result)
}
That needs some filling in, and some thought about what the types like R should be, but that sort of outline would probably work decently well. If this was the JVM you'd have to worry about all sorts of race conditions, but in the single-threaded JS world it probably can be as simple as using an autoincrementing integer for the request ID, and storing away the Promise...
I know it sounds heavy weight, but I'm trying to solve an hypothetical situation. Imagine you have N observers of some object. Each one interested in the object state. When applying the Observer Pattern the observable object tends to iterate through its observer list invoking the observer notify()|update() method.
Now imagine that a specific observer has a lot of work to do with the state of the observable object. That will slow down the last notification, for example.
So, in order to avoid slowing down notifications to all observers, one thing we can do is to notify the observer in a separate thread. In order for that to work, I suppose that a thread for each observer is needed. That is a painful overhead we are having in order to avoid the notification slow down caused by heavy work. Worst than slowing down if thread approach is used, is dead threads caused by infinite loops. It would be great reading experienced programmers for this one.
What people with years on design issues think?
Is this a problem without a substancial solution?
Is it a really bad idea? why?
Example
This is a vague example in order to demonstrate and, hopefully, clarify the basic idea that I don't even tested:
class Observable(object):
def __init__(self):
self.queues = {}
def addObserver(self, observer):
if not observer in self.queues:
self.queues[observer] = Queue()
ot = ObserverThread(observer, self.queues[observer])
ot.start()
def removeObserver(self, observer):
if observer in self.queues:
self.queues[observer].put('die')
del self.queues[observer]
def notifyObservers(self, state):
for queue in self.queues.values():
queue.put(state)
class ObserverThread(Thread):
def __init__(self, observer, queue):
self.observer = observer
self.queue = queue
def run(self):
running = True
while running:
state = self.queue.get()
if state == 'die':
running = False
else:
self.observer.stateChanged(state)
You're on the right track.
It is common for each observer to own its own input-queue and its own message handling thread (or better: the queue would own the thread, and the observer would own the queue). See Active object pattern.
There are some pitfalls however:
If you have 100's or 1000's of observers you may need to use a thread pool pattern
Note the you'll lose control over the order in which events are going to be processed (which observer handles the event first). This may be a non-issue, or may open a Pandora box of very-hard-to-detect bugs. It depends on your specific application.
You may have to deal with situations where observers are deleted before notifiers. This can be somewhat tricky to handle correctly.
You'll need to implement messages instead of calling functions. Message generation may require more resources, as you may need to allocate memory, copy objects, etc. You may even want to optimize by implementing a message pool for common message types (you may as well choose to implement a message factory that wrap such pools).
To further optimize, you'll probably like to generate one message and send it to all to observers (instead of generating many copies of the same message). You may need to use some reference counting mechanism for your messages.
Let each observer decide itself if its reaction is heavyweight, and if so, start a thread, or submit a task to a thread pool. Making notification in a separate thread is not a good solution: while freeing the observable object, it limits the processor power for notifications with single thread. If you do not trust your observers, then create a thread pool and for each notification, create a task and submit it to the pool.
In my opinion when you have a large no of Observers for an Observable, which do heavy processing, then the best thing to do is to have a notify() method in Observer.
Use of notify(): Just to set the dirty flag in the Observer to true. So whenever the Observer thread will find it appropriate it will query the Observable for the required updates.
And this would not require heavy processing on Observable side and shift the load to the Observer side.
Now it depends on the Observers when they have to Observe.
The answer of #Pathai is valid in a lot of cases.
One is that you are observing changes in a database. In many ways you can't reconstruct the final state from the snapshots alone, especially if your state is fetched as a complex query from the database, and the snapshot is an update to the database.
To implement it, I'd suggest using an Event object:
class Observer:
def __init__(self):
self.event = threading.Event()
# in observer:
while self.event.wait():
# do something
self.event.clear()
# in observable:
observer.event.set()