should I create threads before hand to save time? - multithreading

I am using python 2.7 .I am using multi-threading.Now if a thread dies I again
create one to compensate for it.So should I create a lot of threads before hand and store them
and use from them when one or more existing threads die or should I create one when some thread dies??
Which is more efficient in terms of time ??

When you say a thread "dies", do you mean you intentionally terminate it or it fails due to error?
If you're intentionally terminating it and you're worried about the time required to spawn a new thread, why not keep the thread persistent and simply have it do the job that the new thread would have done? This is a pretty standard approach - maintain a pool of "worker" threads and have a work queue with pending items to execute. They all run an identical loop which is to pull an item off the queue and execute it. These items can be objects with methods which contain the code to execute if it's convenient to work that way - if the tasks are all very similar then it might be easier to put the code into the thread's own function instead.
If you're talking about threads failing due to error, I wouldn't have imagined this was common enough to worry about it. If it is, you probably need to look at making your code more robust.
In either case, spawning a thread on most systems should be a lightweight activity - a lot more lightweight than spawning a whole new process, for example. As a result, I really wouldn't worry about keeping a pool of threads in reserve to use - that really sounds like early optimisation to me.
Even if spawning threads were slow, consider what you would be doing by spawning threads in advance - you would be taking up more memory (some memory in the OS to keep track of a the thread, some in Python for the objects that it uses to track the thread), although not a great deal; you'd also be spending more time at the start of your program creating all these threads. So, you might save a little time while you were running, but instead your program takes significantly longer to start. That doesn't sound like a sensible trade-off to me unless the speed and latency of your code is absolutely critical while it's running, and if speed is that critical then I'm not sure a pure Python solution is the right approach anyway. Something like C/C++ is going to give you better control of scheduling, at the expense of much more complexity.
In summary: seriously, don't worry about it, just spawn threads as you need them. Trust me, there will be much bigger speed problems elsewhere in your code which are much more deserving of your time.

Related

Will a waiting thread still eat up cpu time?

I'm trying to make a thread pool for a game engine and I've been considering how my system should react to third party libraries spawning their own threads.
From what I've read, it is ideal to only have one thread for each CPU you have access to. So if my third party physics update spawns four threads, it would be ideal to turn off four threads from my thread pool while it is running, then turn them back on afterwards, that way multiple threads are never contending over one CPU.
My question is about the underlying mechanics behind functionality like conditional variables. Since spawning threads is expensive, having four threads wait on a conditional variable and then notifying them when the physics is done seems like a much better option than joining four threads and re-spawning them afterwards. But if they are waiting on a variable, are the threads truly "asleep" or are they still contending for CPU resources in the background?
Although you did not write what platform you are programming on, in most implementations threads that are waiting consume little to no CPU resources.
They do however use some memory (to save the stack, etc.), so you should avoid spawning an excessive number of threads and trying to reuse them as much as possible, since as you noted, spawning a new thread is an expensive operation on most platforms.
Even though you did not provide a lot of information, I'm guessing that in your scenario letting the threads wait is a much better option, as a small number of threads will not use a lot of resources and possibly having to spawn new threads frequently will affect performance badly on almost all platforms.
Any good third party library should give you the option of running it's work through your thread pool, to avoid that problem in the first place.
For example here's the documentation on how you can do that with PhysX - https://developer.nvidia.com/sites/default/files/akamai/physx/Docs/TaskManager.html

Can code running in a background thread be faster than in the main VCL thread in Delphi?

If anybody has had a lot of experience timing code running on the main VCL thread vs a background thread, I'd like to get an opinion. I have some code that does some heavy string processing running in my Delphi 6 application on the main thread. Each time I run an operation, the time for each operation hovers around 50 ms on a single thread on my i5 Quad core. What makes me really suspicious is that the same code running on an old Pentium 4 that I have, shows the same time for the operation when usually I see code running about 4 times slower on the Pentium 4 than the Quad Core. I am beginning to wonder if the code might be consuming significantly less time than 50 ms but that there's something about the main VCL thread, perhaps Windows message handling or executing Windows API calls, that is creating an artificial "floor" for the operation. Note, an operation is triggered by an incoming request on a socket if that matters, but the time measurement does not take place until the data is fully received.
Before I undertake the work of moving all the code on to a background thread for testing, I am wondering if anyone has any general knowledge in this area? What have your experiences been with code running on and off the main VCL thread? Note, the timing measurements are being done when there is absolutely no user triggered activity going on during the tests.
I'm also wondering if raising the priority of the thread to just below real-time would do any good. I've never seen much improvement in my run times when experimenting with those flags.
-- roschler
Given all threads have the same priority, as they normally do, there can't be a difference, for the following reasons. If you're seeing a difference, re-evaluate the code (make sure you run the same thing in both VCL and background threads) and make sure you time it properly:
The compiler generates the exact same code, it doesn't care if the code is going to run in the main thread or in a background thread. In fact you can put the whole code in a procedure and call that from both your worker thread's Execute() and from the main VCL thread.
For the CPU all cores, and all threads, are equal. Unless it's actually a Hyper Threading CPU, where not all cores are real, but then see the next bullet.
Even if not all CPU cores are equal, your thread will very unlikely run on the same core, the operating system is free to move it around at will (and does actually schedule your thread to run on different cores at different times).
Messaging overhead doesn't matter for the main VCL thread, because unless you're calling Application.ProcessMessages() manually, the message pump is simply stopped while your procedure does it's work. The message pump is passive, your thread needs to request messages from the queue, but since the thread is busy doing your work, it's not requesting any messages so no overhead there.
There's just one place where threads are not equal, and this can change the perceived speed of execution: It's the operating system that schedules threads to execution units (cores), and for the operating system threads have different priorities. You can tell the OS a certain thread needs to be treated differently using the SetThreadPriority() API (which is used by the TThread.Priority property).
Without simple source code to reproduce the issue, and how you are timing your threads, it will be difficult to understand what occurs in your software.
Sounds definitively like either:
An Architecture issue - how are your threads defined?
A measurement issue - how are you timing your threads?
A typical scaling issue of both the memory manager and the RTL string-related implementation.
About the latest point, consider this:
The current memory manager (FastMM4) is not scaling well on multi-core CPU; try with a per-thread memory manager, like our experimental SynScaleMM - note e.g. that the Free Pascal Compiler team has written a new scaling MM from scratch recently, to avoid such issue;
Try changing the string process implementation to avoid memory allocation (use static buffers), and string reference-counting (every string reference counting access produces a LOCK DEC/INC which do not scale so well on multi-code CPU - use per-thread char-level process, using e.g. PChar on static buffers instead of string).
I'm sure that without string operations, you'll find that all threads are equivalent.
In short: neither the current Delphi MM, neither the current string implementation scales well on multi-core CPU. You just found out a known issue of the current RTL. Read this SO question.
When your code has control of the VCL thread, for instance if it is in one method and doesn't call out to any VCL controls or call Application.ProcessMessages, then the run time will not be affected just because it's in the main VCL thread.
There is no overhead, since you "own" the whole processing power of the thread when you are in your own code.
I would suggest that you use a profiling tool to find where the actual bottleneck is.
Performance can't be assessed statically. For that you need to get AQTime, or some other performance profiler for Delphi. I use AQtime, and I love it, but I'm aware it's considered expensive.
Your code will not magically get faster just because you moved it to a background thread. If anything, your all-inclusive-time until you see results in your UI might get a little slower, if you have to send a lot of data from the background thread to the foreground thread via some synchronization mechanisms.
If however you could execute parts of your algorithm in parallel, that is, split your work so that you have 2 or more worker threads processing your data, and you have a quad core processor, then your total time to do a fixed load of work, could decrease. That doesn't mean the code would run any faster, but depending on a lot of factors, you might achieve a slight benefit from multithreading, up to the number of cores in your computer. It's never ever going to be a 2x performance boost, to use two threads instead of one, but you might get 20%-40% better performance, in your more-than-one-threaded parallel solutions, depending on how scalable your heap is under multithreaded loads, and how IO/memory/cache bound your workload is.
As for raising thread priorities, generally all you will do there is upset the delicate balance of your Windows system's performance. By raising the priorities you will achieve (sometimes) a nominal, but unrepeatable and non-guaranteeable increase in performance. Depending on the other things you do in your code, and your data sources, playing with priorities of threads can introduce subtle problems. See Dining Philosophers problem for more.
Your best bet for optimizing the speed of string operations is to first test it and find out exactly where it is using most of its time. Is it heap operations? Memory Copy and move operations? Without a profiler, even with advice from other people, you will still be comitting a cardinal sin of programming; premature optimization. Be results oriented. Be science based. Measure. Understand. Then decide.
Having said that, I've seen a lot of horrible code in my time, and there is one killer thing that people do that totally kills their threaded app performance; Using TThread.Synchronize too much.
Here's a pathological (Extreme) case, that sadly, occurs in the wild fairly frequently:
procedure TMyThread.Execute;
begin
while not Terminated do
Synchronize(DoWork);
end;
The problem here is that 100% of the work is really done in the foreground, other than the "if terminated" check, which executes in the thread context. To make the above code even worse, add a non-interruptible sleep.
For fast background thread code, use Synchronize sparingly or not at all, and make sure the code it calls is simple and executes quickly, or better yet, use TThread.Queue or PostMessage if you could really live with queueing main thread activity.
Are you asking if a background thread would be faster? If your background thread would run the same code as the main thread and there's nothing else going on in the main thread, you don't stand to gain anything with a background thread. Threads should be used to split and distribute processing loads that would otherwise contend with one another and/or block one another when running in the main thread. Since you seem to be dealing with a case where your main thread is otherwise idle, simply spawning a thread to run slow code will not help.
Threads aren't magic, they can't speed up slow code or eliminate processing bottlenecks in a particular segment not related to contention on the main thread. Make sure your code isn't doing something you don't know about and that your timing methodology is correct.
My first hunch would be that your interaction with the socket is affecting your timing in a way you haven't detected... (I know you said you're sure that's not involved - but maybe check again...)

Is firing off a Thread a valid answer to simplifying code?

As multi-processor and multi-core computers become more and more ubiquitous, is simply firing off a new thread a (relatively) simple and painless way of simplifying code? For instance, in a current personal project, I have a network server listening on a port. Since this is just a personal project, it's just a desktop app, with a GUI integrated into it for configuration. So, the app reads something like this:
Main()
Read configuration
Start listener thread
Run GUI
Listener Thread
While the app is running
Wait for a new connection
Run a client thread for the new connection
Client Thread
Write synchronously
Read synchronously
ad inifinitum, or till they disconnect
This approach means that while I have to worry about alot of locking, with the potential issues that involves, I avoid alot of spaghetti code from assynchronous calls, etc.
A slightly more insidious version of this came up today when I was working on the startup code. The startup was quick, but it was using lazy loading for alot of the configuration, which meant that while startup was quick, actually connecting to and using the service was difficult because of the lag while it loaded different sections (this was actually measurable in real time, up to 3-10 seconds sometimes). So I moved to a different strategy, on startup, loop through everything and force the lazy loading to kick in... but this made it start prohibitively slow; get up, go get a coffee slow. Final solution: throw the loop into a seperate thread with feedback in the system tray while it's still loading.
Is this "Meh, throw it in another thread, it'll be fine" attitude ok? At what point do you start getting diminishing returns and/or even reduced performance?
Multithreading does a lot of things, but I don't think "simplification" is ever one of them.
It's a great way to introduce bugs into code.
Using multiple threads properly is not easy. It should not be attempted by new developers.
In my opinion, multi-threaded programming is pretty high up on the difficulty (and complexity) scale, along with memory management. To me, the "Meh, throw it in another thread, it'll be fine" attitude is a bit too casual. Think long and hard you must, before forking threads you do.
No.
Plainly and simply, multithreading increases complexity and is a nearly trivial way to add bugs to code. There are concurrency issues such as synchronization, deadlock, race conditions, and priority inversion to name a few.
Secondly, the performance gains are not automatic. Recently, there was an excellent article in MSDN Magazine along these lines. The salient details are that a certain operation was taking 46 seconds per ten iterations coded as a single-threaded operation. The author parallelized the operation naively (one thread per four cores) and the operation dropped to 30 seconds per ten iterations. Sounds great until you take into consideration that the operation now eats 300% more processing power but only experienced a 34% gain in efficiency. It's not worth consuming all available processing power for a gain like that.
This gives you the extra job of debugging race conditions, and handling locks and sycronisation issues.
I would not use this unless there was a real need.
Read up on Amdahl's law, best summarized by "The speedup of a program using multiple processors in parallel computing is limited by the time needed for the sequential fraction of the program."
As it turns out, if only a small part of your app can run in parallel you won't get much gains, but potentially many hard-to-debug bugs.
I don't mean to be flip but what's in that configuration file that it takes so long to load? That's the origin of your problem, right?
Before spawning another thread to handle it, perhaps it can be parred down? Reduced, perhaps put in another data format that would be quicker, etc?
How often does it change? Is it something you can parse once at the beginning of the day and put the variables in shared memory so subsequent runs of your main program can just attach and get the needed values from there?
While I agree with everyone else here in saying that multithreading does not simplify code, it can be used to greatly simplify the user experience of your application.
Consider an application that has a lot of interactive widgets (I am currently developing one where this helps) - in the workflow of my application, a user can "build" the current project they are working on. This requires disabling the interactive widgets my application presents to the user and presenting a dialog with a indeterminate progress bar and a friendly "please wait" message.
The "build" occurs on a background thread; if it were to happen on the UI thread it would make the user experience less enjoyable - after all, it's no fun not being able to tell whether or not you are able to click on a widget in an application while a background task is running (cough, Visual Studio). Not to say that VS doesn't use background threads, I'm just saying their user experience could use some improvement. But I digress.
The one thing I take issue with in the title of your post is that you think of firing off threads when you need to perform tasks - I generally prefer to reuse threads - in .NET, I generally favor using the system thread pool over creating a new thread each time I want to do something, for the sake of performance.
I'm going to provide some balance against the unanimous "no".
DISCLAIMER: Yes, threads are complicated and can cause a whole bunch of problems. Everyone else has pointed this out.
From experience, a sequence of blocking reads/writes to a socket (which requires a separate thead) is much simpler than non-blocking ones. With blocking calls, you can tell the state of the connection just by looking at where you are in the function. With non-blocking calls, you need a bunch of variables to record the state of the connection, and check and modify them every time you interact with the connection. With blocking calls, you can just say "read the next X bytes" or "read until you find X" and it will actually do it (or fail). With non-blocking calls, you have to deal with fragmented data which usually requires keeping temporary buffers and filling them as necessary. You also end up checking if you've received enough data every time you receive little more. Plus you have to keep a list of open connections and handle unexpected closes for all of them.
It doesn't get much simpler than this:
void WorkerThreadMain(Connection connection) {
Request request = ReadRequest(connection);
if(!request) return;
Reply reply = ProcessRequest(request);
if(!connection.isOpen) return;
SendReply(reply, connection);
connection.close();
}
I'd like to note that this "listener spawns off a worker thread per connection" pattern is how web servers are designed, and I assume it's how a lot of request/response soft of server applications are designed.
So in conclusion, I have experienced the asynchronous socket spaghetti code you mentioned, and spawning off worker threads for every connection ended up being a good solution. Having said all this, throwing threads at a problem should usually be your last resort.
I think your have no choice but to deal with threads especially with networking and concurrent connections. Do threads make code simpler? I don't think so. But without them how would you program a server that can handle more than 1 client at the same time?

Which is the better method? Allowing the thread to sleep for a while or deleting it and recreating it later?

We have a process that needs to run every two hours. It's a process that needs to run on it's own thread so as to not interrupt normal processing.
When it runs, it will download 100k records and verify them against a database. The framework to run this has a lot of objects managing this process. These objects only need to be around when the process is running.
What's a better standard?
Keep the thread in wait mode by letting it sleep until I need it again. Or,
Delete it when it is done and create it the next time I need it? (System Timer Events.)
There is not that much difference between the two solutions. I tend to prefer the one where the thread is created each time.
Having a thread lying around consumes resources (memory at least). In a garbage collected language, it may be easy to have some object retained in this thread, thus using even more memory. If you have not the thread laying around, all resources are freed and made available for two hours to the main process.
When you want to stop your whole process, where your thread may be executing or not, you need to interrupt the thread cleanly. It is always difficult to interrupt a thread or knowing if it is sleeping or working. You may have some race conditions there. Having the thread started on demand relieves you from those potential problems: you know if you started the thread and in that case calling thread_join makes you wait until the thread is done.
For those reasons, I would go for the thread on demand solution, even though the other one has no insurmontable problems.
Starting one thread every two hours is very cheap, so I would go with that.
However, if there is a chance that at some time in the future the processing could take more than the run interval, you probably want to keep the thread alive. That way, you won't be creating a second thread that will start processing the records while the first is still running, possibly corrupting data or processing records twice.
Either should be fine but I would lean towards keeping the thread around for cases where the verification takes longer than expected (ex: slow network links or slow database response).
How would you remember to start a new thread when the two hours are up ? With a timer? (That's on another thread!) with another thread that sleeps until the specified time? Shutting down the thread and restarting it based on something running somewhere else does you no good if the something else is either on it's own separate thread, or blocks the main app while it's waiting to "Create" the worker thread when the two hours are up, no?
Just let the Thread sleep...
I agree with Vilx that it's mostly a matter of taste. There is processing and memory overhead of both methods, but probably not enough for either to matter.
If you are using Java you could check Timer class. It allows you to schedule tasks on given time.
Also, if you need more control you can use quartz library.
I guess actually putting the thread to sleep is most effective, ending it and recreating it would "cost" some resources, while putting it to sleep would just fill a little space in the sceduler while it's data could be paged by the operationg system if needed.
But anyway it's probably not a very big difference, and the difference would probably depend on how good the OS' sceduler is, etc...
It really depends on one thing as I can tell... state.
If the thread creates a lot of state (allocates memory) that is useful to have during the next iteration of the thread run, then I would keep it around. That way, your process can potentially optimize its run by only performing certain operations if certain things changed since the last running.
However, if the state that the process creates is significant compared with the amount of work to be done, and you are short on resources on the machine, then it may not be worth the cost of keeping the state around in between exectutions. If thats the case, then you should recreate the thread from scratch each time.
I think it's just a matter of taste. Both are good. Use the one which you find easier to implement. :)
I would create the thread a single time, and use events/condition variables to let it sleep until signaled to wake up again. That way if the amount of time needed ever has to change, you only need change the timing in firing the event and your code will still be pretty clean.
I wouldn't think it's very important, but the best approach is very platform dependent.
A .NET System.Threading.Timer costs nothing while it's waiting, and will invoke your code on a pool thread. In theory, that would be the best of both your suggestions.
Another important thing to consider if you are on a garbage collected system like Java is that anything strongly referenced by a sleeping thread is not garbage. In that respect, it's better to kill idle threads, and let them, and any objects they reference, get cleaned up.
It all depends, of course. But by default I would go with a separate process (not thread) started on demand.

Threads or asynch?

How do you make your application multithreaded ?
Do you use asynch functions ?
or do you spawn a new thread ?
I think that asynch functions are already spawning a thread so if your job is doing just some file reading, being lazy and just spawning your job on a thread would just "waste" ressources...
So is there some kind of design when using thread or asynch functions ?
If you are talking about .Net, then don't forget the ThreadPool. The thread pool is also what asynch functions often use. Spawning to much threads can actually hurt your performance. A thread pool is designed to spawn just enough threads to do the work the fastest. So do use a thread pool instead of spwaning your own threads, unless the thread pool doesn't meet your needs.
PS: And keep an eye out on the Parallel Extensions from Microsoft
Spawning threads is only going to waste resources if you start spawning tons of them, one or two extra threads isn't going to effect the platforms proformance, infact System currently has over 70 threads for me, and msn is using 32 (I really have no idea how a messenger can use that many threads, exspecialy when its minimised and not really doing anything...)
Useualy a good time to spawn a thread is when something will take a long time, but you need to keep doing something else.
eg say a calculation will take 30 seconds. The best thing to do is spawn a new thread for the calculation, so that you can continue to update the screen, and handle any user input because users will hate it if your app freezes untill its finished doing the calculation.
On the other hand, creating threads to do something that can be done almost instantly is nearly pointless, since the overhead of creating (or even just passing work to an existing thread using a thread pool) will be higher than just doing the job in the first place.
Sometimes you can break your app into a couple of seprate parts which run in their own threads. For example in games the updates/physics etc may be one thread, while grahpics are another, sound/music is a third, and networking is another. The problem here is you really have to think about how these parts will interact or else you may have worse proformance, bugs that happen seemingly "randomly", or it may even deadlock.
I'll second Fire Lancer's answer - creating your own threads is an excellent way to process big tasks or to handle a task that would otherwise be "blocking" to the rest of synchronous app, but you have to have a clear understanding of the problem that you must solve and develope in a way that clearly defines the task of a thread, and limits the scope of what it does.
For an example I recently worked on - a Java console app runs periodically to capture data by essentially screen-scraping urls, parsing the document with DOM, extracting data and storing it in a database.
As a single threaded application, it, as you would expect, took an age, averaging around 1 url a second for a 50kb page. Not too bad, but when you scale out to needing to processes thousands of urls in a batch, it's no good.
Profiling the app showed that most of the time the active thread was idle - it was waiting for I/O operations - opening of a socket to the remote URL, opening a connection to the database etc. It's this sort of situation that can easily be improved with multithreading. Rewriting to be multi-threaded and with just 5 threads instead of one, even on a single core cpu, gave an increase in throughput of over 20 times.
In this example, each "worker" thread was explicitly limited to what it did - open the remote a remote url, parse the data, store it in the db. All the "high level" processing - generating the list of urls to parse, working out which next, handling errors, all remained with the control of the main thread.
The use of threads makes you think more about the way your application needs threading and can in the long run make it easier to improve / control your performance.
Async methods are faster to use but they are a bit magic - a lot of things happen to make them possible - so it's probable that at some point you will need something that they can't give you. Then you can try and roll some custom threading code.
It all depends on your needs.
The answer is "it depends".
It depends on what you're trying to achieve. I'm going to assume that you're aiming for more performance.
The simplest solution is to find another way to improve your performance. Run a profiler. Look for hot spots. Reduce unnecessary IO.
The next solution is to break your program into multiple processes, each of which can run in their own address space. This is easiest because there is no chance of the individual processes messing each other up.
The next solution is to use threads. At this point you're opening a major can of worms, so start small, and only multi-thread the critical path of the code.
The next solution is to use asynch IO. Generally only recommended for people writing some of very heavily loaded server, and even then I would rather re-use one of the existing frameworks that abstract away the details e.g. the C++ framework ICE, or an EJB server under java.
Note that each of these solutions has multiple sub-solutions - there are different breeds of threads and different kinds of asynch IO, each with slightly different performance characteristics, but again, it's generally best to let the framework handle it for you.

Resources