I made a multiThread download application, and now I got to show the progress of each downloading Thread, like in IDM, When Data is downloaded the progressbar is notified about downloaded data, and as you know each thread position in progressBar had to begin from a specified position, now the question is:
How can I increment progressposition according to downloaded data, it is pretty simple in monothread by using IDHTTPWORK, so can I use the same method in multithread application or is there another simple method to implement?
Do I need to synchronise the instructions that increment position?
Suppose you have N downloads, of known size M[i] bytes. Before you start downloading, sum these values to get the total number of bytes to be downloaded, M.
While the threads are working they keep track of how many bytes have been downloaded so far, m[i] say. Then, at any point in time the proportion of the task that is complete is:
Sum(m[i]) / M
You can update the progress out of the main thread using a timer. Each time the timer fires, calculate the sum of the m[i] counts. There's no need for synchronisation here so long as the m[i] values are aligned. Any data races are benign.
Now, m[i] might not be stored in an array. You might have an array of download thread objects. And each of those objects stored all the information relating to that download object, including m[i].
Alternatively you can use the same sort of synchronized updating as you do for single threaded code. Remove the timer and update from the made thread when you get new progress information. However, with a lot of threads there is a lot of synchronization and that can potentially lead to contention. The lock free approach above would be my preference. Even though it involves polling on the timer.
You can take a look at the subclassed MFC list controls developed in the article by Michael Dunn 15 years ago: Articles/79/Neat-Stuff-to-Do-in-List-Controls-Using-Custom-Dra on codeproject dot com.
If you implement one of them, say, CXListCtrl* pListCtrl, at thread creation time, then the progress reporting of that thread becomes as simple as making calls such as:
pListCtrl->SetProgress(mItem,0);
when it's time to start showing progress, and
pListCtrl->SetProgress(mItem,0, i);
when you're i% done.
Actually, if you just want the progress bar functionality and don't care about all that's under the hood, you could obtain and use without modification (or license issues) the class XListCtrl.cpp in the Work Queue article at Articles/3607/Work-Queue on that same site.
Related
I have recently come across a question based on multi-threading. I was given a situation where there will be variable no of cars constantly changing there locations. Also there are multiple users who are posting requests to get location of any car at any moment. What would be data structure to handle this situation and why?
You could use a mutex (one per car).
Lock: before changing location of the associated car
Unlock: after changing location of the associated car
Lock: before getting location of the associated car
Unlock: after done doing work that relies on that location being up to date
I'd answer with:
Try to make threading an external concept to your system yet make the system as modular and encapsulated as possible at the same time. It will allow adding concurrency at later phase at low cost and in case the solution happens to work nicely in a single thread (say by making it event-loop-based) no time will have been burnt for nothing.
There are several ways to do this. Which way you choose depends a lot on the number of cars, the frequency of updates and position requests, the expected response time, and how accurate (up to date) you want the position reports to be.
The easiest way to handle this is with a simple mutex (lock) that allows only one thread at a time to access the data structure. Assuming you're using a dictionary or hash map, your code would look something like this:
Map Cars = new Map(...)
Mutex CarsMutex = new Mutex(...)
Location GetLocation(carKey)
{
acquire mutex
result = Cars[carKey].Location
release mutex
return result
}
You'd do that for Add, Remove, Update, etc. Any method that reads or updates the data structure would require that you acquire the mutex.
If the number of queries far outweighs the number of updates, then you can do better with a reader/writer lock instead of a mutex. With an RW lock, you can have an unlimited number of readers, OR you can have a single writer. With that, querying the data would be:
acquire reader lock
result = Cars[carKey].Location
release reader lock
return result
And Add, Update, and Remove would be:
acquire writer lock
do update
release writer lock
Many runtime libraries have a concurrent dictionary data structure already built in. .NET, for example, has ConcurrentDictionary. With those, you don't have to worry about explicitly synchronizing access with a Mutex or RW lock; the data structure handles synchronization for you, either with a technique similar to that shown above, or by implementing lock-free algorithms.
As mentioned in comments, a relational database can handle this type of thing quite easily and can scale to a very large number of requests. Modern relational databases, properly constructed and with sufficient hardware, are surprisingly fast and can handle huge amounts of data with very high throughput.
There are other, more involved, methods that can increase throughput in some situations depending on what you're trying to optimize. For example, if you're willing to have some latency in reported position, then you could have position requests served from a list that's updated once per minute (or once every five minutes). So position requests are fulfilled immediately with no lock required from a static copy of the list that's updated once per minute. Updates are queued and once per minute a new list is created by applying the updates to the old list, and the new list is made available for requests.
There are many different ways to solve your problem.
I'm working on an application that processes (possibly large reaching one or two million lines) text (in tab separated form) files containing detail of items and since the processing time can be long I want to update a progress bar so the user knows that the application didn't just hang, or better, to provide an idea of the remaining time.
I've already researched and I know how to update a simple progress bar but the examples tend to be simplistic as to call something like progressBar.setProgress(counter++, 100) using Timer, there are other examples where the logic is simple and written in the same class. I'm also new to the language having done mostly Java and some JavaScript in the past, among others.
I wrote the logic for processing the file (validation of input and creation of output files). But then, if I call the processing logic in the main class the update will be done at the end of processing (flying by so fast from 0 to 100) no matter if I update variables and try to dispatch events or things like that; the bar won't reflect the processing progress.
Would processing the input by chunks be a valid approach? And then, I'm not sure if the processing delay of one data chunk won't affect the processing of the next chunk and so on, because the timer tick is set to be 1 millisecond and the chunk processing time would be longer than that. Also, if the order of the input won't be affected or the result will get corrupted in some way. I've read multithreading is not supported in the language, so should that be a concern?
I already coded the logic described before and it seems to work:
// called by mouse click event
function processInput():void {
timer = new Timer(1);
timer.addEventListener(TimerEvent.TIMER, processChunk);
timer.start();
}
function processChunk(event:TimerEvent):void {
// code to calculate start and end index for the data chunk,
// everytime processChunk is executed these indexes are updated
var dataChunk:Array = wholeInputArray.splice(index0, index1);
processorObj.processChunk(dataChunk)
progressBar.setProgress(index0, wholeInputArray.length);
progressBar.label = index0 + " processed items";
if(no more data to process) { // if wholeInputArray.length == index1
timer.stop();
progressBar.setProgress(wholeInputArray.length, wholeInputArray.length);
progressBar.label = "Processing done";
// do post processing here: show results, etc.
}
}
The declaration for the progress bar is as follows:
<mx:ProgressBar id="progressBar" x="23" y="357" width="411" direction="right"
labelPlacement="center" mode="manual" indeterminate="false" />
I tested it with an input of 50000 lines and it seems to work generating the same result as the other approach that processes the input at once. But, would that be a valid approach or is there a better approach?
Thanks in advance.
your solution is good, i use it most of time.
But multithreading is now supported on AS3 (for desktop and web only for the moment).
Have a look at: Worker documentation and Worker exemple.
Hope that helps :)
may I ask if this Timer AS IS is the working Timer ??? because IF YES then you are in for a lot of trouble with your Application in the long run! - re loading & getting the Timer to stop, close etc. The EventListener would be incomplete and would give problems for sure!
I would like to recommend to get this right first before going further as I know from experience as in some of my own AIR Applications I need to have several hundred of them running one after another in modules as well as in some of my web Apps. not quiet so intense yet a few!
I'm sure a more smother execution will be the reward! regards aktell
Use Workers. Because splitting data into chunks and then processing it is a valid but quite cumbersome approach and with workers you can simply spawn a background worker, do all the parsing there and return a result, all without blocking GUI. Worker approach should require less time to do parsing, because there is no need to stop parser and wait for the next frame.
Workers would be an ideal solution, but quite complicated to set up. If you're not up to it right now, here's a PseudoThread solution I use in similar situations which you can probably get up and running in 5 minutes:
Pseudo Threads
It uses EnterFrame events for balancing between work and letting the UI does its thing and you can manually update the progress bar within your 'thread' code. I think it would be easily adapted for your needs since your data is easily sliced.
Without using Workers (which it seems you are not yet familiar with) AS3 will behave single threaded. Your timers will not overlap. If one of your chunks takes more than 1s to complete the next timer event will be processed when it can. It will not queue up further events if it takes more than your time period ( assuming your processing code is blocking).
The previous answers show the "correct" solution to this, but this might get you where you need to be faster.
It's a very common problem every developer faces every now and then, when visual updates may be so rapid and fast that it causes the contents of the form to flicker. I'm currently using a thread to search files and trigger an event to its calling (main VCL) thread to report each and every search result. If you've ever used the FindFirst / FindNext, or done any large loop for that matter which performs very fast and rapid iterations, then you would know that updating the GUI on every little iteration is extremely heavy, and nearly defeats the purpose of a thread, because the thread then becomes dependent on how fast the GUI can update (on each and every iteration inside the thread).
What I'm doing upon every event from the thread (there could be 100 events in 1 millisecond), is simply incrementing a global integer, to count the number of iterations. Then, I am displaying that number in a label on the main form. As you can imagine, rapid updates from the thread will cause this to flicker beyond control.
So what I would like to know is how to avoid this rapid flicker in the GUI when a thread is feeding events to it faster than it's able to update?
NOTE: I am using VCL Styles, so the flicker becomes even worse.
This is indeed a common problem, not always by threads, but by any loop which needs to update the GUI, and at the same time the loop is iterating faster than the GUI is able to update. The quick and easy solution to this is to use a Timer to update your GUI. Whenever the loop triggers an update, don't immediately update the GUI. Instead, set a some global variable (like the global iteration count) for each thing which may need to be updated (the label to display the count), and then make the timer do the GUI updates. Set the timer's interval for like 100-200 msec. This way, you control the GUI updates to only occur as frequent as you set the timer interval.
Another advantage to this is the performance of your thread will no longer depend on how fast your GUI can update. The thread can trigger its event and only increment this integer, and continue with its work. Keep in mind that you still must make sure you're thread-protecting your GUI. This is an art of its own which I will not cover and assume you already know.
NOTE: The more GUI updates you need to perform, the higher you may need to tweak the timer's interval.
I'm a real beginner in multithreading. This question is about high-level multithreading in PyQt.
Suppose that a table widget requires much time to be populated because of some single items, making the window unresponsive meanwhile.
So I imagine that a responsive window should require a multithreaded solution in this case, where the big calculations (not every ones) are supposed to use separate threads.
A simpler version could use a separate thread for every single column instead of single items.
Working examples are really appreciated.
Thank you and sorry for my bad english.
EDIT: I removed the 'QtConcurrent' "requisite" from my original question.
I don't have a working example on hand for you at the moment, but I can at least offer a suggestion...
You can create a QThread (or a pool) that loops on a Queue.
Your main gui thread can place a data structure into the queue that includes the input parameters, and the destination cell (row/col).
The thread loop receives a new item from the queue, does the calculation, and then emits a signal like cellDataReady(row, col, value).
This way you can run through the table data and at any time when a calculation is needed, just queue it up.
If you want to do it the QThreadPool route, all of the threads can be pulling from the same queue object. Whichever one is free next will grab the next item, calculate, and emit.
Emitting signals from the threads will allow your main gui to connect to them and simply add the value into the table.
I'm looking for a design pattern that would fit my application design.
My application processes large amounts of data and produces some graphs.
Data processing (fetching from files, CPU intensive calculations) and graph operations (drawing, updating) are done in seperate threads.
Graph can be scrolled - in this case new data portions need to be processed.
Because there can be several series on a graph, multiple threads can be spawned (two threads per serie, one for dataset update and one for graph update).
I don't want to create multiple progress bars. Instead, I'd like to have single progress bar that inform about global progress. At the moment I can think of MVC and Observer/Observable, but it's a little bit blurry :) Maybe somebody could point me in a right direction, thanks.
I once spent the best part of a week trying to make a smooth, non-hiccupy progress bar over a very complex algorithm.
The algorithm had 6 different steps. Each step had timing characteristics that were seriously dependent on A) the underlying data being processed, not just the "amount" of data but also the "type" of data and B) 2 of the steps scaled extremely well with increasing number of cpus, 2 steps ran in 2 threads and 2 steps were effectively single-threaded.
The mix of data effectively had a much larger impact on execution time of each step than number of cores.
The solution that finally cracked it was really quite simple. I made 6 functions that analyzed the data set and tried to predict the actual run-time of each analysis step. The heuristic in each function analyzed both the data sets under analysis and the number of cpus. Based on run-time data from my own 4 core machine, each function basically returned the number of milliseconds it was expected to take, on my machine.
f1(..) + f2(..) + f3(..) + f4(..) + f5(..) + f6(..) = total runtime in milliseconds
Now given this information, you can effectively know what percentage of the total execution time each step is supposed to take. Now if you say step1 is supposed to take 40% of the execution time, you basically need to find out how to emit 40 1% events from that algorithm. Say the for-loop is processing 100,000 items, you could probably do:
for (int i = 0; i < numItems; i++){
if (i % (numItems / percentageOfTotalForThisStep) == 0) emitProgressEvent();
.. do the actual processing ..
}
This algorithm gave us a silky smooth progress bar that performed flawlessly. Your implementation technology can have different forms of scaling and features available in the progress bar, but the basic way of thinking about the problem is the same.
And yes, it did not really matter that the heuristic reference numbers were worked out on my machine - the only real problem is if you want to change the numbers when running on a different machine. But you still know the ratio (which is the only really important thing here), so you can see how your local hardware runs differently from the one I had.
Now the average SO reader may wonder why on earth someone would spend a week making a smooth progress bar. The feature was requested by the head salesman, and I believe he used it in sales meetings to get contracts. Money talks ;)
In situations with threads or asynchronous processes/tasks like this, I find it helpful to have an abstract type or object in the main thread that represents (and ideally encapsulates) each process. So, for each worker thread, there will presumably be an object (let's call it Operation) in the main thread to manage that worker, and obviously there will be some kind of list-like data structure to hold these Operations.
Where applicable, each Operation provides the start/stop methods for its worker, and in some cases - such as yours - numeric properties representing the progress and expected total time or work of that particular Operation's task. The units don't necessarily need to be time-based, if you know you'll be performing 6,230 calculations, you can just think of these properties as calculation counts. Furthermore, each task will need to have some way of updating its owning Operation of its current progress in whatever mechanism is appropriate (callbacks, closures, event dispatching, or whatever mechanism your programming language/threading framework provides).
So while your actual work is being performed off in separate threads, a corresponding Operation object in the "main" thread is continually being updated/notified of its worker's progress. The progress bar can update itself accordingly, mapping the total of the Operations' "expected" times to its total, and the total of the Operations' "progress" times to its current progress, in whatever way makes sense for your progress bar framework.
Obviously there's a ton of other considerations/work that needs be done in actually implementing this, but I hope this gives you the gist of it.
Multiple progress bars aren't such a bad idea, mind you. Or maybe a complex progress bar that shows several threads running (like download manager programs sometimes have). As long as the UI is intuitive, your users will appreciate the extra data.
When I try to answer such design questions I first try to look at similar or analogous problems in other application, and how they're solved. So I would suggest you do some research by considering other applications that display complex progress (like the download manager example) and try to adapt an existing solution to your application.
Sorry I can't offer more specific design, this is just general advice. :)
Stick with Observer/Observable for this kind of thing. Some object observes the various series processing threads and reports status by updating the summary bar.