Displaying progress bar for long running process in Actionscript/Flash Builder without mixing logic - multithreading

I'm working on an application that processes (possibly large reaching one or two million lines) text (in tab separated form) files containing detail of items and since the processing time can be long I want to update a progress bar so the user knows that the application didn't just hang, or better, to provide an idea of the remaining time.
I've already researched and I know how to update a simple progress bar but the examples tend to be simplistic as to call something like progressBar.setProgress(counter++, 100) using Timer, there are other examples where the logic is simple and written in the same class. I'm also new to the language having done mostly Java and some JavaScript in the past, among others.
I wrote the logic for processing the file (validation of input and creation of output files). But then, if I call the processing logic in the main class the update will be done at the end of processing (flying by so fast from 0 to 100) no matter if I update variables and try to dispatch events or things like that; the bar won't reflect the processing progress.
Would processing the input by chunks be a valid approach? And then, I'm not sure if the processing delay of one data chunk won't affect the processing of the next chunk and so on, because the timer tick is set to be 1 millisecond and the chunk processing time would be longer than that. Also, if the order of the input won't be affected or the result will get corrupted in some way. I've read multithreading is not supported in the language, so should that be a concern?
I already coded the logic described before and it seems to work:
// called by mouse click event
function processInput():void {
timer = new Timer(1);
timer.addEventListener(TimerEvent.TIMER, processChunk);
timer.start();
}
function processChunk(event:TimerEvent):void {
// code to calculate start and end index for the data chunk,
// everytime processChunk is executed these indexes are updated
var dataChunk:Array = wholeInputArray.splice(index0, index1);
processorObj.processChunk(dataChunk)
progressBar.setProgress(index0, wholeInputArray.length);
progressBar.label = index0 + " processed items";
if(no more data to process) { // if wholeInputArray.length == index1
timer.stop();
progressBar.setProgress(wholeInputArray.length, wholeInputArray.length);
progressBar.label = "Processing done";
// do post processing here: show results, etc.
}
}
The declaration for the progress bar is as follows:
<mx:ProgressBar id="progressBar" x="23" y="357" width="411" direction="right"
labelPlacement="center" mode="manual" indeterminate="false" />
I tested it with an input of 50000 lines and it seems to work generating the same result as the other approach that processes the input at once. But, would that be a valid approach or is there a better approach?
Thanks in advance.

your solution is good, i use it most of time.
But multithreading is now supported on AS3 (for desktop and web only for the moment).
Have a look at: Worker documentation and Worker exemple.
Hope that helps :)

may I ask if this Timer AS IS is the working Timer ??? because IF YES then you are in for a lot of trouble with your Application in the long run! - re loading & getting the Timer to stop, close etc. The EventListener would be incomplete and would give problems for sure!
I would like to recommend to get this right first before going further as I know from experience as in some of my own AIR Applications I need to have several hundred of them running one after another in modules as well as in some of my web Apps. not quiet so intense yet a few!
I'm sure a more smother execution will be the reward! regards aktell

Use Workers. Because splitting data into chunks and then processing it is a valid but quite cumbersome approach and with workers you can simply spawn a background worker, do all the parsing there and return a result, all without blocking GUI. Worker approach should require less time to do parsing, because there is no need to stop parser and wait for the next frame.

Workers would be an ideal solution, but quite complicated to set up. If you're not up to it right now, here's a PseudoThread solution I use in similar situations which you can probably get up and running in 5 minutes:
Pseudo Threads
It uses EnterFrame events for balancing between work and letting the UI does its thing and you can manually update the progress bar within your 'thread' code. I think it would be easily adapted for your needs since your data is easily sliced.

Without using Workers (which it seems you are not yet familiar with) AS3 will behave single threaded. Your timers will not overlap. If one of your chunks takes more than 1s to complete the next timer event will be processed when it can. It will not queue up further events if it takes more than your time period ( assuming your processing code is blocking).
The previous answers show the "correct" solution to this, but this might get you where you need to be faster.

Related

Handling large amounts of arbitrarily scheduled tasks in node

Premise: I have a calendar-like system that allows the creation/deletion of 'events' at a scheduled time in the future. The end goal is to perform an action (send message/reminder) prior to & at the start of the event. I've done a bit of searching & have narrowed down to what seems to be my two most viable choices
Unix Cron Jobs
Bree
I'm not quite sure which will best suit my end goal though, and additionally, it feels like there must be some additional established ways to do things like this that I just don't have proper knowledge of, or that I'm entirely skipping over.
My questions:
If, theoretically, the system were to be handling an arbitrarily large amount of 'events', all for arbitrary times in the future, which of these options is more practical system-resource-wise? Is my concern in this regard even valid?
Is there any foreseeable problem with filling up a crontab with a large volume of jobs - or, in bree's case, scheduling a large amount of jobs?
Is there a better idea I've just completely missed so far?
This mainly stems from bree's use of node 'worker threads'. I'm very unfamiliar with this concept
and concerned that since a 'worker thread' is spawned per every job, I could very quickly tie up all of my available threads and grind... something, to a halt. This, however, sounds somewhat silly & possibly wrong(possibly indicative of my complete lack of knowledge here), & thus, my question.
Thanks, Stark.
For a calendar-like system, it seems you could query your database to find all events occuring in the next hour, then create a setTimeout() for each one of those. Then, an hour later, do the same thing again. Then, upon any server restart, do the same thing again. You don't really need to worry about events that aren't imminent. They can just sit in the database until shortly before their time. You will just need an efficient way to query the database to find events that are imminent and user a timer for them.
WorkerThreads are fairly heavy weight items in nodejs as they create a whole separate heap and a whole new instance of a V8 interpreter. You would definitely not want a separate WorkerThread for each event.
I should add that timers in nodejs are very lightweight items and it is not problem to have lots of them. They are just stored in a sorted linked list and only the insertion of a new timer takes a little bit more time (to do an insertion sort as it is added to the list) as the list gets longer. There is no continuous run-time overhead because there are lots of timers. The event loop, then just checks the first item in the linked list to see if it's time yet for the next timer to fire. If so, it removes it from the head of the list and calls its callback. If not, it goes about the rest of the event loop work items and will check the first item in the list again the next through the event loop.

Time delay in Matlab for a specific function, while letting the rest of the functions run

I am currently working on an image processing object in Matlab. I am acquiring images from a webcam using the snapshot function, which are to be processed in various ways (irrelevant to the question).
I would like these snapshots to be acquired every 5 seconds. However, in these 5 seconds, I do not want my program to pause and wait, I want it to run the image processing functions. I have tried pause, but that obviously pauses the whole program. The way I imagine the processor circuitry from my basic IC knowledge, I am looking to implement an event coming from a clock counter, which would stop the machine instructions dealing with the image processing part, and prioritise the instructions involving the image acquisition.
I have stumbled on this link that talks about multithreading in Matlab using Java. Is there an easier way of implementing what I want to do?
Could you please suggest some functions that achieve what I want to do? If there is no function which does what I want, could you point me towards some articles or books which deal with the subject?
You could use a timer object
rate=5; % call every 5 s
my_timer= timer('TimerFcn',{#my_timer_callback,arguments}, 'Period', rate,'ExecutionMode', 'fixedRate'); % specify arguments for additional arguments
start(my_timer) % stop(my_timer) to end processing
and do the processing inside my_timer_callback.
function my_timer_callback(obj,event,arguments)
% do processing here
Better would be to run the callback triggered by the camera, so I would look into whether Matlab allows you to attach callbacks to the camera data acquisition (e.g. in the same way as for daq objects).

Designing concurrency in a Python program

I'm designing a large-scale project, and I think I see a way I could drastically improve performance by taking advantage of multiple cores. However, I have zero experience with multiprocessing, and I'm a little concerned that my ideas might not be good ones.
Idea
The program is a video game that procedurally generates massive amounts of content. Since there's far too much to generate all at once, the program instead tries to generate what it needs as or slightly before it needs it, and expends a large amount of effort trying to predict what it will need in the near future and how near that future is. The entire program, therefore, is built around a task scheduler, which gets passed function objects with bits of metadata attached to help determine what order they should be processed in and calls them in that order.
Motivation
It seems to be like it ought to be easy to make these functions execute concurrently in their own processes. But looking at the documentation for the multiprocessing modules makes me reconsider- there doesn't seem to be any simple way to share large data structures between threads. I can't help but imagine this is intentional.
Questions
So I suppose the fundamental questions I need to know the answers to are thus:
Is there any practical way to allow multiple threads to access the same list/dict/etc... for both reading and writing at the same time? Can I just launch multiple instances of my star generator, give it access to the dict that holds all the stars, and have new objects appear to just pop into existence in the dict from the perspective of other threads (that is, I wouldn't have to explicitly grab the star from the process that made it; I'd just pull it out of the dict as if the main thread had put it there itself).
If not, is there any practical way to allow multiple threads to read the same data structure at the same time, but feed their resultant data back to a main thread to be rolled into that same data structure safely?
Would this design work even if I ensured that no two concurrent functions tried to access the same data structure at the same time, either for reading or for writing?
Can data structures be inherently shared between processes at all, or do I always explicitly have to send data from one process to another as I would with processes communicating over a TCP stream? I know there are objects that abstract away that sort of thing, but I'm asking if it can be done away with entirely; have the object each thread is looking at actually be the same block of memory.
How flexible are the objects that the modules provide to abstract away the communication between processes? Can I use them as a drop-in replacement for data structures used in existing code and not notice any differences? If I do such a thing, would it cause an unmanageable amount of overhead?
Sorry for my naivete, but I don't have a formal computer science education (at least, not yet) and I've never worked with concurrent systems before. Is the idea I'm trying to implement here even remotely practical, or would any solution that allows me to transparently execute arbitrary functions concurrently cause so much overhead that I'd be better off doing everything in one thread?
Example
For maximum clarity, here's an example of how I imagine the system would work:
The UI module has been instructed by the player to move the view over to a certain area of space. It informs the content management module of this, and asks it to make sure that all of the stars the player can currently click on are fully generated and ready to be clicked on.
The content management module checks and sees that a couple of the stars the UI is saying the player could potentially try to interact with have not, in fact, had the details that would show upon click generated yet. It produces a number of Task objects containing the methods of those stars that, when called, will generate the necessary data. It also adds some metadata to these task objects, assuming (possibly based on further information collected from the UI module) that it will be 0.1 seconds before the player tries to click anything, and that stars whose icons are closest to the cursor have the greatest chance of being clicked on and should therefore be requested for a time slightly sooner than the stars further from the cursor. It then adds these objects to the scheduler queue.
The scheduler quickly sorts its queue by how soon each task needs to be done, then pops the first task object off the queue, makes a new process from the function it contains, and then thinks no more about that process, instead just popping another task off the queue and stuffing it into a process too, then the next one, then the next one...
Meanwhile, the new process executes, stores the data it generates on the star object it is a method of, and terminates when it gets to the return statement.
The UI then registers that the player has indeed clicked on a star now, and looks up the data it needs to display on the star object whose representative sprite has been clicked. If the data is there, it displays it; if it isn't, the UI displays a message asking the player to wait and continues repeatedly trying to access the necessary attributes of the star object until it succeeds.
Even though your problem seems very complicated, there is a very easy solution. You can hide away all the complicated stuff of sharing you objects across processes using a proxy.
The basic idea is that you create some manager that manages all your objects that should be shared across processes. This manager then creates its own process where it waits that some other process instructs it to change the object. But enough said. It looks like this:
import multiprocessing as m
manager = m.Manager()
starsdict = manager.dict()
process = Process(target=yourfunction, args=(starsdict,))
process.run()
The object stored in starsdict is not the real dict. instead it sends all changes and requests, you do with it, to its manager. This is called a "proxy", it has almost exactly the same API as the object it mimics. These proxies are pickleable, so you can pass as arguments to functions in new processes (like shown above) or send them through queues.
You can read more about this in the documentation.
I don't know how proxies react if two processes are accessing them simultaneously. Since they're made for parallelism I guess they should be safe, even though I heard they're not. It would be best if you test this yourself or look for it in the documentation.

Having MATLAB to run multiple independent functions which contains infinite while loop

I am currently working with three matlab functions to make them run near simultaneously in single Matlab session(as I known matlab is single-threaded), these three functions are allocated with individual tasks, it might be difficult for me to explain all the detail of each function here, but try to include as much information as possible.
They are CONTROL/CAMERA/DATA_DISPLAY tasks, The approach I am using is creating Timer objects to have all the function callback continuously with different callback period time.
CONTROL will sending and receiving data through wifi with udp port, it will check the availability of package, and execute callback constantly
CAMERA receiving camera frame continuously through tcp and display it, one timer object T1 for this function to refresh the capture frame
DATA_DISPLAY display all the received data, this will refresh continuously, so another timer T2 for this function to refresh the display
However I noticed that the timer T2 is blocking the timer T1 when it is executed, and slowing down the whole process. I am working on a system using a multi-core CPU and I would expect MATLAB to be able to execute both timer objects in parallel taking advantage of the computational cores.
Through searching the parallel computing toolbox in matlab, it seems not able to deal with infinite loop or continuous callback, since the code will not finish and display nothing when execute, probably I am not so sure how to utilize this toolbox
Or can anyone provide any good idea of re-structuring the code into more efficient structure.
Many thanks
I see a problem using the parallel computing toolbox here. The design implies that the jobs are controlled via your primary matlab instance. Besides this, the primary instance is the only one with a gui, which would require to let your DISPLAY_DATA-Task control everything. I don't know if this is possible, but it would result in a very strange architecture. Besides this, inter process communication is not the best idea when processing large data amounts.
To solve the issue, I would use Java to display your data and realise the 'DISPLAY_DATA'-Part. The connection to java is very fast and simple to use. You will have to write a small java gui which has a appendframe-function that allows your CAMERA-Job to push new data. Obviously updating the gui should be done parallel without blocking.

wxpython using gauge pulse with threaded long running processes

The program I am developing uses threads to deal with long running processes. I want to be able to use Gauge Pulse to show the user that whilst a long running thread is in progress, something is actually taking place. Otherwise visually nothing will happen for quite some time when processing large files & the user might think that the program is doing nothing.
I have placed a guage within the status bar of the program. My problem is this. I am having problems when trying to call gauge pulse, no matter where I place the code it either runs to fast then halts, or runs at the correct speed for a few seconds then halts.
I've tried placing the one line of code below into the thread itself. I have also tried create another thread from within the long running process thread to call the code below. I still get the same sort of problems.
I do not think that I could use wx.CallAfter as this would defeat the point. Pulse needs to be called whilst process is running, not after the fact. Also tried usin time.sleep(2) which is also not good as it slows the process down, which is something I want to avoid. Even when using time.sleep(2) I still had the same problems.
Any help would be massively appreciated!
progress_bar.Pulse()
You will need to find someway to send update requests to the main GUI from your thread during the long running process. For example, if you were downloading a very large file using a thread, you would download it in chunks and after each chunk is complete, you would send an update to the GUI.
If you are running something that doesn't really allow chunks, such as creating a large PDF with fop, then I suppose you could use a wx.Timer() that just tells the gauge to pulse every so often. Then when the thread finishes, it would send a message to stop the timer object from updating the gauge.
The former is best for showing progress while the latter works if you just want to show the user that your app is doing something. See also
http://wiki.wxpython.org/LongRunningTasks
http://www.blog.pythonlibrary.org/2010/05/22/wxpython-and-threads/
http://www.blog.pythonlibrary.org/2013/09/04/wxpython-how-to-update-a-progress-bar-from-a-thread/

Resources