I am creating an app that accesses a database. On every database access, the app waits for the job to be finished.
To keep the UI responsive, I want to put all the database stuff in a separate thread.
Here is my idea:
The db-thread creates all database components it needs when it is created
Now the thread just sits there and waits for a command
If it receives a command, it performs the action and goes back to idle. During that time the main thread waits.
the db-thread lives as long as the app is running
Does this sound ok?
What's the best way to get the database results from the db-thread into the main thread?
I haven't done much with threads so far, therefore I'm wondering if the db-thread can create a query component out of which the main thread reads the results. Main thread and db thread will never access the query at the same time. Will this still cause problems?
What you are looking for is the standard data access technique, called asynchronous query execution. Some data access components implement this feature in an easy-to-use manner. At least dbGo (ADO) and AnyDAC implement that. Lets consider the dbGo.
The idea is simple - you call the convenient dataset methods, like a Open. The method launches required task in a background thread and immediately returns. When the task is completed, an appropriate event will be fired, notifying the application, that the task is finished.
The standard approach with the DB GUI applications and the Open method is the following (draft):
include eoAsyncExecute, eoAsyncFetch, eoAsyncFetchNonBlock into dataset ExecuteOptions;
disconnect TDataSource.DataSet from dataset;
set dataset OnFetchComplete to a proc P;
show "Hello ! We do the hard work to process your requests. Please wait ..." dialog;
call the dataset Open method;
when the query execution will be finished, the OnFetchComplete will be called, so the P. And the P hides the "Wait" dialog and connects TDataSource.DataSet back to the dataset.
Also your "Wait" dialog may have a Cancel button, which an user may use to cancel a too long running query.
First of all - if you haven't much experience with multi-threading, don't start with the VCL classes. Use the OmniThreadLibrary, for (among others) those reasons:
Your level of abstraction is the task, not the thread, a much better way of dealing with concurrency.
You can easily switch between executing tasks in their own thread and scheduling them with a thread pool.
All the low-level details like thread shutdown, bidirectional communication and much more are taken care of for you. You can concentrate on the database stuff.
The db-thread creates all database components it needs when it is created
This may not be the best way. I have generally created components only when needed, but not destroyed immediately. You should definitely keep the connection open in a thread pool thread, and close it only once the thread has been inactive for some time and the pool disposes of it. But it is also often a good idea to keep a cache of transaction and statement objects.
If it receives a command, it performs the action and goes back to idle. During that time the main thread waits.
The first part is being handled fine when OTL is used. However - don't have the main thread wait, this will bring little advantage over performing the database access directly in the VCL thread in the first place. You need an asynchronous design to make best use of multiple threads. Consider a standard database browser form that has controls for filtering records. I handle this by (re-)starting a timer every time one of the controls changes. Once the user finishes editing the timer event fires (say after 500 ms), and a task is started that executes the statement that fetches data according to the filter criteria. The grid contents are cleared, and it is repopulated only when the task has finished. This may take some time though, so the VCL thread doesn't wait for the task to complete. Instead the user could even change the filter criteria again, in which case the current task is cancelled and a new one started. OTL gives you an event for task completion, so the asynchronous design is easy to achieve.
What's the best way to get the database results from the db-thread into the main thread?
I generally don't use data aware components for multi-threaded db apps, but use standard controls that are views for business objects. In the database tasks I create these objects, put them in lists, and the task completion event transfers the list to the VCL thread.
Main thread and db thread will never access the query at the same time.
With all components that load data on-demand you can't be sure of that. Often only the first records are fetched from the db, and fetching continues after they have been consumed. Such components obviously must not be shared by threads.
I have implemented both strategies: Thread pool and adhoc thread creation.
I suggest to begin with the adhoc thread creation, it is simpler to implement and simpler to scale.
Only move to a thread pool if (with careful evaluation) (1) there is a lot of resources (and time) invested in the creation of the thread and (2) you have a lot of creation requests.
In both cases you must deal with passing parameters and collect results. I suggest to extend the thread class with properties that allow this data passing.
Refer to the documentation of the classes, components and functions that the thread use to make sure they are thread safe, that is, they can be use simultaneously from different threads. If not, you will need to synchronize the access. In some cases you may find slight differences regarding thread safety. As an example, see DateTimeToStr.
If you create your thread at start and reuse it later whenever you need it, you have to make sure that you disconnect the db components (grid..) from the underlying datasource (disableControls) each time you're "processing" data.
For the sake of simplicity, I would inherit TThread and implement all the business logic in my own class. The result dataset would be a member of this class and I would connect it the db aware compos in with synchronize.
Anyway, it is also very important to delegate as much work as possible to the db server and keep the UI as lightweight as possible. Firebird is my favourite db server: triggers, for select, custom UDF dlls developed in Delphi, many thread safe db components with lots of examples and good support (forum) : jvUIB...
Good Luck
Related
I'd like to be able to open a TDataSet asynchronously in its own thread so that the main VCL thread can continue until that's done, and then have the main VCL thread read from that TDataSet afterwards. I've done some experimenting and have gotten into some very weird situations, so I'm wondering if anyone has done this before.
I've seen some sample apps where a TDataSet is created in a separate thread, it's opened and then data is read from it, but that's all done in the separate thread. I'm wondering if it's safe to read from the TDataSet from the main VCL thread after the other thread opens the data source.
I'm doing Win32 programming in Delphi 7, using TmySQLQuery from DAC for MySQL as my TDataSet descendant.
Provided you only want to use the dataset in its own thread, you can just use synchronize to communicate with the main thread for any VCL/UI update, like with any other component.
Or, better, you can implement communication between the mainthread and worker threads with your own messaging system.
check Hallvard's solution for threading here:
http://hallvards.blogspot.com/2008/03/tdm6-knitting-your-own-threads.html
or this other one:
http://dn.codegear.com/article/22411
for some explanation on synchronize and its inefficiencies:
http://www.eonclash.com/Tutorials/Multithreading/MartinHarvey1.1/Ch3.html
I have seen it done with other implementations of TDataSet, namely in the Asta components. These would contact the server, return immediately, and then fire an event once the data had been loaded.
However, I believe it depends very much on the component. For example, those same Asta components could not be opened in a synchronous manner from anything other than the main VCL thread.
So in short, I don't believe it is a limitation of TDataSet per se, but rather something that is implementation specific, and I don't have access to the components you've mentioned.
One thing to keep in mind about using the same TDataSet between multiple threads is you can only read the current record at any given time. So if you are reading the record in one thread and then the other thread calls Next then you are in trouble.
Also remember the thread will most likely need its own database connection. I believe what is needed here is a multi-threaded "holding" object to load the data from the thread into (write only) which is then read only from the main VCL thread. Before reading use some sort of syncronization method to insure that your not reading the same moment your writing, or writing the same moment your reading, or load everything into a memory file and write a sync method to tell the main app where in the file to stop reading.
I have taken the last approach a few times, depdending on the number of expected records (and the size of the dataset) I have even taken this to a physical disk file on the local system. It works quite well.
I've done multithreaded data access, and it's not straightforward:
1) You need to create a session per thread.
2) Everything done to that TDataSet instance must be done in context of the thread where it was created. That's not easy if you wanted to place e.g. a db grid on top of it.
3) If you want to let e.g. main thread play with your data, the straight-forward solution is to move it into a separate container of some kind,e.g. a Memory dataset.
4) You need some kind of signaling mechanism to notify main thread once your data retrieval is complete.
...and exception handling isn't straightforward, either...
But: Once you've succeeded, the application will be really elegant !
Most TDatasets are not thread safe. One that I know is thread safe is kbmMemtable. It also has the ability to clone a dataset so that the problem of moving the record pointer (as explained by Jim McKeeth) does occur. They're one of the best datasets you can get (bought or free).
I was going through the details of node.jsand came to know that, It supports asynchronous programming though essentially it provides a single threaded model.
How is asynchronous programming handled in such cases? Is it like runtime itself creates and manages threads, but the programmer cannot create threads explicitly? It would be great if someone could point me to some resources to learn about this.
Say it with me now: async programming does not necessarily mean multi-threaded.
Javascript is a single-threaded runtime - you simply aren't able to create new threads in JS because the language/runtime doesn't support it.
Frank says it correctly (although obtusely) In English: there's a main event loop that handles when things come into your app. So, "handle this HTTP request" will get added to the event queue, then handled by the event loop when appropriate.
When you call an async operation (a mysql db query, for example), node.js sends "hey, execute this query" to mysql. Since this query will take some time (milliseconds), node.js performs the query using the MySQL async library - getting back to the event loop and doing something else there while waiting for mysql to get back to us. Like handling that HTTP request.
Edit: By contrast, node.js could simply wait around (doing nothing) for mysql to get back to it. This is called a synchronous call. Imagine a restaurant, where your waiter submits your order to the cook, then sits down and twiddles his/her thumbs while the chef cooks. In a restaurant, like in a node.js program, such behavior is foolish - you have other customers who are hungry and need to be served. Thus you want to be as asynchronous as possible to make sure one waiter (or node.js process) is serving as many people as they can.
Edit done
Node.js communicates with mysql using C libraries, so technically those C libraries could spawn off threads, but inside Javascript you can't do anything with threads.
Ryan said it best: sync/async is orthogonal to single/multi-threaded. For single and multi-threaded cases there is a main event loop that calls registered callbacks using the Reactor Pattern. For the single-threaded case the callbacks are invoked sequentially on main thread. For the multi-threaded case they are invoked on separate threads (typically using a thread pool). It is really a question of how much contention there will be: if all requests require synchronized access to a single data structure (say a list of subscribers) then the benefits of having multiple threaded may be diminished. It's problem dependent.
As far as implementation, if a framework is single threaded then it is likely using poll/select system call i.e. the OS is triggering the asynchronous event.
To restate the waiter/chef analogy:
Your program is a waiter ("you") and the JavaScript runtime is a kitchen full of chefs doing the things you ask.
The interface between the waiter and the kitchen is mediated by queues so requests are not lost in instances of overcapacity.
So your program is assigned one thread of execution. You can only wait one table at a time. Each time you want to offload some work (like making the food/making a network request), you run to the kitchen and pin the order to a board (queue) for the chefs (runtime) to pick-up when they have spare capacity. The chefs will let you know when the order is ready (they will call you back). In the meantime, you go wait another table (you are not blocked by the kitchen).
So the accepted answer is misleading. The JavaScript runtime is definitionally multithreaded because I/O does not block your JavaScript program. As a waiter you can continue serving customers, while the kitchen cooks. That involves at least two threads of execution. The reality is that the runtime will maintain several threads of execution behind the scenes, in order to efficiently serve the single thread directly corresponding to your script.
By design, only one thread of execution is assigned to the synchronous running of your JavaScript program. This is a good thing because it makes your program easier to reason about than having to handle multiple threads of execution yourself. Don't worry: your JavaScript program can still get plenty complicated though!
I have an application where I need to query a database to get/put information. I can't do it synchronously as it would block my entire process until the function returns.
Basically I have a few functions that run one or more queries at certain points.
fun
stuff1
stuff2
stuff3
query1
stuff4
query2
stuff5
I could start the functions in separate threads, but then I would have to lock everything to prevent races (I think locking could be slow ?)
I could start the queries asynchronously and monitor them but then I would have to split my functions and use callbacks that would run when the qouery is over
I am interested in a general solution, but my platform is POSIX and the database is (unfortunately) mysql.
What would you do ? How would you handle this ?
Thank you for your time.
There are patterns that have been known and used for quite sometime and I believe they have not changed.
Running independent functions: Use different threads
Running independent functions and a then function dependent on all of them: Use different threads and joining them at the end - synchronise them. I do not know about POSIX but in .NET we have EventWaitHandle that can wait for multiple threads and notify when all finished.
Running functions that each depends on another: run on a single background thread and chain the callbacks. Again .NET offers Task<T> chaining which makes reading and writing the code much simpler. jquery now offers promise which is the same thing.
Depends on how complicated the situation is. In a simple scenario breaking the work across multiple functions which are given as callbacks to the query will work - and is a valid solution. In a more complicated scenario, you need some dependency injection framework like spring.
You can create a new thread that would handle a database queries queue. This thread would hold a list containing the next actions to perform on the database and would be accessed by a function like : MyDatabaseQueue.PerformActionWhenFree( Action a, Callback callmebackwhendone ). This thread would be responsible for creating one query thread at a time. That way you can always receive more queries in the queue and have only one database query thread at a time.
Well if the queries are tightly coupled, you can simply start a parallel thread with pthread_create and run them sequentially on that thread. Thus, your main thread won't be blocked and you still won't need to employ any locks.
I don't want multiple windows, each with its own UI thread, nor events raised on a single UI thread, not background workers and notifications, none of that Invoke, BeginInvoke stuff either.
I'm interested in a platform that allows multiple threads to update the same window in a safe manner. Something like first thread creates three buttons, the second thread another five, and they both can access them,change their properties and delete them without any unwanted consequences.
I want safe multi-threaded access to the UI without Invoking, a platform where the UI objects can be accessed directly from any thread without raising errors like "The object can only be accessed from the thread that created it". To let me do the synchronizing if I have to, not prevent me from cross-tread accessing the UI in a direct manner.
I'm gonna get down voted but ... Go Go Gadget Soapbox.
Multi threaded GUI are not possible in the general case. It has been attempted time and time again and it never comes out well. It is not a coincidence that all of the major windowing frameworks follow the single threaded ui model. They weren't copying each other, it's just that the constraints of the problem lead them to the same answer. Many people smarter than you or i have tried to solve this.
It might be possible to implement a multi-thread ui for a particular project. I'm only saying that it can't be done in the general case. That means it's unlikely you'll find a framework to do what you want.
The gist of the problem is this. Envision the gui components as a chain (in reality it's more like a tree, but a chain is simple to describe). The button connects to the frame, connects to the box, connects to the window. There are two source of events for a gui the system/OS and the user. The system/OS event originate at the bottom of the chain (the windowing system), the user event originate at the top of the chain (the button). Both of these events must move through the gui chain. If two threads are pushing these events simultaneously they must be mutex protected. However, there is no known algorithm for concurrently traversing a double linked list in both directions. It is prone to dead lock. GUI experts tried and tried to figure out ways to get around the deadlocking problem, and eventually arrived at the solution we use today called Model/View/Controller, aka one thread runs the UI.
You could make a thread-safe Producer/Consumer queue of delegates.
Any thread that wants to update a UI component would create a delegate encapsulating the operations to be performed, and add it to the queue.
The UI thread (assuming all components were created on the same thread) would then periodically pull an item from the queue, and execute the delegate.
I don't believe a platform like that exists per se
There is nothing stopping you from saying taking .Net and creating all new controls which are thread safe and can work like that(or maybe just the subset of what you need) which shouldn't be an extremely large job(though definitely no small job) because you can just derive from the base controls and override any thread-unsafe methods or properties.
The real question though is why? It would definitely be slower because of all the locking. Say your in one thread that is doing something with the UI, well it has to lock the window it's working on else it could be changed without it knowing by the other thread. So with all the locking, you will spend most of your drawing time and such waiting on locks and (expensive) context switches from threads. You could maybe make it async, but that just doesn't seem safe(and probably isn't) because controls that you supposedly just created may or may not exist and would be about like
Panel p=new Panel();
Button b=new Button();
WaitForControlsCreated(); //waits until the current control queue is cleared
p.Controls.Add(b);
which is probably just as slow..
So the real question here is why? The only "good" way of doing it is just having an invoke abstracted away so that it appears you can add controls from a non-UI thread.
I think you are misunderstanding how threads really work and what it takes to actually make an object thread safe
Accept that any code updating the GUI has to be on the GUI thread.
Learn to use BeginInvoke().
On Windows, Window handles have thread affinity. This is a limitation of the Window manager. It's a bad idea to have multiple threads accessing the same window on Windows.
I'm surprised to see these answers.
Only the higher level language frameworks like C# have thread restrictions on GUI elements.
Windows, at the SDK layer, is 100% application controlled and there are no restrictions on threads except at insignificant nitty gritty level. For example if multiple threads want to write to a window, you need to lock on a mutex, get the device context, draw, then release the context, then unlock the mutex. Getting and releasing a device context for a moment of drawing needs to be on the same thread... but those are typically within 10 lines of code from each other.
There isn't even a dedicated thread that windows messages come down on, whatever thread calls "DispatchMessage()" is the thread the WINPROC will be called on.
Another minor thread restriction is that you can only "PeekMessage" or "GetMessage" a window that was created on the current thread. But really this is very minor, and how many message pumps do you need anyway.
Drawing is completely disconnected from threads in Windows, just mutex your DC's for drawing. You can draw anytime, from anywhere, not just on a WM_PAINT message.
BeOS / Haiku OS
Based on my guessing of your requirement, you want a single Windows Form and having ways to execute certain routines asynchronously (like multi-threading), yes?
Typically (for the case of .NET WinForms) Control.Invoke / Control.BeginInvoke is used to a certain effect what I think you want.
Here's an interesting article which might help: http://www.yoda.arachsys.com/csharp/threads/winforms.shtml
I'd like to be able to open a TDataSet asynchronously in its own thread so that the main VCL thread can continue until that's done, and then have the main VCL thread read from that TDataSet afterwards. I've done some experimenting and have gotten into some very weird situations, so I'm wondering if anyone has done this before.
I've seen some sample apps where a TDataSet is created in a separate thread, it's opened and then data is read from it, but that's all done in the separate thread. I'm wondering if it's safe to read from the TDataSet from the main VCL thread after the other thread opens the data source.
I'm doing Win32 programming in Delphi 7, using TmySQLQuery from DAC for MySQL as my TDataSet descendant.
Provided you only want to use the dataset in its own thread, you can just use synchronize to communicate with the main thread for any VCL/UI update, like with any other component.
Or, better, you can implement communication between the mainthread and worker threads with your own messaging system.
check Hallvard's solution for threading here:
http://hallvards.blogspot.com/2008/03/tdm6-knitting-your-own-threads.html
or this other one:
http://dn.codegear.com/article/22411
for some explanation on synchronize and its inefficiencies:
http://www.eonclash.com/Tutorials/Multithreading/MartinHarvey1.1/Ch3.html
I have seen it done with other implementations of TDataSet, namely in the Asta components. These would contact the server, return immediately, and then fire an event once the data had been loaded.
However, I believe it depends very much on the component. For example, those same Asta components could not be opened in a synchronous manner from anything other than the main VCL thread.
So in short, I don't believe it is a limitation of TDataSet per se, but rather something that is implementation specific, and I don't have access to the components you've mentioned.
One thing to keep in mind about using the same TDataSet between multiple threads is you can only read the current record at any given time. So if you are reading the record in one thread and then the other thread calls Next then you are in trouble.
Also remember the thread will most likely need its own database connection. I believe what is needed here is a multi-threaded "holding" object to load the data from the thread into (write only) which is then read only from the main VCL thread. Before reading use some sort of syncronization method to insure that your not reading the same moment your writing, or writing the same moment your reading, or load everything into a memory file and write a sync method to tell the main app where in the file to stop reading.
I have taken the last approach a few times, depdending on the number of expected records (and the size of the dataset) I have even taken this to a physical disk file on the local system. It works quite well.
I've done multithreaded data access, and it's not straightforward:
1) You need to create a session per thread.
2) Everything done to that TDataSet instance must be done in context of the thread where it was created. That's not easy if you wanted to place e.g. a db grid on top of it.
3) If you want to let e.g. main thread play with your data, the straight-forward solution is to move it into a separate container of some kind,e.g. a Memory dataset.
4) You need some kind of signaling mechanism to notify main thread once your data retrieval is complete.
...and exception handling isn't straightforward, either...
But: Once you've succeeded, the application will be really elegant !
Most TDatasets are not thread safe. One that I know is thread safe is kbmMemtable. It also has the ability to clone a dataset so that the problem of moving the record pointer (as explained by Jim McKeeth) does occur. They're one of the best datasets you can get (bought or free).