HI
I have a control that accesses a database using proprietary datasets. The database is an old ISAM bases database.
The control uses a background thread to query the database using the proprietary datasets.
A form will have several of these controls on it, each using their own thread to access the data as they all need to load simultaneously.
The proprietary datasets handle concurrency by displaying a VCL TForm notifying the user that the table being opened is locked by another user and that the dataset is waiting for the lock to be released.
The form has a cancel button on it which lets the user cancel the lock wait.
The problem:
When using the proprietary datasets from within a thread, the application will crash, hang or give some error if the lock wait form it displayed. I suspect this is to do with the VCL not being thread safe.
I have solved the issue by synchronizing Dataset.Open however this holds up the main thread until the dataset.open returns, which can take a considerable amount of time depending on the complexity of the query.
I have displayed a modal progress bar which lets to user know that something it happening but I don't like this idea as the user will be sitting waiting for the progress bar to complete.
The proprietary dataset code is compiled into the main application, i.e. its not stored in a separate DLL. We are not allowed to change how the locking works or whether a form is displayed or not at this stage of the development process as we are too close to release.
Ideally I would like to have Dataset.open run in the controls thread as well instead of having the use the main thread, however this doesn't seem likely to work.
Can anyone else suggest a work around? please.
Fibers won't help you one bit, because they are in the Windows API solely to help ease porting old code that was written with cooperative multitasking in mind. Fibers are basically a form of co-routines, they all execute in the same process, have their own stack space, and the switching between them is controlled by the user code, not by the OS. That means that the switching between them can be made to occur only at times that are safe, so no synchronization issues. OTOH that means that only one fiber can be running within one thread at the same time, so using fibers with blocking code has the same characteristics as calling blocking code from within one thread - the application becomes unresponsive.
You could use fibers together with multiple threads, but that can be dangerous and doesn't bring any benefit over using threads alone.
I have used fibers successfully within VCL applications, but only for specific purposes. Forget about them if you want to deal with potentially blocking code.
As for your problem - you should make a control that is used for display purposes only, and which uses the standard inter-process communication mechanisms to exchange data with another process that accesses your database.
COM objects can run in out-of-process mode. May be in delphi it will be a bit easier to use them, then another IPC mechanisms.
Related
Nodejs can not have a built-in thread API like java and .net
do. If threads are added, the nature of the language itself will
change. It’s not possible to add threads as a new set of available
classes or functions.
Nodejs 10.x added worker threads as an experiment and now stable since 12.x. I have gone through the few blogs but did not understand much maybe due to lack of knowledge. How are they different than the threads.
Worker threads in Javascript are somewhat analogous to WebWorkers in the browser. They do not share direct access to any variables with the main thread or with each other and the only way they communicate with the main thread is via messaging. This messaging is synchronized through the event loop. This avoids all the classic race conditions that multiple threads have trying to access the same variables because two separate threads can't access the same variables in node.js. Each thread has its own set of variables and the only way to influence another thread's variables is to send it a message and ask it to modify its own variables. Since that message is synchronized through that thread's event queue, there's no risk of classic race conditions in accessing variables.
Java threads, on the other hand, are similar to C++ or native threads in that they share access to the same variables and the threads are freely timesliced so right in the middle of functionA running in threadA, execution could be interrupted and functionB running in threadB could run. Since both can freely access the same variables, there are all sorts of race conditions possible unless one manually uses thread synchronization tools (such as mutexes) to coordinate and protect all access to shared variables. This type of programming is often the source of very hard to find and next-to-impossible to reliably reproduce concurrency bugs. While powerful and useful for some system-level things or more real-time-ish code, it's very easy for anyone but a very senior and experienced developer to make costly concurrency mistakes. And, it's very hard to devise a test that will tell you if it's really stable under all types of load or not.
node.js attempts to avoid the classic concurrency bugs by separating the threads into their own variable space and forcing all communication between them to be synchronized via the event queue. This means that threadA/functionA is never arbitrarily interrupted and some other code in your process changes some shared variables it was accessing while it wasn't looking.
node.js also has a backstop that it can run a child_process that can be written in any language and can use native threads if needed or one can actually hook native code and real system level threads right into node.js using the add-on SDK (and it communicates with node.js Javascript through the SDK interface). And, in fact, a number of node.js built-in libraries do exactly this to surface functionality that requires that level of access to the nodejs environment. For example, the implementation of file access uses a pool of native threads to carry out file operations.
So, with all that said, there are still some types of race conditions that can occur and this has to do with access to outside resources. For example if two threads or processes are both trying to do their own thing and write to the same file, they can clearly conflict with each other and create problems.
So, using Workers in node.js still has to be aware of concurrency issues when accessing outside resources. node.js protects the local variable environment for each Worker, but can't do anything about contention among outside resources. In that regard, node.js Workers have the same issues as Java threads and the programmer has to code for that (exclusive file access, file locks, separate files for each Worker, using a database to manage the concurrency for storage, etc...).
It comes under the node js architecture. whenever a req reaches the node it is passed on to "EVENT QUE" then to "Event Loop" . Here the event-loop checks whether the request is 'blocking io or non-blocking io'. (blocking io - the operations which takes time to complete eg:fetching a data from someother place ) . Then Event-loop passes the blocking io to THREAD POOL. Thread pool is a collection of WORKER THREADS. This blocking io gets attached to one of the worker-threads and it begins to perform its operation(eg: fetching data from database) after the completion it is send back to event loop and later to Execution.
I have a question, Node.js uses libuv inside of u core, to manage its event loop and by default works whit 4 threads and process queue whit limit of 1024 process.
Process queue limit
Threads by default
So, because most programmers say it's single thread?
By default, node.js only uses ONE thread to run your Javascript. Thus your Javascript runs as single threaded. No two pieces of your Javascript are ever running at the same time. This is a critical design element in Javascript and is why it does not generally have concurrency problems with access to shared variables.
The event driven system works by doing this:
Fetch event from event queue.
Run the Javascript callback associated with the event.
Run that Javascript until it returns control back to the system.
Fetch the next event from the event queue and go back to step 2.
If no event in the event queue, go to sleep until an event is added to the queue, then go to step 1.
In this way, you can see that a given piece of Javascript runs until it returns control back to the system and then, and only then, can another piece of Javascript run. That's where the notion of "single threaded" comes from. One piece of Javascript running at a time. It vastly simplifies concurrency issues and, when combined with the non-blocking I/O model, it makes a very efficient system, even when lots of operations are "in flight" (though only one is actually running at a time).
Yes, node.js has some threads inside of libuv that are used for things like implementing file system access. But those are only for native code inside the library and do NOT make your Javascript multi-threaded in any way.
Now, recent versions of node.js do have Worker Threads which allow you to actually run multiple threads of Javascript, but each thread is a very separate environment and you must communicate with other threads via messages without the direct sharing of variables. This is relatively new to nodejs version 10.5 (though it's similar in concept to WebWorkers in the browser. These Worker Threads are not used at all unless you specifically engage them with custom programming designed to take advantage of them and live within their specific rules of operation.
I have a scenario where i am spawning multiple threads (each of them working on a browser instance). But all threads share 1 common operation which i need to synchronize. I was thinking of creating locks using file created on disk, but am afraid will that really work ?
Basically i need some locking mechanism on windows platform. I am spawning multiple threads, each thread is an automated script (javascript/batch) working over browser, but i need to synchronise any operation which triggers opening of save as/upload dialog in browser.
Regards
Depends strongly on the language, platform, OS, etc you want to develop on.
For POSIX you could achieve that e.g. with mutexes:
http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_mutex_lock.html
If you develop with C++17:
http://en.cppreference.com/w/cpp/thread/mutex
Regarding your question: you could also do that with file locks. In that case you would have to wait for the file to be locked again, which can be achieved with select()
Why does being thread safe matter in a web app? Pylons (Python web framework) uses a global application variable which is not thread safe. Does this matter? Is it only a problem if I intend on using multi-threading? Or, does it mean that one user might not have updated state if another user... I'm just confusing myself. What's so important with this?
Threading errors can lead to serious and subtle problems.
Say your system has 10 members. One more user signs up to your system and the application adds him to the roster and increments the count of members; "simultaneously", another user quits and the application removes him from the roster and decrements the count of members.
If you don't handling threading properly, your member count (which should be 10) could easily be nine, 10, or 11, and you'll never be able to reproduce the bug.
So be careful.
You should care about thread safety. E.g in java you write a servlet that provides some functionality. The container will deploy an instance of your servlet, and as HTTP requests arrive from clients, over different TCP connections, each request is handled by a separate thread which in turn will call your servlet. As a result, you will have your servlet being call from multiple threads. So if it is not thread-safe, then erroneous result will be returned to the user, due to data corruption of access to shared data by threads.
It really depends on the application framework (which I know nothing about in this case) and how the web server handles it. Obviously, any good webserver is going to be responding to multiple requests simultaneously, so it will be operating with multiple threads. That web server may dispatch to a single instance of your application code for all of these requests, or it may spawn multiple instances of your web application and never use a given instance concurrently.
Even if the app server does use separate instances, your application will probably have some shared state--say, a database with a list of users. In that case, you need to make sure that state can be accessed safely from multiple threads/instances of your web app.
Then, of course, there is the case where you use threading explicitly in your application. In that case, the answer is obvious.
Your Web Application is almost always multithreading. Even though you might not use threads explicitly. So, to answer your questions: it's very important.
How can this happen? Usually, Apache (or IIS) will serve several request simultaneously, calling multiple times from multiple threads your python programs. So you need to consider that your programs run in multiple threads concurrently and act accordingly.
(This was too long to add a comment to the other fine answers.)
Concurrency problems (read: multiple access to shared state) is a super-set of threading problems. The (concurrency problems) can easily exist at an "above thread" level such as a process/server level (the global variable in the case you mention above is process-unique value, which in turn can lead to an inconsistent view/state if there are multiple processes).
Care must be taken to analyze the data consistency requirements and then implement the software to fulfill those requirements. I would always err on the side of safe, and only degrade in carefully analyzed areas where it is acceptable.
However, note that CPython runs only one thread context for Python code execution (to get true concurrent threads you need to write/use C extensions), so, while you can get a form of race condition upon expected data, you won't get (all) the same kind of partial-write scenarios and such that may plague C/C++ programs. But, once again. Err on the side of a consistent view.
There are a number of various existing methods of making access to a global atomic -- across threads or processes. Use them.
Threads make the design, implementation and debugging of a program significantly more difficult.
Yet many people seem to think that every task in a program that can be threaded should be threaded, even on a single core system.
I can understand threading something like an MPEG2 decoder that's going to run on a multicore cpu ( which I've done ), but what can justify the significant development costs threading entails when you're talking about a single core system or even a multicore system if your task doesn't gain significant performance from a parallel implementation?
Or more succinctly, what kinds of non-performance related problems justify threading?
Edit
Well I just ran across one instance that's not CPU limited but threads make a big difference:
TCP, HTTP and the Multi-Threading Sweet Spot
Multiple threads are pretty useful when trying to max out your bandwidth to another peer over a high latency network connection. Non-blocking I/O would use significantly less local CPU resources, but would be much more difficult to design and implement.
Performing a CPU intensive task without blocking the user interface, for example.
Any application in which you may be waiting around for a resource (for example, blocking I/O from network sockets or disk devices) can benefit from threading.
In that case the thread blocking on the slow operation can be put to sleep while other threads continue to run (including, under some operating systems, the GUI thread which, if the OS cannot contact it for a while, will offer the use the chance to destroy it, thinking it's deadlocked somehow).
So it's not just for multi-core machines at all.
An interesting example is a webserver - you need to be able to handle multiple incoming connections that have nothing to do with each other.
what kinds of non-performance related
problems justify threading?
Web applications are the classic example. Each user request is conceptually a new thread. Nothing to do with performance, it's just a natural fit for the design.
Blocking code is usually much simpler to write and easier to read (and therefore maintain) than non-blocking code. Yet, using blocking code limits you to a single execution path and also locks out things like user interface (mentioned) and other IO ports. Threading is an elegant solution in these cases.
Another case when multithreading is to be considered is when you have several near-synchronous IO channels that should be managed: using multiple threads (and usually a local message queue) allows for much clearer code.
Here are a couple of specific and simple scenarios where I have launched threads...
A long running report request by the user. When the report is submitted, it is placed in a queue to be processed by a separate thread. The user can then go on within the application and check back later to see the status of their report, they aren't left with a "Processing..." page or icon.
A thread that iterates cache storage, removing data that has expired or no longer needed. The thread's job within the application is independent of any processing for a specific user, but part of the overall application run-time maintenance.
although, not specifically a threading scenario, logging within our web site is handed off to a parallel process, so the throughput of the web site isn't hindered by the time it takes to record log data.
I agree that threading just for threadings sake isn't a good idea and it can introduce problems within your application if isn't done properly, but it is an extremely useful tool for solving some problems.
Whenever you need to call some external component (be it a database query, a 3. party library, an operating system primitive etc.) that only provides a synchronous/blocking interface or using the asynchronous interface not worth the extra trouble and pain - and you also need some form of concurrency - e.g. serving multiple clients in a server or keep the GUI still responsive.
Well, how do you know if you're app is going to run on a multi-core system or not?
Beyond that, there are a lot of processes that take up time, but don't require the CPU. Such as writing to a disk or networking. Who wants to push a button in a GUI and then have to sit there and wait for a network connection. Even on a single core machine, having a separate IO thread greatly improves user experience. You always at least want a separate thread for the UI.
Yet many people seem to think that
every task in a program that can be
threaded should be threaded, even on a
single core system.
"Many people"... Who?
Also from my experience many many programs that should be multithreaded aren't (especially games.. I have an i7 and yet most games still use only 1 of my cores), so I'm not sure what you're talking about. Definitely programs like calc.exe are not multithread (or, if they are, 1 thread does 99% of the work).
Performing a CPU intensive task
without blocking the user interface,
for example.
Yes, this is true but this is fairly easy to implement and it's not what the OP is referring to (since, in this case, 1 thread does almost all the work and you only need very few mutexes)