How to do Asynchronous thread execution in Java? - multithreading

I am working on a coding exercise in which I have to create a logging framework in java.
Steps for logging is like Log4J i.e. 1. Message is processed by a processor. 2. Message is appended to IO or in file or on server by an appender.
The requirement is class should handle multiple threads on same time (Asynchronous Manner). Class has two methods log() – used for logging and shutdown() – for termination.
I have objects of message_processor and appender but, this process should done in FIFO manner.
My Job is when log() is called message_processor will process the message which is very time-consuming process and then the processed message is appended and logged.
I have a rough idea which is, I have to create a static data structure in which I will be adding a thread and its key. I will keep adding whenever log() is called.
After adding into my data structure I will start the thread and keep checking if the thread is completed or not.
When the thread is completed and the thread which is completed has no predecessors then, I will append it or else I will wait for the predecessors to be completed. If the thread is completed or not I have to maintain a flag.
Log() will spawn threads for message processing and will wait for threads to be completed their task and when the thread is completed, log() will append the messages in-order.
The data structure will be read and write by multiple threads on same time.
My question is which data structure should I use to implement this kind of functionality in Java. For the flag should I create a Java class which implements a Runnable interface or Is there any data structure which these kind of functionality?
Is it an efficient idea to implement Asynchronous threading?
Any suggestions are welcomed.

I think, you should use concurrent data structures that provide by framework instead of doing your implementation.

Related

Condition variable usage pattern in C/C++ and other languages

If you look at documentation describing the usage of condition variables (cv), you'll see that e.g. in PThreads and C++ you don't need to hold the mutex of a cv to call notify on this cv. Whereas e.g. in Java and Python, you must lock the mutex to do the same thing.
Is there some deep reason why things are implemented this way (I'm about the latter case), given that an implementation of a language like Java eventually uses some native threading tools?
The Java notify and notifyAll basic synchronization tools both require you to synchronize on the object before calling them. This is for a simple safety point since it also requires you to synchronize on them before waiting.
For example if you have two threads. One thread reads data from a buffer and one thread writes data into the buffer.
The reading data thread needs to wait until the writing data thread has finished writing a block of data into the buffer and then it can read the block.
If wait(), notify(), and notifyAll() methods can be called without synchronization then you can get a race condition where:
The reading thread calls wait() and the thread is added to waiting queue.
At the same time, the writing thread calls notify() to signal it has added data.
The reading thread misses the change and waits forever since the notify() was processed before the wait() was.
By forcing the wait and notify to happen within a synchronized block this race condition is removed.

A log4j logger with multiple buffers

I want to create logger that will handle messages from multiple threads. Threads will be executed by ExecutorService and they will stay alive for a few minutes. Each of them performs activity which is completely independent from other threads. When I'm reading log, I want to see separate messages for each of threads in consistent state but also have all of them in a single file. So I want to use only one instance of logger (as I will log into a single file) but each thread will communicate his own buffer for this logger. When thread is about to finish execution he should flush the buffer - so when I read the log, the messages originating from this thread will not be interspersed with other threads' messages.
How can I achieve it with log4j? I tried to search docs but either I can't specify my requirements well or this kind of feature is not supported.

How can akka actor interact between threads

I've read akka documentation and can't produce clean understanding of thread interaction while using akka. Docs may omit this thing as obvious but it is not so obvious for me.
All akka actors seemed to be run in same thread they are called. I see actors as co-procedures that just had own stack reset each time receive called.
You may perform a huge chain of actors switching in straight line. Each receive perform small non-blocking operation and force another receive to work further. There is no event loop, that can handle messages outside of the actor system.
I'd like to catch a request from other thread, perform control operations, and wait for another message.
There are some use cases that outline my needs.
There is thread that constantly polling data from some sources. Once data matches pattern it invokes event-driven handler based on actors. Logical controller makes a decision and passes it workers. There should be two persistent thread. One threads works constantly on polling and another works asynchronously to control it work. You should not let akka actors to first thread since they broke polling periods and first thread should not block actors so they need another thread.
There is some kind of two-side board game. One side has a controller thread that schedules calculation time works interacts with board server and etcetera. Other thread is a heavy calculating thread that loops over different variants and could not be written in akka since it has blocking nature
I aware of existing akka futures, but they represent a working task that run once fired and shutting down after performing their goal. The futures are well combined with akka actors, but can not express looped working threads.
Akka actor system incorporates different kinds of network event loops. You may use its built-in remote actor system or well known 0mq protocol. But using network for thread interactions seems like overdoing for me.
What is the supposed way to glue non-akka thread with akka one? Should I wrote a couple of special procedures to perform message passing in thread-safe way?
If you need polling, then the polling thread should just turn whatever is polled into a message and fire it off to an actor.
I find it more useful to use an Actor with a receiveTimeout to do non-blocking polling at an interval, and when there's something that gets polled, it will publish it to some other actor, or perhaps even its ActorSystems' EventStream, for true pub-sub action.

DB-connection in separate thread - what's the best way?

I am creating an app that accesses a database. On every database access, the app waits for the job to be finished.
To keep the UI responsive, I want to put all the database stuff in a separate thread.
Here is my idea:
The db-thread creates all database components it needs when it is created
Now the thread just sits there and waits for a command
If it receives a command, it performs the action and goes back to idle. During that time the main thread waits.
the db-thread lives as long as the app is running
Does this sound ok?
What's the best way to get the database results from the db-thread into the main thread?
I haven't done much with threads so far, therefore I'm wondering if the db-thread can create a query component out of which the main thread reads the results. Main thread and db thread will never access the query at the same time. Will this still cause problems?
What you are looking for is the standard data access technique, called asynchronous query execution. Some data access components implement this feature in an easy-to-use manner. At least dbGo (ADO) and AnyDAC implement that. Lets consider the dbGo.
The idea is simple - you call the convenient dataset methods, like a Open. The method launches required task in a background thread and immediately returns. When the task is completed, an appropriate event will be fired, notifying the application, that the task is finished.
The standard approach with the DB GUI applications and the Open method is the following (draft):
include eoAsyncExecute, eoAsyncFetch, eoAsyncFetchNonBlock into dataset ExecuteOptions;
disconnect TDataSource.DataSet from dataset;
set dataset OnFetchComplete to a proc P;
show "Hello ! We do the hard work to process your requests. Please wait ..." dialog;
call the dataset Open method;
when the query execution will be finished, the OnFetchComplete will be called, so the P. And the P hides the "Wait" dialog and connects TDataSource.DataSet back to the dataset.
Also your "Wait" dialog may have a Cancel button, which an user may use to cancel a too long running query.
First of all - if you haven't much experience with multi-threading, don't start with the VCL classes. Use the OmniThreadLibrary, for (among others) those reasons:
Your level of abstraction is the task, not the thread, a much better way of dealing with concurrency.
You can easily switch between executing tasks in their own thread and scheduling them with a thread pool.
All the low-level details like thread shutdown, bidirectional communication and much more are taken care of for you. You can concentrate on the database stuff.
The db-thread creates all database components it needs when it is created
This may not be the best way. I have generally created components only when needed, but not destroyed immediately. You should definitely keep the connection open in a thread pool thread, and close it only once the thread has been inactive for some time and the pool disposes of it. But it is also often a good idea to keep a cache of transaction and statement objects.
If it receives a command, it performs the action and goes back to idle. During that time the main thread waits.
The first part is being handled fine when OTL is used. However - don't have the main thread wait, this will bring little advantage over performing the database access directly in the VCL thread in the first place. You need an asynchronous design to make best use of multiple threads. Consider a standard database browser form that has controls for filtering records. I handle this by (re-)starting a timer every time one of the controls changes. Once the user finishes editing the timer event fires (say after 500 ms), and a task is started that executes the statement that fetches data according to the filter criteria. The grid contents are cleared, and it is repopulated only when the task has finished. This may take some time though, so the VCL thread doesn't wait for the task to complete. Instead the user could even change the filter criteria again, in which case the current task is cancelled and a new one started. OTL gives you an event for task completion, so the asynchronous design is easy to achieve.
What's the best way to get the database results from the db-thread into the main thread?
I generally don't use data aware components for multi-threaded db apps, but use standard controls that are views for business objects. In the database tasks I create these objects, put them in lists, and the task completion event transfers the list to the VCL thread.
Main thread and db thread will never access the query at the same time.
With all components that load data on-demand you can't be sure of that. Often only the first records are fetched from the db, and fetching continues after they have been consumed. Such components obviously must not be shared by threads.
I have implemented both strategies: Thread pool and adhoc thread creation.
I suggest to begin with the adhoc thread creation, it is simpler to implement and simpler to scale.
Only move to a thread pool if (with careful evaluation) (1) there is a lot of resources (and time) invested in the creation of the thread and (2) you have a lot of creation requests.
In both cases you must deal with passing parameters and collect results. I suggest to extend the thread class with properties that allow this data passing.
Refer to the documentation of the classes, components and functions that the thread use to make sure they are thread safe, that is, they can be use simultaneously from different threads. If not, you will need to synchronize the access. In some cases you may find slight differences regarding thread safety. As an example, see DateTimeToStr.
If you create your thread at start and reuse it later whenever you need it, you have to make sure that you disconnect the db components (grid..) from the underlying datasource (disableControls) each time you're "processing" data.
For the sake of simplicity, I would inherit TThread and implement all the business logic in my own class. The result dataset would be a member of this class and I would connect it the db aware compos in with synchronize.
Anyway, it is also very important to delegate as much work as possible to the db server and keep the UI as lightweight as possible. Firebird is my favourite db server: triggers, for select, custom UDF dlls developed in Delphi, many thread safe db components with lots of examples and good support (forum) : jvUIB...
Good Luck

How to use queue with two threads-- one for consumer and one for producer

I am using an application where a lower level application always invokes a callback RecData(char *buf) when it receives data.
In the callback I am creating two threads and pass the consumer and producer function to these created threads respectively.
My code:
void RecData (char * buf)
{
CreateThread(NULL,0,producer_queue,(void *)buf,0,NULL);
CreateThread(NULL,0,consumer_queue,NULL,0,NULL);
}
The above works when I receive one data at a time. If I receive say 5 data almost at the same time then producer_queue should first put all the data in queue and then consumer_queue should start retrieving the data but here as soon as producer_queue puts the first data in queue, consumer_queue retrieves it.
What you want to do, I believe, is control access to the queue. You'll want to look at using a mutex to control reading from the queue.
When you recieve data, you will lock the mutex, then enqueue data. When you are done queing the data, then release the lock.
When reading from the queue, you will see if the mutex is locked. If you are writing data to the queue, you won't be able to start reading, until your producer thread has completed writing all of it's data and release the lock. If you actually lock the mutex, then you prevent your writer thread from writing while you are reading data.
This approach could introduce potential deadlocks. If your writer thread dies prior to releasing the lock, then your reader thread will not be able to continue (then again your thread dying may just trigger an error state).
I hope this makes sense.
Use the concept of condition variables. The probelm you have is the most common one in multi-threaded programming world. Just using mutexes doesn't help the situation. Always remember that mutexes are for locking & condition variables are for waiting. The later is always safer and almost certain when a thread should start consuming from a shared queue.
Check out the below link on how you can create a condition variable on your own on windows:
http://www.cs.wustl.edu/~schmidt/win32-cv-1.html
If you are using windows vista, the below msdn example may help you:
http://msdn.microsoft.com/en-us/library/ms686903(VS.85).aspx
In all cases use the logic as shown in Schmidt's website as it looks more portable (oh yes portable on different versions of windows atleast). Schmidt's implemention gives you the standard POSIX api feel which is the widely used standard on most modern UNIX/LINUX systems.

Resources