Hibernate in multithreaded env - multithreading

I have a webservice using Hibernate as DAL - using MySql with InnoDB.
Since I want to make webservice calls really short (for better user experience in client side) I am using 2 threads with msg queue to do some work.
1 thread gets userId in the message, loads the user from the DB and gets it email address and send email to.
Second thread is used like this:
Webservice call.... doing some actions.
Adding ActivityLog into the DB.
Doing session.save( log ); session.commit();
Now we send message to the thread with the logId.
Message received - insert new entries into timeline table (userid, logId). Session is different session object than the main logic session object.
Should I have problems in this? in lazy loading? in the threads, since message is sent to the thread after commit()?

A Webservice -> Message queue architecture is pretty standard in case you don't need synchronous reply to the web service.
In the web service, store whatever you want into the database and the message queue will pick that up later.
A different session is not a problem, but if you have a single static Session for each of these threads, they may be subject to Session bloat: objects piling up in the Session cache.
More on this: http://suryagaddipati.wordpress.com/2008/02/15/hibernate-rich-clients-and-long-running-sessions/

Related

Spawn multiple threads from a single EJB #Asynchronous method

I'm building a Java EE application that one of its requirements is to send messages for registered e-mails (around 1000 to 2000). The access to that application is limited and in any time there will be less than 2 user logged in.
For sending e-mails I'm using JavaMail, a #Stateless bean and #Asynchronous method.
My problem is that it takes too long to send the 1000+ e-mails, around 1.2 secs for each e-mail in my development server. What should I do to reduce time ? Can I span multiple Stateless beans? Or in that case creating around 10 to 15 threads, with so low user access isn't a too bad?
Your performance problem is probably due to creating a new connection to send each message, as described in the JavaMail FAQ. Your code needs to be able to cache and reuse the connection to the mail server. A better approach for sending the messages asynchronously might be to put the information necessary to construct the message in a JMS message and then use a (pool of) MDB to process the information, turn it into a mail message, and send it, while caching and reusing the Transport object that represents the connection to the server.
You need to configure the async thread pool size inside your container, default size is usually 15 parallel threads. There isn't one thread per bean instance but if the pool fills up there will be a queue and max 15 sending at a time.

How/Where to store temporary data without using claim check pattern?

I have a usecase that requires our application to send a notification to an external system in case when a particular event occurs. The notification to external system happens by putting a message into a JMS queue.
The transactional requirements are not that strict. Hence, instead of using JTA for such a trivial usecase I decided to use JMS local transaction, as spring understands how to synchronize JMS local transaction with any managed transaction(e.g. database transaction) to elevate 1PC.
The problem I am facing is that the notification has to be enriched with some data before sending the notification. This extra information has no relevance to my business domain which is responsible for generating the event. So, I am not sure where to temporary store that extra data to reclaim it before sending the notification. Probably, below illustration may help in understanding the problem.
HTTP Request ---> Rest API ---> Application Domain ---> Event Generation ---> Notification
As per the above illustration I do not want to pass that extra data and pollute my domain layer, which is part of Rest API request payload, to send the notification.
One solution I thought of is to use thread scoped queue channel to reclaim it before sending the notification. This way Rest API will initiate the process by putting the extra data into the queue and before sending the notification I will pull it from the queue to enrich the message for notification.
The part which I am unable to achieve in this solution is that how to pull the message from queue when I receive the event somewhere in the application (between event generation and notification phase).
If my approach does not make any sense than please suggest any solution without using claim/check pattern.
Why not simply store the information in a header (or headers)? The domain layer doesn't need to know it's there.
Or, for your solution, create a new QueueChannel for each request, and store a reference to it in a header and receive() from it on the back end, but it's easier to just use a header directly.

WCF Operation.Context not Thread safe?

I'm code reviewing a WCF service.
In the header of each message we inject data that the service is going to use later to build a connection string to a DB.
That's because the service is going to be used by a number of different sites, each with its own DB that the service has to query.
We use wcf extensibility. We have a custom MessageInspector that, after receiving the request, extracts the data from the message header, creates a context (that implements IExtension) and adds it to OperationContext.Current.Extensions.
Before sending the reply the custom context is removed from the Extencions collection.
This is a fairly common pattern, as discussed here:
Where to store data for current WCF call? Is ThreadStatic safe?
and here:
http://social.msdn.microsoft.com/Forums/vstudio/en-US/319cac66-66e8-4dfe-9a82-dfd289c9df1f/wcf-doesnt-have-session-storage-so-where-should-one-store-call-specific-data?forum=wcf
This all works fine as long as the service receives a request, processes it, sends the reply and receives the next request.
But what if the service receives a request and before being able to reply it gets a second request? I built a small console application to test it. I send 2 messages from 2 different threads, I made the wcf service wait for 2 seconds, to ensure the second request comes in before the first one is completed and this is what I get:
Site Id : test1450 ; Session: uuid:2caf47cf-7d46-4d72-9275-d9c037fa0e70;id=2 : Thread Id: 6
Site Id : test1450 ; Session: uuid:2caf47cf-7d46-4d72-9275-d9c037fa0e70;id=3 : Thread Id: 22
It looks like wcf creates 2 sessions executing on 2 different threads, but Site Id is the same. It shouldn't. Judging from this it looks like OperationContext.Current.Extensions is a collection shared between threads.
Right now I'm inclined to think my test is wrong and I missed something.
Has anyone tried something similar and found out that OperationContext.Current is not thread safe?
OperationContext.Current like other similar properties such as HttpContext.Current have thread affine (or thread static) values. So they are thread safe in the sense that multiple threads can read them, but different threads will get different instances. They can be thought of as dictionaries between specific threads and instances.
So in this context they are not thread safe.
Requests are served by a thread pool so concurrent requests will have different thread ids. (up to a point where the thread pool is full, then requests will be put on hold)

API with Work Queue Design Pattern

I am building an API that is connected to a work queue and I'm having trouble with the structure. What I'm looking for is a design pattern for a worker queue that is interfaced via a API.
Details:
I'm using a Node.js server and Express to create an API that takes a request and returns JSON. These request can take a long time to process (very data intensive) so this is why we use a queuing system (RabbitMQ).
So for example lets say I send a request to the API that will take 15 min to process. The Express API formats the request and puts it in a RabbitMQ (AMQP) queue. The next available worker takes the request off the queue and starts to process it. After its done (in this case 15 min) it saves the data into a MongoDB. .... now what .....
My issue is, how do I get the finished data back to the caller of the API? The caller is a completely separate program that contacts the API via something like an Ajax request.
The worker will save the processed data into a database but I have no way to push back to the original calling program.
Does anyone have any API with a work queue resources?
Please and thank you.
On the initiating call by the client you should return to the client a task identifier that will persist with the data all the way to MongoDB.
You can then provide an additional API method for the client to check the task's status. This method should take a single parameter, the task identifier, and check if a document with that identifier has made in into your collection in MongoDB. Return false if it doesn't exist yet, true when it does.
The client will have to repeatedly poll (but maybe at a 1 minute interval) the task status API method until it returns true.

Azure Storage Queue - correlate response to request

When a Web Role places a message onto a Storage Queue, how can it poll for a specific, correlated response? I would like the back-end Worker Role to place a message onto a response queue, with the intent being that the caller would pick the response up and go from there.
Our intent is to leverage the Queue in order to offload some heavy processing onto the back-end Worker Roles in order to ensure high performance on the Web Roles. However, we do not wish to respond to the HTTP requests until the back-end Workers are finished and have responded.
I am actually in the middle of making a similar decision. In my case i have a WCF service running in a web role which should off-load calculations to worker-roles. When the result has been computed, the web role will return the answer to the client.
My basic data structure knowledge tells me that i should avoid using something that is designed as a queue in a non-queue way. That means a queue should always be serviced in a FIFO like manner. So basically if using queues for both requests and response, the threads awaiting to return data to the client will have to wait untill the calculation message is at the "top" of the response queue, which is not optimal. If storing the responses by using Azure tables, the threads poll for messages creating unnecessary overhead
What i belive is a possible solution to this problem is using a queue for the requests. This enables use of the competeing consumers pattern and thereby load-balancing. On messages sent into this queue you set the correlationId property on the message. For reply the pub/sub part ("topics") part of Azure service bus is used togehter with a correlation filter. When your back-end has processed the request, it published a result to a "responseSubject" with the correlationId given in the original request. Now this response ca be retrieved by your client by calling CreateSubscribtion (Sorry, i can't post more than two links apparently, google it) using that correlation filter, and it should get notified when the answer is published. Notice that the CreateSubscribtion part should just be done one time in the OnStart method. Then you can do an async BeginRecieve on that subscribtion and the role will be notified in the given callback when a response for one of it's request is available. The correlationId will tell you which request the response is for. So your last challenge is giving this response back to the thread holding the client connection.
This could be achieved by creating Dictionary with the correlationId's (probably GUID's) as key and responses as value. When your web role gets a request it creates the guid, set it as correlationId, add it the hashset, fire the message to the queue and then call Monitor.Wait() on the Guid object. Then have the recieve method invoked by the topic subscribition add the response to the dictionary and then call Monitor.Notify() on that same guid object. This awakens your original request-thread and you can now return the answer to your client (Or something. Basically you just want your thread to sleep and not consume any ressources while waiting)
The queues on the Azure Service Bus have a lot more capabilities and paradigms including pub / sub capabilities which can address issues dealing with queue servicing across multiple instance.
One approach with pub / sub, is to have one queue for requests and one for the responses. Each requesting instance would also subscribe to the response queue with a filter on the header such that it would only receive the responses targeted for it. The request message would, of course contain the value to the placed in the response header to drive the filter.
For the Service Bus based solution there are samples available for implementing Request/Response pattern with Queues and Topics (pub-sub)
Let worker role keep polling and processing the message. As soon as the message is processed add an entry in Table storage with the required corelationId(RowKey) and the processing result, before deleting the processed message from the queue.
Then WebRoles just need to do a look up of the Table with the desired correlationId(RowKey) & PartitionKey
Have a look at using SignalR between the worker role and the browser client. So your web role puts a message on the queue and returns a result to the browser (something simple like 'waiting...') and hook it up to the worker role with SignalR. That way your web role carries on doing other stuff and doesn't have to wait for a result from asynchronous processing, only the browser needs to.
There is nothing intrinsic to Windows Azure queues that does what you are asking. However, you could build this yourself fairly easily. Include a message ID (GUID) in your push to the queue and when processing is complete, have the worker push a new message with that message ID into a response channel queue. Your web app can poll this queue to determine when processing is completed for a given command.
We have done something similar and are looking to use something like SignalR to help reply back to the client when commands are completed.

Resources