Load spring context using multiple threads - multithreading

I have a large application context consisting of many context files, using autowiring and package scans, starting up web services, establishing connections to databases and and external legacy system etc. I have been thinking on how to improve context loading times since it takes a while without really taking up CPU. Is there a way to tell application context to initialize using multiple threads? In theory it should be possible since we have dependencies already defined. I'd like the resources (db, web services and legacy connections) to be initialized in parallel.

There's one option that comes in my mind and I'm not sure if it will work, as I've never tried doing this (in my view, if an app takes too long to start up is a sign that it has to be broken down in smaller components, where each component is an app on its own right).
The solution that I think might work is to have a hierarchy of context files, so you can instantiate the parent application context and then instantiate each one of the child context concurrently. The problem with this approach is that you cannot have dependencies between child contexts, but you can have indirect ones (e.g. The parent context has an event dispatcher then classes in one context listen to events triggered from the parent context, and another child context triggers events on the parent context).

Related

Task vs Service for database operations

What is the difference between JavaFX 8 Task and Service and in which case is it better to use one over the other? What is better to use in database operations?
Main Difference between Task and Service - One Time versus Repeated Execution
A Task is a one off thing - you can only use a Task once. If you want to perform the same Task again, you need to construct a new Task instance.
A Service has a reusable interface so that you can start and restart a single service instance multiple times. Behind the scenes, it just takes a Task definition as input and creates new tasks as needed.
Example Use Cases
Task Example => monitoring and reporting progress of a long running startup task on application initialization, like this Splash Page example.
Service Example => The internal load worker implementation for WebEngine where the same task, loading a page asynchronously, needs to be repeated for each page loaded.
Recommendation - Initially try to solve your problem using only a Task and not a Service
Until you are more familiar with concurrency in JavaFX, I'd advise sticking to just using a Task rather than a Service. Tasks have a slightly simpler interface. You can accomplish most of what a Service does simply by creating new Task instances when you need them. If, after understanding Task, you find yourself wanting a predefined API for starting or restarting Tasks, then start using Service at that time.
Database Access Sample using Tasks
Either Task or Service will work for performing database operations off of the JavaFX application thread. Which to use depends on your personal coding preference as well as the particular database operation being performed.
Here is an example which uses a Task to access a database via JDBC. The example was created for JavaFX - Background Thread for SQL Query.
Background Information
The JavaFX concurrency tutorial provides a good overview of Task and Service.
There is excellent documentation in the Task and Service javadoc, including sample code for example use cases.
Worker, Task and Service definitions (from Javadoc)
Task and Service are both Workers, so they have this in common:
A Worker is an object which performs some work in one or more background threads, and whose state is observable and available to JavaFX applications and is usable from the main JavaFX Application thread.
Task definition:
A fully observable implementation of a FutureTask. Tasks expose additional state and observable properties useful for programming asynchronous tasks in JavaFX . . Because Service is designed to execute a Task, any Tasks
defined by the application or library code can easily be used with a
Service.
Service definition:
A Service is a non-visual component encapsulating the information
required to perform some work on one or more background threads. As
part of the JavaFX UI library, the Service knows about the JavaFX
Application thread and is designed to relieve the application
developer from the burden of managing multithreaded code that interacts
with the user interface. As such, all of the methods and state on the
Service are intended to be invoked exclusively from the JavaFX
Application thread.
Service implements Worker. As such, you can observe the state of the
background operation and optionally cancel it. Service is a reusable
Worker, meaning that it can be reset and restarted. Due to this, a
Service can be constructed declaratively and restarted on demand.

One or more AppDomains created server-side when multiple clients call a WCF service?

The question is pretty much in the title, but I will elaborate.
I have a Silverlight application that acts as a slightly extended user interface.
The main part of my program will run on a server to keep the shared database coherent.
This is where my question comes in: Will two clients calling a WCF service each get a thread inside that service OR will they get a full AppDomain each?
The difference is that if the first is the case they can share the DB easily, but in the second scenario they cannot - as I understand it.
EDIT: This is because the DB makes use of the Identity Map pattern [Fowler] where objects used are saved in physical memory (static singleton variable) - multiple AppDomains would mess that up.
(I asked my university teacher and searched quite a bit before asking this, seemingly, simple question)
The threading model for WCF services is determined by the ConcurrencyMode you configure for your service: http://msdn.microsoft.com/en-us/library/system.servicemodel.concurrencymode.aspx.
Regarding AppDomains - that depends entirely on how you're hosting your service. If you're running a ServiceHost of your own, manually, there will always be exactly 1 AppDomain on the server side, unless you decide to start managing and spinning up your own.
If you're hosting inside IIS...it's up to IIS how it handles requests. It may reuse 1 AppDomain, it may spin up multiple AppDomains (unless you override the setting in the web.config to permit only 1 AppDomain per worker process), or it may spin up multiple physical worker processes (which inherently implies multiple AppDomains) if you have web garden mode enabled.
All this said, I'm not sure exactly why this would affect your data access strategy. Multiple threads or AppDomains should have no problem sharing a DB.

Synchronization on App Engine

I'm building an application using Google App Engine. The application consists of several servlets, one of which has a static member object that holds a lot of internal state. Multiple Android phones contact this servlet, causing the servlet to update the state of the static member object. However, when multiple phones happen to talk to the server at the same time, I get synchronization issues in which the static object is being modified by multiple threads at once. I've tried throwing in some synchronized blocks, but this does not seem to help.
I think the reason has something to do with how App Engine spawns threads for HTTP requests, but I'm not sure. What's the standard way to synchronize access to shared objects in App Engine? Thanks!
Your GAE app can be started on different servers, there can be, and will be few instances at one time. Any instance can be started or killed at any time, without any notice before. So, any in-memory state is useless.
You have to use or memcache service, or database instead.

Run multiple WorkerRoles per instance

I have several WorkerRole that only do job for a short time, and it would be a waste of money to put them in a single instance each. We could merge them in a single one, but it'd be a mess and in the far future they are supposed to work independently when the load increases.
Is there a way to create a "multi role" WorkerRole in the same way you can create a "multi site" WebRole?
In negative case, I think I can create a "master worker role", that is able to load the assemblies from a given folder, look for RoleEntryPoint derivated classes with reflection, create instances and invoke the .Run() or .OnStart() method. This "master worker role" will also rethrown unexpected exceptions, and call .OnStop() in all sub RoleEntryPoints when .OnStop() is called in the master one. Would it work? What should I be aware of?
As mentioned by others, this is a very common technique for maximizing utilization of your instances. There may examples and "frameworks" that abstract the worker infrastructure and the actual work you want to be done, including one in this (our) sample: http://msdn.microsoft.com/en-us/library/ff966483.aspx (scroll down to "inside the implementation")
Te most common ways of triggering work are:
Time scheduled workers (like "cron"
jobs)
Message baseds workers (work triggered by the presence of a message).
The code sample mentioned above implements further abstractions for #2 and is easily extensible for #1.
Bear in mind though that all interactions with queues are based on polling. The worker will not wake up with a new message on the queue. You need to actively query the queue for new messages. Querying too often will make Microsoft happy, but probably not you :-). Each query counts as a transaction that is billed (10K of those = $0.01). A good practice is to poll the queue for messages with some kind of delayed back-off. Also, get messages in batches.
Finally, taking this to an extreme, you can also combine web roles and worker roles in a single instance. See here for an example: http://blog.smarx.com/posts/web-page-image-capture-in-windows-azure
Multiple worker roles provide a very clean implementation. However, the cost footprint for idle role instances is going to be much higher than a single worker role.
Role-combining is a common pattern I've seen, working with ISV's on their Windows Azure deployments. You can have a background thread that wakes up every so often and runs a process. Another common implementation technique is to use an Azure Queue to send a message representing a process to execute. You can have multiple queues if you want, or a single command queue. In any case, you would have a queue listener running in a background thread, which would run in each instance. The first one to get the message processes it. You could take it further, and have a timed process pushing those messages onto the queue (maybe every 24 hours, or every hour).
Aside from CPU and memory limits, just remember that a single role can only have a maximum of 5 endpoints (less if you're using Remote Desktop).
EDIT: As of September 2011, role configuration has become much more flexible, now that you have 25 Input endpoints (accessible from the outside world) and 25 Internal endpoints (used for communication between roles) across an entire deployment. The MSDN article is here
I recently blogged about overloading a Web Role, which is somewhat related.
While there's no real issue with the solutions that have been pointed out for finding ways to do multiple worker components within a single Worker Role, I just want you to keep in mind the entire point of having distinct Worker Roles defined in the first place is isolation in the face of faults. If you just shove everything into a single Worker Role instance, just one of those worker components behaving badly has the ability to take down every other worker component in that role. Now all of a sudden you're writing a lot of infrastructure to provide isolation and fault tolerance across components which is pretty much what Azure is there to provide for you.
Again, I'm not saying it's an absolute to strickly do one thing. There's a place where multiple components under a single Worker Role makes sense (especially monaterily). Simply saying that you should keep in mind why it's designed this way in the first place and factor that in appropriately as you plan your architecture.
Why would a 'multi role' be a mess? You could write each worker role implementation as a loosely coupled component and then compose a Worker Role from all appropriate components.
When you later need to separate some of the responsibilities out to a separate worker role, you can compose a new worker role with only this component, while at the same time removing it from the old worker role.
If you wanted to, you could employ late binding so that this could even be done without recompilation, but often I don't think that would be worth the effort.

Thread Safe web apps - why does it matter?

Why does being thread safe matter in a web app? Pylons (Python web framework) uses a global application variable which is not thread safe. Does this matter? Is it only a problem if I intend on using multi-threading? Or, does it mean that one user might not have updated state if another user... I'm just confusing myself. What's so important with this?
Threading errors can lead to serious and subtle problems.
Say your system has 10 members. One more user signs up to your system and the application adds him to the roster and increments the count of members; "simultaneously", another user quits and the application removes him from the roster and decrements the count of members.
If you don't handling threading properly, your member count (which should be 10) could easily be nine, 10, or 11, and you'll never be able to reproduce the bug.
So be careful.
You should care about thread safety. E.g in java you write a servlet that provides some functionality. The container will deploy an instance of your servlet, and as HTTP requests arrive from clients, over different TCP connections, each request is handled by a separate thread which in turn will call your servlet. As a result, you will have your servlet being call from multiple threads. So if it is not thread-safe, then erroneous result will be returned to the user, due to data corruption of access to shared data by threads.
It really depends on the application framework (which I know nothing about in this case) and how the web server handles it. Obviously, any good webserver is going to be responding to multiple requests simultaneously, so it will be operating with multiple threads. That web server may dispatch to a single instance of your application code for all of these requests, or it may spawn multiple instances of your web application and never use a given instance concurrently.
Even if the app server does use separate instances, your application will probably have some shared state--say, a database with a list of users. In that case, you need to make sure that state can be accessed safely from multiple threads/instances of your web app.
Then, of course, there is the case where you use threading explicitly in your application. In that case, the answer is obvious.
Your Web Application is almost always multithreading. Even though you might not use threads explicitly. So, to answer your questions: it's very important.
How can this happen? Usually, Apache (or IIS) will serve several request simultaneously, calling multiple times from multiple threads your python programs. So you need to consider that your programs run in multiple threads concurrently and act accordingly.
(This was too long to add a comment to the other fine answers.)
Concurrency problems (read: multiple access to shared state) is a super-set of threading problems. The (concurrency problems) can easily exist at an "above thread" level such as a process/server level (the global variable in the case you mention above is process-unique value, which in turn can lead to an inconsistent view/state if there are multiple processes).
Care must be taken to analyze the data consistency requirements and then implement the software to fulfill those requirements. I would always err on the side of safe, and only degrade in carefully analyzed areas where it is acceptable.
However, note that CPython runs only one thread context for Python code execution (to get true concurrent threads you need to write/use C extensions), so, while you can get a form of race condition upon expected data, you won't get (all) the same kind of partial-write scenarios and such that may plague C/C++ programs. But, once again. Err on the side of a consistent view.
There are a number of various existing methods of making access to a global atomic -- across threads or processes. Use them.

Resources