Tomcat - one thread per request - or other alternatives? - multithreading

My understanding is that in Tomcat, each request will take up one Java/(and thus OS) thread.
Imagine I have an app with lots of long-running requests (eg a poker game with multiple players,) that involves in-game chat, and AJAX long-polling etc.
Is there a way to change the tomcat configuration/architecture for my webapp so that I'm not using a thread for each request but 'intercept' the request and response so they can be processed as part of a queue?

I think you're right about tomcat likes to handle each request in its own thread. This could be problematic for several concurrent threads. I have the following suggestions:
Configure maxThreads and acceptCount attributes of the Connector elements in server.xml. In this way you limit the number of threads that can get spawned to a threshold. Once that limit is reached, requests get queued. The acceptCount attribute is to set this queue size. Simplest to implement but not a good long term solution
Configure multiple Connector elements in server.xml and make them share a threadpool by adding an Executor element in server.xml. You probably want to point tomcat to your own implementation of Executor interface.
If you want finer grain control no how requests are serviced, consider implementing your own connector. The 'protocol' attribute of the Connector element in server.xml should point to your new connector. I have done this to add a custom SSL connector and this works great.
Would you reduce this problem to a general requirement to make tomcat more scalable in terms of the number of requests/connections? The generic solution to that would be configuring a loadbalancer to handle multiple instances of tomcat.

Related

Ktor, Netty and increasing the number of threads per endpoint

Using Ktor and Kotlin 1.5 to implement a REST service backed by Netty. A couple of things about this service:
"Work" takes non-trivial amount of time to complete.
A unique client endpoint sends multiple requests in parallel to this service.
There are only a handful of unique client endpoints.
The service is not scaling as expected. We ran a load test with parallel requests coming from a single client and we noticed that we only have two threads on the server actually processing the requests. It's not a resource starvation problem - there is plenty of network, memory, CPU, etc. and it doesn't matter how many requests we fire up in parallel - it's always two threads keeping busy, while the others are sitting idle.
Is there a parameter we can configure to increase the number of threads available to process requests for specific endpoints?
Netty use what is called Non-blocking IO model (http://tutorials.jenkov.com/java-concurrency/single-threaded-concurrency.html).
In this case you have only a single thread and it can handle a lot of sub-processes in parallel, as long as you follow best practices (not blocking the main thread event loop).
You might need to check the following configuration options for Netty https://ktor.io/docs/engines.html#configure-engine
connectionGroupSize = x
workerGroupSize = y
callGroupSize = z
Default values usually are set rather low and tweaking them could be useful for the time-consuming 'work'. The exact values might vary depending on the available resources.

How splitting in spring integration works for web container?

I want to use Spring Integration for HTTP inbound message processing.
I know, that it spring integration channel would run on a container thread, but if I want to use splits,
what threads would be used?
How the result of split would be returned to the initial web request thread?
(Note: I am not 100% sure if I understand you use case, but as a general remark:)
The spring integration spitter splits a message in multiple "smaller" messages. This is unrelated to multi-threading, that is, it does not per-se imply that the smaller messages are processed in parallel. It is still a sequential stream of smaller messages.
You can then process the smaller messages in parallel, by defining a handler with a given parallelism and you can define that this handler uses a dedicated thread pool.
(Sorry if this does not answer your question, please clarify).

Apache Camel - Browse Exchanges of a SEDA queue

I'm working on a small app which uses Apache Camel with JMX active.
Very simply put, I have a route using SEDA component - just 1 consumer - which in a nutshell creates its own thread and queues incoming Exchanges if the route is busy.
Basically I'd like to monitor/browse/visualize the Exchanges that are waiting in the SEDA queue. I've tried Hawtio and JConsole with JMX but it only provides the number of total and current inflight exchanges on that given route. It doesn't mention the number of Exchanges waiting to be processed.
I've also tried the Browse component which keeps track of all exchanges being passed to the browse endpoint, however it keeps all the exchanges, as opposed as just the "queued" ones.
I'm wondering if there is something out-of-the-box in Camel which allows me to do this or if I overlooked something in Hawtio or JConsole.
Thanks in advance.
You can see on the SedaEndpoint mbean how many messages are in the queue. You can find those in the endpoints tree in hawtio, or also in plain JMX as well.
#ManagedAttribute(description = "Current queue size")
public int getCurrentQueueSize() {
return queue.size();
}

Mule: Thread count under load with doThreading="false"

we have a mule app with HTTP inbound endpoint and I'm trying to figure out how to control the thread count under load. As an experiment I have added the following configuration:
<core:configuration>
<core:default-threading-profile doThreading="false" maxThreadsActive="500" poolExhaustedAction="RUN"/>
</core:configuration>
Under load I'm seeing the thread count peak at over 1000 threads. Am not sure why this is the case give the maxThreadsActive setting and the doThreading="false". Reading about poolExhaustedAction="RUN", I would expect the listener thread to block while processing inbound requests rather than spawn new ones, and finally reject the connection if its backlog queue is full. I never see rejected client connections.
Does Mule maintain a separate thread pool for each inbound endpoint in the app (sorry if this is in the documentation)? Even if so, don't think it helps explain what I'm seeing.
Any help appreciated. We are running a number of mule apps in one container and I'd like to control the total number of threads.
Thanks, Alfie.
Clearly the doThreading attribute on default-threading-profile is not enough to control Mule threading as a whole nor limit with a global cap the specific threading behaviour of transports. I reckon you're getting 500 threads for the HTTP message receiver pool and 500 for the VM message dispatcher pool.
I strongly suggest you reading about tuning Mule: http://www.mulesoft.org/documentation/display/current/Tuning+Performance
My gut feel is that you need to
configure threading on each transport (VM, HTTP), strictly specifying the pool size for receivers and dispatchers,
select flow processing strategies that prevent Mule from spawning new threads (i.e. use synchronous to hog the receiver threads),
select exchange patterns that also prevent Mule from spawning new threads (i.e. use request-response to piggyback the current execution thread).

Prevent thread blocking in Tomcat

I have a Java servlet that acts as a facade to other webservices deployed on the same Tomcat instance. My wrapper servlet creates N more threads, each which invokes a webservice, collates the response and sends it back to the client. The webservices are deployed all on the same Tomcat instance as different applications.
I am seeing thread blocking on this facade wrapper service after a few hours of deployment which brings down the Tomcat instance. All blocked threads are endpoints to this facade webservice (like http://domain/appContext/facadeService)
Is there a way to control such thread-blocking, due to starvation of available threads that actually do the processing? What are the best practices to prevent such deadlocks?
The common solution to this problem is to use the Executor framework. You need to express your web service call as Callable and pass it to the executor either as it stands, or as a Collection<Callable> (see the Javadoc for complete list of options).
You have two choices to control the time. First is to use parameters of an appropriate method of the Executor class where you specify the max web service timeout. Another option is to do get the result (which is expressed as Future<T>) and use .get(long, TimeUnit) to specify the maximum amount of time you can wait for a result.

Resources