We are monitoring Tomcat using SNMP tool and its showing me.
Thread Total Started Count = 500 (It's changing frequently)
I have hunt down OID and i found its "jvmThreadTotalStartedCount" http://support.ipmonitor.com/mibs/JVM-MANAGEMENT-MIB/item.aspx?id=jvmThreadTotalStartedCount
It is saying: The total number of threads created and started since the Java Virtual Machine started.
My question is what this means? Could someone explain me in simple/basic language.
A thread is a flow of execution within a process. There are processes that only have a single flow of execution (single-threaded) and others, like Tomcat, which partition their behavior into several flows of execution, in parallel (multi-threaded).
Tomcat, as a web server, typically allocates one thread to handle each request it receives, up to a limit (might be 500 in your case), after which following requests are queued, waiting for a thread to become free to handle them. This is known as thread pooling.
So, to answer your first question, Thread Total Started Count is the total count of all the different flows of execution that have been created by this instance of Tomcat since it started running.
Related
I was just playing with threads and see how much CPU they consume. I have checked in two scenarios.
In first scenario I created four threads and started them with infinite loop. Soon those threads consumed my all 4 CPU cores. After checking performance monitor in task manager I found CPU consumption is 100%.
In second scenario when I tried it with web application and in rest controller(using tomcat server 8.5 version) I have run infinite loop. So that if I request url 4 times with browser(with different tabs obviously). My CPU consumption should be 100%. I couldn't see 100% CPU consumption.
Why is there difference?
My Second question is: how would I tune the server thread pool. I have to use more than 4 threads because it might be possible few of them are waiting for IO operation. I am using hibernate as ORM that maintains connection pooling. So how many threads I should use in thread pool as well as connection pool. How would I decide?
We can't answer the first part of your question without seeing your code. But I suspect the problem is in the way that you have implemented the threads in the webapp case. (Because what you report shouldn't happen ...)
The answer to the second part is "trial and error". More specifically:
Make the pool sizes tunable parameters
Develop a benchmark that is representative of your expected system load.
Run benchmark with different settings, measure performance and graph results.
Based on the graph (and other criteria) pick a settings that are the best compromise between performance and resource (e.g. memory) utilization.
Thread pools and connection pools are different, and have different resource implications. The first is (largely) about memory; i.e. thread stacks and temporary objects used by the threads while they are active. The second is (largely) about resources associated with connections (active or idle).
I have the following strange situation.
We have a process, call it Distributor, that receives tasks over ZeroMQ/TCP from Client, and accumulates them in a queue. There is a Worker process, which talks with the Distributor over ZeroMQ/IPC. The Distributor forwards each incoming task to Worker, and waits for an answer. As soon as the Worker answers, it sends it another task (if there was one received in the mean time), and returns the answer to the Client (over a separate ZeroMQ/TCP connection). If a task was not processed within 10ms, it is dropped from the queue.
With 1 Worker, the system is capable to process ~3,500 requests/sec. The client sends 10,000 requests/sec, so 6,500 requests are dropped.
But - when I'm running some unrelated process on the server, which takes 100% CPU (a busy wait loop, or whatever) - then, strangely, the system can suddenly process ~7,000 requests/sec. When the process is stopped, it returns back to 3,500. The server has 4 cores.
The same happens when running 2, 3 or 4 Workers (connected to the same Distributor), with slightly different numbers.
The Distributor is written in C++. The Worker is written in Python, and uses pyzmq binding. The worker process is a simple arithmetic process, and does not depend on any external I/O other than Distributor.
There is a theory that this has to do with ZeroMQ using threads on separate CPUs when the server is free, and the same CPU when it's busy. If this is the case, I would appreciate an idea how to configure thread/CPU affinity of ZeroMQ so that it works correctly (without running a busy loop in background).
Is there any ZeroMQ setting that might explain / fix this?
EDIT:
This doesn't happen with a Worker written in C++.
This was indeed a CPU affinity problem. Turns out that using ZeroMQ in a setting where a worker processes an input and waits for the next one, if the context switch causes it to switch to another process, a lot of time is wasted on copying the ZeroMQ data.
Running the worker with
taskset -c 1 python worker.py
solves the problem.
I have a foxx application developed and it is running on machine A. The cpu utilization is usually below 3-4% and sometimes spikes to 20%. I have close to 6 million records.
Same application is deployed on an another machine (exact replica of machine A) and have data of about 100k only, But cpu utilization is at around 200%.
How do I debug this. What is happening on the machine B. Both machines have same application, same arangodb version, same configuration. Disk I/O is also same, memory utilization at machine B is 1/6th of machine A.
Any pointers. This is happening in production enviornment, so its really important for me to debug it quickly.
We were finally able to reproduce such an issue ourselves. We found there was a situation in which a scheduler thread could go into some busy wait state, resulting in the following loop to be executed over and over:
a scheduler thread calling epoll_wait()
epoll_wait() returning instantly, signalling an event for a certain file descriptor
the correct event handling callback being called, but not removing the file descriptor from the list of watched descriptors
goto 1
As the one file descriptor was not properly cleared from the list of watched descriptors, the epoll_wait() always signalled an event for the file descriptor to be available. This made it return almost instantly, and the whole above loop being executed many times per second.
This caused CPU spikes in threads named scheduler.
We found one reason for this to be a client-side connection timing out while the operation triggered by the connection is still executed on the server-side operation. For example, if a client called a server route that took 5 seconds to complete and respond, but the client disconnected after 3 seconds, then it might have happened.
What made it hard to reproduce it that it did not affect all such client connections, but only some - which ones is still unclear.
This particular issue was fixed in ArangoDB 2.6.5, so you may want to give it a try when it is released.
I know that node is a single threaded system and I was wondering if a child process uses its own thread or its parents. say for example I have an amd E-350 cpu with two threads. if I ran a node server that spawned ten child instances which all work continuously. would it allow it or would it fail as the hardware itself is not sufficient enough?
I can say from own experience that I successfully spawned 150 child processes inside an Amazon t2.micro with just one core.
The reason? I was DoS-ing myself for testing my core server's limits.
The attack stayed alive for 8 hours, until I gave up, but it could've been working for much longer.
My code was simply running an HTTP client pool and as soon as one request was done, another one spawned. This doesn't need a lot of CPU. It needs lots of network, though.
Most of the time, the processes were just waiting for requests to finish.
However, in a high-concurrency application, the performance will be awful if you share memory between so many processes.
I have a web application that simply acts as a Front Controller using Spring Boot to call other remote REST services where I am combining Spring's DeferredResult with Observables subscribed on Scheduler.computation().
We are also using JMeter to stress out the web application, and we have noticed that requests start to fail with a 500 status, no response data and no logs anywhere when the number of concurrent threads scheduled in JMeter increases from 25, which obviously is a very "manageable" number for Tomcat.
Digging into the issue with the use of VisualVM to analyze how the threads were being created and used, we realized that the use of rx.Schedulers was somehow impacting the number of threads created by Tomcat NIO. Let me summarize our tests based on the rx.Scheduler used and a test in JMeter with 100 users (threads):
SCHEDULERS.COMPUTATION()
As we're using the Schedulers.computation() and my local machine has 4 available processors, then 4 EventLoop thread pools are created by RxJava (named RxComputationThreadPool-XXX) and ONLY 10 of Tomcat (named http-nio-8080-exec-XXX), as per VisualVM:
http://screencast.com/t/7C9La6K4Kt6
SCHEDULERS.IO() / SCHEDULERS.NEWTHREAD()
This scheduler seems to basically act as the Scheduler.newThread(), so a new thread is always created when required. Again, we can see lots of threads created by RxJava (named RxNewThreadScheduler-XXX), but ONLY 10 for Tomcat (named http-nio-8080-exec-XXX), as per VisualVM:
http://screencast.com/t/K7VWhkxci09o
SCHEDULERS.IMMEDIATE() / NO SCHEDULER
If we disable the creation of new threads in RxJava, either by setting the Schedulers.immediate() or removing it from the Observable, then we see the expected behaviour from Tomcat's threads, i.e. 100 http-nio-8080-exec corresponding to the number of users defined for the JMeter test:
http://screencast.com/t/n9TLVZGJ
Therefore, based on our testing, it's clear to us that the combination of RxJava with Schedulers and Tomcat 8 is somehow constraining the number of threads created by Tomcat... And we have no idea why or how this is happening.
Any help would be much appreciated as this is blocking our development so far.
Thanks in advance.