High CPU Utilisation Cassandra (Native Transport Request) - cassandra

We are running cassandra version 2.0.9 in production. It's a 4 node cluster. For the past few days we are experiencing a high spike in CPU Utilisation. You may see in the picture below.
This is the jconsole output.
When we looked into the threads which are eating a lot of CPU we came across Native Transport request these are eating a lot of CPU (Like 12%) which is huge.
Thread stack trace.
Threads info.
Thread CPU%.
What can the problem be how should we go about debugging it?
Why are majority of NTR request stuck on BCrypt.java? Is this the problem?
The cluster was behaving normally a few days back but now out of 4 nodes 3 are always on high CPU Utilisation.

You have authentication enabled which stores bcrypted hash, not the password. So each request needs to to be checked. This will end up being a CPU issue if you are continually creating new connections instead of reusing an authenticated session. Sessions are long lived objects and should be by default (https://github.com/datastax/php-driver/tree/master/features#persistent-sessions) but if using CGI or something constantly creating new processes you will still have issues. Maybe try php-fpm ?

Related

Issue in pm2 - It stops responding

Am facing issue in my application servers. Assume that - there are two nodes in the Load-balancer.
Suddenly one of the node from them becomes unhealthy.
When I logged in that instance. There were no logs coming in pm2.
then I check its CPU it was very high.
So please guide me how can I fix this issue. Or any way to debug it.
Check out flame graphs to see where your Node app is CPU bound.
You can also use the new debugging system in Node 6.3 (--inspect) to debug with the full power of Chrome DevTools.
PM2 has some limited protection for runaway issues like this via the max-memory-restart option. Typically, high CPU will also correlate with high memory usage and this option can be used to restart your app when it begins consuming large amounts of memory (which in your case may or may not be the correct moment but it should help).
--max-memory-restart <memory> specify max memory amount used to autorestart (in octet or use syntax like 100M)

tcpsndbuf causing max CPU load

I do run a web application, which I stripped to the max to avoid a high tcpsndbuf values. Still a failcnt on tcpsndbuf will be reached after a few days and very, very little traffic.
I initially thought it is an application container issue as described in this thread.
The tcpsndbuf values rise and rise. It seems as the buffer value is never released completely. Thus, it continuously increases until it breaches the limit and demands it's tribute of 100% CPU throttle.
However, after ruling out different sources like hibernate configs, mysql driver error, I want to focus on openvz managed by Parallel Plesk/Power Panel and the apache proxy_mod.
The reason why I belive it, is the fact that numeruous process of /usr/sbin/apache2 -k start are listed in the process list (Parallels Power Panel, Processes)
I use the proxy_mod to route port 80 to my application server which is on a different port but the same host.
How can I analyse the socket states in detail? E.g. what buffer is locked by which application, when has it been assigned? Can I see the proxy_mod connections? Is that type of data available? Any other hints on tcpsndbuf provoking a max CPU load are highly welcome.

IIS application initialization module and memory management

I am researching into the IIS Application Initialization module and from what I can see, when using the AlwaysRunning option for Start Mode setting for the application pool, basically it starts a new worker process that will always run even if there isn't any requests. When applying this option it starts the process automatically.
My concern is memory management and CPU usage, specifically how is this handled since the process always runs.
How can I compare this to setting the Start Mode to OnDemand and increase the Idle Time minutes to couple of days? That way, I guess, the process will run in idle mode for x days before it's terminated, and reinitialized on the next request and keep running for a couple of days. If I set the minutes to let's say 1.5 days, someone is bound to use the application at least once a day, so it will persist the process runtime and it will never be terminated.
Can anyone share experience regarding this topic?
Thanks
I have multisite application that runs few sites under separate app pools. All are set OnDemand for Start Mode and IdleTime for 1740 minutes, also I use Page Output Cache from app with different times for different page types. There is also NHibernate behind scene and DB is MySql.
The most active site have more than 100k visits per day and almost never is idle. When it starts if I recycle, need 30 seconds to 2 minutes to became full operable depending on requests at the moment and CPU usage is going from 40% to 70%. After the site is up CPU usage is very low (0-4%) if there are no new entries in DB and memory usage is around 3GB when all is cached. Sometimes CPU is going to 20% if at that moment are new request (for not cached content) and there is new entry saving.
Also Page Output Cache works on First Come First Served base so maybe this can also cause little problem while caching is done - user must wait, little more CPU to do the caching.
The most biggest problem in my case is using NHibernate and MySql but Page Output Cache resolved the problem for me when I decided to cache the page modules and content. I realize that is better application to starve for memory then for CPU.
3.5k visitors at one moment when everything is cached gave to me same memory usage (3GB) and CPU (server overall) around 40%
Other sites are using around 1-1.5GB memory and CPU never more then 20% at start.
The application with same settings for app pool and using MSSQL with EF I can't even notice that run on server. It is used by 10-60 users in minute there is not much content except embedding codes and it use 1-5% CPU and never more than 8MB memory. On recycle it is up for less then 10 seconds.
With this my experience I can tell you that all depends on what application serves and how it works :) and how much content do you have.
If you use OnDemand with long IdleTime it will be same as AlwaysStart and process is not used at that moment. If you use OnDemand with short IdleTime more often you will need CPU to start the process.

Nodejs on heavy load

I am using nodejs with socket.io on my chat system. My hardware is 13.6 Ghz Cpu and 16gb ram.
When the online users count reaches 600,some users can't connect to socket,can't send messages.And some users disconnects from chat.
How can I resolve this problem ? What is your opinion of this problem ?
First, I'm not sure how you have a clock speed of 13.6ghz for a single thread. I'd assume your CPU has multiple cores, or your mobo supports multiple processor sockets, and 13.6 is simply a sum. (8.2GHz was a world record that was set July 23rd, 2013.)
Secondly, I'd ask yourself why the disconnect is happening.
Are you watching processor load - is a single thread maxing out its processor allocation (i.e. 100% usage on a single core)?
How's your RAM consumption; is it climbing, and has the OS offloaded
memory onto the page file/swap partition - could there be a memory
leak?
Is your network bandwidth capped? Has it reached its maximum capacity?
My high-level recommendations are:
Make sure your application is non-blocking. This means using asynchronous methods whenever possible. By design, however, all Node.js' Net methods are asynchronous.
Consider clustering your application and use a shared port. This is possible with Node.js child processes using Cluster. This will distribute the load on the CPU to multiple cores. You don't have to worry about load balancing (e.g. round-robin, fastest, ratio) - handling the client is on a first-available first-serve basis, whichever Node.js process can handle the client's request first, wins.
Verify your NIC has enough throughput to handle the load. If it is configured for / auto-negotiating as 10BASE-T or 100BASE-TX half-duplex, you could be in trouble.
Ultimately, you need to perform more diagnostics to isolate the issue. Curiosity, digging, patience and research will lead you to the answer. Your question is far too open-ended to be provided with an exact answer - it is more theoretical. There are also too many variables to pinpoint an exact cause.

How to determine the best number of threads in Tomcat?

How does one determine the best number of maxSpare, minSpare and maxThreads, acceptCount etc in Tomcat? Are there existing best practices?
I do understand this needs to be based on hardware (e.g. per core) and can only be a basis for further performance testing and optimization on specific hardware.
the "how many threads problem" is quite a big and complicated issue, and cannot be answered with a simple rule of thumb.
Considering how many cores you have is useful for multi threaded applications that tend to consume a lot of CPU, like number crunching and the like. This is rarely the case for a web-app, which is usually hogged not by CPU but by other factors.
One common limitation is lag between you and other external systems, most notably your DB. Each time a request arrive, it will probably query the database a number of times, which means streaming some bytes over a JDBC connection, then waiting for those bytes to arrive to the database (even is it's on localhost there is still a small lag), then waiting for the DB to consider our request, then wait for the database to process it (the database itself will be waiting for the disk to seek to a certain region) etc...
During all this time, the thread is idle, so another thread could easily use that CPU resources to do something useful. It's quite common to see 40% to 80% of time spent in waiting on DB response.
The same happens also on the other side of the connection. While a thread of yours is writing its output to the browser, the speed of the CLIENT connection may keep your thread idle waiting for the browser to ack that a certain packet has been received. (This was quite an issue some years ago, recent kernels and JVMs use larger buffers to prevent your threads for idling that way, however a reverse proxy in front of you web application server, even simply an httpd, can be really useful to avoid people with bad internet connection to act as DDOS attacks :) )
Considering these factors, the number of threads should be usually much more than the cores you have. Even on a simple dual or quad core server, you should configure a few dozens threads at least.
So, what is limiting the number of threads you can configure?
First of all, each thread (used to) consume a lot of resources. Each thread have a stack, which consumes RAM. Moreover, each Thread will actually allocate stuff on the heap to do its work, consuming again RAM, and the act of switching between threads (context switching) is quite heavy for the JVM/OS kernel.
This makes it hard to run a server with thousands of threads "smoothly".
Given this picture, there are a number of techniques (mostly: try, fail, tune, try again) to determine more or less how many threads you app will need:
1) Try to understand where your threads spend time. There are a number of good tools, but even jvisualvm profiler can be a great tool, or a tracing aspect that produces summary timing stats. The more time they spend waiting for something external, the more you can spawn more threads to use CPU during idle times.
2) Determine your RAM usage. Given that the JVM will use a certain amount of memory (most notably the permgen space, usually up to a hundred megabytes, again jvisualvm will tell) independently of how many threads you use, try running with one thread and then with ten and then with one hundred, while stressing the app with jmeter or whatever, and see how heap usage will grow. That can pose a hard limit.
3) Try to determine a target. Each user request needs a thread to be handled. If your average response time is 200ms per "get" (it would be better not to consider loading of images, CSS and other static resources), then each thread is able to serve 4/5 pages per second. If each user is expected to "click" each 3/4 seconds (depends, is it a browser game or a site with a lot of long texts?), then one thread will "serve 20 concurrent users", whatever it means. If in the peak hour you have 500 single users hitting your site in 1 minute, then you need enough threads to handle that.
4) Crash test the high limit. Use jmeter, configure a server with a lot of threads on a spare virtual machine, and see how response time will get worse when you go over a certain limit. More than hardware, the thread implementation of the underlying OS is important here, but no matter what it will hit a point where the CPU spend more time trying to figure out which thread to run than actually running it, and that numer is not so incredibly high.
5) Consider how threads will impact other components. Each thread will probably use one (or maybe more than one) connection to the database, is the database able to handle 50/100/500 concurrent connections? Even if you are using a sharded cluster of nosql servers, does the server farm offer enough bandwidth between those machines? What else will run on the same machine with the web-app server? Anache httpd? squid? the database itself? a local caching proxy to the database like mongos or memcached?
I've seen systems in production with only 4 threads + 4 spare threads, cause the work done by that server was merely to resize images, so it was nearly 100% CPU intensive, and others configured on more or less the same hardware with a couple of hundreds threads, cause the webapp was doing a lot of SOAP calls to external systems and spending most of its time waiting for answers.
Oce you've determined the approx. minimum and maximum threads optimal for you webapp, then I usually configure it this way :
1) Based on the constraints on RAM, other external resources and experiments on context switching, there is an absolute maximum which must not be reached. So, use maxThreads to limit it to about half or 3/4 of that number.
2) If the application is reasonably fast (for example, it exposes REST web services that usually send a response is a few milliseconds), then you can configure a large acceptCount, up to the same number of maxThreads. If you have a load balancer in front of your web application server, set a small acceptCount, it's better for the load balancer to see unaccepted requests and switch to another server than putting users on hold on an already busy one.
3) Since starting a thread is (still) considered a heavy operation, use minSpareThreads to have a few threads ready when peak hours arrive. This again depends on the kind of load you are expecting. It's even reasonable to have minSpareThreads, maxSpareThreads and maxThreads setup so that an exact number of threads is always ready, never reclaimed, and performances are predictable. If you are running tomcat on a dedicated machine, you can raise minSpareThreads and maxSpareThreads without any danger of hogging other processes, otherwise tune them down cause threads are resources shared with the rest of the processes running on most OS.

Resources