I am running a nodejs application which uses redis and sequelize library(to connect mysql).The application runs on cluod run. Initally morning when the transactions starts the response is fast.But as time passes by, the response time for 50 percentile is less than 1 sec. Whereas my 99 percentile and 95 percentile response time is less than (15 secs) resulting in very high latency. But memory stays at 20% out of 512 MB. Also my 95 percentile and 99 percentile is more than 80% cpu but my 50 percentile is less than 30%. What could be the issue? Is it due to memory paging or any other rasons?
I ran a load test for my node API using pm2, and I got the following results using wrk tool:
Running 30s test
1 threads and 100 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 52.57ms 15.31ms 200.31ms 80.37%
Req/Sec 1.92k 199.55 2.20k 79.60%
57315 requests in 30.02s, 29.63MB read
Requests/sec: 1909.28
Transfer/sec: 0.99MB
However when I use all available threads the req/sec drops and the max latency goes up.
Running 30s test
8 threads and 100 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 57.05ms 19.87ms 345.07ms 80.40%
Req/Sec 213.29 34.97 313.00 70.82%
50944 requests in 30.06s, 26.33MB read
Requests/sec: 1694.56
Transfer/sec: 0.88MB
How do I make sense of the stdev value? What does it mean by 80%?
Why would the req/sec go down when I use more threads?
We are running a cassandra 2-node cluster .
The following is the latency stat for reads or writes when executed independently :
99% write avg write latency 99% read avg read latency GC time
545 .227 2816 1.793 2400
However,the total read time for the same batch set is almost 3 times worse when performing read and write in parallel(write latencies being almost unaffected).
99% read avg read latency GC time
4055 1.955 6851
There is not compaction recorded on the application keyspace - though we could see compaction on the system and system_schema tablespaces.
What may be causing the sizeable jump in read timings for the same sample set - when writes happen concurrent to read?
Another point to mention is that the bloom filter false positives is always 0 - which seems to indicate bloom filters are being used effectively.
Any pointers to investigate is appreciated.
The YCSB Endpoint benchmark would have you believe that Cassandra is the golden child of Nosql databases. However, recreating the results on our own boxes (8 cores with hyperthreading, 60 GB memory, 2 500 GB SSD), we are having dismal read throughput for workload b (read mostly, aka 95% read, 5% update).
The cassandra.yaml settings are exactly the same as the Endpoint settings, barring the different ip addresses, and our disk configuration (1 SSD for data, 1 for a commit log). While their throughput is ~38,000 operations per second, ours is ~16,000 regardless (relatively) of the threads/number of client nodes. I.e. one worker node with 256 threads will report ~16,000 ops/sec, while 4 nodes will each report ~4,000 ops/sec
I've set the readahead value to 8KB for the SSD data drive. I'll put the custom workload file below.
When analyzing disk io & cpu usage with iostat, it seems that the reading throughput is consistently ~200,000 KB/s, which seems to suggest that the ycsb cluster throughput should be higher (records are 100 bytes). ~25-30% of cpu seems to be under %iowait, 10-25% in use by the user.
top and nload stats are not ostensibly bottlenecked (<50% memory usage, and 10-50 Mbits/sec for a 10 Gb/s link).
# The name of the workload class to use
workload=com.yahoo.ycsb.workloads.CoreWorkload
# There is no default setting for recordcount but it is
# required to be set.
# The number of records in the table to be inserted in
# the load phase or the number of records already in the
# table before the run phase.
recordcount=2000000000
# There is no default setting for operationcount but it is
# required to be set.
# The number of operations to use during the run phase.
operationcount=9000000
# The offset of the first insertion
insertstart=0
insertcount=500000000
core_workload_insertion_retry_limit = 10
core_workload_insertion_retry_interval = 1
# The number of fields in a record
fieldcount=10
# The size of each field (in bytes)
fieldlength=10
# Should read all fields
readallfields=true
# Should write all fields on update
writeallfields=false
fieldlengthdistribution=constant
readproportion=0.95
updateproportion=0.05
insertproportion=0
readmodifywriteproportion=0
scanproportion=0
maxscanlength=1000
scanlengthdistribution=uniform
insertorder=hashed
requestdistribution=zipfian
hotspotdatafraction=0.2
hotspotopnfraction=0.8
table=usertable
measurementtype=histogram
histogram.buckets=1000
timeseries.granularity=1000
The key was increasing native_transport_max_threads in the casssandra.yaml file.
Along with the increased settings in the comment (increasing connections in ycsb client as well as concurrent read/writes in cassandra), Cassandra jumped to ~80,000 ops/sec.
I used Couchdb to create a private NPM mirror, but I found that beam.smp kept my CPU usage to 100%, is there any way to make it lower, like 50%?
Thank you very much.
You cannot directly limit CPU/memory usage for CouchDB, but you may tweak Replicator options to reduce their usage. Options you're interested:
http_connections
Defines maximum number of HTTP connections per replication. Keeping them lower reduces transfer bandwidth.
[replicator]
http_connections = 20
worker_batch_size
With lower batch sizes checkpoints are done more frequently. Lower batch sizes also reduce the total amount of used RAM memory.
[replicator]
worker_batch_size = 500
worker_processes
Amount of replication workers. Keeping them lower reduces amount of data replication handled => reduces CPU usage because of less data to process.
[replicator]
worker_processes = 4
Play with these options to find right combination to fit your limits.