Redis config for pubsub and caching in AWS EC2 - node.js

I'm using Redis on EC2, my question is that what would be an ideal config for a redis instance which sole purpose is just pubsub and caching?
Obviously I can turn-off the saving to disk since I'm not persisting anything but will a small disk with high memory be ideal?
Say 100k users all at once subscribed to their own pubsub channel. Will the EC2 instance following EC2 instance be enough:
High-Memory Extra Large Instance
17.1 GiB of memory
6.5 EC2 Compute Units (2 virtual cores with 3.25 EC2 Compute Units each)
420 GB of instance storage
64-bit platform
I/O Performance: Moderate
EBS-Optimized Available: No
API name: m2.xlarge
I'm having a hard time estimating since I don't know what or how to measure the memory footprint of pubsub in Redis.

pub/sub in redis is transient, and does not get persisted to disk, so, indeed, you don't need to worry about persisting redis.
The rule of thumb for estimating redis memory footprint should be based on expected number of messages per second times average size of message.
This is quite conservative because this assumes it takes a second to forward a message to all subscribers.
Using the above estimate, if each of your 100k users sends 1 message per second, you'd be able to accomodate messages of 150kb each.
So this should be plenty.

Related

Scaling Node.js: Using autoscaling groups with small virtual servers or cluster processes on VM with many vCPUs?

In learning about Node.js's cluster module I've been turning over the following architecture in my head: Balancing costs with performance, would it be more beneficial (i.e. cheapest but still scalable) to run your Node.js application in a cloud service's autoscaling group using small servers with one virtual CPU (say, AWS's t2.small EC2, 1 vCPU, 2gb memory) or use a larger server (say, an m5.xlarge 4 vCPU, 16gb memory), run Node.js to cluster four child processes to use the 4 vCPUs, but still autoscale?
A possible trade-off is the time it takes AWS to deploy another small server to autoscale, but on a low-traffic app or utility app you'll have to take on the cost of running the larger server when usage is low. But if the time it takes to spin up another server to handle the load is nominal, does that negate the benefits of using the cluster module?
Specifically, my question is twofold: Are these two approaches feasible and, if so, is my presumption about the cluster module's usefulness in the small server approach correct?

What is the relationship between three metrics of RDS: Free Memory,Active Memory and Freeable Memory?

What are the three metrics of AWS RDS: Free Memory (Enhanced monitoring), Active Memory (Enhanced monitoring), and Freeable Memory (CloudWatch monitor)?
What is the relationship between them?
Look at these two pictures.
The value of three metrics are different.
Let me answer your question in two parts.
What is difference between Enhanced monitoring and Cloudwatch monitoring?
As per official guide
Amazon RDS Enhanced Monitoring — Look at metrics in real time for the
operating system.
Amazon CloudWatch Metrics – Amazon RDS automatically sends metrics to
CloudWatch every minute for each active database. You are not charged
additionally for Amazon RDS metrics in CloudWatch.
Meaning, enhanced monitoring allows your to monitor operating system counters while cloudwatch monitoring enables your to monitor performance counters per database instance.
What does Free/Active/Freeable memory represents?
Enhanced monitoring info source
Free Memory
The amount of unassigned memory, in kilobytes.
Active Memory
The amount of assigned memory, in kilobytes.
Freeable Memory
Official Source
The amount of available random access memory.
Units: Bytes
Freeable memory is not a indication of the actual free memory
available. It is the memory that is currently in use that can be freed
and used for other uses; it's is a combination of buffers and cache in
use on the database instance.

Scaling Node.js App on Heroku Using Dynos

I am trying to better understand scaling a Node.js server on Heroku. I have an app that handles large amounts of data and have been running into some memory issues.
If a Node.js server is upgraded to a 2x dyno, does this mean that automatically my application will be able to handle up to 1.024 GB of RAM on a single thread? My understanding is that a single Node thread has memory limit of ~1.5 GB, which is above the limit of a 2x dyno.
Now, let's say that I upgrade to a performance-M dyno (2.5 GB of memory), would I need to use clustering to fully take advantage of the 2.5 GB of memory?
Also, if a single request is made to my Node.js app for a large amount of data, which while being processed exceeds the amount of memory allocated to that cluster, will the process then use some of the memory allocated to another cluster or will it just throw an error?

Why is my client CPU utilization so high when I use a cassandra cluster?

This is a follow-on question to Why is my cassandra throughput not improving when I add nodes?. I have configured my client and nodes as closely as I could to what is recommended here: http://docs.datastax.com/en/cassandra/2.1/cassandra/install/installRecommendSettings.html. The whole setup is not exactly world class (the client is on a laptop with 32G of RAM and a modern'ish processor, for example). I am more interested in developing an intuition for the cassandra infrastructure at this point.
I notice that if I shut down all but one of the nodes in the cluster and run my test client against it, I get a throughput of ~120-140 inserts/s and a CPU utilization of ~30-40%. When I crank up all 6 nodes and run this one client against them, I see a throughput of ~110-120 inserts/s and my CPU utilization gets to between ~80-100%.
All my tests are run with a clean DB (I completely delete all DB files and restart from scratch) and I insert 30M rows.
My test client is multi-threaded and each thread writes exclusively to one partition using unlogged batches, as recommend by various sources for a schema like mine (e.g. https://lostechies.com/ryansvihla/2014/08/28/cassandra-batch-loading-without-the-batch-keyword/).
Is this CPU spike expected behavior?

CPU usage when searching using solr

We have a solr cloud setup of 4 shards (one shard per physical machine) having ~100 million documents. Zookeeper is on one of those 4 machines. We encounter complex queries having wild cards and proximity searches together and it sometimes takes more than 15 secs to get top 100 documents. Query traffic is very very low at the moment (2-3 queries every minute). 4 Servers hosting cloud have following specs:
(2 servers -> 64 GB RAM, 24 CPU cores, 2.4 GHz) + (2 servers -> 48 GB RAM, 24 CPU cores, 2.4GHz).
We are providing 8 GB JVM memory per shard. Our 510GB index on SSDs per machine (totalling to 4*510 GB = 2.4TB) is mapped into OS disk cache on remaining RAM on each server. So I suppose RAM is not an issue for us.
Now Interesting thing to note is: When a query is fired to the cloud, only one CPU core is utilized to 100% and rest are all at 0%. Same behaviour is replicated on all the machines. No other processes are running on these machines.
Shouldn't solr be doing multi-threading of some-kind to utilize the CPU cores? Can I anyhow increase CPU consumption for each queries as traffic is not a problem. If so, how?
A single request to a Solr shard is largely processed single-threaded (you can set threads for faceting on multiple fields). Rule of thumb is to keep document count for shards to no more than a very few hundreds of millions. You are well below that with 25M/shard, but as you say, your queries are complex. What you see is simple the effect of single-threaded processing.
The solution to your problem is to use more shards, as all shards are queried in parallel. As you have a lot of free CPU cores and very little traffic, you might want to try running 10 shards on each machine. It is not a problem for SolrCloud to use 40 shards in total and the increased merging overhead should be insignificant compared to your heavy queries.

Resources