Azure Websites - Scale Up vs. Scale Out - azure

Has anyone seen any analysis or info on when it is ideal to scale out vs. scale up. When does one make more sense than the other.
Currently, 2 small instances will cost the same as one medium under both the standard and basic modes.
Is having 2 small instances and thus 4 GB of RAM, the same as having 1 Medium instance with 4 GB of RAM (but without an SLA); and the same for cores. All the other features are the same.
Does either CPU pressure or memory pressure, two easy metrics, dictate which way to scale?
And, in this case, scaling out does not present an issue as far as apps/sites working on different machines.

When you can, always try to scale out vs. scale up. Chances of one VM going down due to a reboot/upgrade/etc and having catastrophic downtime are much bigger than 0... while the overhead of running two VM's and load-balancing between them is minimal and chances of you having both VM's down are much much smaller.
In addition if you ever need 3 servers, scaling up with medium servers will not yield the right granularity.

Having 2 small instances of 1.75 GB each IS NOT the same as having 1 Medium instance with 3.5 GB of RAM. It is better to have a MEDIUM instance because 3.5 GB is now available to applications instead of just 1.75 GB. Also, each OS takes some RAM away approximately 800-900 MB. Having two instances takes RAm of two OS.

Related

Express (NodeJS) more cores vs. more nodes? (With Analysis and Examples)

When it comes to running Express (NodeJS) in something like Kubernetes, would it be more cost effective to run with more cores and less nodes? Or more nodes with less cores each? (Assuming the cost of cpus/node is linear ex: 1 node with 4 cores = 2 nodes 2cores)
In terms of redundancy, more nodes seems the obvious answer.
However, in terms of cost effectiveness, less nodes seems better because with more nodes, you are paying more for overhead and less for running your app. Here is an example:
1 node with 4 cores costs $40/month, it is running:
10% Kubernetes overhead on one core
90% your app on one core and near 100% on others
Therefore you are paying $40 for 90% + 3x100% = 390% your app
2 nodes with 2 cores each cost a total of $40/month running:
10% Kubernetes overhead on one core (PER NODE)
90% you app on one core and near 100% on other (PER NODE)
Now you are paying $40 for 2 x (90% + 100%) = 2 x 190% = 380% your app
I am assuming balancing the 2 around like 4-8 cores is ideal so you aren't paying so much for each node, scaling nodes less often, and getting hight percentage of compute running your app per node. Is my logic right?
Edit: Math typo
because the node does not come empty, but it has to run some core apps like :
kubelet
kube-proxy
container-runtime (docker, gVisor, or other)
other daemonset.
Sometimes, 3 large VMs are better than 4 medium VMs in term of the best usage of capacity.
However, the main decider is the type of your workload (your apps):
If your apps eats memory more than CPUs (Like Java Apps), you will need to choose Node of [2CPU, 8GB] is better than [4CPUs, 8GB].
If your apps eats CPUs more than memory (Like ML workload), you will need to choose the opposite; computing-optimized instances.
The golden rule 🏆 is to calculate the whole capacity is better than looking into the individual capacity for each node.
At the end, you need to consider not only cost effectiveness but also :
Resilience
HA
Redundancy

Cassandra: operating a cluster with very different machines [duplicate]

This question already has an answer here:
Cassandra with uneven hardware, how to configure?
(1 answer)
Closed 7 years ago.
I have a development machine and we are transitioning to production. However the machine is not too bad:
HOST: HP - ProLiant BL460c G7 - CZJ20601RL
PROC: 2 x Intel(R) Xeon(R) CPU X5660 # 2.80GHz; HT is on (total: 24 thread(s))
RAM : 6 x 2048 MB (total: 11895 MB)
DISK: 2 x 300 GB SAS
but the disks are rather small.
The two other production machines will have larger disk. How can I make sure that I don't fill up the disk of the first machine? Is I do what's going to happen?
I thought about reducing the number of "tokens" (vnodes): 256 on the two production machines and only 64 on this one.
I thought about reducing the number of "tokens" (vnodes)
Tuning the number of vnodes tokens is a good way to size the load on a cluster with different hardware.
However, it's all about guessing. Ideally if your high-end servers has x2 CPU, x2 memory and x2 disk bandwidth, you can do a x2 scaling with vnodes tokens.
In your case, it's more complicated because the hardware-scaling factor is not so obvious.
How can I make sure that I don't fill up the disk of the first machine?
System monitoring. Also, OpsCenter can give you metrics about system disk usage if you install the agents on each server

Better performance on Azure Websites with 2 Small Instances or 1 Medium

With Azures recent introduction of the Basic Tier, I have a question on performance:
I run a small website that receives around 30 000 hits a month. It runs great on Azure Websites with a SQL Azure DB.
Assuming it is similarly and my generous MSDN credits can afford it: (i.e. free)
Basic and Standard appear to be the same in terms of CPU size and memory etc. (this is down to the size of the Instance that you select, eg. Small/Medium/Large). I don’t need 50 gigs of space and the extra features such as Staging/Backups so I have dropped to Basic. This means I can now afford to either:
1) Upgrade to 2 Small Instances (2 x 1 Core, 1.75 GB)
2) Upgrade to 1 Medium Instance (1 x 2 Cores, 3.5 GB)
Which will be more performance (in terms of average responsiveness for the user when they browse the site)? I have tried both and they appear about the same. I would guess that 2 instances would handle more load better and 1 medium would handle more CPU processing better?

Cassandra latency patterns under constant load

I've got pretty unusual latency patterns in my production setup:
the whole cluster (3 machines: 48 gig ram, 7500 rpm disk, 6 cores) shows latency spikes every 10 minutes, all machines at the same time.
See this screenshot.
I checked the logfiles and it seems as there are no compactions taking place at that time.
I've got 2k reads and 5k reads/sec. No optimizations have been made so far.
Caching is set to "ALL", hit rate for row cache is at ~0,7.
Any ideas? Is tuning memtable size an option?
Best,
Tobias

Solr Indexing Time

Solr 1.4 is doing great with respect to Indexing on a dedicated physical server (Windows Server 2008). For Indexing around 1 million full text documents (around 4 GB size) it takes around 20 minutes with Heap Size = 512M - 1G & 4GB RAM.
However while using Solr on a VM, with 4 GB RAM it took 50 minutes to index at the first time. Note that there is no Network delays and no RAM issues. Now when I increased the RAM to 8GB and increased the heap size, the indexing time increased to 2 hrs. That was really strange. Note that except for SQL Server there is no other process running. There are no network delays. However I have not checked for File I/O. Can that be a bottleneck? Does Solr has any issues running in "Virtualization" Environment?
I read a paper today by Brian & Harry: "ON THE RESPONSE TIME OF A SOLR SEARCH ENGINE IN A VIRTUALIZED ENVIRONMENT" & they claim that performance gets deteriorated when RAM is increased when Solr is running on a VM but that is with respect to query times and not indexing times.
I am bit confused as to why it took longer on a VM when I repeated the same test second time with increased heap size and RAM.
I/O on a VM will always be slower than on dedicated hardware. This is because the disk is virtualized and I/O operations must pass through an extra abstraction layer. Indexing requires intensive I/O operations, so it's not surprising that it runs more slowly on a VM. I don't know why adding RAM causes a slowdown though.

Resources