Benchmarking Node.JS server - node.js

I've written a Node.JS server which I would like to benchmark. It has the following components that I would like to benchmark separately:
- socket.io: how many continuous connections can it accept and process (where is the saturation point)
- redis: the same as above
- express: don't want to benchmark it
I know there is quite some (not a lot) documentation about that on the internet, but I don't like to reinvent the wheel, plus I don't want to actually spend countless hours of time trying some solution that turns out to be wrong for the job.
This is why I'm asking you guys here: what should I use to get a number/graph (whatever) on the number of simultaneous connections that server can process simultaneuosly without being bogged down? It would also be nice to monitor cpu, memory and swap of the process (yeah, yeah I know I can use countless of techniques or write my own script, but maybe something like that exists already).
I'm not looking for an answer where you'll paste a link to some solution that I already know it exists, I would like an answer in such a way, so that the person giving it has some actual experience and can really make a point or two and point me in the right direction.
Thank you

You can use ApacheBench ab to test the load that your server may take - man page
Some nice tutorials :
nixcraft/howto-performance-benchmarks-a-web-server
petefreitag/Using Apache Bench for Simple Load Testing
Usage :
$ ab -k -n 1000 -c 100 www.yourserver.com
-k - keep alive
-n N - will send N requests to the server
-c X - will send X packets concurrently

Related

Linux. Can packets pass libpcap by?

I am writing a linux program that controls internet traffic. In other words, how much bytes I have used while some amount of time. I use a Pcap4J for java (implementation of libpcap) and I have question about it. What happens if my program hasn't proceeded a package while a new one has arrived.
1. It slows down the download(upload) rate for the whole OS?
2. It skips a new one, and my program will never know that it passed by?
In other words, I've downloaded the 1G of data on my computer. How many bytes my program get: 100% or it may be passed my program by but still got the destination place!
And give me know if it is a bad idea to write a control traffic app using this lib!
Your application loses packets. In your words, they pass by.
However, if your idea is to have a metric of how many packets went in and out of your system in a given time, there are definitely better ways to achieve it.
On Linux you can just do a script that does something like this:
DEVICE=eth0
RX0=$(cat /sys/net/$DEVICE/statistics/rx_bytes)
TX0=$(cat /sys/net/$DEVICE/statistics/tx_bytes)
while : ; do
sleep 5
RX1=$(cat /sys/net/$DEVICE/statistics/rx_bytes)
TX1=$(cat /sys/net/$DEVICE/statistics/tx_bytes)
echo "RX bytes: $(($RX1-$RX0))"
echo "TX bytes: $(($TX1-$TX0))"
RX0=RX1
TX0=TX1
done
You can adjust times or whether is a parameter, I think you'll get the idea.

Tools to measure TCP connection latency

I want to measure the time it takes to finish TCP three-way handshake. I want to measure this on my Linux server. What are best practices for this? Notice that I want to measure this latency on server side and for all connections that are being accepted.
Sorry, you're right I misunderstood the question.
I think you could achieve this using 'tcpdump' which is a really complete tool to see all the events in tcp traffic.
By your comment I see you want to measure the time between SYNC to the ACK packet.
With tcpdump you can filter the connections and specific packages:
tcpdump -r <interface> "tcp[tcpflags] & (tcp-syn|tcp-ack) != 0"
And by default the time will be displayed in the first column of tcpdump results.
Check this, I think it could help.
I don't know if it's the best practice. Also If you want to manipulate that data, you can pipe the results and use awk or something similar.
EDIT: By searching in google I also found this resource which is really interesting.

How do I solve having my server automatically shutdown, if a UDP port has not been active for a certain amount of time?

I suppose this may be an odd question, but I have a small EC2-instance that costs quite a large sum of money every month. It's charged hourly though, so I only turn on this particular instance when I need it, and power it off when I'm done.
The purpose of this instance is for hosting a Counter-Strike: Global Offensive dedicated server which I only power on when I have a scrim to play.
Instead of forgetting to turn it off and being charged a lot, or having an unintelligent start-up script that asks the instance to power-off after 3 hours, I was thinking of a more intelligent design.
Here's my idea; that the instance intelligently powers itself off when it senses it is no longer in use, by determaining if a certain amount of network activity on UDP 27015 has not been recorded over the last 10 minutes, trying 3 times before powering off.
That way I can power-on, play the match, and not worry about powering off the server :-)
It sounds cool in my head. The question is how I go about solving the task. I imagine a bash-script executed every 10 minutes with the help of cron.
If I'm not being entirely crazy here, could a bash-script suggestion possibly be offered? Or maybe a better solution how I solve this quest I'm on, to save $$ by having the server power itself off when sensing it is no longer in use!
I'm not too familiar with EC2 instances, but if they are running some form of linux... Under Fedora I can use ifconfig to see how much data has been received/transmitted across the network interface. It's not just the single port but all ports on that interface... Would that number suffice for you? Ought to be pretty trivial to monitor it every few minutes and see when the load drops off...
Possibly a simple script to start with that is started when the EC2 instance is brought up and just logs the data. An hour after your game you can grab the log, manually shut down, and review it at your leisure to see if this will work. (It's amazing how many things use the network sometimes...)
Afterthought: Perhaps tcpdump would be better? Will it work with UDP port 27015? You might need some way to time it out, like running it as a background process, possibly with the -c option, sleeping for a while, and then killing the tcpdump process if it's still running. You may need to pipe through wc -l or just grep the final packets grabbed line. Caveat: tcpdump may need to be run as root.
E.g. /usr/sbin/tcpdump -n -nn -q -c 100 -i eth0 port 27015
Further afterthought:
#!/bin/bash --norc
/usr/sbin/tcpdump -n -nn -q -i eth0 port 27015 2>./logfile 1>/dev/null &
TCPDUMP_PID=$!
echo "sleeping... pid=$TCPDUMP_PID"
sleep 30
echo "wake up"
kill $TCPDUMP_PID
sleep 2
cat ./logfile

Website Benchmarking using ab

I am trying my hand at various benchmarking tools for the website I am working on and have found Apache Bench (ab) to be an excellent tool for load testing. It is a command line tool and is very easy to use, apparently. However I have a doubt about two of its basic flags. The site I was reading says:
Suppose we want to see how fast Yahoo can handle 100 requests, with a maximum of 10 requests running concurrently:
ab -n 100 -c 10 http://www.yahoo.com/
and the explanation for the flags states:
Usage: ab [options] [http[s]://]hostname[:port]/path
Options are:
-n requests Number of requests to perform
-c concurrency Number of multiple requests to make
I guess I am just not able to wrap my head around number of requests to perform and number of multiple requests to make. What happens when I give them both together like in the example above?
Can anyone give me a simpler explanation of what these two flags do together?
In your example ab will create 10 connections to yahoo.com and request a page using each of them simultaneously.
If you omit -c 10 ab will create only one connection and create next only when the first completes(when we have the whole main page downloaded).
If we pretend that server's response time does not depend on the number of requests it is simultaneously handling, your example will complete 10 times faster than without -c 10.
Also: What is concurrent request (-c) in Apache Benchmark?
-n 100 -c 10 means "issue 100 requests, 10 at a time."

Linux - Program Design for Debug - Print STDOUT streams from several programs

Let's say I have 10 programs (in terminals) working in tandem: {p1,p2,p3,...,p10}.
It's hard to keep track of all STDOUT debug statements in their respective terminal. I plan to create a GUI to keep track of each STDOUT such that, if I do:
-- Click on p1 would "tail" program 1's output.
-- Click on p3 would "tail" program 4's output.
It's a decent approach but there may be better ideas out there? It's just overwhelming to have 10 terminals; I'd rather have 1 super terminal that keeps track of this.
And unfortunately, linux "screen" is not an option. RESTRICTIONS: I only have the ability to either: redirect STDOUT to a file. (or read directly from STDOUT).
If you are looking for a creative alternative, I would suggest that you look at sockets.
If each program writes to the socket (rather than STDOUT), then your master terminal can act as a server and organize the output.
Now from what you described, it seems as though you are relatively constrained to STDOUT, however it could be possible to do something like this:
# (use netcat (or nc on some systems) to write to a socket on the provided port)
./prog1 | netcat localhost 12312
I'm not sure if this fits in the requirements of what you are doing (and it might be more effort than it is worth!), but it could provide a very stable solution.
EDIT: As was pointed out in the comments, netcat does exactly what you would need to make this work.

Resources