Generating specific cpu, disk and network utilization workloads on Linux - linux

I am looking for a Linux tool to generate workloads with pre-defined cpu, disk and network utilization. For example, I need to keep my cpu utilization on 55% and the disk utilization on 30% for a minute on my Ubuntu workstation. Is there any tool to generate such workloads for cpu, disk and net?
p.s. It is preferable to have one tool to do all the above mentioned features. But, if there are different tools for cpu, disk and net, I will be happy if you could share the links.

As there is no take 30% of system resources function, i don't think there is a corresponding tool. The Linux kernel gives out as much resources as needed and free, depending on the scheduling mechanism and more.
A tool like you are looking for would have to:
Generate Load (no problem)
Check the system load (no problem)
Regulate the power of the load generating functions (BIG problem)
The different amount of load could be accomplished with dynamic sleeps, and more, but the difficulty is very high, the efficiency very low.
For disk IO you could test IOZone for example, and play a little bit with the parameters.

Related

How to improve read/write speed when using distributed file system?

If I browse the Distributed File System (DFS) shared folder I can create a file and watch it replicate almost immediately across to the other office DFS share. Accessing the shares is pretty instant even across the broadband links.
I would like to improve the read/write speed. Any tips much appreciated.
Improving hardware always help but keep in mind that in any distributed file system the performance of the parent host will influence besides that in many cases you can't touch the hardware and you need to optimize network or tune your systems to best fit your current provider architecture.
An example of this, mainly in virtualized environments, is the case when disabling the TCP segmentation offload from the network cards, ifconfig_DEFAULT="SYNCDHCP -tso", it will considerably improve the throughput but at a cost of more CPU usage.
Depending on how far you want to go you can start all these optimizations from the very bottom:
creating your custom lean kernel/image
test/benchmark network settings (iperf)
fine tune your FS, if using ZFS here are some guides:
http://open-zfs.org/wiki/Performance_tuning
https://wiki.freebsd.org/ZFSTuningGuide
performance impact when using Solaris ZFS lz4 compression
Regarding moosefs there are some threads about how the block size affects I/O performance and how in many cases by disabling cache allow blocks > 4k.
Mainly for FreeBSD we added special cache option for MooseFS client
called DIRECT.
This option is available in MooseFS client since version 3.0.49.
To disable local cache and enable DIRECT communication please use this
option during mount:
mfsmount -H mfsmaster.your.domain.com -o mfscachemode=DIRECT /mount/point
In most filesystems speed factors are: type of access (sequential or random) and block size. Hardware performance is also the factor on MooseFS. You can improve speed by improving hard drives performance (for example you can switch to SSD), network topology (network latency) and network capacity.

CPU utilization in performance testing

I am doing performance testing on an app. I found when the number of virtual users increases, the response time increases linearly(should be natural, right?), but the CPU utilization stops increasing when reaches around 60%. Does it mean the CPU is the bottleneck? If not, what could be the bottleneck?
The bottleneck might or might not be CPU, you need to consider monitoring other OS metrics as well, to wit:
Physical RAM
Swap usage
Network IO
Disk IO
Each of them could be the bottleneck.
Also when you increase number of users ideal system should increase the number of TPS (transactions per second) by the same factor. When you increase virtual users and TPS is not getting increased the situation is called saturation point and you need to find out what is slowing your system down.
If resources utilization is far from 95-100% and your system provides large response times the reason can be non-optimal code of your application or slow database query or something like that, in this case you will need to use profiling tools to get to the bottom of the issue.
See How to Monitor Your Server Health & Performance During a JMeter Load Test article for more information on the application under test monitoring concept

Why so many applications allocate incredibly large amount of virtual memory while not using any of it?

I've been watching some weird phenomena in programming for quite some time, since overcommit is enabled by default on linux systems.
It seems to me that pretty much every high level application (eg. application written in high level programming language like Java, Python or C# including some desktop applications written in C++ that use large libraries such as Qt) use insane amount of virtual operating memory. For example, it's normal for web browser to allocate 20GB of ram while using only 300MB of it. Or for a dektop environment, mysql server, pretty much every java or mono application and so on, to allocate tens of gigabytes of RAM.
Why is that happening? What is the point? Is there any benefit in this?
I noticed that when I disable overcommit in linux, in case of a desktop system that actually runs a lot of these applications, the system becomes unusable as it doesn't even boot up properly.
Languages that run their code inside virtual machines (like Java (*), C# or Python) usually assign large amounts of (virtual) memory right at startup. Part of this is necessary for the virtual machine itself, part is pre-allocated to parcel out to the application inside the VM.
With languages executing under direct OS control (like C or C++), this is not necessary. You can write applications that dynamically use just the amount of memory they actually require. However, some applications / frameworks are still designed in such a way that they request a large chunk memory from the operating system once, and then manage the memory themselves, in hopes of being more efficient about it than the OS.
There are problems with this:
It is not necessarily faster. Most operating systems are already quite smart about how they manage their memory. Rule #1 of optimization, measure, optimize, measure.
Not all operating systems do have virtual memory. There are some quite capable ones out there that cannot run applications that are so "careless" in assuming that you can allocate lots & lots of "not real" memory without problems.
You already found out that if you turn your OS from "generous" to "strict", these memory hogs fall flat on their noses. ;-)
(*) Java, for example, cannot expand its VM once it is started. You have to give the maximum size of the VM as a parameter (-Xmxn). Thinking "better safe than sorry" leads to severe overallocations by certain people / applications.
These applications usually have their own method of memory management, which is optimized for their own usage and is more efficient than the default memory management provided by the system. So they allocate huge memory block, to skip or minimize the effect of the memory management provided by system or libc.

Monitor system (kernel) load on linux

I am working on a project which consists of a kernel distributed network file system.
I reached the point where I am testing my implementation. For that I would like to monitor the CPU load of it and by this I am referring to the kernel load of my module.
As I understood from a similar post there is no way of monitoring the load of a kernel module, therefore I was wondering which would be the best way to do it?
An example of testing my app would be to run the dd command in parallel.
At the moment I am using pidstat -c "dd" -p ALL to monitor the system load of command dd. At the same time I am looking at the top tool (top -d 0.2 to see more 'accurate' values).
With all these I do not feel confident that my way of monitoring is pretty accurate.
Any advice is highly appreciated.
Thank you.
You could use something like collectd to monitor all sorts of metrics, possibly showing it with Graphite for a simple overview with some processing tools (like averages over time).
That said, rather than monitoring the CPU load you could measure the throughput. By loading the system as much as you can you should be easily able to pinpoint which resource is the bottleneck: Disk I/O, network I/O, CPU, memory, or something else. And for a distributed network file system, you'll want to ensure that the bottleneck is very clearly disk or network I/O.

What is the difference in the "Host Cache Preference" settings when adding a disk to an Azure VM?

When adding a VHD data disk to a VM I am asked for a "Host Cache Preference" (None, Read Only, Read/write).
Can someone tell me the effect of choosing one over the other?
Specifically, I am using a VM as a Build Server so the disks are used for compiling .Net source code. Which setting would be best in this scenario?
Just as the settings mention this setting turns on caching preferences for I/O. The effect of changing them is that reads, writes or both read/writes can be cached for performance. For example, if you have read-only database/Lucene index/read-only files it would be optimal to turn on read-cache for the drive.
I have not seen dramatic performance changes in changing this setting (until I used SQL Server/Lucene) on the drives. High I/O will be improved by stripping disks...in your case if you have millions of lines of code across 10,000s of files then you could see performance improvement in reading/writing. The default IOPs max for a single drive is 500 IOPs (which is about 2x15k SAS drives or a high-end SSD). If you need more than that, add more disks and stripe them...
For example, on an extra large VM you can attach 16 drives * 500 IOPs (~8,000 IOPs):
http://msdn.microsoft.com/en-us/library/windowsazure/dn197896.aspx
(there are some good write-ups/whitepapers for people who did this and netted optimal performance by adding the max amount of smaller drives..rather than just one massive one).
Short summary: leave the defaults for caching. Test with an I/O tools for specific performance. Single drive performance will not likely matter, if I/O is your bottleneck striping drives will be MUCH better than the caching setting on the VHD drive.

Resources