Simulating Disk I/O - linux

We are currently evaluating storage for a virtualization environment (Xen). The storage is an active-active cluster and I need to test some stuff there like split brain scenarios etc.
I'm looking for a tool that simulates a lot of small disk I/O, like a virtual machine would read/write to its image file.
I don't need performance testing tools but more something like data integrity. Is there anything around?

btest[1] is able to do such things, it can read/write in random/sequential and can also verify the data afterwards.
[1] http://sourceforge.net/projects/btest/

Related

How to improve read/write speed when using distributed file system?

If I browse the Distributed File System (DFS) shared folder I can create a file and watch it replicate almost immediately across to the other office DFS share. Accessing the shares is pretty instant even across the broadband links.
I would like to improve the read/write speed. Any tips much appreciated.
Improving hardware always help but keep in mind that in any distributed file system the performance of the parent host will influence besides that in many cases you can't touch the hardware and you need to optimize network or tune your systems to best fit your current provider architecture.
An example of this, mainly in virtualized environments, is the case when disabling the TCP segmentation offload from the network cards, ifconfig_DEFAULT="SYNCDHCP -tso", it will considerably improve the throughput but at a cost of more CPU usage.
Depending on how far you want to go you can start all these optimizations from the very bottom:
creating your custom lean kernel/image
test/benchmark network settings (iperf)
fine tune your FS, if using ZFS here are some guides:
http://open-zfs.org/wiki/Performance_tuning
https://wiki.freebsd.org/ZFSTuningGuide
performance impact when using Solaris ZFS lz4 compression
Regarding moosefs there are some threads about how the block size affects I/O performance and how in many cases by disabling cache allow blocks > 4k.
Mainly for FreeBSD we added special cache option for MooseFS client
called DIRECT.
This option is available in MooseFS client since version 3.0.49.
To disable local cache and enable DIRECT communication please use this
option during mount:
mfsmount -H mfsmaster.your.domain.com -o mfscachemode=DIRECT /mount/point
In most filesystems speed factors are: type of access (sequential or random) and block size. Hardware performance is also the factor on MooseFS. You can improve speed by improving hard drives performance (for example you can switch to SSD), network topology (network latency) and network capacity.

Monitor system (kernel) load on linux

I am working on a project which consists of a kernel distributed network file system.
I reached the point where I am testing my implementation. For that I would like to monitor the CPU load of it and by this I am referring to the kernel load of my module.
As I understood from a similar post there is no way of monitoring the load of a kernel module, therefore I was wondering which would be the best way to do it?
An example of testing my app would be to run the dd command in parallel.
At the moment I am using pidstat -c "dd" -p ALL to monitor the system load of command dd. At the same time I am looking at the top tool (top -d 0.2 to see more 'accurate' values).
With all these I do not feel confident that my way of monitoring is pretty accurate.
Any advice is highly appreciated.
Thank you.
You could use something like collectd to monitor all sorts of metrics, possibly showing it with Graphite for a simple overview with some processing tools (like averages over time).
That said, rather than monitoring the CPU load you could measure the throughput. By loading the system as much as you can you should be easily able to pinpoint which resource is the bottleneck: Disk I/O, network I/O, CPU, memory, or something else. And for a distributed network file system, you'll want to ensure that the bottleneck is very clearly disk or network I/O.

is something like 'core local storage' possible?

I know there's thread local storage (TLS) for threads, however, is there something like core local storage for each core in multi-core environment ?
Since a thread can be moved from core to core without warning, there doesn't seem to be any obvious use for such a capability.
Does it exist? As others have said there's really no point for it that I can see.
But can you build something like that? Oh very simple just grab some memory and index it with the cpu.id, there's no magic behind thread local storage either.
If you're trying to improve performance, just consider that thread local storage well be exclusively in one cores cache which is pretty much just as well and has the advantage of adapting to changing cores cheaply.

How to make Cassandra use two disks on ZFS in SmartOS?

I heard that there's a huge improvement when Cassandra can write it's logfiles to one disk, and the SS Tables to another. I have two disks, and if I was running Linux I would mount each in a different path and configure Cassandra to write on those.
What I would like to know is how to do that in ZFS and SmartOS.
I'm a complete newbie in SmartOS, and from what I understood I add the disks to the storage pool, are they then managed as being one ?
psanford explained how to use two disks, but that's probably not what you want here. That's usually recommended to work around deficiencies in the operating system's I/O scheduling. ZFS has a write throttle to avoid saturating disks[0], and SmartOS can be configured to throttle I/Os to ensure that readers see good performance when some users (possibly the same user) are doing heavy writes[1]. I'd be surprised if the out-of-the-box configuration wasn't sufficient, but if you're seeing bad performance, it would be good to quantify that.
[0] http://dtrace.org/blogs/ahl/2014/02/10/the-openzfs-write-throttle/
[1] http://dtrace.org/blogs/wdp/2011/03/our-zfs-io-throttle/
By default SmartOS aggregates all your disks together into a single ZFS pool (SmartOS names this pool "zones"). From this pool you create ZFS datasets which can either look like block devices (used for KVM virtual machines) or as filesystems (used for SmartOS zones).
You can setup more than one pool in SmartOS, but you will have to do it manually. The Solaris documentation is still quite good and applicable to modern Illumos distributions (including SmartOS). Chapter 4 has all the relevant information for creating a new ZFS pool, but it can be as simple as:
zpool create some_new_pool_name c1t0d0 c1t1d0
This assumes that you have access to the global zone.
If I were running a Cassandra cluster on bare metal and I wanted to benefit from things like ZFS and DTrace I would probably use OmniOS instead of SmartOS. I don't want any contention for resources with my database machines, so I wouldn't run any other zones or VMs on that hardware (which is what SmartOS is really good at).

Architecture to Sandbox the Compilation and Execution Of Untrusted Source Code

The SPOJ is a website that lists programming puzzles, then allows users to write code to solve those puzzles and upload their source code to the server. The server then compiles that source code (or interprets it if it's an interpreted language), runs a battery of unit tests against the code, and verifies that it correctly solves the problem.
What's the best way to implement something like this - how do you sandbox the user input so that it can not compromise the server? Should you use SELinux, chroot, or virtualization? All three plus something else I haven't thought of?
How does the application reliably communicate results outside of the jail while also insuring that the results are not compromised? How would you prevent, for instance, an application from writing huge chunks of nonsense data to disk, or other malicious activities?
I'm genuinely curious, as this just seems like a very risky sort of application to run.
A chroot jail executed from a limited user account sounds like the best starting point (i.e. NOT root or the same user that runs your webserver)
To prevent huge chunks of nonsense data being written to disk, you could use disk quotas or a separate volume that you don't mind filling up (assuming you're not testing in parallel under the same user - or you'll end up dealing with annoying race conditions)
If you wanted to do something more scalable and secure, you could use dynamic virtualized hosts with your own server/client solution for communication - you have a pool of 'agents' that receive instructions to copy and compile from X repository or share, then execute a battery of tests, and log the output back via the same server/client protocol. Your host process can watch for excessive disk usage and report warnings if required, the agents may or may not execute the code under a chroot jail, and if you're super paranoid you would destroy the agent after each run and spin up a new VM when the next sample is ready for testing. If you're doing this large scale in the cloud (e.g. 100+ agents running on EC2) you only ever have enough spun up to accommodate demand and therefore reduce your costs. Again, if you're going for scale you can use something like Amazon SQS to buffer requests, or if you're doing a experimental sample project then you could do something much simpler (just think distributed parallel processing systems, e.g. seti#home)

Resources