What is the recommended environment for running multiple casperjs instances?

What is the recommended environment for running multiple casperjs instances? - linux

I am new to casperjs and planning to use it to accurately simulate anywhere from a few dozen to low hundreds of concurrent sessions accessing a private server on a private network. Unlike typical HTTP load generators (Apache bench, httperf, ...), my purpose is to be able to control each session programmatically (increase delay between requests, have 'smarts' built into each script) and have each session have distinct source IP addresses.
My current thinking is to use OpenVZ containers (openvz.org) to create each 'virtual' client running casperjs (minimal functionality I need is following elements on the UI and taking screenshots). Would love to hear of anyone who has done something similar.
The crux of my question is: what would the 'slimmest' environment for running casperjs be? I'd like to strip down the OS as much as possible to be able to scale multiple clients. Specifically:
any recommended low-footprint UNIX/Linux distributions for CasperJS?
any specific recommendations on stripping down mainstream (CentOS, Debian, ...) distributions?
Thank you all in advance. I look forward to hearing your input on this specific question or similar experiences/tools for what I'm trying to achieve...
Fernando

CasperJS is headless, e.g. it doesn't need X running to function. Any bare bones Linux distribution will do you well.
any recommended low-footprint UNIX/Linux distributions for CasperJS?
Arch is very lightweight and has an easy to follow Beginners Guide. Arch's AUR has a package for CasperJS that's pretty straightforward to setup as well. Just make sure to grab the required base-devel package (pacman -S base-devel) before installing from the AUR as it's needed for the Arch Build System.
any specific recommendations on stripping down mainstream (CentOS, Debian, ...) distributions?
Not so much stripping down, but CrunchBang is based off of the latest Debian release. It may be worth taking a look at. It would be much less of a hassle to setup than Arch, and uses the same APT package manager as Debian / Ubuntu. It installs with the lightweight OpenBox window manager, but you can remove this and X all together if you'd like.
With that said, even a lightweight Linux environment won't help much with the amount of memory each CasperJS instance will use. You could probably pull off a few dozen depending on the amount of memory available, but a few hundred may not be feasible. It all depends on how much memory each website uses. Casperjs comes with some configuration options that may help reduce memory (e.g. don't load images, plugins, etc), but that may defeat the purpose of your tests.
The best advice I can give is to try it out for yourself. Write a simple script that will open the pages you are going to use and pass a callback to CasperJS's run() function to keep it alive (e.g. don't exit from Casper). It can be as simple as:
casper.start('http://example.com/site1', function () {});
casper.thenOpen('http://example.com/site2', function () {});
casper.run(function() {
// wait 60 seconds before exit . . . or remove to never exit
setTimeout(function() { casper.exit(); }, 60000);
}
Spin up multiple instances, and watch your total memory usage. You can use the cli tools top, or use this alias that totals the amount of memory usage for the current user.
alias memu="ps -u $(whoami) -o pid,rss,command | awk '{print \$0}{sum+=\$2} END {print \"Total\", sum/1024, \"MB\"}'"
From this you should be able to see roughly how much memory each instance takes, and how many you can run at once on one machine.

Related

How to install a software 5 times in linux?

The software(let's name it SW) has complex dependencies. I do not know whether it is enough if I just install SW 5 times in different directories.
My motivation is multi-thread running. If only one SW is installed, common-parameters may be shared between different threads.
Could docker help? Then the five SW could be independent of each other completely.
But I want a way that the 5 SW really occupy disk space in the host. It is not just 5 containers are used.

It depends. Many complex free software can be configured to be installed with some specific name or suffix.
For exemple, many GNU programs (and some non-GNU ones) have a configure script produced by autoconf. In that case, try configure --help at first. You probably can use --prefix and/or --program-suffix
So, for an autoconf-ed program with some configure script, you could build it five times with five different --program-suffix strings. See also GNU stow.
For free software programs not using such configure scripts, you need to read their documentation and source code. Most of the time their documentation explains how to configure them for such purposes. But you can still improve their source code for your needs.
For proprietary programs, you should dive into their documentation, and discuss with the vendor (perhaps paying him to adapt his software to your needs).
BTW, your question is unrelated to multi-threading (e.g. to Posix threads).

Detect Fast User Switch Linux

I'm currently attempting to detect when a user has fast-user-switched to another user on the Linux platform (specifically, Fedora 14-16, RedHat 4.7-6.x, CentOS 4-6, OpenSuse 10-11). I've been looking for something similar to the WTSRegisterSessionNotification() function that is available on Windows, but all I've come across is a bunch of references to bugs in the Wine software.
Has anyone else run across this snag? There seems to be a ton of resources on how to do this on Windows and Mac OS X (which is fine), but on Linux there seems to be nothing...
EDIT:
Apparently, on newer systems (at least Fedora 16) this may appear to be a viable option. I wonder if it has a DBus interface...More to come soon!

First of all, I need to tell you I'm not an expert in this area, but I have enough knowledge to give you pointers to places where you may go and learn more. So I may be wrong in some ways.
My guess is:
this is not easy
for most methods you may implement there are probably many ways to trick them into believing something that's not true, which may lead to security problems
your method may depend on the:
chosen Linux Distribution
version of the Distribution
Desktop Environment
Display Manager
As far as I know (and I may be wrong if something changed during the last years), fast user switching is implemented by launching another X server on another VT. So one way would be to detect if there are multiple X servers running.
But there are many cases where there multiple X servers running and it's not because of fast user switching. Examples: Multiseat or even simple Xephyr logins. With Xephyr and XDMCP, you may even have the same user logged in twice in a non-fast-user-switching case.
I started googling about this and found this old web page:
http://fedoraproject.org/wiki/Desktop/FastUserSwitching
If things haven't changed since then, you should study ConsoleKit and PolicyKit (and also DeviceKit and maybe Systemd today) and their DBus APIs.
There are also the commands ck-list-sessions and ck-launch-session. But I believe you can fool these commands easily: try to ck-launch-session xterm and then ck-list-session.
Why exactly are you trying to detect fast user switching? What's your ultimate goal? Maybe you can solve your problem without trying to detect fast user switch...

Well it appears that the most useful way of getting at this information is to use the ConsoleKit DBus interface.
The following procedure outlines how to enumerate the sessions and determine if they are active or not:
1.) Enumerate the sessions using the following:
Bus: org.freedesktop.ConsoleKit
Path: /org/freedesktop/ConsoleKit/Manager
Method: org.freedesktop.ConsoleKit.Manager.GetSessions
What is returned is an array of object paths that export the Session interface. These, in turn, can be queried using DBus to get their appropriate properties. For example, I used dbus-send to communicate with ConsoleKit to enumerate the sessions in my system:
dbus-send --system --print-reply --dest=org.freedesktop.ConsoleKit /org/freedesktop/ConsoleKit/Manager org.freedesktop.ConsoleKit.Manager.GetSessions
And what I received in return was the following:
method return sender=:1.15 -> dest=:1.205 reply_serial=2
array [
object path "/org/freedesktop/ConsoleKit/Session2"
]
2.) Using the returned object path(s), I can query them for their attributes, such as if they are active or not using the following:
Bus: org.freedesktop.ConsoleKit
Path: /org/freedesktop/ConsoleKit/Session2
Method: org.freedesktop.ConsoleKit.Session.IsActive
Depending on the method, I can query what I need from the session(s)! Using the ConsoleKit interface I can also retrieve the identifier for the current session, so I can always query it to see if it's active when I need to. Just for fun, here's the output of the following command:
dbus-send --system --print-reply --dest=org.freedesktop.ConsoleKit /org/freedesktop/ConsoleKit/Session2 org.freedesktop.ConsoleKit.Session.IsActive
method return sender=:1.15 -> dest=:1.206 reply_serial=2
boolean true
Neat.

You have to do it by polling to be sure of working on all machines (you obviously don't have to have DBus running to do user switching!).
Solaris, HP-UX, and others, do not do user switching on the console.
Platforms to support: linux, FreeBSD, AIX. Linux/BSD use virtual terminals; AIX uses /dev/lft0 if you're interested.
Suppose you want to reliably and securely run a application on the console, and restart it on the new active X server when the console switches to another VT. The problems are that you may or may not have a desktop environment running (some of us use twm!). The session may not have been started via a login manager (you could do Ctrl-Alt-F2 on linux, login, and run startx quite happily). The system might not even have xdm/gdm/similar installed.
The dumb solution is the only reliable one: every few seconds, query what the active virtual terminal is (VT_GETSTATE on linux, VT_GETACTIVE on BSD). If it's changed, you know a switch has happened. If you switched to a non-graphical session (eg with Ctrl-Alt-F1) there won't be an X server active.
Otherwise, you have to hunt hard to find which display number is active. For example, you might see two X servers in ps, with display numbers :1 and :2. Which of those is on VT7 though? The final piece of the puzzle, mapping VT numbers to display numbers, is the hardest. This question is answered in this duplicate question, "Which virtual terminal is a given X process running on?".

Stripping down a kernel in linux?

I recently read a post (admittedly its a few years old) and it was advice for fast number-crunching program:
"Use something like Gentoo Linux with 64 bit processors as you can compile it natively as you install. This will allow you to get the maximum punch out of the machine as you can strip the kernel right down to only what you need."
can anyone elaborate on what they mean by stripping down the kernel? Also, as this post was about 6 years old, which current version of Linux would be best for this (to aid my google searches)?

There is some truth in the statement, as well as something somewhat nonsensical.
You do not spend resources on processes you are not running. So as a first instance I would try minimise the number of processes running. For that we quite enjoy Ubuntu server iso images at work -- if you install from those, log in and run ps or pstree you see a thing of beauty: six or seven processes. Nothing more. That is good.
That the kernel is big (in terms of source size or installation) does not matter per se. Many of this size stems from drivers you may not be using anyway. And the same rule applies again: what you do not run does not compete for resources.
So think about a headless server, stripped down -- rather than your average desktop installation with more than a screenful of processes trying to make the life of a desktop user easier.

You can create a custom linux kernel for any distribution.
Start by going to kernel.org and downloading the latest source. Then choose your configuration interface (you have the choice of console text, 'config', ncurses style 'menuconfig', KDE style 'xconfig' and GNOME style 'gconfig' these days) and execute ./make whateverconfig. After choosing all the options, type make to create your kernel. Then make modules to compile all the selected modules for this kernel. Then, make install will copy the files to your /boot directory, and make modules_install, copies the modules. Next, go to /boot and use mkinitrd to create the ram disk needed to boot properly, if needed. Then you'll add the kernel to your GRUB menu.lst, by editing menu.lst and copying the latest entry and adding a similar one pointing to the new kernel version.
Of course, that's a basic overview and you should probably search for 'linux kernel compile' to find more detailed info. Selecting the necessary kernel modules and options takes a bit of experience - if you choose the wrong options, the kernel might not be bootable and you'll have to start over, which is a pain because selecting the options and compiling the kernel can take 15-30 minutes.
Ultimately, it isn't going to make a large difference to compile a stripped-down custom kernel unless your given task is very, very performance sensitive. It makes sense to remove things you're never going to use from the kernel, though, like say ISDN support.
I'd have to say this question is more suited to SuperUser.com, by the way, as it's not quite about programming.

How to make R use all processors?

I have a quad-core laptop running Windows XP, but looking at Task Manager R only ever seems to use one processor at a time. How can I make R use all four processors and speed up my R programs?

I have a basic system I use where I parallelize my programs on the "for" loops. This method is simple once you understand what needs to be done. It only works for local computing, but that seems to be what you're after.
You'll need these libraries installed:
library("parallel")
library("foreach")
library("doParallel")
First you need to create your computing cluster. I usually do other stuff while running parallel programs, so I like to leave one open. The "detectCores" function will return the number of cores in your computer.
cl <- makeCluster(detectCores() - 1)
registerDoParallel(cl, cores = detectCores() - 1)
Next, call your for loop with the "foreach" command, along with the %dopar% operator. I always use a "try" wrapper to make sure that any iterations where the operations fail are discarded, and don't disrupt the otherwise good data. You will need to specify the ".combine" parameter, and pass any necessary packages into the loop. Note that "i" is defined with an equals sign, not an "in" operator!
data = foreach(i = 1:length(filenames), .packages = c("ncdf","chron","stats"),
.combine = rbind) %dopar% {
try({
# your operations; line 1...
# your operations; line 2...
# your output
})
}
Once you're done, clean up with:
stopCluster(cl)

The CRAN Task View on High-Performance Compting with R lists several options. XP is a restriction, but you still get something like snow to work using sockets within minutes.

As of version 2.15, R now comes with native support for multi-core computations. Just load the parallel package
library("parallel")
and check out the associated vignette
vignette("parallel")

I hear tell that REvolution R supports better multi-threading then the typical CRAN version of R and REvolution also supports 64 bit R in windows. I have been considering buying a copy but I found their pricing opaque. There's no price list on their web site. Very odd.

I believe the multicore package works on XP. It gives some basic multi-process capability, especially through offering a drop-in replacement for lapply() and a simple way to evaluate an expression in a new thread (mcparallel()).

On Windows I believe the best way to do this would probably be with foreach and snow as David Smith said.
However, Unix/Linux based systems can compute using multiple processes with the 'multicore' package. It provides a high-level function, 'mclapply', that performs a list comprehension across multiple cores. An advantage of the 'multicore' package is that each processor gets a private copy of the Global Environment that it may modify. Initially, this copy is just a pointer to the Global Environment, making the sharing of variable extremely quick if the Global Environment is treated as read-only.
Rmpi requires that the data be explicitly transferred between R processes instead of working with the 'multicore' closure approach.
-- Dan

If you do a lot of matrix operations and you are using Windows you can install revolutionanalytics.com/revolution-r-open for free, and this one comes with the intel MKL libraries which allow you to do multithreaded matrix operations. On Windows if you take the libiomp5md.dll, Rblas.dll and Rlapack.dll files from that install and overwrite the ones in whatever R version you like to use you'll have multithreaded matrix operations (typically you get a 10-20 x speedup for matrix operations). Or you can use the Atlas Rblas.dll from prs.ism.ac.jp/~nakama/SurviveGotoBLAS2/binary/windows/x64 which also work on 64 bit R and are almost as fast as the MKL ones. I found this the single easiest thing to do to drastically increase R's performance on Windows systems. Not sure why they don't come as standard in fact on R Windows installs.
On Windows, multithreading unfortunately is not well supported in R (unless you use OpenMP via Rcpp) and the available SOCKET-based parallelization on Windows systems, e.g. via package parallel, is very inefficient. On POSIX systems things are better as you can use forking there. (package multicore there is I believe the most efficient one). You could also try to use package Rdsm for multithreading within a shared memory model - I've got a version on my github that has unflagged -unix only flag and should work also on Windows (earlier Windows wasn't supported as dependency bigmemory supposedly didn't work on Windows, but now it seems it does) :
library(devtools)
devtools::install_github('tomwenseleers/Rdsm')
library(Rdsm)

Building a custom Linux Live CD

Can anyone point me to a good tutorial on creating a bootable Linux CD from scratch?
I need help with a fairly specialized problem: my firm sells an expansion card that requires custom firmware. Currently we use an extremely old live CD image of RH7.2 that we update with current firmware. Manufacturing puts the cards in a machine, boots off the CD, the CD writes the firmware, they power off and pull the cards. Because of this cycle, it's essential that the CD boot and shut down as quickly as possible.
The problem is that with the next generation of cards, I have to update the CD to a 2.6 kernel. It's easy enough to acquire a pre-existing live CD - but those all are designed for showing off Linux on the desktop - which means they take forever to boot.
Can anyone fix me up with a current How-To?
Update:
So, just as a final update for anyone reading this later - the tool I ended up using was "livecd-creator".
My reason for choosing this tool was that it is available for RedHat-based distributions like CentOs, Fedora and RHEL - which are all distributions that my company supports already. In addition, while the project is very poorly documented it is extremely customizable. I was able to create a minimal LiveCD and edit the boot sequence so that it booted directly into the firmware updater instead of a bash shell.
The whole job would have only taken an hour or two if there had been a README explaining the configuration file!

There are a couple of interesting projects you could look into.
But first: does it have to be a CD-ROM? That's probably the slowest possible storage (well, apart from tape, maybe) you could use. What about a fast USB stick or a an IEE1394 hard-disk or maybe even an eSATA hard-disk?
Okay, there are several Live-CDs that are designed to be very small, in order to e.g. fit on a business card sized CD. Some were also designed to be booted from a USB stick, back when that meant 64-128 MiByte: Damn Small Linux is one of the best known ones, however it uses a 2.4 kernel. There is a sister project called Damn Small Linux - Not, which has a 2.6 kernel (although it seems it hasn't been updated in years).
Another project worth noting is grml, a Live-CD for system administration tasks. It does not boot into a graphic environment, and is therefore quite fast; however, it still contains about 2 GiByte of software compressed onto a CD-ROM. But it also has a smaller flavor, aptly named grml-small, which only contains about 200 MiByte of software compressed into 60 MiByte.
Then there is Morphix, which is a Live-CD builder toolkit based on Knoppix. ("Morphable Knoppix"!) Morphix is basically a tool to build your own special purpose Live-CD.
The last thing I want to mention is MachBoot. MachBoot is a super-fast Live-CD. It uses various techniques to massively speed up the boot process. I believe they even trace the order in which blocks are accessed during booting and then remaster the ISO so that those blocks are laid out contiguously on the medium. Their current record is less than 6 seconds to boot into a full graphical desktop environment. However, this also seems to be stale.

One key piece of advice I can give is that most LiveCDs use a compressed filesystem called squashfs to cram as much data on the CD as possible. Since you don't need compression, you could run the mksquashfs step (present in most tutorials) with -noDataCompression and -noFragmentCompression to save on decompression time. You may even be able to drop the squashfs approach entirely, but this would require some restructuring. This may actually be slower depending on your CD-ROM read speed vs. CPU speed, but it's worth looking into.
This Ubuntu tutorial was effective enough for me to build a LiveCD based on 8.04. It may be useful for getting the feel of how a LiveCD is composed, but I would probably not recommend using an Ubuntu LiveCD.
If at all possible, find a minimal LiveCD and build up with only minimal stripping out, rather than stripping down a huge LiveCD like Ubuntu. There are some situations in which the smaller distros are using smaller/faster alternatives rather than just leaving something out. If you want to get seriously hardcore, you could look at Linux From Scratch, and include only what you want, but that's probably more time than you want to spend.

Creating Your Own Custom Ubuntu 7.10 Or Linux Mint 4.0 Live-CD With Remastersys

Depends on your distro. Here's a good article you can check out from LWN.net
There is a book I used which covers a lot of distros, though it does not cover creating a flash-bootable image. The book is Live Linux(R) CDs: Building and Customizing Bootables. You can use it with supplemental information from your distro of choice.

So, just as a final update for anyone reading this later - the tool I ended up using was "livecd-creator".
My reason for choosing this tool was that it is available for RedHat-based distributions like CentOs, Fedora and RHEL - which are all distributions that my company supports already. In addition, while the project is very poorly documented it is extremely customizable. I was able to create a minimal LiveCD and edit the boot sequence so that it booted directly into the firmware updater instead of a bash shell.
The whole job would have only taken an hour or two if there had been a README explaining the configuration file!

Debian Live provides the best tools for building a Linux Live CD. Webconverger uses Debian Live for example.
It's very easy to use.
sudo apt-get install live-helper # from Debian unstable, which should work fine from Ubuntu
lh_config # edit config/* to your liking
sudo lh_build

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string