I want to run this command multiple times but im not sure how to do it.
It has to be a single command. what this command does is it pushes my cpu cores to
%100
dd if=/dev/zero of=/dev/null
Its for an assignment. please help if you can
thank you
This is what in the assignment says. Maybe it can be helpfull to figure it out
"Figure out how to run multiple instances of the command dd
if=/dev/zero of=/dev/null at the same time. You could also use the
command sum /dev/zero. You should run one instance per CPU core, so as
to push CPU utilization to 100% on all of the CPU cores in your
virtual machine. You should be able to launch all of the instances by
running a single command or pipeline as a regular user "
so far i tried doing
dd if=/dev/zero of=/dev/null | xargs -p2
but that doesn't do the job right
Your assignment is probably already due and over. But for future readers, here's a single line solution.
perl -e 'print "/dev/zero\n" x'$(nproc --all) | xargs -n 1 -P $(nproc --all) -I{} dd if={} of=/dev/null
How does this work? Let's dissect the pipeline.
nproc --all will return the number of cores in the system. Let's pretend your system has 4 cores.
perl -e 'print "/dev/zero\n" x 4' will print 4 lines of /dev/zero.
Output
/dev/zero
/dev/zero
/dev/zero
/dev/zero
The output of perl is then passed to xargs.
-n 1 tells xargs to use only one argument at a time.
-I {} tells xargs that the argument shall replace the occurrences of {}
-P 4 tells xargs to run as many as 4 instances of the command in parallel
A shorter version of the above command can be written like this:
perl -e 'print "if=/dev/zero of=/dev/null\n" x '$(nproc --all) | xargs -n2 -P0 dd
This will run 4 copies:
dd if=/dev/zero of=/dev/null | dd if=/dev/zero of=/dev/null | dd if=/dev/zero of=/dev/null | dd if=/dev/zero of=/dev/null
But it is really not recommended as a solution for homework, as it looks as if you do not understand what | does. Here nothing is being sent through the pipe. It has the advantage that it is easy to stop with a Ctrl-C.
If the goal is simply to increase carbon emissions then this is shorter:
burnP6 | burnP6 | burnP6 | burnP6
If you have GNU Parallel:
yes /dev/zero | parallel dd if={} of=/dev/null
yes | parallel burnP6
GNU Parallel starts by default 1 job per CPU core, and thus it only reads that many arguments from yes.
Many ways.. for example repeating the command four times:
command & ; command & ; command & ; command &
..or in a more systematic way:
for i in {1..4}
do
dd if=/dev/zero of=/dev/null &
done
Or you could try my home made parallel data transfer tool pdd. This tool spawns several threads and each of the threads is bond to a CPU core.
Related
I am Trying to count the lines in all the files in a very large folder under Ubuntu.
The files are .gz files and I use
zcat * | wc -l
to count all the lines in all the files, and it's slow!
I want to use multi core computing for this task and found this
about Gnu parallel,
I tried to use this bash command:
parallel zcat * | parallel --pipe wc -l
and the cores are not all working
I found that the job starting might cause major overhead and tried using batching with
parallel -X zcat * | parallel --pipe -X wc -l
without improvenemt,
how can I use all the cores to count the lines in all the files in a folder given they are all .gz files and need to be decompresses before counting the rows (don't need to keep them uncompressed after)
Thanks!
If you have 150,000 files, you will likely get problems with "argument list too long". You can avoid that like this:
find . -name \*gz -maxdepth 1 -print0 | parallel -0 ...
If you want the name beside the line count, you will have to echo it yourself, since your wc process will only be reading from its stdin and won't know the filename:
find ... | parallel -0 'echo {} $(zcat {} | wc -l)'
Next, we come to efficiency and it will depend on what your disks are capable of. Maybe try with parallel -j2 then parallel -j4 and see what works on your system.
As Ole helpfully points out in the comments, you can avoid having to output the name of the file whose lines are being counted by using GNU Parallel's --tag option to tag output line, so this is even more efficient:
find ... | parallel -0 --tag 'zcat {} | wc -l'
Basically the command you are looking for is:
ls *gz | parallel 'zcat {} | wc -l'
What it does is:
ls *gzlist all gz files on stdout
Pipe it to parallel
Spawn subshells with parallel
Run in said subshells the command inside quotes 'zcat {} | wc -l'
About the '{}', according to the manual:
This replacement string will be replaced by a full line read from the input source
So each line piped to parallel get fed to zcat.
Of course this is basic, I assume it could be tuned, the documentation and examples might help
I ran the following command :
time for i in {1..100}; do find / -name "*.service" | wc -l; done
got a 100 lines of the result then :
real 0m35.466s
user 0m15.688s
sys 0m14.552s
I then ran the following command :
time for i in {1..100}; do find / -name "*.service" | awk 'END{print NR}'; done
got a 100 lines of the result then :
real 0m35.036s
user 0m15.848s
sys 0m14.056s
I precise I already ran find / -name "*.service" just before so it was cached for both commands.
I expected wc -l to be faster. Why is it not ?
other's have mentioned that you're probably timing find, not wc or awk. still, there may be interesting differences to explore between wc and awk in their various flavors.
here are the results I get:
Mac OS 10.10.5 awk 0.16m lines/second
GNU awk/gawk 4.1.4 4.4m lines/second
Mac OS 10.10.5 wc 6.8m lines/second
GNU wc 8.27 11m lines/second
i didn't use find, but instead used wc -l or `awk 'END{print NR}' on a large text file (66k lines) in a loop.
i varied the order of the commands and didn't find any deviations large enough to change the rankings i reported.
LC_CTYPE=C had no measurable effect on any of these.
conclusions
don't use mac builtin command line tools except for trivial amounts of data.
GNU wc is faster than GNU awk at counting lines.
i use MacPorts GNU binaries. it would be interesting to see how Homebrew binaries compare. (i'm guessing they'd lose.)
Three things:
Such a small difference is usually not significant:
0m35.466s - 0m35.036s = 0m0.43s or 1.2%
Yet wc -l is faster (10x) than awk 'END{print NR}'.
% time seq 100000000 | awk 'END{print NR}' > /dev/null
real 0m13.624s
user 0m14.656s
sys 0m1.047s
% time seq 100000000 | wc -l > /dev/null
real 0m1.604s
user 0m2.413s
sys 0m0.623s
My guess is that the hard drive cache holds the find results, so after the first run with wc -l, most of the reads needed for find are in the cache. Presumably the difference in times between the initial find with disk reads and the second find with cache reads, would be greater than the difference in run times between awk and wc.
One way to test this is to reboot, which clears the hard disk cache, then run the two tests again, but in the reverse order, so that awk is run first. I'd expect that the first-run awk would be even slower than the first-run wc, and the second-run wc would be faster than the second-run awk.
I'm trying to run a program on a dedicated core in Linux. (I know Jailhouse is a good way to do so, but I have to use off-the-shelf Linux. :-( )
Other processes, such as interrupt handlers, kernel threads, service progresses, may also run on the dedicated core occasionally. I want to disable as many such processes as possible. To do that, I need first pin down the list of processes that may run on the dedicated core.
My question is:
Is there any existing tools that I can use to trace the list of PIDs or processes that run on a specific core over a time interval?
Thank you very much for your time and help in this question!
TL;DR Dirty hacky solution.
DISCLAIMER: At some point stops working "column: line too long" :-/
Copy this to: core-pids.sh
#!/bin/bash
TARGET_CPU=0
touch lastPIDs
touch CPU_PIDs
while true; do
ps ax -o cpuid,pid | tail -n +2 | sort | xargs -n 2 | grep -E "^$TARGET_CPU" | awk '{print $2}' > lastPIDs
for i in {1..100}; do printf "#\n" >> lastPIDs; done
cp CPU_PIDs aux
paste lastPIDs aux > CPU_PIDs
column -t CPU_PIDs > CPU_PIDs.humanfriendly.tsv
sleep 1
done
Then
chmod +x core-pids.sh
./core-pids.sh
Then open CPU_PIDs.humanfriendly.tsv with your favorite editor, and ¡inspect!
The key is in the "ps -o cpuid,pid" bit, for more detailed info, please comment. :D
Explanation
Infinite loop with
ps -o cpuid,pid | tail -n +2 | sort | xargs -n 2 | grep -E "^$TARGET_CPU" | awk '{print $2}' > lastPIDs
ps ax -o cpuid,pid
Show pid's associated to CPU
tail -n +2
remove headers
sort
sort by cpuid
xargs -n 2
remove white spaces at begging
grep -E "^$TARGET_CPU"
filter by CPU id
awk '{print $2}'
get pid column
> lastPIDs
output to file those las pid's for the target CPU id
for i in {1..10}; do printf "#\n" >> lastPIDs; done
hack for pretty .tsv print with the "columns -t" command
cp CPU_PIDs aux
CPU_PIDs holds the whole timeline, we copy it to aux file to allow the next command to use it as input and output
paste lastPIDs aux > CPU_PIDs
Append lastPIDs columns to the whole timeline file CPU_PIDs
column -t CPU_PIDs > CPU_PIDs.humanfriendly.tsv
pretty print whole timeline CPU_PIDs file
Attribution
stackoverflow answer to: ps utility in linux (procps), how to check which CPU is used
by Mikel
stackoverflow answer to: Echo newline in Bash prints literal \n
by sth
stackoverflow answer to: shell variable in a grep regex
by David W.
superuser answer to: Aligning columns in output from a UNIX command
Janne Pikkarainen
nixCraft article: HowTo: Unix For Loop 1 to 100 Numbers
The best way to obtain what you want is to operate as follows:
Use the isolcpus= Linux kernel boot parameter to "free" one core from the Linux scheduler
Disable the irqbalance daemon (in case it is executing)
Set the IRQs affinities to the other cores by manually writing the CPU mask on /proc/irq/<irq_number>/smp_affinity
Finally, run your program setting the affinity to the dedicated core through the taskset command.
In this case, such core will only execute your program. For checking, you can type ps -eLF and look at the PSR column (which specifies the CPU number).
Not a direct answer to the question, but I am usually using perf context-switches software event to identify the perturbation of the system or other processes on my benchmarks
I'm using the ruby binding, ruby-xz.
random_string = SecureRandom.random_bytes(100)
compressed_string = XZ.compress(random_string, compression_level = 9, check = :none, extreme = true)
compressed_string.size # => always 148
I've tested it ten thousands of times, on strings of varying length.
I know that at least half of the strings are 1-incompressible (cannot be compresse by more than 1 bit), 3/4 of the strings are 2-incompressible, etc. (This follows from a counting argument.) This, obviously, says nothing about the lower bound of the number compressible strings, but there are bound to be a few, aren't there?
Explanation
There are a few reasons:
liblzma, when not in RAW mode, adds a header describing the dictionary size and a few other settings. That is one of the reasons it grows in size.
LZMA, like a lot of other compressors, uses a range encoder to encode the output of the dictionary compression (in essence a badass version of LZ77) in the least amount of bits needed. So at the end of the bit stream, the last bits are padded to make it into a full byte.
You are compressing random noise, which as you note, is hard to compress. The range encoder tries to find the least amount of bits to encode the symbols outputted by the dictionary compression round. So in this case, there will be a lot of symbols. If, there was one (or two) recurring patterns that LZMA found, it could be that in the end it only saves a bit or two from the output. Which as explained in point 2, you cannot observe on a byte level.
Experiment
Some small experiments for observing the overhead.
empty file with lzma in raw mode:
$ dd if=/dev/urandom bs=1k count=0 2>/dev/null | xz -9 -e --format=raw -c 2>/dev/null | wc -c
1
it needed at least one or two bits to say it reached the end of the stream, and this was padded to one byte
1k file filled with zeroes
$ dd if=/dev/zero bs=1k count=1 2>/dev/null | xz -9 -e --format=raw -c 2>/dev/null | wc -c
19
quite nice, but complexity theory wise, still perhaps a few bytes to many (1000x'\0' would have been optimal encoding)
1k file with all bits at 1
$ dd if=/dev/zero bs=1k count=1 2>/dev/null | sed 's/\x00/\xFF/g'| xz -9 -e --format=raw -c 2>/dev/null | wc -c
21
interestingly, xz compresses this a little worse than all zeroes. most likely related to the fact that LZMA dictionary works on a bit level (which was one of the novel ideas of LZMA).
1k random file:
$ dd if=/dev/urandom bs=1k count=1 2>/dev/null | xz -9 -e --format=raw -c 2>/dev/null | wc -c
1028
so 4 bytes more than the input, still not bad.
1000 runs of 1k random files:
$ for i in {1..1000}; do dd if=/dev/urandom bs=1k count=1 2>/dev/null | xz -9 -e --format=raw -c 2>/dev/null | wc -c; done | sort | uniq -c
1000 1028
so every time, 1028 bytes needed.
How can I use all cpu cores in my server(has 4 cores) linux Debian over OpenVZ to gzipping faster one big file?
I am trying to use these commands but I can not put the pieces together
get number of cores
CORES=$(grep -c '^processor' /proc/cpuinfo)
this for split big file in more
split -b100 file.big
this for use gzip command with multiple core
find /source -type f -print0 | xargs -0 -n 1 -P $CORES gzip --best
I don't know if this is the best way for optimize gzip process of big files..
Use pigz, a parallel gzip implementation.
Unlike parallel with gzip, pigz produces a single gzip stream.
Try GNU Parallel
cat bigfile | parallel --pipe --recend '' -k gzip -9 >bigfile.gz
This will use all your cores to gzip in parallel.
By way of comparison, on my Mac running OSX Mavericks, and using a 6.4GB file on solid state disk, this command
time gzip -9 <bigger >/dev/null
takes 4 minutes 23s and uses 1-2 CPUs at around 50%.
Whereas the GNU Parallel version below
time cat bigger | parallel --pipe --recend '' -k gzip -9 >/dev/null
takes 1 minute 44 seconds and keeps all 8 cores 80+% busy. A very significant difference, with GNU Parallel running in under 40% of the time of the simplistic approach.