Linux - read or collect file content faster (e.g. cpu temp every sec.) - linux

I'm working on a system on which ubuntu is running. I'm reading basic data like CPU frequency and temperature out of the thermal zones provided in /sys/class/thermal.
Unfortunately, I've got around 100 thermal_zones from which I need to read the data. I do it with:
for SENSOR_NODE in /sys/class/thermal/thermal_zone*; do printf "%s: %s\n" $(cat ${SENSOR_NODE}/type) $(cat ${SENSOR_NODE}/temp); done
To collect all data takes ~2.5-3 sec. which is way to long.
Since I want to collect the data every second my question is, if there is a way to "read" or "collect" the data faster?
Thank you in advance

There's only so much you can do while writing your code in shell, but let's start with the basics.
Command substitutions, $(...), are expensive: They require creating a FIFO, fork()ing a new subprocess, connecting the FIFO to that subprocess's stdout, reading from the FIFO and waiting for the commands running in that subshell to exit.
External commands, like cat, are expensive: They require linking and loading a separate executable; and when you run them without exec (in which case they inherit and consume the shell's process ID), they also require a new process to be fork()ed off.
All POSIX-compliant shells give you a read command:
for sensor_node in /sys/class/thermal/thermal_zone*; do
read -r sensor_type <"$sensor_node/type" || continue
read -r sensor_temp <"$sensor_node/temp" || continue
printf '%s: %s\n' "$sensor_type" "$sensor_temp"
done
...which lets you avoid the command substitution overhead and the overhead of cat. However, read reads content only one byte at a time; so while you're not paying that overhead, it's still relatively slow.
If you switch from /bin/sh to bash, you get a faster alternative:
for sensor_node in /sys/class/thermal/thermal_zone*; do
printf '%s: %s\n' "$(<"$sensor_node/type)" "$(<"$sensor_node/temp")"
done
...as $(<file) doesn't need to do the one-byte-at-a-time reads that read does. That's only faster for being bash, though; it doesn't mean it's actually fast. There's a reason modern production monitoring systems are typically written in Go or with a JavaScript runtime like Node.

Related

How can I select() (ie, simultaneously read from) standard input *and* a file in bash?

I have a program that accepts input on one FIFO and emits output to another FIFO. I want to write a small script to control this program. The script needs to listen both to standard input (so I can input commands to adjust things in real time) and the program's output FIFO (so it can respond to events happening there as well).
Essentially my control program needs to select between standard input and a file (my FIFO).
I like learning how to figure out how to develop simple and elegant bash-based solutions to complex problems, and after a little headscratching I remembered that that tail -f will happily select on multiple files and tell you when one of them changes in real time, so I initially tried
tail -f <(od -An -vtd1 -w1) <(cat fifo)
to read both standard input (I'd previously run stty icanon min 1; this od invocation shows each stdin character on a separate line alongside its ASCII code, and is great for escape sequence parsing) and my FIFO. This failed epically (as does cat <(cat)): od gets run here as a backgrounded task, so it doesn't get access to the controlling TTY, and fails with a cryptic "I/O error" that was explained incredibly well here.
So now I'm a bit stumped. I realize that I can use any scripting language like Perl/Python/Ruby/Tcl to solve this; my compsci/engineering question is whether/how I might be able to solve this using (Linux) shell scripting.

Bash: How to record highest memory/cpu consumption during execution of a bash script?

I have a function in a bash script that executes a long process called runBatch. Basically runBatch takes a file as an argument and loads the contents into a db. (runBatch is just a wrapper function for a database command that loads the content of the file)
My function has a loop that looks something like the below, where I am currently recording start time and elapsed time for the process to variables.
for batchFile in `ls $batchFilesDir`
do
echo "Batch file is $batchFile"
START_TIME=$(($(date +%s%N)/1000000))
runBatch $batchFile
ELAPSED_TIME=$(($(($(date +%s%N)/1000000))-START_TIME))
IN_SECONDS=$(awk "BEGIN {printf \"%.2f\",${ELAPSED_TIME}/1000}")
done
Then I am writing some information on each batch (such as time, etc.) to a table in a html page I am generating.
How would I go about recording the highest memory/cpu usage while the runBatch is running, along with the time, etc?
Any help appreciated.
Edit: I managed to get this done. I added a wrapper script around this script that runs this script in the background. I pass it's PID with $! to another script in the wrapper script that monitors the processes CPU and Memory usage with top every second. I compile everything into a html page at the end when the PID is no longer alive. Cheers for the pointers.
You should be able to get the PID of the process using $!,
runBatch $batchFile &
myPID=$!
and then you can run a top -b -p $myPID to print out a ticking summary of CPU.
Memory:
cat /proc/meminfo
Next grep whatever you want,
Cpu, it is more complicated - /proc/stat expained
Average load:
cat /proc/loadavg
For timing "runBatch" use
time runBatch
like
time sleep 10
Once you've got the pid of your process (e.g. like answered here) you can use (with watch(1) & cat(1) or grep(1)) the proc(5) file system, e.g.
watch cat /proc/$myPID/stat
(or use /proc/$myPID/status or /proc/$myPID/statm, or /proc/$myPID/maps for the address space, etc...)
BTW, to run batch jobs you should consider batch (and you might look into crontab(5) to run things periodically)

When does the writer of a named pipe do its work?

I'm trying to understand how a named pipe behaves in terms of performance. Say I have a large file I am decompressing that I want to write to a named pipe (/tmp/data):
gzip --stdout -d data.gz > /tmp/data
and then I sometime later run a program that reads from the pipe:
wc -l /tmp/data
When does gzip actually decompress the data, when I run the first command, or when I run the second and the reader attaches to the pipe? If the former, is the data stored on disk or in memory?
Pipes (named or otherwise) have only a very small buffer if any -- so if nothing is reading, then nothing (or very little) can be written.
In your example, gzip will do very little until wc is run, because before that point its efforts to write output will block. Out-of-the-box there is no nontrivial buffer either on-disk or in-memory, though tools exist which will implement such a buffer for you, should you want one -- see pv with its -B argument, or the no-longer-maintained (and, sadly, removed from Debian by folks who didn't understand its function) bfr.

Write a certain hex pattern to a file from bash

I am trying to do some memory tests and am trying to write a certain hex pattern to a regular file from bash. How would I go about doing this without using the xxd or hexdump tool/command?
Thanks,
Neco
The simplest thing is probably:
printf '\xde\xad\xbe\xef' > file
but it is often more convenient to do
perl -e 'print pack "H*", "deadbeef"' > file
If I get your question correctly printf should do:
>printf %X 256
100
Can you use od -x instead? That's pretty universaly available, od has been around since the dawn of time[1]
[1] Not really the dawn of time.
There are multiple ways to do this in bash one of the way is
\x31
'\' is used to skip next charcter from bash decoding
'x' show its a hex number
echo -en \\x31\\x32\\x33 > test
-e to avoid trailing line (else 0x0A will apend at end)
-n to interprete backslash escapes
Memory testing is much more complex subject than just writing / reading patterns in memory. Memory testing puts pretty hard limits on what a testing program can do and what state the whole system is in. Technically, it's impossible to test 100% of memory when you're running regular OS at all.
On the other hand, you can run some real test program from a shell, or schedule test execution on next boot with some clever hacking around. You might want to take a look at how it's done in Inquisitor, i.e. running memtester for in-OS testing and scheduling memtest86* run on next boot.
If you absolutely must remain in your current booted OS, then probably memtester would be your tool of choice - although note that it's not very precise memory test.
There are a lot of suggestions of using printf and echo, but there's one tiny difference. Printf is not capable of producing zeros (binary zeros), while echo does the job properly. Consider these examples:
printf "\x31\x32\x00\x33\x00\x00\x00\x34">printf.txt
echo -en "\x31\x32\x00\x33\x00\x00\x00\x34">echo.txt
As a result, printf.txt has a size of 3 bytes (yep, it writes the first zero and stops). And the echo.txt is 8 bytes long and contains actual data.

Time taken by `less` command to show output

I have a script that produces a lot of output. The script pauses for a few seconds at point T.
Now I am using the less command to analyze the output of the script.
So I execute ./script | less. I leave it running for sufficient time so that the script would have finished executing.
Now I go through the output of the less command by pressing Pg Down key. Surprisingly while scrolling at the point T of the output I notice the pause of few seconds again.
The script does not expect any input and would have definitely completed by the time I start analyzing the output of less.
Can someone explain how the pause of few seconds is noticable in the output of less when the script would have finished executing?
Your script is communicating with less via a pipe. Pipe is an in-memory stream of bytes that connects two endpoints: your script and the less program, the former writing output to it, the latter reading from it.
As pipes are in-memory, it would be not pleasant if they grew arbitrarily large. So, by default, there's a limit of data that can be inside the pipe (written, but not yet read) at any given moment. By default it's 64k on Linux. If the pipe is full, and your script tries to write to it, the write blocks. So your script isn't actually working, it stopped at some point when doing a write() call.
How to overcome this? Adjusting defaults is a bad option; what is used instead is allocating a buffer in the reader, so that it reads into the buffer, freeing the pipe and thus letting the writing program work, but shows to you (or handles) only a part of the output. less has such a buffer, and, by default, expands it automatically, However, it doesn't fill it in the background, it only fills it as you read the input.
So what would solve your problem is reading the file until the end (like you would normally press G), and then going back to the beginning (like you would normally press g). The thing is that you may specify these commands via command line like this:
./script | less +Gg
You should note, however, that you will have to wait until the whole script's output loads into memory, so you won't be able to view it at once. less is insufficiently sophisticated for that. But if that's what you really need (browsing the beginning of the output while the ./script is still computing its end), you might want to use a temporary file:
./script >x & less x ; rm x
The pipe is full at the OS level, so script blocks until less consumes some of it.
Flow control. Your script is effectively being paused while less is paging.
If you want to make sure that your command completes before you use less interactively, invoke less as less +G and it will read to the end of the input, you can then return to the start by typing 1G into less.
For some background information there's also a nice article by Alexander Sandler called "How less processes its input"!
http://www.alexonlinux.com/how-less-processes-its-input
Can I externally enforce line buffering on the script?
Is there an off the shelf pseudo tty utility I could use?
You may try to use the script command to turn on line-buffering output mode.
script -q /dev/null ./script | less # FreeBSD, Mac OS X
script -c "./script" /dev/null | less # Linux
For more alternatives in this respect please see: Turn off buffering in pipe.

Resources