How to get average CPU temperature from bash? - linux

How can I get the average CPU temperature from bash on Linux? Preferably in degrees Fahrenheit. The script should be able to handle different numbers of CPUs.

You do it like so:
Installation
sudo apt install lm-sensors
sudo sensors-detect --auto
get_cpu_temp.sh
#!/bin/bash
# 1. get temperature
## a. split response
## Core 0: +143.6°F (high = +186.8°F, crit = +212.0°F)
IFS=')' read -ra core_temp_arr <<< $(sensors -f | grep '^Core\s[[:digit:]]\+:') #echo "${core_temp_arr[0]}"
## b. find cpu usage
total_cpu_temp=0
index=0
for i in "${core_temp_arr[#]}"; do :
temp=$(echo $i | sed -n 's/°F.*//; s/.*[+-]//; p; q')
let index++
total_cpu_temp=$(echo "$total_cpu_temp + $temp" | bc)
done
avg_cpu_temp=$(echo "scale=2; $total_cpu_temp / $index" | bc)
## c. build entry
temp_status="CPU: $avg_cpu_temp F"
echo $temp_status
exit 0
output
CPU: 135.50 F

You can also read CPU temperatures directly from sysfs (path may differ from machine/OS to machine/OS though):
Bash:
temp_file=$(mktemp -t "temp-"$(date +'%Y%m%d#%H:%M:%S')"-XXXXXX")
ls $temp_file
while true; do
cat /sys/class/thermal/thermal_zone*/temp | tr '\n' ' ' >> "$temp_file"
printf "\n" >> $temp_file
sleep 2
done
If you're a fish user, you may add a function to your config dir, let's say: ~/.config/fish/functions/temp.fish
Fish
function temp
set temp_file (mktemp -t "temp-"(date +'%Y%m%d#%H:%M:%S')"-XXXXXX")
ls $temp_file
while true
cat /sys/class/thermal/thermal_zone*/temp | tr '\n' ' ' >> "$temp_file"
printf "\n" >> $temp_file
sleep 2
end
end
Example

Related

How to find gc content of a fasta file using bash script?

I am learning bioinformatics.
I want to find GC content from a fasta file using Bash script.
GC content is basically (number of (g + c)) / (number of (a + t + g + c)).
I am trying to use wc command. But I was not able to get an answer.
Edit 17th Feb 2023.
After going through documentation and videos, I came up with a solution.
filename=$# # collecting all the filenames as parameters
for f in $filename # Looping over files
do
echo " $f is being processed..."
gc=( $( grep -v ">" < "$f" | grep -io 'g\|c'< "$f" | wc -l)) # Reading lines that dont start with < using -v. grep -io matches to either g or c and outputs each match on single line. wc -l counts the number of lines or indirectly the number of g and c. This is stored in a variable.
total=( $( grep -v ">" < "$f" | tr -d '\s\r' | wc -c)) # Spaces, tabs, new line are removed from the file using tr. Then the number of characters are counted by wc -c
percent=( $( echo "scale=2;100*$gc/$total" |bc -l)) # bc -l is used to get the answer in float format. scale=2 mentions the number of decimal points.
echo " The GC content of $f is: "$percent"%"
echo
done
Do not reinvent the wheel. For common bioinformatics tasks, use open-source tools that are specifically designed for these tasks, are well-tested, widely used, and handle edge cases. For example, use EMBOSS infoseq utility. EMBOSS can be easily installed, for example using conda.
Example:
Install EMBOSS package (do once):
conda create --name emboss emboss --channel iuc
Activate the conda environment and use EMBOSS infoseq, here to priitn the sequence name, length and percent GC:
source activate emboss
cat your_sequence_file_name.fasta | infoseq -auto -only -name -length -pgc stdin
source deactivate
This prints into STDOUT something like this:
Name Length %GC
seq_foo 119 60.50
seq_bar 104 39.42
seq_baz 191 46.60
...
This should work:
#!/usr/bin/env sh
# Adapted from https://www.biostars.org/p/17680
# Fail on error
set -o errexit
# Disable undefined variable reference
set -o nounset
# ================
# CONFIGURATION
# ================
# Fasta file path
FASTA_FILE="file.fasta"
# Number of digits after decimal point
N_DIGITS=3
# ================
# LOGGER
# ================
# Fatal log message
fatal() {
printf '[FATAL] %s\n' "$#" >&2
exit 1
}
# Info log message
info() {
printf '[INFO ] %s\n' "$#"
}
# ================
# MAIN
# ================
{
# Check command 'bc' exist
command -v bc > /dev/null 2>&1 || fatal "Command 'bc' not found"
# Check file exist
[ -f "$FASTA_FILE" ] || fatal "File '$FASTA_FILE' not found"
# Count number of sequences
_n_sequences=$(grep --count '^>' "$FASTA_FILE")
info "Analyzing $_n_sequences sequences"
[ "$_n_sequences" -ne 0 ] || fatal "No sequences found"
# Remove sequence wrapping
_fasta_file_content=$(
sed 's/\(^>.*$\)/#\1#/' "$FASTA_FILE" \
| tr --delete "\r\n" \
| sed 's/$/#/' \
| tr "#" "\n" \
| sed '/^$/d'
)
# Vars
_sequence=
_a_count_total=0
_c_count_total=0
_g_count_total=0
_t_count_total=0
# Read line by line
while IFS= read -r _line; do
# Check if header
if printf '%s\n' "$_line" | grep --quiet '^>'; then
# Save sequence and continue
_sequence=${_line#?}
continue
fi
# Count
_a_count=$(printf '%s\n' "$_line" | tr --delete --complement 'A' | wc --bytes)
_c_count=$(printf '%s\n' "$_line" | tr --delete --complement 'C' | wc --bytes)
_g_count=$(printf '%s\n' "$_line" | tr --delete --complement 'G' | wc --bytes)
_t_count=$(printf '%s\n' "$_line" | tr --delete --complement 'T' | wc --bytes)
# Add current count to total
_a_count_total=$((_a_count_total + _a_count))
_c_count_total=$((_c_count_total + _c_count))
_g_count_total=$((_g_count_total + _g_count))
_t_count_total=$((_t_count_total + _t_count))
# Calculate GC content
_gc=$(
printf 'scale = %d; a = %d; c = %d; g = %d; t = %d; (g + c) / (a + c + g + t)\n' \
"$N_DIGITS" "$_a_count" "$_c_count" "$_g_count" "$_t_count" \
| bc
)
# Add 0 before decimal point
_gc="$(printf "%.${N_DIGITS}f\n" "$_gc")"
info "Sequence '$_sequence' GC content: $_gc"
done << EOF
$_fasta_file_content
EOF
# Total data
info "Adenine total count: $_a_count_total"
info "Cytosine total count: $_c_count_total"
info "Guanine total count: $_g_count_total"
info "Thymine total count: $_t_count_total"
# Calculate total GC content
_gc=$(
printf 'scale = %d; a = %d; c = %d; g = %d; t = %d; (g + c) / (a + c + g + t)\n' \
"$N_DIGITS" "$_a_count_total" "$_c_count_total" "$_g_count_total" "$_t_count_total" \
| bc
)
# Add 0 before decimal point
_gc="$(printf "%.${N_DIGITS}f\n" "$_gc")"
info "GC content: $_gc"
}
The "Count number of sequences" and "Remove sequence wrapping" codes are adapted from https://www.biostars.org/p/17680
The script uses only basic commands except for bc to do the precision calculation (See bc installation).
You can configure the script by modifying the variables in the CONFIGURATION section.
Because you haven't indicated which one you want, the GC content is calculated for both each sequence and the overall. Therefore, get rid of anything that isn't necessary :)
Despite my lack of bioinformatics background, the script successfully parses and analyzes a fasta file.

What is the unix/linux way to divide the number of lines contained from two different files?

I have a file and I am processing it line by line and producing another file with the result. I want to monitor the percentage of completion. In my case, it is just the number of lines in the new file divide by the number of lines from the input file. A simple example would be:
$ cat infile
unix
is
awesome
$ cat infile | process.sh >> outfile &
Now, if I run my command, I should get 0.33 if process.sh completed the first line.
Any suggestions?
You can use pv for progress (in debian/ubuntu inside package pv):
pv -l -s `wc -l file.txt` file.txt | process.sh
This will use number of lines for progress.
Or you can use just the number of bytes:
pv file.txt | process.sh
The above commands will show you the percentage of completion and ETA.
You can use bc:
echo "scale=2; $(cat outfile | wc -l) / $(cat infile | wc -l) * 100" | bc
In addition, combine this with watch for updated progress:
watch -d "echo \"scale=2; \$(cat outfile | wc -l) / \$(cat infile | wc -l) * 100\" | bc"
TOTAL_LINES=`wc -l infile`
LINES=`wc -l outfile`
PERCENT=`echo "scale=2;${LINES}/${TOTAL_LINES}" | bc | sed -e 's_^\.__'`
echo "${PERCENT} % Complete"
scale=2 means you get two significant digits.

Using bc as daemon in BASH shell from awk

# mkfifo inp out
# bc -ql <inp >out &
[1] 6766
#
# exec 3>inp 4<out
# echo "scale=3; 4/5;" >&3
# read a <&4; echo $a
.800
#
# awk ' BEGIN { printf("4/5\n") >"/dev/fd/3"; exit 1;} '
# read a <&4; echo $a
.800
#
# awk ' BEGIN { printf("4/5\n") >"/dev/fd/3"; exit 1;} '
# awk ' BEGIN { getline a <"/dev/fd/4"; printf("%s\n", a); } '
^C
In BASH environment I can communicate with bc program using fifo.
But in awk I can write but no read with getline function.
How can I read from "/dev/fd/4" in awk.
My awk version is: mawk 1.3.3 Nov 1996, Copyright (C) Michael D. Brennan
Thanks
Laci
Continued:
I did some further experiment and I summarize my result.
Awk script language suits best for my task,
and I need to use "bc" because I have to count with very long numbers (about 100 digits).
The next two scripts show that using named pipe is faster than unnamed (about 83 times).
1) With unnamed pipe:
# time for((i=6000;i;i--)); do a=`echo "$i/1"|bc -ql`; done
real 0m13.936s
2) With named pipe:
# mkfifo in out
# bc -ql <in >out &
# exec 3>in 4<out
#
# time for((i=500000;i;i--)); do echo "$i/1" >&3; read a <&4; done
real 0m14.391s
3) In the awk environment the bc usage is a bit slower (about 18 times) than in bash but it works this way:
# time awk ' BEGIN {
# for(i=30000;i;i--){
# printf("%d/1\n",i) >"/dev/fd/3";
# system("read a </dev/fd/4; echo $a >tmp_1");
# getline a <"tmp_1"; close("tmp_1");}
# } '
real 0m14.178s
4)What can be the problem when I try to do accordig to "man awk" ? :
# awk ' BEGIN {
# for(i=4;i;i--){
# printf("%d/1\n",i) >"/dev/fd/3"; system("sleep .1");
# "read a </dev/fd/4; echo $a" | getline a ;print a;}
# } '
4.000
4.000
4.000
4.000
The above "awk" script was able to pick up only the first number from the pipe.
The other three numbers remained in the pipe.
These will be visible when I'm reading the pipe after the above awk script.
# for((;;)); do read a </dev/fd/4; echo $a; done
3.000
2.000
1.000
Thanks for gawk.
It sounds like you're looking for gawk's co-process ability, see http://www.gnu.org/software/gawk/manual/gawk.html#Getline_002fCoprocess. Given awks support of math functions, though, I wonder why you'd want to use bc...
Try:
mkfifo inp out
bc -l <inp >out &
awk ' BEGIN { printf("4/5\n"); exit 0;} ' > inp
read a < out; echo $a
awk ' BEGIN { printf("4/5\n"); exit 0;} ' > inp
awk ' BEGIN { getline a; printf("%s\n", a); exit 0 } ' < out
rm inp
rm out

How to get result from background process linux shell script?

For example let's say I want to count the number of lines of 10 BIG files and print a total.
for f in files
do
#this creates a background process for each file
wc -l $f | awk '{print $1}' &
done
I was trying something like:
for f in files
do
#this does not work :/
n=$( expr $(wc -l $f | awk '{print $1}') + $n ) &
done
echo $n
I finally found a working solution using anonymous pipes and bash:
#!/bin/bash
# this executes a separate shell and opens a new pipe, where the
# reading endpoint is fd 3 in our shell and the writing endpoint
# stdout of the other process. Note that you don't need the
# background operator (&) as exec starts a completely independent process.
exec 3< <(./a.sh 2&1)
# ... do other stuff
# write the contents of the pipe to a variable. If the other process
# hasn't already terminated, cat will block.
output=$(cat <&3)
You should probably use gnu parallel:
find . -maxdepth 1 -type f | parallel --gnu 'wc -l' | awk 'BEGIN {n=0} {n += $1} END {print n}'
or else xargs in parallel mode:
find . -maxdepth 1 -type f | xargs -n1 -P4 wc -l | awk 'BEGIN {n=0} {n += $1} END {print n}'
Another option, if this doesn't fit your needs, is to write to temp files. If you don't want to write to disk, just write to /dev/shm. This is a ramdisk on most Linux systems.
#!/bin/bash
declare -a temp_files
count=0
for f in *
do
if [[ -f "$f" ]]; then
temp_files[$count]="$(mktemp /dev/shm/${f}-XXXXXX)"
((count++))
fi
done
count=0
for f in *
do
if [[ -f "$f" ]]; then
cat "$f" | wc -l > "${temp_files[$count]}" &
((count++))
fi
done
wait
cat "${temp_files[#]}" | awk 'BEGIN {n=0} {n += $1} END {print n}'
for tf in "${temp_files[#]}"
do
rm "$tf"
done
By the way, this can be though of as a map-reduce with wc doing the mapping and awk doing the reduction.
You could write that to a file or better, listen to a fifo as soon as data arrives.
Here is a small example on how they work:
# create the fifo
mkfifo test
# listen to it
while true; do if read line <test; then echo $line; fi done
# in another shell
echo 'hi there'
# notice 'hi there' being printed in the first shell
So you could
for f in files
do
#this creates a background process for each file
wc -l $f | awk '{print $1}' > fifo &
done
and listen on the fifo for sizes.

File that autoruns itself

The same way it’s possible to write a file that autoextracts itself, I’m looking for a way to autorun a program within a script (or whatever it needs). I want the program part of the script, because I just want one file. It’s actually a challenge: I have a xz compressed program, and I wanna be able to run it without any intervention of the xz program by the user (just a ./theprogram).
Any idea?
Autorun after doing what? Login? Call it in ~/.bashrc. During boot? Write an appropriate /etc/init.d/yourprog and link it to the desired runlevel. Selfextract? Make it a shell archive (shar file). See the shar utility, http://linux.die.net/man/1/shar
Sorry but I was just thinking... Something like this would not work?
(I am assuming it is a script...)
#!/bin/bash
cat << 'EOF' > yourfile
yourscript
EOF
chmod +x yourfile
./yourfile
Still, it's pretty hard to understand exactly what you are trying to do... it seems to me that the "autorun" is pretty similar to a "call the program from within the script"..
I had written a script for this. This should help:
#!/bin/bash
set -e
payload=$(cat $0 | grep --binary-files=text -n ^PAYLOAD: | cut -d: -f1 )
filaname=`head $0 -n $payload | tail -n 1 | cut -d: -f2-`
tail -n +$(( $payload + 1 )) $0 > /tmp/$filaname
set +e
#Do whatever with the payload
exit 0
#Command to add payload:
#read x; ls $x && ( cp 'binary_script.sh' ${x}_binary_script.sh; echo $x >> ${x}_binary_script.sh; cat $x >> ${x}_binary_script.sh )
#Note: Strictly NO any character after "PAYLOAD:", not even newline...
PAYLOAD:
Sample usage:
Suppose myNestedScript.sh contains below data:
#!/bin/bash
echo hello world
Then run
x=myNestedScript.sh; ls $x && ( cp 'binary_script.sh' ${x}_binary_script.sh; echo $x >> ${x}_binary_script.sh; cat $x >> ${x}_binary_script.sh )
It will generate below file, which you can directly execute. Upon executing below file, it will extract myNestedScript.sh to /tmp & run that script.
#!/bin/bash
set -e
payload=$(cat $0 | grep --binary-files=text -n ^PAYLOAD: | cut -d: -f1 )
filaname=`head $0 -n $payload | tail -n 1 | cut -d: -f2-`
tail -n +$(( $payload + 1 )) $0 > /tmp/$filaname
set +e
chmod 755 /tmp/$filaname
/tmp/$filaname
exit 0
PAYLOAD:myNestedScript.sh
#!/bin/bash
echo hello world

Resources