Check if .tar.gz file is currently being extracted - linux

Is there a way to see if a .tar.gz file is currently being extracted? I'm currently downloading some very large databases and am unsure whether the process crashed/aborted or is still running. The file in question is almost 300GB and will be 4TB if successfully extracted. However, there is no progress being shown in the bash terminal and it has been at work for over 1 hour now, which is longer than what the download took.
It's also worth of note that the storage this is file is on is NVME but I don't know the exact hardware installed

If lsof (list open files) is installed, You could try lsof | grep tar to see if it is still extracting.

If you want to ensure the integrity of the file transfer, the best way is to use "rsync" with the appropriate options. Otherwise, you need to do a separate checksum at both source and copy to compare and confirm they are identical. Command format would be
rsync --checksum ...
If you are trying to confirm whether the downloading process is still live, then searching for the known process on the "ps" report will confirm that it is still running. Depending on your need, the command format would be
ps -ef | grep rsync
At the end of this, you will find a handy script I wrote to monitor progress of a single active rsync job (no parallel jobs).
If you want to monitor the destination directory (reserved uniquely for this download process) for specific insights of file-related OS actions, you could do that with the "inotifywait" utility, which would also report to you both the temporary and permanent filenames, if you use the correct options. The command format would be something like this:
inotifywait -m --format '%:e %f' /tmp
The output looks like this:
Setting up watches.
Watches established.
CREATE dum.txt
OPEN dum.txt
MODIFY dum.txt
CLOSE_WRITE:CLOSE dum.txt
MOVED_FROM dum.txt (if that was a temporary name)
MOVED_TO dum2.txt
Script to monitor "rsync":
#!/bin/sh
##########################################################################################################
### $Id: OS_Admin__partitionMirror_Monitor.sh,v 1.3 2022/08/05 03:46:55 root Exp root $
###
### This script is intended to perform an ongoing scan to report when an active RSYNC backup process terminates.
##########################################################################################################
redON="\033[91;1m"
redOFF="\033[0m"
greenON="\033[92;1m"
greenOFF="\033[0m"
cyanON="\033[96;1m"
cyanOFF="\033[0m"
italicON="\033[3m"
italicOFF="\033[0m"
yellowON="\033[93;1m"
yellowOFF="\033[0m"
. $Oasis/bin/INCLUDES__TerminalEscape_SGR.bh
BASE=`basename "$0" ".sh" `
TMP="/tmp/tmp.${BASE}.$$"
date | awk '{ printf("\n\t %s\n\n", $0 ) ; }'
if [ "$1" = "--snapshots" ]
then
SNAP=1
else
SNAP=0
fi
rm -f ${TMP}
ps -ef 2>&1 | grep -v grep | grep rsync | sort -r >${TMP}
if [ ! -s ${TMP} ]
then
echo "\t RSYNC process is ${redON}not${redOFF} running (or has already ${greenON}terminated${greenOFF}).\n"
exit 0
fi
awk '{ print $2 }' <${TMP} >${TMP}.pid
awk '{ printf("%s|%s\n", $3, $2) }' <${TMP} >${TMP}.ppid
for pid in `cut -f1 -d\| ${TMP}.ppid `
do
PPID=`grep ${pid} ${TMP}.pid `
PID=`grep '^'${pid} ${TMP}.ppid | cut -f2 -d\| `
PRNT=`grep '^'${pid} ${TMP}.ppid | cut -f1 -d\| `
if [ \( -n "${PPID}" \) -a \( "${PRNT}" -ne 1 \) ]
then
descr="child"
echo "\t PID ${PID} is RSYNC ${cyanON}${italicON}${descr}${italicOFF}${cyanOFF} process ..."
else
descr="MASTER"
echo "\t PID ${PID} is RSYNC ${yellowON}${descr}${yellowOFF} process ..."
fi
done
getRsyncProcessStatus()
{
testor=`ps -ef 2>&1 | awk -v THIS="${PID}" '{ if( $2 == THIS ){ print $0 } ; }' `
MODE=`echo "${testor}" |
awk '{ if( $NF ~ /^[/]DB001_F?[/]/ ){ print "2" }else{ print "1" } ; }' 2>>/dev/null `
}
getRsyncProcessStatus
if [ ${MODE} -eq 2 ]
then
echo "\t RSYNC restore process under way ..."
INTERVAL=60
else
echo "\t RSYNC backup process under way ..."
INTERVAL=10
fi
if [ -n "${testor}" ]
then
echo "\n\t ${testor}\n" | sed 's+--+\n\t\t\t\t--+g' | awk '{
rLOC=index($0,"rsync") ;
if( rLOC != 0 ){
sBeg=sprintf("%s", substr($0,1,rLOC-1) ) ;
sEnd=sprintf("%s", substr($0,rLOC+5) ) ;
sMid="\033[91;1mrsync\033[0m" ;
printf("%s%s%s\n", sBeg, sMid, sEnd) ;
}else{
pLOC=index($0,"/DB001_") ;
if( pLOC != 0 ){
sBeg=sprintf("%s", substr($0,1,pLOC-1) ) ;
sEnd=sprintf("%s", substr($0,pLOC) ) ;
printf("%s\033[1m\033[93;1m%s\033[0m\n", sBeg, sEnd) ;
}else{
print $0 ;
} ;
} ;
}'
echo "\n\t Scanning at ${INTERVAL} second intervals ..."
test ${SNAP} -eq 1 || echo "\t \c"
fi
if [ ${SNAP} -eq 1 ]
then
while true
do
getRsyncProcessStatus
if [ -z "${testor}" ]
then
echo "\n\n\t RSYNC process (# ${PID}) has ${greenON}completed${greenOFF}.\n"
date | awk '{ printf("\t %s\n\n", $0 ) ; }'
exit 0
fi
jobLog=`ls -tr /site/Z_backup.*.err | tail -1 `
echo "\t `tail -1 ${jobLog}`"
sleep ${INTERVAL}
done 2>&1 | uniq
else
while true
do
getRsyncProcessStatus
if [ -z "${testor}" ]
then
echo "\n\n\t RSYNC process (# ${PID}) has ${greenON}completed${greenOFF}.\n"
date | awk '{ printf("\t %s\n\n", $0 ) ; }'
exit 0
fi
echo ".\c"
sleep ${INTERVAL}
done
fi
Snapshot of terminal session while the script is running:

Related

Bash Scripting checking for home directories

I'm trying to create a script to check if user accounts have valid home directories.
This is what i got at the moment:
#!/bin/bash
cat /etc/passwd | awk -F: '{print $1 " " $3 " " $6 }' | while read user userid directory; do
if [ $userid -ge 1000 ] && [ ! -d "$directory ]; then
echo ${user}
fi
done
This works. I get the expected output which is the username of the account with an invalid home directory.
eg. output
student1
student2
However, I am unable to make it so that ONLY if there is no issues with the valid home directories and all of them are valid, echo "All home directories are valid".
Didn't run it, but it should be something like:
#!/bin/bash
users=()
cat /etc/passwd | awk -F: '{print $1 " " $3 " " $6 }' | while read user userid directory; do
if [ $userid -ge 1000 ] && [ ! -d "$directory" ]; then
users=+("${user}")
fi
done
if test -n ${#users[#]} == 0; then
echo "All home directories are valid"
else
for (( i=0; i<${#users[#]}; i++ )); do echo "${users[$i]}" ; done
fi
You could set a flag, and unset it if you see an invalid directory. Or you could simply check whether your loop printed anything.
You have a number of common antipatterns which you'll want to avoid, too.
# Avoid useless use of cat
# If you are using Awk anyway,
# use it for user id comparison, too
awk -F: '$3 >= 1000 {print $1, $6 }' /etc/passwd |
# Basically always use read -r
while read -r user directory; do
# Fix missing close quote
if [ ! -d "$directory" ]; then
# Quote user
echo "$user"
fi
done |
# If no output, print default message
grep '^' >&2 || echo "No invalid directories" >&2
A proper tool prints its diagnostic output to standard error, not standard output, so I added >&2 to the end.

sed is not working for commenting a line in a file using bash script

I have created a bash script that is used to modify the ulimit of open files in the RHEL server.
so i have reading the lines in the file /etc/security/limits.conf and if the soft/hard limit of the open files are less than 10000 for '*' domain i am commenting the line and adding a new line with soft/hard limit as 10000.
The Script is working as designed but the sed command to comment a line in the script is not working.
Please find the full script below :-
#!/bin/sh
#This script would be called by '' to set ulimit values for open files in unix servers.
#
configfile=/etc/security/limits.conf
help(){
echo "usage: $0 <LimitValue>"
echo -e "where\t--LimitValue= No of files you want all the users to open"
exit 1
}
modifyulimit()
{
grep '*\s*hard\s*nofile\s*' $configfile | while read -r line ; do
firstChar="$(echo $line | xargs | cut -c1-1)"
if [ "$firstChar" != "#" ];then
hardValue="$(echo $line | rev | cut -d ' ' -f1 | rev)"
if [[ "$hardValue" -ge "$1" ]]; then
echo ""
else
sed -i -e 's/$line/#$line/g' $configfile
echo "* hard nofile $1" >> $configfile
fi
else
echo ""
fi
done
grep '*\s*soft\s*nofile\s*' $configfile | while read -r line ; do
firstChar="$(echo $line | xargs | cut -c1-1)"
if [ "$firstChar" != "#" ];then
hardValue="$(echo $line | rev | cut -d ' ' -f1 | rev)"
if [[ "$hardValue" -ge "$1" ]]; then
echo ""
else
sed -i -e 's/$line/#$line/g' $configfile
echo "* hard nofile $1" >> $configfile
fi
else
echo ""
fi
done
}
deleteEofTag(){
sed -i "/\b\(End of file\)\b/d" $configfile
}
addEofTag()
{
echo "#################End of file###################" >> $configfile
}
#-------------Execution of the script starts here ----------------------
if [ $# -ne 1 ];
then
help
else
modifyulimit $1
deleteEofTag
addEofTag
fi
The command sed -i -e 's/$line/#$line/g' $configfile when executed from the terminal is working absolutely fine and it is commenting the line but it is not working when i am executing it from the unix shell script.
interpolation does not work in single quote
use double quote and try
sed -i -e 's/$line/#$line/g'
sed -i -e "s/$line/#$line/g"
also you might try:
sed -i -e s/${line}/#${line}/g
as this will tell the script to take the value of the variable instead of variable as such.

bash script run to send process to background

Hi Im making a script to do some rsync process, for the rsync process, Sys admin has created the script, when it run it is asking select options, so i want to create a script to pass that argument from script and run it from cron.
list of directories to rsync take from file.
filelist=$(cat filelist.txt)
for i in filelist;do
echo -e "3\nY" | ./rsync.sh $i
#This will create a rsync log file
so i check the some value of log file and if it is empty i moving to the second file. if the file is not empty, i have to start rsync process as below that will take more that 2 hours.
if [ a != 0 ];then
echo -e "3\nN" | ./rsync.sh $i
above rsync process need to send to the background and take next file to loop. i check with the screen command, but screen is not working with server. also i need to get the duration that take to run process and passing to the log, when i use the time command i am unable to pass the echo variable. Also need to send this to background and take next file. appreciate any suggestions to success this task.
Question
1. How to send argument with Time command
echo -e "3\nY" | time ./rsync.sh $i
above one not working
how to send this to background and take next file to rsync while running previous rsync process.
Full Code
#!/bin/bash
filelist=$(cat filelist.txt)
Lpath=/opt/sas/sas_control/scripts/Logs/rsync_logs
date=$(date +"%m-%d-%Y")
timelog="time_result/rsync_time.log-$date"
for i in $filelist;do
#echo $i
b_i=$(basename $i)
echo $b_i
echo -e "3\nY" | ./rsync.sh $i
f=$(cat $Lpath/$(ls -tr $Lpath| grep rsync-dry-run-$b_i | tail -1) | grep 'transferred:' | cut -d':' -f2)
echo $f
if [ $f != 0 ]; then
#date=$(date +"%D : %r")
start_time=`date +%s`
echo "$b_i-start:$start_time" >> $timelog
#time ./rsync.sh $i < echo -e "3\nY" 2> "./time_result/$b_i-$date" &
time { echo -e "3\nY" | ./rsync.sh $i; } 2> "./time_result/$b_i-$date"
end_time=`date +%s`
s_time=$(cat $timelog|grep "$b_i-start" |cut -d ':' -f2)
duration=$(($end_time-$s_time))
echo "$b_i duration:$duration" >> $timelog
fi
done
Your question is not very clear, but I'll try:
(1) If I understand you correctly, you want to time the rsync.
My first attempt would be to use echo xxxx | time rsycnc. On my bash, this was however broken (or not supposed to work?). I'm normally using Zsh instead of bash, and on zsht, this indeed runs fine.
If it is important for you to use bash, an alternative (since the time for the echo can likely be neglected) would be to time the whole pipe, i.e. time (echo xxxx | time rsync), or even simpler time rsync <(echo xxxx)
(2) To send a process to the background, add an & to the line. However, the time command produces of course output (that's it purpose), and you don't want to receive output from a program in background. The solution is to redirect the output:
(time rsync <(echo xxxx) >output.txt 2>error.txt) &
If you want to time something, you can use:
time sleep 3
If you want to time two things, you can do a compound statement like this (note semicolon after second sleep):
time { sleep 3; sleep 4; }
So, you can do this to time your echo (which will take no time at all) and your rsync:
time { echo "something" | rsync something ; }
If you want to do that in the background:
time { echo "something" | rsync something ; } &
Full Code
#!/bin/bash
filelist=$(cat filelist.txt)
Lpath=/opt/sas/sas_control/scripts/Logs/rsync_logs
date=$(date +"%m-%d-%Y")
timelog="time_result/rsync_time.log-$date"
for i in $filelist;do
#echo $i
b_i=$(basename $i)
echo $b_i
echo -e "3\nY" | ./rsync.sh $i
f=$(cat $Lpath/$(ls -tr $Lpath| grep rsync-dry-run-$b_i | tail -1) | grep 'transferred:' | cut -d':' -f2)
echo $f
if [ $f != 0 ]; then
#date=$(date +"%D : %r")
start_time=`date +%s`
echo "$b_i-start:$start_time" >> $timelog
#time ./rsync.sh $i < echo -e "3\nY" 2> "./time_result/$b_i-$date" &
time { echo -e "3\nY" | ./rsync.sh $i; } 2> "./time_result/$b_i-$date"
end_time=`date +%s`
s_time=$(cat $timelog|grep "$b_i-start" |cut -d ':' -f2)
duration=$(($end_time-$s_time))
echo "$b_i duration:$duration" >> $timelog
fi
done

Editing a .CSV file through shell automation script

I'm getting an error when trying to execute the script below around the done statement. The point of the code is to have the while statement execute for the duration of files listed in the filenames that log in order to grab the revision number for each file location in my branches folder.
filea=/home/filenames.log
fileb=/home/actions.log
filec=/home/revisions.log
filed=/home/final.log
count=1
while read Path do
Status=`sed -n "$count"p $fileb`
Revision=`svn info ${WORKSPACE}/$Path | grep "Revision" | awk '{print $2}'`
if `echo $Path | grep "UpgradeScript"` then
Results="Reverted - ROkere"
Details="Reverted per process"
else if `echo $Path | grep "tsu_includes/shell_scripts"` then
Results="Reverted - ROkere"
Details="Reverted per process"
else
Results="Verified - ROkere"
Details=""
fi
echo "$Path,$Status,$Revision,$Results,$Details" > $filed
count=`expr $count + 1`
done < $filea
need a semicolon or newline before do and then.
Change else if to elif
change
if `echo $Path | grep "UpgradeScript"` then
to (removing backticks, using "here-string", and -q option for grep)
if grep -q "UpgradeScript" <<< "$Path"; then
"filed" will only ever contain just one line. I assume you want to append >> instead of overwrite >
Actually, a quick rewrite. You're reading corresponding lines from 2 files. Faster to do that completely within the shell instead of invoking sed once for each line in the file.
#!/bin/bash
filea=/home/filenames.log
fileb=/home/actions.log
filec=/home/revisions.log # not used?
filed=/home/final.log
exec 3<"$filea" # open $filea on fd 3
exec 4<"$fileb" # open $fileb on fd 4
while read -u3 Path && read -u4 Status; do
Revision=$(svn info "$WORKSPACE/$Path" | awk '/Revision/ {print $2}')
if [[ "$Path" == *"UpgradeScript"* ]]; then
Results="Reverted - ROkere"
Details="Reverted per process"
elif [[ "$Path" == *"tsu_includes/shell_scripts"* ]]; then
Results="Reverted - ROkere"
Details="Reverted per process"
else
Results="Verified - ROkere"
Details=""
fi
echo "$Path,$Status,$Revision,$Results,$Details"
done > "$filed"
exec 3<&- # close fd 3
exec 4<&- # close fd 4

Why do this sample script, keep outputting error near token?

enter image description hereI was trying to see how a shell scripts work and how to run them, so I toke some sample code from a book I picked up from the library called "Wicked Cool Shell Scripts"
I re wrote the code verbatim, but I'm getting an error from Linux, which I compiled the code on saying:
'd.sh: line 3: syntax error near unexpected token `{
'd.sh: line 3:`gmk() {
Before this I had the curly bracket on the newline but I was still getting :
'd.sh: line 3: syntax error near unexpected token
'd.sh: line 3:`gmk()
#!/bin/sh
#format directory- outputs a formatted directory listing
gmk()
{
#Give input in Kb, output converted to Kb, Mb, or Gb for best output format
if [$1 -ge 1000000]; then
echo "$(scriptbc -p 2 $1/1000000)Gb"
elif [$1 - ge 1000]; then
echo "$$(scriptbc -p 2 $1/1000)Mb"
else
echo "${1}Kb"
fi
}
if [$# -gt 1] ; then
echo "Usage: $0 [dirname]" >&2; exit 1
elif [$# -eq 1] ; then
cd "$#"
fi
for file in *
do
if [-d "$file"] ; then
size = $(ls "$file"|wc -l|sed 's/[^[:digit:]]//g')
elif [$size -eq 1] ; then
echo "$file ($size entry)|"
else
echo "$file ($size entries)|"
fi
else
size ="$(ls -sk "$file" | awk '{print $1}')"
echo "$file ($(gmk $size))|"
fi
done | \
sed 's/ /^^^/g' |\
xargs -n 2 |\
sed 's/\^\^\^/ /g' | \
awk -F\| '{ printf "%39s %-39s\n", $1, $2}'
exit 0
if [$#-gt 1]; then
echo "Usage :$0 [dirname]" >&2; exit 1
elif [$# -eq 1]; then
cd "$#"
fi
for file in *
do
if [ -d "$file" ] ; then
size =$(ls "$file" | wc -l | sed 's/[^[:digit:]]//g')
if [ $size -eq 1 ] ; then
echo "$file ($size entry)|"
else
echo "$file ($size entries)|"
fi
else
size ="$(ls -sk "$file" | awk '{print $1}')"
echo "$file ($(convert $size))|"
fi
done | \
sed 's/ /^^^/g' | \
xargs -n 2 | \
sed 's/\^\^\^/ /g' | \
awk -F\| '{ printf "%-39s %-39s\n", $1, $2 }'
exit 0
sh is very sensitive to spaces. In particular assignment (no spaces around =) and testing (must have spaces inside the [ ]).
This version runs, although fails on my machine due to the lack of scriptbc.
You put an elsif in a spot where it was supposed to be if.
Be careful of column alignment between starts and ends. If you mismatch them it will easily lead you astray in thinking about how this works.
Also, adding a set -x near the top of a script is a very good way of debugging what it is doing - it will cause the interpreter to output each line it is about to run before it does.
#!/bin/sh
#format directory- outputs a formatted directory listing
gmk()
{
#Give input in Kb, output converted to Kb, Mb, or Gb for best output format
if [ $1 -ge 1000000 ]; then
echo "$(scriptbc -p 2 $1/1000000)Gb"
elif [ $1 -ge 1000 ]; then
echo "$(scriptbc -p 2 $1/1000)Mb"
else
echo "${1}Kb"
fi
}
if [ $# -gt 1 ] ; then
echo "Usage: $0 [dirname]" >&2; exit 1
elif [ $# -eq 1 ] ; then
cd "$#"
fi
for file in *
do
if [ -d "$file" ] ; then
size=$(ls "$file"|wc -l|sed 's/[^[:digit:]]//g')
if [ $size -eq 1 ] ; then
echo "$file ($size entry)|"
else
echo "$file ($size entries)|"
fi
else
size="$(ls -sk "$file" | awk '{print $1}')"
echo "$file ($(gmk $size))|"
fi
done | \
sed 's/ /^^^/g' |\
xargs -n 2 |\
sed 's/\^\^\^/ /g' | \
awk -F\| '{ printf "%39s %-39s\n", $1, $2}'
exit 0
By the way, with respect to the book telling you to modify your PATH variable, that's really a bad idea, depending on what exactly it advised you to do. Just to be clear, never add your current directory to the PATH variable unless you intend on making that directory a permanent location for all of your scripts etc. If you are making this a permanent location for your scripts, make sure you add the location to the END of your PATH variable, not the beginning, otherwise you are creating a major security problem.
Linux and Unix do not add your current location, commonly called your PWD, or present working directory, to the path because someone could create a script called 'ls', for example, which could run something malicious instead of the actual 'ls' command. The proper way to execute something in your PWD, is to prepend it with './' (e.g. ./my_new_script.sh). This basically indicates that you really do want to run something from your PWD. Think of it as telling the shell "right here". The '.' actually represents your current directory, in other words "here".

Resources