Adding /home directory to the existing user in linux - linux

The below mentioned code is not working properly (CentOS 7/bash file code):
#!/bin/bash
cat /etc/passwd | egrep -v '^(root|halt|sync|shutdown)' |
awk -F: '($7 != "/sbin/nologin" && $7 != "/bin/false") { print $1 " " $6 }' |
while read user dir; do
if [ ! -d "$dir" ]; then
echo "The home directory ($dir) of user $user does not exist."
fi
done
The code written here is going in infinite loop.
Can you please tell me what to do?

As has already been suggested, there is an unnecessary use of cat into grep and then an unnecessary pipe of grep into awk. You can condense the execution with the following:
#!/bin/bash
awk -F: '$1 !~ /root|halt|sync|shutdown/ && $7 != "/sbin/nologin" && $7 != "/bin/false"
{ print $1 " " $6 }' /etc/passwd | while read user dir;
do
if [[ ! -d "$dir" ]];
then
echo "The home directory ($dir) of user $user does not exist."
fi
done

Related

Check if .tar.gz file is currently being extracted

Is there a way to see if a .tar.gz file is currently being extracted? I'm currently downloading some very large databases and am unsure whether the process crashed/aborted or is still running. The file in question is almost 300GB and will be 4TB if successfully extracted. However, there is no progress being shown in the bash terminal and it has been at work for over 1 hour now, which is longer than what the download took.
It's also worth of note that the storage this is file is on is NVME but I don't know the exact hardware installed
If lsof (list open files) is installed, You could try lsof | grep tar to see if it is still extracting.
If you want to ensure the integrity of the file transfer, the best way is to use "rsync" with the appropriate options. Otherwise, you need to do a separate checksum at both source and copy to compare and confirm they are identical. Command format would be
rsync --checksum ...
If you are trying to confirm whether the downloading process is still live, then searching for the known process on the "ps" report will confirm that it is still running. Depending on your need, the command format would be
ps -ef | grep rsync
At the end of this, you will find a handy script I wrote to monitor progress of a single active rsync job (no parallel jobs).
If you want to monitor the destination directory (reserved uniquely for this download process) for specific insights of file-related OS actions, you could do that with the "inotifywait" utility, which would also report to you both the temporary and permanent filenames, if you use the correct options. The command format would be something like this:
inotifywait -m --format '%:e %f' /tmp
The output looks like this:
Setting up watches.
Watches established.
CREATE dum.txt
OPEN dum.txt
MODIFY dum.txt
CLOSE_WRITE:CLOSE dum.txt
MOVED_FROM dum.txt (if that was a temporary name)
MOVED_TO dum2.txt
Script to monitor "rsync":
#!/bin/sh
##########################################################################################################
### $Id: OS_Admin__partitionMirror_Monitor.sh,v 1.3 2022/08/05 03:46:55 root Exp root $
###
### This script is intended to perform an ongoing scan to report when an active RSYNC backup process terminates.
##########################################################################################################
redON="\033[91;1m"
redOFF="\033[0m"
greenON="\033[92;1m"
greenOFF="\033[0m"
cyanON="\033[96;1m"
cyanOFF="\033[0m"
italicON="\033[3m"
italicOFF="\033[0m"
yellowON="\033[93;1m"
yellowOFF="\033[0m"
. $Oasis/bin/INCLUDES__TerminalEscape_SGR.bh
BASE=`basename "$0" ".sh" `
TMP="/tmp/tmp.${BASE}.$$"
date | awk '{ printf("\n\t %s\n\n", $0 ) ; }'
if [ "$1" = "--snapshots" ]
then
SNAP=1
else
SNAP=0
fi
rm -f ${TMP}
ps -ef 2>&1 | grep -v grep | grep rsync | sort -r >${TMP}
if [ ! -s ${TMP} ]
then
echo "\t RSYNC process is ${redON}not${redOFF} running (or has already ${greenON}terminated${greenOFF}).\n"
exit 0
fi
awk '{ print $2 }' <${TMP} >${TMP}.pid
awk '{ printf("%s|%s\n", $3, $2) }' <${TMP} >${TMP}.ppid
for pid in `cut -f1 -d\| ${TMP}.ppid `
do
PPID=`grep ${pid} ${TMP}.pid `
PID=`grep '^'${pid} ${TMP}.ppid | cut -f2 -d\| `
PRNT=`grep '^'${pid} ${TMP}.ppid | cut -f1 -d\| `
if [ \( -n "${PPID}" \) -a \( "${PRNT}" -ne 1 \) ]
then
descr="child"
echo "\t PID ${PID} is RSYNC ${cyanON}${italicON}${descr}${italicOFF}${cyanOFF} process ..."
else
descr="MASTER"
echo "\t PID ${PID} is RSYNC ${yellowON}${descr}${yellowOFF} process ..."
fi
done
getRsyncProcessStatus()
{
testor=`ps -ef 2>&1 | awk -v THIS="${PID}" '{ if( $2 == THIS ){ print $0 } ; }' `
MODE=`echo "${testor}" |
awk '{ if( $NF ~ /^[/]DB001_F?[/]/ ){ print "2" }else{ print "1" } ; }' 2>>/dev/null `
}
getRsyncProcessStatus
if [ ${MODE} -eq 2 ]
then
echo "\t RSYNC restore process under way ..."
INTERVAL=60
else
echo "\t RSYNC backup process under way ..."
INTERVAL=10
fi
if [ -n "${testor}" ]
then
echo "\n\t ${testor}\n" | sed 's+--+\n\t\t\t\t--+g' | awk '{
rLOC=index($0,"rsync") ;
if( rLOC != 0 ){
sBeg=sprintf("%s", substr($0,1,rLOC-1) ) ;
sEnd=sprintf("%s", substr($0,rLOC+5) ) ;
sMid="\033[91;1mrsync\033[0m" ;
printf("%s%s%s\n", sBeg, sMid, sEnd) ;
}else{
pLOC=index($0,"/DB001_") ;
if( pLOC != 0 ){
sBeg=sprintf("%s", substr($0,1,pLOC-1) ) ;
sEnd=sprintf("%s", substr($0,pLOC) ) ;
printf("%s\033[1m\033[93;1m%s\033[0m\n", sBeg, sEnd) ;
}else{
print $0 ;
} ;
} ;
}'
echo "\n\t Scanning at ${INTERVAL} second intervals ..."
test ${SNAP} -eq 1 || echo "\t \c"
fi
if [ ${SNAP} -eq 1 ]
then
while true
do
getRsyncProcessStatus
if [ -z "${testor}" ]
then
echo "\n\n\t RSYNC process (# ${PID}) has ${greenON}completed${greenOFF}.\n"
date | awk '{ printf("\t %s\n\n", $0 ) ; }'
exit 0
fi
jobLog=`ls -tr /site/Z_backup.*.err | tail -1 `
echo "\t `tail -1 ${jobLog}`"
sleep ${INTERVAL}
done 2>&1 | uniq
else
while true
do
getRsyncProcessStatus
if [ -z "${testor}" ]
then
echo "\n\n\t RSYNC process (# ${PID}) has ${greenON}completed${greenOFF}.\n"
date | awk '{ printf("\t %s\n\n", $0 ) ; }'
exit 0
fi
echo ".\c"
sleep ${INTERVAL}
done
fi
Snapshot of terminal session while the script is running:

Bash Scripting checking for home directories

I'm trying to create a script to check if user accounts have valid home directories.
This is what i got at the moment:
#!/bin/bash
cat /etc/passwd | awk -F: '{print $1 " " $3 " " $6 }' | while read user userid directory; do
if [ $userid -ge 1000 ] && [ ! -d "$directory ]; then
echo ${user}
fi
done
This works. I get the expected output which is the username of the account with an invalid home directory.
eg. output
student1
student2
However, I am unable to make it so that ONLY if there is no issues with the valid home directories and all of them are valid, echo "All home directories are valid".
Didn't run it, but it should be something like:
#!/bin/bash
users=()
cat /etc/passwd | awk -F: '{print $1 " " $3 " " $6 }' | while read user userid directory; do
if [ $userid -ge 1000 ] && [ ! -d "$directory" ]; then
users=+("${user}")
fi
done
if test -n ${#users[#]} == 0; then
echo "All home directories are valid"
else
for (( i=0; i<${#users[#]}; i++ )); do echo "${users[$i]}" ; done
fi
You could set a flag, and unset it if you see an invalid directory. Or you could simply check whether your loop printed anything.
You have a number of common antipatterns which you'll want to avoid, too.
# Avoid useless use of cat
# If you are using Awk anyway,
# use it for user id comparison, too
awk -F: '$3 >= 1000 {print $1, $6 }' /etc/passwd |
# Basically always use read -r
while read -r user directory; do
# Fix missing close quote
if [ ! -d "$directory" ]; then
# Quote user
echo "$user"
fi
done |
# If no output, print default message
grep '^' >&2 || echo "No invalid directories" >&2
A proper tool prints its diagnostic output to standard error, not standard output, so I added >&2 to the end.

Bash function not returning value to the script while running through cron

I have a main script that calls a function , which is written in another file.I have sourced this file inside the main script. When I run this main script using cron, then the function does not return any value. Whereas if i run the same main script in console , the function returns value and prints on console.
Below is my main script.
source /home/employee/conf/myConf.cfg $0
load_dte=`fnGetDate dir1 dir2/*`
fnLog "load_dte $load_dte"
And configuration file that contains the function:
function fnGetDate(){
x="2016-06-21"
echo "$x"
}
When this main script is scheduled via cron :
Output is :
load_dte
Whithout cron output:
load_dte 2016-06-21
Any help is appreciated.
Edit:
Below is the full codebase:
$1 $2 are the hdfs locations/tmp/hdfs/loc1 and /tmp/hdfs/loc2
main script:
#/bin/bash
PATH=/bin:/usr/bin
dirname=$( cd "$( dirname "$0" )" && pwd )
source `echo ${dirname%/*}`/conf/lib.cfg $
fnGetScriptPath $0
fnCreateLogFile ${0}
fnMain(){
TBL_BUS="$1"
TBL_ENT="$2"
load_dte=fnGetDate "/tmp/hdfs/loc1" "/tmp/hdfs/loc2*" ""
echo load_dte $load_dte
fnLog "load_dte $load_dte" "$C_INFO"
}
fnMain $1 $2
Below is the config file:
SCRIPT_NAME="$1"
CURRENT_DATE=`date +%Y-%m-%d`
C_INFO=0
C_SUCCESS=1
C_DEBUG=2
C_WARN=3
C_ERROR=4
LOG_STATUS=(INFO SUCCESS DEBUG WARN ERROR)
function fnCreateLogFile(){
fnGetScriptPath $1
FILE=`basename $1`
LOGDIR=$root_dir/logs
LOGFILE=${LOGDIR}/${FILE/'.sh'/''}_${CURRENT_DATE}.log
if [[ ! -d $LOGDIR ]]; then
mkdir -p $LOGDIR
fnLog "Creating log dir: $LOGDIR" $C_SUCCESS
else
fnLog "Log file : $(cd `dirname $LOGFILE` && pwd)/$FILENAME" "$C_INFO"
fi
}
function fnGetScriptPath(){
script_name=`basename $1`
script_base_name=`echo $script_name | cut -d'.' -f1`
dir_name=`dirname $1`
if [[ $dir_name == "." ]];then
work_dir=`pwd`
root_dir=`pwd | awk 'BEGIN{FS="/"; OFS = "/"} {$NF="";print}' | sed 's/.$//'`
else
work_dir=$dir_name
root_dir=`echo $dir_name | awk 'BEGIN{FS="/"; OFS = "/"} {$NF="";print}' | sed 's/.$//'`
fi
}
function fnLog() {
echo `date +"%Y-%m-%d %k:%M:%S"` $1
echo `date +"%Y-%m-%d %k:%M:%S"` $1 >> $LOGFILE
if [[ $2 -eq 1 || $2 -eq 2 ]]; then MSG_ACCUM="$MSG_ACCUM \\n $1" ;fi
}
function fnGetDate(){
HDFS_SRC=$1
HDFS_TGT=$2
SKIP=${3:-"0"}
for X in ${!HDFS_SRC[*]}
do
#echo "comm -23 <(hadoop fs -ls ${HDFS_SRC[X] } | awk -F'/' '{match(\$NF, \"[0-9]+-[0-9]+-[0-9]+\", v)}{print v[0]}' | sort) <( hadoop fs -ls $HDFS_TGT | awk -F'/' '{match(\$NF, \"[0-9]+-[0-9]+-[0-9]+\", v)}{print v[0]}'"
HDFS=("${HDFS[#]}" "`comm -23 <(hadoop fs -ls ${HDFS_SRC[X] } | awk -F'/' '{match($NF, "[0-9]+-[0-9]+-[0-9]+", v)}{print v[0]}' | sort) <( hadoop fs -ls $HDFS_TGT | awk -F'/' '{match($NF, "[0-9]+-[0-9]+-[0-9]+", v)}{print v[0]}' | sort ) | grep "^[0-9]"`")
done
if [[ $value == "" ]];then
DATE=( $(
for x in "${HDFS[#]}"
do
echo "$x"
done | sort | uniq) )
echo ${DATE[$SKIP]}
else
DATE=( $(
for f in ${HDFS[#]}
do
if [[ "$f" > "$value" || "$f" == "$value" ]];then
echo "$f"
fi
done | sort | uniq) )
echo "${DATE[$SKIP]}"
fi
}

How to verify the output of a command in shell?

I need to verify if a particular file system has the noexec option set.
For example /dev/shm. I am running the command in the following manner:
get_val=$(mount|grep /dev/shm )
if [[ -z get_val ]] ; then
# Error
else
value=$(echo "${get_val}" | cut -d " " -f 6 | grep noexec)
if [ "${value}" = "" ]; then
# Not set
else
# Set
fi
fi
The value of get_val is something like devshm on /dev/shm type devshm (rw,noexec,nosuid,gid=5,mode=0620)
Next what I want to do is check if gid and mode has been set to a certain value. However, with the above procedure, I can only check if an option is set.
So I tried this:
echo "${get_val}"| cut -d " " -f 6 | awk -F, '{
if ($4 == "gid=123"){
print 1;
}
else
{ print 0;}
if ($5 == "mode=123)"){
print 1;
}
else
{ print 0;}'
However, this seems too hassle-ish and I am not sure what will be the better way to do this.
Also other parameters could be set in a filesystem such as nodev etc which would make $5 or $2 different.
any suggestions?
Looks like you really should be turning to Awk even for the basic processing.
if mount | awk '$1 == "/dev/shm" && $6 ~ /(^|,)noexec(,|$)/ &&
$6 ~ /(^|,)gid=123(,|$)/ && $6 ~ /(^|,)mode=123(,|$)/ { exit 0 }
END { exit 1 }'
then
echo fine
else
echo fail
fi
The (^|,) and (,|$) anchors are to make sure the matches are bracketed either by commas or beginning/end of field, to avoid partial matches (like mode=1234).
Getting Awk to set its return code so it can be used idiomatically in an if condition is a little bit of a hassle, but a good general idea to learn.
Why not directly use globbing with [[:
[[ ${get_val} == *\(*gid=5*\)* ]] && echo 'Matched'
Similarly, with "mode=0620":
[[ ${get_val} == *\(*mode=0620*\)* ]] && echo 'Matched'
If you prefer Regex way:
[[ ${get_val} =~ \(.*gid=5.*\) ]] && echo 'Matched'
[[ ${get_val} =~ \(.*mode=0620.*\) ]] && echo 'Matched'
Sorry if this is stupid, but isn't it as simple as
mount | grep -E '^/dev/shm.*noexec' && value=1
((value)) && #Do something useful
If you wist to check multiple fields you can pipe the grep like below :
mount | grep -E '^/dev/shm.*noexec' \
| grep -E 'gid=5.*mode=0620|mode=0620.*gid=5' >/dev/null && value=1
((value==1)) && #Do something useful

optimizing the while loop with awk command

Can any one suggest me how to optimize below while loop which is part of a shell script.
function setvars() {
CONN_TSMP="$1"
USER="$2"
DB="$3"
IP="$4"
HOST="$5"
return
}
while read line; do
TST=`grep -w $line $FILE1`
ID=`echo $line | tr -d '\"'`
VARS=$(echo ${TST} | awk -F '"' '{print $2 " " $10 " " $22 " " $20 " " $18 }')
setvars $VARS
if [ -z "$IP" ]; then
IP=`echo "$HOST"`
fi
if [ "$USER" == "root" ] && [ -z $DB ]; then
TARGET=/home/database/data1/mysql_audit/sessions/root_sec
FILE=`echo "$ID-$CONN_TSMP-$USER#$IP.txt"`
else
TARGET=/home/database/data1/mysql_audit/sessions/user_sec
FILE=`echo "$ID-$CONN_TSMP-$USER#$IP.txt"`
fi
ls $TARGET/$FILE
if [ $? -ne 0 ]; then
echo -e "################################################################ \n" >> "$TARGET/$FILE"
echo "$TST" | awk -F 'STATUS="0"' '{print $2}'| sed "s/[</>]//g" >> "$TARGET/$FILE"
echo -e "\n" >> "$TARGET/$FILE"
fi
awk -F '"' '/"'$line'"/ {print "\n======================================\nTIMESTAMP=" $2 "\nSQLTEXT=" $10}' $FILE3 >> "$TARGET/$FILE"
done < "$FILE4"
According to my observation awk is taking more time.
Can any one help me how to write optimize the above code by replacing it with awk code (an awk while loop which replace entire while loop shown above) or by removing awk or sed or grep which take more time.
1) In setvars(), Remove the double quotes around the assignments. The double quotes force the shell to rescan the values. This is minor, but in large shell scripts, it can add up to quite a bit of processing time.
2) You have multiple VAR=echo $SOMEVAL. Just assign the value: ID=$HOST
FILE="$ID-$CONN_TSMP-$USER#$IP.txt"
3) You are running an external program 'ls' to check and see if a file exists. Instead, use the builtin shell commands: if [ ! -f "$TARGET/$FILE" ]; then ...; fi. If you want the output, just do an: echo "$TARGET/$FILE".
4) Open the output file once. This is much faster, but can make maintenance of the script quite difficult. Since you only have 4 echo lines, it may not help that much.
exec 4>>"$TARGET/$FILE"
if [ ! -f "$TARGET/$FILE" ]; then
echo -e ... >&4
...
fi
awk -f ... >&4
exec 4>&-
It's not possible to optimize your awk without seeing the data it is processing. You appear to have a more modern shell as there is a $(...) construct. Replace any backtick usage with $(...).

Resources