Replacing awk with bash built-ins - linux

I have been told to write a bash script for adding all the GroupID's in "/etc/passwd" file, this is my scrip
#!/bin/sh
# script input should be (sh groupsum.sh /etc/passwd)
if [ -f $1 ] ; then
awk -F ':' '{print $4}' $1 > /tmp/numb
A=`awk '{s+=$1} END {print s}' /tmp/numb`
echo $A
else
echo "its not a file"
fi
The script is working fine but to make it fast I should use bash built-in commands instead of using "awk". So I need information to achieve this using built-in commands it would be great if someone gives the explanation on this.

You said "bash built-ins", but your script starts with #!/bin/sh -- which requests POSIX sh, not bash. I'll assume, though, that you really do want bash.
#!/bin/bash
[[ -f "$1" ]] || { echo "Not a file" >&2; exit 1; }
exec <"$1"
total=0
while IFS=':' read -r _ _ _ groupid _; do
(( total += groupid ))
done
echo "$total"
To explain the specific operations being used to replace components of your awk script: The read command iterates through lines (by default), splitting them by characters in IFS; so IFS=: read -r _ _ _ groupid _ discards the first three columns, puts the fourth in in a variable named groupid, and discards the rest. (( )) is a math context in bash; inside it, C-style syntax is usable for integer arithmetic operations, hence the addition.
By the way, reading /etc/passwd directly is a bad idea -- it won't work on systems using LDAP, or NIS, or any other alternate directory service. If you're on a Linux host, you can use the getent program to do a lookup that works with whatever your current directory service is:
$ yourscript <(getent passwd)
All that said, the premise for this question is a poor one -- though there's overhead for spawning any external program, awk included, once it's running awk is much, much faster than bash. If speed were your only priority, you'd do better to not use a shell at all, and have your script start with a shebang that runs the awk interpreter directly.

Related

bash script loop breaks [duplicate]

I have the following shell script. The purpose is to loop thru each line of the target file (whose path is the input parameter to the script) and do work against each line. Now, it seems only work with the very first line in the target file and stops after that line got processed. Is there anything wrong with my script?
#!/bin/bash
# SCRIPT: do.sh
# PURPOSE: loop thru the targets
FILENAME=$1
count=0
echo "proceed with $FILENAME"
while read LINE; do
let count++
echo "$count $LINE"
sh ./do_work.sh $LINE
done < $FILENAME
echo "\ntotal $count targets"
In do_work.sh, I run a couple of ssh commands.
The problem is that do_work.sh runs ssh commands and by default ssh reads from stdin which is your input file. As a result, you only see the first line processed, because the command consumes the rest of the file and your while loop terminates.
This happens not just for ssh, but for any command that reads stdin, including mplayer, ffmpeg, HandBrakeCLI, httpie, brew install, and more.
To prevent this, pass the -n option to your ssh command to make it read from /dev/null instead of stdin. Other commands have similar flags, or you can universally use < /dev/null.
A very simple and robust workaround is to change the file descriptor from which the read command receives input.
This is accomplished by two modifications: the -u argument to read, and the redirection operator for < $FILENAME.
In BASH, the default file descriptor values (i.e. values for -u in read) are:
0 = stdin
1 = stdout
2 = stderr
So just choose some other unused file descriptor, like 9 just for fun.
Thus, the following would be the workaround:
while read -u 9 LINE; do
let count++
echo "$count $LINE"
sh ./do_work.sh $LINE
done 9< $FILENAME
Notice the two modifications:
read becomes read -u 9
< $FILENAME becomes 9< $FILENAME
As a best practice, I do this for all while loops I write in BASH.
If you have nested loops using read, use a different file descriptor for each one (9,8,7,...).
More generally, a workaround which isn't specific to ssh is to redirect standard input for any command which might otherwise consume the while loop's input.
while read -r line; do
((count++))
echo "$count $line"
sh ./do_work.sh "$line" </dev/null
done < "$filename"
The addition of </dev/null is the crucial point here, though the corrected quoting is also somewhat important for robustness; see also When to wrap quotes around a shell variable?. You will want to use read -r unless you specifically require the slightly odd legacy behavior you get for backslashes in the input without -r. Finally, avoid upper case for your private variables.
Another workaround of sorts which is somewhat specific to ssh is to make sure any ssh command has its standard input tied up, e.g. by changing
ssh otherhost some commands here
to instead read the commands from a here document, which conveniently (for this particular scenario) ties up the standard input of ssh for the commands:
ssh otherhost <<'____HERE'
some commands here
____HERE
ssh -n option prevents checking the exit status of ssh when using HEREdoc while piping output to another program.
So use of /dev/null as stdin is preferred.
#!/bin/bash
while read ONELINE ; do
ssh ubuntu#host_xyz </dev/null <<EOF 2>&1 | filter_pgm
echo "Hi, $ONELINE. You come here often?"
process_response_pgm
EOF
if [ ${PIPESTATUS[0]} -ne 0 ] ; then
echo "aborting loop"
exit ${PIPESTATUS[0]}
fi
done << input_list.txt
This was happening to me because I had set -e and a grep in a loop was returning with no output (which gives a non-zero error code).

Loop ends prematurely when executing a command via SSH in a Bash function [duplicate]

I have the following shell script. The purpose is to loop thru each line of the target file (whose path is the input parameter to the script) and do work against each line. Now, it seems only work with the very first line in the target file and stops after that line got processed. Is there anything wrong with my script?
#!/bin/bash
# SCRIPT: do.sh
# PURPOSE: loop thru the targets
FILENAME=$1
count=0
echo "proceed with $FILENAME"
while read LINE; do
let count++
echo "$count $LINE"
sh ./do_work.sh $LINE
done < $FILENAME
echo "\ntotal $count targets"
In do_work.sh, I run a couple of ssh commands.
The problem is that do_work.sh runs ssh commands and by default ssh reads from stdin which is your input file. As a result, you only see the first line processed, because the command consumes the rest of the file and your while loop terminates.
This happens not just for ssh, but for any command that reads stdin, including mplayer, ffmpeg, HandBrakeCLI, httpie, brew install, and more.
To prevent this, pass the -n option to your ssh command to make it read from /dev/null instead of stdin. Other commands have similar flags, or you can universally use < /dev/null.
A very simple and robust workaround is to change the file descriptor from which the read command receives input.
This is accomplished by two modifications: the -u argument to read, and the redirection operator for < $FILENAME.
In BASH, the default file descriptor values (i.e. values for -u in read) are:
0 = stdin
1 = stdout
2 = stderr
So just choose some other unused file descriptor, like 9 just for fun.
Thus, the following would be the workaround:
while read -u 9 LINE; do
let count++
echo "$count $LINE"
sh ./do_work.sh $LINE
done 9< $FILENAME
Notice the two modifications:
read becomes read -u 9
< $FILENAME becomes 9< $FILENAME
As a best practice, I do this for all while loops I write in BASH.
If you have nested loops using read, use a different file descriptor for each one (9,8,7,...).
More generally, a workaround which isn't specific to ssh is to redirect standard input for any command which might otherwise consume the while loop's input.
while read -r line; do
((count++))
echo "$count $line"
sh ./do_work.sh "$line" </dev/null
done < "$filename"
The addition of </dev/null is the crucial point here, though the corrected quoting is also somewhat important for robustness; see also When to wrap quotes around a shell variable?. You will want to use read -r unless you specifically require the slightly odd legacy behavior you get for backslashes in the input without -r. Finally, avoid upper case for your private variables.
Another workaround of sorts which is somewhat specific to ssh is to make sure any ssh command has its standard input tied up, e.g. by changing
ssh otherhost some commands here
to instead read the commands from a here document, which conveniently (for this particular scenario) ties up the standard input of ssh for the commands:
ssh otherhost <<'____HERE'
some commands here
____HERE
ssh -n option prevents checking the exit status of ssh when using HEREdoc while piping output to another program.
So use of /dev/null as stdin is preferred.
#!/bin/bash
while read ONELINE ; do
ssh ubuntu#host_xyz </dev/null <<EOF 2>&1 | filter_pgm
echo "Hi, $ONELINE. You come here often?"
process_response_pgm
EOF
if [ ${PIPESTATUS[0]} -ne 0 ] ; then
echo "aborting loop"
exit ${PIPESTATUS[0]}
fi
done << input_list.txt
This was happening to me because I had set -e and a grep in a loop was returning with no output (which gives a non-zero error code).

Script parameters in Bash

I'm trying to make a shell script which should be used like this:
ocrscript.sh -from /home/kristoffer/test.png -to /home/kristoffer/test.txt
The script will then ocr convert the image file to a text file. Here is what I have come up with so far:
#!/bin/bash
export HOME=/home/kristoffer
/usr/local/bin/abbyyocr9 -rl Swedish -if ???fromvalue??? -of ???tovalue??? 2>&1
But I don't know how to get the -from and -to values. Any ideas on how to do it?
The arguments that you provide to a bashscript will appear in the variables $1 and $2 and $3 where the number refers to the argument. $0 is the command itself.
The arguments are seperated by spaces, so if you would provide the -from and -to in the command, they will end up in these variables too, so for this:
./ocrscript.sh -from /home/kristoffer/test.png -to /home/kristoffer/test.txt
You'll get:
$0 # ocrscript.sh
$1 # -from
$2 # /home/kristoffer/test.png
$3 # -to
$4 # /home/kristoffer/test.txt
It might be easier to omit the -from and the -to, like:
ocrscript.sh /home/kristoffer/test.png /home/kristoffer/test.txt
Then you'll have:
$1 # /home/kristoffer/test.png
$2 # /home/kristoffer/test.txt
The downside is that you'll have to supply it in the right order. There are libraries that can make it easier to parse named arguments on the command line, but usually for simple shell scripts you should just use the easy way, if it's no problem.
Then you can do:
/usr/local/bin/abbyyocr9 -rl Swedish -if "$1" -of "$2" 2>&1
The double quotes around the $1 and the $2 are not always necessary but are adviced, because some strings won't work if you don't put them between double quotes.
If you're not completely attached to using "from" and "to" as your option names, it's fairly easy to implement this using getopts:
while getopts f:t: opts; do
case ${opts} in
f) FROM_VAL=${OPTARG} ;;
t) TO_VAL=${OPTARG} ;;
esac
done
getopts is a program that processes command line arguments and conveniently parses them for you.
f:t: specifies that you're expecting 2 parameters that contain values (indicated by the colon). Something like f:t:v says that -v will only be interpreted as a flag.
opts is where the current parameter is stored. The case statement is where you will process this.
${OPTARG} contains the value following the parameter. ${FROM_VAL} for example will get the value /home/kristoffer/test.png if you ran your script like:
ocrscript.sh -f /home/kristoffer/test.png -t /home/kristoffer/test.txt
As the others are suggesting, if this is your first time writing bash scripts you should really read up on some basics. This was just a quick tutorial on how getopts works.
Use the variables "$1", "$2", "$3" and so on to access arguments. To access all of them you can use "$#", or to get the count of arguments $# (might be useful to check for too few or too many arguments).
I needed to make sure that my scripts are entirely portable between various machines, shells and even cygwin versions. Further, my colleagues who were the ones I had to write the scripts for, are programmers, so I ended up using this:
for ((i=1;i<=$#;i++));
do
if [ ${!i} = "-s" ]
then ((i++))
var1=${!i};
elif [ ${!i} = "-log" ];
then ((i++))
logFile=${!i};
elif [ ${!i} = "-x" ];
then ((i++))
var2=${!i};
elif [ ${!i} = "-p" ];
then ((i++))
var3=${!i};
elif [ ${!i} = "-b" ];
then ((i++))
var4=${!i};
elif [ ${!i} = "-l" ];
then ((i++))
var5=${!i};
elif [ ${!i} = "-a" ];
then ((i++))
var6=${!i};
fi
done;
Rationale: I included a launcher.sh script as well, since the whole operation had several steps which were quasi independent on each other (I'm saying "quasi", because even though each script could be run on its own, they were usually all run together), and in two days I found out, that about half of my colleagues, being programmers and all, were too good to be using the launcher file, follow the "usage", or read the HELP which was displayed every time they did something wrong and they were making a mess of the whole thing, running scripts with arguments in the wrong order and complaining that the scripts didn't work properly. Being the choleric I am I decided to overhaul all my scripts to make sure that they are colleague-proof. The code segment above was the first thing.
In bash $1 is the first argument passed to the script, $2 second and so on
/usr/local/bin/abbyyocr9 -rl Swedish -if "$1" -of "$2" 2>&1
So you can use:
./your_script.sh some_source_file.png destination_file.txt
Explanation on double quotes;
consider three scripts:
# foo.sh
bash bar.sh $1
# cat foo2.sh
bash bar.sh "$1"
# bar.sh
echo "1-$1" "2-$2"
Now invoke:
$ bash foo.sh "a b"
1-a 2-b
$ bash foo2.sh "a b"
1-a b 2-
When you invoke foo.sh "a b" then it invokes bar.sh a b (two arguments), and with foo2.sh "a b" it invokes bar.sh "a b" (1 argument). Always have in mind how parameters are passed and expaned in bash, it will save you a lot of headache.

I keep getting a 'while syntax' error on the output of the at job in unix and I have no idea why

#!/usr/dt/bin/dtksh
while getopts w:m: option
do
case $option in
w) wflag=1
wval="$OPTARG";;
m) mflag=1
mval="$OPTARG";;
?) printf 'BAD\n' $0
exit 2;;
esac
done
if [ ! -z "$wflag" ]; then
printf "W and -w arg is $wval\n"
fi
if [ ! -z "$mflag" ]; then
printf "M and -m arg is $mval\n"
fi
shift $(($OPTIND - 1))
printf "Remaining arguments are: $* \n"
at $wval <<ENDMARKER
echo $* >> Search_List
tr " " "\n" <Search_List >Usr_List
while true; do
if [ -s Usr_List ]; then
for i in $(cat Usr_List); do
if finger -m | grep $i; then
echo '$i is online' | elm user
sed '/$i/d' <Usr_List >tmplist
mv tmplist Usr_List
fi
done
else
break
fi
done
ENDMARKER
Essentially I want to keep searching through until it is empty. Each time an element of the list is found, it is deleted. Once the list is empty quit.
There are no error messages when I first run the command, it only shows up in an email containing the output of the at job.
Thanks in advance for any advice
EDIT: The script uses getopts and takes one argument for -w and one for -m, the w value is set as the time for the at job, the m still has to be used. Any arguments after the one for m are sent to a file called Search_List, Search_List is edited and saved as Usr_List. Then in the while loop, while Usr_List is not empty, the script checks the results of finger -m against the names in Usr_List. If a name is found, it is removed from Usr_List. Once Usr_List is empty, the program should stop.
elm is a way to send an email, so elm user sends an email to user.
The error is :
while: Expression syntax
at uses /bin/sh by default.
at now <<ENDMARKER
<code here>
ENDMARKER
All of this executes under /bin/sh, which on some systems can be Bourne Shell (Solaris for example).
You need to figure out what /bin/sh is for your system, then modify things accordingly. Plus, read the gurantees about what is and what is not in your "at" environment. I think the problem lies there. You have both UNIX and linux tags. So I cannot give a lot more help than that.
You can enable logging -- the way YOU need it -- of the at code chunk:
exec 2&>1 > /tmp/somefile.log
Then write debugging messages to stdout or stderr.
Your HEREDOC is being interpolated. Try quoting the delimiter:
at $wval << 'ENDMARKER'
Although ( I haven't looked closely) it appears that you want some interpolation. But you definitely do not want it on the line in which you reference $i, so quote that $ if you do not quote the entire heredoc:
if finger -m | grep \$i; then
You need to pass the -k option to at:
...
at -k $wval <<ENDMARKER
...
at is otherwise defaulting to your login shell which is csh or one of its derivatives.
It turns out that the while command and the if command needed to be combined.
while [[ -s Usr_List ]]; do
......
done

Forcing bash to expand variables in a string loaded from a file

I am trying to work out how to make bash (force?) expand variables in a string (which was loaded from a file).
I have a file called "something.txt" with the contents:
hello $FOO world
I then run
export FOO=42
echo $(cat something.txt)
this returns:
hello $FOO world
It didn't expand $FOO even though the variable was set. I can't eval or source the file - as it will try and execute it (it isn't executable as it is - I just want the string with the variables interpolated).
Any ideas?
I stumbled on what I think is THE answer to this question: the envsubst command:
echo "hello \$FOO world" > source.txt
export FOO=42
envsubst < source.txt
This outputs: hello 42 world
If you would like to continue work on the data in a file destination.txt, push this back to a file like this:
envsubst < source.txt > destination.txt
In case it's not already available in your distro, it's in the
GNU package gettext.
#Rockallite
I wrote a little wrapper script to take care of the '$' problem.
(BTW, there is a "feature" of envsubst, explained at
https://unix.stackexchange.com/a/294400/7088
for expanding only some of the variables in the input, but I
agree that escaping the exceptions is much more convenient.)
Here's my script:
#! /bin/bash
## -*-Shell-Script-*-
CmdName=${0##*/}
Usage="usage: $CmdName runs envsubst, but allows '\$' to keep variables from
being expanded.
With option -sl '\$' keeps the back-slash.
Default is to replace '\$' with '$'
"
if [[ $1 = -h ]] ;then echo -e >&2 "$Usage" ; exit 1 ;fi
if [[ $1 = -sl ]] ;then sl='\' ; shift ;fi
sed 's/\\\$/\${EnVsUbDolR}/g' | EnVsUbDolR=$sl\$ envsubst "$#"
Many of the answers using eval and echo kind of work, but break on various things, such as multiple lines, attempting to escaping shell meta-characters, escapes inside the template not intended to be expanded by bash, etc.
I had the same issue, and wrote this shell function, which as far as I can tell, handles everything correctly. This will still strip only trailing newlines from the template, because of bash's command substitution rules, but I've never found that to be an issue as long as everything else remains intact.
apply_shell_expansion() {
declare file="$1"
declare data=$(< "$file")
declare delimiter="__apply_shell_expansion_delimiter__"
declare command="cat <<$delimiter"$'\n'"$data"$'\n'"$delimiter"
eval "$command"
}
For example, you can use it like this with a parameters.cfg which is really a shell script that just sets variables, and a template.txt which is a template that uses those variables:
. parameters.cfg
printf "%s\n" "$(apply_shell_expansion template.txt)" > result.txt
In practice, I use this as a sort of lightweight template system.
you can try
echo $(eval echo $(cat something.txt))
You don't want to print each line, you want to evaluate it so that Bash can perform variable substitutions.
FOO=42
while read; do
eval echo "$REPLY"
done < something.txt
See help eval or the Bash manual for more information.
Another approach (which seems icky, but I am putting it here anyway):
Write the contents of something.txt to a temp file, with an echo statement wrapped around it:
something=$(cat something.txt)
echo "echo \"" > temp.out
echo "$something" >> temp.out
echo "\"" >> temp.out
then source it back in to a variable:
RESULT=$(source temp.out)
and the $RESULT will have it all expanded. But it seems so wrong !
Single line solution that doesn't need temporary file :
RESULT=$(source <(echo "echo \"$(cat something.txt)\""))
#or
RESULT=$(source <(echo "echo \"$(<something.txt)\""))
If you only want the variable references to be expanded (an objective that I had for myself) you could do the below.
contents="$(cat something.txt)"
echo $(eval echo \"$contents\")
(The escaped quotes around $contents is key here)
If something.txt has only one line, a bash method, (a shorter version of Michael Neale's "icky" answer),
using process & command substitution:
FOO=42 . <(echo -e echo $(<something.txt))
Output:
hello 42 world
Note that export isn't needed.
If something.txt has one or more lines, a GNU sed evaluate method:
FOO=42 sed 's/"/\\\"/g;s/.*/echo "&"/e' something.txt
Following solution:
allows replacing of variables which are defined
leaves unchanged variables placeholders which are not defined. This is especially useful during automated deployments.
supports replacement of variables in following formats:
${var_NAME}
$var_NAME
reports which variables are not defined in environment and returns error code for such cases
TARGET_FILE=someFile.txt;
ERR_CNT=0;
for VARNAME in $(grep -P -o -e '\$[\{]?(\w+)*[\}]?' ${TARGET_FILE} | sort -u); do
VAR_VALUE=${!VARNAME};
VARNAME2=$(echo $VARNAME| sed -e 's|^\${||g' -e 's|}$||g' -e 's|^\$||g' );
VAR_VALUE2=${!VARNAME2};
if [ "xxx" = "xxx$VAR_VALUE2" ]; then
echo "$VARNAME is undefined ";
ERR_CNT=$((ERR_CNT+1));
else
echo "replacing $VARNAME with $VAR_VALUE2" ;
sed -i "s|$VARNAME|$VAR_VALUE2|g" ${TARGET_FILE};
fi
done
if [ ${ERR_CNT} -gt 0 ]; then
echo "Found $ERR_CNT undefined environment variables";
exit 1
fi
foo=45
file=something.txt # in a file is written: Hello $foo world!
eval echo $(cat $file)
$ eval echo $(cat something.txt)
hello 42 world
$ bash --version
GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin17)
Copyright (C) 2007 Free Software Foundation, Inc.
envsubst is a great solution (see LenW's answer) if the content you're substituting is of "reasonable" length.
In my case, I needed to substitute in a file's content to replace the variable name. envsubst requires that the content be exported as environment variables and bash has a problem when exporting environment variables that are more than a megabyte or so.
awk solution
Using cuonglm's solution from a different question:
needle="doc1_base64" # The "variable name" in the file. (A $ is not needed.)
needle_file="doc1_base64.txt" # Will be substituted for the needle
haystack=$requestfile1 # File containing the needle
out=$requestfile2
awk "BEGIN{getline l < \"${needle_file}\"}/${needle}/{gsub(\"${needle}\",l)}1" $haystack > $out
This solution works for even large files.
expenv () {
LF=$'\n'
echo "cat <<END_OF_TEXT${LF}$(< "$1")${LF}END_OF_TEXT" | bash
return $?
}
expenv "file name"
The following works: bash -c "echo \"$(cat something.txt)"\"

Resources