I want to print the longest and shortest username found in /etc/passwd. If I run the code below it works fine for the shortest (head -1), but doesn't run for (sort -n |tail -1 | awk '{print $2}). Can anyone help me figure out what's wrong?
#!/bin/bash
grep -Eo '^([^:]+)' /etc/passwd |
while read NAME
do
echo ${#NAME} ${NAME}
done |
sort -n |head -1 | awk '{print $2}'
sort -n |tail -1 | awk '{print $2}'
Here the issue is:
Piping finishes with the first sort -n |head -1 | awk '{print $2}' command. So, input to first command is provided through piping and output is obtained.
For the second command, no input is given. So, it waits for the input from STDIN which is the keyboard and you can feed the input through keyboard and press ctrl+D to obtain output.
Please run the code like below to get desired output:
#!/bin/bash
grep -Eo '^([^:]+)' /etc/passwd |
while read NAME
do
echo ${#NAME} ${NAME}
done |
sort -n |head -1 | awk '{print $2}'
grep -Eo '^([^:]+)' /etc/passwd |
while read NAME
do
echo ${#NAME} ${NAME}
done |
sort -n |tail -1 | awk '{print $2}
'
All you need is:
$ awk -F: '
NR==1 { min=max=$1 }
length($1) > length(max) { max=$1 }
length($1) < length(min) { min=$1 }
END { print min ORS max }
' /etc/passwd
No explicit loops or pipelines or multiple commands required.
The problem is that you only have two pipelines, when you really need one. So you have grep | while read do ... done | sort | head | awk and sort | tail | awk: the first sort has an input (i.e., the while loop) - the second sort doesn't. So the script is hanging because your second sort doesn't have an input: or rather it does, but it's STDIN.
There's various ways to resolve:
save the output of the while loop to a temporary file and use that as an input to both sort commands
repeat your while loop
use awk to do both the head and tail
The first two involve iterating over the password file twice, which may be okay - depends what you're ultimately trying to do. But using a small awk script, this can give you both the first and last line by way of the BEGIN and END blocks.
While you already have good answers, you can also use POSIX shell to accomplish your goal without any pipe at all using the parameter expansion and string length provided by the shell itself (see: POSIX shell specifiction). For example you could do the following:
#!/bin/sh
sl=32;ll=0;sn=;ln=; ## short len, long len, short name, long name
while read -r line; do ## read each line
u=${line%%:*} ## get user
len=${#u} ## get length
[ "$len" -lt "$sl" ] && { sl="$len"; sn="$u"; } ## if shorter, save len, name
[ "$len" -gt "$ll" ] && { ll="$len"; ln="$u"; } ## if longer, save len, name
done </etc/passwd
printf "shortest (%2d): %s\nlongest (%2d): %s\n" $sl "$sn" $ll "$ln"
Example Use/Output
$ sh cketcpw.sh
shortest ( 2): at
longest (17): systemd-bus-proxy
Using either pipe/head/tail/awk or the shell itself is fine. It's good to have alternatives.
(note: if you have multiple users of the same length, this just picks the first, you can use a temp file if you want to save all names and use -le and -ge for the comparison.)
If you want both the head and the tail from the same input, you may want something like sed -e 1b -e '$!d' after you sort the data to get the top and bottom lines using sed.
So your script would be:
#!/bin/bash
grep -Eo '^([^:]+)' /etc/passwd |
while read NAME
do
echo ${#NAME} ${NAME}
done |
sort -n | sed -e 1b -e '$!d'
Alternatively, a shorter way:
cut -d":" -f1 /etc/passwd | awk '{ print length, $0 }' | sort -n | cut -d" " -f2- | sed -e 1b -e '$!d'
Related
I am writing a function in a BASH shell script, that should return lines from csv-files with headers, having more commas than the header. This can happen, as there are values inside these files, that could contain commas. For quality control, I must identify these lines to later clean them up. What I have currently:
#!/bin/bash
get_bad_lines () {
local correct_no_of_commas=$(head -n 1 $1/$1_0_0_0.csv | tr -cd , | wc -c)
local no_of_files=$(ls $1 | wc -l)
for i in $(seq 0 $(( ${no_of_files}-1 )))
do
# Check that the file exist
if [ ! -f "$1/$1_0_${i}_0.csv" ]; then
echo "File: $1_0_${i}_0.csv not found!"
continue
fi
# Search for error-lines inside the file and print them out
echo "$1_0_${i}_0.csv has over $correct_no_of_commas commas in the following lines:"
grep -o -n '[,]' "$1/$1_0_${i}_0.csv" | cut -d : -f 1 | uniq -c | awk '$1 > $correct_no_of_commas {print}'
done
}
get_bad_lines products
get_bad_lines users
The output of this program is now all the comma-counts with all of the line numbers in all the files,
and I suspect this is due to the input $1 (foldername, i.e. products & users) conflicting with the call to awk with reference to $1 as well (where I wish to grab the first column being the count of commas for that line in the current file in the loop).
Is this the issue? and if so, would it be solvable by either referencing the 1.st column or the folder name by different variable names instead of both of them using $1 ?
Example, current output:
5 6667
5 6668
5 6669
5 6670
(should only show lines for that file having more than 5 commas).
Tried variable declaration in call to awk as well, with same effect
(as in the accepted answer to Awk field variable clash with function argument)
:
get_bad_lines () {
local table_name=$1
local correct_no_of_commas=$(head -n 1 $table_name/${table_name}_0_0_0.csv | tr -cd , | wc -c)
local no_of_files=$(ls $table_name | wc -l)
for i in $(seq 0 $(( ${no_of_files}-1 )))
do
# Check that the file exist
if [ ! -f "$table_name/${table_name}_0_${i}_0.csv" ]; then
echo "File: ${table_name}_0_${i}_0.csv not found!"
continue
fi
# Search for error-lines inside the file and print them out
echo "${table_name}_0_${i}_0.csv has over $correct_no_of_commas commas in the following lines:"
grep -o -n '[,]' "$table_name/${table_name}_0_${i}_0.csv" | cut -d : -f 1 | uniq -c | awk -v table_name="$table_name" '$1 > $correct_no_of_commas {print}'
done
}
You can use awk the full way to achieve that :
get_bad_lines () {
find "$1" -maxdepth 1 -name "$1_0_*_0.csv" | while read -r my_file ; do
awk -v table_name="$1" '
NR==1 { num_comma=gsub(/,/, ""); }
/,/ { if (gsub(/,/, ",", $0) > num_comma) wrong_array[wrong++]=NR":"$0;}
END { if (wrong > 0) {
print(FILENAME" has over "num_comma" commas in the following lines:");
for (i=0;i<wrong;i++) { print(wrong_array[i]); }
}
}' "${my_file}"
done
}
For why your original awk command failed to give only lines with too many commas, that is because you are using a shell variable correct_no_of_commas inside a single quoted awk statement ('$1 > $correct_no_of_commas {print}'). Thus there no substitution by the shell, and awk read "$correct_no_of_commas" as is, and perceives it as an undefined variable. More precisely, awk look for the variable correct_no_of_commas which is undefined in the awk script so it is an empty string . awk will then execute $1 > $"" as matching condition, and as $"" is a $0 equivalent, awk will compare the count in $1 with the full input line. From a numerical point of view, the full input line has the form <tab><count><tab><num_line>, so it is 0 for awk. Thus, $1 > $correct_no_of_commas will be always true.
You can identify all the bad lines with a single awk command
awk -F, 'FNR==1{print FILENAME; headerCount=NF;} NF>headerCount{print} ENDFILE{print "#######\n"}' /path/here/*.csv
If you want the line number also to be printed, use this
awk -F, 'FNR==1{print FILENAME"\nLine#\tLine"; headerCount=NF;} NF>headerCount{print FNR"\t"$0} ENDFILE{print "#######\n"}' /path/here/*.csv
Iam fairly new here, and looking to learn more about bash programming.
So first I need some help with the finger command.
When I use just "finger" thats what I get as output, obviously with some data sets.
Login Name Tty Idle Login Time Where
What I want, is that I adapt the finger command so it only outputs the "Name" with its associated data sets, like that:
Name
...
You can use awk for this:
finger | awk '{print $2}'
Edit: New approach using a combination of awk and cut that's somewhat more robust to arbitrarily formatted names.
#!/bin/bash
#parse_finger.sh
#Read first line from stdin
IFS='$\n' read -r line
#Count the number of chars until 'Name'
str=$(echo "$line" | awk -F "Name" '{print $1}')
start=${#str}
start=$((start+1))
#Count the number of chars until 'Tty'
str=$(echo "$line" | awk -F "Tty" '{print $1}')
stop=${#str}
stop=$((stop-1))
#Print out the 'Name' header
echo "$line" | cut -c $start-$stop
#Read in the rest of our lines and print the cols we care about
while IFS='$\n' read -r line; do
echo "$line" | cut -c $start-$stop
done
Run it with finger | parse_finger.sh
I have the following script:
#!/bin/bash
TotalMem=$(top -n 1 | grep Mem | awk 'NR==1{print $4}') #integer
UsadoMem=$(top -n 1 | grep Mem | awk 'NR==1{print $8}') #integer
PorcUsado='scale=2;UsadoMem/TotalMem'|bc -l
echo $PorcUsado
The variable PorcUsado returns empty. I search for the use of bc, but something is wrong...
You're assigning PorcUsado to scale=2;UsadoMem/TotalMem and then piping the output of that assignment (nothing) into bc. You probably want the pipe inside a command substitution, e.g. (using a here string instead of a pipe):
PorcUsado=$(bc -l <<<'scale=2;UsadoMem/TotalMem')
But you'll also need to evaluate those shell variables - bc can't do it for you:
PorcUsado=$(bc -l <<<"scale=2;$UsadoMem/$TotalMem")
Notice the use of " instead of ' and the $ prefix to allow Bash to evaluate the variables.
Also, if this is the whole script, you can just skip the PorcUsado variable at all and let bc write directly to stdout.
#!/bin/bash
TotalMem=$(top -n 1 | grep Mem | awk 'NR==1{print $4}') #integer
UsadoMem=$(top -n 1 | grep Mem | awk 'NR==1{print $8}') #integer
bc -l <<<"scale=2;$UsadoMem/$TotalMem"
Why pipe top output at all? Seems too costly.
$ read used buffers < <(
awk -F':? +' '
{a[$1]=$2}
END {printf "%d %d", a["MemTotal"]-a["MemFree"], a["Buffers"]}
' /proc/meminfo
)
Of course, it can easily be a one-liner if you value brevity over readability.
I think the pipe is the problem try something like this:
PorcUsado=$(echo "scale=2;$UsadoMem/$TotalMem" | bc -l)
i haven't tested it yet but you have to echo the string and pipe the result from echo to bc.
EDIT: Correcting the variable names
You don't need grep or bc, since awk can search and do math all by itself:
top -n 1 -l 1 | awk '/Mem/ {printf "%0.2f\n",$8/$4;exit}'
For example, I have below log files from the 16th-20th of Feb 2015. Now I want to create a single file named, mainentrywatcherReport_2015-02-16_2015-02-20.log. So in other words, I want to extract the date format from the first and last file of week (Mon-Fri) and create one output file every Saturday. I will be using cron to trigger the script every Saturday.
$ ls -l
mainentrywatcher_2015-02-16.log
mainentrywatcher_2015-02-17.log
mainentrywatcher_2015-02-18.log
mainentrywatcher_2015-02-19.log
mainentrywatcher_2015-02-20.log
$ cat *.log >> mainentrywatcherReport_2015-02-16_2015-02-20.log
$ mv *.log archive/
Can anybody help on how to rename the output file to above format?
Perhaps try this:
parta=`ls -l | head -n1 | cut -d'_' -f2 | cut -d'.' -f1`
partb=`ls -l | head -n5 | cut -d'_' -f2 | cut -d'.' -f1`
filename=mainentrywatcherReport_${parta}_${partb}.log
cat *.log >> ${filename}
"ls -l" output is described in the question
"head -nX" takes the Xth line of the output
"cut -d'_' -f2" takes everything (that remains) after the first underscore
"cut -d'.' -f1" times everything (that remains) before the first period
both commands are surrounded by ` marks (above tilde ~) to capture the output of the command to a variable
file name assembles the two dates stripped of the unnecessary with the other formatting desired for the final file name.
the cat command demonstrates one possible way to use the resulting filename
Happy coding! Leave a comment if you have any questions.
You can try this if you want to introduce simple looping...
FROM=ls -lrt mainentrywatcher_* | awk '{print $9}' | head -1 | cut -d"_" -f2 | cut -d"." -f1
TO=ls -lrt mainentrywatcher_* | awk '{print $9}' | tail -1 | cut -d"_" -f2 | cut -d"." -f1
FINAL_LOG=mainentrywatcherReport_${FROM}_${TO}.log
for i in ls -lrt mainentrywatcher_* | awk '{print $9}'
do
cat $i >> $FINAL_LOG
done
echo "All Logs Stored in $FINAL_LOG"
Another approach given your daily files and test contents as follows:
mainentrywatcher_2015-02-16.log -> a
mainentrywatcher_2015-02-17.log -> b
mainentrywatcher_2015-02-18.log -> c
mainentrywatcher_2015-02-19.log -> d
mainentrywatcher_2015-02-20.log -> e
That utilizes bash parameter expansion/substring extraction would be a simple loop:
#!/bin/bash
declare -i cnt=0 # simple counter to determine begin
for i in mainentrywatcher_2015-02-*; do # loop through each matching file
tmp=${i//*_/} # isolate date
tmp=${tmp//.*/}
[ $cnt -eq 0 ] && begin=$tmp || end=$tmp # assign first to begin, last to end
((cnt++)) # increment counter
done
cmbfname="${i//_*/}_${begin}_${end}.log" # form the combined logfile name
cat ${i//_*/}* > $cmbfname # cat all into combined name
## print out begin/end/cmbfname & contents to verify
printf "\nbegin: %s\nend : %s\nfname: %s\n\n" $begin $end $cmbfname
printf "contents: %s\n\n" $cmbfname
cat $cmbfname
exit 0
use/output:
alchemy:~/scr/tmp/stack/tmp> bash weekly.sh
begin: 2015-02-16
end : 2015-02-20
fname: mainentrywatcher_2015-02-16_2015-02-20.log
contents: mainentrywatcher_2015-02-16_2015-02-20.log
a
b
c
d
e
You can, of course, modify the for loop to accept a positional parameter containing the partial filename and pass the partial file name from the command line.
Something like this:
#!/bin/sh
LOGS="`echo mainentrywatcher_2[0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9].log`"
HEAD=
TAIL=
for logs in $LOGS
do
TAIL=`echo $logs | sed -e 's/^.*mainentrywatcher_//' -e 's/\.log$//'`
test -z "$HEAD" && HEAD=$TAIL
done
cat $LOGS >mainentrywatcherReport_${HEAD}_${TAIL}.log
mv $LOGS archive/
That is:
get a list of the existing logs (which happen to be sorted) in a variable $LOGS
walk through the list, getting just the date according to the example
save the first date as $HEAD
save the last date as $TAIL
after the loop, cat all of those files into the new output file
move the used-up log-files into the archive directory.
If I have
days="1 2 3 4 5 6"
func() {
echo "lSecure1"
echo "lSecure"
echo "lSecure4"
echo "lSecure6"
echo "something else"
}
and do
func | egrep "lSecure[1-6]"
then I get
lSecure1
lSecure4
lSecure6
but what I would like is
lSecure2
lSecure3
lSecure5
which is all the days that doesn't have a lSecure string.
Question
My current idea is to use awk to split the $days and then loop over all combinations.
Is there a better way?
Note that grep -v inverts the sense of a plain grep and does not solve the problem as it does not generate the required strings.
I usually use the -f flag of grep for similar purposes. The <( ... ) code generates a file with all possibilities, grep only selects those not present in the func.
func | grep 'lSecure[1-6]' | grep -v -f- <( for i in $days ; do echo lSecure$i ; done )
Or, you may prefer it the other way round:
for i in $days ; do echo lSecure$i ; done | grep -vf <( func | grep 'lSecure[1-6]' )
F=$(func)
for f in $days; do
if ! echo $F | grep -q lSecure$f; then
echo lSecure$f
fi
done
An awk solution:
$ func | awk -v i="${days}" 'BEGIN{split(i,a," ")}{gsub(/lSecure/,"");
for(var in a)if(a[var] == $0){delete a[var];break}}
END{for(var in a) print "lSecure" a[var]}' | sort
We store it in an awk array a then while reading a line, get the last number, if it is present in array, then remove that from the array. So at the end, in the array, only those element which have not been found remains. Sort is just to present in a sorted manner :)
I am not sure exactly what you are trying to achieve, but you might consider using uniq -u which deletes repeated sequences. For example you can do this with it:
( echo "$days" | tr -s ' ' '\n'; func | grep -oP '(?<=lSecure)[1-6]' ) | sort | uniq -u
Output:
2
3
5