Script is re-reading arguments - linux

When I supply the script with the argument: hi[123].txt it will do exactly what I want.
But if I specify the wildcard character ( hi*.txt ) it will be re-reading some files.
I was wondering how to modify this script to fix that silly problem:
#!/bin/sh
count="0"
total="0"
FILE="$1" #FILE specification is now $1 Specification..
for FILE in $#
do
#if the file is not readable then say so
if [ ! -r $FILE ];
then
echo "File: $FILE not readable"
exit 0
fi
# Start processing readable files
while read line
do
if [[ "$line" =~ ^Total ]];
then
tmp=$(echo $line | cut -d':' -f2)
total=$(expr $total + $tmp)
echo "$FILE (s) have a total of:$tmp "
count=$(expr $count + 1)
fi
done < $FILE
done
echo " Total is: $total"
echo " Number of files read is:$count"

This seems redundant:
FILE="$1" #FILE specification is now $1 Specification..
for FILE in $#
...
The initial assignment is promptly overwritten.
On the whole this seems to be a task better suited to a line processing language like awk or perl.
Consider something along the lines of this awk script:
BEGIN{
TOTAL=0;
COUNT=0;
FS=':';
}
/^Total/{
TOTAL += $2;
COUNT++;
printf("File '%s' has a total of %i",FILENAME,TOTAL);
}
END{
printf("Total is %i",TOTAL);
printf("Number of files read is%i",COUNT);
}

I don't know what is wrong with it, but one little point i noticed:
Change for FILE in $# into for FILE in "$#" . Because if files have embedded spaces, you are now on the safe way. It will expand into "$1" "$2" ... then, instead of $1 $2 ... (and note everywhere you use $FILE too remember to "" it).
And what others say, you don't need to initialize FILE before you enter the loop. It will be set to each of the filenames of the expanded positional parameters in the for loop automatically.
However, i would go with an awk script like this:
awk -F: '
/^Total/ {
total += $2
# count++ not needed. see below
print FILENAME "(s) have a total of: " $2
}
END {
print "Total is: " total
print "Number of files read is: " (ARGC-1)
}' foo*.txt
Note that when a file contains multiple "^Count" lines, you would indeed say you read more files than you actually read if you rely on count to tell you the number of files read.

On error, exit with a non-zero status. Also on error, report errors to standard error, not standard output - though that may be a bit advanced for you as yet.
echo "$0: file $FILE not readable" 1>&2
The 1 is theoretically unnecessary (though I remember problems with a shell implementation on Windows if it was omitted). Echoing the script name '$0' at the start of the error message is a good idea too - it makes error tracking easier later when your script is used in other contexts.
I believe this Perl one-liner does the job you are after.
perl -na -F: -e '$sum += $F[1] if m/^Total:/; END { print $sum; }' "$#"
I understand that you are learning shell programming, but one of the important things with shell programming is knowing which programs to use.

How about this solution:
for FILE in `/bin/ls $#`
do
. . .
This will effectively eliminate duplicates because /bin/ls hi1.txt hi1.txt hi1.txt should only show hi1.txt once.
Though I'm not sure why it's re-reading files. The wildcard expansion should only include each file once. Do you have some files matched by hi*.txt that are links to files matched by hi[123].txt?

Related

Linux script reading an ini file and splitting into variables by a specified character

I'm stuck in the following task: Lets pretend we have an .ini file in a folder. The file contains lines like this:
eno1=10.0.0.254/24
eno2=172.16.4.129/25
eno3=192.168.2.1/25
tun0=10.10.10.1/32
I had to choose the biggest subnet mask. So my attempt was:
declare -A data
for f in datadir/name
do
while read line
do
r=(${line//=/ })
let data[${r[0]}]=${r[1]}
done < $f
done
This is how far i got. (Yeah i know the file named name is not an .ini file but a .txt since i got problem even with creating an ini file,this teacher didn't even give a file like that for our exam.)
It splits the line until the =, but doesn't want to read the IP number because of the (first) . character.
(Invalid arithmetic operator the error message i got)
If someone could help me and explain how i can make a script for tasks like this i would be really thankful!
Both previously presented solutions operate (and do what they're designed to do); I thought I'd add something left-field as the specifications are fairly loose.
$ cat freasy
eno1=10.0.0.254/24
eno2=172.16.4.129/25
eno3=192.168.2.1/25
tun0=10.10.10.1/32
I'd argue that the biggest subnet mask is the one with the lowest numerical value (holds the most hosts).
$ sort -t/ -k2,2nr freasy| tail -n1
eno1=10.0.0.254/24
Don't use let. It's for arithmetic.
$ help let
let: let arg [arg ...]
Evaluate arithmetic expressions.
Evaluate each ARG as an arithmetic expression.
Just use straight assignment:
declare -A data
for f in datadir/name
do
while read line
do
r=(${line//=/ })
data[${r[0]}]=${r[1]}
done < $f
done
Result:
$ declare -p data
declare -A data=([tun0]="10.10.10.1/32" [eno1]="10.0.0.254/24" [eno2]="172.16.4.129/25" [eno3]="192.168.2.1/25" )
awk provides a simple solution to find the max value following the '/' that will be orders of magnitude faster than a bash script or Unix pipeline using:
awk -F"=|/" '$3 > max { max = $3 } END { print max }' file
Example Use/Output
$ awk -F"=|/" '$3 > max { max = $3 } END { print max }' file
32
Above awk separates the fields using either '=' or '/' as field separator and then keeps the max of the 3rd field $3 and outputs that value using the END {...} rule.
Bash Solution
If you did want a bash script solution, then you can isolate the wanted parts of each line using [[ .. =~ .. ]] to populate the BASH_REMATCH array and then compare ${BASH_REMATCH[3]} against a max variable. The [[ .. ]] expression with =~ considers everything on the right side an Extended Regular Expression and will isolate each grouping ((...)) as an element in the array BASH_REMATCH, e.g.
#!/bin/bash
[ -z "$1" ] && { printf "filename required\n" >&2; exit 1; }
declare -i max=0
while read -r line; do
[[ $line =~ ^(.*)=(.*)/(.*)$ ]]
((${BASH_REMATCH[3]} > max)) && max=${BASH_REMATCH[3]}
done < "$1"
printf "max: %s\n" "$max"
Using Only POSIX Parameter Expansions
Using parameter expansion with substring removal supported by POSIX shell (Bourne shell, dash, etc..), you could do:
#!/bin/sh
[ -z "$1" ] && { printf "filename required\n" >&2; exit 1; }
max=0
while read line; do
[ "${line##*/}" -gt "$max" ] && max="${line##*/}"
done < "$1"
printf "max: %s\n" "$max"
Example Use/Output
After making yourscript.sh executable with chmod +x yourscript.sh, you would do:
$ ./yourscript.sh file
max: 32
(same output for both shell script solutions)
Let me know if you have further questions.

Shell script that filters command output and saves it in Json formated list

never worked with shell scripts before,but i need to in my current task.
So i have to run a command that returns output like this:
awd54a7w6ds54awd47awd refs/heads/SomeInfo1
awdafawe23413f13a3r3r refs/heads/SomeInfo2
a8wd5a8w5da78d6asawd7 refs/heads/SomeInfo3
g9reh9wrg69egs7ef987e refs/heads/SomeInfo4
And i need to loop over every line of output get only the "SomeInfo" part and write it to a file in a format like this:
["SomeInfo1","SomeInfo2","SomeInfo3"]
I've tried things like this:
for i in $(some command); do
echo $i | cut -f2 -d"heads/" >> text.txt
done
But i don't know how to format it into an array without using a temporary file.
Sorry if the question is dumb and probably too easy and im sure i can figure it out on my own,but i just don't have the time for it because its just an extra conveniance feature that i personally want to implement.
Try this
# json_encoder.sh
arr=()
while read line; do
arr+=(\"$(basename "$line")\")
done
printf "[%s]" $(IFS=,; echo "${arr[*]}")
And then invoke
./your_command | json_encoder.sh
PS. I personally do this kind of data massaging with Vim.
Using Perl one-liner
$ cat petar.txt
awd54a7w6ds54awd47awd refs/heads/SomeInfo1
awdafawe23413f13a3r3r refs/heads/SomeInfo2
a8wd5a8w5da78d6asawd7 refs/heads/SomeInfo3
g9reh9wrg69egs7ef987e refs/heads/SomeInfo4
$ perl -ne ' { /.*\/(.*)/ and push(#res,"\"$1\"") } END { print "[".join(",",#res)."]\n" }' petar.txt
["SomeInfo1","SomeInfo2","SomeInfo3","SomeInfo4"]
While you should rarely ever use a script to format json, in your case you are simply parsing output into a comma-separated line with added end-caps of [...]. You can use bash parameter expansion to avoid spawning any additional subshells to obtain the last field of information in each line as follows:
#!/bin/bash
[ -z "$1" -o ! -r "$1" ] && { ## validate file given as argument
printf "error: file doesn't exist or not readable.\n" >&2
exit 1
}
c=0 ## simple flag variable
while read -r line; do ## read each line
if [ "$c" -eq '0' ]; then ## is flag 0?
printf "[\"%s\"" "${line##*/}" ## output ["last"
else ## otherwise
printf ",\"%s\"" "${line##*/}" ## output ,"last"
fi
c=1 ## set flag 1
done < file ## redirect file to loop
echo "]" ## append closing ]
Example Use/Output
Using your given data as the input file, you would get the following:
$ bash script.sh file
["SomeInfo1","SomeInfo2","SomeInfo3","SomeInfo4"]
Look things over and let me know if you have any questions.
You can also use awk without any loops I guess:
cat prev_output | awk -v ORS=',' -F'/' '{print "\042"$3"\042"}' | \
sed 's/^/[/g ; s/,$/]\n/g' > new_output
cat new_output
["SomeInfo1","SomeInfo2","SomeInfo3","SomeInfo4"]

bash scripting reading numbers from a file

Hello i need to make a bash script that will read from a file and then add the numbers in the file. For example, the file im reading would read as:
cat samplefile.txt
1
2
3
4
The script will use the file name as an argument and then add those numbers and print out the sum. Im stuck on how i would go about reading the integers from the file and then storing them in a variable.
So far what i have is the following:
#! /bin/bash
file="$1" #first arg is used for file
sum=0 #declaring sum
readnums #declaring var to store read ints
if [! -e $file] ; do #checking if files exists
echo "$file does not exist"
exit 0
fi
while read line ; do
do < $file
exit
What's the problem? Your code looks fine, except readnums is not a valid command name, and you need spaces inside the square brackets in the if condition. (Oh and "$file" should properly be inside double quotes.)
#!/bin/bash
file=$1
sum=0
if ! [ -e "$file" ] ; do # spaces inside square brackets
echo "$0: $file does not exist" >&2 # error message includes $0 and goes to stderr
exit 1 # exit code is non-zero for error
fi
while read line ; do
sum=$((sum + "$line"))
do < "$file"
printf 'Sum is %d\n' "$sum"
# exit # not useful; script will exit anyway
However, the shell is not traditionally a very good tool for arithmetic. Maybe try something like
awk '{ sum += $1 } END { print "Sum is", sum }' "$file"
perhaps inside a snippet of shell script to check that the file exists, etc (though you'll get a reasonably useful error message from Awk in that case anyway).

Finding max lines in a file while printing file name and lines separately?

So I keep messing this up and I think where I was going wrong was that the code i'm writing needs to return only the file name and number of lines from an argument.
So using wc I need to get something to accept either 0 or 1 arguments and print out something like "The file findlines.sh has 4 lines" or if they give a ./findlines.sh Desktop/testfile they'll get the "the file testfile has 5 lines"
I have a few attempts and all of them have failed. I can't seem to figure out how to approach it at all.
Should I echo "The file" and then toss the argument name in and then add another echo for "has the number of lines [lines]"?
Sample input would be from terminal something like
>findlines.sh
Output:the file findlines.sh has 18 lines
Or maybe
>findlines.sh /home/directory/user/grocerylist
Output of 'the file grocerylist has 16 lines
#! /bin/sh -
file=${1-findfiles.sh}
lines=$(wc -l < "$file") &&
printf 'The file "%s" has %d lines\n' "$file" "$lines"
This should work:
#!/bin/bash
file="findfiles.sh"
if [ $# -ge 1 ]
then
file=$1
fi
if [ -f $file ]
then
lines=`wc -l "$file" | awk '{print $1}'`
echo "The file $file has $lines lines"
else
echo "File not found"
fi
See sch's answer for a shorter example that doesn't use awk.

Awk: loop & save different lines to different files?

I'm looping over a series of large files with a shell script:
i=0
while read line
do
# get first char of line
first=`echo "$line" | head -c 1`
# make output filename
name="$first"
if [ "$first" = "," ]; then
name='comma'
fi
if [ "$first" = "." ]; then
name='period'
fi
# save line to new file
echo "$line" >> "$2/$name.txt"
# show live counter and inc
echo -en "\rLines:\t$i"
((i++))
done <$file
The first character in each line will either be alphanumeric, or one of the above defined characters (which is why I'm renaming them for use in the output file name).
It's way too slow.
5,000 lines takes 128seconds.
At this rate I've got a solid month of processing.
Will awk be faster here?
If so, how do I fit the logic into awk?
This can certainly be done more efficiently in bash.
To give you an example: echo foo | head does a fork() call, creates a subshell, sets up a pipeline, starts the external head program... and there's no reason for it at all.
If you want the first character of a line, without any inefficient mucking with subprocesses, it's as simple as this:
c=${line:0:1}
I would also seriously consider sorting your input, so you can only re-open the output file when a new first character is seen, rather than every time through the loop.
That is -- preprocess with sort (as by replacing <$file with < <(sort "$file")) and do the following each time through the loop, reopening the output file only conditionally:
if [[ $name != "$current_name" ]] ; then
current_name="$name"
exec 4>>"$2/$name" # open the output file on FD 4
fi
...and then append to the open file descriptor:
printf '%s\n' "$line" >&4
(not using echo because it can behave undesirably if your line is, say, -e or -n).
Alternately, if the number of possible output files is small, you can just open them all on different FDs up-front (substituting other, higher numbers where I chose 4), and conditionally output to one of those pre-opened files. Opening and closing files is expensive -- each close() forces a flush to disk -- so this should be a substantial help.
A few things to speed it up:
Don't use echo/head to get the first character. You're
spawning at least two additional processes per line. Instead,
use bash's parameter expansion facilities to get the first character.
Use if-elif to avoid checking $first against all the
possibilities
each time. Even better, if you are using bash 4.0 or later, use an associative array
to store the output file names, rather than checking against
$first in a big if-statement for each line.
If you don't have a version of bash that supports associative
arrays, replace your if statements with the following.
if [[ "$first" = "," ]]; then
name='comma'
elif [[ "$first" = "." ]]; then
name='period'
else
name="$first"
fi
But the following is suggested. Note the use of $REPLY as the default variable used by read if no name is given (just FYI).
declare -A OUTPUT_FNAMES
output[","]=comma
output["."]=period
output["?"]=question_mark
output["!"]=exclamation_mark
output["-"]=hyphen
output["'"]=apostrophe
i=0
while read
do
# get first char of line
first=${REPLY:0:1}
# make output filename
name=${output[$first]:-$first}
# save line to new file
echo $REPLY >> "$name.txt"
# show live counter and inc
echo -en "\r$i"
((i++))
done <$file
#!/usr/bin/awk -f
BEGIN {
punctlist = ", . ? ! - '"
pnamelist = "comma period question_mark exclamation_mark hyphen apostrophe"
pcount = split(punctlist, puncts)
ncount = split(pnamelist, pnames)
if (pcount != ncount) {print "error: counts don't match, pcount:", pcount, "ncount:", ncount; exit}
for (i = 1; i <= pcount; i++) {
punct_lookup[puncts[i]] = pnames[i]
}
}
{
print > punct_lookup[substr($0, 1, 1)] ".txt"
printf "\r%6d", i++
}
END {
printf "\n"
}
The BEGIN block builds an associative array so you can do punct_lookup[","] and get "comma".
The main block simply does the lookups for the filenames and outputs the line to the file. In AWK, > truncates the file the first time and appends subsequently. If you have existing files that you don't want truncated, then change it to >> (but don't use >> otherwise).
Yet another take:
declare -i i=0
declare -A names
while read line; do
first=${line:0:1}
if [[ -z ${names[$first]} ]]; then
case $first in
,) names[$first]="$2/comma.txt" ;;
.) names[$first]="$2/period.txt" ;;
*) names[$first]="$2/$first.txt" ;;
esac
fi
printf "%s\n" "$line" >> "${names[$first]}"
printf "\rLine $((++i))"
done < "$file"
and
awk -v dir="$2" '
{
first = substr($0,1,1)
if (! (first in names)) {
if (first == ",") names[first] = dir "/comma.txt"
else if (first == ".") names[first] = dir "/period.txt"
else names[first] = dir "/" first ".txt"
}
print > names[first]
printf("\rLine %d", NR)
}
'

Resources