shell script program for count num of line - linux

Write a shell script to count the number of lines, characters, words in a file (without the use of commands). Also delete the occurrence of word “Linux” from the file wherever it appears and save the results in a new file.

This is the nearest I could get without using any third party packages...
#!/bin/bash
count=0
while read -r line
do
count=$((count + 1))
done < "$filename"
echo "Number of lines: $count"

Sachin Bharadwaj gave a script that counts the lines.
Now, to count the words, we can use set to split the line into $# positional parameters.
And to count the characters, we can use the parameter length: ${#line}.
Finally, to delete every “Linux”, we can use pattern substitution: ${line//Linux}.
(Cf. Shell Parameter Expansion.)
All taken together:
while read -r line
do
((++count))
set -- $line
((wordcount+=$#))
((charcount+=${#line}+1)) # +1 for the '\n'
echo "${line//Linux}"
done < "$filename" >anewfile
echo "Number of lines: $count"
echo "Number of words: $wordcount"
echo "Number of chars: $charcount"

Related

How do I use for to loop over potentially-empty lines of output from egrep?

I'm trying to print out blank lines in a text file but I want it to also print out numbers to see how many lines of white spaces the egrep returned by using:
for x in $(egrep '$^ txtfile); do echo '$x'; done
but this doesn't echo or return anything, is there any way I know how many blank lines the egrep command returned?
for is the wrong tool for this job; the right one (if you don't want to use grep -c but really do want to read each line of output) is a while read loop, as discussed in BashFAQ #1:
#!/usr/bin/env bash
# ^^^^- important: bash, not sh
count=0
while IFS= read -r x; do
echo "Read a line: <$x>" >&2
(( ++count ))
done < <(egrep '^$' txtfile)
echo "Read $count lines" >&2

Using lines from a file while looping over an unrelated numeric range in bash

I got a task to display all bash fonts/colors/background colors in table where the text is different for each variation and is being taken from file that contains 448 words/random number(I'm working with numbers).
Here is my code for displaying all variations
for i in {0..8}; do
for j in {30..37}; do
for n in {40..47}; do
echo -ne "\e[$i;$j;$n""mcolors"
done
echo
done
done
echo ""
Output: enter image description here
Code for generating random numbers:
#!/bin/bash
for ((i=0;i<$1;i++))
do
echo $RANDOM >> randomnumbers.sh
done
So the question is how can I pass numbers from randomnumbers.sh to my script so "colors" line in output changes to number being taken by order from randomnumbers.sh? Thanks!
One simple approach is to have an open file descriptor with your random numbers, and read a line from that file whenever such a value is required.
Here, we're using FD 3, so that other parts of your script can still read from original stdin:
#!/usr/bin/env bash
# Make file descriptor 3 point to our file of random numbers (don't use .sh for data files)
exec 3< randomnumbers || exit
for i in {0..8}; do
for j in {30..37}; do
for n in {40..47}; do
read -r randomNumber <&3 # read one number from FD 3
printf '\e[%s;%s;%sm%05d' "$i" "$j" "$n" "$randomNumber" # & use in the format string
done
printf '\n'
done
done
printf '\n'

Splitting out timestamp/key/value pairs from bash

Hi I have this file full of data; the time stamps is basically the beginning of the line. I need to break down the file and print each line individually. How can I accomplish this using only bash and (if needed) standard UNIX tools (sed, awk, etc)?
The time stamp field goes from 08:30:00:324810: onward .. example 17:30:00:324810: . The number of field following the time stamp varies; so there could be 1 to x number of fields . So I need to find the time stamp format and then insert a page break.
08:30:00:324810: usg_07Y BidYield=1.99788141 Bid=99.20312500 08:30:00:325271: usg_07Y
AskYield=1.98578274 Ask=99.28125000 08:30:00:325535: usg_10Y Ask=0.00000000 08:30:01:324881:
usg_07Y BidYield=2.02938740 AskYield=1.97127853 Bid=99.00000000 Ask=99.37500000 08:30:01:377021:
usg_05Y Bid=0.00000000 Ask=0.00000000
Thanking u in advance
Matt
It is fairly trivial. Read the file into an array, find the timestamp, output a newline before it:
#!/bin/bash
set -f # inhibit globbing (filename expansion)
declare -i cnt=0 # simple counter
a=( $(<"$1") ) # read file into array
for i in "${a[#]}"; do # for each word in file
if [ "$cnt" -gt 0 ]; then # test counter > 0
# if last char ':', then output newline before word
[ ${i:(-1):1} = ':' ] && printf "\n%s" "${i}" || printf " %s" "$i"
else
printf "%s" "$i" # if first word, just print.
fi
((cnt++))
done
printf "\n"
Use/output:
$ bash parsedtstamp.sh filename.txt
08:30:00:324810: usg_07Y BidYield=1.99788141 Bid=99.20312500
08:30:00:325271: usg_07Y AskYield=1.98578274 Ask=99.28125000
08:30:00:325535: usg_10Y Ask=0.00000000
08:30:01:324881: usg_07Y BidYield=2.02938740 AskYield=1.97127853 Bid=99.00000000 Ask=99.37500000
08:30:01:377021: usg_05Y Bid=0.00000000 Ask=0.00000000
I added a counter var to only output the newline if not the first word.
Alternate version that avoids temporary array storage (for large files)
While there is no limit on array size in Bash, if you find yourself parsing million line files, it is probably better to avoid storing all lines in memory. This can be accomplished by simply processing the lines as they are read from the file. It is just a way of doing to same thing without using an array for intermediate storage:
#!/bin/bash
set -f # inhibit globbing (filename expansion)
declare -i cnt=0 # simple counter
# read each line in file
while read -r line_entries || [ -n "$line_entries" ]; do
for i in $line_entries; do # for each word in line (no quotes for word splitting)
if [ "$cnt" -gt 0 ]; then # test counter > 0
# if last char ':', then output newline before word
if [ ${i:(-1):1} = ':' ]; then
printf "\n%s" "${i}"
else
printf " %s" "$i"
fi
else
printf "%s" "$i" # if first word, just print.
fi
((cnt++)) # increment counter
done
done <"$1"
printf "\n"
An awk way
awk -vORS="" '{for(i=1;i<=NF;i++)if($i~/:$/&&x++)$i="\n"$i}$NF=$NF" "
END{print "\n"}' file
Sets output record sep to nothing.
Loops through fields.
If fields last char is : then it add a newline before the field.
Adds space to last field in case it is a date to prevent no space between colon and next field.
Prints a newline at the end.

Bash reading txt file and storing in array

I'm writing my first Bash script, I have some experience with C and C# so I think the logic of the program is correct, it's just the syntax is so complicated because apparently there are many different ways to write the same thing!
Here is the script, it simply checks if the argument (string) is contained in a certain file. If so it stores each line of the file in an array and writes an item of the array in a file. I'm sure there must be easier ways to achieve that but I want to do some practice with bash loops
#!/bin/bash
NOME=$1
c=0
#IF NAME IS FOUND IN THE PHONEBOOK THEN STORE EACH LINE OF THE FILE INTO ARRAY
#ONCE THE ARRAY IS DONE GET THE INDEX OF MATCHING NAME AND RETURN ARRAY[INDEX+1]
if grep "$NOME" /root/phonebook.txt ; then
echo "CREATING ARRAY"
while read line
do
myArray[$c]=$line # store line
c=$(expr $c + 1) # increase counter by 1
done < /root/phonebook.txt
else
echo "Name not found"
fi
c=0
for i in myArray;
do
if myArray[$i]="$NOME" ; then
echo ${myArray[i+1]} >> /root/numbertocall.txt
fi
done
This code returns the only the second item of myArray (myArray[2]) or the second line of the file, why?
The first part (where you build the array) looks ok, but the second part has a couple of serious errors:
for i in myArray; -- this executes the loop once, with $i set to "myArray". In this case, you want $i to iterate over the indexes of myArray, so you need to use
for i in "${!myArray[#]}"
or
for ((i=0; i<${#a[#]}; i++))
(although I generally prefer the first, since it'll work with noncontiguous and associative arrays).
Also, you don't need the ; unless do is on the same line (in shell, ; is mostly equivalent to a line break so having a semicolon at the end of a line is redundant).
if myArray[$i]="$NOME" ; then -- the if statement takes a command, and will therefore treat myArray[$i]="$NOME" as an assignment command, which is not at all what you wanted. In order to compare strings, you could use the test command or its synonym [
if [ "${myArray[i]}" = "$NOME" ]; then
or a bash conditional expression
if [[ "${myArray[i]}" = "$NOME" ]]; then
The two are very similar, but the conditional expression has much cleaner syntax (e.g. in a test command, > redirects output, while \> is a string comparison; in [[ ]] a plain > is a comparison).
In either case, you need to use an appropriate $ expression for myArray, or it'll be interpreted as a literal. On the other hand, you don't need a $ before the i in "${myArray[i]}" because it's in a numeric expression context and therefore will be expanded automatically.
Finally, note that the spaces between elements are absolutely required -- in shell, spaces are very important delimiters, not just there for readability like they usually are in c.
1.-This is what you wrote with small adjustments
#!/bin/bash
NOME=$1
#IF NAME IS FOUND IN THE PHONE-BOOK **THEN** READ THE PHONE BOOK LINES INTO AN ARRAY VARIABLE
#ONCE THE ARRAY IS COMPLETED, GET THE INDEX OF MATCHING LINE AND RETURN ARRAY[INDEX+1]
c=0
if grep "$NOME" /root/phonebook.txt ; then
echo "CREATING ARRAY...."
IFS= while read -r line #IFS= in case you want to preserve leading and trailing spaces
do
myArray[c]=$line # put line in the array
c=$((c+1)) # increase counter by 1
done < /root/phonebook.txt
for i in ${!myArray[#]}; do
if myArray[i]="$NOME" ; then
echo ${myArray[i+1]} >> /root/numbertocall.txt
fi
done
else
echo "Name not found"
fi
2.-But you can also read the array and stop looping like this:
#!/bin/bash
NOME=$1
c=0
if grep "$NOME" /root/phonebook.txt ; then
echo "CREATING ARRAY...."
readarray myArray < /root/phonebook.txt
for i in ${!myArray[#]}; do
if myArray[i]="$NOME" ; then
echo ${myArray[i+1]} >> /root/numbertocall.txt
break # stop looping
fi
done
else
echo "Name not found"
fi
exit 0
3.- The following improves things. Supposing a)$NAME matches the whole line that contains it and b)there's always one line after a $NOME found, this will work; if not (if $NOME can be the last line in the phone-book), then you need to do small adjustments.
!/bin/bash
PHONEBOOK="/root/phonebook.txt"
NUMBERTOCALL="/root/numbertocall.txt"
NOME="$1"
myline=""
myline=$(grep -A1 "$NOME" "$PHONEBOOK" | sed '1d')
if [ -z "$myline" ]; then
echo "Name not found :-("
else
echo -n "$NOME FOUND.... "
echo "$myline" >> "$NUMBERTOCALL"
echo " .... AND SAVED! :-)"
fi
exit 0

Awk: loop & save different lines to different files?

I'm looping over a series of large files with a shell script:
i=0
while read line
do
# get first char of line
first=`echo "$line" | head -c 1`
# make output filename
name="$first"
if [ "$first" = "," ]; then
name='comma'
fi
if [ "$first" = "." ]; then
name='period'
fi
# save line to new file
echo "$line" >> "$2/$name.txt"
# show live counter and inc
echo -en "\rLines:\t$i"
((i++))
done <$file
The first character in each line will either be alphanumeric, or one of the above defined characters (which is why I'm renaming them for use in the output file name).
It's way too slow.
5,000 lines takes 128seconds.
At this rate I've got a solid month of processing.
Will awk be faster here?
If so, how do I fit the logic into awk?
This can certainly be done more efficiently in bash.
To give you an example: echo foo | head does a fork() call, creates a subshell, sets up a pipeline, starts the external head program... and there's no reason for it at all.
If you want the first character of a line, without any inefficient mucking with subprocesses, it's as simple as this:
c=${line:0:1}
I would also seriously consider sorting your input, so you can only re-open the output file when a new first character is seen, rather than every time through the loop.
That is -- preprocess with sort (as by replacing <$file with < <(sort "$file")) and do the following each time through the loop, reopening the output file only conditionally:
if [[ $name != "$current_name" ]] ; then
current_name="$name"
exec 4>>"$2/$name" # open the output file on FD 4
fi
...and then append to the open file descriptor:
printf '%s\n' "$line" >&4
(not using echo because it can behave undesirably if your line is, say, -e or -n).
Alternately, if the number of possible output files is small, you can just open them all on different FDs up-front (substituting other, higher numbers where I chose 4), and conditionally output to one of those pre-opened files. Opening and closing files is expensive -- each close() forces a flush to disk -- so this should be a substantial help.
A few things to speed it up:
Don't use echo/head to get the first character. You're
spawning at least two additional processes per line. Instead,
use bash's parameter expansion facilities to get the first character.
Use if-elif to avoid checking $first against all the
possibilities
each time. Even better, if you are using bash 4.0 or later, use an associative array
to store the output file names, rather than checking against
$first in a big if-statement for each line.
If you don't have a version of bash that supports associative
arrays, replace your if statements with the following.
if [[ "$first" = "," ]]; then
name='comma'
elif [[ "$first" = "." ]]; then
name='period'
else
name="$first"
fi
But the following is suggested. Note the use of $REPLY as the default variable used by read if no name is given (just FYI).
declare -A OUTPUT_FNAMES
output[","]=comma
output["."]=period
output["?"]=question_mark
output["!"]=exclamation_mark
output["-"]=hyphen
output["'"]=apostrophe
i=0
while read
do
# get first char of line
first=${REPLY:0:1}
# make output filename
name=${output[$first]:-$first}
# save line to new file
echo $REPLY >> "$name.txt"
# show live counter and inc
echo -en "\r$i"
((i++))
done <$file
#!/usr/bin/awk -f
BEGIN {
punctlist = ", . ? ! - '"
pnamelist = "comma period question_mark exclamation_mark hyphen apostrophe"
pcount = split(punctlist, puncts)
ncount = split(pnamelist, pnames)
if (pcount != ncount) {print "error: counts don't match, pcount:", pcount, "ncount:", ncount; exit}
for (i = 1; i <= pcount; i++) {
punct_lookup[puncts[i]] = pnames[i]
}
}
{
print > punct_lookup[substr($0, 1, 1)] ".txt"
printf "\r%6d", i++
}
END {
printf "\n"
}
The BEGIN block builds an associative array so you can do punct_lookup[","] and get "comma".
The main block simply does the lookups for the filenames and outputs the line to the file. In AWK, > truncates the file the first time and appends subsequently. If you have existing files that you don't want truncated, then change it to >> (but don't use >> otherwise).
Yet another take:
declare -i i=0
declare -A names
while read line; do
first=${line:0:1}
if [[ -z ${names[$first]} ]]; then
case $first in
,) names[$first]="$2/comma.txt" ;;
.) names[$first]="$2/period.txt" ;;
*) names[$first]="$2/$first.txt" ;;
esac
fi
printf "%s\n" "$line" >> "${names[$first]}"
printf "\rLine $((++i))"
done < "$file"
and
awk -v dir="$2" '
{
first = substr($0,1,1)
if (! (first in names)) {
if (first == ",") names[first] = dir "/comma.txt"
else if (first == ".") names[first] = dir "/period.txt"
else names[first] = dir "/" first ".txt"
}
print > names[first]
printf("\rLine %d", NR)
}
'

Resources