BASH SCRIPT: Combining Permutations of Contents in a Directory - linux

I'm currently trying to combine the contents of all .rule files in the rules directory.
For example:
./rules
./numbersFirst.rule
./numbersLast.rule
./lettersFirst.rule
etc.
Each of these files has about 1,000 rules. I need to write a bash script that can output all permutations of each of these rules.
For all the singles, it would just be:
cat rules/*.rule >> ruleSet
Is there any way to do this programmatically and cleverly? For example:
for rule1 in rules/*.rule
do
for rule2 in rules/*.rule
do
if [ $rule1 != $rule 2 ]
then
#read both files and output "$line_rule1 $line_rule2"
#Magic here?
fi
done
done
What about for permutations of 3, 4, ... n files, each with 1,000 lines each? The ideal is to programmatically do this with n files so that I can simple add to the directory and rebuild from this script. Obviously it will be a LOT of combinations!

You can compute cartesian product with GNU parallel if available :
#!/bin/bash
YOUR_DIR="./rules"
ARGS="::: "
NUM=0
for file in $YOUR_DIR/*.rule; do
ARGS="$ARGS $(cat $file | tr "\n" " ") ::: "
NUM=$((NUM+1))
INDEX="$INDEX {$NUM}"
done
if [ ! -z "$ARGS" ]; then
parallel --no-notice -P1 echo $INDEX $ARGS
fi
Or through only recurrence with associative array :
#!/bin/bash
dim=()
YOUR_DIR="./rules"
NUM=0
for file in $YOUR_DIR/*.rule; do
ARGS="$(cat $file | tr "\n" " ")"
dim[$NUM]="$ARGS"
NUM=$((NUM+1))
done
for i in "${!dim[#]}"
do
echo "key : $i"
echo "value: ${dim[$i]}"
done
function iterate {
local index="$2"
if [ "${index}" == "${#dim[#]}" ]; then
for (( i=0; i<=${index}; i++ ))
do
echo -n "${items[$i]} "
done
echo ""
else
for element in ${dim[${index}]}; do
items["${index}"]="${element}"
local it=$((index+1))
iterate items[#] "$it"
done
fi
}
declare -a items=("")
iterate "" 0
You can find a generalization here

Related

Bash script: max,min,sum - many sources as parameter

Is it possible to write a script that reads the file containing numbers (one per line) and writes their maximum, minimum and sum. If the file is empty, it will print an appropriate message. The name of the file is to be given as the parameter of the script. I mange to create below script, but there are 2 errors:
./4.3: line 20: syntax error near unexpected token `done'
./4.3: line 20: `done echo "Max: $max" '
Is it possible to add multiple files as parameter?
lines=`cat "$1" | wc -l`
if [ $lines -eq 0 ];
then echo "File $1 is empty!"
exit fi min=`cat "$1" | head -n 1`
max=$min sum=0
while [ $lines -gt 0 ];
do num=`cat "$1" |
tail -n $lines`
if [ $num -gt $max ];
then max=$num
elif [ $num -lt $min ];
then min=$num fiS
sum=$[ $sum + $num] lines=$[ $lines - 1 ]
done echo "Max: $max"
echo "Min: number $min"
echo "Sum: $sum"
Pretty compelling use of GNU datamash here:
read sum min max < <( datamash sum 1 min 1 max 1 < "$1" )
[[ -z $sum ]] && echo "file is empty"
echo "sum=$sum; min=$min; max=$max"
Or, sort and awk:
sort -n "$1" | awk '
NR == 1 { min = $1 }
{ sum += $1 }
END {
if (NR == 0) {
print "file is empty"
} else {
print "min=" min
print "max=" $1
print "sum=" sum
}
}
'
Here's how I'd fix your original attempt, preserving as much of the intent as possible:
#!/usr/bin/env bash
lines=$(wc -l "$1")
if [ "$lines" -eq 0 ]; then
echo "File $1 is empty!"
exit
fi
min=$(head -n 1 "$1")
max=$min
sum=0
while [ "$lines" -gt 0 ]; do
num=$(tail -n "$lines" "$1")
if [ "$num" -gt "$max" ]; then
max=$num
elif [ "$num" -lt "$min" ]; then
min=$num
fi
sum=$(( sum + num ))
lines=$(( lines - 1 ))
done
echo "Max: $max"
echo "Min: number $min"
echo "Sum: $sum"
The dealbreakers were missing linebreaks (can't use exit fi on a single line without ;); other changes are good practice (quoting expansions, useless use of cat), but wouldn't have prevented your script from working; and others are cosmetic (indentation, no backticks).
The overall approach is a massive antipattern, though: you read the whole file for each line being processed.
Here's how I would do it instead:
#!/usr/bin/env bash
for fname in "$#"; do
[[ -s $fname ]] || { echo "file $fname is empty" >&2; continue; }
IFS= read -r min < "$fname"
max=$min
sum=0
while IFS= read -r num; do
(( sum += num ))
(( max = num > max ? num : max ))
(( min = num < min ? num : min ))
done < "$fname"
printf '%s\n' "$fname:" " min: $min" " max: $max" " sum: $sum"
done
This uses the proper way to loop over an input file and utilizes the ternary operator in the arithmetic context.
The outermost for loop loops over all arguments.
You can do the whole thing in one while loop inside a shell script. Here's the bash version:
s=0
while read x; do
if [ ! $mi ]; then
mi=$x
elif [ $mi -gt $x ]; then
mi=$x
fi
if [ ! $ma ]; then
ma=$x
elif [ $ma -lt $x ]; then
ma=$x
fi
s=$((s+x))
done
if [ ! $ma ]; then
echo "File is empty."
else
echo "s=$s, mi=$mi, ma=$ma"
fi
Save that script into a file, and then you can use pipes to send as many input files into it as you wish, like so (assuming the script is called "mysum"):
cat file1 file2 file3 | mysum
or for a single file
mysum < file1
(Make sure, the script is executable and on the $PATH, otherwise use "./mysum" for the script in the current directory or indeed "bash mysum" if it isn't executable.)
The script assumes that the numbers are one per line and that there's nothing else on the line. It gives a message if the input is empty.
How does it work? The "read x" will take input from stdin line-by-line. If the file is empty, the while loop will never be run, and thus variables mi and ma won't be set. So we use this at the end to trigger the appropriate message. Otherwise the loop checks first if the mi and ma variables exist. If they don't, they are initialised with the first x. Otherwise it is checked if the next x requires updating the mi and ma found thus far.
Note that this trick ensures that you can feed-in any sequence of numbers. Otherwise you have to initialise mi with something that's definitely too large and ma with something that's definitely too small - which works until you encounter a strange number list.
Note further, that this works for integers only. If you need to work with floats, then you need to use some other tool than the shell, e.g. awk.
Just for fun, here's the awk version, a one-liner, use as-is or in a script, and it will work with floats, too:
cat file1 file2 file3 | awk 'BEGIN{s=0}; {s+=$1; if(length(mi)==0)mi=$1; if(length(ma)==0)ma=$1; if(mi>$1)mi=$1; if(ma<$1)ma=$1} END{print s, mi, ma}'
or for one file:
awk 'BEGIN{s=0}; {s+=$1; if(length(mi)==0)mi=$1; if(length(ma)==0)ma=$1; if(mi>$1)mi=$1; if(ma<$1)ma=$1} END{print s, mi, ma}' < file1
Downside: if doesn't give a decent error message for an empty file.
a script that reads the file containing numbers (one per line) and writes their maximum, minimum and sum
Bash solution using sort:
<file sort -n | {
read -r sum
echo "Min is $sum"
while read -r num; do
sum=$((sum+num));
done
echo "Max is $num"
echo "Sum is $sum"
}
Let's speed up by using some smart parsing using tee, tr and calculating with bc and if we don't mind using stderr for output. But we could do a little fifo and synchronize tee output. Anyway:
{
<file sort -n |
tee >(echo "Min is $(head -n1)" >&2) >(echo "Max is $(tail -n1)" >&2) |
tr '\n' '+';
echo 0;
} | bc | sed 's/^/Sum is /'
And there is always datamash. The following willl output 3 numbers, being sum, min and max:
<file datamash sum 1 min 1 max 1
You can try with a shell loop and dc
while [ $# -gt 0 ] ; do
dc -f - -e '
['"$1"' is empty]sa
[la p q ]sZ
z 0 =Z
# if file is empty
dd sb sc
# populate max and min with the first value
[d sb]sY
[d lb <Y ]sM
# if max keep it
[d sc]sX
[d lc >X ]sN
# if min keep it
[lM x lN x ld + sd z 0 <B]sB
lB x
# on each line look for max, min and keep the sum
[max for '"$1"' = ] n lb p
[min for '"$1"' = ] n lc p
[sum for '"$1"' = ] n ld p
# print summary at end of each file
' <"$1"
shift
done

How to make "dictionary" with shell functions?

This is my code:
#!/bin/sh
echo "ARGUMENTS COUNT : " $#
echo "ARGUMENTS LIST : " $*
dictionary=`awk '{ print $1 }'`
function()
{
for i in dictionary
do
for j in $*
do
if [ $j = $i ]
then
;
else
append
fi
done
done
}
append()
{
ls $j > dictionary1.txt
}
function
I need using unix shell functions make "dictionary". For example: I write in arguments default word, example hello. Then my function checks the file dictionary1 if that word is existing in the file. If not - append that word in file, if it's already exist - do nothing.
For some reason, my script does not work. When I start my script, it waits for something and that's it.
What I am doing wrong? How can I fix it?
An implementation that tries to care about both performance and correctness might look like:
#!/usr/bin/env bash
# ^^^^- NOT sh; sh does not support [[ ]] or <(...)
addWords() {
local tempFile dictFile
tempFile=$(mktemp dictFile.XXXXXX) || return
dictFile=$1; shift
[[ -e "$dictFile" ]] || touch "$dictFile" || return
sort -um "$dictFile" <(printf '%s\n' "$#" | sort -u) >"$tempFile"
mv -- "$tempFile" "$dictFile"
}
addWords myDict beta charlie delta alpha
addWords myDict charlie zulu
cat myDict
...has a final dictionary state of:
alpha
beta
charlie
delta
zulu
...and it rereads the input file only once for each addWords call (no matter how many words are being added!), not once per word to add.
Don't name a function "function".
Don't read in and walk through the whole file - all you need is to know it the word is there or not. grep does that.
ls lists files. You want to send a word to the file, not a filename. use echo or printf.
sh isn't bash. Use bash unless there's a clear reason not to, and the only reason is because it isn't available.
Try this:
#! /bin/env bash
checkWord() {
grep -qm 1 "$1" dictionary1.txt ||
echo "$1" >> dictionary1.txt
}
for wd
do checkWord "$wd"
done
If that works, you can add more structure and error checking.
You can remove your dictionary=awk... line (as mentioned it's blocking waiting for input) and simply grep your dictionary file for each argument, something like the below :
for i in "$#"
do
if ! grep -qow "$i" dictionary1.txt
then
echo "$i" >> dictionary1.txt
fi
done
With any awk in any shell on any UNIX box:
awk -v words="$*" '
BEGIN {
while ( (getline word < "dictionary1.txt") > 0 ) {
dict[word]++
}
close("dictionary1.txt")
split(words,tmp)
for (i in tmp) {
word = tmp[i]
if ( !dict[word]++ ) {
newWords = newWords word ORS
}
}
printf "%s", newWords >> "dictionary1.txt"
exit
}'

Bash: Counting instances of a string in text file with a loop

I am trying to write a simple bash script in which it takes in a text file, loops through the file and tells me how many times a certain string appears in the file. I want to eventually use this for a custom log searcher (for instance, search for the words 'log in' in a particular log file, etc.), but am having some difficulty as I am relatively new to bash. I want to be able to quickly search different logs for different terms at my will and see how many times they occur. Everything works perfectly until I get down to my loops. I think that I am using grep wrong, but am unsure if that is the issue. My loop codes may seem a little strange because I have been at it for a while and have been constantly tweaking things. I have done a bunch of searching but I feel like I am the only one who has ever had this issue (hopefully not because it is incredibly simple and I just suck). Any and all help is greatly appreciated, thanks in advance everyone.
edit: I would like to account for every instance of the string and not just
one instance per line
#!/bin/bash
echo "This bash script counts the instances of a user-defined string in a file."
echo "Enter a file to search:"
read fileName
echo " "
echo $path
if [ -f "$fileName" ] || [ -d "$fileName" ]; then
echo "File Checker Complete: '$fileName' is a file."
echo " "
echo "Enter a string that you would like to count the occurances of in '$fileName'."
read stringChoice
echo " "
echo "You are looking for '$stringChoice'. Counting...."
#TRYING WITH A WHILE LOOP
count=0
cat $fileName | while read line
do
if echo $line | grep $stringChoice; then
count=$[ count + 1 ]
done
echo "Finished processing file"
#TRYING WITH A FOR LOOP
# count=0
# for i in $(cat $fileName); do
# echo $i
# if grep "$stringChoice"; then
# count=$[ $count + 1 ]
# echo $count
# fi
# done
if [ $count == 1 ] ; then
echo " "
echo "The string '$stringChoice' occurs $count time in '$fileName'."
elif [ $count > 1 ]; then
echo " "
echo "The string '$stringChoice' occurs $count times in '$fileName'."
fi
elif [ ! -f "$fileName" ]; then
echo "File does not exist, please enter the correct file name."
fi
To find and count all occurrences of a string, you could use grep -o which matches only the word instead of the entire line and pipe the result to wc
read string; grep -o "$string" yourfile.txt | wc -l
You made basic syntax error in the code. Also, the variable of count was never updating as the the while loop was being executed in a subshell and thus the updated count value was never reflecting back.
Please change your code to the following one to get desired result.
#!/bin/bash
echo "This bash script counts the instances of a user-defined string in a file."
echo "Enter a file to search:"
read fileName
echo " "
echo $path
if [ -f "$fileName" ] ; then
echo "File Checker Complete: '$fileName' is a file."
echo " "
echo "Enter a string that you would like to count the occurances of in '$fileName'."
read stringChoice
echo " "
echo "You are looking for '$stringChoice'. Counting...."
#TRYING WITH A WHILE LOOP
count=0
while read line
do
if echo $line | grep $stringChoice; then
count=`expr $count + 1`
fi
done < "$fileName"
echo "Finished processing file"
echo "The string '$stringChoice' occurs $count time in '$fileName'."
elif [ ! -f "$fileName" ]; then
echo "File does not exist, please enter the correct file name."
fi

Script to calculate odd file size

I need to write a script which will calculate a total size of files which size is odd number; could you help me please?
#!/bin/bash
echo "Directory <$1> contains the following filenames of odd size:"
ls -l $1 |
while read file_parm
do
size=`echo $file_parm | cut -f 5 -d " "`
name=`echo $file_parm | cut -f 9 -d " "`
let "div=size%2"
if [ ! -d $name ]
then
if [ $div -ne 0 ]
then
# this is listing odd numbers from this
# directory; I just need to add them together
# and print result
echo "[$name : $size]"
fi
fi
done
I virtually copied the code from my comment and ran it, and it worked -- I just had to ensure I had $1 set to somewhere sane, rather than empty.
$ set -- "."; totsize=0; for file in "$1"/*; do if [ -f "$file" ]; then size=$(stat -c '%s' "$file"); if ((size % 2 == 1)); then echo "[$file : $size]"; ((totsize += $size)); fi; fi; done; echo "Total size of odd-sized files = $totsize"
[./bash-assoc-arrays.sh : 417]
[./makefile : 1125]
[./xx.pl : 117]
Total size of odd-sized files = 1659
$
Or, formatted for readability:
set -- "."
totsize=0
for file in "$1"/*
do
if [ -f "$file" ]
then
size=$(stat -c '%s' "$file")
if ((size % 2 == 1))
then
echo "[$file : $size]"
((totsize += $size))
fi
fi
done
echo "Total size of odd-sized files = $totsize"
The repeated invocation of stat is a bit expensive. If you don't have files with newlines in their names (most people don't), you can speed it up with a single invocation of stat and some care:
stat -c '%s %F %n' "$1"/* |
{
totsize=0
while read size type name
do
if [ "X$type" = "X-" ] && ((size % 2 == 1))
then
((totsize+=$size))
echo "[$name : $size]"
fi
done
echo "Total size of odd-sized files = $totsize"
}
You could use (...) in place of {...} at a marginal (unmeasurable) cost in efficiency.
Answers to other questions explain the [ "X$type" = "X-" ] notation.

greping a character from file UNIX.linux bash. Can't pass an argument(file name) through command line

I am having trouble with my newbie linux script which needs to count brackets and tell if they are matched.
#!/bin/bash
file="$1"
x="()(((a)(()))"
left=$(grep -o "(" <<<"$x" | wc -l)
rght=$(grep -o ")" <<<"$x" | wc -l)
echo "left = $left right = $rght"
if [ $left -gt $rght ]
then echo "not enough brackets"
elif [ $left -eq $rght ]
then echo "all brackets are fine"
else echo "too many"
fi
the problem here is i can't pass an argument through command line so that grep would work and count the brackets from the file. In the $x place I tried writing $file but it does not work
I am executing the script by writting: ./script.h test1.txt the file test1.txt is on the same folder as script.h
Any help in explaining how the parameter passing works would be great. Or maybe other way to do this script?
The construct <<< is used to transmit "the contents of a variable", It is not applicable to "contents of files". If you execute this snippet, you could see what I mean:
#!/bin/bash
file="()(((a)((a simple test)))"
echo "$( cat <<<"$file" )"
which is also equivalent to just echo "$file". That is, what is being sent to the console are the contents of the variable "file".
To get the "contents of a file" which name is inside a var called "file", then do:
#!/bin/bash
file="test1.txt"
echo "$( cat <"$file" )"
which is exactly equivalent to echo "$( <"$file" )", cat <"$file" or even <"$file" cat
You can use: grep -o "(" <"$file" or <"$file" grep -o "("
But grep could accept a file as a parameter, so this: grep -o "(" "$file" also works.
However, I believe that tr would be a better command, as this: <"$file" tr -cd "(".
It transforms the whole file into a sequence of "(" only, which will need a lot less to be transmitted (passed) to the wc command. Your script would become, then:
#!/bin/bash -
file="$1"
[[ $file ]] || exit 1 # make sure the var "file" is not empty.
[[ -r $file ]] || exit 2 # test if the file "file" exists.
left=$(<"$file" tr -cd "("| wc -c)
rght=$(<"$file" tr -cd ")"| wc -c)
echo "left = $left right = $rght"
# as $left and $rght are strictly numeric values, this integer tests work:
(( $left > $rght )) && echo "not enough right brackets"
(( $left == $rght )) && echo "all brackets are fine"
(( $left < $rght )) && echo "too many right brackets"
# added as per an additional request of the OP.
if [[ $(<"$file" tr -cd "()"|head -c1) = ")" ]]; then
echo "the first character is (incorrectly) a right bracket"
fi

Resources