How to make "dictionary" with shell functions?

How to make "dictionary" with shell functions? - linux

This is my code:
#!/bin/sh
echo "ARGUMENTS COUNT : " $#
echo "ARGUMENTS LIST : " $*
dictionary=`awk '{ print $1 }'`
function()
{
for i in dictionary
do
for j in $*
do
if [ $j = $i ]
then
;
else
append
fi
done
done
}
append()
{
ls $j > dictionary1.txt
}
function
I need using unix shell functions make "dictionary". For example: I write in arguments default word, example hello. Then my function checks the file dictionary1 if that word is existing in the file. If not - append that word in file, if it's already exist - do nothing.
For some reason, my script does not work. When I start my script, it waits for something and that's it.
What I am doing wrong? How can I fix it?

An implementation that tries to care about both performance and correctness might look like:
#!/usr/bin/env bash
# ^^^^- NOT sh; sh does not support [[ ]] or <(...)
addWords() {
local tempFile dictFile
tempFile=$(mktemp dictFile.XXXXXX) || return
dictFile=$1; shift
[[ -e "$dictFile" ]] || touch "$dictFile" || return
sort -um "$dictFile" <(printf '%s\n' "$#" | sort -u) >"$tempFile"
mv -- "$tempFile" "$dictFile"
}
addWords myDict beta charlie delta alpha
addWords myDict charlie zulu
cat myDict
...has a final dictionary state of:
alpha
beta
charlie
delta
zulu
...and it rereads the input file only once for each addWords call (no matter how many words are being added!), not once per word to add.

Don't name a function "function".
Don't read in and walk through the whole file - all you need is to know it the word is there or not. grep does that.
ls lists files. You want to send a word to the file, not a filename. use echo or printf.
sh isn't bash. Use bash unless there's a clear reason not to, and the only reason is because it isn't available.
Try this:
#! /bin/env bash
checkWord() {
grep -qm 1 "$1" dictionary1.txt ||
echo "$1" >> dictionary1.txt
}
for wd
do checkWord "$wd"
done
If that works, you can add more structure and error checking.

You can remove your dictionary=awk... line (as mentioned it's blocking waiting for input) and simply grep your dictionary file for each argument, something like the below :
for i in "$#"
do
if ! grep -qow "$i" dictionary1.txt
then
echo "$i" >> dictionary1.txt
fi
done

With any awk in any shell on any UNIX box:
awk -v words="$*" '
BEGIN {
while ( (getline word < "dictionary1.txt") > 0 ) {
dict[word]++
}
close("dictionary1.txt")
split(words,tmp)
for (i in tmp) {
word = tmp[i]
if ( !dict[word]++ ) {
newWords = newWords word ORS
}
}
printf "%s", newWords >> "dictionary1.txt"
exit
}'

Related

Linux script reading an ini file and splitting into variables by a specified character

I'm stuck in the following task: Lets pretend we have an .ini file in a folder. The file contains lines like this:
eno1=10.0.0.254/24
eno2=172.16.4.129/25
eno3=192.168.2.1/25
tun0=10.10.10.1/32
I had to choose the biggest subnet mask. So my attempt was:
declare -A data
for f in datadir/name
do
while read line
do
r=(${line//=/ })
let data[${r[0]}]=${r[1]}
done < $f
done
This is how far i got. (Yeah i know the file named name is not an .ini file but a .txt since i got problem even with creating an ini file,this teacher didn't even give a file like that for our exam.)
It splits the line until the =, but doesn't want to read the IP number because of the (first) . character.
(Invalid arithmetic operator the error message i got)
If someone could help me and explain how i can make a script for tasks like this i would be really thankful!

Both previously presented solutions operate (and do what they're designed to do); I thought I'd add something left-field as the specifications are fairly loose.
$ cat freasy
eno1=10.0.0.254/24
eno2=172.16.4.129/25
eno3=192.168.2.1/25
tun0=10.10.10.1/32
I'd argue that the biggest subnet mask is the one with the lowest numerical value (holds the most hosts).
$ sort -t/ -k2,2nr freasy| tail -n1
eno1=10.0.0.254/24

Don't use let. It's for arithmetic.
$ help let
let: let arg [arg ...]
Evaluate arithmetic expressions.
Evaluate each ARG as an arithmetic expression.
Just use straight assignment:
declare -A data
for f in datadir/name
do
while read line
do
r=(${line//=/ })
data[${r[0]}]=${r[1]}
done < $f
done
Result:
$ declare -p data
declare -A data=([tun0]="10.10.10.1/32" [eno1]="10.0.0.254/24" [eno2]="172.16.4.129/25" [eno3]="192.168.2.1/25" )

awk provides a simple solution to find the max value following the '/' that will be orders of magnitude faster than a bash script or Unix pipeline using:
awk -F"=|/" '$3 > max { max = $3 } END { print max }' file
Example Use/Output
$ awk -F"=|/" '$3 > max { max = $3 } END { print max }' file
32
Above awk separates the fields using either '=' or '/' as field separator and then keeps the max of the 3rd field $3 and outputs that value using the END {...} rule.
Bash Solution
If you did want a bash script solution, then you can isolate the wanted parts of each line using [[ .. =~ .. ]] to populate the BASH_REMATCH array and then compare ${BASH_REMATCH[3]} against a max variable. The [[ .. ]] expression with =~ considers everything on the right side an Extended Regular Expression and will isolate each grouping ((...)) as an element in the array BASH_REMATCH, e.g.
#!/bin/bash
[ -z "$1" ] && { printf "filename required\n" >&2; exit 1; }
declare -i max=0
while read -r line; do
[[ $line =~ ^(.*)=(.*)/(.*)$ ]]
((${BASH_REMATCH[3]} > max)) && max=${BASH_REMATCH[3]}
done < "$1"
printf "max: %s\n" "$max"
Using Only POSIX Parameter Expansions
Using parameter expansion with substring removal supported by POSIX shell (Bourne shell, dash, etc..), you could do:
#!/bin/sh
[ -z "$1" ] && { printf "filename required\n" >&2; exit 1; }
max=0
while read line; do
[ "${line##*/}" -gt "$max" ] && max="${line##*/}"
done < "$1"
printf "max: %s\n" "$max"
Example Use/Output
After making yourscript.sh executable with chmod +x yourscript.sh, you would do:
$ ./yourscript.sh file
max: 32
(same output for both shell script solutions)
Let me know if you have further questions.

bash separate parameters with specific delimiter

I am searching for a command, that separates all given parameters with a specific delimiter, and outputs them quoted.
Example (delimiter is set to be a colon :):
somecommand "this is" "a" test
should output
"this is":"a":"test"
I'm aware that the shell interprets the "" quotes before passing the parameters to the command. So what the command should actually do is to print out every given parameter in quotes and separate all these with a colon.
I'm also not seeking for a bash-only solution, but for the most elegant solution.
It is very easy to just loop over an array of these elements and do that, but the problem is that I have to use this inside a gnu makefile which only allows single line shell commands and uses sh instead of bash.
So the simpler the better.

How about
somecommand () {
printf '"%s"\n' "$#" | paste -s -d :
}
Use printf to add the quotes and print every entry on a separate line, then use paste with the -s ("serial") option and a colon as the delimiter.
Can be called like this:
$ somecommand "this is" "a" test
"this is":"a":"test"

apply_delimiter () {
(( $# )) || return
local res
printf -v res '"%s":' "$#"
printf '%s\n' "${res%:}"
}
Usage example:
$ apply_delimiter hello world "how are you"
"hello":"world":"how are you"

As indicated in a number of the comments, a simple "loop-over" approach, looping over each of the strings passed as arguments is a fairly straight-forward way to approach it:
delimit_colon() {
local first=1
for i in "$#"; do
if [ "$first" -eq 1 ]; then
printf "%s" "$i"
first=0
else
printf ":%s" "$i"
fi
done
printf "\n"
}
Which when combined with a short test script could be:
#!/bin/bash
delimit_colon() {
local first=1
for i in "$#"; do
if [ "$first" -eq 1 ]; then
printf "%s" "$i"
first=0
else
printf ":%s" "$i"
fi
done
printf "\n"
}
[ -z "$1" ] && { ## validate input
printf "error: insufficient input\n"
exit 1
}
delimit_colon "$#"
exit 0
Test Input/Output
$ bash delimitargs.sh "this is" "a" test
this is:a:test

Here a solution using the z-shell:
#!/usr/bin/zsh
# this is "somecommand"
echo '"'${(j_":"_)#}'"'

If you have them in an array already, you can use this command
MYARRAY=("this is" "a" "test")
joined_string=$(IFS=:; echo "$(MYARRAY[*])")
echo $joined_string
Setting the IFS (internal field separator) will be the character separator. Using echo on the array will display the array using the newly set IFS. Putting those commands in $() will put the output of the echo into joined_string.

How do I indirectly assign a variable in bash to take multi-line data from both Standard In, a File, and the output of execution

I have found many snippets here and in other places that answer parts of this question. I have even managed to do this in many steps in an inefficient manner. If it is possible, I would really like to find single lines of execution that will perform this task, rather than having to assign to a variable and copy it a few times to perform the task.
e.g.
executeToVar ()
{
# Takes Arg1: NAME OF VARIABLE TO STORE IN
# All Remaining Arguments Are Executed
local STORE_INvar="${1}" ; shift
eval ${STORE_INvar}=\""$( "$#" 2>&1 )"\"
}
Overall does work, i.e. $ executeToVar SOME_VAR ls -l * # will actually fill SOME_VAR with the output of the execution of the ls -l * command that is taken from the rest of the arguments. However, if the command was to output empty lines at the end, (for e.g. - echo -e -n '\n\n123\n456\n789\n\n' which should have 2 x new lines at the start and the end ) these are stripped by bash's sub-execution process. I have seen in other posts similar to this that this has been solved by adding a token 'x' to the end of the stream, e.g. turning the sub-execution into something like:
eval ${STORE_INvar}=\""$( "$#" 2>&1 ; echo -n x )"\" # <-- ( Add echo -n x )
# and then if it wasn't an indirect reference to a var:
STORE_INvar=${STORE_INvar%x}
# However no matter how much I play with:
eval "${STORE_INvar}"=\""${STORE_INvar%x}"\"
# I am unable to indirectly remove the x from the end.
Anyway, I also need 2 x other variants on this, one that assigns the STDIN stream to the var and one that assigns the contents of a file to the var which I assume will be variations of this involving $( cat ${1} ), or maybe $( cat ${1:--} ) to give me a '-' if no filename. But, none of that will work until I can sort out the removal of the x that is needed to ensure accurate assignment of multi line variables.
I have also tried (but to no avail):
IFS='' read -d '' "${STORE_INvar}" <<<"$( $# ; echo -n x )"
eval \"'${STORE_INvar}=${!STORE_INvar%x}'\"

This is close to optimal -- but drop the eval.
executeToVar() { local varName=$1; shift; printf -v "$1" %s "$("$#")"; }
The one problem this formulation still has is that $() strips trailing newlines. If you want to prevent that, you need to add your own trailing character inside the subshell, and strip it off yourself.
executeToVar() {
local varName=$1; shift;
local val="$(printf %s x; "$#"; printf %s x)"; val=${val#x}
printf -v "$varName" %s "${val%x}"
}
If you want to read all content from stdin into a variable, this is particularly easy:
# This requires bash 4.1 for automatic fd allocation
readToVar() {
if [[ $2 && $2 != "-" ]]; then
exec {read_in_fd}<"$2" # copy from named file
else
exec {read_in_fd}<&0 # copy from stdin
fi
IFS= read -r -d '' "$1" <&$read_in_fd # read from the FD
exec {read_in_fd}<&- # close that FD
}
...used as:
readToVar var < <( : "run something here to read its output byte-for-byte" )
...or...
readToVar var filename
Testing these:
bash3-3.2$ executeToVar var printf '\n\n123\n456\n789\n\n'
bash3-3.2$ declare -p var
declare -- var="
123
456
789
"
...and...
bash4-4.3$ readToVar var2 < <(printf '\n\n123\n456\n789\n\n')
bash4-4.3$ declare -p var2
declare -- var2="
123
456
789
"

what'w wrong with storing in a file:
$ stuffToFile filename $(stuff)
where "stuffToFile" tests for a. > 1 argument, b. input on a pipe
$ ... commands ... | stuffToFile filename
and
$ stuffToFile filename < another_file
where "stoffToFile" is a function:
function stuffToFile
{
[[ -f $1 ]] || { echo $1 is not a file; return 1; }
[[ $# -lt 2 ]] && { cat - > $1; return; }
echo "$*" > $1
}
so, if "stuff" has leading and trailing blank lines, then you must:
$ stuff | stuffToFile filename

shell script to find a word in a list of files, all of them given as parameters

I need a simple shell program which has to do something like this:
script.sh word_to_find file1 file2 file3 .... fileN
which will display
word_to_find 3 - if word_to_find appears in 3 files
or
word_to_find 5 - if word_to_find appears in 5 files
This is what I've tried
#!/bin/bash
count=0
for i in $#; do
if [ grep '$1' $i ];then
((count++))
fi
done
echo "$1 $count"
But this message appears:
syntax error: "then" unexpected (expecting "done").
Before this the error was
[: grep: unexpected operator.

Try this:
#!/bin/sh
printf '%s %d\n' "$1" $(grep -hm1 "$#" | wc -l)
Notice how all the script's arguments are passed verbatim to grep -- the first is the search expression, the rest are filenames.
The output from grep -hm1 is a list of matches, one per file with a match, and wc -l counts them.
I originally posted this answer with grep -l but that would require filenames to never contain a newline, which is a rather pesky limitation.
Maybe add an -F option if regular expression search is not desired (i.e. only search literal text).

The code you showed is:
#!/bin/bash
count=0
for i in $#; do
if [ grep '$1' $i ];then
((count++))
fi
done
echo "$1 $count"
When I run it, I get the error:
script.sh: line 5: [: $1: binary operator expected
This is reasonable, but it is not the same as either of the errors reported in the question. There are multiple problems in the code.
The for i in $#; do should be for i in "$#"; do. Always use "$#" so that any spaces in the arguments are preserved. If none of your file names contain spaces or tabs, it is not critical, but it is a good habit to get into. (See How to iterate over arguments in bash script for more information.)
The if operations runs the [ (aka test) command, which is actually a shell built-in as well as a binary in /bin or /usr/bin. The use of single quotes around '$1' means that the value is not expanded, and the command sees its arguments as:
[
grep
$1
current-file-name
]
where the first is the command name, or argv[0] in C, or $0 in shell. The error I got is because the test command expects an operator such as = or -lt at the point where $1 appears (that is, it expects a binary operator, not $1, hence the message).
You actually want to test whether grep found the word in $1 in each file (the names listed after $1). You probably want to code it like this, then:
#!/bin/bash
word="$1"
shift
count=0
for file in "$#"
do
if grep -l "$word" "$file" >/dev/null 2>&1
then ((count++))
fi
done
echo "$word $count"
We can negotiate on the options and I/O redirections used with grep. The POSIX grep
options -q and/or -s options provide varying degrees of silence and -q could be used in place of -l. The -l option simply lists the file name if the word is found, and stops scanning on the first occurrence. The I/O redirection ensures that errors are thrown away, but the test ensures that successful matches are counted.
Incorrect output claimed
It has been claimed that the code above does not produce the correct answer. Here's the test I performed:
$ echo "This country is young" > young.iii
$ echo "This country is little" > little.iii
$ echo "This fruit is fresh" > fresh.txt
$ bash findit.sh country young.iii fresh.txt little.iii
country 2
$ bash -x findit.sh country young.iii fresh.txt little.iii
+ '[' -f /etc/bashrc ']'
+ . /etc/bashrc
++ '[' -z '' ']'
++ return
+ alias 'r=fc -e -'
+ word=country
+ shift
+ count=0
+ for file in '"$#"'
+ grep -l country young.iii
+ (( count++ ))
+ for file in '"$#"'
+ grep -l country fresh.txt
+ for file in '"$#"'
+ grep -l country little.iii
+ (( count++ ))
+ echo 'country 2'
country 2
$
This shows that for the given files, the output is correct on my machine (Mac OS X 10.10.2; GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin14)). If the equivalent test works differently on your machine, then (a) please identify the machine and the version of Bash (bash --version), and (b) please update the question with the output you see from bash -x findit.sh country young.iii fresh.txt little.iii. You may want to create a sub-directory (such as junk), and copy findit.sh into that directory before creating the files as shown, etc.
You could also bolster your case by showing the output of:
$ grep country young.iii fresh.txt little.iii
young.iii:This country is young
little.iii:This country is little
$

#!/usr/bin/perl
use strict;
use warnings;
my $wordtofind = shift(#ARGV);
my $regex = qr/\Q$wordtofind/s;
my #file = ();
my $count = 0;
my $filescount = scalar(#ARGV);
for my $file(#ARGV)
{
if(-e $file)
{
eval { open(FH,'<' . $file) or die "can't open file $file "; };
unless($#)
{
for(<FH>)
{
if(/$regex/)
{
$count++;
last;
}
}
close(FH);
}
}
}
print "$wordtofind $count\n";

You could use an Awk script:
#!/usr/bin/env awk -f
BEGIN {
n=0
} $0 ~ w {
n++
} END {
print w,n
}
and run it like this:
./script.awk w=word_to_find file1 file2 file3 ... fileN
or if you don't want to worry about assigning a variable (w) on the command line:
BEGIN {
n=0
w=ARGV[1]
delete ARGV[1]
} $0 ~ w {
n++
} END {
print w,n
}

Create new file but add number if filename already exists in bash

I found similar questions but not in Linux/Bash
I want my script to create a file with a given name (via user input) but add number at the end if filename already exists.
Example:
$ create somefile
Created "somefile.ext"
$ create somefile
Created "somefile-2.ext"

The following script can help you. You should not be running several copies of the script at the same time to avoid race condition.
name=somefile
if [[ -e $name.ext || -L $name.ext ]] ; then
i=0
while [[ -e $name-$i.ext || -L $name-$i.ext ]] ; do
let i++
done
name=$name-$i
fi
touch -- "$name".ext

Easier:
touch file`ls file* | wc -l`.ext
You'll get:
$ ls file*
file0.ext file1.ext file2.ext file3.ext file4.ext file5.ext file6.ext

To avoid the race conditions:
name=some-file
n=
set -o noclobber
until
file=$name${n:+-$n}.ext
{ command exec 3> "$file"; } 2> /dev/null
do
((n++))
done
printf 'File is "%s"\n' "$file"
echo some text in it >&3
And in addition, you have the file open for writing on fd 3.
With bash-4.4+, you can make it a function like:
create() { # fd base [suffix [max]]]
local fd="$1" base="$2" suffix="${3-}" max="${4-}"
local n= file
local - # ash-style local scoping of options in 4.4+
set -o noclobber
REPLY=
until
file=$base${n:+-$n}$suffix
eval 'command exec '"$fd"'> "$file"' 2> /dev/null
do
((n++))
((max > 0 && n > max)) && return 1
done
REPLY=$file
}
To be used for instance as:
create 3 somefile .ext || exit
printf 'File: "%s"\n' "$REPLY"
echo something >&3
exec 3>&- # close the file
The max value can be used to guard against infinite loops when the files can't be created for other reason than noclobber.
Note that noclobber only applies to the > operator, not >> nor <>.
Remaining race condition
Actually, noclobber does not remove the race condition in all cases. It only prevents clobbering regular files (not other types of files, so that cmd > /dev/null for instance doesn't fail) and has a race condition itself in most shells.
The shell first does a stat(2) on the file to check if it's a regular file or not (fifo, directory, device...). Only if the file doesn't exist (yet) or is a regular file does 3> "$file" use the O_EXCL flag to guarantee not clobbering the file.
So if there's a fifo or device file by that name, it will be used (provided it can be open in write-only), and a regular file may be clobbered if it gets created as a replacement for a fifo/device/directory... in between that stat(2) and open(2) without O_EXCL!
Changing the
{ command exec 3> "$file"; } 2> /dev/null
to
[ ! -e "$file" ] && { command exec 3> "$file"; } 2> /dev/null
Would avoid using an already existing non-regular file, but not address the race condition.
Now, that's only really a concern in the face of a malicious adversary that would want to make you overwrite an arbitrary file on the file system. It does remove the race condition in the normal case of two instances of the same script running at the same time. So, in that, it's better than approaches that only check for file existence beforehand with [ -e "$file" ].
For a working version without race condition at all, you could use the zsh shell instead of bash which has a raw interface to open() as the sysopen builtin in the zsh/system module:
zmodload zsh/system
name=some-file
n=
until
file=$name${n:+-$n}.ext
sysopen -w -o excl -u 3 -- "$file" 2> /dev/null
do
((n++))
done
printf 'File is "%s"\n' "$file"
echo some text in it >&3

Try something like this
name=somefile
path=$(dirname "$name")
filename=$(basename "$name")
extension="${filename##*.}"
filename="${filename%.*}"
if [[ -e $path/$filename.$extension ]] ; then
i=2
while [[ -e $path/$filename-$i.$extension ]] ; do
let i++
done
filename=$filename-$i
fi
target=$path/$filename.$extension

Use touch or whatever you want instead of echo:
echo file$((`ls file* | sed -n 's/file\([0-9]*\)/\1/p' | sort -rh | head -n 1`+1))
Parts of expression explained:
list files by pattern: ls file*
take only number part in each line: sed -n 's/file\([0-9]*\)/\1/p'
apply reverse human sort: sort -rh
take only first line (i.e. max value): head -n 1
combine all in pipe and increment (full expression above)

Try something like this (untested, but you get the idea):
filename=$1
# If file doesn't exist, create it
if [[ ! -f $filename ]]; then
touch $filename
echo "Created \"$filename\""
exit 0
fi
# If file already exists, find a similar filename that is not yet taken
digit=1
while true; do
temp_name=$filename-$digit
if [[ ! -f $temp_name ]]; then
touch $temp_name
echo "Created \"$temp_name\""
exit 0
fi
digit=$(($digit + 1))
done
Depending on what you're doing, replace the calls to touch with whatever code is needed to create the files that you are working with.

This is a much better method I've used for creating directories incrementally.
It could be adjusted for filename too.
LAST_SOLUTION=$(echo $(ls -d SOLUTION_[[:digit:]][[:digit:]][[:digit:]][[:digit:]] 2> /dev/null) | awk '{ print $(NF) }')
if [ -n "$LAST_SOLUTION" ] ; then
mkdir SOLUTION_$(printf "%04d\n" $(expr ${LAST_SOLUTION: -4} + 1))
else
mkdir SOLUTION_0001
fi

A simple repackaging of choroba's answer as a generalized function:
autoincr() {
f="$1"
ext=""
# Extract the file extension (if any), with preceeding '.'
[[ "$f" == *.* ]] && ext=".${f##*.}"
if [[ -e "$f" ]] ; then
i=1
f="${f%.*}";
while [[ -e "${f}_${i}${ext}" ]]; do
let i++
done
f="${f}_${i}${ext}"
fi
echo "$f"
}
touch "$(autoincr "somefile.ext")"

without looping and not use regex or shell expr.
last=$(ls $1* | tail -n1)
last_wo_ext=$($last | basename $last .ext)
n=$(echo $last_wo_ext | rev | cut -d - -f 1 | rev)
if [ x$n = x ]; then
n=2
else
n=$((n + 1))
fi
echo $1-$n.ext
more simple without extension and exception of "-1".
n=$(ls $1* | tail -n1 | rev | cut -d - -f 1 | rev)
n=$((n + 1))
echo $1-$n.ext

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to make "dictionary" with shell functions? - linux

You can remove your dictionary=awk... line (as mentioned it's blocking waiting for input) and simply grep your dictionary file for each argument, something like the below : for i in "$#" do if ! grep -qow "$i" dictionary1.txt then echo "$i" >> dictionary1.txt fi done

Related

Linux script reading an ini file and splitting into variables by a specified character

bash separate parameters with specific delimiter

How do I indirectly assign a variable in bash to take multi-line data from both Standard In, a File, and the output of execution

shell script to find a word in a list of files, all of them given as parameters

Create new file but add number if filename already exists in bash

Categories

Resources