BASH convert a text file line by line into variables

BASH convert a text file line by line into variables - linux

I want to make a script for an automated setup for a multiseat system. First action is
lspci | grep -i 'vga\|graphic' | cut -b 1-7 > text.txt
Now i want to put the two lines of the file into variables. My dowdy solution was this:
VAR1=$(head -n 1 text.txt)
VAR2=$(tail -n 1 text.txt)
It also works, however, there's probably a better solution to convert a text file line by line into variables.

The following should achieve exactly what you're doing, without the use of a temporary file
#!/bin/bash
{ read -r var1 _ && read -r var2 _; } < <(lspci | grep -i 'vga\|graphics')
Now, if you have several lines from lspci | grep -i 'vga\|graphics' (or just one, or none), you might want something more general, i.e., put the results in an array:
#!/bin/bash
var=()
while read -r f _; do var+=( "$f" ); done < <(lspci | grep -i 'vga\|graphics')
# display the content of var
declare -p var
If you have a recent version of Bash, and you love mapfile and awk (but who doesn't?), you could also do something like this:
#!/bin/bash
mapfile -t var < <(lspci | awk 'tolower($0) ~ /var|graphics/ { print $1 }')
# display the content of var
declare -p var
For a Pure Bash possibility (except for lspci, of course):
#!/bin/bash
shopt -s extglob
var=()
while read -r v rest; do
[[ ${rest,,} = *#(vga|graphics)* ]] && var+=( "$v" )
done < <(lspci)
# display var
declare -p var
This uses:
Lower case conversion of rest with ${rest,,}
Pattern matching and extended globs with *#(vga|graphics)* (to avoid regular expressions altogether).

If you can format your text file into name value pairs, you could use bash associative arrays to store and reference each item. Note in this code = is used as the delimiter to separate the name value pair.
#read in fast config (name value pair file)
declare -A MYMAP
while read item; do
NAME=$(cut -d "=" -f1 <<<"$item")
VALUE=$(cut -d "=" -f2 <<<"$item")
MYMAP["$NAME"]="$VALUE"
done <./config_file.txt
#size of map
MYMAP_N=${#MYMAP[#]}
#make a list of keys
KEYS=("${!MYMAP[#]}")
#dereference map
SELECTION="${MYMAP["my_first_key"]}"

If the values do not contain spaces, in bash using the array variable:
declare -a vars
eval "vars=(`echo line1; echo line2`)" # the `echo ...` simulates your command
echo number of values: ${#vars[#]}
for ((I = 0; I < ${#vars[#]}; ++I )); do
echo value \#$I is ${vars[$I]}
done
echo all values : ${vars[*]}
The trick is to generate the statement initializing the array with the values, and then eval it.
If the values have spaces/special characters, then escaping/quoting might be necessary.

read VAR1 VAR2 < <(sed -n '1p;$p' myfile | tr '\n' ' ')
This ought to do what you need it uses process substitution to print the lines you want and then redirects them to the variables, if you want different lines just build this statement as you need with a for loop putting whatever lines you want if you want them all use wc to count the lines, then build VAR1 .. VAR[n] and sed -n '1p;2p;3p..[n]p' and you then can eval the built statement.

Related

How to search the full string in file which is passed as argument in shell script?

i am passing a argument and that argument i have to match in file and extract the information. Could you please how I can get it?
Example:
I have below details in file-
iMedical_Refined_load_Procs_task_id=970113
HV_Rawlayer_Execution_Process=988835
iMedical_HV_Refined_Load=988836
DHS_RawLayer_Execution_Process=988833
iMedical_DHS_Refined_Load=988834
If I am passing 'hv' as argument so it should to pick 'iMedical_HV_Refined_Load' and give the result - '988836'
If I am passing 'dhs' so it should pick - 'iMedical_DHS_Refined_Load' and give the result = '988834'
I tried below logic but its not giving the result correctly. What Changes I need to do-
echo $1 | tr a-z A-Z
g=${1^^}
echo $g
echo $1
val=$(awk -F= -v s="$g" '$g ~ s{print $2}' /medaff/Scripts/Aggrify/sltconfig.cfg)
echo "TASK ID is $val"

Assuming your matching criteria is the first string after delimiter _ and the output needed is the numbers after the = char, then you can try this sed
$ sed -n "/_$1/I{s/[^=]*=\(.*\)/\1/p}" input_file
$ read -r input
hv
$ sed -n "/_$input/I{s/[^=]*=\(.*\)/\1/p}" input_file
988836
$ read -r input
dhs
$ sed -n "/_$input/I{s/[^=]*=\(.*\)/\1/p}" input_file
988834

If I'm reading it right, 2 quick versions -
$: cat 1
awk -F= -v s="_${1^^}_" '$1~s{print $2}' file
$: cat 2
sed -En "/_${1^^}_/{s/^.*=//;p;}" file
Both basically the same logic.
In pure bash -
$: cat 3
while IFS='=' read key val; do [[ "$key" =~ "_${1^^}_" ]] && echo "$val"; done < file
That's a lot less efficient, though.
If you know for sure there will be only one hit, all these could be improved a bit by short-circuit exits, but on such a small sample it won't matter at all. If you have a larger dataset to read, then I strongly suggest you formalize your specs better than "in this set I should get...".

Bash: How to count the number of occurrences of a string within a file?

I have a file that looks something like this:
dog
cat
dog
dog
fish
cat
I'd like to write some kind of code in Bash to make the file formatted like:
dog:1
cat:1
dog:2
dog:3
fish:1
cat:2
Any idea on how to do this? The file is very large (> 30K lines), so the code should be somewhat fast.
I am thinking some kind of loop...
Like this:
while read line;
echo "$line" >> temp.txt
val=$(grep $line temp.txt)
echo "$val" >> temp2.txt
done < file.txt
And then paste -d ':' file1.txt temp2.txt
However, I am concerned that this would be really slow, as you're going line-by-line. What do other people think?

You may use this simple awk to do this job for you:
awk '{print $0 ":" ++freq[$0]}' file
dog:1
cat:1
dog:2
dog:3
fish:1
cat:2

Here's what I came up with:
declare -A arr; while read -r line; do ((arr[$line]++)); echo "$line:${arr[$line]}" >> output_file; done < input_file
First, declare hash table arr. Then read every line in a for loop and increment the value in the array with the key of the read line. Then echo out the line, followed out by the value in the hashtable. Lastly append into a file 'out'.

Awk or sed are very powerful but it's not bash, here is the bash variant
raw=( $(cat file) ) # read file
declare -A index # init indexed array
for item in ${raw[#]}; { ((index[$item]++)); } # 1st loop through raw data to count items
for item in ${raw[#]}; { echo $item:${index[$item]}; } # 2nd loop change data

Iterative Bash Script Bug

Using a bash script, I'm trying to iterate through a text file that only has around 700 words, line-by-line, and run a case-insensitive grep search in the current directory using that word on particular files. To break it down, I'm trying to output the following to a file:
Append a newline to a file, then the searched word, then another newline
Append the results of the grep command using that search
Repeat steps 1 and 2 until all words in the list are exhausted
So for example, if I had this list.txt:
search1
search2
I'd want the results.txt to be:
search1:
grep result here
search2:
grep result here
I've found some answers throughout the stack exchanges on how to do this and have come up with the following implementation:
#!/usr/bin/bash
while IFS = read -r line;
do
"\n$line:\n" >> "results.txt";
grep -i "$line" *.in >> "results.txt";
done < "list.txt"
For some reason, however, this (and the numerous variants I've tried) isn't working. Seems trivial, but I'd it's been frustrating me beyond belief. Any help is appreciated.

Your script would work if you changed it to:
while IFS= read -r line; do
printf '\n%s:\n' "$line"
grep -i "$line" *.in
done < list.txt > results.txt
but it'd be extremely slow. See https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice for why you should think long and hard before writing a shell loop just to manipulate text. The standard UNIX tool for manipulating text is awk:
awk '
NR==FNR { words2matches[$0]; next }
{
for (word in words2matches) {
if ( index(tolower($0),tolower(word)) ) {
words2matches[word] = words2matches[word] $0 ORS
}
}
}
END {
for (word in words2matches) {
print word ":" ORS words2matches[word]
}
}
' list.txt *.in > results.txt
The above is untested of course since you didn't provide sample input/output we could test against.

Possible problems:
bash path - use /bin/bash path instead of /usr/bin/bash
blank spaces - remove ' ' after IFS
echo - use -e option for handling escape characters (here: '\n')
semicolons - not required at end of line
Try following script:
#!/bin/bash
while IFS= read -r line; do
echo -e "$line:\n" >> "results.txt"
grep -i "$line" *.in >> "results.txt"
done < "list.txt"

You do not even need to write a bash script for this purpose:
INPUT FILES:
$ more file?.in
::::::::::::::
file1.in
::::::::::::::
abc
search1
def
search3
::::::::::::::
file2.in
::::::::::::::
search2
search1
abc
def
::::::::::::::
file3.in
::::::::::::::
abc
search1
search2
def
search3
PATTERN FILE:
$ more patterns
search1
search2
search3
CMD:
$ grep -inf patterns file*.in | sort -t':' -k3 | awk -F':' 'BEGIN{OFS=FS}{if($3==buffer){print $1,$2}else{print $3; print $1,$2}buffer=$3}'
OUTPUT:
search1
file1.in:2
file2.in:2
file3.in:2
search2
file2.in:1
file3.in:3
search3
file1.in:4
file3.in:5
EXPLANATIONS:
grep -inf patterns file*.in will grep all the file*.in with all the patterns located in patterns file thanks to -f option, using -i forces insensitive case, -n will add the line numbers
sort -t':' -k3 you sort the output with the 3rd column to regroup patterns together
awk -F':' 'BEGIN{OFS=FS}{if($3==buffer){print $1,$2}else{print $3; print $1,$2}buffer=$3}' then awk will print the display that you want by using : as Field Separator and Output Field Separator, you use a buffer variable to save the pattern (3rd field) and you print the pattern whenever it changes ($3!=buffer)

Dynamic indirect Bash array

I have logs in this format:
log1,john,time,etc
log2,peter,time,etc
log3,jack,time,etc
log4,peter,time,etc
I want to create a list for every person in the format
"name"=("no.lines" "line" "line" ...)
For example:
peter=("2" "log2,peter,time,etc" "log4,peter,time,etc")
I already have this structure and know how to create variables like
declare "${FIELD[1]}"=1
but I don't know how to increase number of records and I am getting an error if I want to create a list like this and append into it.
#!/bin/bash
F=("log1,john,time,etc" "log2,peter,time,etc" "log3,jack,time,etc" "log4,peter,time,etc")
echo "${F[#]}"
declare -a CLIENTS
for LINE in "${F[#]}"
do
echo "$LINE"
IFS=',' read -ra FIELD < <(echo "$LINE")
if [ -z "${!FIELD[1]}" ] && [ -n "${FIELD[1]}" ] # check if there is already record for given line, if not create
then
CLIENTS=("${CLIENTS[#]}" "${FIELD[1]}") # add person to list of variables records for later access
declare -a "${FIELD[1]}"=("1" "LINE") # ERROR
elif [ -n "${!FIELD[1]}" ] && [ -n "${FIELD[1]}" ] # if already record for client
then
echo "Increase records number" # ???
echo "Append record"
"${FIELD[#]}"=("${FIELD[#]}" "$LINE") # ERROR
else
echo "ELSE"
fi
done
echo -e "CLIENTS: \n ${CLIENTS[#]}"
echo "Client ${CLIENTS[0]} has ${!CLIENTS[0]} records"
echo "Client ${CLIENTS[1]} has ${!CLIENTS[1]} records"
echo "Client ${CLIENTS[2]} has ${!CLIENTS[2]} records"
echo "Client ${CLIENTS[3]} has ${!CLIENTS[3]} records"

Be warned: The below uses namevars, a new bash 4.3 feature.
First: I would strongly suggest namespacing your arrays with a prefix to avoid collisions with unrelated variables. Thus, using content_ as that prefix:
read_arrays() {
while IFS= read -r line && IFS=, read -r -a fields <<<"$line"; do
name=${fields[1]}
declare -g -a "content_${fields[1]}"
declare -n cur_array="content_${fields[1]}"
cur_array+=( "$line" )
unset -n cur_array
done
}
Then:
lines_for() {
declare -n cur_array="content_$1"
printf '%s\n' "${#cur_array[#]}" ## emit length of array for given person
}
...or...
for_each_line() {
declare -n cur_array="content_$1"; shift
for line in "${cur_array[#]}"; do
"$#" "$line"
done
}
Tying all this together:
$ read_arrays <<'EOF'
log1,john,time,etc
log2,peter,time,etc
log3,jack,time,etc
log4,peter,time,etc
EOF
$ lines_for peter
2
$ for_each_line peter echo
log2,peter,time,etc
log4,peter,time,etc
...and, if you really want the format you asked for, with the number of columns as explicit data, and variable names that aren't safely namespaced, it's easy to convert from one to the other:
# this should probably be run in a subshell to avoid namespace pollution
# thus, (generate_stupid_format) >output
generate_stupid_format() {
for scoped_varname in "${!content_#}"; do
unscoped_varname="${scoped_varname#content_}"
declare -n unscoped_var=$unscoped_varname
declare -n scoped_var=$scoped_varname
unscoped_var=( "${#scoped_var[#]}" "${scoped_var[#]}" )
declare -p "$unscoped_varname"
done
}

Bash with Coreutils, grep and sed
If I understand your code right, you try to have multidimensional arrays, which Bash doesn't support. If I were to solve this problem from scratch, I'd use this mix of command line tools (see security concerns at the end of the answer!):
#!/bin/bash
while read name; do
printf "%s=(\"%d\" \"%s\")\n" \
"$name" \
"$(grep -c "$name" "$1")" \
"$(grep "$name" "$1" | tr $'\n' ' ' | sed 's/ /" "/g;s/" "$//')"
done < <(cut -d ',' -f 2 "$1" | sort -u)
Sample output:
$ ./SO.sh infile
jack=("1" "log3,jack,time,etc")
john=("1" "log1,john,time,etc")
peter=("2" "log2,peter,time,etc" "log4,peter,time,etc")
This uses process substitution to prepare the log file so we can loop over unique names; the output of the substitution looks like
$ cut -d ',' -f 2 "$1" | sort -u
jack
john
peter
i.e., a list of unique names.
For each name, we then print the summarized log line with
printf "%s=(\"%d\" \"%s\")\n"
Where
The %s string is just the name ("$name").
The log line count is the output of a grep command,
grep -c "$name" "$1"
which counts the number of occurrences of "$name". If the name can occur elsewhere in the log line, we can limit the search to just the second field of the log lines with
grep -c "$name" <(cut -d ',' -f 2 "$1")
Finally, to get all log lines on one line with proper quoting and all, we use
grep "$name" "$1" | tr $'\n' ' ' | sed 's/ /" "/g;s/" "$//'
This gets all lines containing "$name", replaces newlines with spaces, then surrounds the spaces with quotes and removes the extra quotes from the end of the line.
Pure Bash
After initially thinking that pure Bash would be too cumbersome, it turned out to be not all that complicated:
#!/bin/bash
declare -A count
declare -A lines
old_ifs=IFS
IFS=,
while read -r -a line; do
name="${line[1]}"
(( ++count[$name] ))
lines[$name]+="\"${line[*]}\" "
done < "$1"
for name in "${!count[#]}"; do
printf "%s=(\"%d\" %s)\n" "$name" "${count[$name]}" "${lines[$name]% }"
done
IFS="$old_ifs"
This updates two associative arrays while looping over the input file: count keeps track of the number of times a certain name occurs, and lines appends the log lines to an entry per name.
To separate fields by commas, we set the input field separator IFS to a comma (but save it beforehand so it can be reset at the end).
read -r -a reads the lines into an array line with comma separated fields, so the name is now in ${line[1]}. We increase the count for that name in the arithmetic expression (( ... )), and append (+=) the log line in the next line.
${line[*]} prints all fields of the array separated by IFS, which is exactly what we want. We also add a space here; the unwanted space at the end of the line (after the last element) will be removed later.
The second loop iterates over all the keys of the count array (the names), then prints the properly formatted line for each. ${lines[$name]% } removes the space from the end of the line.
Security concerns
As it seems that the output of these scripts is supposed to be reused by the shell, we might want to prevent malicious code execution if we can't trust the contents of the log file.
A way to do that for the Bash solution (hat tip: Charles Duffy) would be the following: the for loop would have to be replaced by
for name in "${!count[#]}"; do
IFS=' ' read -r -a words <<< "${lines[$name]}"
printf -v words_str '%q ' "${words[#]}"
printf "%q=(\"%d\" %s)\n" "$name" "${count[$name]}" "${words_str% }"
done
That is, we split the combined log lines into an array words, print that with the %q formatting flag into a string words_str and then use that string for our output, resulting in escaped output like this:
peter=("2" \"log2\,peter\,time\,etc\" \"log4\,peter\,time\,etc\")
jack=("1" \"log3\,jack\,time\,etc\")
john=("1" \"log1\,john\,time\,etc\")
The analogous could be done for the first solution.

You can use awk. As a demo:
awk -F, '{a1[$2]=a1[$2]" \""$0"\""; sum[$2]++} END{for (e in sum){print e"=(" "\""sum[e]"\""a1[e]")"}}' file
john=("1" "log1,john,time,etc")
peter=("2" "log2,peter,time,etc" "log4,peter,time,etc")
jack=("1" "log3,jack,time,etc")

concatenate the result of echo and a command output

I have the following code:
names=$(ls *$1*.txt)
head -q -n 1 $names | cut -d "_" -f 2
where the first line finds and stores all names matching the command line input into a variable called names, and the second grabs the first line in each file (element of the variable names) and outputs the second part of the line based on the "_" delim.
This is all good, however I would like to prepend the filename (stored as lines in the variable names) to the output of cut. I have tried:
names=$(ls *$1*.txt)
head -q -n 1 $names | echo -n "$names" cut -d "_" -f 2
however this only prints out the filenames
I have tried
names=$(ls *$1*.txt
head -q -n 1 $names | echo -n "$names"; cut -d "_" -f 2
and again I only print out the filenames.
The desired output is:
$
filename1.txt <second character>
where there is a single whitespace between the filename and the result of cut.
Thank you.

Best approach, using awk
You can do this all in one invocation of awk:
awk -F_ 'NR==1{print FILENAME, $2; exit}' *"$1"*.txt
On the first line of the first file, this prints the filename and the value of the second column, then exits.
Pure bash solution
I would always recommend against parsing ls - instead I would use a loop:
You can avoid the use of awk to read the first line of the file by using bash built-in functionality:
for i in *"$1"*.txt; do
IFS=_ read -ra arr <"$i"
echo "$i ${arr[1]}"
break
done
Here we read the first line of the file into an array, splitting it into pieces on the _.

Maybe something like that will satisfy your need BUT THIS IS BAD CODING (see comments):
#!/bin/bash
names=$(ls *$1*.txt)
for f in $names
do
pattern=`head -q -n 1 $f | cut -d "_" -f 2`
echo "$f $pattern"
done

If I didn't misunderstand your goal, this also works.
I've always done it this way, I just found out that this is a deprecated way to do it.
#!/bin/bash
names=$(ls *"$1"*.txt)
for e in $names;
do echo $e `echo "$e" | cut -c2-2`;
done

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

BASH convert a text file line by line into variables - linux

Related

How to search the full string in file which is passed as argument in shell script?

Bash: How to count the number of occurrences of a string within a file?

Iterative Bash Script Bug

Dynamic indirect Bash array

concatenate the result of echo and a command output

Categories

Resources