Get Values out of String in bash - string

I hope you can help me.
I try to separate a String:
#!/bin/bash
file=$(<sample.txt)
echo "$file"
The File itself contains Values like this:
(;FF[4]GM[1]SZ[19]CA[UTF-8]SO[sometext]BC[cn]WC[ja]
What I need is a way to extract the Values between the [ ] and set them as variables, for Example:
$FF=4
$GM=1
$SZ=19
and so on
However, some Files do not contain all Values, so that in some cases there is no FF[*]. In this case the Program should use the Value of "99"
How do I have to do this?
Thank you so much for your help.
Greetings
Chris

It may be a bit overcomplicated, but here it comes another way:
grep -Po '[-a-zA-Z0-9]*' file | awk '!(NR%2) {printf "declare %s=\"%s\";\n", a,$0; next} {a=$0} | bash
By steps
Filter file by printing only the needed blocks:
$ grep -Po '[-a-zA-Z0-9]*' a
FF
4
GM
1
SZ
19
CA
UTF-8
SO
sometext
BC
cn
WC
ja
Reformat so that it specifies the declaration:
$ grep -Po '[-a-zA-Z0-9]*' a | awk '!(NR%2) {printf "declare %s=\"%s\";\n", a,$0; next} {a=$0}'
declare FF="4";
declare GM="1";
declare SZ="19";
declare CA="UTF-8";
declare SO="sometext";
declare BC="cn";
declare WC="ja";
And finally pipe to bash so that it is executed.
Note 2nd step could be also rewritten as
xargs -n2 | awk '{print "declare"$1"=\""$2"\";"}'

I'd write this, using ; or [ or ] as awk's field separators
$ line='(;FF[4]GM[1]SZ[19]CA[UTF-8]SO[sometext]BC[cn]WC[ja]'
$ awk -F '[][;]' '{for (i=2; i<NF; i+=2) {printf "%s=\"%s\" ", $i, $(i+1)}; print ""}' <<<"$line"
FF="4" GM="1" SZ="19" CA="UTF-8" SO="sometext" BC="cn" WC="ja"
Then, to evaluate the output in your current shell:
$ source <(!!)
source <(awk -F '[][;]' '{for (i=2; i<NF; i+=2) {printf "%s=\"%s\" ", $i, $(i+1)}; print ""}' <<<"$line")
$ echo $SO
sometext
To handle the default FF value:
$ source <(awk -F '[][;]' '{
print "FF=99"
for (i=2; i<NF; i+=2) printf "%s=\"%s\" ", $i, $(i+1)
print ""
}' <<< "(;A[1]B[2]")
$ echo $FF
99
$ source <(awk -F '[][;]' '{
print "FF=99"
for (i=2; i<NF; i+=2) printf "%s=\"%s\" ", $i, $(i+1)
print ""
}' <<< "(;A[1]B[2]FF[3]")
$ echo $FF
3

Per your request:
while IFS=\[ read -r A B; do
[[ -z $B ]] && B=99
eval "$A=\$B"
done < <(exec grep -oE '[[:alpha:]]+\[[^]]*' sample.txt)
Although using an associative array would be better:
declare -A VALUES
while IFS=\[ read -r A B; do
[[ -z $B ]] && B=99
VALUES[$A]=$B
done < <(exec grep -oE '[[:alpha:]]+\[[^]]*' sample.txt)
There you could have access both with keys ("${!VALUES[#]}") and values "${VALUES['FF']}".

I would probably do something like this:
set $(sed -e 's/^(;//' sample.txt | tr '[][]' ' ')
while (( $# > 2 ))
do
varname=${1}
varvalue=${2}
# do something to test varname and varvalue to make sure they're sane/safe
declare "${varname}=${varvalue}"
shift 2
done

Related

Difficulty to create .txt file from loop in bash

I've this data :
cat >data1.txt <<'EOF'
2020-01-27-06-00;/dev/hd1;100;/
2020-01-27-12-00;/dev/hd1;100;/
2020-01-27-18-00;/dev/hd1;100;/
2020-01-27-06-00;/dev/hd2;200;/usr
2020-01-27-12-00;/dev/hd2;200;/usr
2020-01-27-18-00;/dev/hd2;200;/usr
EOF
cat >data2.txt <<'EOF'
2020-02-27-06-00;/dev/hd1;120;/
2020-02-27-12-00;/dev/hd1;120;/
2020-02-27-18-00;/dev/hd1;120;/
2020-02-27-06-00;/dev/hd2;230;/usr
2020-02-27-12-00;/dev/hd2;230;/usr
2020-02-27-18-00;/dev/hd2;230;/usr
EOF
cat >data3.txt <<'EOF'
2020-03-27-06-00;/dev/hd1;130;/
2020-03-27-12-00;/dev/hd1;130;/
2020-03-27-18-00;/dev/hd1;130;/
2020-03-27-06-00;/dev/hd2;240;/usr
2020-03-27-12-00;/dev/hd2;240;/usr
2020-03-27-18-00;/dev/hd2;240;/usr
EOF
I would like to create a .txt file for each filesystem ( so hd1.txt, hd2.txt, hd3.txt and hd4.txt ) and put in each .txt file the sum of the value from each FS from each dataX.txt. I've some difficulties to explain in english what I want, so here an example of the result wanted
Expected content for the output file hd1.txt:
2020-01;/dev/hd1;300;/
2020-02;/dev/hd1;360;/
2020-03;/dev/hd1;390:/
Expected content for the file hd2.txt:
2020-01;/dev/hd2;600;/usr
2020-02;/dev/hd2;690;/usr
2020-03;/dev/hd2;720;/usr
The implementation I've currently tried:
for i in $(cat *.txt | awk -F';' '{print $2}' | cut -d '/' -f3| uniq)
do
cat *.txt | grep -w $i | awk -F';' -v date="$(cat *.txt | awk -F';' '{print $1}' | cut -d'-' -f-2 | uniq )" '{sum+=$3} END {print date";"$2";"sum}' >> $i
done
But it doesn't works...
Can you show me how to do that ?
Because the format seems to be so constant, you can delimit the input with multiple separators and parse it easily in awk:
awk -v FS='[;-/]' '
prev != $9 {
if (length(output)) {
print output >> fileoutput
}
prev = $9
sum = 0
}
{
sum += $9
output = sprintf("%s-%s;/%s/%s;%d;/%s", $1, $2, $7, $8, sum, $11)
fileoutput = $8 ".txt"
}
END {
print output >> fileoutput
}
' *.txt
Tested on repl generates:
+ cat hd1.txt
2020-01;/dev/hd1;300;/
2020-02;/dev/hd1;360;/
2020-03;/dev/hd1;390;/
+ cat hd2.txt
2020-01;/dev/hd2;600;/usr
2020-02;/dev/hd2;690;/usr
2020-03;/dev/hd2;720;/usr
Alternatively, you could -v FS=';' and use split to split first and second column to extract the year and month and the hdX number.
If you seek a bash solution, I suggest you invert the loops - first iterate over files, then over identifiers in second column.
for file in *.txt; do
prev=
output=
while IFS=';' read -r date dev num path; do
hd=$(basename "$dev")
if [[ "$hd" != "${prev:-}" ]]; then
if ((${#output})); then
printf "%s\n" "$output" >> "$fileoutput"
fi
sum=0
prev="$hd"
fi
sum=$((sum + num))
output=$(
printf "%s;%s;%d;%s" \
"$(cut -d'-' -f1-2 <<<"$date")" \
"$dev" "$sum" "$path"
)
fileoutput="${hd}.txt"
done < "$file"
printf "%s\n" "$output" >> "$fileoutput"
done
You could also almost translate awk to bash 1:1 by doing IFS='-;/' in while read loop.

Hex compare in bash scripting

I am facing some issue when I am reading the 3rd word(a hex string) of each line in a text file and compare it with a hex number. Can some one please help me on it.
#!/bin/bash
A=$1
cat $A | while read a; do
a1=$(echo \""$a"\" | awk '{ print $3 }')
#echo $a > cut -d " " -f 3
echo $a1
(("$a1" == 0x10F7))
echo $?
done
But when I use below, the comparison happens correctly,
a1= 0xADCAFE
(( "$a1" == 0x10F7 ))
echo $?
Then why it is showing issue when I read like below,
a1=$(echo \""$a"\" | awk '{ print $3 }')
or> a1=$(echo $a | awk '{ print $3 }')
echo $a prints intended hex value, but comparison does not happen.
Regards,
Running Awk inside a while read loop is an antipattern. Just do the loop in Awk; it's good at that.
awk '$3 == 4343' "$1"
If you want to compare against a string whose value is "0x10F7" then it's
awk '$3 == "0x10F7"' "$1"
If you want to match either, case insensitively etc, a regex is a good way to do that.
awk '$3 ~ /^(0x10[Ff]7|4343)$/' "$1"
Notice how the $1 in double quotes is handled by the shell, and gets replaced by a (properly quoted!) copy of the script's first command-line argument before Awk runs, while the Awk script in single quotes has its own namespace, so $3 is an Awk variable which refers to the third field in the current input line.
Either way, avoid the useless use of cat and always always always quote variables which contain file names with double quotes.
That's literal double quotes. You seem to have tried both a dangerous bare $a and a doubly double-quoted "\"$a\"" where the simple "$a" would be what you actually want.
Thank you all for your responses, Now my script is working fine. I was trying to match two files, below script does the purpose
#!/bin/bash
A=$1
B=$2
dos2unix -f "$A"
dos2unix -f "$B"
rm search_match.txt search_data_match.txt search_nomatch.txt search_data_nomatch.txt
while read line;do
search_word=$(echo $line | awk '{ print $1 }')
grep "$search_word" $B >> temp_file.txt
while read var;do
file1_hex=$(echo $line | awk '{ print $2 }')
file2_hex=$(echo $var | awk '{ print $3 }')
(("$file1_hex" == "$file2_hex"))
zero=$(echo $?)
if [ "$zero" -eq 0 ] ; then
echo $line >> search_match.txt
echo $var >> search_data_match.txt
else
echo $line >> search_nomatch.txt
echo $var >> search_data_nomatch.txt
fi
done < "temp_file.txt"
rm temp_file.txt
done < "$A"

Bash command rev to reverse delemiters

I am working on a shell script that converts exported Microsoft in-addr.apra.txt files to a more useful format so that i can use it in the future in other products for automation purposes. No i am figuring a problem which (im not a programmer) can not solve in a simple way.
Sample script
x=123.223.224
rev $x
gives me
422.322.321
but i want to have the output as follow:
224.223.123
is there a easy way to do it without rev or putting each group in a variable? Or is there a sample i can use? or maybe i use the wrong tools to do it?
Using awk:
x='123.223.224'
awk 'BEGIN{FS=OFS="."} {for (i=NF; i>=2; i--) printf $i OFS; print $1}' <<< "$x"
224.223.123
Use awk for this!
If your text file always contains three octets, simply use . as separator:
echo $x | awk -F. '{ print $3 "." $2 "." $1 }'
For more complex cases, use internal split():
echo $x | awk '{
n = split($0, a, ".");
for(i = n; i > 1; i--) {
printf "%s.", a[i];
}
print a[1]; }'
In this sample split() will split every line (which is passed as argument $0) using delimiter ., saves resulting array into a and returns length of that array (which is saved to n). Note that unlike C,
split() array indexes are starting with one.
Or python:
python -c "print '.'.join(reversed('$x'.split('.')))"
Here is my script.
#!/bin/sh
value=$1
delim=$2
total_fields=$(echo "$value" | tr -cd $2 | wc -c)
let total_fields=total_fields+1
i=1
reverse_value=""
while [ $total_fields -gt 0 ]; do
cur_value=$(echo "$value" | cut -d${delim} -f${total_fields})
if [ $total_fields -ne 1 ]; then
cur_value="$cur_value${delim}"
fi
#echo "$cur_value"
reverse_value="$reverse_value$cur_value"
#echo "$i --> $reverse_value"
let total_fields=total_fields-1
done
echo "$reverse_value"
Using a few small tools.
tr '.' '\n' <<< "$x" | tac | paste -sd.
224.223.123

How to re-structure the file using AWK?

I have written a code to re-structure the csv file based on the control file, The control file looks like below.
Control file :
1,column1
3,column3
6,column6
4,column4
-1,column9
Based on the above control file i have taken the index's 1,3,6,4,-1 columns in source.csv file and created new file by using paste command.incase if index value is -1 in control file i have to insert the entire column as null and header name will be column9.
Code :
var=1
while read line
do
t=$(echo $line | awk '{ print $1}' | cut -d, -f1)
if [ $t != -1 ]
then
cut -d, -f$t source.csv >file_$var.csv
else
touch file_$var.csv
fi
var=$((var+1))
done < "$file"
ls -v file_*.csv | xargs paste -d, > new_file.csv
Is there a way to convert these lines into AWK , Suggest me some ideas.
Before Running script:
sample.csv
column1,column2,column3,column4,column5,column6,column7
a,b,c,d,e,f,g
Output:
new_file.csv
column1,column3,column6,column4,column9
a,c,f,d,
column9 is -1 indicate null or just , separated indicate null.
Basic intention is to restructure the source file based on the control file.
Script:
#Greenplum Database details to read target file structure from Meta Data Tables.
export PGUSER=xxx
export PGPORT=5432
export PGHOST=10.100.20.10
export PGDATABASE=fff
SCHEMA='jiodba'
##Function to explain usage of this script
usage() {
echo "Usage: program.sh -s <Source_folder> -t <Target_folder> -f <file_name> ";
exit 1; }
source_folder=$1
target_folder=$2
file_name=$3
#removes the existing file from current directory
rm -f file_struct_*.csv
# Reading the Header from the Source file.
v_source_header=`head -1 $file_name`
IFS="," # Set the field separator
set $v_source_header # Breaks the string into $1, $2, ...
i=1
for item # A for loop by default loop through $1, $2, ...
do
echo "$i,$item">>source_header.txt
((i++))
done
sed -e "s/
//" source_header.txt | sed -e "s/ \{1,\}$//" > source_headers.txt
rm -f source_header.txt
#Get the Target header information from Greenplum Meta data Table and writing into target_header.txt file.
psql -t -A -F "," -c "select Target_column_position,Target_column_name from jiodba.etl_tbl_sequencing where source_file_name='$file_name' order by target_column_position" > target_header.txt
#Removing the trail space and control characters.
sed -e "s/
//" target_header.txt | sed -e "s/ \{1,\}$//" > target_headers.txt
rm -f target_header.txt
#Compare the Source Header Target Structure and generate the Difference.
awk -F, 'NR==FNR{a[$2]=$1;next} {if ($2 in a) print a[$2]","$2; else print "-1," $2}' source_headers.txt target_headers.txt >>tgt_struct_output.txt
#Loop to Read column index from the tgt_struct_output.txt and cut it in Source file.
file='tgt_struct_output.txt'
var=1
while read line
do
t=$(echo $line | awk '{ print $1}' | cut -d, -f1)
if [ $t != -1 ]
then
cut -d, -f$t $file_name>file_struct_$var.csv
else
touch file_struct_$var.csv
fi
var=$((var+1))
done<"$file"
awk -F, -v OFS=, 'FNR==NR {c[++n]=$2; a[$2]=$1;next} FNR==1{f=""; for (i=1; i<=n; i++)
{printf "%s%s", f, c[i]; b[++k]=i; f=OFS} print "";next}
{for (i=1; i<=n; i++) if(a[c[i]]>0) printf "%s%s", $a[c[i]], OFS; print""
}' tgt_struct_output.txt $file_name
#Paste the different file(columns)into single file
ls -v file_struct_*.csv | xargs paste -d,| sed -e "s/
//" > new_file.csv
new_header=`cut -d "," -f 2 target_headers.txt | tr "\n" "," | sed 's/,$//'`
#Replace the header with original target header incase if column doesnt exit in the target table structure.
sed "1s/.*/$new_header/" new_file.csv
#Removing the Temp files.
rm -f file_struct_*.csv
rm -f source_headers.txt target_headers.txt tgt_struct_output.txt
touch file_struct_1.csv #Just to avoid the error in shell
Sample.csv
BP ID,Prepaid Account No,CurrentMonetary balance ,charge Plan names ,Provider contract id,Contract Item ID,Start Date,End Date
1100001538,001000002506,251,[B2] R2 LTE CHARGE PLAN ,00000000000000000141,[B2] R2 LTE CHARGE PLAN _00155D10E20D1ED39A8E146EA7169A2E00155D10E20D1ED398FD63624498DB4A,16-Oct-12,18-Oct-12
1100003404,001000004029,45.22,B0.3 ECS_CHARGE_PLAN DROP1 V3,00000000000000009349,B0.3 ECS DROP2 V0.2_00155D10E20D1ED39A8E146EA7169A2E00155D10E20D1ED398FD63624498DA2E,16-Nov-13,23-Nov-13
1100006545,001000006620,388.796,B0.3 ECS_CHARGE_PLAN DROP1 V3,00000000000000010477,B0.3 ECS DROP2 V0.2_00155D10E20D1ED39A8E146EA7169A2E00155S00E20D1ED398FD63624498DA2E,07-Nov-12,07-Nov-13
You can try this awk:
awk -F, -v OFS=, 'FNR==NR {c[++n]=$2; a[$2]=$1;next} FNR==1{f=""; for (i=1; i<=n; i++)
{printf "%s%s", f, c[i]; b[++k]=i; f=OFS} print "";next}
{for (i=1; i<=n; i++) if(a[c[i]]>0) printf "%s%s", $a[c[i]], OFS; print""
}' ctrl.csv sample.csv
column1,column3,column6,column4,column9
a,c,f,d,

How to make a bash copy of an awk variable?

Here is a simplified version of my problem.
if (echo "AA BB CC" | awk '{ print $1 $2 }' | grep -q "B"); then
echo $2
fi
I would like to make $2 available in bash, so I can use it elsewhere in the script.
Can that be done?
Update
I realized that I had simplified the problem too much. The awk expression should have been awk '{ print $1 $2 }' instead of just awk '{ print $2 }' which I originally posted.
You can use set:
set -- `echo "AA BB CC" | awk '{print $2}'`
case $1 in *B*) echo $1;; esac
... or if you used the awk just to split the output, let set do that part as well:
set -- `echo "AA BB CC"`
case $2 in *B*) echo $2;; esac
Remember the output of awk, test it for the regular expression and print it:
output=$( echo "AA BB CC" | awk '{ print $2 }' )
if grep -q B <<< "$output" ; then echo "$output" ; fi
You can capture stdout into a variable by using the backtick operator, e.g.
a=`echo foo`
echo $a
For your example, it would be something like:
a=`echo "AA BB CC" | awk '{ print $2 }' | grep -q "B"`
echo $a

Resources