Difficulty to create .txt file from loop in bash - linux

I've this data :
cat >data1.txt <<'EOF'
2020-01-27-06-00;/dev/hd1;100;/
2020-01-27-12-00;/dev/hd1;100;/
2020-01-27-18-00;/dev/hd1;100;/
2020-01-27-06-00;/dev/hd2;200;/usr
2020-01-27-12-00;/dev/hd2;200;/usr
2020-01-27-18-00;/dev/hd2;200;/usr
EOF
cat >data2.txt <<'EOF'
2020-02-27-06-00;/dev/hd1;120;/
2020-02-27-12-00;/dev/hd1;120;/
2020-02-27-18-00;/dev/hd1;120;/
2020-02-27-06-00;/dev/hd2;230;/usr
2020-02-27-12-00;/dev/hd2;230;/usr
2020-02-27-18-00;/dev/hd2;230;/usr
EOF
cat >data3.txt <<'EOF'
2020-03-27-06-00;/dev/hd1;130;/
2020-03-27-12-00;/dev/hd1;130;/
2020-03-27-18-00;/dev/hd1;130;/
2020-03-27-06-00;/dev/hd2;240;/usr
2020-03-27-12-00;/dev/hd2;240;/usr
2020-03-27-18-00;/dev/hd2;240;/usr
EOF
I would like to create a .txt file for each filesystem ( so hd1.txt, hd2.txt, hd3.txt and hd4.txt ) and put in each .txt file the sum of the value from each FS from each dataX.txt. I've some difficulties to explain in english what I want, so here an example of the result wanted
Expected content for the output file hd1.txt:
2020-01;/dev/hd1;300;/
2020-02;/dev/hd1;360;/
2020-03;/dev/hd1;390:/
Expected content for the file hd2.txt:
2020-01;/dev/hd2;600;/usr
2020-02;/dev/hd2;690;/usr
2020-03;/dev/hd2;720;/usr
The implementation I've currently tried:
for i in $(cat *.txt | awk -F';' '{print $2}' | cut -d '/' -f3| uniq)
do
cat *.txt | grep -w $i | awk -F';' -v date="$(cat *.txt | awk -F';' '{print $1}' | cut -d'-' -f-2 | uniq )" '{sum+=$3} END {print date";"$2";"sum}' >> $i
done
But it doesn't works...
Can you show me how to do that ?

Because the format seems to be so constant, you can delimit the input with multiple separators and parse it easily in awk:
awk -v FS='[;-/]' '
prev != $9 {
if (length(output)) {
print output >> fileoutput
}
prev = $9
sum = 0
}
{
sum += $9
output = sprintf("%s-%s;/%s/%s;%d;/%s", $1, $2, $7, $8, sum, $11)
fileoutput = $8 ".txt"
}
END {
print output >> fileoutput
}
' *.txt
Tested on repl generates:
+ cat hd1.txt
2020-01;/dev/hd1;300;/
2020-02;/dev/hd1;360;/
2020-03;/dev/hd1;390;/
+ cat hd2.txt
2020-01;/dev/hd2;600;/usr
2020-02;/dev/hd2;690;/usr
2020-03;/dev/hd2;720;/usr
Alternatively, you could -v FS=';' and use split to split first and second column to extract the year and month and the hdX number.
If you seek a bash solution, I suggest you invert the loops - first iterate over files, then over identifiers in second column.
for file in *.txt; do
prev=
output=
while IFS=';' read -r date dev num path; do
hd=$(basename "$dev")
if [[ "$hd" != "${prev:-}" ]]; then
if ((${#output})); then
printf "%s\n" "$output" >> "$fileoutput"
fi
sum=0
prev="$hd"
fi
sum=$((sum + num))
output=$(
printf "%s;%s;%d;%s" \
"$(cut -d'-' -f1-2 <<<"$date")" \
"$dev" "$sum" "$path"
)
fileoutput="${hd}.txt"
done < "$file"
printf "%s\n" "$output" >> "$fileoutput"
done
You could also almost translate awk to bash 1:1 by doing IFS='-;/' in while read loop.

Related

Printing awk output in same line after grep

I have a very crude script getinfo.sh that gets me information from all files with name FILENAME1 and FILENAME2 in all subfolders and the path of the subfolder. The awk result should only pick the nth line from FILENAME2 if the script is called with "getinfo.sh n". I want all the info printed in one line!
The problem is that if i use print instead of printf the info is written to a new line but my script works. If i use printf i can see the last bit of the awk command in the command propt after the script ist done, but it is not paset after the grep command in the same line. All in all the complete line would be pretty long, but that is intentionally. Would you be willing to tell me what i am doing wrong?
#!/bin/bash
IFS=$'\n'
while read -r fname ;
do
pushd $(dirname "${fname}") > /dev/null
printf '%q' "${PWD##*/}"
grep 'Search_term ' FILENAME1 | tail -1
awk '{ if(NR==n) printf "%s",$0 }' n=$1 $2 FILENAME2
popd > /dev/null
done < <(find . -type f -name 'FILENAME1')
I would also be happy to grep the nth line if this is easier?
SOLUTION:
#!/bin/bash
IFS=$'\n'
while read -r fname ;
do
pushd $(dirname "${fname}") > /dev/null
{
printf '%q' "${PWD##*/}"
grep 'Search_term' FILENAME1 | tail -1
} | tr -d '\n'
if [ "$1" -eq "$1" ] 2>/dev/null
then
awk '{ if(NR==n) printf "%s",$0 }' n="$1" FILENAME2
fi
printf "\n"
popd > /dev/null
done < <(find . -type f -name 'FILENAME1')
You made it clearer in the comments.
I want the output of printf '%q' "${PWD##*/}" and grep 'Search_term ' FILENAME1 | tail -1 and awk '{ if(NR==n) printf "%s",$0 }' n=$1 $2 FILENAME2 to be printed in one line
So first, we have three commands, that each print a single line of output. As the commands do not matter, let's wrap them in functions to simplify the answer:
cmd1() { printf '%q\n' "${PWD##*/}"; }
cmd2() { grep .... ; }
cmd3() { awk ....; }
To print them without newlines between them, we can:
Use a command substitution, which removes trailing empty newlines. With some printf:
printf "%s%s%s\n" "$(cmd1)" "$(cmd2)" "$(cmd3)"
or some echo:
echo "$(cmd1) $(cmd2) $(cmd3)"
or append to a variable:
str="$(cmd1)"
str+=" $(cmd2)"
str+=" $(cmd3)"
printf" %s\n" "$str"
and so on.
We can remove newlines from the stream, using tr -d '\n':
{
cmd1
cmd2
cmd3
} | tr -d '\n'
echo # newlines were removed, so add one to the end.
or we can also remove the newlines only from the first n-1 commands, but I think this is less readable:
{
cmd1
cmd2
} | tr -d'\n'
cmd3 # the trailing newline will be added by cmd3
If i do not pass a number the awk command should be omited.
I see that your awk command expands both $1 and $2, and i see only $1 to be passed as the n=$1 environment variable to awk. I don't know what is $2. You can write if-s on the value of $# the number of arguments:
if (($# == 2)); then
awk '{ if(NR==n) printf "%s",$0 }' n="$1" "$2" FILENAME2
fi
and similar for each case you want to handle. Remember about proper quoting.
Your command shows the unused parameter $2, I deleted that one.
You can add a newline at the end of the awk using the END block, but you also want an extra newline when you call your script without a line number. echo will do.
#!/bin/bash
IFS=$'\n'
while read -r fname ;
do
pushd $(dirname "${fname}") > /dev/null
# Add result of grep in same printf statement
printf '%s %s' "${PWD##*/}" "$(grep 'Search_term ' FILENAME1 | tail -1)"
if (( $# -eq 1 )); then
# use $1 as an awk variable, number n
# use $2 as a different file to read from
awk -v n=$1 '{ if(NR==n) printf "%s ",$0 }' FILENAME2
fi
# Add line-ending
echo
popd > /dev/null
done < <(find . -type f -name 'FILENAME1')

Increment variable when matched awk from tail

I'm monitoring from an actively written to file:
My current solution is:
ws_trans=0
sc_trans=0
tail -F /var/log/file.log | \
while read LINE
echo $LINE | grep -q -e "enterpriseID:"
if [ $? = 0 ]
then
((ws_trans++))
fi
echo $LINE | grep -q -e "sc_ID:"
if [ $? = 0 ]
then
((sc_trans++))
fi
printf "\r WSTRANS: $ws_trans \t\t SCTRANS: $sc_trans"
done
However when attempting to do this with AWK I don't get the output - the $ws_trans and $sc_trans remains 0
ws_trans=0
sc_trans=0
tail -F /var/log/file.log | \
while read LINE
echo $LINE | awk '/enterpriseID:/ {++ws_trans} END {print | ws_trans}'
echo $LINE | awk '/sc_ID:/ {++sc_trans} END {print | sc_trans}'
printf "\r WSTRANS: $ws_trans \t\t SCTRANS: $sc_trans"
done
Attempting to do this to reduce load. I understand that AWK doesn't deal with bash variables, and it can get quite confusing, but the only reference I found is a non tail application of AWK.
How can I assign the AWK Variable to the bash ws_trans and sc_trans? Is there a better solution? (There are other search terms being monitored.)
You need to pass the variables using the option -v, for example:
$ var=0
$ printf %d\\n {1..10} | awk -v awk_var=${var} '{++awk_var} {print awk_var}'
To set the variable "back" you could use declare, for example:
$ declare $(printf %d\\n {1..10} | awk -v awk_var=${var} '{++awk_var} END {print "var=" awk_var}')
$ echo $var
$ 10
Your script could be rewritten like this:
ws_trans=0
sc_trans=0
tail -F /var/log/system.log |
while read LINE
do
declare $(echo $LINE | awk -v ws=${ws_trans} '/enterpriseID:/ {++ws} END {print "ws_trans="ws}')
declare $(echo $LINE | awk -v sc=${sc_trans} '/sc_ID:/ {++sc} END {print "sc_trans="sc}')
printf "\r WSTRANS: $ws_trans \t\t SCTRANS: $sc_trans"
done

Total count of the array values

Here I'm accepting few mount points from the user and using each value to get space available on the host.
./user_input.ksh -string /m01,/m02,/m03
#!/bin/ksh
STR=$2
function showMounts {
echo "$STR"
arr=($(tr ',' ' ' <<< "$STR"))
printf "%s\n" "$(arr[#]}"
for x in "${arr[#]}"
do
free_space=`df -h "$x" | grep -v "Avail" | awk '{print $4}'`
echo "$x": free_space "$free_space"
done
#echo "$total_free_space"
}
Problems:
How can I exit for loop if any of the user input mount not avaialble?
currently it only add error in the log.
How to get total_free_space (i.e. sum of free_space)?
If you want to keep your code , test this (no ksh here). If you don't care, read Ed Morton's answer.
./user_input.ksh -string /m01,/m02,/m03
#!/bin/ksh
STR=$2
function showMounts {
echo "$STR"
arr=($(tr ',' ' ' <<< "$STR"))
printf "%s\n" "${arr[#]}"
for x in "${arr[#]}"; do
free_space=$(df -P "$x" | awk 'NR > 1 && !/Avail/{print $4}')
echo "$x: free_space $free_space"
((total_free_space+=$free_space))
done
echo "$((total_free_space/1024/1000))G"
}
showMounts
Caution:
"${arr[#]}"
not
"$(arr[#]}"
As I said in your last question, you do not need ANY of that, all you need is a one-liner like:
df -h "${STR//,/ }" | awk '/^ /{print $5, $3; sum+=$3} END{print sum}'
I have to say "like" because you haven't shown us the df -h /m01 /m02 /m03 output yet so I don't know exactly how to parse it.

How to re-structure the file using AWK?

I have written a code to re-structure the csv file based on the control file, The control file looks like below.
Control file :
1,column1
3,column3
6,column6
4,column4
-1,column9
Based on the above control file i have taken the index's 1,3,6,4,-1 columns in source.csv file and created new file by using paste command.incase if index value is -1 in control file i have to insert the entire column as null and header name will be column9.
Code :
var=1
while read line
do
t=$(echo $line | awk '{ print $1}' | cut -d, -f1)
if [ $t != -1 ]
then
cut -d, -f$t source.csv >file_$var.csv
else
touch file_$var.csv
fi
var=$((var+1))
done < "$file"
ls -v file_*.csv | xargs paste -d, > new_file.csv
Is there a way to convert these lines into AWK , Suggest me some ideas.
Before Running script:
sample.csv
column1,column2,column3,column4,column5,column6,column7
a,b,c,d,e,f,g
Output:
new_file.csv
column1,column3,column6,column4,column9
a,c,f,d,
column9 is -1 indicate null or just , separated indicate null.
Basic intention is to restructure the source file based on the control file.
Script:
#Greenplum Database details to read target file structure from Meta Data Tables.
export PGUSER=xxx
export PGPORT=5432
export PGHOST=10.100.20.10
export PGDATABASE=fff
SCHEMA='jiodba'
##Function to explain usage of this script
usage() {
echo "Usage: program.sh -s <Source_folder> -t <Target_folder> -f <file_name> ";
exit 1; }
source_folder=$1
target_folder=$2
file_name=$3
#removes the existing file from current directory
rm -f file_struct_*.csv
# Reading the Header from the Source file.
v_source_header=`head -1 $file_name`
IFS="," # Set the field separator
set $v_source_header # Breaks the string into $1, $2, ...
i=1
for item # A for loop by default loop through $1, $2, ...
do
echo "$i,$item">>source_header.txt
((i++))
done
sed -e "s/
//" source_header.txt | sed -e "s/ \{1,\}$//" > source_headers.txt
rm -f source_header.txt
#Get the Target header information from Greenplum Meta data Table and writing into target_header.txt file.
psql -t -A -F "," -c "select Target_column_position,Target_column_name from jiodba.etl_tbl_sequencing where source_file_name='$file_name' order by target_column_position" > target_header.txt
#Removing the trail space and control characters.
sed -e "s/
//" target_header.txt | sed -e "s/ \{1,\}$//" > target_headers.txt
rm -f target_header.txt
#Compare the Source Header Target Structure and generate the Difference.
awk -F, 'NR==FNR{a[$2]=$1;next} {if ($2 in a) print a[$2]","$2; else print "-1," $2}' source_headers.txt target_headers.txt >>tgt_struct_output.txt
#Loop to Read column index from the tgt_struct_output.txt and cut it in Source file.
file='tgt_struct_output.txt'
var=1
while read line
do
t=$(echo $line | awk '{ print $1}' | cut -d, -f1)
if [ $t != -1 ]
then
cut -d, -f$t $file_name>file_struct_$var.csv
else
touch file_struct_$var.csv
fi
var=$((var+1))
done<"$file"
awk -F, -v OFS=, 'FNR==NR {c[++n]=$2; a[$2]=$1;next} FNR==1{f=""; for (i=1; i<=n; i++)
{printf "%s%s", f, c[i]; b[++k]=i; f=OFS} print "";next}
{for (i=1; i<=n; i++) if(a[c[i]]>0) printf "%s%s", $a[c[i]], OFS; print""
}' tgt_struct_output.txt $file_name
#Paste the different file(columns)into single file
ls -v file_struct_*.csv | xargs paste -d,| sed -e "s/
//" > new_file.csv
new_header=`cut -d "," -f 2 target_headers.txt | tr "\n" "," | sed 's/,$//'`
#Replace the header with original target header incase if column doesnt exit in the target table structure.
sed "1s/.*/$new_header/" new_file.csv
#Removing the Temp files.
rm -f file_struct_*.csv
rm -f source_headers.txt target_headers.txt tgt_struct_output.txt
touch file_struct_1.csv #Just to avoid the error in shell
Sample.csv
BP ID,Prepaid Account No,CurrentMonetary balance ,charge Plan names ,Provider contract id,Contract Item ID,Start Date,End Date
1100001538,001000002506,251,[B2] R2 LTE CHARGE PLAN ,00000000000000000141,[B2] R2 LTE CHARGE PLAN _00155D10E20D1ED39A8E146EA7169A2E00155D10E20D1ED398FD63624498DB4A,16-Oct-12,18-Oct-12
1100003404,001000004029,45.22,B0.3 ECS_CHARGE_PLAN DROP1 V3,00000000000000009349,B0.3 ECS DROP2 V0.2_00155D10E20D1ED39A8E146EA7169A2E00155D10E20D1ED398FD63624498DA2E,16-Nov-13,23-Nov-13
1100006545,001000006620,388.796,B0.3 ECS_CHARGE_PLAN DROP1 V3,00000000000000010477,B0.3 ECS DROP2 V0.2_00155D10E20D1ED39A8E146EA7169A2E00155S00E20D1ED398FD63624498DA2E,07-Nov-12,07-Nov-13
You can try this awk:
awk -F, -v OFS=, 'FNR==NR {c[++n]=$2; a[$2]=$1;next} FNR==1{f=""; for (i=1; i<=n; i++)
{printf "%s%s", f, c[i]; b[++k]=i; f=OFS} print "";next}
{for (i=1; i<=n; i++) if(a[c[i]]>0) printf "%s%s", $a[c[i]], OFS; print""
}' ctrl.csv sample.csv
column1,column3,column6,column4,column9
a,c,f,d,

Get Values out of String in bash

I hope you can help me.
I try to separate a String:
#!/bin/bash
file=$(<sample.txt)
echo "$file"
The File itself contains Values like this:
(;FF[4]GM[1]SZ[19]CA[UTF-8]SO[sometext]BC[cn]WC[ja]
What I need is a way to extract the Values between the [ ] and set them as variables, for Example:
$FF=4
$GM=1
$SZ=19
and so on
However, some Files do not contain all Values, so that in some cases there is no FF[*]. In this case the Program should use the Value of "99"
How do I have to do this?
Thank you so much for your help.
Greetings
Chris
It may be a bit overcomplicated, but here it comes another way:
grep -Po '[-a-zA-Z0-9]*' file | awk '!(NR%2) {printf "declare %s=\"%s\";\n", a,$0; next} {a=$0} | bash
By steps
Filter file by printing only the needed blocks:
$ grep -Po '[-a-zA-Z0-9]*' a
FF
4
GM
1
SZ
19
CA
UTF-8
SO
sometext
BC
cn
WC
ja
Reformat so that it specifies the declaration:
$ grep -Po '[-a-zA-Z0-9]*' a | awk '!(NR%2) {printf "declare %s=\"%s\";\n", a,$0; next} {a=$0}'
declare FF="4";
declare GM="1";
declare SZ="19";
declare CA="UTF-8";
declare SO="sometext";
declare BC="cn";
declare WC="ja";
And finally pipe to bash so that it is executed.
Note 2nd step could be also rewritten as
xargs -n2 | awk '{print "declare"$1"=\""$2"\";"}'
I'd write this, using ; or [ or ] as awk's field separators
$ line='(;FF[4]GM[1]SZ[19]CA[UTF-8]SO[sometext]BC[cn]WC[ja]'
$ awk -F '[][;]' '{for (i=2; i<NF; i+=2) {printf "%s=\"%s\" ", $i, $(i+1)}; print ""}' <<<"$line"
FF="4" GM="1" SZ="19" CA="UTF-8" SO="sometext" BC="cn" WC="ja"
Then, to evaluate the output in your current shell:
$ source <(!!)
source <(awk -F '[][;]' '{for (i=2; i<NF; i+=2) {printf "%s=\"%s\" ", $i, $(i+1)}; print ""}' <<<"$line")
$ echo $SO
sometext
To handle the default FF value:
$ source <(awk -F '[][;]' '{
print "FF=99"
for (i=2; i<NF; i+=2) printf "%s=\"%s\" ", $i, $(i+1)
print ""
}' <<< "(;A[1]B[2]")
$ echo $FF
99
$ source <(awk -F '[][;]' '{
print "FF=99"
for (i=2; i<NF; i+=2) printf "%s=\"%s\" ", $i, $(i+1)
print ""
}' <<< "(;A[1]B[2]FF[3]")
$ echo $FF
3
Per your request:
while IFS=\[ read -r A B; do
[[ -z $B ]] && B=99
eval "$A=\$B"
done < <(exec grep -oE '[[:alpha:]]+\[[^]]*' sample.txt)
Although using an associative array would be better:
declare -A VALUES
while IFS=\[ read -r A B; do
[[ -z $B ]] && B=99
VALUES[$A]=$B
done < <(exec grep -oE '[[:alpha:]]+\[[^]]*' sample.txt)
There you could have access both with keys ("${!VALUES[#]}") and values "${VALUES['FF']}".
I would probably do something like this:
set $(sed -e 's/^(;//' sample.txt | tr '[][]' ' ')
while (( $# > 2 ))
do
varname=${1}
varvalue=${2}
# do something to test varname and varvalue to make sure they're sane/safe
declare "${varname}=${varvalue}"
shift 2
done

Resources