I've this data :
cat >data1.txt <<'EOF'
2020-01-27-06-00;/dev/hd1;100;/
2020-01-27-12-00;/dev/hd1;100;/
2020-01-27-18-00;/dev/hd1;100;/
2020-01-27-06-00;/dev/hd2;200;/usr
2020-01-27-12-00;/dev/hd2;200;/usr
2020-01-27-18-00;/dev/hd2;200;/usr
EOF
cat >data2.txt <<'EOF'
2020-02-27-06-00;/dev/hd1;120;/
2020-02-27-12-00;/dev/hd1;120;/
2020-02-27-18-00;/dev/hd1;120;/
2020-02-27-06-00;/dev/hd2;230;/usr
2020-02-27-12-00;/dev/hd2;230;/usr
2020-02-27-18-00;/dev/hd2;230;/usr
EOF
cat >data3.txt <<'EOF'
2020-03-27-06-00;/dev/hd1;130;/
2020-03-27-12-00;/dev/hd1;130;/
2020-03-27-18-00;/dev/hd1;130;/
2020-03-27-06-00;/dev/hd2;240;/usr
2020-03-27-12-00;/dev/hd2;240;/usr
2020-03-27-18-00;/dev/hd2;240;/usr
EOF
I would like to create a .txt file for each filesystem ( so hd1.txt, hd2.txt, hd3.txt and hd4.txt ) and put in each .txt file the sum of the value from each FS from each dataX.txt. I've some difficulties to explain in english what I want, so here an example of the result wanted
Expected content for the output file hd1.txt:
2020-01;/dev/hd1;300;/
2020-02;/dev/hd1;360;/
2020-03;/dev/hd1;390:/
Expected content for the file hd2.txt:
2020-01;/dev/hd2;600;/usr
2020-02;/dev/hd2;690;/usr
2020-03;/dev/hd2;720;/usr
The implementation I've currently tried:
for i in $(cat *.txt | awk -F';' '{print $2}' | cut -d '/' -f3| uniq)
do
cat *.txt | grep -w $i | awk -F';' -v date="$(cat *.txt | awk -F';' '{print $1}' | cut -d'-' -f-2 | uniq )" '{sum+=$3} END {print date";"$2";"sum}' >> $i
done
But it doesn't works...
Can you show me how to do that ?
Because the format seems to be so constant, you can delimit the input with multiple separators and parse it easily in awk:
awk -v FS='[;-/]' '
prev != $9 {
if (length(output)) {
print output >> fileoutput
}
prev = $9
sum = 0
}
{
sum += $9
output = sprintf("%s-%s;/%s/%s;%d;/%s", $1, $2, $7, $8, sum, $11)
fileoutput = $8 ".txt"
}
END {
print output >> fileoutput
}
' *.txt
Tested on repl generates:
+ cat hd1.txt
2020-01;/dev/hd1;300;/
2020-02;/dev/hd1;360;/
2020-03;/dev/hd1;390;/
+ cat hd2.txt
2020-01;/dev/hd2;600;/usr
2020-02;/dev/hd2;690;/usr
2020-03;/dev/hd2;720;/usr
Alternatively, you could -v FS=';' and use split to split first and second column to extract the year and month and the hdX number.
If you seek a bash solution, I suggest you invert the loops - first iterate over files, then over identifiers in second column.
for file in *.txt; do
prev=
output=
while IFS=';' read -r date dev num path; do
hd=$(basename "$dev")
if [[ "$hd" != "${prev:-}" ]]; then
if ((${#output})); then
printf "%s\n" "$output" >> "$fileoutput"
fi
sum=0
prev="$hd"
fi
sum=$((sum + num))
output=$(
printf "%s;%s;%d;%s" \
"$(cut -d'-' -f1-2 <<<"$date")" \
"$dev" "$sum" "$path"
)
fileoutput="${hd}.txt"
done < "$file"
printf "%s\n" "$output" >> "$fileoutput"
done
You could also almost translate awk to bash 1:1 by doing IFS='-;/' in while read loop.
I am facing some issue when I am reading the 3rd word(a hex string) of each line in a text file and compare it with a hex number. Can some one please help me on it.
#!/bin/bash
A=$1
cat $A | while read a; do
a1=$(echo \""$a"\" | awk '{ print $3 }')
#echo $a > cut -d " " -f 3
echo $a1
(("$a1" == 0x10F7))
echo $?
done
But when I use below, the comparison happens correctly,
a1= 0xADCAFE
(( "$a1" == 0x10F7 ))
echo $?
Then why it is showing issue when I read like below,
a1=$(echo \""$a"\" | awk '{ print $3 }')
or> a1=$(echo $a | awk '{ print $3 }')
echo $a prints intended hex value, but comparison does not happen.
Regards,
Running Awk inside a while read loop is an antipattern. Just do the loop in Awk; it's good at that.
awk '$3 == 4343' "$1"
If you want to compare against a string whose value is "0x10F7" then it's
awk '$3 == "0x10F7"' "$1"
If you want to match either, case insensitively etc, a regex is a good way to do that.
awk '$3 ~ /^(0x10[Ff]7|4343)$/' "$1"
Notice how the $1 in double quotes is handled by the shell, and gets replaced by a (properly quoted!) copy of the script's first command-line argument before Awk runs, while the Awk script in single quotes has its own namespace, so $3 is an Awk variable which refers to the third field in the current input line.
Either way, avoid the useless use of cat and always always always quote variables which contain file names with double quotes.
That's literal double quotes. You seem to have tried both a dangerous bare $a and a doubly double-quoted "\"$a\"" where the simple "$a" would be what you actually want.
Thank you all for your responses, Now my script is working fine. I was trying to match two files, below script does the purpose
#!/bin/bash
A=$1
B=$2
dos2unix -f "$A"
dos2unix -f "$B"
rm search_match.txt search_data_match.txt search_nomatch.txt search_data_nomatch.txt
while read line;do
search_word=$(echo $line | awk '{ print $1 }')
grep "$search_word" $B >> temp_file.txt
while read var;do
file1_hex=$(echo $line | awk '{ print $2 }')
file2_hex=$(echo $var | awk '{ print $3 }')
(("$file1_hex" == "$file2_hex"))
zero=$(echo $?)
if [ "$zero" -eq 0 ] ; then
echo $line >> search_match.txt
echo $var >> search_data_match.txt
else
echo $line >> search_nomatch.txt
echo $var >> search_data_nomatch.txt
fi
done < "temp_file.txt"
rm temp_file.txt
done < "$A"
I am working on a shell script that converts exported Microsoft in-addr.apra.txt files to a more useful format so that i can use it in the future in other products for automation purposes. No i am figuring a problem which (im not a programmer) can not solve in a simple way.
Sample script
x=123.223.224
rev $x
gives me
422.322.321
but i want to have the output as follow:
224.223.123
is there a easy way to do it without rev or putting each group in a variable? Or is there a sample i can use? or maybe i use the wrong tools to do it?
Using awk:
x='123.223.224'
awk 'BEGIN{FS=OFS="."} {for (i=NF; i>=2; i--) printf $i OFS; print $1}' <<< "$x"
224.223.123
Use awk for this!
If your text file always contains three octets, simply use . as separator:
echo $x | awk -F. '{ print $3 "." $2 "." $1 }'
For more complex cases, use internal split():
echo $x | awk '{
n = split($0, a, ".");
for(i = n; i > 1; i--) {
printf "%s.", a[i];
}
print a[1]; }'
In this sample split() will split every line (which is passed as argument $0) using delimiter ., saves resulting array into a and returns length of that array (which is saved to n). Note that unlike C,
split() array indexes are starting with one.
Or python:
python -c "print '.'.join(reversed('$x'.split('.')))"
Here is my script.
#!/bin/sh
value=$1
delim=$2
total_fields=$(echo "$value" | tr -cd $2 | wc -c)
let total_fields=total_fields+1
i=1
reverse_value=""
while [ $total_fields -gt 0 ]; do
cur_value=$(echo "$value" | cut -d${delim} -f${total_fields})
if [ $total_fields -ne 1 ]; then
cur_value="$cur_value${delim}"
fi
#echo "$cur_value"
reverse_value="$reverse_value$cur_value"
#echo "$i --> $reverse_value"
let total_fields=total_fields-1
done
echo "$reverse_value"
Using a few small tools.
tr '.' '\n' <<< "$x" | tac | paste -sd.
224.223.123
I have written a code to re-structure the csv file based on the control file, The control file looks like below.
Control file :
1,column1
3,column3
6,column6
4,column4
-1,column9
Based on the above control file i have taken the index's 1,3,6,4,-1 columns in source.csv file and created new file by using paste command.incase if index value is -1 in control file i have to insert the entire column as null and header name will be column9.
Code :
var=1
while read line
do
t=$(echo $line | awk '{ print $1}' | cut -d, -f1)
if [ $t != -1 ]
then
cut -d, -f$t source.csv >file_$var.csv
else
touch file_$var.csv
fi
var=$((var+1))
done < "$file"
ls -v file_*.csv | xargs paste -d, > new_file.csv
Is there a way to convert these lines into AWK , Suggest me some ideas.
Before Running script:
sample.csv
column1,column2,column3,column4,column5,column6,column7
a,b,c,d,e,f,g
Output:
new_file.csv
column1,column3,column6,column4,column9
a,c,f,d,
column9 is -1 indicate null or just , separated indicate null.
Basic intention is to restructure the source file based on the control file.
Script:
#Greenplum Database details to read target file structure from Meta Data Tables.
export PGUSER=xxx
export PGPORT=5432
export PGHOST=10.100.20.10
export PGDATABASE=fff
SCHEMA='jiodba'
##Function to explain usage of this script
usage() {
echo "Usage: program.sh -s <Source_folder> -t <Target_folder> -f <file_name> ";
exit 1; }
source_folder=$1
target_folder=$2
file_name=$3
#removes the existing file from current directory
rm -f file_struct_*.csv
# Reading the Header from the Source file.
v_source_header=`head -1 $file_name`
IFS="," # Set the field separator
set $v_source_header # Breaks the string into $1, $2, ...
i=1
for item # A for loop by default loop through $1, $2, ...
do
echo "$i,$item">>source_header.txt
((i++))
done
sed -e "s/
//" source_header.txt | sed -e "s/ \{1,\}$//" > source_headers.txt
rm -f source_header.txt
#Get the Target header information from Greenplum Meta data Table and writing into target_header.txt file.
psql -t -A -F "," -c "select Target_column_position,Target_column_name from jiodba.etl_tbl_sequencing where source_file_name='$file_name' order by target_column_position" > target_header.txt
#Removing the trail space and control characters.
sed -e "s/
//" target_header.txt | sed -e "s/ \{1,\}$//" > target_headers.txt
rm -f target_header.txt
#Compare the Source Header Target Structure and generate the Difference.
awk -F, 'NR==FNR{a[$2]=$1;next} {if ($2 in a) print a[$2]","$2; else print "-1," $2}' source_headers.txt target_headers.txt >>tgt_struct_output.txt
#Loop to Read column index from the tgt_struct_output.txt and cut it in Source file.
file='tgt_struct_output.txt'
var=1
while read line
do
t=$(echo $line | awk '{ print $1}' | cut -d, -f1)
if [ $t != -1 ]
then
cut -d, -f$t $file_name>file_struct_$var.csv
else
touch file_struct_$var.csv
fi
var=$((var+1))
done<"$file"
awk -F, -v OFS=, 'FNR==NR {c[++n]=$2; a[$2]=$1;next} FNR==1{f=""; for (i=1; i<=n; i++)
{printf "%s%s", f, c[i]; b[++k]=i; f=OFS} print "";next}
{for (i=1; i<=n; i++) if(a[c[i]]>0) printf "%s%s", $a[c[i]], OFS; print""
}' tgt_struct_output.txt $file_name
#Paste the different file(columns)into single file
ls -v file_struct_*.csv | xargs paste -d,| sed -e "s/
//" > new_file.csv
new_header=`cut -d "," -f 2 target_headers.txt | tr "\n" "," | sed 's/,$//'`
#Replace the header with original target header incase if column doesnt exit in the target table structure.
sed "1s/.*/$new_header/" new_file.csv
#Removing the Temp files.
rm -f file_struct_*.csv
rm -f source_headers.txt target_headers.txt tgt_struct_output.txt
touch file_struct_1.csv #Just to avoid the error in shell
Sample.csv
BP ID,Prepaid Account No,CurrentMonetary balance ,charge Plan names ,Provider contract id,Contract Item ID,Start Date,End Date
1100001538,001000002506,251,[B2] R2 LTE CHARGE PLAN ,00000000000000000141,[B2] R2 LTE CHARGE PLAN _00155D10E20D1ED39A8E146EA7169A2E00155D10E20D1ED398FD63624498DB4A,16-Oct-12,18-Oct-12
1100003404,001000004029,45.22,B0.3 ECS_CHARGE_PLAN DROP1 V3,00000000000000009349,B0.3 ECS DROP2 V0.2_00155D10E20D1ED39A8E146EA7169A2E00155D10E20D1ED398FD63624498DA2E,16-Nov-13,23-Nov-13
1100006545,001000006620,388.796,B0.3 ECS_CHARGE_PLAN DROP1 V3,00000000000000010477,B0.3 ECS DROP2 V0.2_00155D10E20D1ED39A8E146EA7169A2E00155S00E20D1ED398FD63624498DA2E,07-Nov-12,07-Nov-13
You can try this awk:
awk -F, -v OFS=, 'FNR==NR {c[++n]=$2; a[$2]=$1;next} FNR==1{f=""; for (i=1; i<=n; i++)
{printf "%s%s", f, c[i]; b[++k]=i; f=OFS} print "";next}
{for (i=1; i<=n; i++) if(a[c[i]]>0) printf "%s%s", $a[c[i]], OFS; print""
}' ctrl.csv sample.csv
column1,column3,column6,column4,column9
a,c,f,d,
Here is a simplified version of my problem.
if (echo "AA BB CC" | awk '{ print $1 $2 }' | grep -q "B"); then
echo $2
fi
I would like to make $2 available in bash, so I can use it elsewhere in the script.
Can that be done?
Update
I realized that I had simplified the problem too much. The awk expression should have been awk '{ print $1 $2 }' instead of just awk '{ print $2 }' which I originally posted.
You can use set:
set -- `echo "AA BB CC" | awk '{print $2}'`
case $1 in *B*) echo $1;; esac
... or if you used the awk just to split the output, let set do that part as well:
set -- `echo "AA BB CC"`
case $2 in *B*) echo $2;; esac
Remember the output of awk, test it for the regular expression and print it:
output=$( echo "AA BB CC" | awk '{ print $2 }' )
if grep -q B <<< "$output" ; then echo "$output" ; fi
You can capture stdout into a variable by using the backtick operator, e.g.
a=`echo foo`
echo $a
For your example, it would be something like:
a=`echo "AA BB CC" | awk '{ print $2 }' | grep -q "B"`
echo $a