trim ';' and split by ":" - linux

I need to read a file like
#sys_platform:top_agent_id:channel
# 2 : 999 : 999
2:10086:10086;
2:999:999;
how to read sys_platform and top_agent_id, channel line by line
I write a shell , but not correctly
#!/bin/sh
sys_platform=""
top_agent_id=
channel=""
while read p; do
echo "line=$p"
echo $p | awk -F ':' '{print $1 $2 $3}' | read sys_platform top_agent_id channel
echo "sys_platform:${sys_platform}"
echo "top_agent_id:${top_agent_id}"
done < ./channellist.txt
result as :
line=#sys_platform:top_agent_id:channel
sys_platform:
top_agent_id:
line=# 2 : 999 : 999
sys_platform:
top_agent_id:
line=2:10086:10086;
sys_platform:
top_agent_id:
line=2:999:999;
sys_platform:
top_agent_id:

awk is your friend:
while read p; do
sys_platform=`echo $p | awk -F ':' '{print $1}'`
top_agent_id=`echo $p | awk -F ':' '{print $2}'`
channel=`echo $p | awk -F ':' '{print $3}' | tr -d ';'`
done < $filename
Nevertheless, you can do it directly with bash set builtin:
while read p; do
OFS=$IFS
IFS=':'
set -f
splitted=( $p )
set +f
sys_platform="${splitted[0]}"
top_agent_id="${splitted[1]}"
channel="${splitted[2]}"
IFS=$OFS
done < $filename
Less readable but should be more efficient.

Related

Difficulty to create .txt file from loop in bash

I've this data :
cat >data1.txt <<'EOF'
2020-01-27-06-00;/dev/hd1;100;/
2020-01-27-12-00;/dev/hd1;100;/
2020-01-27-18-00;/dev/hd1;100;/
2020-01-27-06-00;/dev/hd2;200;/usr
2020-01-27-12-00;/dev/hd2;200;/usr
2020-01-27-18-00;/dev/hd2;200;/usr
EOF
cat >data2.txt <<'EOF'
2020-02-27-06-00;/dev/hd1;120;/
2020-02-27-12-00;/dev/hd1;120;/
2020-02-27-18-00;/dev/hd1;120;/
2020-02-27-06-00;/dev/hd2;230;/usr
2020-02-27-12-00;/dev/hd2;230;/usr
2020-02-27-18-00;/dev/hd2;230;/usr
EOF
cat >data3.txt <<'EOF'
2020-03-27-06-00;/dev/hd1;130;/
2020-03-27-12-00;/dev/hd1;130;/
2020-03-27-18-00;/dev/hd1;130;/
2020-03-27-06-00;/dev/hd2;240;/usr
2020-03-27-12-00;/dev/hd2;240;/usr
2020-03-27-18-00;/dev/hd2;240;/usr
EOF
I would like to create a .txt file for each filesystem ( so hd1.txt, hd2.txt, hd3.txt and hd4.txt ) and put in each .txt file the sum of the value from each FS from each dataX.txt. I've some difficulties to explain in english what I want, so here an example of the result wanted
Expected content for the output file hd1.txt:
2020-01;/dev/hd1;300;/
2020-02;/dev/hd1;360;/
2020-03;/dev/hd1;390:/
Expected content for the file hd2.txt:
2020-01;/dev/hd2;600;/usr
2020-02;/dev/hd2;690;/usr
2020-03;/dev/hd2;720;/usr
The implementation I've currently tried:
for i in $(cat *.txt | awk -F';' '{print $2}' | cut -d '/' -f3| uniq)
do
cat *.txt | grep -w $i | awk -F';' -v date="$(cat *.txt | awk -F';' '{print $1}' | cut -d'-' -f-2 | uniq )" '{sum+=$3} END {print date";"$2";"sum}' >> $i
done
But it doesn't works...
Can you show me how to do that ?
Because the format seems to be so constant, you can delimit the input with multiple separators and parse it easily in awk:
awk -v FS='[;-/]' '
prev != $9 {
if (length(output)) {
print output >> fileoutput
}
prev = $9
sum = 0
}
{
sum += $9
output = sprintf("%s-%s;/%s/%s;%d;/%s", $1, $2, $7, $8, sum, $11)
fileoutput = $8 ".txt"
}
END {
print output >> fileoutput
}
' *.txt
Tested on repl generates:
+ cat hd1.txt
2020-01;/dev/hd1;300;/
2020-02;/dev/hd1;360;/
2020-03;/dev/hd1;390;/
+ cat hd2.txt
2020-01;/dev/hd2;600;/usr
2020-02;/dev/hd2;690;/usr
2020-03;/dev/hd2;720;/usr
Alternatively, you could -v FS=';' and use split to split first and second column to extract the year and month and the hdX number.
If you seek a bash solution, I suggest you invert the loops - first iterate over files, then over identifiers in second column.
for file in *.txt; do
prev=
output=
while IFS=';' read -r date dev num path; do
hd=$(basename "$dev")
if [[ "$hd" != "${prev:-}" ]]; then
if ((${#output})); then
printf "%s\n" "$output" >> "$fileoutput"
fi
sum=0
prev="$hd"
fi
sum=$((sum + num))
output=$(
printf "%s;%s;%d;%s" \
"$(cut -d'-' -f1-2 <<<"$date")" \
"$dev" "$sum" "$path"
)
fileoutput="${hd}.txt"
done < "$file"
printf "%s\n" "$output" >> "$fileoutput"
done
You could also almost translate awk to bash 1:1 by doing IFS='-;/' in while read loop.

script to check when a directory is last updated

I am trying to write a script that takes the last update date of a hadoop path subtracts it from current date and sends out an email if the difference is more than a certain number of days (variable).It needs to loop through a pipe delimited config file which has the number of days, table name,hadoop path and email
example of config file
30|db.big_leasing_center|/TST/DL/EDGE_BASE/GFDCP-52478/BIG_LEASING_CENTER/Data/|abcd#gmail.com
2|db.event|/TST/DL/EDGE_BASE/GFDCP-52478/GFDCP01P-ITG_FIN_DB/EVENTS/Data/|cab#gmail.com
below is what i tried
#!/bin/ksh
check_last_refresh() {
DAYS=$1
TABLE=$2
HDP_PATH=$3
EMAIL_DL=$4
last_refresh_date=$(hdfs dfs -ls $HDP_PATH | grep '^-' | awk '{print $6}' | sort -rh | head -1)
echo $last_refresh_date
diff=$(( ( $(date '+%s') - $(date '+%s' -d "$last_refresh_date") ) / 86400 ))
echo $diff
if [ "$diff" -gt "$DAYS" ]; then
echo "HI for $TABLE has an issue" | mail -s "HI for $TABLE has an issue, Please check" -b $EMAIL_DL
fi
return 0
}
cat /data/scratchSpace/bznhd9/CONFIG.txt | while read line; do
DAYS=$(echo $line|awk -F'|' '{print $1}')
TABLE=$(echo $line|awk -F'|' '{print $2}')
HDP_PATH=$(echo $line|awk -F'|' '{print $3}')
EMAIL_DL=$(echo $line|awk -F'|' '{print $4}')
echo $TABLE
r=$(check_last_refresh $DAYS $TABLE $HDP_PATH $EMAIL_DL)
echo $r
done
Please help

Awk: parse node names out of "40*r13n15:40*r10n61:40*r11n18:40*r09n15"

I have a linux script for selecting the node.
For example:
4
40*r13n15:40*r10n61:40*r11n18:40*r09n15
The correct result should be:
r13n15
r10n61
r11n18
r09n15
My linux script content is like:
hostNum=`bjobs -X -o "nexec_host" $1 | grep -v NEXEC`
hostSer=`bjobs -X -o "exec_host" $1 | grep -v EXEC`
echo $hostNum
echo $hostSer
for i in `seq 1 $hostNum`
do
echo $hostSer | awk -F ':' '{print '$i'}' | awk -F '*' '{print $2}'
done
But unlucky, I got nothing about node information.
I have tried:
echo $hostSer | awk -F ':' '{print "'$i'"}' | awk -F '*' '{print $2}'
and
echo $hostSer | awk -F ':' '{print '"$i"'}' | awk -F '*' '{print $2}'
But there are wrong. Who can give me a help?
One more awk:
$ echo "$variable" | awk 'NR%2==0' RS='[*:\n]'
r13n15
r10n61
r11n18
r09n15
By setting the record separtor(RS) to *:\n , the string is broken into individual tokens, after which you can just print every 2nd line(NR%2==0).
You can use multiple separators in awk. Please try below:
h='40*r13n15:40*r10n61:40*r11n18:40*r09n15'
echo "$h"| awk -F '[:*]' '{ for (i=2;i<=NF;i+=2) print $i }'
**edited to make it generic based on the comment from RavinderSingh13.

Create file with content, where the content has new line

In linux, how to create a file with content whose single line with \n (or any line separator) is translated into multi-line.
fileA.txt:
trans_fileA::abcd\ndfghc\n091873\nhhjj
trans_fileB::a11d\n11hc\n73345
Code:
while read line; do
file_name=`echo $line | awk -F'::' '{print $1}' `
file_content=`echo $line | awk -F'::' '{print $2}' `
echo $file_name
echo $(eval echo ${file_content})
echo $(eval echo ${file_content}) > fileA.txt
The trans_fileA should be:
abcd
dfghc
091873
hhjj
You can do it this way (with bash):
# read input file line by line, without interpreting \n
while read -r line
do
# extract name
name=$(echo $line | cut -d: -f 1)
# extract data
data=$(echo $line | cut -d: -f 3)
# ask sed to replace \n with linefeed and store result in name
echo $data | sed 's/\\n/\n/g' > "$name"
# read data from given file
done < fileA.txt
You can even write a smaller code:
while read -r line
do echo $line | cut -d: -f 3 | sed 's/\\n/\n/g' > "$(echo $line | cut -d: -f 1) "
done < fileA.txt

Increment variable when matched awk from tail

I'm monitoring from an actively written to file:
My current solution is:
ws_trans=0
sc_trans=0
tail -F /var/log/file.log | \
while read LINE
echo $LINE | grep -q -e "enterpriseID:"
if [ $? = 0 ]
then
((ws_trans++))
fi
echo $LINE | grep -q -e "sc_ID:"
if [ $? = 0 ]
then
((sc_trans++))
fi
printf "\r WSTRANS: $ws_trans \t\t SCTRANS: $sc_trans"
done
However when attempting to do this with AWK I don't get the output - the $ws_trans and $sc_trans remains 0
ws_trans=0
sc_trans=0
tail -F /var/log/file.log | \
while read LINE
echo $LINE | awk '/enterpriseID:/ {++ws_trans} END {print | ws_trans}'
echo $LINE | awk '/sc_ID:/ {++sc_trans} END {print | sc_trans}'
printf "\r WSTRANS: $ws_trans \t\t SCTRANS: $sc_trans"
done
Attempting to do this to reduce load. I understand that AWK doesn't deal with bash variables, and it can get quite confusing, but the only reference I found is a non tail application of AWK.
How can I assign the AWK Variable to the bash ws_trans and sc_trans? Is there a better solution? (There are other search terms being monitored.)
You need to pass the variables using the option -v, for example:
$ var=0
$ printf %d\\n {1..10} | awk -v awk_var=${var} '{++awk_var} {print awk_var}'
To set the variable "back" you could use declare, for example:
$ declare $(printf %d\\n {1..10} | awk -v awk_var=${var} '{++awk_var} END {print "var=" awk_var}')
$ echo $var
$ 10
Your script could be rewritten like this:
ws_trans=0
sc_trans=0
tail -F /var/log/system.log |
while read LINE
do
declare $(echo $LINE | awk -v ws=${ws_trans} '/enterpriseID:/ {++ws} END {print "ws_trans="ws}')
declare $(echo $LINE | awk -v sc=${sc_trans} '/sc_ID:/ {++sc} END {print "sc_trans="sc}')
printf "\r WSTRANS: $ws_trans \t\t SCTRANS: $sc_trans"
done

Resources