Changing contents of files through shell script - linux

I have a requirement where I need to change the contents of a file say file.xyt. The file contains values like:
21 100 34 82
122 50 75 12
88 10 15 45
I need to see if the fourth argument in every line (which for this example are 82, 12, and 45) is less than 23.
And if so, i need to delete that specific line.
For this example, the result will be:
21 100 34 82
88 10 15 45
How can i achieve this using shell script? Thanks in advance.

You can use awk:
awk '$4 >= 23 {print}' file
that can be shortened to(thanks #RomanPerekhrest):
awk '$4 >= 23' file
If you want to write the file in place, you can use a temporary file:
awk '$4 >= 23' file > tmp && mv tmp file
In case you have gawk 4.1.0 or later, you can use the -i flag to edit the file in place:
gawk -i '$4 >= 23' file
Or using a Bash loop:
while read -r a b c d; do
[[ $d -ge 23 ]] && echo $a $b $c $d
done < file

In case if each line of the file contains 4 separate numbers, to modify the initial file in place you may use the following sed approach:
sed -Ei '/\<(1{,1}[0-9]|2[0-2])$/d' file.xyt
file.xyt contents:
21 100 34 82
88 10 15 45

Related

converting 4 digit year to 2 digit in shell script

I have file as:
$cat file.txt
1981080512 14 15
2019050612 17 18
2020040912 19 95
Here the 1st column represents dates as YYYYMMDDHH
I would like to write the dates as YYMMDDHH. So the desire output is:
81080512 14 15
19050612 17 18
20040912 19 95
My script:
while read -r x;do
yy=$(echo $x | awk '{print substr($0,3,2)}')
mm=$(echo $x | awk '{print substr($0,5,2)}')
dd=$(echo $x | awk '{print substr($0,7,2)}')
hh=$(echo $x | awk '{print substr($0,9,2)}')
awk '{printf "%10s%4s%4s\n",'$yy$mm$dd$hh',$2,$3}'
done < file.txt
It is printing
81080512 14 15
81080512 17 18
Any help please. Thank you.
Please don't kill me for this simple answer, but what about this:
cut -c 3- file.txt
You simply cut the first two digits by showing character 3 till the end of every line (the -c switch indicates that you need to cut characters (not bytes, ...)).
You can do it using single GNU AWK's substr as follows, let file.txt content be then
1981080512 14 15
2019050612 17 18
2020040912 19 95
then
awk '{$1=substr($1,3);print}' file.txt
output
81080512 14 15
19050612 17 18
20040912 19 95
Explanation: I used substr function to get 3rd and onward characters from 1st column and assign it back to said column, then I print such changed line.
(tested in gawk 4.2.1)

Sum all the numbers in a file given by positional parameter

I want to sum all the numbers in a file (columns and lines) given by the first parameter, but my program shows sum=sum+$i instead of the numeric sum:
sum=0;
file=$1
for i in $file
do
sum=sum+$i;
done;
echo "The sum is: " $sum
Input file:
$cat file.txt
10 20 10
40
50
Expected output :
The sum is: 21
Maybe if there is an awk method to solve this?
Try this -
$cat file1.txt
10 20 10
40
50
$awk '{for(i=1;i<=NF;i++) {sum+=$i}} END {print sum}' file1.txt
130
OR
$xargs < file1.txt| tr ' ' + | bc
130
cat file.txt | xargs | sed -e 's/\ /+/g' | bc
You can also use a simple read and an array to sum the value relying on word splitting to separate the values into an array via the default IFS (Internal Field Separator), e.g.
#!/bin/bash
declare -i sum=0
fn="${1:-/dev/stdin}" ## read from file as 1st argument (default stdin)
while read -r line; do ## read each line
a=( $line ) ## separate values into array
for i in ${a[#]}; do ## for each value in array
((sum += i)) ## add to sum
done
done <"$fn"
echo "sum: $sum"
Example Input File
$ cat dat/numfile.txt
10 20 10
40
50
Example Use/Output
$ bash sumnumfile.sh dat/numfile.txt
sum: 130
Another for some awks (at least mawk and gawk):
$ awk -v RS="[^0-9]" '{s+=$1}END{print s}' file
130

Cut a file between two lines numbers using awk

Say I have a file with 100 lines (not including header). I want to cut that file down, only keeping the content between line 51 and 70 (inclusive), as well as the header so that the resulting file is 20+1 lines.
So far, I have this code:
awk 'NR==1 {h=$0; next} (NR-1)>50 && (NR-1)<71 {filename = "file20.csv"; print h >> filename} {print >> filename}' file100.csv
But it's giving me an error:
fatal: expression for `>>' redirection has null string value
Can somebody help me understand where my syntax is wrong?
You can directly use:
awk 'NR==1 || (NR>=51 && NR<=70)'
Note that this evaluates the condition of NR. In case it is true, it performs awk's default action: {print $0}. Hence, you do not have to explicit it.
Then you can redirect to another file:
awk 'NR==1 || (NR>=51 && NR<=70)' file > new_file
Test
$ seq 100 | awk 'NR==1 || (NR>=51 && NR<=70)'
1
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
It returns 21 lines:
$ seq 100 | awk 'NR==1 || (NR>=51 && NR<=70)' | wc -l
21

keep groups of lines with specific keywords (bash)

I have a text file with plenty of lines in this format (the lines between every two # defined as a group):
# some str for test
hdfv 12 9 b
cgj 5 11 t
# another string to examine
kinj 58 96 f
dfg 7 26 u
fds 9 76 j
---
key.txt:
string to
---
output:
# another string to examine
kinj 58 96 f
dfg 7 26 u
fds 9 76 j
I should search some keywords(string,to) from lines which starts with # and if the keywords does not exist in key.txt (a file with two columns) then I should remove that line and the following lines(of that group).I've written this code without result!(key words are together in input file as the example )
cat input.txt | while IFS=$'#' read -r -a myarray
do
a=${myarray[1]}
b=${myarray[0]}
unset IFS
read -r a x y z <<< "$a"
key=$(echo "$x $y")
if grep "$key" key.txt > /dev/null
then
echo $key exists
else
grep -v -e "$a" -e "$b" input.txt > $$ && mv $$ input.txt
fi
done
can some one help me?
A simple way to get correct block is using awk and correct Record Selector:
awk 'FNR==NR {a[$0];next} { RS="#";for (i in a) if ($0~i) print}' key.txt input.txt
another string to examine
kinj 58 96 f
dfg 7 26 u
fds 9 76 j
This should reinsert the # that is used and remove the extra empty line. I may be simpler ways to do this, but this works.
awk 'FNR==NR {a[$0];next} { RS="#";for (i in a) if ($0~i) {sub(/^ /,RS);sub(/\n$/,x);print}}' key.txt input.txt
#another string to examine
kinj 58 96 f
dfg 7 26 u
fds 9 76 j

AWK--Comparing the value of two variables in two different files

I have two text files A.txt and B.txt. Each line of A.txt
A.txt
100
222
398
B.txt
1 2 103 2
4 5 1026 74
7 8 209 55
10 11 122 78
What I am looking for is something like this:
for each line of A
search B;
if (the value of third column in a line of B - the value of the variable in A > 10)
print that line of B;
Any awk for doing that??
How about something like this,
I had some troubles understanding your question, but maybe this will give you some pointers,
#!/bin/bash
# Read intresting values from file2 into an array,
for line in $(cat 2.txt | awk '{print $3}')
do
arr+=($line)
done
# Linecounter,
linenr=0
# Loop through every line in file 1,
for val in $(cat 1.txt)
do
# Increment linecounter,
((linenr++))
# Loop through every element in the array (containing values from 3 colum from file2)
for el in "${!arr[#]}";
do
# If that value - the value from file 1 is bigger than 10, print values
if [[ $((${arr[$el]} - $val )) -gt 10 ]]
then
sed -n "$(($el+1))p" 2.txt
# echo "Value ${arr[$el]} (on line $(($el+1)) from 2.txt) - $val (on line $linenr from 1.txt) equals $((${arr[$el]} - $val )) and is hence bigger than 10"
fi
done
done
Note,
This is a quick and dirty thing, there is room for improvements. But I think it'll do the job.
Use awk like this:
cat f1
1
4
9
16
cat f2
2 4 10 8
3 9 20 8
5 1 15 8
7 0 30 8
awk 'FNR==NR{a[NR]=$1;next} $3-a[FNR] < 10' f1 f2
2 4 10 8
5 1 15 8
UPDATE: Based on OP's edited question:
awk 'FNR==NR{a[NR]=$1;next} {for (i in a) if ($3-a[i] > 10) print}'
and see how simple awk based solution is as compared to nested for loops.

Resources