I'm using the following command to match the following line in a file:
cat file.txt | grep -A 1 'match' > output.txt
This allows me to get the line after 'match' is found in the following file:
match
random text line 1
match
match
match
random text line 2
match
random text line 3
match
match
random text line 4
match
random text line 5
match
random text line 6
match
random text line 7
match
random text line 8
match
match
random text line 9
However, I need to return only the lines after 2 or more consecutive 'match' lines. In this case the output would be:
random text line 2
random text line 4
random text line 9
I have tried using a combination of grep -A 2 'match' | grep -A 1 'match' but it doesn't work as it's redundant. I'm stuck on how to match only if there are two consecutive lines. I'm open to using awk or sed for matching too if it's more efficient. Any direction would be greatly appreciated.
grep stands for g/re/p i.e. it's for Globally finding a Regular Expression and Printing the result, that is all. That is not all that you are trying to do so therefore grep would be the wrong tool to try to use. For general purpose text manipulation the standard tool to use is awk:
$ awk '/match/{c++;next} c>1; {c=0}' file
random text line 2
random text line 4
random text line 9
Related
Is it possible using sed to replace the first occurrence of a character or substring in line of file only if it is the first 2 characters in the line?
For example we have this text file:
15 hello
15 h15llo
1 hello
1 h15loo
Using the following command: sed -i 's/15/0/' file.txt
Will give this output
0 hello
0 h15llo
1 hello
1 h0loo
What I am trying to avoid is it considering the characters past the first 2.
Is this possible?
Desired output:
0 hello
0 h15llo
1 hello
1 h15loo
You can use
sed -i 's/^15 /0 /' file.txt
sed -i 's/^15\([[:space:]]\)/0\1/' file.txt
sed -i 's/^15\(\s\)/0\1/' file.txt
Here, the ^ matches the start of string position, 15 matches the 15 substring and then a space matches a space.
The second and third solutions are the same, instead of a literal space, they capture a whitespace char into Group 1 and the group value is put back into the result using the \1 placeholder.
my text file has 3 or more than 3 spaces, now I want to replace the 3 or more than 3 spaces with a comma and it should not replace if the file has less than 3 spaces
ex:
input:
a b 3 c d 6 9
output:
a b,3,c,d,6,9
You can do it easily with sed:
$ sed -r 's/ {3,}/,/g' file
a b 3,c,d,6,9
The -r flag instructs the sed to use the extended regular expression syntax which we need for the {min,max} interval operator in the s/// search/replace command. With it we say: for each occurrence (note the g, or global flag in the end) of the space character which is repeated 3 or more times (no upper limit), replace it with ,. Pass through all other characters.
I have a huge file
from line 3 to end of (#lines in file -1 )
starting at character position 75 on the line. I need to change the string to 123456789.
thought suggestions? I can't the existing characters per line are not duplicates so I can't search on that.
The joys of hiding pii data
In vim, you can do this:
%s/\(^.\{75\}\)\#<=........./1234567890/g
which basically does a lookbehind of 75 characters (which starts at the beginning of the line), and replaces the rest of the line with your string.
Let's consider this test file:
$ cat testfile
.........-.........-.........-.........-.........-.........-.........-....ReplaceMeKeep
.........-.........-.........-.........-.........-.........-.........-....OldData..Keep
Using sed
This replaces the nine characters starting with column 75 on with 123456789:
$ sed -E 's/(.{74}).{0,9}/\1123456789/' testfile
.........-.........-.........-.........-.........-.........-.........-....123456789Keep
.........-.........-.........-.........-.........-.........-.........-....123456789Keep
Using awk
This puts the new string in place of the first nine characters starting at position 75:
$ awk '{print substr($0,1,74) "123456789" substr($0,75+9)}' testfile
.........-.........-.........-.........-.........-.........-.........-....123456789Keep
.........-.........-.........-.........-.........-.........-.........-....123456789Keep
I'm using grep to display all lines that have ONLY 4,5,6,7 and 9 in the zipcode column.
How do i display only the lines of the file that contain the numbers 4,5,6,7 and 9 in the zipcode field?
A sample row is:
15 m jagger mick 41 4th 95115
Thanks
I am going to assume you meant "How do I use grep to..."
If all of the lines in the file have a 5 digit zip at the end of each line, then:
egrep "[45679]{5}$" filename
Should give you what you want.
If there might be whitespace between the zip and the end of the line, then:
egrep "[45679]{5}[[:space:]]*$" filename
would be more robust.
If the problem is more general than that, please describe it more accurately.
Following regex should fetch you desired result:
egrep "[45679]+$" file
If by "grep" you mean, "the correct tool", then the solution you seek is:
awk '$7 ~ /^[45679]*$/' input
This will print all lines of input in which the 7th field consists only of the characters 4,5,6,7, and 9. If you want to specify 'the last column' rather than the 7th, try
awk '$NF ~ /^[45679]*$/' input
How can I find numbers in format 0.xzy where xzy are numbers and where x is 8-9 and write 5 lines before each match (including) to the outputfile.txt.
To find numbers in the format 0.xzy (using word boundaries not forcing whole line match) and print the 5 line preceding the match -B5 and redirect the output to outfile:
$ grep -B5 -Ew '0\.[0-9]{3}' file > outfile
# fix x to 8 or 9
$ grep -B5 -Ew '0\.[8-9][0-9]{2}' file > outfile
Note: the . needs escaping to mean a literal period otherwise 01234 will match. You will find man grep very helpful!
Find numbers in format 0.xzy (xzy are numbers)
grep "^0.[0-9][0-9][0-9]$" file
Find cases where x is 8-9
grep "^0.[89][0-9][0-9]$" file