Grep find lines that have 4,5,6,7 and 9 in zip code column - linux

I'm using grep to display all lines that have ONLY 4,5,6,7 and 9 in the zipcode column.
How do i display only the lines of the file that contain the numbers 4,5,6,7 and 9 in the zipcode field?
A sample row is:
15 m jagger mick 41 4th 95115
Thanks

I am going to assume you meant "How do I use grep to..."
If all of the lines in the file have a 5 digit zip at the end of each line, then:
egrep "[45679]{5}$" filename
Should give you what you want.
If there might be whitespace between the zip and the end of the line, then:
egrep "[45679]{5}[[:space:]]*$" filename
would be more robust.
If the problem is more general than that, please describe it more accurately.

Following regex should fetch you desired result:
egrep "[45679]+$" file

If by "grep" you mean, "the correct tool", then the solution you seek is:
awk '$7 ~ /^[45679]*$/' input
This will print all lines of input in which the 7th field consists only of the characters 4,5,6,7, and 9. If you want to specify 'the last column' rather than the 7th, try
awk '$NF ~ /^[45679]*$/' input

Related

Filtering by author and counting all numbers im txt file - Linux terminal, bash

I need help with two hings
1)the file.txt has the format of a list of films
, in which they are authors in different lines, year of publication, title, e.g.
author1
year1
title1
author2
year2
title2
author3
year3
title3
author4
year4
title4
I need to show only book titles whose author is "Joanne Rowling"
2)
one.txt contains numbers and letters for example like:
dada4dawdaw54 232dawdawdaw 53 34dadasd
77dkwkdw
65 23 laka 23
I need to sum all of them and receive score - here it should 561
I tried something like that:
awk '{for(i=1;i<=NF;i++)s+=$i}END{print s}' plik2.txt
but it doesn't make sense
For the 1st question, the solution of okulkarni is great.
For the 2nd question, one solution is
sed 's/[^0-9]/ /g' one.txt | awk '{for(i=1;i<=NF;i++) sum+= $i} END { print sum}'
The sed command converts all non-numeric characters into spaces, while the awk command sums the numbers, line by line.
For the first question, you just need to use grep. Specifically, you can do grep -A 2 "Joanne Rowling" file.txt. This will show all lines with "Joanne Rowling" and the two lines immediately after.
For the second question, you can also use grep by doing grep -Eo '[0-9]+' | paste -sd+ | bc. This will put a + between every number found by grep and then add them up using bc.

Extract substring from first column

I have a large text file with 2 columns. The first column is large and complicated, but contains a name="..." portion. The second column is just a number.
How can I produce a text file such that the first column contains ONLY the name, but the second column stays the same and shows the number? Basically, I want to extract a substring from the first column only AND have the 2nd column stay unaltered.
Sample data:
application{id="1821", name="app-name_01"} 0
application{id="1822", name="myapp-02", optionalFlag="false"} 1
application{id="1823", optionalFlag="false", name="app_name_public"} 3
...
So the result file would be something like this
app-name_01 0
myapp-02 1
app_name_public 3
...
If your actual Input_file is same as the shown sample then following code may help you in same.
awk '{sub(/.*name=\"/,"");sub(/\".* /," ")} 1' Input_file
Output will be as follows.
app-name_01 0
myapp-02 1
app_name_public 3
Using GNU awk
$ awk 'match($0,/name="([^"]*)"/,a){print a[1],$NF}' infile
app-name_01 0
myapp-02 1
app_name_public 3
Non-Gawk
awk 'match($0,/name="([^"]*)"/){t=substr($0,RSTART,RLENGTH);gsub(/name=|"/,"",t);print t,$NF}' infile
app-name_01 0
myapp-02 1
app_name_public 3
Input:
$ cat infile
application{id="1821", name="app-name_01"} 0
application{id="1822", name="myapp-02", optionalFlag="false"} 1
application{id="1823", optionalFlag="false", name="app_name_public"} 3
...
Here's a sed solution:
sed -r 's/.*name="([^"]+).* ([0-9]+)$/\1 \2/g' Input_file
Explanation:
With the parantheses your store in groups what's inbetween.
First group is everything after name=" till the first ". [^"] means "not a double-quote".
Second group is simply "one or more numbers at the end of the line preceeded with a space".

How to use grep or awk to process a specific column ( with keywords from text file )

I've tried many combinations of grep and awk commands to process text from file.
This is a list of customers of this type:
John,Mills,81,Crescent,New York,NY,john#mills.com,19/02/1954
I am trying to separate these records into two categories, MEN and FEMALES.
I have a list of some 5000 Female Names , all in plain text , all in one file.
How can I "grep" the first column ( since I am only matching first names) but still printing the entire customer record ?
I found it easy to "cut" the first column and grep --file=female.names.txt, but this way it's not going to print the entire record any longer.
I am aware of the awk option but in that case I don't know how to read the female names from file.
awk -F ',' ' { if($1==" ???Filename??? ") print $0} '
Many thanks !
You can do this with Awk:
awk -F, 'NR==FNR{a[$0]; next} ($1 in a)' female.names.txt file.csv
Would print the lines of your csv file that contain first names of any found in your file female.names.txt.
awk -F, 'NR==FNR{a[$0]; next} !($1 in a)' female.names.txt file.csv
Would output lines not found in female.names.txt.
This assumes the format of your female.names.txt file is something like:
Heather
Irene
Jane
Try this:
grep --file=<(sed 's/.*/^&,/' female.names.txt) datafile.csv
This changes all the names in the list of female names to the regular expression ^name, so it only matches at the beginning of the line and followed by a comma. Then it uses process substitution to use that as the file to match against the data file.
Another alternative is Perl, which can be useful if you're not super-familiar with awk.
#!/usr/bin/perl -anF,
use strict;
our %names;
BEGIN {
while (<ARGV>) {
chomp;
$names{$_} = 1;
}
}
print if $names{$F[0]};
To run (assume you named this file filter.pl):
perl filter.pl female.names.txt < records.txt
So, I've come up with the following:
Suppose, you have a file having the following lines in a file named test.txt:
abe 123 bdb 532
xyz 593 iau 591
Now you want to find the lines which include the first field having the first and last letters as vowels. If you did a simple grep you would get both of the lines but the following will give you the first line only which is the desired output:
egrep "^([0-z]{1,} ){0}[aeiou][0-z]+[aeiou]" test.txt
Then you want to the find the lines which include the third field having the first and last letters as vowels. Similary, if you did a simple grep you would get both of the lines but the following will give you the second line only which is the desired output:
egrep "^([0-z]{1,} ){2}[aeiou][0-z]+[aeiou]" test.txt
The value in the first curly braces {1,} specifies that the preceding character which ranges from 0 to z according to the ASCII table, can occur any number of times. After that, we have the field separator space in this case. Change the value within the second curly braces {0} or {2} to the desired field number-1. Then, use a regular expression to mention your criteria.

Increment numbers within string using awk and sed

I have a text file that has about 500 rows of information.
I am adding a few strings to the beginning of each line separated by a comma (Excel recognizes it as another column).
I have this code so far:
sed -e "2,$s#^# =HYPERLINK(B2,C2), https://otrs.city.pittsburgh.pa.us/index.pl?Action=AgentTicketZoom;TicketID=#"** C:\Users\hd\Desktop\newaction.txt > C:\Users\hd\Desktop\test.txt
I have a columns want. Once column is adding on a link to a previous column (easy enough)
Which will be a formula(string) in the first column is =HYPERLINK(B2,C2) and I want to increment the 2's to 3's,4's and so on.
Example:
=HYPERLINK(B2,C2)
=HYPERLINK(B3,C3)
=HYPERLINK(B4,C4)
=HYPERLINK(B5,C5)
=HYPERLINK(B6,C6)
It is my second day coding with sed and awk.
Is there any way I can make this happen using awk and sed?
This Perl one-liner:
perl -pe "BEGIN{$i = 2} s#^#=HYPERLINK(B${i},C${i})#; $i++" "input.txt"
will add =HYPERLINK(B2,C2) to the front of each line and increment the numbers each time.

How to find numbers of specific format and filter out certain range of values?

How can I find numbers in format 0.xzy where xzy are numbers and where x is 8-9 and write 5 lines before each match (including) to the outputfile.txt.
To find numbers in the format 0.xzy (using word boundaries not forcing whole line match) and print the 5 line preceding the match -B5 and redirect the output to outfile:
$ grep -B5 -Ew '0\.[0-9]{3}' file > outfile
# fix x to 8 or 9
$ grep -B5 -Ew '0\.[8-9][0-9]{2}' file > outfile
Note: the . needs escaping to mean a literal period otherwise 01234 will match. You will find man grep very helpful!
Find numbers in format 0.xzy (xzy are numbers)
grep "^0.[0-9][0-9][0-9]$" file
Find cases where x is 8-9
grep "^0.[89][0-9][0-9]$" file

Resources