Create table from multiple columns in linux, but treat fields 2, 3 (and possibly 4) as one column - linux

Let's say I have the following file:
al.pacino Al Pacino
jerry.seinfeld Jerry Seinfeld
chad.murray Chad Michael Murray
I want to create a nice table with only two columns and treat the first name/middle name/last name as one column, like this:
al.pacino Al Pacino
jerry.seinfeld Jerry Seinfeld
chad.murray Chad Michael Murray
The problem is that if I use the "column -t" command, each field will be treated as an individual column, which is not what I want:
al.pacino Al Pacino
jerry.seinfeld Jerry Seinfeld
chad.murray Chad Michael Murray

Insert a tab between the two columns with sed - feed this oneliner the input on stdin - the output will be two tab-delimited columns
sed -r 's/ +/\t/'

I managed to do it with AWK, by reading the content from two variables (first one with login names and second with full names):
awk 'NR==FNR { a[FNR] = $0 ; if (length > max) max = length ; next } { printf "%-*s %s\n", max, a[FNR], $0 }' <(echo "${login_names}") <(echo "${full_names}")
The result is a nice looking, clean table no matter how different in length the names are:
christopher.reeve Christopher Reeve
al.pacino Al Pacino
jerry.seinfeld Jerry Seinfeld
benedict.cumberbatch Benedict Cumberbatch
chad.murray Chad Michael Murray

Related

Filtering by author and counting all numbers im txt file - Linux terminal, bash

I need help with two hings
1)the file.txt has the format of a list of films
, in which they are authors in different lines, year of publication, title, e.g.
author1
year1
title1
author2
year2
title2
author3
year3
title3
author4
year4
title4
I need to show only book titles whose author is "Joanne Rowling"
2)
one.txt contains numbers and letters for example like:
dada4dawdaw54 232dawdawdaw 53 34dadasd
77dkwkdw
65 23 laka 23
I need to sum all of them and receive score - here it should 561
I tried something like that:
awk '{for(i=1;i<=NF;i++)s+=$i}END{print s}' plik2.txt
but it doesn't make sense
For the 1st question, the solution of okulkarni is great.
For the 2nd question, one solution is
sed 's/[^0-9]/ /g' one.txt | awk '{for(i=1;i<=NF;i++) sum+= $i} END { print sum}'
The sed command converts all non-numeric characters into spaces, while the awk command sums the numbers, line by line.
For the first question, you just need to use grep. Specifically, you can do grep -A 2 "Joanne Rowling" file.txt. This will show all lines with "Joanne Rowling" and the two lines immediately after.
For the second question, you can also use grep by doing grep -Eo '[0-9]+' | paste -sd+ | bc. This will put a + between every number found by grep and then add them up using bc.

Compare specific parts of two columns in a text file in Linux

I have a text file with several columns separated by tab character as below:
1 ATGCCCAGA AS:i:10 XS:i:10
2 ATGCTTGA AS:i:10 XS:i:5
3 ATGGGGGA AS:i:10 XS:i:1
4 ATCCCCGA AS:i:20 XS:i:20
I now want to compare the last two columns AS:i:(n1) and XS:i:(n2) to obtain only lines with n1 different to n2. So, my desired output would be:
2 ATGCTTGA AS:i:10 XS:i:5
3 ATGGGGGA AS:i:10 XS:i:1
Could you suggest me some ways that I can compare n1 and n2 and print out the output? Thanks in advance.
As Shawn says, you coudl do this in awk... or perl ... or sed.
An AWK example might be
awk '{split($3,a,":");split($4,b,":");if(a[3]!=b[3]) print $0}' infile.txt
If you are familiar with awk this should be fairly self explanatory

save multiple matches in a list (grep or awk)

I have a file that looks something like this:
# a mess of text
Hello. Student Joe Deere has
id number 1. Over.
# some more messy text
Hello. Student Steve Michael Smith has
id number 2. Over.
# etc.
I want to record the pairs (Joe Deere, 1), (Steve Michael Smith, 2), etc. into a list (or two separate lists with the same order). Namely, I will need to loop over those pairs and do something with the names and ids.
(names and ids are on distinct lines, but come in the order: name1, id1, name2, id2, etc. in the text). I am able to extract the lines of interest with
VAR=$(awk '/Student/,/Over/' filename.txt)
I think I know how to extract the names and ids with grep, but it will give me the result as one big block like
`Joe Deere 1 Steve Michael Smith 2 ...`
(and maybe even with a separator between names and ids). I am not sure at this point how to go forward with this, and in any case it doesn't feel like the right approach.
I am sure that there is a one-liner in awk that will do what I need. The possibilities are infinite and the documentation monumental.
Any suggestion?
$ cat tst.awk
/^id number/ {
gsub(/^([^ ]+ ){2}| [^ ]+$/,"",prev)
printf "(%s, %d)\n", prev, $3
}
{ prev = $0 }
$ awk -f tst.awk file
(Joe Deere, 1)
(Steve Michael Smith, 2)
Could you please try following too.
awk '
/id number/{
sub(/\./,"",$3)
print val", "$3
val=""
next
}
{
gsub(/Hello\. Student | has.*/,"")
val=$0
}
' Input_file
grep -oP 'Hello. Student \K.+(?= has)|id number \K\d+' file | paste - -

How to sort lines in textfile according to a second textfile

I have two text files.
File A.txt:
john
peter
mary
alex
cloey
File B.txt
peter does something
cloey looks at him
franz is the new here
mary sleeps
I'd like to
merge the two
sort one file according to the other
put the unknown lines of B at the end
like this:
john
peter does something
mary sleeps
alex
cloey looks at him
franz is the new here
$ awk '
NR==FNR { b[$1]=$0; next }
{ print ($1 in b ? b[$1] : $1); delete b[$1] }
END { for (i in b) print b[i] }
' fileB fileA
john
peter does something
mary sleeps
alex
cloey looks at him
franz is the new here
The above will print the remaining items from fileB in a "random" order (see http://www.gnu.org/software/gawk/manual/gawk.html#Scanning-an-Array for details). If that's a problem then edit your question to clarify your requirements for the order those need to be printed in.
It also assumes the keys in each file are unique (e.g. peter only appears as a key value once in each file). If that's not the case then again edit your question to include cases where a key appears multiple times in your ample input/output and additionally explain how you want the handled.

Reflecting the sort of one file in another

I have two files, say f1 and f2.
f1 has a list of items that can't be compared (they are all alpha numeric, each on its own line). It's companion file f2 has a list of items that can be compared each on its own line.
I have sorted f2 in reverse order to produce a file f3. I want to reflect this in f1 to produce a file f4.
Example:
f1:
Dan
Sam
James
f2:
3
1
2
f3 (which is a reverse sort of f2):
3
2
1
I want f4 to be:
Dan
James
Sam
I hope this example illustrates what I'm trying to achieve.
Here's a quick and dirty way using the paste command. It should work if your files are simple.
% cat numbers.txt
3
1
2
% cat names.txt
Dan
Sam
James
% paste numbers.txt names.txt | sort -nr | awk '-F\t' '{print $2}'
Dan
James
Sam

Resources