How to sort a column with $ and ',' '.' sign bash command line? - linux

I have a file, and i want to use something like "Cat" command on that file which print out the sorted list.
For example a column loooks like This
Mike $1.00
Mason $1,000,000.00
Tyler $100,000.00
Nick $0.10
Result
Nick $0.10
Mike $1.00
Tyler $100,000.00
Mason $1,000,000.00

You can try this
sort -t$ -nk2 fileName
Description :
-t$ : use $ as separator
-nk2 : sort using numbers in column 2

Related

Printing First Variable in Awk but Only If It's Less than X

I have a file with words and I need to print only the lines that are less than or equal to 4 characters but I'm having trouble with my code. There is other text on the end of the lines but I shortened it for here.
file:
John Doe
Jane Doe
Mark Smith
Abigail Smith
Bill Adams
What I want to do is print the names that have less than 4 characters.
What I've tried:
awk '$1 <= 4 {print $1}' inputfile
What I'm hoping to get:
John
Jane
Mark
Bill
So far, I've got nothing. Either it prints out everything, with no length restrictions or it doesn't even print anything at all. Could someone take a look at this and see what they think?
Thanks
First, let understand why
awk '$1 <= 4 {print $1}' inputfile
gives you whole inputfile, $1 <= 4 is numeric comparison, so this prompt GNU AWK to try to convert first column value to numeric value, but what is numeric value of say
John
? As GNU AWK manual Strings And Numbers put it
A string is converted to a number by interpreting any numeric prefix
of the string as numerals(...)Strings that can’t be interpreted as
valid numbers convert to zero.
Therefore numeric value for John from GNU AWK point of view is zero.
In order to get desired output you might use length function which returns number of characters as follows
awk 'length($1)<=4{print $1}' inputfile
or alternatively pattern matching from 0 to 4 characters that is
awk '$1~/^.{0,4}$/{print $1}' inputfile
where $1~ means check if 1st field match, . denotes any character, {0,4} from 0 to 4 repetitions, ^ begin of string, $ end of string (these 2 are required as otherwise it would also match longer string, as they do contain substring .{0,4})
Both codes for inputfile
John Doe
Jane Doe
Mark Smith
Abigail Smith
Bill Adams
give output
John
Jane
Mark
Bill
(tested in gawk 4.2.1)

Create table from multiple columns in linux, but treat fields 2, 3 (and possibly 4) as one column

Let's say I have the following file:
al.pacino Al Pacino
jerry.seinfeld Jerry Seinfeld
chad.murray Chad Michael Murray
I want to create a nice table with only two columns and treat the first name/middle name/last name as one column, like this:
al.pacino Al Pacino
jerry.seinfeld Jerry Seinfeld
chad.murray Chad Michael Murray
The problem is that if I use the "column -t" command, each field will be treated as an individual column, which is not what I want:
al.pacino Al Pacino
jerry.seinfeld Jerry Seinfeld
chad.murray Chad Michael Murray
Insert a tab between the two columns with sed - feed this oneliner the input on stdin - the output will be two tab-delimited columns
sed -r 's/ +/\t/'
I managed to do it with AWK, by reading the content from two variables (first one with login names and second with full names):
awk 'NR==FNR { a[FNR] = $0 ; if (length > max) max = length ; next } { printf "%-*s %s\n", max, a[FNR], $0 }' <(echo "${login_names}") <(echo "${full_names}")
The result is a nice looking, clean table no matter how different in length the names are:
christopher.reeve Christopher Reeve
al.pacino Al Pacino
jerry.seinfeld Jerry Seinfeld
benedict.cumberbatch Benedict Cumberbatch
chad.murray Chad Michael Murray

How to sort lines in textfile according to a second textfile

I have two text files.
File A.txt:
john
peter
mary
alex
cloey
File B.txt
peter does something
cloey looks at him
franz is the new here
mary sleeps
I'd like to
merge the two
sort one file according to the other
put the unknown lines of B at the end
like this:
john
peter does something
mary sleeps
alex
cloey looks at him
franz is the new here
$ awk '
NR==FNR { b[$1]=$0; next }
{ print ($1 in b ? b[$1] : $1); delete b[$1] }
END { for (i in b) print b[i] }
' fileB fileA
john
peter does something
mary sleeps
alex
cloey looks at him
franz is the new here
The above will print the remaining items from fileB in a "random" order (see http://www.gnu.org/software/gawk/manual/gawk.html#Scanning-an-Array for details). If that's a problem then edit your question to clarify your requirements for the order those need to be printed in.
It also assumes the keys in each file are unique (e.g. peter only appears as a key value once in each file). If that's not the case then again edit your question to include cases where a key appears multiple times in your ample input/output and additionally explain how you want the handled.

Reflecting the sort of one file in another

I have two files, say f1 and f2.
f1 has a list of items that can't be compared (they are all alpha numeric, each on its own line). It's companion file f2 has a list of items that can be compared each on its own line.
I have sorted f2 in reverse order to produce a file f3. I want to reflect this in f1 to produce a file f4.
Example:
f1:
Dan
Sam
James
f2:
3
1
2
f3 (which is a reverse sort of f2):
3
2
1
I want f4 to be:
Dan
James
Sam
I hope this example illustrates what I'm trying to achieve.
Here's a quick and dirty way using the paste command. It should work if your files are simple.
% cat numbers.txt
3
1
2
% cat names.txt
Dan
Sam
James
% paste numbers.txt names.txt | sort -nr | awk '-F\t' '{print $2}'
Dan
James
Sam

Expand one column while preserving another

I am trying to get column one repeated for every value in column two which needs to be on a new line.
cat ToExpand.txt
Pete horse;cat;dog
Claire car
John house;garden
My first attempt:
cat expand.awk
BEGIN {
FS="\t"
RS=";"
}
{
print $1 "\t" $2
}
awk -f expand.awk ToExpand.txt
Pete horse
cat
dog
Claire car
John
garden
The desired output is:
Pete horse
Pete cat
Pete dog
Claire car
John house
John garden
Am I on the right track here or would you use another approach? Thanks in advance.
You could also change the FS value into a regex and do something like this:
awk -F"\t|;" -v OFS="\t" '{for(i=2;i<=NF;i++) print $1, $i}' ToExpand.txt
Pete horse
Pete cat
Pete dog
Claire car
John house
John garden
I'm assuming that:
The first tab is the delimiter for the name
There's only one tab delimiter - If tab delimited data occurs after the ; section use fedorqui's implementation.
It's using an alternate form of setting the OFS value ( using the -v flag ) and loops over the fields after the first to print the expected output.
You can think of RS in your example as making "lines" out of your data ( records really ) and your print block is acting on those "lines"(records) instead of the normal newline. Then each record is further parsed by your FS. That's why you get the output from your first attempt. You can explore that by printing out the value of NF in your example.
Try:
awk '{gsub(/;/,ORS $1 OFS)}1' OFS='\t' file
This replaces every occurrence of a semicolon with a newline, the first field and the output field separator..
Output:
Pete horse
Pete cat
Pete dog
Claire car
John house
John garden

Resources