Reading a column from excel using shell script in Linux environment - linux

/need to read the 1st column and last of the excel file in linux environment.
Can some one help me with examples

Your best bet would be to export the excel sheet as a CSV then manipulate it using awk or similar, such as:
awk -F"," '{print $1, $NF}' file.csv
For example:
# cat test.csv
hello, goodbye, seeya
# awk -F"," '{print $1, $NF}' test.csv
hello seeya
Edit - For info, "$NF" is Number of Fields, so essentially 'last field'.

Related

Awk 3rd column if second coulmn matches with a variable

I am new to Awk and linux. I want to print 3rd column if 2nd column matches with a variable.
file.txt
1;XYZ;123
2;ABC;987
3;ZZZ;999
So I want to print 987, After checking if 2nd column is ABC
name="ABC"
awk -F';' '$2==$name { print $3 }' file.txt
But this is not working. Please help. Please note, I want to use AWK only, to understand how this can be achieved using awk.
Do following and it should fly then. In awk variables don't work like shell you have to explicitly mention them by using -v var_name in awk code.
name="ABC"
awk -F';' -v name="$name" '$2==name{ print $3 }' file.txt

Extract domain then paste into the same line using sed/awk/grep/perl

I've started my tech adventure not so long ago - as you will feel from question - but now I'm stucked because after almost a whole day thinking and searching I don't know the proper solution for my problem.
Briefly, I got a file with thousand lines which contains email and firstname. The thing is I really need another column just with the domain name itself for example next to the email address. Please take a look at the examples below.
This is how it looks now:
something#nothing.tld|:|george|-|
anything#another.tld|:|thomas|-|
third#address.tld|:|kelly|-|
How I wanted to look like:
something#nothing.tld|:|nothing.tld|--|george|-|
anything#another.tld|:|another.tld|--|thomas|-|
third#address.tld|:|address.tld|--|kelly|-|
My best guess was using sed to start the process and extract the domain but how can I paste that extracted domain within the same line that's where I stucked.
sed -e 's/.*#\(.*\)|:|*/\1/'
If you could also give a short explanation along with a solution that would be really helpful.
Any help is appreciated.
If you have the following data in a file named, file1,
something#nothing.tld|:|george|-|
anything#another.tld|:|thomas|-|
third#address.tld|:|kelly|-|
you can use : and # as delimiters and add data after it using awk, then save it to a new file,
awk -F '[#:]' '{ print $1"#"$2 ":|" $2"--" $3 }' file1 > file2
Above command saves following data in a file called file2,
something#nothing.tld|:|nothing.tld|--|george|-|
anything#another.tld|:|another.tld|--|thomas|-|
third#address.tld|:|address.tld|--|kelly|-|
With GNU awk for gensub():
$ awk 'BEGIN{FS=OFS="|"} {print $1, $2, gensub(/.*#/,"",1,$1), "--", $3, $4, $5}' file
something#nothing.tld|:|nothing.tld|--|george|-|
anything#another.tld|:|another.tld|--|thomas|-|
third#address.tld|:|address.tld|--|kelly|-|
With any awk:
$ awk 'BEGIN{FS=OFS="|"} {d=$1; sub(/.*#/,"",d); print $1, $2, d, "--", $3, $4, $5}' file
something#nothing.tld|:|nothing.tld|--|george|-|
anything#another.tld|:|another.tld|--|thomas|-|
third#address.tld|:|address.tld|--|kelly|-|
You can do it like this with sed:
sed -E 's/#([^|]+)\|:\|/&\1|--|/' infile
Note the use of a negated-group ([^|]), i.e. match anything except this character group.
Output:
something#nothing.tld|:|nothing.tld|--|george|-|
anything#another.tld|:|another.tld|--|thomas|-|
third#address.tld|:|address.tld|--|kelly|-|

How can I get the second column of a very large csv file using linux command?

I was given this question during an interview. I said I could do it with java or python like xreadlines() function to traverse the whole file and fetch the column, but the interviewer wanted me to just use linux cmd. How can I achieve that?
You can use the command awk. Below is an example of printing out the second column of a file:
awk -F, '{print $2}' file.txt
And to store it, you redirect it into a file:
awk -F, '{print $2}' file.txt > output.txt
You can use cut:
cut -d, -f2 /path/to/csv/file
I'd add to Andreas answer, but can't comment yet.
With csv, you have to give awk a field seperator argument, or it will define fields bound by whitespace instead of commas. (Obviously, csv that uses a different field seperator will need a different character to be declared.)
awk -F, '{print $2}' file.txt

extracting the column using AWK

I am trying to extract column using AWK.
Source file is a .CSV file and below is command I am using:
awk -F ',' '{print $1}' abc.csv > test1
Data in file abc.csv is like below:
xyz#yahoo.com,160,1,2,3
abc#ymail.com,1,2,3,160
But data obtained in test1 is like :
abc#ymail.comxyz#ymail.com
when file is opened in notepad after downloading the file from server.
Notepad doesn't show newlines created on unix. If you want to add them, try
awk -F ',' '{print $1"\r"}' abc.csv > test1
Since you're using a Window tool to read the output you just need to tell awk to use Windows line-endings as the Output Record Separator:
awk -v ORS='\r\n' -F',' '{print $1}' file

command to find words and display columns

I want to search some words in a log file & display only given column numbers from those lines in the file.
eg: i want to search "word" in abc.log and print columns 4,11
grep "word" abc.log | awk '{print $4}' | awk '{print $4}'
but this doesn't workout can some one please help
You need to print $4 and $11 together rather than piping $4 into another awk.
Also, you don't need grep because awk can grep.
Try it like this:
awk '/word/{print $4,$11}' abc.log

Resources