Why this command adds \n at the last line - linux

I'm using this command to sort and remove duplicate lines from a file.
sort file2.txt | uniq > file2_uniq.txt
After performing the command, I find the last line with this value: \n which cause me problems. What can I do to avoid it ?

You could also let sort take care of uniquing the output, omitting the first line would avoid empty lines:
sort -u file2.txt | tail -n +2
Edit
If you also wanted to remove all empty lines I would suggest using:
grep -v '^$' | sort -u file2.txt

Just filter out what you don't want:
sort file2.txt | egrep -v "^$" | uniq > file2_uniq.txt

The problem solved by removing the last line using:
sed '$d' infile > outfile

Related

Find duplicate entries in a text file using shell

I am trying to find duplicate *.sh entry mention in a text file(test.log) and delete it, using shell program. Since the path is different so uniq -u always print duplicate entry even though there are two first_prog.sh entry in a text file
cat test.log
/mnt/abc/shellprog/test/first_prog.sh
/mnt/abc/shellprog/test/second_prog.sh
/mnt/abc/my_shellprog/test/first_prog.sh
/mnt/abc/my_shellprog/test/third_prog.sh
output:
/mnt/abc/shellprog/test/first_prog.sh
/mnt/abc/shellprog/test/second_prog.sh
/mnt/abc/my_shellprog/test/third_prog.sh
I tried couple of way using few command but dont have idea on how to get above output.
rev test.log | cut -f1 -d/ | rev | sort | uniq -d
Any clue on this?
You can use awk for this by splitting fields on / and using $NF (last field) in an associative array:
awk -F/ '!seen[$NF]++' test.log
/mnt/abc/shellprog/test/first_prog.sh
/mnt/abc/shellprog/test/second_prog.sh
/mnt/abc/my_shellprog/test/third_prog.sh
awk shines for these kind of tasks but here in a non awk solution,
$ sed 's|.*/|& |' file | sort -k2 -u | sed 's|/ |/|'
/mnt/abc/shellprog/test/first_prog.sh
/mnt/abc/shellprog/test/second_prog.sh
/mnt/abc/my_shellprog/test/third_prog.sh
or, if your path is balanced (the same number of parents for all files)
$ sort -t/ -k5 -u file
/mnt/abc/shellprog/test/first_prog.sh
/mnt/abc/shellprog/test/second_prog.sh
/mnt/abc/my_shellprog/test/third_prog.sh
awk '!/my_shellprog\/test\/first/' file
/mnt/abc/shellprog/test/first_prog.sh
/mnt/abc/shellprog/test/second_prog.sh
/mnt/abc/my_shellprog/test/third_prog.sh

Linux cut string

In Linux (Cento OS) I have a file that contains a set of additional information that I want to removed. I want to generate a new file with all characters until to the first |.
The file has the following information:
ALFA12345|7890
Beta0-XPTO-2|30452|90 385|29
ZETA2334423 435; 2|2|90dd5|dddd29|dqe3
The output expected will be:
ALFA12345
Beta0 XPTO-2
ZETA2334423 435; 2
That is removed all characters after the character | (inclusive).
Any suggestion for a script that reads File1 and generates File2 with this specific requirement?
Try
cut -d'|' -f1 oldfile > newfile
And, to round out the "big 3", here's the awk version:
awk -F\| '{print $1}' in.dat
You can use a simple sed script.
sed 's/^\([^|]*\).*/\1/g' in.dat
ALFA12345
Beta0-XPTO-2
ZETA2334423 435; 2
Redirect to a file to capture the output.
sed 's/^\([^|]*\).*/\1/g' in.dat > out.dat
And with grep:
$ grep -o '^[^|]*' file1
ALFA12345
Beta0-XPTO-2
ZETA2334423 435; 2
$ grep -o '^[^|]*' file1 > file2

select multiple lines using the linux command sed

I have an example [file] that I want to Grab lines 3-6 and lines 11 - 13 then sort with a one line command and save it as 3_6-11_13. These are the commands I have used thus far but I haven't gotten the desired output:
sed -n '/3/,/6/p'/11/,/13/p file_1 > file_2 | sort -k 2 > file_2 & sed -n 3,6,11,13p file_1 > file_2 | sort -k 2 file_2.
Is there a better way to shorten this. I have thought about using awk but have I stayed with sed so far.
With sed you're allowed to specify addresses by number like so:
sed -n '3,6p'
The -n is to keep sed from automatically printing output.
Then you can run multiple commands if you're using gsed by separating those commands with semicolons:
sed -n '3,6p; 11,13p' | sort -k2 > 3_6-11_13
sed combine multiple commands using -e option
$ sed -e 'comm' -e 'comm' file.txt
or you can separate commands using the semicolon
$ sed 'comm;comm;comm' file.txt

egrep: find lines with no characters

I have a text file and I need to search that file and figure how many blank lines are in the file. A blank line is a line with no characters.
I must use egrep.
[aman#localhost ~]$ cat >try
sldjjsd
dkfjkjdf
dfkjdf
[aman#localhost ~]$ egrep '^$' try|wc -l
4
This will do.
egrep '^$' blankfile -c
Another way, without egrep.
echo $(($(cat blank | wc -l)-$(cat blank | tr -s "\n" | wc -l)))

Omitting the first line from any Linux command output

I have a requirement where i'd like to omit the 1st line from the output of ls -latr "some path" Since I need to remove total 136 from the below output
So I wrote ls -latr /home/kjatin1/DT_901_linux//autoInclude/system | tail -q which excluded the 1st line, but when the folder is empty it does not omit it. Please tell me how to omit 1st line in any linux command output
The tail program can do this:
ls -lart | tail -n +2
The -n +2 means “start passing through on the second line of output”.
Pipe it to awk:
awk '{if(NR>1)print}'
or sed
sed -n '1!p'
ls -lart | tail -n +2 #argument means starting with line 2
This is a quick hacky way: ls -lart | grep -v ^total.
Basically, remove any lines that start with "total", which in ls output should only be the first line.
A more general way (for anything):
ls -lart | sed "1 d"
sed "1 d" means only print everything but first line.
You can use awk command:
For command output use pipe: | awk 'NR>1'
For output of file: awk 'NR>1' file.csv

Resources