I have a file containing consecutive symbols (as pipe "|") like
ANKRD54,LIAR,allergy,|||
ANKRD54,LIAR,asthma,||20447076||
ANKRD54,LIAR,autism,||||
ANKRD54,LIAR,cancer,|||
ANKRD54,LIAR,chronic_obstructive_pulmonary_disease,|||
ANKRD54,LIAR,dental_caries,||||
Now using shell or a sed command in shell is it possible to replace multiple pipe with one pipe like
ANKRD54,LIAR,allergy,|
ANKRD54,LIAR,asthma,|20447076|
ANKRD54,LIAR,autism,|
ANKRD54,LIAR,cancer,|
ANKRD54,LIAR,chronic_obstructive_pulmonary_disease,|
ANKRD54,LIAR,dental_caries,|
I guess the easiest way is use built-in commands: cat your_file | tr -s '|'
Pass your text to sed (e.g. via a pipe)
cat your_file | sed "s/|\+/|/g"
You can do that with a simple awk gsub as:-
awk -F"," -v OFS="," '{gsub(/[|]+/,"|",$4)}1' file
See it in action:-
$ cat file
ANKRD54,LIAR,allergy,|||
ANKRD54,LIAR,asthma,||20447076||
ANKRD54,LIAR,autism,||||
ANKRD54,LIAR,cancer,|||
ANKRD54,LIAR,chronic_obstructive_pulmonary_disease,|||
ANKRD54,LIAR,dental_caries,||||
$ awk -F"," -v OFS="," '{gsub(/[|]+/,"|",$4)}1' file
NKRD54,LIAR,allergy,|
ANKRD54,LIAR,asthma,|20447076|
ANKRD54,LIAR,autism,|
ANKRD54,LIAR,cancer,|
ANKRD54,LIAR,chronic_obstructive_pulmonary_disease,|
ANKRD54,LIAR,dental_caries,|
In Linux (Cento OS) I have a file that contains a set of additional information that I want to removed. I want to generate a new file with all characters until to the first |.
The file has the following information:
ALFA12345|7890
Beta0-XPTO-2|30452|90 385|29
ZETA2334423 435; 2|2|90dd5|dddd29|dqe3
The output expected will be:
ALFA12345
Beta0 XPTO-2
ZETA2334423 435; 2
That is removed all characters after the character | (inclusive).
Any suggestion for a script that reads File1 and generates File2 with this specific requirement?
Try
cut -d'|' -f1 oldfile > newfile
And, to round out the "big 3", here's the awk version:
awk -F\| '{print $1}' in.dat
You can use a simple sed script.
sed 's/^\([^|]*\).*/\1/g' in.dat
ALFA12345
Beta0-XPTO-2
ZETA2334423 435; 2
Redirect to a file to capture the output.
sed 's/^\([^|]*\).*/\1/g' in.dat > out.dat
And with grep:
$ grep -o '^[^|]*' file1
ALFA12345
Beta0-XPTO-2
ZETA2334423 435; 2
$ grep -o '^[^|]*' file1 > file2
I have a number of log files in a directory. I am trying to write a script to search all the log files for a string and echo the name of the files and the line number that the string is found.
I figure I will probably have to use 2 grep's - piping the output of one into the other since the -l option only returns the name of the file and nothing about the line numbers. Any insight in how I can successfully achieve this would be much appreciated.
Many thanks,
Alex
$ grep -Hn root /etc/passwd
/etc/passwd:1:root:x:0:0:root:/root:/bin/bash
combining -H and -n does what you expect.
If you want to echo the required informations without the string :
$ grep -Hn root /etc/passwd | cut -d: -f1,2
/etc/passwd:1
or with awk :
$ awk -F: '/root/{print "file=" ARGV[1] "\nline=" NR}' /etc/passwd
file=/etc/passwd
line=1
if you want to create shell variables :
$ awk -F: '/root/{print "file=" ARGV[1] "\nline=" NR}' /etc/passwd | bash
$ echo $line
1
$ echo $file
/etc/passwd
Use -H. If you are using a grep that does not have -H, specify two filenames. For example:
grep -n pattern file /dev/null
My version of grep kept returning text from the matching line, which I wasn't sure if you were after... You can also pipe the output to an awk command to have it ONLY print the file name and line number
grep -Hn "text" . | awk -F: '{print $1 ":" $2}'
I am trying to find the occurance of tab in a file some_file and print those line with leading line number.
grep -nP "\t" some_file works well for me but I want sed or awk equivalent command for the same.
To emulate: grep -nP "\t" file.txt
Here's one way using GNU awk:
awk '/\t/ { print NR ":" $0 }' file.txt
Here's one way using GNU sed:
< file.txt sed -n '/\t/{ =;p }' | sed '{ N;s/\n/:/ }'
Well, you can always do it in sed:
cat -n test.txt | sed -n "/\t/p"
Unfortunately, sed can only print line numbers to stdout with a new line, so in any case, more than one command is necessary. A more lengthy (unnecessary so) version of the above, but one only using sed, would be:
sed = test.txt | sed -n "N;s/\n/ /;/\t/p"
but I like the one with cat more. CATS ARE NICE.
In an awk file, e.g example.awk, should the header be #!/bin/bash or #!/bin/awk -f?
The reason for my question is that if I try this command in the console I receive the correct file.txt with "line of text":
awk 'BEGIN {print "line of text"}' >> file.txt
but if i try execute the following file with ./example.awk:
#! /bin/awk -f
awk 'BEGIN {print "line of text"}' >> file.txt
it returns an error:
$ ./awk-usage.awk
awk: ./awk-usage.awk:3: awk 'BEGIN {print "line of text"}' >> file.txt
awk: ./awk-usage.awk:3: ^ invalid char ''' in expression
If I change the header to #!/bin/bash or #!/bin/sh it works.
What is my error? What is the reason of that?
Since you explicitly run the awk command, you should use #!/bin/bash. You can use #!/bin/awk if you remove the awk command and include only the awk program (e.g. BEGIN {print "line of text"}), but then you need to append to file using awk syntax (print ... >> file).
awk -f takes a file containing the awk script, so that is completely wrong here.
Your script is a shell script that happens to contains an awk command.
#! /bin/sh tells your shell to execute the file as a shell command with /bin/sh - and it is a shell command. If you replace that with #! /bin/awk -f then the file is executed with awk, basically the same as executing
/bin/awk -f awk 'BEGIN {print "line of text"}' >> file.txt