How to extract patterns form a text files in shell bash

How to extract patterns form a text files in shell bash - linux

I have a text file that contains:
toto.titi.any=val1
toto.tata.any=val2
toto.tete.any=val2
How to extract titi , tata and tete from this file.
Should be some sthing like that
$ cat myfile.txt | sed '......'
and the output should be
titi
tata
tete

Do you really need sed? You could use cut:
cut -d. -f2 filename

With awk you can do:
awk -F. '{print $2}' file

awk/cut would be better choice for this problem.
here is the sed line and grep option:
sed -r 's/.*\.([^.]*)\..*/\1/'
grep -Po '\.\K[^.]*(?=\.)'

awk and cut are best for this
you can also read the file line by line, and print out the portion needed.
$ IFS=.
$ while read first interested others ; do echo $interested; done < file
titi
tata
tete

sed -n 's/^[^.]*.\([^.]*\)..*/\1/p' myfile.txt
display second value between dot from line having at least 2 dot inside

Related

Linux cut string

In Linux (Cento OS) I have a file that contains a set of additional information that I want to removed. I want to generate a new file with all characters until to the first |.
The file has the following information:
ALFA12345|7890
Beta0-XPTO-2|30452|90 385|29
ZETA2334423 435; 2|2|90dd5|dddd29|dqe3
The output expected will be:
ALFA12345
Beta0 XPTO-2
ZETA2334423 435; 2
That is removed all characters after the character | (inclusive).
Any suggestion for a script that reads File1 and generates File2 with this specific requirement?

Try
cut -d'|' -f1 oldfile > newfile

And, to round out the "big 3", here's the awk version:
awk -F\| '{print $1}' in.dat

You can use a simple sed script.
sed 's/^\([^|]*\).*/\1/g' in.dat
ALFA12345
Beta0-XPTO-2
ZETA2334423 435; 2
Redirect to a file to capture the output.
sed 's/^\([^|]*\).*/\1/g' in.dat > out.dat

And with grep:
$ grep -o '^[^|]*' file1
ALFA12345
Beta0-XPTO-2
ZETA2334423 435; 2
$ grep -o '^[^|]*' file1 > file2

Turning multiple lines into one comma separated line [duplicate]

This question already has answers here:
Concise and portable "join" on the Unix command-line
(10 answers)
Closed 8 years ago.
I have the following data in multiple lines:
foo
bar
qux
zuu
sdf
sdfasdf
What I want to do is to convert them to one comma separated line:
foo,bar,qux,zuu,sdf,sdfasdf
What's the best unix one-liner to do that?

Using paste command:
paste -d, -s file

file
aaa
bbb
ccc
ddd
xargs
cat file | xargs
result
aaa bbb ccc ddd
xargs improoved
cat file | xargs | sed -e 's/ /,/g'
result
aaa,bbb,ccc,ddd

There are many ways it can be achieved. The tool you use mostly depends on your own preference or experience.
Using tr command:
tr '\n' ',' < somefile
Using awk:
awk -F'\n' '{if(NR == 1) {printf $0} else {printf ","$0}}' somefile

xargs -a your_file | sed 's/ /,/g'
This is a shorter way.

based on your input example, this awk line works. (without trailing comma)
awk -vRS="" -vOFS=',' '$1=$1' file
test:
kent$ echo "foo
bar
qux
zuu
sdf
sdfasdf"|awk -vRS="" -vOFS=',' '$1=$1'
foo,bar,qux,zuu,sdf,sdfasdf

Perl one-liner:
perl -pe'chomp, s/$/,/ unless eof' file
or, if you want to be more cryptic:
perl '-peeof||chomp&&s/$/,/' file

sed -n 's/.*/&,/;H;$x;$s/,\n/,/g;$s/\n\(.*\)/\1/;$s/\(.*\),/\1/;$p'

perl -pi.bak -e 'unless(eof){s/\n/,/g}' your_file
This will create a backup of original file with an extension of .bak and then modifies the original file

sed extract text between two patterns where second pattern may be either of one

I am trying to extract text between pattern1 (fixed) and pattern2 (this can be p2-1/p2-2).
can you please tell me how to achieve this in a single command?
A file starts with start and ends with either end or close
File1:
======
junktest
data
start
stackoverflow
sed
close
File2:
======
data2
start
stackoverflow
end
I can extract text from File1 with
sed -n "/start/,/close/p"
And from File2 with
sed -n "/start/,/end/p"
I need a single sed command to achieve both..
something like:
sed -n "/start/, /close or end /p"

Both GNU sed and BSD sed:
sed -nE '/start/,/close|end/p' file

This awk looks better
awk '/start/,/end|close/' file

sed -n -E "/Word1/,/Word2-1/p" | sed -n -E "/Word1/,/Word2-2/p"

Easy with awk:
$ awk '/start/{p=1}p{print}/end|close/{p=0}' file

Is there any equivalent command grep -nP "\t" some_file , using sed or awk

I am trying to find the occurance of tab in a file some_file and print those line with leading line number.
grep -nP "\t" some_file works well for me but I want sed or awk equivalent command for the same.

To emulate: grep -nP "\t" file.txt
Here's one way using GNU awk:
awk '/\t/ { print NR ":" $0 }' file.txt
Here's one way using GNU sed:
< file.txt sed -n '/\t/{ =;p }' | sed '{ N;s/\n/:/ }'

Well, you can always do it in sed:
cat -n test.txt | sed -n "/\t/p"
Unfortunately, sed can only print line numbers to stdout with a new line, so in any case, more than one command is necessary. A more lengthy (unnecessary so) version of the above, but one only using sed, would be:
sed = test.txt | sed -n "N;s/\n/ /;/\t/p"
but I like the one with cat more. CATS ARE NICE.

How to filter data out of tabulated stdout stream in Bash?

Here's what output looks like, basically:
? RESTRequestParamObj.cpp
? plugins/dupfields2/_DupFields.cpp
? plugins/dupfields2/_DupFields.h
I need to get the filenames from second column and pass them to rm. There's AWK script that goes like awk '{print $2}' but I was wondering if there's another solution.

If you have spaces between the ? and the filename then:
cut -c9-
If they're tabs then:
cut -f2

Placed your output in file
$> cat ./text
? RESTRequestParamObj.cpp
? plugins/dupfields2/_DupFields.cpp
? plugins/dupfields2/_DupFields.h
Edit it with sed
$> cat ./text | sed -r -e 's/(\?[\ \t]*)(.*)/\2/g'
RESTRequestParamObj.cpp
plugins/dupfields2/_DupFields.cpp
plugins/dupfields2/_DupFields.h
Sed in here is matching 2 parts of line -
? with tabs or spaces
Other characters until the end f the line
And then it changes whole line only with second part.

This might work for you:
echo "? RESTRequestParamObj.cpp" | sed -e 's/^\S\+/rm /' | sh
or using GNU sed
echo "? RESTRequestParamObj.cpp"| sed -r 's/^\S+/rm /e'

bash only solution, assuming your output comes from stdin:
while read line; do echo ${line##* }; done

use cut/perl instead
cut -f2 -t'\t'|xargs rm -rf
<your output>|perl -ne '#cols = split /\t/; print $cols[1]'|xargs rm -rf

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to extract patterns form a text files in shell bash - linux

I have a text file that contains: toto.titi.any=val1 toto.tata.any=val2 toto.tete.any=val2 How to extract titi , tata and tete from this file. Should be some sthing like that $ cat myfile.txt | sed '......' and the output should be titi tata tete

Do you really need sed? You could use cut: cut -d. -f2 filename

With awk you can do: awk -F. '{print $2}' file

awk/cut would be better choice for this problem. here is the sed line and grep option: sed -r 's/.\.([^.])\../\1/' grep -Po '\.\K[^.](?=\.)'

awk and cut are best for this you can also read the file line by line, and print out the portion needed. $ IFS=. $ while read first interested others ; do echo $interested; done < file titi tata tete

sed -n 's/^[^.].\([^.]\)..*/\1/p' myfile.txt display second value between dot from line having at least 2 dot inside

Related

Linux cut string

Turning multiple lines into one comma separated line [duplicate]

sed extract text between two patterns where second pattern may be either of one

Is there any equivalent command grep -nP "\t" some_file , using sed or awk

How to filter data out of tabulated stdout stream in Bash?

Categories

Resources

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to extract patterns form a text files in shell bash - linux

I have a text file that contains: toto.titi.any=val1 toto.tata.any=val2 toto.tete.any=val2 How to extract titi , tata and tete from this file. Should be some sthing like that $ cat myfile.txt | sed '......' and the output should be titi tata tete

Do you really need sed? You could use cut: cut -d. -f2 filename

With awk you can do: awk -F. '{print $2}' file

awk/cut would be better choice for this problem. here is the sed line and grep option: sed -r 's/.*\.([^.]*)\..*/\1/' grep -Po '\.\K[^.]*(?=\.)'

awk and cut are best for this you can also read the file line by line, and print out the portion needed. $ IFS=. $ while read first interested others ; do echo $interested; done < file titi tata tete

sed -n 's/^[^.]*.\([^.]*\)..*/\1/p' myfile.txt display second value between dot from line having at least 2 dot inside

Related

Linux cut string

Turning multiple lines into one comma separated line [duplicate]

sed extract text between two patterns where second pattern may be either of one

Is there any equivalent command grep -nP "\t" some_file , using sed or awk

How to filter data out of tabulated stdout stream in Bash?

Categories

Resources

awk/cut would be better choice for this problem. here is the sed line and grep option: sed -r 's/.\.([^.])\../\1/' grep -Po '\.\K[^.](?=\.)'

sed -n 's/^[^.].\([^.]\)..*/\1/p' myfile.txt display second value between dot from line having at least 2 dot inside