replace sed command text inline - linux

I have this file
file.txt
unknown#mail.com||unknown#mail.com||
unknown#mail2.com||unknown#mail2.com||
unknown#mail3.com||unknown#mail3.com||
unknown#mail4.com||unknown#mail4.com||
unknownpass
unknownpass2
unknownpass3
unknownpass4
How can I use the sed command to obtain this:
unknown#mail.com|unknownpass|unknown#mail.com|unknownpass|
unknown#mail2.com|unknownpass2|unknown#mail2.com|unknownpass2|
unknown#mail3.com|unknownpass3|unknown#mail3.com|unknownpass3|
unknown#mail4.com|unknownpass4|unknown#mail4.com|unknownpass4|

This might work for you (GNU sed):
sed ':a;N;/\n[^|\n]*$/!ba;s/||\([^|]*\)||\(\n.*\)*\n\(.*\)$/|\3|\1|\3|\2/;P;D' file
Slurp the first part of the file into pattern space and one of the replacements, substitute, print and delete the first line and then repeat.

Well, this does use sed anyway:
{ sed -n 5,\$p file.txt; sed 4q file.txt; } | awk 'NR<5{a[NR]=$0; next}
{$2=a[NR-4]; $4=a[NR-4]} 1' FS=\| OFS=\|

awk to the rescue!
awk 'BEGIN {FS=OFS="|"}
NR==FNR {if(NF==1) a[++c]=$1; next}
NF>4 {$2=a[FNR]; $4=$2; print}' file{,}
a two pass algorithm, caches the entries in the first round and inserts them into the empty fields, assumes the number of items match.
Here is another approach with one pass, powered by tac wrapped awk
tac file |
awk 'BEGIN {FS=OFS="|"}
NF==1 {a[++c]=$1}
NF>4 {$2=a[c--]; $4=$2; print}' |
tac

I would combine the related lines with paste and reshuffle the elements with awk (I assume the related lines are exactly half a file away):
n=$(wc -l < file.txt)
paste -d'|' <(head -n $((n/2)) file.txt) <(tail -n $((n/2)) file.txt) |
awk '{ print $1, $6, $3, $6, "" }' FS='|' OFS='|'
Output:
unknown#mail.com|unknownpass|unknown#mail.com|unknownpass|
unknown#mail2.com|unknownpass2|unknown#mail2.com|unknownpass2|
unknown#mail3.com|unknownpass3|unknown#mail3.com|unknownpass3|
unknown#mail4.com|unknownpass4|unknown#mail4.com|unknownpass4|

Related

awk - Delimiter as combination of number and | (pipe) not working

I have an input file with some records as below,
input.txt
Record|111|aaa|aaa|11|1-bb|bb|1111|cccc|cccc
Record|11|1-aaa|aaa|111|bb|bb|1111|cccc|cccc
Record|111|aaa|aaa|11|1-bb|bb|1111|cccc|cccc
Record|111|aaa|aaa|111|bb|bb|11|1-cccc|cccc
Record|22|aaa|aaa|222|bb|bb|2222|cccc|cccc|11|1-dddd|dd
Record|333|aaa|aaa|11|1-bb|bb|333|cccc|cccc
Record|11|1-aaa|aaa|102|bb|bb|1111|cccc|cccc
i want to use a delimiter |11| in awk and get the second field, i tried the most common way as below,
Command
awk -F'|11|' '{print $2}' input.txt
Output
1|aaa|aaa|
|1-aaa|aaa|
1|aaa|aaa|
1|aaa|aaa|
|1-dddd|dd
|1-bb|bb|333|cccc|cccc
|1-aaa|aaa|102|bb|bb|
Expected Output
1-bb|bb|1111|cccc|cccc
1-aaa|aaa|111|bb|bb|1111|cccc|cccc
1-bb|bb|1111|cccc|cccc
1-cccc|cccc
1-dddd|dd
1-bb|bb|333|cccc|cccc
1-aaa|aaa|102|bb|bb|1111|cccc|cccc
Basically its not considering the last | of the delimiter |11|, instead it is taking a delimiter |11.
i tried all below, none gave me the expected output,
awk -F"|11|" '{print $2}' input.txt # gives wrong output
awk -F\|11\| '{print $2}' input.txt # gives Wrong output
awk -v FS='|11|' '{print $2}' input.txt # gives Wrong output
Finally i had to write a for loop inside awk with delimiter as | to make it work, i would like to know why the simple solution doesn't work
Argument to -F is a regex.
awk -F "\\\|11\\\|" '{print $2}' file
or
awk -F '\\|11\\|' '{print $2}' file
or (Thanks to EdMorton)
awk -F'[|]11[|]' '{print $2}' input.txt
Output:
1-bb|bb|1111|cccc|cccc
1-aaa|aaa|111|bb|bb|1111|cccc|cccc
1-bb|bb|1111|cccc|cccc
1-cccc|cccc
1-dddd|dd
1-bb|bb|333|cccc|cccc
1-aaa|aaa|102|bb|bb|1111|cccc|cccc
Cyrus explained why your delimiter does not work as expected (a combination of regular expression quoting issues).
With sed, removing everything up to and including the |11| on each line:
$ sed 's/.*|11|//' input.txt
1-bb|bb|1111|cccc|cccc
1-aaa|aaa|111|bb|bb|1111|cccc|cccc
1-bb|bb|1111|cccc|cccc
1-cccc|cccc
1-dddd|dd
1-bb|bb|333|cccc|cccc
1-aaa|aaa|102|bb|bb|1111|cccc|cccc

Grep entire line after word

What would be the grep command to get an everything in the line after a match?
For example on a file path:
/home/usr/we/This/is/the/file/path
and I want the output to be
/we/This/is/the/File/Path
Matching the /we as the regex.
grep -o does what you want.
grep -o '/we.*'
OP like to use we as a trigger. Using awk
awk -F/ '{for (i=1;i<=NF;i++) {if ($i~/we/) f=1;if (f) printf "/%s",$i}print ""}' file
/we/This/is/the/file/path
Using gnu awk
awk '{print gensub(/.*(\/we)/,"\\1","g")}' file
/we/This/is/the/file/path
YourInput | sed 's|/home/usr\(/we.*\)|\1|'
assuming it's always (and only) starting with /home/usr
else
YourInput | sed -n 's|^.*\(/we.*\)||p'
return only line(s) having /we and remove text before /we

replace a character into awk

I use ${var//string1/string2} for replace characters or strings, now I need do the same but in a specific column awk.
try this for replace space per '_' but does not work
cat file | awk -F',' '{print ${3// /_}'
Use gsub
awk -F, -v OFS=, '{gsub(" ", "_", $3); print}' file.txt

Replace from nth occurrence of pattern till the end of line with sed

For example:
/some/long/path/we/need/to/shorten
Need to delete after the 6th occurrence of '/', including itself:
/some/long/path/we/need
Using sed I came up with this solution, but it's kind of workaround-ish:
path=/some/long/path/we/need/to/shorten
slashesToKeep=5
n=2+slashesToKeep
echo $path | sed "s/[^/]*//$n;s/\/\/.*//g"
Cleaner solution much appreciated!
Input
/some/long/path/we/need/to/shorten
Code
Cut Solution
echo '/some/long/path/we/need/to/shorten' | cut -d '/' -f 1-6
AWK Solution
echo '/some/long/path/we/need/to/shorten' | awk -F '/' '{ for(i=1; i<=6; i++) {print $i} }' | tr '\n' '/'|sed 's/.$//'
Output
/some/long/path/we/need
This might work for you (GNU sed):
sed 's/\/[^\/]*//6g' file
Awk:
awk -F'/' 'BEGIN{OFS=FS}{NF=6}1'
In action:
$ echo /some/long/path/we/need/to/shorten | awk -F'/' 'BEGIN{OFS=FS}{NF=6}1'
/some/long/path/we/need

how to get requred field from file on linux?

I have one file which contains three fields separated by two spaces. I need to get only third field from file. File content is as in following example:
kuldeep Mirat Shakti
balaji salunke pune
.
.
.
How can I get the third field?
To get the 3rd field, assuming you don't have any "embedded spaces", just
awk '{print $3}' file
awk by default sets whitespaces as field delimiters. So even if you have 2 spaces or more, the 3rd field is always $3.
However, if you want to be specific, then specify a Field delimiter
awk -F" " '{print $3}' file
If you have other choices, a Ruby one
ruby -F" " -ane 'print $F[2]' file
ruby -ane 'print $F[2]' file
Update: If you need to get all fields after 3rd,
awk -F" " '{$1=$2=$3=""}1' OFS=" " file # add a pipe to `sed 's/^[ \t]*//'` if desired
ruby -F" " -ane 'puts $F[3..-1].join(" ")' file
Use awk:
awk -F' ' '{print $3}' file
This also works if fields may contain embedded spaces.
To get the third field of each line, pipe through awk, e.g
cat filename | awk '{print $3}'
If you just want to get the third field of the first line, use head, too:
cat filename | head -n 1 | awk '{print $3}'
Given #balaji's comment to #kurani's answer:
perl -pe 's/^.*? .*? //' filename
awk -F' ' '{for(i=3; i<NF; i++) {printf("%s%s",$i,FS)}; print $NF}' filename
less filename | cut -d" " -f 3

Resources