sed for print two different word alernatively - linux

I have a requirement to print two different words in alternative white spaces in the file.
For example,
ABCD
EFGH
IGKL
MNOP
The above scenario, I want print ab and /ab alternatively like below:
ab
ABCD
/ab
ab
EFGH
/ab
ab
IGKL
/ab
ab
MNOP
/ab
*I want this one by one in a line by line format(Not horizontal format).*I know sed 's|^[[:blank:]]*$|</ab>|' this command is almost near to my case. But I don't know how to apply this. Please, someone, help me.

With gnu sed
sed -e 'i\ab' -e 'a\/ab' infile
How this work ?
On each line
first insert ab before with 'i\ab'
next append /ab after with 'a\/ab'
You must use 2 separates commands with '-e' to do that.
You can't use sed 'i\ab;a\/ab' because the first command i (insert) don't know where end the text to insert and get all the line.
So the inserted text is ab;a/ab before each line.
Another way to do that with all sed is
sed -e 'i\
ab
a\
/ab' infile

If you are ok with awk then following may help you here.
awk -v start="ab" -v end="/ab" '{print start ORS $0 ORS end}' Input_file
In case you need to save output into Input_file itself then append > temp_file && mv temp_file Input_file in above code too.

Related

Awk to remove and move rows/records to another file where 3rd field value is empty

I have below pipe separated pipe.
I want to remove the rows where 3rd field is blank in file1 and need to paste those removed line from File1 into another File(File2).
I tried the below code it is working fine and removing the rows for all three column where 3rd field is blank but not able to figure out how to paste those removed line in another file along with the below code.
So need to know single code statement to remove rows where 3rd column value is empty from File1 and paste those removed rows into another File(like File2)
awk -F"|" -v OFS"|" '$3!=""' File1.txt > test.txt
File1
billingtype|documentnumber|originaldocumentnumber
YMNC|420075416|765467
YMNC|429842808|74646464
YPBC|429842809
INV|430071605|7688888
YPBC|430071609
Output File
billingtype|documentnumber|originaldocumentnumber
YMNC|420075416|765467
YMNC|429842808|74646464
INV|430071605|7688888
File2
billingtype|documentnumber|originaldocumentnumber
YPBC|429842809
YPBC|430071609
$ awk 'BEGIN {FS=OFS="|"}
FNR==1 {print > "File2"}
{if($3!="") print;
else print > "File2"}' File1 > tmp && mv tmp File1
header will be printed in both files. Output to a temp file and move over to the input file. Missing field records will be printed to other file.
You can try this gnu sed
sed -i -Ee '1{WFile2' -e 'b' -e '}' -e '/(.*\|){2}/!{W File2' -e 'd}' File1

grep string after first occurrence of numbers

How do I get a string after the first occurrence of a number?
For example, I have a file with multiple lines:
34 abcdefg
10 abcd 123
999 abc defg
I want to get the following output:
abcdefg
abcd 123
abc defg
Thank you.
You could use Awk for this, loop through all the columns in each line upto NF (last column in each line) and once matching the first word, print the column next to it. The break statement would exit the for loop after the first iteration.
awk '{ for(i=1;i<=NF;i++) if ($i ~ /[[:digit:]]+/) { print $(i+1); break } }' file
It is not clear what you exactly want, but you can try to express it in sed.
Remove everything until the first digit, the next digits and any spaces.
sed 's/[^0-9]*[0-9]\+ *//'
Imagine the following two input files :
001 ham
03spam
3 spam with 5 eggs
A quick solution with awk would be :
awk '{sub(/[^0-9]*[0-9]+/,"",$0); print $1}' <file>
This line substitutes the first string of anything that does not contain a number followed by a number by an empty set (""). This way $0 is redefined and you can reprint the first field or the remainder of the field. This line gives exactly the following output.
ham
spam
spam
If you are interested in the remainder of the line
awk '{sub(/[^0-9]*[0-9]+ */,"",$0); print $0}' <file>
This will have as an output :
ham
spam
spam with 5 eggs
Be aware that an extra " *" is needed in the regular expression to remove all trailing spaces after the number. Without it you would get
awk '{sub(/[^0-9]*[0-9]+/,"",$0); print $0}' <file>
ham
spam
spam with 5 eggs
You can remove digits and whitespaces using sed:
sed -E 's/[0-9 ]+//' file
grep can do the job:
$ grep -o -P '(?<=[0-9] ).*' inputFIle
abcdefg
abcd 123
abc defg
For completeness, here is a solution with perl:
$ perl -lne 'print $1 if /[0-9]+\s*(.*)/' inputFIle
abcdefg
abcd 123
abc defg

Swapping the first word with itself 3 times only if there are 4 words only using sed

Hi I'm trying to solve a problem only using sed commands and without using pipeline. But I am allowed to pass the result of a sed command to a file or te read from a file.
EX:
sed s/dog/cat/ >| tmp
or
sed s/dog/cat/ < tmp
Anyway lets say I had a file F1 and its contents was :
Hello hi 123
if a equals b
you
one abc two three four
dany uri four 123
The output should be:
if if if a equals b
dany dany dany uri four 123
Explanation: the program must only print lines that have exactly 4 words and when it prints them it must print the first word of the line 3 times.
I've tried doing commands like this:
sed '/[^ ]*.[^ ]*.[^ ]*/s/[^ ]\+/& & &/' F1
or
sed 's/[^ ]\+/& & &/' F1
but I can't figure out how i can calculate with sed that there are only 4 words in a line.
any help will be appreciated
$ sed -En 's/^([^[:space:]]+)([[:space:]]+[^[:space:]]+){3}$/\1 \1 &/p' file
if if if a equals b
dany dany dany uri four 123
The above uses a sed that supports EREs with a -E option, e.g. GNU and OSX seds).
If the fields are tab separated
sed 'h;s/[^[:blank:]]//g;s/[[:blank:]]\{3\}//;/^$/!d;x;s/\([^[:blank:]]*[[:blank:]]\)/\1\1\1/' infile

sed command to strip a match found

I have a file "fruit.xml" that looks like the below:
FRUIT="Apples"
FRUIT="Bananas"
FRUIT="Peaches"
I want to use a single SED line command to find all occurrences of NAME=" and I want strip the value between the "" from all the matches found.
So the result should look like:
Apples
Bananas
Peaches
This is the command I am using:
sed 's/.*FRUIT="//' fruit.xml
The problem is that it leaves the last " at the end of the value I need. eg: Apples".
Just catch the group and print it back: catch everything from " until another " is found with the () (or \(...\) if you don't use the -r option). Then, print it back with \1:
$ sed -r 's/.*FRUIT="([^"]*)"/\1/' file
Apples
Bananas
Peaches
You can also use field separators with awk: tell awk that your field separators are either FRUIT=" or ". This way, the desired content becomes the 2nd field.
$ awk -FS='FRUIT="|"' '{print $2}' file
Apples
Bananas
Peaches
To make your command work, just strip the " at the end of the line:
$ sed -e 's/.*FRUIT="//' -e 's/"$//' file
^^ ^^^^^^^^^^^
| replace " in the end of line with nothing
-e to allow you use multiple commands
This would be enough if you want to keep the leading spaces,
sed 's/\bFRUIT="\([^"]*\)"/\1/' fruit.xml
OR
sed 's/\bFRUIT="\|"//g' fruit.xml
Try this, this replaces the line with the founded fruit in the quotes:
sed 's/.*FRUIT="\(.*\)"/\1/' test.xml
Use a simple cut command
cut -d '"' -f2 fruits.xml
Output:
Apples
Bananas
Peaches
assuming 1 occurence per value and with this format
sed 's/.*="//;s/".*$//' fruit.xml

How to use Linux command(sed?) to delete specific lines in a file?

I have a file that contains a matrix. For example, I have:
1 a 2 b
2 b 5 b
3 d 4 b
4 b 7 b
I know it's easy to use sed command to delete specific lines with specific strings. But what if I only want to delete those lines where the second field's value is b (i.e., second line and fourth line)?
You can use regex in sed.
sed -i 's/^[0-9]\s+b.*//g' xxx_file
or
sed -i '/^[0-9]\s+b.*/d' xxx_file
The "-i" argument will modify the file's content directly, you can remove "-i" and output the result to other files as you want.
Awk just work fine, just use code as below:
awk '{if ($2 != "b") print $0;}' file
if you want get more usage about awk, just man it!
awk:
cat yourfile.txt | awk '{if($2!="b"){print;}}'

Resources