Enumerate substitutions with sed or awk

Enumerate substitutions with sed or awk - vim

Given the plain text file with lines
bli foo bla
abc
dfg
bli foo bla
hik
lmn
what sed or awk magic transforms it to
bli foo_01 bla
abc
dfg
bli foo_02 bla
hik
lmn
so that every occurence of 'foo' is replaced by 'foo_[occurence number]'.

awk '!/foo/||sub(/foo/,"&_"++_)' infile
Use gawk, nawk or /usr/xpg4/bin/awk on Solaris.

This probably isn't what you require, but it might give some ideas in the right direction.
Administrator#snadbox3 ~
$ cd c:/tmp
Administrator#snadbox3 /cygdrive/c/tmp
$ cat <<-eof >foo.txt
> foo
> abc
> dfg
> foo
> hik
> lmn
> eof
Administrator#snadbox3 /cygdrive/c/tmp
$ awk '/^foo$/{++fooCount; print($0 "_" fooCount);} /^ /{print}' foo.txt
foo_1
abc
dfg
foo_2
hik
lmn
EDIT:
I'm a day late and a penny short, again ;-(
EDIT2:
Character encodings is another thing to lookout for... Java source code isn't necessarily in the systems default encoding... it's quit UTF-8 encoded, to allow for any embedded "higher order entities" ;-) Many *nix utilities still aren't charset-aware.

This is another way to express radoulov's answer
awk '/foo/ {sub(/foo/, "&_" sprintf("%02d",++c))} 1' infile
You should take care that you don't match "foobar" while looking for "foo":
gawk '/\<foo\>/ {sub(/\<foo\>/, "&_" sprintf("%02d",++c))} 1'

Related

Replace a line starting with '-' hyphen with a character repeated n times

I have a text file that has some lines like this (hyphen repeated)
-------------------------------------------------------
I need to replace these lines with character 'B' repeated 1500 times. For example, like
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
Any suggestions using 'sed' or 'awk' command?

With awk:
$ awk '/^-+$/ {s = sprintf("% 1500s", ""); gsub(/ /,"B",s); print s; next} 1' file
Or, maybe a bit more efficient if you have many such lines:
$ awk 'BEGIN {s = sprintf("% 1500s", ""); gsub(/ /,"B",s)} \
/^-+$/ {print s; next} 1' file

I think
perl -pe 'my $bb = "B"x1500; s/^-+$/$bb/g'
should do it.

printf+sed variant:
$ cat file
1111
----
2222
$ sed -r 's/^-+$/'"$(printf -- "%.1s" B{1..150})/" file
1111
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
2222
Here sed is used only for replacement.
printf is used for generating 1500 times B. (The above scriptlet has 150 instead of 1500, because it required too much scrolling.)

Print between two patterns with filepath/filename in a directory

I need a command that prints data between two strings (Hello and End) along with the file name and file path on each line. Here is the input and output. Appreciate your time and help
Input
file1:
Hello
abc
xyz
End
file2:
Hello
123
456
End
file3:
Hello
Output:
/home/test/seq/file1 abc
/home/test/seq/file1 xyz
/home/test/seq/file2 123
/home/test/seq/file2 456
I tried awk and sed but not able to print the file with the path.
awk '/Hello/{flag=1;next}/End/{flag=0}flag' * 2>/dev/null

With awk:
awk '!/Hello/ && !/End/ {print FILENAME,$0} ' /home/test/seq/file?
Output:
/home/test/seq/file1 abc
/home/test/seq/file1 xyz
/home/test/seq/file2 123
/home/test/seq/file2 456

If your file contains lines above Hello and/or below End, then you can use a flag to control printing as you had attempted in your question, e.g.
awk -v f=0 '/End/{f=0} f == 1 {print FILENAME, $0} /Hello/{f=1}' file1 file2 file..
This would handle the case where your input file contained, e.g.
$cat file
some text
some more
Hello
abc
xyz
End
still more text
The flag f is a simple ON/OFF flag to control printing and placing the end rule first with the actual print in the middle eliminates the need for any next command.

Extract information (subset) from a main files using a list of identifiers saved in another file

I have one file containing a list of name (refer as file 1):
Apple
Bat
Cat
I have another file (refer as file 2) containing a list of name and details refer:
Apple bla blaa
aaaaaaaaaggggggggggttttttsssssssvvvvvvv
ssssssssiiuuuuuuuuuueeeeeeeeeeennnnnnnn
sdasasssssssssssssssssssssswwwwwwwwwwww
Aeroplane dsafgeq dasfqw dafsad
vvvvvvvvvvvvvvvvuuuuuuuuuuuuuuuuuuuuuus
fcsadssssssssssssssssssssssssssssssssss
ddddddddddddddddwwwwwwwwwwwwwwwwwwwwwww
sdddddddddddddddddddddddddddddwwwwwwwww
Bat sdasdas dsadw dasd
sssssssssssssssssssssssssssssssssssswww
ssssssssssssssssswwwwwwwwwwwwwwwwwwwwwf
aaaaaaaaaawwwwwwwwwwwwwwwwwwwwwwddddddd
sadddddddddddddddddd
Cat dsafw fasdsa dawwdwaw
sssssssssssssssssssssssssssssssssssssss
wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwssss
I need to extract info out of file 2 using the list of names in file 1.
Output file should be something like below:
Apple bla blaa
aaaaaaaaaggggggggggttttttsssssssvvvvvvv
ssssssssiiuuuuuuuuuueeeeeeeeeeennnnnnnn
sdasasssssssssssssssssssssswwwwwwwwwwww
Bat sdasdas dsadw dasd
sssssssssssssssssssssssssssssssssssswww
ssssssssssssssssswwwwwwwwwwwwwwwwwwwwwf
aaaaaaaaaawwwwwwwwwwwwwwwwwwwwwwddddddd
sadddddddddddddddddd
Cat dsafw fasdsa dawwdwaw
sssssssssssssssssssssssssssssssssssssss
wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwssss
Is there any commands for doing this using Linux (Ubuntu)? I am a new Linux user.

This might work for you (GNU sed):
sed 's#.*#/^&/bb#' file1 |
sed -e ':a' -f - -e 'd;:b;n;/^[A-Z]/!bb;ba' file2
Generate a string of sed commands from the first file and pipe them into another sed script which is run against the second file.
The first file creates a regexp for each line which when matched jumps to a piece of common code. If none of the regexps are matched the lines are deleted. If a regexp is matched then further lines are printed until a new delimiter is found at which point the code then jumps to the start and the process is repeated.

$ awk 'NR==FNR{a[$1];next} NF>1{f=($1 in a)} f' file1 file2
Apple bla blaa
aaaaaaaaaggggggggggttttttsssssssvvvvvvv
ssssssssiiuuuuuuuuuueeeeeeeeeeennnnnnnn
sdasasssssssssssssssssssssswwwwwwwwwwww
Bat sdasdas dsadw dasd
sssssssssssssssssssssssssssssssssssswww
ssssssssssssssssswwwwwwwwwwwwwwwwwwwwwf
aaaaaaaaaawwwwwwwwwwwwwwwwwwwwwwddddddd
sadddddddddddddddddd
Cat dsafw fasdsa dawwdwaw
sssssssssssssssssssssssssssssssssssssss
wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwssss

Taking into consideration that each section has to be separated by an empty line, this solution with awk works ok:
while read -r pat;do
pat="^\\\<${pat}\\\>"
awk -vpattern=$pat '$0 ~ pattern{p=1}$0 ~ /^$/{p=0}p==1' file2
done <file1
This solution to work , requires the file to like this:
Apple bla blaa
1 aaaaaaaaaggggggggggttttttsssssssvvvvvvv
2 ssssssssiiuuuuuuuuuueeeeeeeeeeennnnnnnn
3 sdasasssssssssssssssssssssswwwwwwwwwwww
Aeroplane dsafgeq dasfqw dafsad
4 vvvvvvvvvvvvvvvvuuuuuuuuuuuuuuuuuuuuuus
5 fcsadssssssssssssssssssssssssssssssssss
6 ddddddddddddddddwwwwwwwwwwwwwwwwwwwwwww
7 sdddddddddddddddddddddddddddddwwwwwwwww
Bat sdasdas dsadw dasd
8 sssssssssssssssssssssssssssssssssssswww
9 ssssssssssssssssswwwwwwwwwwwwwwwwwwwwwf
10 aaaaaaaaaawwwwwwwwwwwwwwwwwwwwwwddddddd
11 sadddddddddddddddddd
Cat dsafw fasdsa dawwdwaw
12 sssssssssssssssssssssssssssssssssssssss
13 wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwssss
PS: Numbering has been applied by me in order to be able to "check" that awk will return the correct results per section. Numbering is not required in your real file.
If there are not empty lines separating each section then it is much harder to achieve the correct result.

Exclude one string from bash output

I'm working now on a project. In this project for some reasons I need to exclude first string from the output (or file) that matches the pattern. The difficulty is in that I need to exclude just one string, just first string from the stream.
For example, if I have:
1 abc
2 qwerty
3 open
4 abc
5 talk
After some script working I should have this:
2 qwerty
3 open
4 abc
5 talk
NOTE: I don't know anything about digits before words, so I can't filter the output using knowledge about them.
I've written small script with grep, but it cuts out every string, that matches the pattern:
'some program' | grep -v "abc"
Read info about awk, sed, etc. but didn't understand if I can solve my problem.
Anything helps, Thank you.

Using awk:
some program | awk '{ if (/abc/ && !seen) { seen = 1 } else print }'
Alternatively, using only filters:
some program | awk '!/abc/ || seen { print } /abc/ && !seen { seen = 1 }'

You can use Ex editor. For example to remove the first pattern from the file:
ex +"/abc/d" -scwq file.txt
From the input (replace cat with your program):
ex +"/abc/d" +%p -scq! <(cat file.txt)
You can also read from stdin by replacing cat with /dev/stdin.
Explanation:
+cmd - execute Ex/Vim command
/pattern/d - find the pattern and delete,
%p - print the current buffer
-s - silent mode
-cq! - execute quite without saving (!)
<(cmd) - shell process substitution

give line numbers using sed which you want to delete
sed 1,2d
instead of 1 2 use line numbers that you want to delete
otherwise you can use
sed '/pattrent to match/d'
here we can have
sed '0,/abc/{//d;}'

You can also use a list of commands { list; } to read the first line and print the rest:
command | { read first_line; cat -; }
Simple example:
$ cat file
1 abc
2 qwerty
3 open
4 abc
5 talk
$ cat file | { read first_line; cat -; }
2 qwerty
3 open
4 abc
5 talk

awk '!/1/' file
2 qwerty
3 open
4 abc
5 talk
Thats all!

Finding the pattern and replacing the pattern inside the file using unix

I need your help in unix.i have a file where i have a value declared and and i have to replace the value when called. for example i have the value for &abc and &ccc. now i have to substitute the value of &abc and &ccc in the place of them as shown in the output file.
Input File
go to &abc=ddd;
if file found &ccc=10;
no the value name is &abc;
and the age is &ccc;
Output:
go to &abc=ddd;
if file found &ccc=10;
now the value name is ddd;
and the age is 10;

Try using sed.
#!/bin/bash
# The input file is a command line argument.
input_file="${1}"
# The map of variables to their values
declare -A value_map=( [abc]=ddd [ccc]=10 )
# Loop over the keys in our map.
for variable in "${!value_map[#]}" ; do
echo "Replacing ${variable} with ${value_map[${variable}]} in ${input_file}..."
sed -i "s|${variable}|${value_map[${variable}]}|g" "${input_file}"
done
This simple bash script will replace abc with ddd and ccc with 10 in the given file. Here is an example of it working on a simple file:
$ cat file.txt
so boo aaa abc
duh
abc
ccc
abcccc
hmm
$ ./replace.sh file.txt
Replacing abc with ddd in file.txt...
Replacing ccc with 10 in file.txt...
$ cat file.txt
so boo aaa ddd
duh
ddd
10
ddd10
hmm

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Enumerate substitutions with sed or awk - vim

Given the plain text file with lines bli foo bla abc dfg bli foo bla hik lmn what sed or awk magic transforms it to bli foo_01 bla abc dfg bli foo_02 bla hik lmn so that every occurence of 'foo' is replaced by 'foo_[occurence number]'.

awk '!/foo/||sub(/foo/,"&_"++_)' infile Use gawk, nawk or /usr/xpg4/bin/awk on Solaris.

This is another way to express radoulov's answer awk '/foo/ {sub(/foo/, "&_" sprintf("%02d",++c))} 1' infile You should take care that you don't match "foobar" while looking for "foo": gawk '/\<foo\>/ {sub(/\<foo\>/, "&_" sprintf("%02d",++c))} 1'

Related

Replace a line starting with '-' hyphen with a character repeated n times

Print between two patterns with filepath/filename in a directory

Extract information (subset) from a main files using a list of identifiers saved in another file

Exclude one string from bash output

Finding the pattern and replacing the pattern inside the file using unix

Categories

Resources