replace all lines between 2 matching words [closed] - linux

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I have two files;
in1.txt:
bbb
ccc
ddd
aaa
ccc
bbb
ddd
in2.txt:
sss
In in1.txt, I want to replace lines from aaa to the first occurence of ddd with the contents of in2.txt.
Desired output:
bbb
ccc
ddd
sss

tl;dr:
$ sed -e "/aaa/,/ddd/c\\$(cat in2.txt)" in1.txt
bbb
ccc
ddd
sss
In detail:
$ sed -e '/START/,/FIN/c\REPLACE_WITH' file
/START/,/FIN/ indicates the range of text to replace - beginning with START and ending with FIN.
The \c is used to replace the previous declared lines with REPLACE_WITH.
Hope this helps.

This might work for you (GNU sed):
sed -e 'x;/x/{x;:a;n;ba};x;/^aaa$/{:b;N;/^ddd$/M!bb;x;s/^/x/;x;r file2' -e 'd}' file1
In order to make the replacement once only , set a flag in the hold space and check every time a line is read in, if that flag has been set. If it has, print the remainder of the file using the n command and a loop.
If the flag has not been set, on encountering the start delimiter, accumulate the the file up until the end delimiter in the pattern space. Set the once only flag and read in the contents of the second file. Finally delete the pattern space.

On top of the nice sed solutions that have been provided, I have added a awk one:
Input files:
$ more in*.txt
::::::::::::::
in1.txt
::::::::::::::
bbb
ccc
ddd
aaa
ccc
bbb
ddd
::::::::::::::
in2.txt
::::::::::::::
sss
command:
awk -v delim1="aaa" -v delim2="ddd" -v target=in2.txt '{if($0 == delim1){test=1;system("cat "target);next}if(test !=1) print;if($0 == delim2){test=0};}' in1.txt
output:
bbb
ccc
ddd
sss
Code:
{
if ($0 == delim1) {
test = 1
system("cat " target)
next
}
if (test != 1) {
print $0
}
if ($0 == delim2) {
test = 0
}
}
Explanations:
-v delim1="aaa" -v delim2="ddd" -v target=in2.txt you pass the 2 delimiters to awk as parameters as well as the file you want to read from (in2.txt)
When you reach the first delimiter, your test variable is set to 1 and you print the content of in2.txt, you jump to next line.
if test variable is different than 1 you print the line (this means that you have not yet encountered the first delimiter)
When you reach the 2nd delimiter you reset test to allow the printing of the rest of the file.

Related

Print between two patterns with filepath/filename in a directory

I need a command that prints data between two strings (Hello and End) along with the file name and file path on each line. Here is the input and output. Appreciate your time and help
Input
file1:
Hello
abc
xyz
End
file2:
Hello
123
456
End
file3:
Hello
Output:
/home/test/seq/file1 abc
/home/test/seq/file1 xyz
/home/test/seq/file2 123
/home/test/seq/file2 456
I tried awk and sed but not able to print the file with the path.
awk '/Hello/{flag=1;next}/End/{flag=0}flag' * 2>/dev/null
With awk:
awk '!/Hello/ && !/End/ {print FILENAME,$0} ' /home/test/seq/file?
Output:
/home/test/seq/file1 abc
/home/test/seq/file1 xyz
/home/test/seq/file2 123
/home/test/seq/file2 456
If your file contains lines above Hello and/or below End, then you can use a flag to control printing as you had attempted in your question, e.g.
awk -v f=0 '/End/{f=0} f == 1 {print FILENAME, $0} /Hello/{f=1}' file1 file2 file..
This would handle the case where your input file contained, e.g.
$cat file
some text
some more
Hello
abc
xyz
End
still more text
The flag f is a simple ON/OFF flag to control printing and placing the end rule first with the actual print in the middle eliminates the need for any next command.

Isolate product names from strings by matching string after (including) first letter in a variable

I have a bunch of strings of following pattern in a text file:
201194_2012110634 Appliance 130 AB i Some optional (Notes )
300723_2017050006(2016111550) Device 16 AB i Note
The first part is serial, the second is date. Device/Appliance name and model (about 10 possible different names) is the string after date number and before (including AB i).
I was able to isolate dates and serials using
SERIAL=${line:0:6}
YEAR=${line:7:4}
I'm trying to isolate Device name and note after that:
#!/bin/bash
while IFS= read line || [[ -n $line ]]; do
NAME=${line#*[a-zA-Z]}
STRINGAP='Appliance '"${line/#*Appliance/}"
The first approach is to take everything after the first letter appearing in line, which gives me
NAME = ppliance 130 AB i Some optional (Notes )
The second approach is to write tests for each of the ~10 possible appliance/device names and then append appliance name after the subtracted test. Then test variable which actually matched Appliance / Device (or other name) and use that to input into the database.
Is it possible to write a line that would select everything, including first letter in a line, in text file? Then I would subtract everything after AB i to get notes and everything before AB i would become appliance name.
Remove the ${line#*[az-A-Z]} line (which will, as you see, remove the first character of the name), and instead use
STRINGAP=$(echo "$line" | sed 's/^[0-9_]* \(.*\) AB i.*/\1/')
This drops the leading digits and underscore, and everything from " AB i" to the end.
Edit: The details are unclear - do you want to keep the "AB i", and will it always be "AB i"? If you want it, change the line to
STRINGAP=$(echo "$line" | sed 's/^[0-9_]* \(.* AB i\).*/\1/')
I also forgot the double quotes round the text line.
You can use sed and read to give you more control of parsing.
tmp> line2="300723_2017050006(2016111550) Device 16 AB i Note"
tmp> read serial date type val <<<$(echo $line2 | \
sed 's/\([0-9]*\)_\([0-9]*\)[^A-Z]*\(Device\|Appliance\) \
\([0-9]*\).*/\1 \2 \3 \4/')
tmp> echo "$serial|$date|$type|$val"
300723|2017050006|Device|16
Basically, read allows you to assign multiple variables in one line. The sed statment parses the line, and gives you space delimitted output of its results. You can also read each variable seperately if you don't mind running sed a few extra times:
device="$(echo $line2 | sed -e 's/^.*Device \([0-9]*\).*/\1/;t;d')"
appliance="$(echo $line2 | sed -e 's/^.*Appliance \([0-9]*\).*/\1/;t;d')"
This way $device is populated with device if present, and is blank otherwise (note the -e and ;t;d at the end of the regex to prevent it from dumping the line if it doesn't match.)
Your question isn't clear but it seems like you might be trying to parse strings into substrings. Try this with GNU awk for the 3rd arg to match() and let us know if there's something else you were looking for:
$ awk 'match($0,/^([0-9]+)_([0-9]+)(\([0-9]+\))?\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(.*)/,a) {
for (i=1; i<=8; i++) {
print i, a[i]
}
print "---"
}' file
1 201194
2 2012110634
3
4 Appliance
5 130
6 AB
7 i
8 Some optional (Notes )
---
1 300723
2 2017050006
3 (2016111550)
4 Device
5 16
6 AB
7 i
8 Note
---
If you wanted a CSV output, for example, then it'd just be:
$ awk -v OFS=',' 'match($0,/^([0-9]+)_([0-9]+)(\([0-9]+\))?\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(.*)/,a) {
for (i=1; i<=8; i++) {
printf "%s%s", a[i], (i<8?OFS:ORS)
}
}' file
201194,2012110634,,Appliance,130,AB,i,Some optional (Notes )
300723,2017050006,(2016111550),Device,16,AB,i,Note
Massage to suit...

Exclude one string from bash output

I'm working now on a project. In this project for some reasons I need to exclude first string from the output (or file) that matches the pattern. The difficulty is in that I need to exclude just one string, just first string from the stream.
For example, if I have:
1 abc
2 qwerty
3 open
4 abc
5 talk
After some script working I should have this:
2 qwerty
3 open
4 abc
5 talk
NOTE: I don't know anything about digits before words, so I can't filter the output using knowledge about them.
I've written small script with grep, but it cuts out every string, that matches the pattern:
'some program' | grep -v "abc"
Read info about awk, sed, etc. but didn't understand if I can solve my problem.
Anything helps, Thank you.
Using awk:
some program | awk '{ if (/abc/ && !seen) { seen = 1 } else print }'
Alternatively, using only filters:
some program | awk '!/abc/ || seen { print } /abc/ && !seen { seen = 1 }'
You can use Ex editor. For example to remove the first pattern from the file:
ex +"/abc/d" -scwq file.txt
From the input (replace cat with your program):
ex +"/abc/d" +%p -scq! <(cat file.txt)
You can also read from stdin by replacing cat with /dev/stdin.
Explanation:
+cmd - execute Ex/Vim command
/pattern/d - find the pattern and delete,
%p - print the current buffer
-s - silent mode
-cq! - execute quite without saving (!)
<(cmd) - shell process substitution
give line numbers using sed which you want to delete
sed 1,2d
instead of 1 2 use line numbers that you want to delete
otherwise you can use
sed '/pattrent to match/d'
here we can have
sed '0,/abc/{//d;}'
You can also use a list of commands { list; } to read the first line and print the rest:
command | { read first_line; cat -; }
Simple example:
$ cat file
1 abc
2 qwerty
3 open
4 abc
5 talk
$ cat file | { read first_line; cat -; }
2 qwerty
3 open
4 abc
5 talk
awk '!/1/' file
2 qwerty
3 open
4 abc
5 talk
Thats all!

find a pattern and print line based on finding the first pattern sed, awk grep

I have a rather large file. What is common to all is the hostname to break each section example :
HOSTNAME:host1
data 1
data here
data 2
text here
section 1
text here
part 4
data here
comm = 2
HOSTNAME:host-2
data 1
data here
data 2
text here
section 1
text here
part 4
data here
comm = 1
The above prints
As you see above, in between each section there are other sections broken down by key words or lines that have specific values
I like to use a oneliner to print host name for each section and then print which ever lines I want to extract under each hostname section
Can you please help. I am using now grep -C 10 HOSTNAME | gerp -C pattern
but this assumes that there are 10 lines in each section. This is not an optimal way to do this; can someone show a better way. I also need to be able to print more than one line under each pattern that I find . So if I find data1 and there are additional lines under it I like to grab and print them
So output of command would be like
grep -C 10 HOSTNAME | grep data 1
grep -C 10 HOSTNAME | grep -A 2 data 1
HOSTNAME:Host1
data 1
HOSTNAME:Hoss2
data 1
Beside Grep I use this sed command to print my output
sed -r '/HOSTNAME|shared/!d' filename
The only problem with this sed command is that it only prints the lines that have patterns shared & HOSTNAME in them. I also need to specify the number of lines I like to print in my case under the line that matched patterns shared. So I like to print HOSTNAME and give the number of lines I like to print under second search pattern shared.
Thanks
awk to the rescue!
$ awk -v lines=2 '/HOSTNAME/{c=lines} NF&&c&&c--' file
HOSTNAME:host1
data 1
HOSTNAME:host-2
data 1
print lines number of lines including pattern match, skips empty lines.
If you want to specify secondary keyword instead number of lines
$ awk -v key='data 1' '/HOSTNAME/{h=1; print} h&&$0~key{print; h=0}' file
HOSTNAME:host1
data 1
HOSTNAME:host-2
data 1
Here is a sed twoliner:
sed -n -r '/HOSTNAME/ { p }
/^\s+data 1/ {p }' hostnames.txt
It prints (p)
when the line contains a HOSTNAME
when the line starts with some whitespace (\s+) followed by your search criterion (data 1)
non-mathing lines are not printed (due to the sed -n option)
Edit: Some remarks:
this was tested with GNU sed 4.2.2 under linux
you dont need the -r if your sed version does not support it, replace the second pattern to /^.*data 1/
we can squash everything in one line with ;
Putting it all together, here is a revised version in one line, without the need for the extended regex ( i.e without -r):
sed -n '/HOSTNAME/ { p } ; /^.*data 1/ {p }' hostnames.txt
The OP requirements seem to be very unclear, but the following is consistent with one interpretation of what has been requested, and more importantly, the program has no special requirements, and the code can easily be modified to meet a variety of requirements. In particular, both search patterns (the HOSTNAME pattern and the "data 1" pattern) can easily be parameterized.
The main idea is to print all lines in a specified subsection, or at least a certain number up to some limit.
If there is a limit on how many lines in a subsection should be printed, specify a value for limit, otherwise set it to 0.
awk -v limit=0 '
/^HOSTNAME:/ { subheader=0; hostname=1; print; next}
/^ *data 1/ { subheader=1; print; next }
/^ *data / { subheader=0; next }
subheader && (limit==0 || (subheader++ < limit)) { print }'
Given the lines provided in the question, the output would be:
HOSTNAME:host1
data 1
HOSTNAME:host-2
data 1
(Yes, I know the variable 'hostname' in the awk program is currently unused, but I included it to make it easy to add a test to satisfy certain obvious requirements regarding the preconditions for identifying a subheader.)
sed -n -e '/hostname/,+p' -e '/Duplex/,+p'
The simplest way to do it is to combine two sed commands ..

Finding the pattern and replacing the pattern inside the file using unix

I need your help in unix.i have a file where i have a value declared and and i have to replace the value when called. for example i have the value for &abc and &ccc. now i have to substitute the value of &abc and &ccc in the place of them as shown in the output file.
Input File
go to &abc=ddd;
if file found &ccc=10;
no the value name is &abc;
and the age is &ccc;
Output:
go to &abc=ddd;
if file found &ccc=10;
now the value name is ddd;
and the age is 10;
Try using sed.
#!/bin/bash
# The input file is a command line argument.
input_file="${1}"
# The map of variables to their values
declare -A value_map=( [abc]=ddd [ccc]=10 )
# Loop over the keys in our map.
for variable in "${!value_map[#]}" ; do
echo "Replacing ${variable} with ${value_map[${variable}]} in ${input_file}..."
sed -i "s|${variable}|${value_map[${variable}]}|g" "${input_file}"
done
This simple bash script will replace abc with ddd and ccc with 10 in the given file. Here is an example of it working on a simple file:
$ cat file.txt
so boo aaa abc
duh
abc
ccc
abcccc
hmm
$ ./replace.sh file.txt
Replacing abc with ddd in file.txt...
Replacing ccc with 10 in file.txt...
$ cat file.txt
so boo aaa ddd
duh
ddd
10
ddd10
hmm

Resources