Print consecutive lines with a pattern - text

I am trying to extract the first group of lines that start with #.
I just wanted to get the first pattern dynamically
example:
Input
# script:
something
## Select the operation ** -
# mkfolders) create -
# copy) copy Overlay -
# bkp) BKP of overlay -
something
# commentary
# commentary
something
Output
## Select the operation **
# mkfolders) create -
# copy) copy Overlay -
# bkp) BKP of overlay -
I was using the following sed command for this
sed -n 5,8p file
The problem with it is that if there is any change in the file it is necessary to change the command. A dynamic solution for only the first group of consecultive lines would be welcome:
Any solution?
Thanks in advance.

Use awk's paragraph mode
$ awk -v RS= '/^#/{print; exit}' ip.txt
# script:
$ awk -v RS= '/^##/{print; exit}' ip.txt
## Select the operation ** -
# mkfolders) create -
# copy) copy Overlay -
# bkp) BKP of overlay -
-v RS= When RS is set to empty string, one or more consecutive empty lines is used as input record separator
/^##/ input record starting with ##
print; exit print the input record and then exit as only first such record is needed
Question is somewhat unclear, below code would print first record that has # character in two consecutive lines
$ awk -v RS= '/#[^\n]+\n#/{print; exit}' ip.txt
## Select the operation ** -
# mkfolders) create -
# copy) copy Overlay -
# bkp) BKP of overlay -

This might work for you (GNU sed):
sed -rn '/^\s*#/!b;:a;N;/^\s[^#]/M!ba;/^(\s*#.*\n#[^\n]*).*/!b;s//\1/p;q' file
Forget lines that do not begin with #. Collect subsequent lines in the pattern space until a line that does not begin with a # or the end-of-file condition. If the collection is less than two lines, forget them and repeat. Otherwise, remove the last line if it does not begin with #, print the collection and quit.

Related

Sed variable expansion and \n match when trying to append some text

I've the following template file
# We strongly recommend the following be uncommented to protect innocent
# web applications running on the proxy server who think the only
# one who can access services on "localhost" is a local user
#http_access deny to_localhost
#
# INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS
#
# Example rule allowing access from your local networks.
# Adapt localnet in the ACL section to list your (internal) IP networks
# from where browsing should be allowed
#http_access allow localnet
#http_access allow localhost
I want to append some text after the line bellow using sed (this is a requirement, I can't use another tool)
# INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS
#
I'm tried to use the following command:
sed "/# INSERT YOUR OWN RULE/aTEXTTOAPPEND" test.txt
The result:
#
# INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS
TEXTTOAPPEND
#
The old code used perl to do the job
perl -i -0pe 's/(#( |\t)*\n# INSERT YOUR OWN RULE.*\n#( |\t)*\n)/\1\n'"$REPLACE"'\n/g' /etc/squid/squid.conf
I'm facing two problems:
1 - Not being able to use a varible in this append command
I tried so far:
$ echo $REPLACE
TEXTTOAPPEND
$ sed "/# INSERT YOUR OWN RULE/a${REPLACE}" test.txt
sed: -e expression #1, char 53: unknown command: `B'
sed "/# INSERT YOUR OWN RULE/a\${REPLACE}" test.txt
#
# INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS
${REPLACE}
#
2 - Not being able to match the hash character after the line I want to append the text (when I add the \n the command stop matching, so, nothing is added in this case)
sed "/# INSERT YOUR OWN RULE.*\n#/aTEXTTOAPPEND" test.txt
#
# INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS
#
Expected output:
#
# INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS
#
TEXTTOAPPEND
Can someone point what I'm missing here? For variable expansion I thought using double quotes and ${var} would do the job.
Update:
I was trying everything using git-bash, when trying in a real linux machine the command bellow worked:
echo $REPLACE
ABC
sed "/# INSERT YOUR OWN RULE.*/a${REPLACE}" test.txt
#
# INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS
ABC
#
the only problem now is the 2. how to work with the \n in this case?
sed "/# INSERT YOUR OWN RULE.*\n#.*/a${REPLACE}" test.txt
#
# INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS
#
Can someone point what I'm missing here?
Commands in sed are separated by a newline. Sed sees a newline - it assumes the command ends here. I can reproduce your sed: -e expression #1, char 53: unknown command: 'B' with simple variable that has a newline and the next line starts with B:
replace="something
B something"
sed "/# INSERT YOUR OWN RULE/a${replace}"
sed sees asomething and appends something to the output and newline terminates the a command. Then it sees B something and tries to parse that as a command, but B is invalid.
The most safest way to append content of the variable with sed is to use a temporary file with r command. Note, that you need a newline after r command after filename, because any ; will be parsed as part of the filename! In bash, you can be smart and combine process substitution with a here string to create a temporary file descriptor*. To output current pattern space and read next line into pattern space in sed use n command. Like this:
sed '/# INSERT YOUR OWN RULE/{n;r'<(cat <<<"$replace")$'\n}'
It will append the content of replace after the next line after the regex. Note that after r there is $'\n' - a newline.
* Just a <(echo "$replace") would work too, but I somehow feel the cat <<<"$replace" will be better in memory consumption for big strings, I didn't check that in any way.
The following script:
replace="anything
can
be here!"
cat <<EOF |
#http_access deny to_localhost
#
# INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS
#
# Example rule allowing access from your local networks.
EOF
sed '/# INSERT YOUR OWN RULE/{n;r'<(cat <<<"$replace")$'\n}'
outputs on repl:
#http_access deny to_localhost
#
# INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS
#
anything
can
be here!
# Example rule allowing access from your local networks.
how to work with the \n in this case?
Sed reads one line at a time. As this is very similar, I would just point to this answer I did just yesterday that deals with the same problem. The script in sed this case would need to buffer two lines at a time with N command:
sed '
: restart
N # buffer two lines
: loop
# match two lines
/# INSERT YOUR OWN RULE.*\n#/{
r'<(cat <<<"$replace")'
# print and start over
n ; b restart
}
# hold, print leading line, change, remove leading line
h ; s/\n.*// ; p ; x ; s/[^\n]*\n//
# append next line and loop
N
b loop
'
But you can't do rsomething or asomething with sed -z because then records would be separated by zero, so sed reads the whole file, so asomething would be displayed after the whole file. Well, you can test it sed -z '/# INSERT YOUR OWN RULE.*\n#.*/r'<(cat <<<$replace) and it will just print the content of $replace on the end of the file.
This might work for you (GNU sed):
cat <<! | sed 'N;/^# INSERT YOUR OWN RULE.*\n#/!{P;D};r /dev/stdin' file
Text to be appended or a variable
$var
or both $var
!
Construct a here document to be appended and pipe it through to the sed invocation.
The sed invocation uses the N and the P;D commands to open a two line window throughout the length of the file but on matching the required two lines invokes the r command which appends the former here document.
An alternative:
sed '1{x;s/^/'"$var"'/;x};N;/^# INSERT YOUR OWN RULE.*\n#/!{P;D};G' file
But,this should work too:
sed 'N;/^# INSERT YOUR OWN RULE.*\n#/!{P;D};a\'"$var" file
Sometimes the simplest way is the best, maybe this is the case?
sed '8aTEXTTOAPPEND' -i test.txt
This will add 'TEXTTOAPPEND' after 8-th string.
And automate this
N=$(grep -n 'INSERT YOUR OWN RULE' test.txt) # get line number
N=${N%%:*} # remove all except line number
((N++)) # inc line number coz we got this line with #
sed "${N}aTEXTTOAPPEND" -i test.txt # add text after $Nth line
Oneliner
N=$(grep -n 'INSERT YOUR OWN RULE' test.txt); N=${N%%:*}; ((N++)); sed "${N}aTEXTTOAPPEND" -i test.txt
sed "/INSERT YOUR OWN RULE/a '\n'TEXTTOAPPEND" file-for-change |
sed "/INSERT YOUR OWN RULE/{n;s/'/#/;n;s/^'//;n;d"}
Second sed is for change and removing "'" leading symbol at start of next two lines
Or, more simple variant:
sed -r '/INSERT YOUR/,+1{s/(#$)/\1\nTEXTTOAPPEND/}'
It works with GNU sed.

How do I delete everything until the end of a file after a pattern using sed?

I want to change this:
--- BEGINNING OF file.txt ---
# use another authentication method.
# TYPE DATABASE USER ADDRESS METHOD
# "local" is for Unix domain socket connections only
local all all
--- END ---
... to this:
--- BEGINNING OF file.txt ---
# use another authentication method.
# TYPE DATABASE USER ADDRESS METHOD
--- END ---
Here's my latest attempt ("-i" to be added):
sed "s/^\(.*type\s*database\s*user\s*address\s*method\).*$/\1/i" file.txt
Thank you!
The s command is, indeed sed's Swiss Army knife, but when you want to delete entire lines, it's the wrong tool. You want the d command:
sed '0,/type\s\+database\s\+user\s\+address\s\+method/I!d'
The address 0,/.../I matches all lines from the beginning of the file (0) to the first line that matches the regular expression, case insensitive (I—available in GNU sed but not POSIX sed). ! inverts the match so d deletes the lines that don't match. Try it online!
This is also easily done with awk:
$ awk -v f=1 'f
{s=tolower($0)}
s~/type\s+database\s+user\s+address\s+method/{f=!f}' file
Or,
$ awk 'BEGIN{f=1}
f
{s=tolower($0)}
s~/type\s+database\s+user\s+address\s+method/{exit}' file
Outline:
BEGIN{f=1} # set a flag to true value
f # print if that flag is true
{s=tolower($0)} # make the line lowercase
s~/regex/{exit} # at the regex but after printing -- exit
Which can be further simplified to:
$ awk '1
tolower($0)~/type\s+database\s+user\s+address\s+method/{exit}' file

Delete lines from a file matching first 2 fields from a second file in shell script

Suppose I have setA.txt:
a|b|0.1
c|d|0.2
b|a|0.3
and I also have setB.txt:
c|d|200
a|b|100
Now I want to delete from setA.txt lines that have the same first 2 fields with setB.txt, so the output should be:
b|a|0.3
I tried:
comm -23 <(sort setA.txt) <(sort setB.txt)
But the equality is defined for whole line, so it won't work. How can I do this?
$ awk -F\| 'FNR==NR{seen[$1,$2]=1;next;} !seen[$1,$2]' setB.txt setA.txt
b|a|0.3
This reads through setB.txt just once, extracts the needed information from it, and then reads through setA.txt while deciding which lines to print.
How it works
-F\|
This sets the field separator to a vertical bar, |.
FNR==NR{seen[$1,$2]=1;next;}
FNR is the number of lines read so far from the current file and NR is the total number of lines read. Thus, when FNR==NR, we are reading the first file, setB.txt. If so, set the value of associative array seen to true, 1, for the key consisting of fields one and two. Lastly, skip the rest of the commands and start over on the next line.
!seen[$1,$2]
If we get to this command, we are working on the second file, setA.txt. Since ! means negation, the condition is true if seen[$1,$2] is false which means that this combination of fields one and two was not in setB.txt. If so, then the default action is performed which is to print the line.
This should work:
sed -n 's#\(^[^|]*|[^|]*\)|.*#/^\1/d#p' setB.txt |sed -f- setA.txt
How this works:
sed -n 's#\(^[^|]*|[^|]*\)|.*#/^\1/d#p'
generates an output:
/^c|d/d
/^a|b/d
which is then used as a sed script for the next sed after the pipe and outputs:
b|a|0.3
(IFS=$'|'; cat setA.txt | while read x y z; do grep -q -P "\Q$x|$y|\E" setB.txt || echo "$x|$y|$z"; done; )
explanation: grep -q means only test if grep can find the regexp, but do not output, -P means use Perl syntax, so that the | is matched as is because the \Q..\E struct.
IFS=$'|' will make bash to use | instead of the spaces (SPC, TAB, etc.) as token separator.

Replace String in File with 1st,2nd... line in another file

I have a string in file1 stored as a variable.
I need to replace the variable in file1 with the first line of another file - file2.
stop for a while(15 seconds or so) So that i use file1 for some
Then replace the variable in file1 with with the second line of file2.
stop for a while(15 seconds or so)
Repeat the above step for the third line of file2 and so on. And exit after doing the replacement with the last row in file2.
You can do something like this:
#!/bin/bash
while read line; do
# Using sed is not a good idea if file2 may contain characters that have
# meaning in a sed regex (or may be your delimiter), and substituting it
# directly into the awk code would break if there was a quote in there.
# This should work with everything.
#
# Also, we'll need the template file later, so we can't replace in-place.
# Instead, write the result of the substitution to its own file and work
# on that.
awk -v SUBST="$line" '{ gsub("VARIABLE", SUBST, $0); print $0 }' file1 > file1.cooked
# Instead of sleeping, I encourage you to do the actual work here. That
# way, you will not introduce brittle timing issues that will vex you when
# things break in non-obvious ways.
sleep 15
done < file2

Unix/Linux, Delete comments from lines

I need to delete/remove comments from a user-input line without deleting any codes. So for example:
mail -s 'text' brown < text #comments
How do I remove the comments and leave the code intact?
I can delete lines that begin with #, but not if it begins somewhere in the middle of the lines.
I tried:
echo $line | sed -e 's/\
but it does not work. Any idea what I'm doing wrong?
Also, how to detect cases in which # is not used to begin a comment?
For example quoted # and line of code that ends with # since they are not comments.
echo $line | sed -e '/^#/d'
In this line, the # is not used as a comment, but as part of code. I figure out that I need to detect that if # is within quotes or does not have a whitespace character before the #. How do I leave the output as it is?
You can remove all from # to end of line using this awk
awk '{sub(/#.*$/,"")}1' file
But if you have file like this:
#!/bin/bash
pidof tail #See if tail is running
if [ $? -ne 0 ] ; then #start loop
awk '{print " # "$8}' file >tmp # this is my code
fi # end of loop
awk -F# '{for (i=1;i<=NF;i++) print $i}' file > tmp2
a=a+1 # increment a
There are no way you can remove the comment automatically without destroying some.
Well, consider what almost always comes after a comment in bash.
#comment...
#another comment
A line break! Which is effectively a character. So, all you have to do is add a wildcard after your #, to include the actual comment text, then put a line break 'character' at the end. You'll actually need to use \n rather than trying to hit Enter. Unfortunately I'm not on linux at the moment, and sometimes delimiters (the backslash) don't work properly. Trying something like `\n` might work, or maybe using $'\n'.
EDIT: With regex ^ will indicate the start of a new line, while $ indicates the end.
As for not deleting actual code, matching for a space immediately followed by # should work. I would match for a space OR line break preceding the #.
At any rate, please be sure not to accidentally ruin whatever you're working on, just in case I'm wrong.

Resources