Prepend file content to match in sed - text

I am trying to append the content of a file before the closing body tag in an html document. I've tried
cat test.html | sed -e $'/<\/body>/{ r insert.html ... }'
using various combinations of \np, \nd at ..., but everything seems to be inserted after the tag.
It would also be nice if additional string constants could be added around the content of insert.html, such as centering tags etc.

If sed is your hard requirement, you can try this in GNU sed:
sed '/<\/body>/e cat insert.html' test.html
It uses GNU-specific e shell-command (e cat filename here), which will, unlike r filename be executed before the end of the current cycle (before </body> line is processed/printed).
Note (from the docs) r filename will:
Queue the contents of filename to be read and inserted into the output stream at the end of the current cycle, or when the next input line is read.
and e command:
[...] unlike the r command, the output of the command will be printed immediately; the r command instead delays the output to the end of the current cycle.

Related

Use bash to find line in java files which include a pattern, and then replace another part of the line

I have a directory that includes a lot of java files, and in each file I have a class variable:
String system = "x";
I want to be able to create a bash script which I execute in the same directory, which will go to only the java files in the directory, and replace this instance of x, with y. Here x and y are a word. Now this may not be the only instance of the word x in the java script, however it will definitely be the first.
I want to be able to execute my script in the command line similar to:
changesystem.sh -x -y
This way I can specify what the x should be, and the y I wish to replace it with. I found a way to find and print the line number at which the first instance of a pattern is found:
awk '$0 ~ /String system/ {print NR}' file
I then found how to replace a substring on a given line using:
awk 'NR==line_number { sub("x", "y") }'
However, I have not found a way to combine them. Maybe there is also an easier way? Or even, a better and more efficient way?
Any help/advice will be greatly appreciated
You may create a changesystem.sh file with the following GNU awk script:
#!/bin/bash
for f in *.java; do
awk -i inplace -v repl="$1" '
!x && /^\s*String\s+system\s*=\s*".*";\s*$/{
lwsp=gensub(/\S.*/, "", 1);
print lwsp"String system = \""repl"\";";
x=1;next;
}1' "$f";
done;
Or, with any awk:
#!/bin/bash
for f in *.java; do
awk -v repl="$1" '
!x && /^[[:space:]]*String[[:space:]]+system[[:space:]]*=[[:space:]]*".*";[[:space:]]*$/{
lwsp=$0; sub(/[^[:space:]].*/, "", lwsp);
print lwsp"String system = \""repl"\";";
x=1;next
}1' "$f" > tmp && mv tmp "$f";
done;
Then, make the file executable:
chmod +x changesystem.sh
Then, run it like
./changesystem.sh 'new_value'
Notes:
for f in *.java; do ... done iterates over all *.java files in the current directory
-i inplace - GNU awk feature to perform replacement inline (not available in a non-GNU awk)
-v repl="$1" passes the first argument of the script to the awk command
!x && /^\s*String\s+system\s*=\s*".*";\s*$/ - if x is false and the record starts with any amount of whitespace (\s* or [[:space:]]*), then String, any 1+ whitespaces, system, = enclosed with any zero or more whitesapces, and then a " char, then has any text and ends with "; and any zero or more whitespaces, then
lwsp=gensub(/\S.*/, "", 1); puts the leading whitespace in the lwsp variable (it removes all text starting with the first non-whitespace char from the line matched)
lwsp=$0; sub(/[^[:space:]].*/, "", lwsp); - same as above, just in a different way since gensub is not supported in non-GNU awk and sub modifies the given input string (here, lwsp)
{print "String system = \""repl"\";";x=1;next}1 - prints the String system = " + the replacement string + ";, assigns 1 to x, and moves to the next line, else, just prints the line as is.
You don't need to pre-compute the line number. The whole job can be done by one not-too-complicated sed command. You probably do want to script it, though. For example:
#!/bin/bash
[[ $# -eq 3 ]] || {
echo "usage: $0 <context regex> <target regex> <replacement text>" 1>&2
exit 1
}
sed -si -e "/$1/ { s/\\<$2\\>/$3/; t1; p; d; :1; n; b1; }" ./*.java
That assumes that the files to modify are java source files in the current working directory, and I'm sure you understand the (loose) argument check and usage message.
As for the sed command itself,
the -s option instructs sed to treat each argument as a separate stream, instead of operating as if by concatenating all the inputs into one long stream.
the -i option instructs sed to modify the designated files in-place.
the sed expression takes the default action for each line (printing it verbatim) unless the line matches the "context" pattern given by the first script argument.
for lines that do match the context pattern,
s/\\<$2\\>/$3/ - attempt to perform the wanted substitution
the \< and \> match word start and end boundaries, respectively, so that the specified pattern will not match a partial word (though it can match multiple complete words if the target pattern allows)
t1 - if a substitution was made, then branch to label 1, otherwise
p; d - print the current line and immediately start the next cycle
:1; n; b1 - label 1 (reachable only by branching): print the current line and read the next one, then loop back to label 1. This prints the remainder of the file without any more tests or substitutions.
Example usage:
/path/to/replace_first.sh 'String system' x y
It is worth noting that that does expose the user to some details of seds interpretation of regular expressions and replacement text, though that does not manifest for the example usage.
Note that that could be simplified by removing the context pattern bit if you are sure you want to modify the overall first appearance of the target in each file. You could also hard-code the context, the target pattern, and/or the replacement text. If you hard-code all three then the script would no longer need any argument handling or checking.

Add pipe delimiter at the end of each row using unix

I am new to unix commands, please forgive if i am not using correct line of code below.
I have files (xxxx.txt.date) on winscp with header and footer. Now i want to add N number of pipe (|) at the end of the each row of all files starting from 2nd line till second last line. (i dont want | in header as well as footer).
Now i have created a scirpt in which i am using below command:
sed -e "2,\$s/$/|/" $file | column -t
2,$s/$/|/: adds | at the end of every line from line 2
Now below are the issues i am facing
First
The data doesn't change in the files i am able to see pipe added at end of each row in hive, how can i change data in files?
I don't want | in footer.
Any suggestion or help will be appreciated.
Thanks in advance !!
If you need to append just one "|" at the end of each line except header and footer
sed -i '1n; $n; s/$/|/' file_name
1n; $n; : Just print first and last line as is.
-i : make changes to the file instead of printing to STDOUT.
If you need to append n pipes at the end of each line except Header and Footer. If you use the below awk command, you will have to redirect the output to a temporary file and then rename it.
Assumptions:
I am assuming your Header and Footer are standard and start with some character(e.g., H, F, T etc) or String(Header, Footer, Trailer etc)
I am assuming your original file is delimited with "|". You can specify your actual delimiter in the below awk.
awk -F'|' -v n=7 '{if(/^Header|^Footer/) {print} else {end="";for (i=1;i<=n;i++) end=sprintf("%s%s", end, "|"); rec=sprintf("%s%s", $0, end); print rec}}' file_name
n=number of times you want to repeat | at the end of each line.
^Header|^Footer - If the line starts with "Header" or "Footer", just print the record as it is. You can specify your header and footer strings from file.
for loop - prepares a string "end" which contains "|" n times.
rec - Contains concatenated string of entire record followed by end string

Using gawk to Replace a Pattern of Text with the Contents of a File Whose Filename is Inside the Text

I am trying to replace text inside a text file according to a certain criteria.
For example, if I have three text files, with outer.txt containing:
Blah Blah Blah
INCLUDE inner1.txt
Etcetera Etcetera
INCLUDE inner2.txt
end of file
And inner1.txt containing:
contents of inner1
And inner2.txt containing:
contents of inner2
At the end of the replacement, the outer.txt file would look like:
Blah Blah Blah
contents of inner1
Etcetera Etcetera
contents of inner2
end of file
The overall pattern would be that for every instance of the word "INCLUDE", replace that entire line with the contents of the file whose filename immediately follows that instance of "INCLUDE", which in one case would be inner1.txt and in the second case would be inner2.txt.
Put more simply, is it possible for gawk to be able to determine which text file is to be embedded into the outer text file based on the very contents to be replaced in the outer text file?
With gnu sed
sed -E 's/( *)INCLUDE(.*)/printf "%s" "\1";cat \2/e' outer.txt
If you set the +x bit on the edit-file ('chmod +x edit-file'), then you can do:
g/include/s//cat/\
.w\
d\
r !%
w
q
Explanation:
g/include/s//cat/\
Starts a global command.
.w\
(from within the global context), overwrites the edit-file with the current line only (effectively: 'cat included_file', where you replace included_file for the filename in question.)
d\
(from within the global context), deletes the current line from the buffer. (i.e. deletes 'include included_file', again, included_file standing for the file in question).
r !%
(from within the global context), reads the output from executing the default file (which is file we are editing, and was overwritten above with 'cat...').
w
(finally, outside the global context). Writes (saves) the buffer back to the edit-file.
q
quit.
With GNU awk:
awk --load readfile '{if ($1=="INCLUDE") {printf readfile($2)} else print}' outer.txt
Another ed approach would be something like:
#!/bin/sh
ed -s outer.txt <<-'EOF'
/Blah Blah Blah/+1kx
/end of file/-1ky
'xr inner.txt
'xd
'yr inner2.txt
'yd
%p
Q
EOF
Change Q to w if in-place editing is required
Remove the %p to silence the output.

Output only the first pattern-line and its following line

I need to filter the output of a command.
I tried this.
bpeek | grep nPDE
My problem is that I need all matches of nPDE and the line after the found file. So the output would be like:
iteration nPDE
1 1
iteration nPDE
2 4
The best case would be if it would show me the found line only once and then only the line after it.
I found solutions with awk, But as far as I know awk can only read files.
There is an option for that.
grep --help
...
-A, --after-context=NUM print NUM lines of trailing context
Therefore:
bpeek | grep -A 1 'nPDE'
With awk (for completeness since you have grep and sed solutions):
awk '/nPDE/{c=2} c&&c--'
grep -A works if your grep supports it (it's not in POSIX grep). If it doesn't, you can use sed:
bpeek | sed '/nPDE/!d;N'
which does the following:
/nPDE/!d # If the line doesn't match "nPDE", delete it (starts new cycle)
N # Else, append next line and print them both
Notice that this would fail to print the right output for this file
nPDE
nPDE
context line
If you have GNU sed, you can use an address range as follows:
sed '/nPDE/,+1!d'
Addresses of the format addr1,+N define the range between addr1 (in our case /nPDE/) and the following N lines. This solution is easier to adapt to a different number of context lines, but still fails with the example above.
A solution that manages cases like
blah
nPDE
context
blah
blah
nPDE
nPDE
context
nPDE
would like like
sed -n '/nPDE/{$p;:a;N;/\n[^\n]*nPDE[^\n]*$/!{p;b};ba}'
doing the following:
/nPDE/ { # If the line matches "nPDE"
$p # If we're on the last line, just print it
:a # Label to jump to
N # Append next line to pattern space
/\n[^\n]*nPDE[^\n]*$/! { # If appended line does not contain "nPDE"
p # Print pattern space
b # Branch to end (start new loop)
}
ba # Branch to label (appended line contained "nPDE")
}
All other lines are not printed because of the -n option.
As pointed out in Ed's comment, this is neither readable nor easily extended to a larger amount of context lines, but works correctly for one context line.

Replace text between two strings in file using linux bash

i have file "acl.txt"
192.168.0.1
192.168.4.5
#start_exceptions
192.168.3.34
192.168.6.78
#end_exceptions
192.168.5.55
and another file "exceptions"
192.168.88.88
192.168.76.6
I need to replace everything between #start_exceptions and #end_exceptions with content of exceptions file. I have tried many solutions from this forum but none of them works.
EDITED:
Ok, if you want to retain the #start and #stop, I will revert to awk:
awk '
BEGIN {p=1}
/^#start/ {print;system("cat exceptions");p=0}
/^#end/ {p=1}
p' acl.txt
Thanks to #fedorqui for tweaks in comments below.
Output:
192.168.0.1
192.168.4.5
#start_exceptions
192.168.88.88
192.168.76.6
#end_exceptions
192.168.5.55
p is a flag that says whether or not to print lines. It starts at the beginning as 1, so all lines are printed till I find a line starting with #start. Then I cat the contents of the exceptions file and stop printing lines till I find a line starting with #end, at which point I set the p flag back to 1 so remaining lines get printed.
If you want output to a file, add "> newfile" to the very end of the command like this:
awk '
BEGIN {p=1}
/^#start/ {print;system("cat exceptions");p=0}
/^#end/ {p=1}
p' acl.txt > newfile
YET ANOTHER VERSION IF YOU REALLY WANT TO USE SED
If you really, really want to do it with sed, you can use nested address spaces, firstly to select the lines between #start_exceptions and #end_exceptions, then again to select the first line within that and also lines other than the #end_exceptions line:
sed '
/^#start/,/^#end/{
/^#start/{
n
r exceptions
}
/^#end/!d
}
' acl.txt
Output:
192.168.0.1
192.168.4.5
#start_exceptions
192.168.88.88
192.168.76.6
#end_exceptions
192.168.5.55
ORIGINAL ANSWER
I think this will work:
sed -e '/^#end/r exceptions' -e '/^#start/,/^#end/d' acl.txt
When it finds /^#end/ it reads in the exceptions file. And it also deletes everything between /#start/ and /#end/.
I have left the matching slightly "loose" for clarity of expressing the technique.
You can use the following, based on Replace string with contents of a file using sed:
$ sed $'/end/ {r exceptions\n} ; /start/,/end/ {d}' acl.txt
192.168.0.1
192.168.4.5
192.168.88.88
192.168.76.6
192.168.5.55
Explanation
sed $'one_thing; another_thing' ac1.txt performs the two actions.
/end/ {r exceptions\n} if the line contains end, then read the file exceptions and append it.
/start/,/end/ {d} from a line containing start to a line containing end, delete all the lines.
I had problem with Mark Setchell's solution in MINGW. The caret was not picking up the beginning of line. Indeed, is the detection of the separator dependent on it being at the beginning of the line?
I came up with this awk alternative...
$ awk -v data="$(<exceptions)" '
BEGIN {p=1}
/#start_exceptions/ {print; print data;p=0}
/#end_exceptions/ {p=1}
p
' acl.txt

Resources