I want to add Some large code between two patterns:
File1.txt
This is text to be inserted into the File.
infile.txt
Some Text here
First
Second
Some Text here
I want to add File1.txt content between First and Second :
Desired Output:
Some Text here
First
This is text to be inserted into the File.
Second
Some Text here
I can search using two patterns with sed command ,But I don't have idea how do I add content between them.
sed '/First/,/Second/!d' infile
Since /r stands for reading in a file, use:
sed '/First/r file1.txt' infile.txt
You can find some info here: Reading in a file with the 'r' command.
Add -i (that is, sed -i '/First/r file1.txt' infile.txt) for in-place edition.
To perform this action no matter the case of the characters, use the I mark as suggested in Use sed with ignore case while adding text before some pattern:
sed 's/first/last/Ig' file
As indicated in comments, the above solution is just printing a given string after a pattern, without taking into consideration the second pattern.
To do so, I'd go for an awk with a flag:
awk -v data="$(<patt_file)" '/First/ {f=1} /Second/ && f {print data; f=0}1' file
Given these files:
$ cat patt_file
This is text to be inserted
$ cat file
Some Text here
First
First
Second
Some Text here
First
Bar
Let's run the command:
$ awk -v data="$(<patt_file)" '/First/ {f=1} /Second/ && f {print data; f=0}1' file
Some Text here
First # <--- no line appended here
First
This is text to be inserted # <--- line appended here
Second
Some Text here
First # <--- no line appended here
Bar
i think you can try this
$ sed -n 'H;${x;s/Second.*\n/This is text to be inserted into the File\
&/;p;}' infile.txt
awk flavor:
awk '/First/ { print $0; getline < "File1.txt" }1' File2.txt
Here's a cut of bash code that I wrote to insert a pattern from patt_file. Essentially had had to delete some repetitious data using uniq then add some stuff back in. I copy the stuff I need to put back in using lineNum values, save it to past_file. Then match patMatch in the file I'm adding the stuff to.
#This pulls the line number from row k, column 2 of the reduced repitious file
lineNum1=$(awk -v i=$k -v j=2 'FNR == i {print $j}' test.txt)
#This pulls the line number from row k + 1, coulmn 2 of the reduced repitious file
lineNum2=$(awk -v i=$((k+1)) -v j=2 'FNR == i {print $j}' test.txt)
#This pulls fields row 4, 2 and 3 column into with tab spacing (important) from reduced repitious file
awk -v i=$k -v j=2 -v h=3 'FNR == i {print $j" "$h}' test.txt>closeJ.txt
#This substitutes all of the periods (dots) for \. so that sed will match them
patMatch=$(sed 's/\./\\./' closeJ.txt)
#This Selects text in the full data file between lineNum1 and lineNum2 and copies it to a file
awk -v awkVar1=$((lineNum1 +1)) -v awkVar2=$((lineNum2 -1)) 'NR >= awkVar1 && NR <= awkVar2 { print }' nice.txt >patt_file.txt
#This inserts the contents of the pattern matched file into the reduced repitious file
#The reduced repitious file will now grow
sed -i.bak "/$patMatch/ r "patt_file.txt"" test.txt
Related
I would like to match all lines from a file containing a word, and take all lines under until coming two two newline characters in a row.
I have the following sed code to cut and paste specific lines, but not subsequent lines:
sed 's|.*|/\\<&\\>/{w results\nd}|' teststring | sed -file.bak -f - testfile
How could I modify this to take all subsequent lines?
For example, say I wanted to match lines with 'dog', the following should take the first 3 lines of the 5:
The best kind of an animal is a dog, for sure
-man's best friend
-related to wolves
Racoons are not cute
Is there a way to do this?
This should do:
awk '/dog/ {f=1} /^$/ {f=0} f {print > "new"} !f {print > "tmp"}' file && mv tmp file
It will set f to true if word dog is found, then if a blank line is found set f to false.
If f is true, print to new file.
If f is false, print to tmp file.
Copy tmp file to original file
Edit: Can be shorten some:
awk '/dog/ {f=1} /^$/ {f=0} {print > (f?"new":"tmp")}' file && mv tmp file
Edit2: as requested add space for every section in the new file:
awk '/dog/ {f=1;print ""> "new"} /^$/ {f=0} {print > (f?"new":"tmp")}' file && mv tmp file
If the original files does contains tabs or spaces instead of just a blank line after each dog section, change from /^$/ to /^[ \t]*$/
This might work for you (GNU sed):
sed 's|.*|/\\<&\\>/ba|' stringFile |
sed -f - -e 'b;:a;w resultFile' -e 'n;/^$/!ba' file
Build a set of regexps from the stringFile and send matches to :a. Then write the matched line and any further lines until an empty line (or end of file) to the resultFile.
N.B. The results could be sent directly to resultFile,using:
sed 's#.*#/\\<&\\>/ba#' stringFile |
sed -nf - -e 'b;:a;p;n;/^$/!ba' file > resultFile
To cut the matches from the original file use:
sed 's|.*|/\\<&\\>/ba|' stringFile |
sed -f - -e 'b;:a;N;/\n\s*$/!ba;w resultFile' -e 's/.*//p;d' file
Is this what you're trying to do?
$ awk -v RS= '/dog/' file
The best kind of an animal is a dog, for sure
-man's best friend
-related to wolves
Could you please try following.
awk '/dog/{count="";found=1} found && ++count<4' Input_file > temp && mv temp Input_file
everyone. I have
file 1.log:
text1 value11 text
text text
text2 value12 text
file 2.log:
text1 value21 text
text text
text2 value22 text
I want:
value11;value12
value21;value22
For now I grep values in separated files and paste later in another file, but I think this is not a very elegant solution because I need to read all files more than one time, so I try to use grep for extract all data in a single cat | grep line, but is not the result I expected.
I use:
cat *.log | grep -oP "(?<=text1 ).*?(?= )|(?<=text2 ).*?(?= )" | tr '\n' '; '
or
cat *.log | grep -oP "(?<=text1 ).*?(?= )|(?<=text2 ).*?(?= )" | xargs
but I get in each case:
value11;value12;value21;value22
value11 value12 value21 value22
Thank you so much.
Try:
$ awk -v RS='[[:space:]]+' '$0=="text1" || $0=="text2"{getline; printf "%s%s",sep,$0; sep=";"} ENDFILE{if(sep)print""; sep=""}' *.log
value11;value12
value21;value22
For those who prefer their commands spread over multiple lines:
awk -v RS='[[:space:]]+' '
$0=="text1" || $0=="text2" {
getline
printf "%s%s",sep,$0
sep=";"
}
ENDFILE {
if(sep)print""
sep=""
}' *.log
How it works
-v RS='[[:space:]]+'
This tells awk to treat any sequence of whitespace (newlines, blanks, tabs, etc) as a record separator.
$0=="text1" || $0=="text2"{getline; printf "%s%s",sep,$0; sep=";"}
This tells awk to look file records that matches either text1 ortext2`. For those records and those records only the commands in curly braces are executed. Those commands are:
getline tells awk to read in the next record.
printf "%s%s",sep,$0 tells awk to print the variable sep followed by the word in the record.
After we print the first match, the command sep=";" is executed which tells awk to set the value of sep to a semicolon.
As we start each file, sep is empty. This means that the first match from any file is printed with no separator preceding it. All subsequent matches from the same file will have a ; to separate them.
ENDFILE{if(sep)print""; sep=""}
After the end of each file is reached, we print a newline if sep is not empty and then we set sep back to an empty string.
Alternative: Printing the second word if the first word ends with a number
In an alternative interpretation of the question (hat tip: David C. Rankin), we want to print the second word on any line for which the first word ends with a number. In that case, try:
$ awk '$1~/[0-9]$/{printf "%s%s",sep,$2; sep=";"} ENDFILE{if(sep)print""; sep=""}' *.log
value11;value12
value21;value22
In the above, $1~/[0-9]$/ selects the lines for which the first word ends with a number and printf "%s%s",sep,$2 prints the second field on that line.
Discussion
The original command was:
$ cat *.log | grep -oP "(?<=text1 ).*?(?= )|(?<=text2 ).*?(?= )" | tr '\n' '; '
value11;value12;value21;value22;
Note that, when using most unix commands, cat is rarely ever needed. In this case, for example, grep accepts a list of files. So, we could easily do without the extra cat process and get the same output:
$ grep -hoP "(?<=text1 ).*?(?= )|(?<=text2 ).*?(?= )" *.log | tr '\n' '; '
value11;value12;value21;value22;
I agree with #John1024 and how you approach this problem will really depend on what the actual text is you are looking for. If for instance your lines of concern start with text{1,2,...} and then what you want in the second field can be anything, then his approach is optimal. However, if the values in the first field and vary and what you are really interested in is records where you have valueXX in the second field, then an approach keying off the second field may be what you are looking for.
Taking for example your second field, if the text you are interested in is in the form valueXX (where XX are two or more digits at the end of the field), you can process only those records where your second field matches and then use a simple conditional testing whether FNR == 1 to control the ';' delimiter output and ENDFILE to control the new line similar to:
awk '$2 ~ /^value[0-9][0-9][0-9]*$/ {
printf "%s%s", (FNR == 1) ? "" : ";", $2
}
ENDFILE {
print ""
}' file1.log file2.log
Example Use/Output
$ awk '$2 ~ /^value[0-9][0-9][0-9]*$/ {
printf "%s%s", (FNR == 1) ? "" : ";", $2
}
ENDFILE {
print ""
}' file1.log file2.log
value11;value12
value21;value22
Look things over and consider your actual input files and then either one of these two approaches should get you there.
If I understood you correctly, you want the values but search for the text[12] ie. to get the word after matching search word, not the matching search word:
$ awk -v s="^text[12]$" ' # set the search regex *
FNR==1 { # in the beginning of each file
b=b (b==""?"":"\n") # terminate current buffer with a newline
}
{
for(i=1;i<NF;i++) # iterate all but last word
if($i~s) # if current word matches search pattern
b=b (b~/^$|\n$/?"":";") $(i+1) # add following word to buffer
}
END { # after searching all files
print b # output buffer
}' *.log
Output:
value11;value12
value21;value22
* regex could be for example ^(text1|text2)$, too.
Instead of grep , I used awk here. The file pkg.conf has 'ssl_cipher' string , I need to copy the line containing ssl_cipher to another file 'pkg.conf.new' at the same line number (here it`s 20 in pkg.conf):
bash-4.2$ awk '/ssl_cipher/ {print FNR,$(NF-1),$NF}' pkg.conf
20 ssl_cipher 'ECDHJES128:ECDH+AESGCM:ECDH+AES256:DH+AES:DH+AESGCM:DH+AES256:RSA+AES :RSA+AESGCMHaNULL:!RC4:!MD5:!DSS:!3DESHSSLv3');
Is there an awk one liner to do this or should I seek the help of 'sed' ?
You can use this awk script:
awk 'NR==FNR{ # On the first file
if(/ssl_cipher/){ # lookup the string
line_content=$0; # store the content
line_no=NR # and line number
};
next # skip other files
}
FNR==line_no{ # On the second file, at the wanted line
print line_content # append the wanted content
}1 # print the other lines
' pkg.conf pkg-new.conf
Note that will insert a new line. As mentionned by #Yoric, if you want to replace the line, and the next keyword after the print line_content.
The result is output to the stdout. If you want to replace the pkg-new.conf file, and if you have GNU awk, you can add the option -i inplace to the command line.
You can use sed to generate a sed script:
awk '/ssl_cipher/ {print FNR,$(NF-1),$NF}' pkg.conf \
| sed 's/ /{i/;s/$/\nd}/' \
| sed -f- pkg.conf.new
The first sed script will transform the output to
20{issl_cipher 'ECDHJES128:ECDH+AESGCM:ECDH+AES256:DH+AES:DH+AESGCM:DH+AES256:RSA+AES :RSA+AESGCMHaNULL:!RC4:!MD5:!DSS:!3DESHSSLv3');
d}
Which tells sed to treat line 20 specially: i is for insert, and d for delete which removes the original contents of the line.
So i want to take a text file's contents away from another text file but on very large data sets
file 1:
ligand1
ligand6
ligand9
ligand4
File 2:
ligand1
ligand9
Output File
ligand6
ligand4
I've been using this grep -v -x -f file1.txt file2.txt > new_file.txt
But on big data sets it crashes
You can use a simple awk logic for this:-
$ awk 'NR==FNR{list[$0];next} !($0 in list)' file_2 <(tr -d ' ' <file_1)
ligand6
ligand4
Which can then be written to a file in some temporary path say, e.g.
awk 'NR==FNR{list[$0];next} !($0 in list)' file_2 <(tr -d ' ' <file_1) > /tmp/newFile
The tr command on file_1 to strip off the leading white-spaces which mangles the awk substitutions
The logic is simple:-
FNR and NR which keeps track of the row in each file, so when using more than one file, NR keeps alive across files and FNR resets after a single file(if 1st input has 5 lines and 2nd input has 10 lines then NR would be 1,2,3...15 and FNR would be 1...5 then 1...1)
NR==FNR and next means, this part of codes work only for file_2 i.e. basically all contents of file_2 are copied in that awk array named list.
!($0 in list) action is then applied on file_1 which will print only those lines which are not already present in file_2. That's it!
Note:- If the extra leading white-space is unexpected and be removed, the overall command performance could be a bit more faster, as this now strips the space for every line.
If your file lines are in the same order, you can use comm command:
comm -23 file1 file2 print lines only in file1
Is it possible to use grep to match only lines with numbers in a pre-specified range?
For instance I want to list all lines with numbers in the range [1024, 2048] of a log that contain the word 'error'.
I would like to keep the '-n' functionality i.e. have the number of the matched line in the file.
Use sed first:
sed -ne '1024,2048p' | grep ...
-n says don't print lines, 'x,y,p' says print lines x-y inclusive (overrides the -n)
sed -n '1024,2048{/error/{=;p}}' | paste - -
Here /error/ is a pattern to match and = prints the line number.
Awk is a good tool for the job:
$ awk 'NR>=1024 && NR<=2048 && /error/ {print NR,$0}' file
In awk the variable NR contains the current line number and $0 contains the line itself.
The benefits with using awk is that you can easily change the output to display however you want it. For instance to separate the line number from line with a colon followed by a TAB:
$ awk 'NR>=1024 && NR<=2048 && /error/ {print NR,$0}' OFS=':\t' file