Get specified content between two lines

Get specified content between two lines - linux

I have a file t.txt of the form:
abc 0
file content1
file content2
abc 1
file content3
file content4
abc 2
file content5
file content6
Now I want to retain all the content between abc 1 and abc 2 i.e. I want to retain:
file content3
file content4
For this I am using sed:
sed -e "/abc\s4/, /abc\s5/ p" t.txt > j.txt
But when I do so I get j.txt to be a copy of t.txt. I dont know where I am making the mistake..can someone please help

You can use this sed:
$ sed -n '/abc 1/,/abc 2/{/abc 1/d; /abc 2/d; p}' file
file content3
file content4
Explanation
/abc 1/,/abc 2/ selectr range of lines from the one containing abc 1 to the one containing abc 2. It could also be /^abc 1$/ to match the full string.
p prints the lines. So for example sed -n '/file/p' file will print all the lines containing the string file.
d deletes the lines.
'/abc 1/,/abc 2/p' alone would print the abc 1 and abc 2 lines:
$ sed -n '/abc 1/,/abc 2/p' file
abc 1
file content3
file content4
abc 2
So you have to explicitly delete them with {/abc 1/d; /abc 2/d;} and then print the rest with p.
With awk:
$ awk '$0=="abc 2" {f=0} f; $0=="abc 1" {f=1}' file
file content3
file content4
It triggers the flag f whenever abc 1 is found and untriggers it when abc 2 is found. f alone will be true and hence print the lines.

Related

Print between two patterns with filepath/filename in a directory

I need a command that prints data between two strings (Hello and End) along with the file name and file path on each line. Here is the input and output. Appreciate your time and help
Input
file1:
Hello
abc
xyz
End
file2:
Hello
123
456
End
file3:
Hello
Output:
/home/test/seq/file1 abc
/home/test/seq/file1 xyz
/home/test/seq/file2 123
/home/test/seq/file2 456
I tried awk and sed but not able to print the file with the path.
awk '/Hello/{flag=1;next}/End/{flag=0}flag' * 2>/dev/null

With awk:
awk '!/Hello/ && !/End/ {print FILENAME,$0} ' /home/test/seq/file?
Output:
/home/test/seq/file1 abc
/home/test/seq/file1 xyz
/home/test/seq/file2 123
/home/test/seq/file2 456

If your file contains lines above Hello and/or below End, then you can use a flag to control printing as you had attempted in your question, e.g.
awk -v f=0 '/End/{f=0} f == 1 {print FILENAME, $0} /Hello/{f=1}' file1 file2 file..
This would handle the case where your input file contained, e.g.
$cat file
some text
some more
Hello
abc
xyz
End
still more text
The flag f is a simple ON/OFF flag to control printing and placing the end rule first with the actual print in the middle eliminates the need for any next command.

Linux command to split each lines in a file based on a character and write only the specified columns to another file

Suppose the input file file.txt is
abc/def/ghi/jkl/mno
pqr/st/u/vwxy/z
bla/123/45678/9
How to split the lines based on the character '/' and write the specified columns (here it is second and fourth) to another file so that the file should look like
def jkl
st vwxy
123 9

You can use perl, for example:
cat file.txt | perl -ne 'chomp(#cols = split("/", $_)); print "#cols[1, 3]\n";' > output

Extract information (subset) from a main files using a list of identifiers saved in another file

I have one file containing a list of name (refer as file 1):
Apple
Bat
Cat
I have another file (refer as file 2) containing a list of name and details refer:
Apple bla blaa
aaaaaaaaaggggggggggttttttsssssssvvvvvvv
ssssssssiiuuuuuuuuuueeeeeeeeeeennnnnnnn
sdasasssssssssssssssssssssswwwwwwwwwwww
Aeroplane dsafgeq dasfqw dafsad
vvvvvvvvvvvvvvvvuuuuuuuuuuuuuuuuuuuuuus
fcsadssssssssssssssssssssssssssssssssss
ddddddddddddddddwwwwwwwwwwwwwwwwwwwwwww
sdddddddddddddddddddddddddddddwwwwwwwww
Bat sdasdas dsadw dasd
sssssssssssssssssssssssssssssssssssswww
ssssssssssssssssswwwwwwwwwwwwwwwwwwwwwf
aaaaaaaaaawwwwwwwwwwwwwwwwwwwwwwddddddd
sadddddddddddddddddd
Cat dsafw fasdsa dawwdwaw
sssssssssssssssssssssssssssssssssssssss
wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwssss
I need to extract info out of file 2 using the list of names in file 1.
Output file should be something like below:
Apple bla blaa
aaaaaaaaaggggggggggttttttsssssssvvvvvvv
ssssssssiiuuuuuuuuuueeeeeeeeeeennnnnnnn
sdasasssssssssssssssssssssswwwwwwwwwwww
Bat sdasdas dsadw dasd
sssssssssssssssssssssssssssssssssssswww
ssssssssssssssssswwwwwwwwwwwwwwwwwwwwwf
aaaaaaaaaawwwwwwwwwwwwwwwwwwwwwwddddddd
sadddddddddddddddddd
Cat dsafw fasdsa dawwdwaw
sssssssssssssssssssssssssssssssssssssss
wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwssss
Is there any commands for doing this using Linux (Ubuntu)? I am a new Linux user.

This might work for you (GNU sed):
sed 's#.*#/^&/bb#' file1 |
sed -e ':a' -f - -e 'd;:b;n;/^[A-Z]/!bb;ba' file2
Generate a string of sed commands from the first file and pipe them into another sed script which is run against the second file.
The first file creates a regexp for each line which when matched jumps to a piece of common code. If none of the regexps are matched the lines are deleted. If a regexp is matched then further lines are printed until a new delimiter is found at which point the code then jumps to the start and the process is repeated.

$ awk 'NR==FNR{a[$1];next} NF>1{f=($1 in a)} f' file1 file2
Apple bla blaa
aaaaaaaaaggggggggggttttttsssssssvvvvvvv
ssssssssiiuuuuuuuuuueeeeeeeeeeennnnnnnn
sdasasssssssssssssssssssssswwwwwwwwwwww
Bat sdasdas dsadw dasd
sssssssssssssssssssssssssssssssssssswww
ssssssssssssssssswwwwwwwwwwwwwwwwwwwwwf
aaaaaaaaaawwwwwwwwwwwwwwwwwwwwwwddddddd
sadddddddddddddddddd
Cat dsafw fasdsa dawwdwaw
sssssssssssssssssssssssssssssssssssssss
wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwssss

Taking into consideration that each section has to be separated by an empty line, this solution with awk works ok:
while read -r pat;do
pat="^\\\<${pat}\\\>"
awk -vpattern=$pat '$0 ~ pattern{p=1}$0 ~ /^$/{p=0}p==1' file2
done <file1
This solution to work , requires the file to like this:
Apple bla blaa
1 aaaaaaaaaggggggggggttttttsssssssvvvvvvv
2 ssssssssiiuuuuuuuuuueeeeeeeeeeennnnnnnn
3 sdasasssssssssssssssssssssswwwwwwwwwwww
Aeroplane dsafgeq dasfqw dafsad
4 vvvvvvvvvvvvvvvvuuuuuuuuuuuuuuuuuuuuuus
5 fcsadssssssssssssssssssssssssssssssssss
6 ddddddddddddddddwwwwwwwwwwwwwwwwwwwwwww
7 sdddddddddddddddddddddddddddddwwwwwwwww
Bat sdasdas dsadw dasd
8 sssssssssssssssssssssssssssssssssssswww
9 ssssssssssssssssswwwwwwwwwwwwwwwwwwwwwf
10 aaaaaaaaaawwwwwwwwwwwwwwwwwwwwwwddddddd
11 sadddddddddddddddddd
Cat dsafw fasdsa dawwdwaw
12 sssssssssssssssssssssssssssssssssssssss
13 wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwssss
PS: Numbering has been applied by me in order to be able to "check" that awk will return the correct results per section. Numbering is not required in your real file.
If there are not empty lines separating each section then it is much harder to achieve the correct result.

to copy specific line numbers from one file to specific line numbers of another file using awk

I have a .txt file with 71 lines and I have another 12 set of files(file1 to file12). I want to copy first 5 lines from .txt file to file1 on specific line numbers similarly next 5 lines from .txt to file2 again on specific line numbers and so on.
This is my current code:
n = 1
sed -i '52,56d' $dumpfile
awk'{print $'"$n"',$'"$n+1"',$'"$n+2"',$'"$n+3"'}' sample.txt > $dumpfile
n=$(($n + 1))
In $dumpfile I have put my 12 files.
Sample file (12files; file1, file2...)
...........
................
..............
abc = 4,1,3
def = 1,2,6
dfg = 28,36,4
tyu = 68,47,6
rty = 65,6,97
file (sample.txt)
abc = 1,2,3
def = 4,5,6
dfg = 2,3,4
tyu = 8,7,6
rty = 5,6,7
abc = 21,2,32
def = 64,53,6
dfg = 28,3,4
tyu = 18,75,6
rty = 5,63,75
...........
...........
I want to replace these five lines of (file1... file12) with five lines of sample.txt file. Line number of lines to be replaced in file1 to file12 are same in all the 12 files, where as in sample.txt file first set of 5 lines will go in file1, second set of 5 lines will go in file2 and so on upto file12.

What you need is something like, this (uses GNU awk for ARGIND and inplace editing):
awk -i inplace -v start=52 '
NR==FNR {new[NR]=$0; next}
FNR==start {print new[ARGIND-1]; c=5}
!(c&&c--)
' RS="" sample.txt RS='\n' file1 file2 ... file12
but until you post some testable sample input and the associated output it's just a guess and, obviously, untested.

extracting data from two list using a shell script

I am trying to create a shell script that pulls a line from a file and checks another file for an instance of the same. If it finds an entry then it adds it to another file and loops through the first list until the it has gone through the whole file. The data in the first file looks like this -
email#address.com;
email2#address.com;
and so on
The other file in which I am looking for a match and placing the match in the blank file looks like this -
12334 email#address.com;
32213 email2#address.com;
I want it to retain the numbers as well as the matching data. I have an idea of how this should work but need to know how to implement it.
My Idea
#!/bin/bash
read -p "enter first file name:" file1
read -p "enter second file name:" file2
FILE_DATA=( $( /bin/cat $file1))
FILE_DATA1=( $( /bin/cat $file2))
for I in $((${#FILE_DATA[#]}))
do
echo $FILE_DATA[$i] | grep $FILE_DATA1[$i] >> output.txt
done
I want the output to look like this but only for addresses that match -
12334 email#address.com;
32213 email2#address.com;
Thank You

quite like manipulating text using SQL:
$ cat file1
b#address.com
a#address.com
c#address.com
d#address.com
$ cat file2
10712 e#address.com
11457 b#address.com
19985 f#address.com
22519 d#address.com
$ join -1 1 -2 2 <(sort file1) <(sort -k2 file2) | awk '{print $2,$1}'
11457 b#address.com
22519 d#address.com
make keys sorted(we use emails as keys here)
join on keys(file1.column1, file2.column2)
format output(use awk to reverse columns)

As you've learned about diff and comm, now it's time to learn about another tool in the unix toolbox, join.
Join does just what the name indicates, it joins together 2 files. The way you join is based on keys embedded in the file.
The number 1 restraint on using join is that the data must be sorted in both files on the same column.
file1
a abc
b bcd
c cde
file2
a rec1
b rec2
c rec3
join file1 file2
a abc rec1
b bcd rec2
c cde rec3
you can consult the join man page for how to reduce and reorder the columns of output. for example
1>join -o 1.1 2.2 file1 file2
a rec1
b rec2
c rec3
You can use your code for file name input to turn this into a generalizable script.
Your solution using a pipeline inside a for loop will work for small sets of data, but as the size of data grows, the cost of starting a new process for each word you are searching for will drag down the run time.
I hope this helps.

Read line by the file1.txt file and assign the line to var ADDR. grep file2.txt with the content of var ADDR and append the output to file_result.txt.
(while read ADDR; do grep "${ADDR}" file2.txt >> file_result.txt ) < file1.txt

This awk one-liner can help you do that -
awk 'NR==FNR{a[$1]++;next}($2 in a){print $0 > "f3.txt"}' f1.txt f2.txt
NR and FNR are awk's built-in variables that stores the line numbers. NR does not get reset to 0 when working with two files. FNR does. So while that condition is true we add everything to an array a. Once the first file is completed, we check for the second column of second file. If a match is present in the array we put the entire line in a file f3.txt. If not then we ignore it.
Using data from Kev's solution:
[jaypal:~/Temp] cat f1.txt
b#address.com
a#address.com
c#address.com
d#address.com
[jaypal:~/Temp] cat f2.txt
10712 e#address.com
11457 b#address.com
19985 f#address.com
22519 d#address.com
[jaypal:~/Temp] awk 'NR==FNR{a[$1]++;next}($2 in a){print $0 > "f3.txt"}' f1.txt f2.txt
[jaypal:~/Temp] cat f3.txt
11457 b#address.com
22519 d#address.com

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Get specified content between two lines - linux

Related

Print between two patterns with filepath/filename in a directory

Linux command to split each lines in a file based on a character and write only the specified columns to another file

Extract information (subset) from a main files using a list of identifiers saved in another file

to copy specific line numbers from one file to specific line numbers of another file using awk

extracting data from two list using a shell script

Categories

Resources