filter out content between two files

filter out content between two files - linux

I have two files
conditions.txt
abcd
efgh
logs.txt
efgh
ijkl
mnop
qrst
I am expecting output to be:
ijkl
mnop
qrst
Actual output:
efgh
ijkl
ijkl
mnop
mnop
qrst
qrst
Here's the code I had worked till now
func(){
while read condition
do
if [[ $line = $condition ]] ; then
:
else
echo "$line";
done < condition.txt
}
while read line
do
func $line
done < log.txt

Try using grep:
$ grep -v -f conditions.txt logs.txt
From the man page for GNU grep:
-v, --invert-match
Invert the sense of matching, to select non-matching lines.
-f FILE, --file=FILE
Obtain patterns from FILE, one per line. If this option is used multiple times or is combined with the -e (--regexp) option, search for all patterns
given. The empty file contains zero patterns, and therefore matches nothing.

If you don't feel like re-inventing wheels ...
grep -vf conditions.txt logs.txt
ijkl
mnop
qrst

Related

Copy a content from one file and need to replace in another file using sed

Here, we have two files. we need to copy a key from file 1 and need to replace in file 2 with specific string "key" using sed command. we tried with below commands:
sed -e '3 /key/{r file1' -e 'd}' file2
sed -n "3 s/key/$(cat file 1 |grep ^Key|cut -d ' ' -f2)/" file2
File 1
ABCD
EFGH
Key: qvUkD6QaFBA1jYEpynivMoQx+9V71F4+fdn1TIUKPBNny/3zCnjihd1mwxZg==
File 2
IJKL
MNOP
secret key;
MNOP
Expected result:
IJKL
MNOP
secret qvUkD6QaFBA1jYEpynivMoQx+9V71F4+fdn1TIUKPBNny/3zCnjihd1mwxZg==;
MNOP

awk
I am not sure how efficient my code will be for your usage.
awk ' /^Key/{q=$2;next} /A|E/{$0=""; next}/^secret/{$2="\""q"\";"}1' $file1 $file2
$ awk ' /^Key/{q=$2;next} /A|E/{$0=""; next}/^secret/{$2="\""q"\";"}1' $file1 $file2
IJKL
MNOP
secret "qvUkD6QaFBA1jYEpynivMoQx+9V71F4+fdn1TIUKPBNny/3zCnjihd1mwxZg==";
MNOP
Here, I am matching any line starting with the Key and secret string and substituting their values.
sed
You will need to create a variable to fetch the key first.
key=$(sed '1,2d;s/Key: //' $file1) or key=$(awk 'NR==3{print $2}' $file1)
$ echo $key
qvUkD6QaFBA1jYEpynivMoQx+9V71F4+fdn1TIUKPBNny/3zCnjihd1mwxZg==
The following code will generate your expected result, but once again, I am not sure how efficient it will be for your usage.
sed "/^secret/s|key|$key|" $file2
$ sed "/^secret/s|key|$key|" $file2
IJKL
MNOP
secret "qvUkD6QaFBA1jYEpynivMoQx+9V71F4+fdn1TIUKPBNny/3zCnjihd1mwxZg==";
MNOP

This might work for you (GNU sed):
sed -nE '/Key: /{s///;s/\W/\\&/g;s#.*#s/"key"/&/#p}' file1 | sed -Ef - file2
Craft a substitution command from file1 not forgetting to quote non-word characters.
Pass the sed substitution command as stdin, to a second invocation of sed via the -f option and use it to edit file2.

How to obtain the query order output when we use grep?

I have 2 files
file1.txt
1
3
5
2
File2.txt
1 aaa
2 bbb
3 ccc
4 aaa
5 bbb
Desired output:
1 aaa
3 ccc
5 bbb
2 bbb
Command used : cat File1.txt |grep -wf- File2.txt but the output was:
1 aaa
2 bbb
3 ccc
5 bbb
Is it a way to return the output in the query order?
Thanks in advance!!!

Important Edit
On second thought, do not use grep with redirection as it's incredibly slow. Use awk to read the original patterns to get the order back.
Use this instead
grep -f patterns searchdata | awk 'NR==FNR { line[$1] = $0; next } $1 in line { print line[$1] }' - patterns > matched
Benchmark
#!/bin/bash
paste <(shuf -i 1-10000) <(crunch 4 4 2>/dev/null | shuf -n 10000) > searchdata
shuf -i 1-10000 > patterns
printf 'Testing awk:'
time grep -f patterns searchdata | awk 'NR==FNR { line[$1] = $0; next } $1 in line { print line[$1] }' - patterns > matched
wc -l matched
cat /dev/null > matched
printf '\nTesting grep with redirection:'
time {
while read -r pat; do
grep -w "$pat" searchdata >> matched
done < patterns
}
wc -l matched
Output
Testing awk:
real 0m0.022s
user 0m0.017s
sys 0m0.010s
10000 matched
Testing grep with redirection:
real 0m36.370s
user 0m28.761s
sys 0m7.909s
10000 matched
Original
To preserve the query order, read the file line-by-line:
while read -r pat; do grep -w "$pat" file2.txt; done < file1.txt
I don't think grep has an option to support this, but this solution will be slower if you have large files to read from.

print a file content side by side bash

I have a file with below contents. I need to print each line side by side
hello
1223
man
2332
xyz
abc
Output desired:
hello 1223
man 2332
xyz abc
Is there any other alternative than paste command?

You can use this awk:
awk '{ORS = (NR%2 ? FS : RS)} 1' file
hello 1223
man 2332
xyz abc
This sets ORS (output record separator) equal to input field separator (FS) for odd numbered lines, for even numbered lines it will be set to input record separator (RS).
To get tabular data use column -t:
awk '{ORS = (NR%2 ? FS : RS)} 1' file | column -t
hello 1223
man 2332
xyz abc

awk/gawk solution:
$ gawk 'BEGIN{ OFS="\t"} { COL1=$1; getline; COL2=$1; print(COL1,COL2)}' file
hello 1223
man 2332
xyz abc
Bash solution (no paste command):
$ echo $(cat file) | while read col1 col2; do printf "%s\t%s\n" $col1 $col2; done
hello 1223
man 2332
xyz abc

replace a word in a string if there is a given string using sed

Consider the following strings:
function 12345 filename.pdf 6789 12
function 12345 filename.doc 7789 4567
Is there a way to search the strings using sed to see if they contain pdf or doc substrings, and replace the strings to the following?
function_pdf 12345 filename.pdf 6789 12
function_doc 12345 filename.doc 7789 4567

You really have not specified the problem adequately, but perhaps you are looking for:
sed -e '/\.pdf/s/function/function_pdf/g' -e /\.doc/s/function/function_doc/g'

Through sed,
$ sed 's/^\([^[:space:]]\+\)\( [^[:space:]]\+ [^[:space:]]\+\.\)\(pdf\|doc\)/\1_\3\2\3/g' file
function_pdf 12345 filename.pdf 6789 12
function_doc 12345 filename.doc 7789 4567

Using sed :
~$ cat i.txt
function 12345 filename.pdf 6789
function 12345 filename.doc 7789
function 12345 filename.txt 8888
~$ sed -e 's/\(function\) \(.*\)\(pdf\|doc\)\(.*\)/\1_\3 \2\3\4/' i.txt
function_pdf 12345 filename.pdf 6789
function_doc 12345 filename.doc 7789
function 12345 filename.txt 8888
Capture the extension with the regexp you want, then insert it where you want using \x notation.
From man sed:
the special escapes \1 through \9 to refer to the corresponding matching sub-expressions in the regexp.

With awk:
awk '$1=="function" && ($3 ~ /\.(pdf|doc)$/) {$1=$1 "_" substr($3,length($3)-2)}7'

sed 's/\( .*\.\)\([^ ]*\)\(.*\)/_\2&/' YourFile
the simpliest sed i found for this (sed seems very efficient for this)

Iterate through the lines returned by grep in Shell

Suppose I have a file info.txt. The first column is the id, and the rest are its content.
1 aaa bbb
1 ccc ddd mmm
4 ccc eee
7 ddd fff jjj kkk
I'm only intereted with the lines beginning with "1". So I use grep to filter it:
what_I_concern=$(cat info.txt | grep -iw 1 | cut -d ' ' -f 2-)
and then I want to iterate through these lines:
for i in $what_I_concern; do
pass $i to another program #I want to pass one line at a time
done
But what it really did is to iterate through every word in these lines, instead of taking each line as a whole.
How can I solve this problem?

The way you're trying to accomplish what you need is causing word splitting. Instead, say:
while read -r line; do
someprogram $(cut -d' ' -f2- <<< "$line")
done < <(grep '^1' info.txt)
The <() syntax is known as Process Substitution. In this case, it enables the while loop to read the output of the grep command as a file.

You can avoid using grep and cut altogether in this case (assuming default IFS)
while read -r first rest; do
[ "$first" = "1" ] && pass "$rest" to another program;
done < info.txt

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

filter out content between two files - linux

If you don't feel like re-inventing wheels ... grep -vf conditions.txt logs.txt ijkl mnop qrst

Related

Copy a content from one file and need to replace in another file using sed

How to obtain the query order output when we use grep?

print a file content side by side bash

replace a word in a string if there is a given string using sed

Iterate through the lines returned by grep in Shell

Categories

Resources