grep using output from another command - linux

Say I have command1 which outputs this:
b05808aa-c6ad-4d30-a334-198ff5726f7c
59996d37-9008-4b3b-ab22-340955cb6019
2b41f358-ff6d-418c-a0d3-ac7151c03b78
7ac4995c-ff2c-4717-a2ac-e6870a5670f0
I also have command2 which outputs this:
b05808aa-c6ad-4d30-a334-198ff5726f7c
59996d37-9008-4b3b-ab22-340955cb6019
Is there a way to grep the output from command1 to not include any lines matched from command2, so that the final output would look like this?
2b41f358-ff6d-418c-a0d3-ac7151c03b78
7ac4995c-ff2c-4717-a2ac-e6870a5670f0

Issue this grep
command1 | grep -vF -f <(command2)
Here,
-F means Fixed string match*
-v means invert match
-f means the file with patterns
<(command) actually creates a FIFO with that command and use it on redirection.

To get all the lines from the output of command1 that do not appear in the output of command2:
grep -vFf <(command2) <(command1)
-f tells grep to use patterns that come from a file. In this case, that file is the output of command2. -F tells grep that those patterns are to be treated as fixed strings, not regex. -v tells grep to invert its normal behavior and just show lines the lines that do not match.

Related

Move a file list based upon grep pattern in command line [duplicate]

I want to pass each output from a command as multiple argument to a second command, e.g.:
grep "pattern" input
returns:
file1
file2
file3
and I want to copy these outputs, e.g:
cp file1 file1.bac
cp file2 file2.bac
cp file3 file3.bac
How can I do that in one go? Something like:
grep "pattern" input | cp $1 $1.bac
You can use xargs:
grep 'pattern' input | xargs -I% cp "%" "%.bac"
You can use $() to interpolate the output of a command. So, you could use kill -9 $(grep -hP '^\d+$' $(ls -lad /dir/*/pid | grep -P '/dir/\d+/pid' | awk '{ print $9 }')) if you wanted to.
In addition to Chris Jester-Young good answer, I would say that xargs is also a good solution for these situations:
grep ... `ls -lad ... | awk '{ print $9 }'` | xargs kill -9
will make it. All together:
grep -hP '^\d+$' `ls -lad /dir/*/pid | grep -P '/dir/\d+/pid' | awk '{ print $9 }'` | xargs kill -9
For completeness, I'll also mention command substitution and explain why this is not recommended:
cp $(grep -l "pattern" input) directory/
(The backtick syntax cp `grep -l "pattern" input` directory/ is roughly equivalent, but it is obsolete and unwieldy; don't use that.)
This will fail if the output from grep produces a file name which contains whitespace or a shell metacharacter.
Of course, it's fine to use this if you know exactly which file names the grep can produce, and have verified that none of them are problematic. But for a production script, don't use this.
Anyway, for the OP's scenario, where you need to refer to each match individually and add an extension to it, the xargs or while read alternatives are superior anyway.
In the worst case (meaning problematic or unspecified file names), pass the matches to a subshell via xargs:
grep -l "pattern" input |
xargs -r sh -c 'for f; do cp "$f" "$f.bac"; done' _
... where obviously the script inside the for loop could be arbitrarily complex.
In the ideal case, the command you want to run is simple (or versatile) enough that you can simply pass it an arbitrarily long list of file names. For example, GNU cp has a -t option to facilitate this use of xargs (the -t option allows you to put the destination directory first on the command line, so you can put as many files as you like at the end of the command):
grep -l "pattern" input | xargs cp -t destdir
which will expand into
cp -t destdir file1 file2 file3 file4 ...
for as many matches as xargs can fit onto the command line of cp, repeated as many times as it takes to pass all the files to cp. (Unfortunately, this doesn't match the OP's scenario; if you need to rename every file while copying, you need to pass in just two arguments per cp invocation: the source file name and the destination file name to copy it to.)
So in other words, if you use the command substitution syntax and grep produces a really long list of matches, you risk bumping into ARG_MAX and "Argument list too long" errors; but xargs will specifically avoid this by instead copying only as many arguments as it can safely pass to cp at a time, and running cp multiple times if necessary instead.
The above will still work incorrectly if you have file names which contain newlines. Perhaps see also https://mywiki.wooledge.org/BashFAQ/020
#!/bin/bash
for f in files; do
if grep -q PATTERN "$f"; then
echo cp -v "$f" "${f}.bac"
fi
done
files can be *.txt or *.text which basically means files ending in *.txt or *text or replace with something that you want/need, of course replace PATTERN with yours. Remove echo if you're satisfied with the output. For a recursive solution take a look at the bash shell option globstar

grep -o and display part of filenames using ls

I have a directory which has many directories inside it with the pattern of their name as :
YYYYDDMM_HHMISS
Example: 20140102_120202
I want to extract only the YYYYDDMM part.
I tried ls -l|awk '{print $9}'|grep -o ^[0-9]* and got the answer.
However i have following questions:
Why doesnt this return any results: ls -l|awk '{print $9}'|grep -o [0-9]* . Infact it should have returned all the directories.
Strangely just including '^' before [0-9] works fine :
ls -l|awk '{print $9}'|grep -o ^[0-9]*
Any other(simpler) way to achieve the result?
Why doesnt this return any results: ls -l|awk '{print $9}'|grep -o [0-9]*
If there are files in your current directory that start with [0-9], then the shell will expand them before calling grep. For example, if I have two files a1, a2 and a3 and run this:
ls | grep a*
After the filenames are expanded, the shell will run this:
ls | grep a1 a2 a3
The result of which is that it will print the lines in a2 and a3 that match the text "a1". It will also ignore whatever is coming from stdin, because when you specify filenames for grep (2nd argument and beyond), it will ignore stdin.
Next, consider this:
ls | grep ^a*
Here, ^ has no special meaning to the shell, so it uses it verbatim. Since I don't have filenames starting with ^a, it will use ^a* as the pattern. If I did have filenames like ^asomething or ^another, then again, ^a* would be expanded to those filenames and grep would do something I didn't really intend.
This is why you have to quote search patterns, to prevent the shell from expanding them. The same goes for patterns in find /path -name 'pattern'.
As for a simpler way for what you want, I think this should do it:
ls | sed -ne 's/_.*//p'
To show only the YYDDMM part of the directory names:
for i in ./*; do echo $(basename "${i%%_*}"); done
Not sure what you want to do with it once you've got it though...
You must avoid parsing ls output.
Simple is to use this printf:
printf "%s\n" [0-9]*_[0-9]*|egrep -o '^[0-9]+'

linux command grep -is "abc" filename|wc -l

what does the s mean there and also when pipe into wc what is that for? I know it eventually count the number of abc appeared in file filename, but not sure about the option s for and also pipe to wc mean
linux command grep -is "abc" filename|wc -l
output
47
-s means "suppress error messages about unreadable files" and the pipe to wc means "take the output and send it to the wc -l command" which effectively counts the number of lines matched. You can accomplish the same with the -c option to grep: grep -isc "abc" filename
Consider,
command_1 | command_2
Role of the pipe is that- it takes output of command written before it (command_1 here) and supplies that output to the command written after it (command_2 here).
The man page has everything you would want to know about the options for grep:
-s, --no-messages
Suppress error messages about nonexistent or unreadable files.
Portability note: unlike GNU grep, traditional grep did not con-
form to POSIX.2, because traditional grep lacked a -q option and
its -s option behaved like GNU grep's -q option. Shell scripts
intended to be portable to traditional grep should avoid both -q
and -s and should redirect output to /dev/null instead.
The pipe to wc -l is what gives you the count of how many lines the string "abc" appeared on. It isn't necessarily the number of times the string appeared in the file since one line with multiple occurrences is going to be counted as only 1.
grep man page says:
-s, --no-messages suppress error messages
grep returns the lines that have abc (case insensitive) in them. You pipe them to wc to get a count of the number of lines.
From man grep:
-s, --no-messages
Suppress error messages about nonexistent or unreadable files.
The wc command counts line, words and characters. With -l it returns the number of lines.

Preserve colouring after piping grep to grep

There is a simlar question in Preserve ls colouring after grep’ing but it annoys me that if you pipe colored grep output into another grep that the coloring is not preserved.
As an example grep --color WORD * | grep -v AVOID does not keep the color of the first output. But for me ls | grep FILE do keep the color, why the difference ?
grep sometimes disables the color output, for example when writing to a pipe. You can override this behavior with grep --color=always
The correct command line would be
grep --color=always WORD * | grep -v AVOID
This is pretty verbose, alternatively you can just add the line
alias cgrep="grep --color=always"
to your .bashrc for example and use cgrep as the colored grep. When redefining grep you might run into trouble with scripts which rely on specific output of grep and don't like ascii escape code.
A word of advice:
When using grep --color=always, the actual strings being passed on to the next pipe will be changed. This can lead to the following situation:
$ grep --color=always -e '1' * | grep -ve '12'
11
12
13
Even though the option -ve '12' should exclude the middle line, it will not because there are color characters between 1 and 2.
The existing answers only address the case when the FIRST command is grep (as asked by the OP, but this problem arises in other situations too).
More general answer
The basic problem is that the command BEFORE | grep, tries to be "smart" by disabling color when it realizes the output is going to a pipe. This is usually what you want so that ANSI escape codes don't interfere with your downstream program.
But if you want colorized output emanating from earlier commands, you need to force color codes to be produced regardless of the output sink. The forcing mechanism is program-specific.
Git: use -c color.status=always
git -c color.status=always status | grep -v .DS_Store
Note: the -c option must come BEFORE the subcommand status.
Others
(this is a community wiki post so feel free to add yours)
Simply repeat the same grep command at the end of your pipe.
grep WORD * | grep -v AVOID | grep -v AVOID2 | grep WORD

Pipe output to use as the search specification for grep on Linux

How do I pipe the output of grep as the search pattern for another grep?
As an example:
grep <Search_term> <file1> | xargs grep <file2>
I want the output of the first grep as the search term for the second grep. The above command is treating the output of the first grep as the file name for the second grep. I tried using the -e option for the second grep, but it does not work either.
You need to use xargs's -i switch:
grep ... | xargs -ifoo grep foo file_in_which_to_search
This takes the option after -i (foo in this case) and replaces every occurrence of it in the command with the output of the first grep.
This is the same as:
grep `grep ...` file_in_which_to_search
Try
grep ... | fgrep -f - file1 file2 ...
If using Bash then you can use backticks:
> grep -e "`grep ... ...`" files
the -e flag and the double quotes are there to ensure that any output from the initial grep that starts with a hyphen isn't then interpreted as an option to the second grep.
Note that the double quoting trick (which also ensures that the output from grep is treated as a single parameter) only works with Bash. It doesn't appear to work with (t)csh.
Note also that backticks are the standard way to get the output from one program into the parameter list of another. Not all programs have a convenient way to read parameters from stdin the way that (f)grep does.
I wanted to search for text in files (using grep) that had a certain pattern in their file names (found using find) in the current directory. I used the following command:
grep -i "pattern1" $(find . -name "pattern2")
Here pattern2 is the pattern in the file names and pattern1 is the pattern searched for
within files matching pattern2.
edit: Not strictly piping but still related and quite useful...
This is what I use to search for a file from a listing:
ls -la | grep 'file-in-which-to-search'
Okay breaking the rules as this isn't an answer, just a note that I can't get any of these solutions to work.
% fgrep -f test file
works fine.
% cat test | fgrep -f - file
fgrep: -: No such file or directory
fails.
% cat test | xargs -ifoo grep foo file
xargs: illegal option -- i
usage: xargs [-0opt] [-E eofstr] [-I replstr [-R replacements]] [-J replstr]
[-L number] [-n number [-x]] [-P maxprocs] [-s size]
[utility [argument ...]]
fails. Note that a capital I is necessary. If i use that all is good.
% grep "`cat test`" file
kinda works in that it returns a line for the terms that match but it also returns a line grep: line 3 in test: No such file or directory for each file that doesn't find a match.
Am I missing something or is this just differences in my Darwin distribution or bash shell?
I tried this way , and it works great.
[opuser#vjmachine abc]$ cat a
not problem
all
problem
first
not to get
read problem
read not problem
[opuser#vjmachine abc]$ cat b
not problem xxy
problem abcd
read problem werwer
read not problem 98989
123 not problem 345
345 problem tyu
[opuser#vjmachine abc]$ grep -e "`grep problem a`" b --col
not problem xxy
problem abcd
read problem werwer
read not problem 98989
123 not problem 345
345 problem tyu
[opuser#vjmachine abc]$
You should grep in such a way, to extract filenames only, see the parameter -l (the lowercase L):
grep -l someSearch * | xargs grep otherSearch
Because on the simple grep, the output is much more info than file names only. For instance when you do
grep someSearch *
You will pipe to xargs info like this
filename1: blablabla someSearch blablabla something else
filename2: bla someSearch bla otherSearch
...
Piping any of above line makes nonsense to pass to xargs.
But when you do grep -l someSearch *, your output will look like this:
filename1
filename2
Such an output can be passed now to xargs
I have found the following command to work using $() with my first command inside the parenthesis to have the shell execute it first.
grep $(dig +short) file
I use this to look through files for an IP address when I am given a host name.

Resources