Grep return wrong result - linux

I need search lines without term \t42\t.
I use:
grep -w -v '\t42\t' file.txt > tmp.txt
Why have I line with term \t42\t in result file?

You're getting this result because for grep the \t sequence means "one tab character". You must escape the backslash characters for them to be treated litteraly:
grep -w -v '\\t42\\t' file.txt > tmp.txt

Remove -w as it doesn't work if you append non-word characters to the pattern:
grep -v '\t42\t' file.txt > tmp.txt

Related

How to search exact phrase from a file which consist of set of phrase with hyphen

I have the file, which consists of a couple of phrases as follows. I would like to grep the exact match from out of them.
file.txt
abc
abc-def
xyz
xyz-pqr
pqrs
If I search "abc" I need to return only abc.
or
if I search "abc-def" i need to return only "abc-def"
preferd output
$grep -w "abc" file.txt
abc
or
$grep -w "abc-def" file.txt
abc-def
the below method is not working for the hyphens
$grep -w abc file.txt
With your given data/file you can use the -x flag.
grep -x abc file.txt
grep -x abc-def file.txt
-x, --line-regexp force PATTERN to match only whole lines
The -x flag is defined/required by POSIX grep(1)
In order to match an entire line you need to match the start and end of the line:
grep '^abc$' file.txt
grep '^abc-def$' file.txt
You can use awk this way:
awk -v w="abc" '$1==w' file.txt
abc
Or,
awk '$1==w' w="abc" file.txt
With the == operator, it only returns exact string matches. We are setting what to match with w="abc" either with the -v switch or through stdin.

grep regex with hyphens and white spaces

I am trying to fetch a line with the following information -a string1 -b file1.txt -c string3 using grep.
I tried
grep -v grep | grep '[b][:space:] *.txt *[c]'
grep -v grep | grep '[b] *.txt *[c]'
string1, string3 and file1 are variables. So I am looking for solutions using wild characters.
But there is nothing returned. Any help will be appreciated.
You may use this grep:
grep -- '-a [^[:blank:]]* -b [^[:blank:]]*.txt -c [^[:blank:]]*' *.txt
[^[:blank:]] matches any non-whitespace character.
-- separates options and pattern arguments.

grep: Invalid regular expression

I have a text file which looks like this:
haha1,haha2,haha3,haha4
test1,test2,test3,test4,[offline],test5
letter1,letter2,letter3,letter4
output1,output2,[offline],output3,output4
check1,[core],check2
num1,num2,num3,num4
I need to exclude all those lines that have "[ ]" and output them to another file without all those lines that have "[ ]".
I'm currently using this command:
grep ",[" loaded.txt | wc -l > newloaded.txt
But it's giving me an error:
grep: Invalid regular expression
Use grep -F to treat the search pattern as a fixed string. You could also replace wc -l with grep -c.
grep -cF ",[" loaded.txt > newloaded.txt
If you're curious, [ is a special character. If you don't use -F then you'll need to escape it with a backslash.
grep -c ",\[" loaded.txt > newloaded.txt
By the way, I'm not sure why you're using wc -l anyways...? From your problem description, it sounds like grep -v might be more appropriate. -v inverts grep's normal output, printing lines that don't match.
grep -vF ",[" loaded.txt > newloaded.txt
An alternative method to Grep
It's unclear if you want to remove lines that might contain either bracket [], or only the ones where the brackets specifically surround characters. Regardless of which method you intend to use, sed can easily remove lines that fit a definitive pattern:
To delete only lines that contained both brackets surrounding characters [...]:
sed '/\[.*\]/d' loaded.txt > newloaded.txt
Another approach might be to remove any line that contained either bracket:
sed '/\[/d;/\]/d' loaded.txt > newloaded.txt
(eg. lines containing either [ or ] would be deleted)
Your grep command doesn't seem to be excluding anything. Also, why are you using wc? I thought you want the lines, not their count.
So if you just want the lines, as you say, that don't have [], then this should work:
grep -v "\[" loaded.txt > new.txt
You can also use awk for this:
awk -F\[ 'NF==1' file > newfile
cat newfile
haha1,haha2,haha3,haha4
letter1,letter2,letter3,letter4
num1,num2,num3,num4
Or this:
awk '!/\[/' file

Grep substrings in string/word

Is there a way on grep or any other unix tool to search for a sequence of substrings in a string?
To clarify:
$ grep "substring1.*subrstring2"
substring1_mySubstring2 # OK substrings forming a single string
substring1 substring2 # WRONG substrings are separated`
You can tell grep to look for substring1 + some characters + substring2:
grep -iE 'substring1\w+substring2' file
Note the usage of -i to ignore case and -E for an extended regex coverage (the same without -E could be done with \w\+ instead).
Test
$ cat a
substring1_mySubstring2
substring1 substring2
substring1_and_other_things12345substring2 blabla
Let's see how this matches just when there is no spaces in between:
$ grep -iE 'substring1\w+substring2' a
substring1_mySubstring2
substring1_and_other_things12345substring2 blabla

Remove blank lines with grep

I tried grep -v '^$' in Linux and that didn't work. This file came from a Windows file system.
Try the following:
grep -v -e '^$' foo.txt
The -e option allows regex patterns for matching.
The single quotes around ^$ makes it work for Cshell. Other shells will be happy with either single or double quotes.
UPDATE: This works for me for a file with blank lines or "all white space" (such as windows lines with \r\n style line endings), whereas the above only removes files with blank lines and unix style line endings:
grep -v -e '^[[:space:]]*$' foo.txt
Keep it simple.
grep . filename.txt
Use:
$ dos2unix file
$ grep -v "^$" file
Or just simply awk:
awk 'NF' file
If you don't have dos2unix, then you can use tools like tr:
tr -d '\r' < "$file" > t ; mv t "$file"
grep -v "^[[:space:]]*$"
The -v makes it print lines that do not completely match
===Each part explained===
^ match start of line
[[:space:]] match whitespace- spaces, tabs, carriage returns, etc.
* previous match (whitespace) may exist from 0 to infinite times
$ match end of line
Running the code-
$ echo "
> hello
>
> ok" |
> grep -v "^[[:space:]]*$"
hello
ok
To understand more about how/why this works, I recommend reading up on regular expressions. http://www.regular-expressions.info/tutorial.html
If you have sequences of multiple blank lines in a row, and would like only one blank line per sequence, try
grep -v "unwantedThing" foo.txt | cat -s
cat -s suppresses repeated empty output lines.
Your output would go from
match1
match2
to
match1
match2
The three blank lines in the original output would be compressed or "squeezed" into one blank line.
The same as the previous answers:
grep -v -e '^$' foo.txt
Here, grep -e means the extended version of grep. '^$' means that there isn't any character between ^(Start of line) and $(end of line). '^' and '$' are regex characters.
So the command grep -v will print all the lines that do not match this pattern (No characters between ^ and $).
This way, empty blank lines are eliminated.
I prefer using egrep, though in my test with a genuine file with blank line your approach worked fine (though without quotation marks in my test). This worked too:
egrep -v "^(\r?\n)?$" filename.txt
Do lines in the file have whitespace characters?
If so then
grep "\S" file.txt
Otherwise
grep . file.txt
Answer obtained from:
https://serverfault.com/a/688789
This code removes blank lines and lines that start with "#"
grep -v "^#" file.txt | grep -v ^[[:space:]]*$
awk 'NF' file-with-blank-lines > file-with-no-blank-lines
It's true that the use of grep -v -e '^$' can work, however it does not remove blank lines that have 1 or more spaces in them. I found the easiest and simplest answer for removing blank lines is the use of awk. The following is a modified a bit from the awk guys above:
awk 'NF' foo.txt
But since this question is for using grep I'm going to answer the following:
grep -v '^ *$' foo.txt
Note: the blank space between the ^ and *.
Or you can use the \s to represent blank space like this:
grep -v '^\s*$' foo.txt
I tried hard, but this seems to work (assuming \r is biting you here):
printf "\r" | egrep -xv "[[:space:]]*"
Using Perl:
perl -ne 'print if /\S/'
\S means match non-blank characters.
egrep -v "^\s\s+"
egrep already do regex, and the \s is white space.
The + duplicates current pattern.
The ^ is for the start
Use:
grep pattern filename.txt | uniq
Here is another way of removing the white lines and lines starting with the # sign. I think this is quite useful to read configuration files.
[root#localhost ~]# cat /etc/sudoers | egrep -v '^(#|$)'
Defaults requiretty
Defaults !visiblepw
Defaults always_set_home
Defaults env_reset
Defaults env_keep = "COLORS DISPLAY HOSTNAME HISTSIZE INPUTRC KDEDIR
LS_COLORS"
root ALL=(ALL) ALL
%wheel ALL=(ALL) ALL
stack ALL=(ALL) NOPASSWD: ALL
Read lines from file exclude EMPTY Lines
grep -v '^$' folderlist.txt
folderlist.txt
folder1/test
folder2
folder3
folder4/backup
folder5/backup
Results will be:
folder1/test
folder2
folder3
folder4/backup
folder5/backup

Resources