grep lines with X but not with Y - linux

The problem:
I want to get all lines of code in my project folder that have ".js" in order to check that i don't have un-minimized JavaScript files.
When I'm trying to do the following: grep -H ".js\"" *
I'm getting everything right. but still have a problem as I don't want to get lines with ".min.js" which i don't want to get.
Is it possible using grep command to search my project folder for all files/lines that have ".js" but not ".min.js" ?
Thanks.
GalT.

Just pipe the output to another grep as
grep -H ".js" | grep -vH ".min.js"

You can do this with awk
awk '/.js/ && !/.min.js/'
To print filename:
awk '/.js/ && !/.min.js/ {print FILENAME}' *

The following command will work in folder as well.
For current dir you can use this
find . | xargs grep ".js" | grep -v "min.js"
For any specific folder
find (folder path) | xargs grep ".js" | grep -v "min.js"

Related

How to get file name of the linux awk and find results

i have this linux command using pipes which find files in directyrids and grep and awk on the results the
command working just fine , what im missing is the bility to get the file name of the results so i could know what is the source file of the result .
$ find . -name "*.log" | xargs grep -i TypeOf | grep -v 'Error=APP1' | awk '{split($0,a,"Name="); print a[2]}' | sort -h
how do i add to this command something to pring the file names
You don't need to add much but you do need to remove several things. You never need grep when you're using awk and split() is what awk does by default so your existing command line should just be:
find . -name "*.log" | xargs awk -F"Name=' 'tolower($0)~/typeof/ && !/Error=APP1/ {print $2}' | sort -h
and then to print the file name and line number just add them to the print statement:
find . -name "*.log" | xargs awk -F'Name=' 'tolower($0)~/typeof/ && !/Error=APP1/ {print FILENAME, FNR, $2}' | sort -h
The problem is your awk program. You discard all fields except the second, including the file name, which should be first. You probably need to use something like:
awk -F: '{split($0,a,"Name="); print $1, a[2]}'

Remove lines containing a string from all files in directory

My server has been infected with malware. I have upgraded my Linux server to the latest version and no new files are being infected, but I need to clean up all the files now.
I can locate all the files doing the following:
grep -H "gzinflate(base64_decode" /home/website/data/private/assets/ -R | cut -d: -f1
But, I want to now delete the line containing gzinflate(base64_decode in every single file.
I'd use sed -i '/gzinflate(base64_decode/d' to delete those matching line in a file:
... | xargs -I'{}' sed -i '/gzinflate(base64_decode/d' '{}'
Note: You really want to be using grep -Rl not grep -RH .. | cut -d: -f1 as -l lists the matching filenames only so you don't need to pipe to cut.
Warning: You should really be concerned about the deeper issue of security here, I wouldn't trust the system at all now, you don't know what backdoors are open or what files may still be infected.
once you got these files using your command
grep -H "gzinflate(base64_decode" /home/website/data/private/assets/ -R | cut -d: -f1
you loop throu files one by one and use
grep -v "gzinflate(base64_decode" file > newfile

how to compare output of two ls in linux

So here is the task which I can't solve. I have a directory with .h files and a directory with .i files, which have the same names as the .h files. I want just by typing a command to have all .h files which are not found as .i files. It's not a hard problem, I can do it in some programming language, but I'm just curious how it will look like in cmd :). To be more specific here is the algo:
get file names without extensions from ls *.h
get file names without extensions from ls *.i
compare them
print all names from 1 that are not met in 2
Good luck!
diff \
<(ls dir.with.h | sed 's/\.h$//') \
<(ls dir.with.i | sed 's/\.i$//') \
| grep '$<' \
| cut -c3-
diff <(ls dir.with.h | sed 's/\.h$//') <(ls dir.with.i | sed 's/\.i$//') executes ls on the two directories, cuts off the extensions, and compares the two lists. Then grep '$<' finds the files that are only in the first listing, and cut -c3- cuts off the "< " characters that diff inserted.
ls ./dir_h/*.h | sed -r -n 's:.*dir_h/([^.]*).h$:dir_i/\1.i:p' | xargs ls 2>&1 | \
grep "No such file or directory" | awk '{print $4}' | sed -n -r 's:dir_i/([^:]*).*:dir_h/\1:p'
ls -1 dir1/*.hh dir2/*.ii | awk -F"/" '{print $NF}' |awk -F"." '{a[$1]++;b[$0]}END{for(i in a)if(a[i]==1 && b[i".hh"]) print i}'
explanation:
ls -1 dir1/*.hh dir2/*.ii
above will list all the files *.hh and *.ii files in both the directories.
awk -F"/" '{print $NF}'
above will just print the file name excluding the complete path of the file.
awk -F"." '{a[$1]++;b[$0]}END{for(i in a)if(a[i]==1 && b[i".hh"]) print i}'
above will create two associative arrays one with file name and one with excluding the extension.
if both hh and ii files exist the value in the assosciative array will 2 if there is only one file then the value will be 1.so we need array item whose value is 1 and it should be a header file (.hh).
this can be checked using the asso..array b which is done in the END block.
Assuming bash is your shell:
for file in $( ls dir_with_h/*.h ); do
name=${file%\.h}; # trim trailing ".h" file extension
name=${name#dir_with_h/}; # trim leading folder name
if [ ! -e dir_with_i/${name}.i ]; then
echo ${name};
fi
done
Undoubtedly this can be ported to virtually all other shells. I find this less cryptic than some other approaches (although this is surely my problem) but it is a little wordy. As such. a shell script might help recall it.

grep: grep -v to exclude a file but doesn't work

I have three file in a directory:
a.html:
<html>
a
</html>
b.html:
<html>
b
</html>
htmlfile:
this is just a html file
I want to get the files which filename extension are not .html, so the command I use is:
ls|grep -v *html
but the result is:
<html>
b
</html>
why?
Thank you. but I don't know why ls|grep -v *html print out the content of b.html. if this command is print out the content of files which ending with .html, why don't print out the content of a.html?
Since you did not put *html in quotes, the shell expands your command to
ls | grep -v a.html b.html
Now, since grep is called with two arguments, it will ignore stdin. So the result is equivalent to
grep -v "a.html" b.html
which prints the contents of b.html.
edit
To make it work, use either
ls | grep -v "html$"
# Note that html$ is the regexp equivalent to the shell pattern *html
or
shopt -s extglob # turn on extended globbing
ls -d !(*html)
Use
ls | grep -v .html
this will filter out names with a .html extension.
Quick test:
$ ls
a.html b.html htmlfile
$ ls | grep -v .html
htmlfile
This should do the trick:
ls -1 | grep -v '.html$'
This is something to do with shell globbing. I'm not sure exactly why grep -v behaves this way, but here are some similar results on Arch Linux zsh:
ls | grep -v *html
<html>b</html>
grep -v *html
<html>b</html>
Note those have the same result. The grep command is operating on the current working directory and using the shell globbing character (*) as an argument. The pipe from ls has nothing to do with it. It's included in the output, but grep discards it.
To see how this works more clearly, move up a directory and issue:
ls <dirname> | grep -v *html
zsh: no matches found: *html
Edit: See Pumbaa80's answer for why this happens.

grep command working in testdir but not in "real" directory

I just thought I had found my solution because the command works in my test directory.
grep -H -e 'author="[^"].*' *.xml | cut -d: -f1 | xargs -I '{}' mv {} mydir/.
But using the command in the non-test-direcory the command did not work:
This is the error message:
grep: unknown option -- O
Usage: grep [OPTION]... PATTERN [FILE]...
Try `grep --help' for more information.
Not even this worked:
$ grep -H author *.xml
or this:
$ grep -H 'author' *.xml
(same error message)
I suspect it has some relation to the file names or the amount of files.
I have almost 3000 files in the non-test-directory and only 20 in my test directory.
In both directories almost all file names contain spaces and " - ".
Some more info:
I'm using Cygwin.
I am not allowed to change the filenames
Try this (updated):
grep -HlZ 'author="[^"].*' -- *.xml | xargs -0 -I {} mv -- {} mydir/
EXPLANATION (updated)
In your "real" directory you have a file with name starting with -O.
Your shell expands the file list *.xml and grep takes your - starting filename as an option (not valid). Same thing happens with mv. As explained in the Common options section of info coreutils, you can use -- to delimit the option list. What comes after -- is considered as an operand, not an option.
Using the -l (lowercase L) option, grep outputs only the filename of matching files, so you don't need to use cut.
To correctly handle every strange filename, you have to use the pair -Z in grep and -0 in xargs.
No need to use -e because your pattern does not begin with -.
Hope this will help!

Resources