Question on grep - linux

Out of many results returned by grepping a particular pattern, if I want to use all the results one after the other in my script, how can I go about it?For e.g. I grep for .der in a certificate folder which returns many results. I want to use each and every .der certificate listed from the grep command. How can I use one file after the other out of the grep result?

Are you actually grepping content, or just filenames? If it's file names, you'd be better off using the find command:
find /path/to/folder -name "*.der" -exec some other commands {} ";"
It should be quicker in general.

One way is to use grep -l. This ensures you only get every file once. -l is used to print the name of each file only, not the matches.
Then, you can loop on the results:
for file in `grep ....`
do
# work on $file
done
Also note that if you have spaces in your filenames, there is a ton of possible issues. See Looping through files with spaces in the names on the Unix&Linux stackexchange.

You can use the output as part of a for loop, something like:
for cert in $(grep '\.der' *) ; do
echo ${cert} # or something else
done
Of course, if those der things are actually files (and you're using ls | grep to get them), you can directly use the files:
for cert in *.der ; do
echo ${cert} # or something else
done
In both cases, you may need to watch out for arguments with embedded spaces.

Related

How to replace text strings (by bulk) after getting the results by using grep

One of my Linux MySQL servers suffered from a crash. So I put back a backup, however this time the MySQL is running local (localhost) instead of remotely (IP-address).
Thanks to Stack Overflow users I found an excellent command to find the IP-address in all .php files in a given directory! The command I am using for this is:
grep -r -l --include="*.php" "100.110.120.130" .
This outputs the necessary files with its location ofcourse. If it were less than 10 results, I would simply change them by hand obviously. However I received over 200 hits/results.
So now I want to know if there is a safe command which replaces the IP-address (example: 100.110.120.130) with the text "localhost" instead for all .php files in the given directory (/var/www/vhosts/) recursively.
And maybe, if only possible and not to much work, also output the changed lines to a file? I don't know if thats even possible.
Maybe someone can provide me with a working solution? To be honest, I dont dare to fool around out of the blue with this. Thats why I created a new thread.
The most standard way of replacing a string in multiple files would be to use a tool such as sed. The list of files you've obtained via grep could be read line by line (when output to a file) using a while loop in combination with sed.
$ grep -r -l --include="*.php" "100.110.120.130" . > list.txt
# this will output all matching files to list.txt
Replacing IP in matched files:
while read -r line ; do echo "$line" >> updated.txt ; sed -i 's/100.110.120.130/localhost/g' "${line}" ; done<list.txt
This will take list.txt and read it line by line to the sed command which should replace all occurrences of the IP to "localhost". The echo command directly before sed outputs all the filenames that will be modified into a file updated.txt (it isn't necessary though as list.txt contains the same exact filenames, although it could be used as a means of verification perhaps).
To do a dry run before modifying all of the matched files remove the
-i from the sed command and it will print the output to stdout
instead of in-place modifying the files.

Getting the most recent filename where the extension name is case *in*sensitive

I am trying to get the most recent .CSV or .csv file name among other comma separated value files where the extension name is case insensitive.
I am achieving this with the following command, provided by someone else without any explanation:
ls -t ~(i:*.CSV) | head -1
or
ls -t -- ~(i:*.CSV) | head -1
I have two questions:
What is the use of ~ and -- in this case? Does -- helps here?
How can I get a blank response when there is no .csv or .CSV file in
the folder? At the moment I get:
/bin/ls: cannot access ~(i:*.CSV): No such file or directory
I know I can test the exit code of the last command, but I was wondering maybe there is a --silent option or something.
Many thanks for your time.
PS: I made my research online quite thorough and I was unable to find an answer.
The ~ is just a literal character; the intent would appear to be to match filenames starting with ~ and ending with .csv, with i: being a flag to make the match case-insensitive. However, I don't know of any shell that supports that particular syntax. The closest thing I am aware of would be zsh's globbing flags:
setopt extended_glob # Allow globbing flags
ls ~(#i)*.csv
Here, (#i) indicates that anything after it should be matched without regard to case.
Update: as #baptistemm points out, ~(i:...) is syntax defined by ksh.
The -- is a conventional argument, supported by many commands, to mean that any arguments that follow are not options, but should be treated literally. For example, ls -l would mean ls should use the -l option to modify its output, while ls -- -l means ls should try to list a file named -l.
~(i:*.CSV) is to tell to shell (this is only supported apparently in ksh93) the enclosed text after : must be treated as insensitive, so in this example that could all these possibilites.
*.csv or
*.Csv or
*.cSv or
*.csV or
*.CSv or
*.CSV
Note this could have been written ls -t *.[CcSsVv] in bash.
To silent errors I suggest you to look for in this site for "standard error /dev/null" that will help.
I tried running commands like what you have in both bash and zsh and neither worked, so I can't help you out with that, but if you want to discard the error, you can add 2>/dev/null to the end of the ls command, so your command would look like the following:
ls -t ~(i:*.CSV) 2>/dev/null | head -1
This will redirect anything written to STDERR to /dev/null (i.e. throw it out), which, in your case, would be /bin/ls: cannot access ~(i:*.CSV): No such file or directory.

Using grep to identify a pattern

I have several documents hosted on a cloud instance. I want to extract all words conforming to a specific pattern into a .txt file. This is the pattern:
ABC123A
ABC123B
ABC765A
and so one. Essentially the words start with a specific character string 'ABC', have a fixed number of numerals, and end with a letter. This is my code:
grep -oh ABC[0-9].*[a-zA-Z]$ > /home/user/abcLetterMatches.txt
When I execute the query, it runs for several hours without generating any output. I have over 1100 documents. However, when I run this query:
grep -r ABC[0-9].*[a-zA-Z]$ > /home/user/abcLetterMatches.txt
the list of files with the strings is generated in a matter for seconds.
What do I need to correct in my query? Also, what is causing the delay?
UPDATE 1
Based on the answers, it's evident that the command is missing the file name on which it needs to be executed. I want to run the code on multiple document files (>1000)
The documents I want searched are in multiple sub-directories within a directory. What is a good way to search through them? Doing
grep -roh ABC[0-9].*[a-zA-Z]$ > /home/user/abcLetterMatches.txt
only returns the file names.
UPDATE 2
If I use the updated code from the answer below:
find . -exec grep -oh "ABC[0-9].*[a-zA-Z]$" >> ~/abcLetterMatches.txt {} \;
I get a no file or directory error
UPDATE 3
The pattern can be anywhere in the line.
You can use this regexp :
~/ grep -E "^ABC[0-9]{3}[A-Z]$" docs > filename
ABC123A
ABC123B
ABC765A
There is no delay, grep is just waiting for the input you didn't give it (and therefore it waits on standard input, by default). You can correct your command by supplying argument with filename:
grep -oh "ABC[0-9].*[a-zA-Z]$" file.txt > /home/user/abcLetterMatches.txt
Source (man grep):
SYNOPSIS
grep [OPTIONS] PATTERN [FILE...]
To perform the same grepping on several files recursively, combine it with find command:
find . -exec grep -oh "ABC[0-9].*[a-zA-Z]$" >> ~/abcLetterMatches.txt {} \;
This does what you ask for:
grep -hr '^ABC[0-9]\{3\}[A-Za-z]$'
-h to not get the filenames.
-r to search recursively. If no directory is given (as above) the current one is used. Otherwise just specify one as the last argument.
Quotes around the pattern to avoid accidental globbing, etc.
^ at the beginning of the pattern to — together with $ at the end — only match whole lines. (Not sure if this was a requirement, but the sample data suggests it.)
\{3\} to specify that there should be three digits.
No .* as that would match a whole lot of other things.

Help needed to nab the malware viral activity using awk

I am facing issues with my server as sometimes the malwares are adding their code at the end or start of the files. I have fixed the security loopholes to the extent of my knowledge. My hosting provider has informed that the security is adequate now, but I have become paranoid with the viral/malware activity on my site. I have a plan, but I am not well versed with Linux editors like sed or awk or gawk so help needed from your side. I can do this using my PHP knowledge but that would be very resource intensive.
Since malwares/virus add code at the start or end of the file (so that the website does not show any error), can you please let me know how to write a command which would recursively look into all .php files (I will use the help to make changes in other type of files) in parent and all sub-directories and add a particular tag at the start and end of the file, say, XXXXXX_START, and YYYYYY_END.
Then I need a script which would read all the .php files and check if the first line of the code is XXXXX_START and last line is YYYYYYY_END and create a report if any file is found to be different.
I will setup a cron to check all the files and email the report to me if any discrepancy found.
I know this is not 100% foolproof as virus may add the data after the commented lines, but this is the best option I could think of.
I have tried the following commands to add data at the start -
sed -i -r '1i add here' *.txt
but this isn't recursive and it adds line to only the parent directory files.
Then I found this -
BEGIN and END are special patterns. They are not used to match input records. Rather, they are used for supplying start-up or clean-up information to your awk script. A BEGIN rule is executed, once, before the first input record has been read. An END rule is executed, once, after all the input has been read. For example:
awk 'BEGIN { print "Analysis of `foo'" }
/foo/ { ++foobar }
END { print "`foo' appears " foobar " times." }' BBS-list
But unfortunately, I could not decipher anything.
Any help on above mentioned details is highly appreciated. Any other suggestions are welcomed.
Regards,
Nitin
You can use the following to modify the files (also creates backup files called .bak):
find . -name "*.php" | xargs sed -i.bak '1iSTART_XXXX
$aEND_YYYY'
You could use the following shell script for checking the files:
for f in `find . -name "*.php" -print`
do
START_LINE=`head -1 $f`
END_LINE=`tail -1 $f`
if [[ $START_LINE != "START_XXXX" ]]
then
echo "$f: Mismatched header!"
fi
if [[ $END_LINE != "END_YYYY" ]]
then
echo "$f: Mismatched footer!"
fi
done
Use version control and/or backups; in the event of suspicious activity, zap the live site and reinstall from backups or your version control source.
$ find . -type f | grep "txt$" | xargs sed -i -r '1i add here'
Will apply that command to all files in or under the current directory. You could probably fold the grep logic into find, but I like simple incantations.

Looking for tool to search text in files on command line

Hello
I'm looking some script or program that use keywords or pattern search in files ex. php, html, etc and show where is this file
I use command cat /home/* | grep "keyword"
but i have too many folders and files and this command causes big uptime :/
I need this script to find fake websites (paypal, ebay, etc)
find /home -exec grep -s "keyword" {} \; -print
You don't really say what OS (and shell) you are using. You might want to retag your question to help us out.
Because you mention cat | ... , I am assuming you are using a Unix/Linux variant, so here are some pointers for looking at files. (bmargulies solution is good too).
I'm looking some script or program that use keywords or pattern search in files
grep is the basic program for searching files for text strings. Its usage is
grep [-options] 'search target' file1 file2 .... filen
(Note that 'search target' contains a space, if you don't surround spaces in your searchTarget with double or single quotes, you will have a minor error to debug.)
(Also note that 'search target' can use a wide range of wild-card characters, like .,?,+,,., and many more, that is beyond the scope of your question). ... anyway ...
As I guess you have discovered, you can only cram so many files at a time into the comand-line, even when using wild-card filename expansion. Unix/linux almost always have a utiltiyt that can help with that,
startDir=/home
find ${startDir} -print | xargs grep -l 'Search Target'
This, as one person will be happy to remind you, will require further enhancements if your filenames contain whitespace characters or newlines.
The options available for grep can vary wildly based on which OS you are using. If you're lucky, you type the following to get the man page for your local grep.
man grep
If you don't have your page buffer setup for a large size, you might need to do
man grep | page
so you can see the top of the 'document'. Press any key to advance to the next page and when you are at the end of the document, the last key press returns you to the command prompt.
Some options that most greps have that might be useful to you are
-i (ignore case)
-l (list filenames only (where txt is found)
There is also fgrep, which is usually interpretted to mean 'file' grep
becuase you can give it a file of search targets to scan for, and is used like
fgrep [-other_options] -f srchTargetsFile file1 file2 ... filen
I need this script to find fake websites (paypal, ebay, etc)
Final solution
you can make a srchFile like
paypal.fake.com
ebay.fake.com
etc.fake.com
and then combined with above, run the following
startDir=/home
find ${startDir} -print | xargs fgrep -il -f srchFile
Some greps require that the -fsrchFile be run together.
Now you are finding all files starting /home, searching with fgrep for paypay, ebay, etc in all files. The -l says it will ONLY print the filename where a match is found. You can remove the -l and then you will see the output of what is found, prepended with the filename.
IHTH.

Resources