Finding .php files with certain variable declaration (string search) on command line (Shell) - linux

I'm attempting to find all .PHP files that are in certain depth of a directory (at least 4 levels down, but not more than 5 levels in).
I'm logged into my Centos server with root authority via shell.
The string I want to search for is:
$slides='';
What I have in front of me.. I would expect it to work. I tried to escape the $ with a \ (I thought perhaps it works like regex, needing special chars excluded). I tried without the ='' portion, or tried adding \'\' to that part.. or remove the ='' altogether to simplify. nothing.
find . -maxdepth 5 -mindepth 4 -type f -name ‘*.php’ -print | xargs grep "\$slides=’’" *
I'm already running it under the directory under which I want to recursively search.
Also - I have the filter to look for *.php only but I still get a bunch of directory names in the return with a warning that says grep: [dir_name]: Is a directory
Clearly I am missing something here as far as syntax of grep command goes, or how the filter works here. I use grep more in PHP so this is quite a transition for me!

So you were almost right. The problem looks to have been the grep part of the command
grep "\$slides=''" *
Namely the * was the issue. From the bash manual
After word splitting, unless the -f option has been set (see The Set
Builtin), Bash scans each word for the characters ‘*’, ‘?’, and ‘[’.
If one of these characters appears, and is not quoted, then the word
is regarded as a pattern, and replaced with an alphabetically sorted
list of filenames matching the pattern
When you piped the found files with find into xargs and attempted to grep them with *, grep would have interpreted this as you wanting to find the string $slides='' in a list of filenames/directories returned by the glob *, and you cannot grep directories without supplying the -r flag to grep, so it returned an error.
Instead, what you wanted to do is pipe the found files with find into xargs so it can add the list of filenames to the grep command, as that's what xargs does. From the xargs manual
xargs reads items from the standard input, delimited by blanks (which
can be protected with double or single quotes or a backslash) or
newlines, and executes the command (default is /bin/echo) one or more
times with any initial- arguments followed by items read from
standard input. Blank lines on the standard input are ignored.
Making the correct command
find . -maxdepth 5 -mindepth 4 -type f -name '*.php' -print0 | xargs -0 grep "\$slides=''"
Using the -print0 flag in find, and the -0 flag in xargs, to use NUL as the delimiter, in case any filenames contained newlines.

If you want to use shell_exec from your PHP code, it is a program execution function which allows you to run a command like 'ls -al' in the operating system shell and get the result returned into a variable. Querystrings are not commands you can use in this way.
Do you mean running PHP from the command line so that it runs from the shell, not from the web server:
php -r 'echo "hello world\n";'
If you run PHP 4.3 and above, you can use the PHP Command Line Interface (CLI) which can also execute scripts stored in files. Have a look at the syntax and examples at: http://php.net/features.commandline

Related

How to use xargs with find?

I have a large number of files on disk and trying to xargs with find to get faster output.
find . -printf '%m %p\n'|sort -nr
If I write find . -printf '%m %p\n'|xargs -0 -P 0 sort -nr, it gives error argument line is too long. Removing -0 option gives other error.
The parallelism commands such as xargs or GNU parallel
are applicable only if the task can be divided into multiple independent
jobs e.g. processing multiple files at once with the same command.
It is not possible to use sort command with these parallelism commands.
Although sort has --parallel option, it may not work well for
piped input. (Not fully evaluated.)
As side notes:
The mechanism of xargs is it reads items (filenames in most cases) from
the standard input and generates individual commands by combining
the argument list (command to be executed) with each item. Then you'll
see the syntax .. | xargs .. sort is incorrect because each filename
is passed to sort as an argument then sort tries to sort the contents
of the file.
The -0 option to xargs tells xargs that input items are delimited
by a null character instead of a newline. It is useful when the input
filenames contain special characters including a newline character.
In order to use this feature, you need to coherently handle the piped
stream in that way: putting -print0 option to find and -z option
to sort. Otherwise the items are wrongly concatenated and will cause
argument line is too long error.
Suggesting to use locate command instead of find command.
You might want to update files database with updatedb command.
Read more about locate command here.

Replace spaces in all files in a directory with underscores

I have found some similar questions here but not this specific one and I do not want to break all my files. I have a list of files and I simply need to replace all spaces with underscores. I know this is a sed command but I am not sure how to generically apply this to every file.
I do not want to rename the files, just modify them in place.
Edit: To clarify, just in case it's not clear, I only want to replace whitespace within the files, file names should not be changed.
find . -type f -exec sed -i -e 's/ /_/g' {} \;
find grabs all items in the directory (and subdirectories) that are files, and passes those filenames as arguments to the sed command using the {} \; notation. The sed command it appears you already understand.
if you only want to search the current directory, and ignore subdirectories, you can use
find . -maxdepth 1 -type f -exec sed -i -e 's/ /_/g' {} \;
This is a 2 part problem. Step 1 is providing the proper sed command, 2 is providing the proper command to replace all files in a given directory.
Substitution in sed commands follows the form s/ItemToReplace/ItemToReplaceWith/pattern, where s stands for the substitution and pattern stands for how the operation should take place. According to this super user post, in order to match whitespace characters you must use either \s or [[:space:]] in your sed command. The difference being the later is for POSIX compliance. Lastly you need to specify a global operation which is simply /g at the end. This simply replaces all spaces in a file with underscores.
Substitution in sed commands follows the form s/ItemToReplace/ItemToReplaceWith/pattern, where s stands for the substitution and pattern stands for how the operation should take place. According to this super user post, in order to match whitespace characters you must use either just a space in your sed command, \s, or [[:space:]]. The difference being the last 2 are for whitespace catching (tabs and spaces), with the last needed for POSIX compliance. Lastly you need to specify a global operation which is simply /g at the end.
Therefore, your sed command is
sed s/ /_/g FileNameHere
However this only accomplishes half of your task. You also need to be able to do this for every file within a directory. Unfortunately, wildcards won't save us in the sed command, as * > * would be ambiguous. Your only solution is to iterate through each file and overwrite them individually. For loops by default should come equipped with file iteration syntax, and when used with wildcards expands out to all files in a directory. However sed's used in this manner appear to completely lose output when redirecting to a file. To correct this, you must specify sed with the -i flag so it will edit its files. Whatever item you pass after the -i flag will be used to create a backup of the old files. If no extension is passed (-i '' for instance), no backup will be created.
Therefore the final command should simply be
for i in *;do sed -i '' 's/ /_/g' $i;done
Which looks for all files in your current directory and echos the sed output to all files (Directories do get listed but no action occurs with them).
Well... since I was trying to get something running I found a method that worked for me:
for file in `ls`; do sed -i 's/ /_/g' $file; done

Using grep to identify a pattern

I have several documents hosted on a cloud instance. I want to extract all words conforming to a specific pattern into a .txt file. This is the pattern:
ABC123A
ABC123B
ABC765A
and so one. Essentially the words start with a specific character string 'ABC', have a fixed number of numerals, and end with a letter. This is my code:
grep -oh ABC[0-9].*[a-zA-Z]$ > /home/user/abcLetterMatches.txt
When I execute the query, it runs for several hours without generating any output. I have over 1100 documents. However, when I run this query:
grep -r ABC[0-9].*[a-zA-Z]$ > /home/user/abcLetterMatches.txt
the list of files with the strings is generated in a matter for seconds.
What do I need to correct in my query? Also, what is causing the delay?
UPDATE 1
Based on the answers, it's evident that the command is missing the file name on which it needs to be executed. I want to run the code on multiple document files (>1000)
The documents I want searched are in multiple sub-directories within a directory. What is a good way to search through them? Doing
grep -roh ABC[0-9].*[a-zA-Z]$ > /home/user/abcLetterMatches.txt
only returns the file names.
UPDATE 2
If I use the updated code from the answer below:
find . -exec grep -oh "ABC[0-9].*[a-zA-Z]$" >> ~/abcLetterMatches.txt {} \;
I get a no file or directory error
UPDATE 3
The pattern can be anywhere in the line.
You can use this regexp :
~/ grep -E "^ABC[0-9]{3}[A-Z]$" docs > filename
ABC123A
ABC123B
ABC765A
There is no delay, grep is just waiting for the input you didn't give it (and therefore it waits on standard input, by default). You can correct your command by supplying argument with filename:
grep -oh "ABC[0-9].*[a-zA-Z]$" file.txt > /home/user/abcLetterMatches.txt
Source (man grep):
SYNOPSIS
grep [OPTIONS] PATTERN [FILE...]
To perform the same grepping on several files recursively, combine it with find command:
find . -exec grep -oh "ABC[0-9].*[a-zA-Z]$" >> ~/abcLetterMatches.txt {} \;
This does what you ask for:
grep -hr '^ABC[0-9]\{3\}[A-Za-z]$'
-h to not get the filenames.
-r to search recursively. If no directory is given (as above) the current one is used. Otherwise just specify one as the last argument.
Quotes around the pattern to avoid accidental globbing, etc.
^ at the beginning of the pattern to — together with $ at the end — only match whole lines. (Not sure if this was a requirement, but the sample data suggests it.)
\{3\} to specify that there should be three digits.
No .* as that would match a whole lot of other things.

Looking for tool to search text in files on command line

Hello
I'm looking some script or program that use keywords or pattern search in files ex. php, html, etc and show where is this file
I use command cat /home/* | grep "keyword"
but i have too many folders and files and this command causes big uptime :/
I need this script to find fake websites (paypal, ebay, etc)
find /home -exec grep -s "keyword" {} \; -print
You don't really say what OS (and shell) you are using. You might want to retag your question to help us out.
Because you mention cat | ... , I am assuming you are using a Unix/Linux variant, so here are some pointers for looking at files. (bmargulies solution is good too).
I'm looking some script or program that use keywords or pattern search in files
grep is the basic program for searching files for text strings. Its usage is
grep [-options] 'search target' file1 file2 .... filen
(Note that 'search target' contains a space, if you don't surround spaces in your searchTarget with double or single quotes, you will have a minor error to debug.)
(Also note that 'search target' can use a wide range of wild-card characters, like .,?,+,,., and many more, that is beyond the scope of your question). ... anyway ...
As I guess you have discovered, you can only cram so many files at a time into the comand-line, even when using wild-card filename expansion. Unix/linux almost always have a utiltiyt that can help with that,
startDir=/home
find ${startDir} -print | xargs grep -l 'Search Target'
This, as one person will be happy to remind you, will require further enhancements if your filenames contain whitespace characters or newlines.
The options available for grep can vary wildly based on which OS you are using. If you're lucky, you type the following to get the man page for your local grep.
man grep
If you don't have your page buffer setup for a large size, you might need to do
man grep | page
so you can see the top of the 'document'. Press any key to advance to the next page and when you are at the end of the document, the last key press returns you to the command prompt.
Some options that most greps have that might be useful to you are
-i (ignore case)
-l (list filenames only (where txt is found)
There is also fgrep, which is usually interpretted to mean 'file' grep
becuase you can give it a file of search targets to scan for, and is used like
fgrep [-other_options] -f srchTargetsFile file1 file2 ... filen
I need this script to find fake websites (paypal, ebay, etc)
Final solution
you can make a srchFile like
paypal.fake.com
ebay.fake.com
etc.fake.com
and then combined with above, run the following
startDir=/home
find ${startDir} -print | xargs fgrep -il -f srchFile
Some greps require that the -fsrchFile be run together.
Now you are finding all files starting /home, searching with fgrep for paypay, ebay, etc in all files. The -l says it will ONLY print the filename where a match is found. You can remove the -l and then you will see the output of what is found, prepended with the filename.
IHTH.

How can I use xargs to copy files that have spaces and quotes in their names?

I'm trying to copy a bunch of files below a directory and a number of the files have spaces and single-quotes in their names. When I try to string together find and grep with xargs, I get the following error:
find .|grep "FooBar"|xargs -I{} cp "{}" ~/foo/bar
xargs: unterminated quote
Any suggestions for a more robust usage of xargs?
This is on Mac OS X 10.5.3 (Leopard) with BSD xargs.
You can combine all of that into a single find command:
find . -iname "*foobar*" -exec cp -- "{}" ~/foo/bar \;
This will handle filenames and directories with spaces in them. You can use -name to get case-sensitive results.
Note: The -- flag passed to cp prevents it from processing files starting with - as options.
find . -print0 | grep --null 'FooBar' | xargs -0 ...
I don't know about whether grep supports --null, nor whether xargs supports -0, on Leopard, but on GNU it's all good.
The easiest way to do what the original poster wants is to change the delimiter from any whitespace to just the end-of-line character like this:
find whatever ... | xargs -d "\n" cp -t /var/tmp
This is more efficient as it does not run "cp" multiple times:
find -name '*FooBar*' -print0 | xargs -0 cp -t ~/foo/bar
I ran into the same problem. Here's how I solved it:
find . -name '*FoooBar*' | sed 's/.*/"&"/' | xargs cp ~/foo/bar
I used sed to substitute each line of input with the same line, but surrounded by double quotes. From the sed man page, "...An ampersand (``&'') appearing in the replacement is replaced by the string matching the RE..." -- in this case, .*, the entire line.
This solves the xargs: unterminated quote error.
This method works on Mac OS X v10.7.5 (Lion):
find . | grep FooBar | xargs -I{} cp {} ~/foo/bar
I also tested the exact syntax you posted. That also worked fine on 10.7.5.
Just don't use xargs. It is a neat program but it doesn't go well with find when faced with non trivial cases.
Here is a portable (POSIX) solution, i.e. one that doesn't require find, xargs or cp GNU specific extensions:
find . -name "*FooBar*" -exec sh -c 'cp -- "$#" ~/foo/bar' sh {} +
Note the ending + instead of the more usual ;.
This solution:
correctly handles files and directories with embedded spaces, newlines or whatever exotic characters.
works on any Unix and Linux system, even those not providing the GNU toolkit.
doesn't use xargs which is a nice and useful program, but requires too much tweaking and non standard features to properly handle find output.
is also more efficient (read faster) than the accepted and most if not all of the other answers.
Note also that despite what is stated in some other replies or comments quoting {} is useless (unless you are using the exotic fishshell).
Look into using the --null commandline option for xargs with the -print0 option in find.
For those who relies on commands, other than find, eg ls:
find . | grep "FooBar" | tr \\n \\0 | xargs -0 -I{} cp "{}" ~/foo/bar
find | perl -lne 'print quotemeta' | xargs ls -d
I believe that this will work reliably for any character except line-feed (and I suspect that if you've got line-feeds in your filenames, then you've got worse problems than this). It doesn't require GNU findutils, just Perl, so it should work pretty-much anywhere.
I have found that the following syntax works well for me.
find /usr/pcapps/ -mount -type f -size +1000000c | perl -lpe ' s{ }{\\ }g ' | xargs ls -l | sort +4nr | head -200
In this example, I am looking for the largest 200 files over 1,000,000 bytes in the filesystem mounted at "/usr/pcapps".
The Perl line-liner between "find" and "xargs" escapes/quotes each blank so "xargs" passes any filename with embedded blanks to "ls" as a single argument.
Frame challenge — you're asking how to use xargs. The answer is: you don't use xargs, because you don't need it.
The comment by user80168 describes a way to do this directly with cp, without calling cp for every file:
find . -name '*FooBar*' -exec cp -t /tmp -- {} +
This works because:
the cp -t flag allows to give the target directory near the beginning of cp, rather than near the end. From man cp:
-t, --target-directory=DIRECTORY
copy all SOURCE arguments into DIRECTORY
The -- flag tells cp to interpret everything after as a filename, not a flag, so files starting with - or -- do not confuse cp; you still need this because the -/-- characters are interpreted by cp, whereas any other special characters are interpreted by the shell.
The find -exec command {} + variant essentially does the same as xargs. From man find:
-exec command {} +
This variant of the -exec action runs the specified command on
the selected files, but the command line is built by appending
each selected file name at the end; the total number of invoca‐
matched files. The command line is built in much the same way
that xargs builds its command lines. Only one instance of `{}'
is allowed within the command, and (when find is being invoked
from a shell) it should be quoted (for example, '{}') to protect
it from interpretation by shells. The command is executed in
the starting directory. If any invocation returns a non-zero
value as exit status, then find returns a non-zero exit status.
If find encounters an error, this can sometimes cause an immedi‐
ate exit, so some pending commands may not be run at all. This
variant of -exec always returns true.
By using this in find directly, this avoids the need of a pipe or a shell invocation, such that you don't need to worry about any nasty characters in filenames.
With Bash (not POSIX) you can use process substitution to get the current line inside a variable. This enables you to use quotes to escape special characters:
while read line ; do cp "$line" ~/bar ; done < <(find . | grep foo)
Be aware that most of the options discussed in other answers are not standard on platforms that do not use the GNU utilities (Solaris, AIX, HP-UX, for instance). See the POSIX specification for 'standard' xargs behaviour.
I also find the behaviour of xargs whereby it runs the command at least once, even with no input, to be a nuisance.
I wrote my own private version of xargs (xargl) to deal with the problems of spaces in names (only newlines separate - though the 'find ... -print0' and 'xargs -0' combination is pretty neat given that file names cannot contain ASCII NUL '\0' characters. My xargl isn't as complete as it would need to be to be worth publishing - especially since GNU has facilities that are at least as good.
For me, I was trying to do something a little different. I wanted to copy my .txt files into my tmp folder. The .txt filenames contain spaces and apostrophe characters. This worked on my Mac.
$ find . -type f -name '*.txt' | sed 's/'"'"'/\'"'"'/g' | sed 's/.*/"&"/' | xargs -I{} cp -v {} ./tmp/
If find and xarg versions on your system doesn't support -print0 and -0 switches (for example AIX find and xargs) you can use this terribly looking code:
find . -name "*foo*" | sed -e "s/'/\\\'/g" -e 's/"/\\"/g' -e 's/ /\\ /g' | xargs cp /your/dest
Here sed will take care of escaping the spaces and quotes for xargs.
Tested on AIX 5.3
I created a small portable wrapper script called "xargsL" around "xargs" which addresses most of the problems.
Contrary to xargs, xargsL accepts one pathname per line. The pathnames may contain any character except (obviously) newline or NUL bytes.
No quoting is allowed or supported in the file list - your file names may contain all sorts of whitespace, backslashes, backticks, shell wildcard characters and the like - xargsL will process them as literal characters, no harm done.
As an added bonus feature, xargsL will not run the command once if there is no input!
Note the difference:
$ true | xargs echo no data
no data
$ true | xargsL echo no data # No output
Any arguments given to xargsL will be passed through to xargs.
Here is the "xargsL" POSIX shell script:
#! /bin/sh
# Line-based version of "xargs" (one pathname per line which may contain any
# amount of whitespace except for newlines) with the added bonus feature that
# it will not execute the command if the input file is empty.
#
# Version 2018.76.3
#
# Copyright (c) 2018 Guenther Brunthaler. All rights reserved.
#
# This script is free software.
# Distribution is permitted under the terms of the GPLv3.
set -e
trap 'test $? = 0 || echo "$0 failed!" >& 2' 0
if IFS= read -r first
then
{
printf '%s\n' "$first"
cat
} | sed 's/./\\&/g' | xargs ${1+"$#"}
fi
Put the script into some directory in your $PATH and don't forget to
$ chmod +x xargsL
the script there to make it executable.
bill_starr's Perl version won't work well for embedded newlines (only copes with spaces). For those on e.g. Solaris where you don't have the GNU tools, a more complete version might be (using sed)...
find -type f | sed 's/./\\&/g' | xargs grep string_to_find
adjust the find and grep arguments or other commands as you require, but the sed will fix your embedded newlines/spaces/tabs.
I used Bill Star's answer slightly modified on Solaris:
find . -mtime +2 | perl -pe 's{^}{\"};s{$}{\"}' > ~/output.file
This will put quotes around each line. I didn't use the '-l' option although it probably would help.
The file list I was going though might have '-', but not newlines. I haven't used the output file with any other commands as I want to review what was found before I just start massively deleting them via xargs.
I played with this a little, started contemplating modifying xargs, and realised that for the kind of use case we're talking about here, a simple reimplementation in Python is a better idea.
For one thing, having ~80 lines of code for the whole thing means it is easy to figure out what is going on, and if different behaviour is required, you can just hack it into a new script in less time than it takes to get a reply on somewhere like Stack Overflow.
See https://github.com/johnallsup/jda-misc-scripts/blob/master/yargs and https://github.com/johnallsup/jda-misc-scripts/blob/master/zargs.py.
With yargs as written (and Python 3 installed) you can type:
find .|grep "FooBar"|yargs -l 203 cp --after ~/foo/bar
to do the copying 203 files at a time. (Here 203 is just a placeholder, of course, and using a strange number like 203 makes it clear that this number has no other significance.)
If you really want something faster and without the need for Python, take zargs and yargs as prototypes and rewrite in C++ or C.
You might need to grep Foobar directory like:
find . -name "file.ext"| grep "FooBar" | xargs -i cp -p "{}" .
If you are using Bash, you can convert stdout to an array of lines by mapfile:
find . | grep "FooBar" | (mapfile -t; cp "${MAPFILE[#]}" ~/foobar)
The benefits are:
It's built-in, so it's faster.
Execute the command with all file names in one time, so it's faster.
You can append other arguments to the file names. For cp, you can also:
find . -name '*FooBar*' -exec cp -t ~/foobar -- {} +
however, some commands don't have such feature.
The disadvantages:
Maybe not scale well if there are too many file names. (The limit? I don't know, but I had tested with 10 MB list file which includes 10000+ file names with no problem, under Debian)
Well... who knows if Bash is available on OS X?

Resources