Using grep to find function - linux

I need to find the usage of functions like system("rm filename") & system("rm -r filename").
I tried grep -r --include=*.{cc,h} "system" . & grep -r --include=*.{cc,h} "rm" . but they are giving too many outcomes.
How do I search for all the instances of system("rm x") where 'x' can be anything. Kind of new with grep.

Try:
grep -E "system\(\"rm [a-zA-Z0-9 ]*\"\)" file.txt
Regexp [a-zA-Z0-9 ] builds a pattern for grep what it needs to find in x of system("rm x"). Unfortunately, grep don't supports groups for matching, so you will need to specify it directly what to search.

A possible way might be to work inside the GCC compiler. You could use the MELT domain specific language for that. It provides easy matching on Gimple internal representation of GCC.
It is more complex than textual solutions, but it would also find e.g. calls to system inside functions after inlining and other optimizations.
So customizing the GCC compiler is probably not worth the effort for your case, unless you have a really large code base (e.g. million of lines of source code).
In a simpler textual based approach, you might pipe two greps, e.g.
grep -rwn system * | grep -w rm
or perhaps just
grep -rn 'system.*rm' *
BTW, in some big enough software, you may probably have a lot of code like e.g.
char cmdbuf[128];
snprintf (cmdbuf, sizeof(cmdbuf), "rm %s", somefilepath);
system (cmdbuf);
and in that case a simple textual grep based approach is not enough (unless you inspect visually surrounding code).

Install ack (http://beyondgrep.com) and your call is:
ack --cc '\bsystem\(.+\brm\rb'

Related

Why does find -regex command, differ from find | grep?

The find command below outputs nothing, and does not find any "include" files or directories.
find -regex "*include*" 2>/dev/null
However piping the find command into grep -E seems to find most include files.
find ./ 2>/dev/null | grep -E "*include*"
I've left out the output since the first is blank and the second matches quite a few files.
I'm starting to need to dig through linux system files to find the answers I need (especially to find macro values). In order to do that I have been using find | grep -E to find the files that that should have the macro I am looking for.
Below is the line I tried today with find (my root directory is /), and nothing is output. I don't want to run the command as root so I pipe the errors out to /dev/null. I checked the errors for regex syntax errors but nothing. Its still looping through all directories since I still get a "find: /var/lib: Permission Denied" Error
find -regex "*include*" 2>/dev/null
However this seems to work and give me everything I want.
find ./ 2>/dev/null | grep -E "*include*"
So my main question is why does find -regex not output the same as find | grep -E ?
Regular expressions are not a language, but a general mathematical construct with many different notations and dialects thereof.
For simple patterns you can often get away with ignoring this fact since most dialects use very similar notation, but since you are specifying an ill defined pattern with a leading asterisk, you get into engine specific behavior.
grep -E uses the GNU implementation of POSIX ERE, and interprets your pattern as ()*includ(e)* and therefore matches includ followed by zero or more es. (POSIX says that the behavior of a leading asterisk is undefined).
find uses Emacs Regex, and interprets it as \*includ(e)* and therefore requires a literal asterisk in the filename.
If you want the same result from both, you can use find -regextype posix-egrep, or you can specify a regex that is equivalent in both such as .*include.* to match include as a substring.
As I understand your question you want to find files in Linux directories
You should use this library
yum install locate
If you use ubuntu
sudo apt-get install locate
Prepare library
sudo updatedb
Then start search
locate include

How can I use xargs to run a function in a command substitution for each match?

While writing Bash functions for string replacements I have encountered a strange behaviour when using xargs. This is actually driving me mad currently as I cannot get it to work.
Fortunately I have been able to nail it down to the following simple example:
Define a simple function which doubles every character of the given parameter:
function subs { echo $1 | sed -E "s/(.)/\1\1/g"; }
Call the function:
echo $(subs "ABC")
As expected the output is:
AABBCC
Now call the function using xargs:
echo "ABC" | xargs -I % echo $(subs "%")
Surprisingly the result now is:
ABCABC
It seems as if the sed command inside the function treats the whole string now as a single character.
Why does this happen and how can it be prevented?
You might ask, why I use xargs at all. Of course, this is a simplified example and the actual use case is much more complex.
In the original use case, I have a program which produces lots of output. I pipe the output through several greps to get the lines of interest. Afterwards, I pipe the lines to sed to extract the data I need from the lines. Because some transformations I need to do on the data are too complex to do with regular expressions alone, I'd like to use a function for these. So, my original idea was to simply pipe into the function but I couldn't get that to work and end up with the xargs solution. My original idea was something like this:
command | grep ... | grep ... | grep ... | sed ... | subs
BTW: I do not do this from the command line but from within a script. The function is defined in the very same script in which it is used.
I'm using Bash 3.2 (Mac OS X default), so fancy Bash 4.x stuff won't help me, sorry.
I'll be happy about everything which might shed some light on this topic.
Best regards
Frank
If you really need to do this (and you probably don't, but we can't help without a more representative sample), a better-practice approach might look like:
subs() { sed -E "s/(.)/\1\1/g" <<<"$1"; }
export -f subs
echo "ABC" | xargs bash -c 'for arg; do subs "$arg"; done' _
The use of echo "$(subs "$arg")" instead of just subs "$arg" adds nothing but bugs (consider what happens if one of your arguments is -n -- and that's assuming a relatively tame echo; they're allowed to consume backslashes even without a -e argument and to do all manner of other surprising things). You could do it above, but it slows your program down and makes it more prone to surprising behaviors; there's no point.
Running export -f subs export your function to the environment, so it can be run by other instances of bash invoked as child processes (all programs invoked by xargs are outside your shell, so they can't see shell-local variables or functions).
Without -I -- which is to say, in its default mode of operation -- xargs appends arguments to the end of the command it's given. This permits a much more efficient usage mode, where instead of invoking one command per line of input, it passes as many arguments as possible to the shortest possible number of subprocesses.
This also avoids major security bugs that can happen when using xargs -I in conjunction with bash -c '...' or sh -c '...'. (If you ever use -I% sh -c '...%...', then your filenames become part of your code, and are able to be used in injection attacks on your system).
That's because the construct $(subs "%") gets expanded by the shell when parsing the pipeline, so xargs runs with echo %%.

File Glob Patterns in Linux terminal

I want to search a filename which may contain kavi or kabhi.
I wrote command in the terminal:
ls -l *ka[vbh]i*
Between ka and i there may be v or bh .
The code I wrote isn't correct. What would be the correct command?
A nice way to do this is to use extended globs. With them, you can perform regular expressions on Bash.
To start you have to enable the extglob feature, since it is disabled by default:
shopt -s extglob
Then, write a regex with the required condition: stuff + ka + either v or bh + i + stuff. All together:
ls -l *ka#(v|bh)i*
The syntax is a bit different from the normal regular expressions, so you need to read in Extended Globs that...
#(list): Matches one of the given patterns.
Test
$ ls
a.php AABB AAkabhiBB AAkabiBB AAkaviBB s.sh
$ ls *ka#(v|bh)i*
AAkabhiBB AAkaviBB
a slightly longer cmd line could be using find, grep and xargs. it has the advantage of being easily extended to different search terms (by either extending the grep statement or by using additional options of find), a bit more readability (imho) and flexibility in being able to execute specific commands on the files which are found
find . | grep -e "kabhi" -e "kavi" | xargs ls -l
You can get what you want by using curly braces in bash:
ls -l *ka{v,bh}i*
Note: this is not a regular expression question so much as a "shell globbing" question. Shell "glob patterns" are different from regular expressions, though they are similar in many ways.

Easy replace with/without regex in multiple files

Hundred times a day I need to search for patterns in files and sometime I have to replace these patterns with something else. Most of the time it is simple patterns like a word or a short sentence but sometime I have to look for more complex regexp. I don't really like sed (at least the sed version I have because it is not much compliant with the PCRE engine). So I rather prefer using perl -pi -e.
However, Perl pie is not very attractive on Cygwin because of the mandatory -i.bak temp files. I need to find a way to automatically remove the .bak files after processing. Moreover, if I want to replace recursively in a project I have to list all the files first:
find . | xargs -n1 perl -pi -e 's/foo/bar/'
This command is quite long to write especially if you use it thousand times a month. So I decided to write a more useful tool working in the same way as the great silver searcher ag.
ag 'foo\d{3}[^\w]' # Search for a pattern
# Oh yes this one should be renamed!
replace 's/(foo)\d{3}[^\w]/\U$1\E_bar/g'
I wrote this very primitive bash function
function replace
{
EXTENSION=.perlpie_tmp
perl -p -i$EXTENSION -e $1 ${*:2}
for file in ${*:2}; do
rm "$file$EXTENSION";
done;
}
But I am not satisfied at all because it doesn't automatically search for all files recursively if there is no more than one argument. I may either modify this function an add find . if the number of arguments is 1, or I can write a much complex program in Perl that can support command line options, pretty output, smart case search or even plain text search.
What is the most suitable option to this problem and is there any advanced search/replace tool on the linux world? If not I may try to write my own rip tool standing for replace-in-place which can support all the options that I need.
Before that I need some advices...
EDIT
Actually I think to fork https://github.com/petdance/ack2 to add a replacement feature... This may or may not be a good idea...
Here's an alternative to your function (edited to use the suggestion provided by gniourf_gniourf, thanks):
find -type f . -exec sh -c 'perl -pi.bak -e "s/foo/bar/" "$0" && rm -f "$0".bak' {} \;
Using this approach, you can remove the file as you go.
I think you can use
grep -Hrn -e "string" .
to find a pattern, and
find -type f -exec sed -i "s#string1#string2#g" {} \;
to replace a pattern
I would slightly modify your existing function:
function replace {
local perl_code=$1 EXTENSION=.perlpie_tmp file
shift
for file; do
perl -p -i$EXTENSION -e "$perl_code" "$file" && rm "$file$EXTENSION"
done;
}
This will slightly worsen the performance as you're now calling perl multiple times, but I suspect you won't notice.

grep based on blacklist -- without procedural code?

It's a well-known task, simple to describe:
Given a text file foo.txt, and a blacklist file of exclusion strings, one per line, produce foo_filtered.txt that has only the lines of foo.txt that do not contain any exclusion string.
A common application is filtering compiler warnings from a build log, but to ignore warnings on files that are not yours. The file foo.txt is the warnings file (itself filtered from the build log), and a blacklist file excluded_filenames.txt with file names, one per line.
I know how it's done in procedural languages like Perl or AWK, and I've even done it with combinations of Linux commands such as cut, comm, and sort.
But I feel that I should be really close with xargs, and just can't see the last step.
I know that if excluded_filenames.txt has only 1 file name in it, then
grep -v foo.txt `cat excluded_filenames.txt`
will do it.
And I know that I can get the filenames one per line with
xargs -L1 -a excluded_filenames.txt
So how do I combine those two into a single solution, without explicit loops in a procedural language?
Looking for the simple and elegant solution.
You should use the -f option (or you can use fgrep which is the same):
grep -vf excluded_filenames.txt foo.txt
You could also use -F which is more directly the answer to what you asked:
grep -vF "`cat excluded_filenames.txt`" foo.txt
from man grep
-f FILE, --file=FILE
Obtain patterns from FILE, one per line. The empty file contains zero patterns, and therefore matches nothing.
-F, --fixed-strings
Interpret PATTERN as a list of fixed strings, separated by newlines, any of which is to be matched.

Resources