How to grep lines that end with .c or .cpp? - linux

I have a file as below, I want to grep for lines having .c or .cpp extension. I have tried using cat file|grep ".c" grep but I am getting all types of extensions as output. Please shed some light on this. Thanks in advance.
file contents are below:
/dir/a/b/cds/main.c
/dir/a/f/cmdss/file.cpp
/dir/a/b/cds/main.h
/dir/a/f/cmdss/file.hpp
/dir/a/b/cdys/main_abc.c
/dir/a/f/cmfs/file_123.cpp

grep supports regular expressions.
$ grep -E '\.(c|cpp)$' input
-E means 'Interpret PATTERN as an extended regular expression'
\. means a dot .
() is a group
c|cpp is an alternative
$ is the lineend

$ grep -E '\.cp{2}?' testfile1
/dir/a/b/cds/main.c
/dir/a/f/cmdss/file.cpp
/dir/a/b/cdys/main_abc.c
/dir/a/f/cmfs/file_123.cpp
$
May be this variant will useful. Here p{2} mean 'symbol p meet 2 times after symbol c'

Also you can use --include parameter like below
grep --include \*.hpp --include \*.cpp your_search_pattern

The Android framework defines a bash function extensions named cgrep, it goes recursively in the project directory, and it's much faster than using grep -r.
Usage:
cgrep <expession to find>
it greps only C/C++ header and source files.
function cgrep()
{
find . -name .repo -prune -o -name .git -prune -o -type f \( -name '*.c ' -o -name '*.cc' -o -name '*.cpp' -o -name '*.h' \) -print0 | xargs -0 gre p --color -n "$#"
}
You can paste this in you .bashrc file, or use the inline directly in shell.

Related

Sed and grep in multiple files

I want to use "sed" and "grep" to search and replace in multiples files, excluding some directories.
I run this command:
$ grep -RnI --exclude-dir={node_modules,install,build} 'chaine1' /projets/ | sed -i 's/chaine1/chaine2/'
I get this message:
sed: pas de fichier d'entrée
I also tried with these two commands:
$ grep -RnI --exclude-dir={node_modules,install,build} 'chaine1' . | xargs -0 sed -i 's/chaine2/chaine2/'
$ grep -RnI --exclude-dir={node_modules,install,build} 'chaine2' . -exec sed -i 's/chaine1/chaine2/g' {} \;
But,it doesn't work!!
Could you help me please?
Thanks in advance.
You want find with -exec. Don't bother running grep, sed will only change lines containing your pattern anyway.
find \( -name node_modules -o -name install -o -name build \) -prune \
-o -type f -exec sed -i 's/chaine1/chaine2/' {} +
First, the direct outputs of grep command are not file paths. They look like this {file_path}:{line_no}:{content}. So the first thing you need to do is to extract file paths. We can do this use cut command or use -l option of grep.
# This will print {file_path}
$ echo {file_path}:{line_no}:{content} | cut -f 1 -d ":"
# This is a better solution, because it only prints each file once even though
# the grep pattern appears at many lines of a file.
$ grep -RlI --exclude-dir={node_modules,install,build} "chaine1" /projets/
Second, sed -i does not read from stdin. We can use xargs to read each file path from stdin and then pass it to sed as its argument. You have already done this.
The complete command like this:
$ grep -RlI --exclude-dir={node_modules,install,build} "chaine1" /projets/ | xargs -i sed -i 's/chaine1/chaine2/' {}
Edit: Thanks to #EdMorton's comment, I dig into find. My previous solutions will dig into files not in exclusive directories once by grep, and then process files containing pattern string for another time by sed. However, we can first use find to filter files according to their path names, and then use sed to process files only once.
My find solution is almost the same as #knittl's, but with bug fixed. Besides, I try to explain why it gets the similar results with grep. Because I still not find how to skip binary files like -I option of grep.
$ find \( \( -name node_modules -o -name install -o -name build \) -prune -type f \
-o -type f \) -exec echo {} +
or
find \( \( -name node_modules -o -name install -o -name build \) -prune \
-o -type f \) -type f -exec echo {} +
\( -name pat1 -o -name pat2 \) gives paths matching pat1 or pat2 (include files and directories), where -o means logical or. -prune ignores a directory and the files under it. They combine to achieve similar function with exclude-dir in grep.
-type f gives paths of regular files.

Can't find a file by pattern [duplicate]

I am having a hard time getting find to look for matches in the current directory as well as its subdirectories.
When I run find *test.c it only gives me the matches in the current directory. (does not look in subdirectories)
If I try find . -name *test.c I would expect the same results, but instead it gives me only matches that are in a subdirectory. When there are files that should match in the working directory, it gives me: find: paths must precede expression: mytest.c
What does this error mean, and how can I get the matches from both the current directory and its subdirectories?
Try putting it in quotes -- you're running into the shell's wildcard expansion, so what you're acually passing to find will look like:
find . -name bobtest.c cattest.c snowtest.c
...causing the syntax error. So try this instead:
find . -name '*test.c'
Note the single quotes around your file expression -- these will stop the shell (bash) expanding your wildcards.
What's happening is that the shell is expanding "*test.c" into a list of files. Try escaping the asterisk as:
find . -name \*test.c
From find manual:
NON-BUGS
Operator precedence surprises
The command find . -name afile -o -name bfile -print will never print
afile because this is actually equivalent to find . -name afile -o \(
-name bfile -a -print \). Remember that the precedence of -a is
higher than that of -o and when there is no operator specified
between tests, -a is assumed.
“paths must precede expression” error message
$ find . -name *.c -print
find: paths must precede expression
Usage: find [-H] [-L] [-P] [-Olevel] [-D ... [path...] [expression]
This happens because *.c has been expanded by the shell resulting in
find actually receiving a command line like this:
find . -name frcode.c locate.c word_io.c -print
That command is of course not going to work. Instead of doing things
this way, you should enclose the pattern in quotes or escape the
wildcard:
$ find . -name '*.c' -print
$ find . -name \*.c -print
Try putting it in quotes:
find . -name '*test.c'
I see this question is already answered. I just want to share what worked for me. I was missing a space between ( and -name. So the correct way of chosen a files with excluding some of them would be like below;
find . -name 'my-file-*' -type f -not \( -name 'my-file-1.2.0.jar' -or -name 'my-file.jar' \)
I came across this question when I was trying to find multiple filenames that I could not combine into a regular expression as described in #Chris J's answer, here is what worked for me
find . -name one.pdf -o -name two.txt -o -name anotherone.jpg
-o or -or is logical OR. See Finding Files on Gnu.org for more information.
I was running this on CygWin.
You can try this:
cat $(file $( find . -readable) | grep ASCII | tr ":" " " | awk '{print $1}')
with that, you can find all readable files with ascii and read them with cat
if you want to specify his weight and no-executable:
cat $(file $( find . -readable ! -executable -size 1033c) | grep ASCII | tr ":" " " | awk '{print $1}')
In my case i was missing trailing / in path.
find /var/opt/gitlab/backups/ -name *.tar

Regular expression to not match file two file types in Bash

I'm trying to do a list in bash of files that are not .html or .js
I've tired both of the following methods but neither work?
ls !(*.html|*.js)
ls | grep -v '\.(html|js)$'
There's yet another way to do it. bash has an option for extended glob patterns:
shopt -s extglob
ls !(*.html|*.js)
(Note that this is still a glob pattern, not a regular expression -- for example, * means "any string", not "zero or more of the preceding thing").
If your version of ls supports the -I flag:
ls -I *.js -I *.html
From the man page:
-I, --ignore=PATTERN
do not list implied entries matching shell PATTERN
Otherwise, use find:
find . -maxdepth 1 -type f ! \( -name "*.html" -o -name "*.js" \)
For formatting add:
-printf "%f\n"
If the filenames need to be piped, you only need to change the printf() statement:
-printf '%f\0' | xargs -0 ...
Use extended-regexp with the -E option:
ls | grep -E -v '\.(html|js)$'
The -I flag can filter ls output:
ls -I '*.html' -I '*.js'
or
ls | grep -v -e '\.html' -e '\.js'
From the man page:
-e PATTERN, --regexp=PATTERN
Use PATTERN as the pattern; useful to protect patterns beginning with -.

Piping find results into grep for fast directory exclusion

I am successfully using find to create a list of all files in the current subdirectory, excluding those in the subdirectory "cache." Here's my first bit of code:
find . -wholename './cach*' -prune -o -print
I now wish to pipe this into a grep command. It seems like that should be simple:
find . -wholename './cach*' -prune -o -print | xargs grep -r -R -i "samson"
... but this is returning results that are mostly from the cache directory. I've tried removing the xargs reference, but that does what you'd expect, running the grep on text of the file names, rather than on the files themselves. My goal is to find "samson" in any files that aren't cached content.
I'll probably get around this issue by just using doubled greps in this instance, but I'm very curious about why this one-liner behaves this way. I'd love to hear thoughts on a way to modify it while still using these two commands (as there are speed advantages to doing it this way).
(This is in CentOS 5, btw.)
The wholename match may be the reason why it's still including "cache" files. If you're executing the find command in the directory that contains the "cache" folder, it should work. If not, try changing it to -name '*cache*' instead.
Also, you do not need the -r or -R for your grep, that tells it to recurse through directories - but you're testing individual files.
You can update your command using the piped version, or a single-command:
find . -name '*cache*' -prune -o -print0 | xargs -0 grep -il "samson"
or
find . -name '*cache*' -prune -o -exec grep -iq "samson" {} \; -print
Note, the -l in the first command tells grep to "list the file" and not the line(s) that match. The -q in the second does the same; it tells grep to respond quietly so find will then just print the filename.
You've told grep itself to recurse (twice! -r and -R are synonyms). Since one of the arguments you're passing is . (the top directory), grep is searching in every file (some of them twice, or even more if they're in subdirectories).
If you're going to use find and grep, do this:
find . -path './cach*' -prune -o -print0 | xargs -0 grep -i "samson"
Using -print0 and -0 makes your script work even with file names that contain spaces or punctuation characters.
However, you probably don't need to bother with find here, since GNU grep is capable of excluding directories:
grep -R --exclude-dir='cach*' -i "samson" .
(This also excludes ./deeply/nested/directory/cache. If you only want to exclude cache directories at the toplevel, use find as you did.)
Use the -exec option on find instead of piping them to another command. From there you can use grep "samson" {} \; to look for samson in each file listed.
For example:
find . -wholename './cach*' -prune -o -exec grep "samson" "{}" +

How to list specific type of files in recursive directories in shell?

How can we find specific type of files i.e. doc pdf files present in nested directories.
command I tried:
$ ls -R | grep .doc
but if there is a file name like alok.doc.txt the command will display that too which is obviously not what I want. What command should I use instead?
If you are more confortable with "ls" and "grep", you can do what you want using a regular expression in the grep command (the ending '$' character indicates that .doc must be at the end of the line. That will exclude "file.doc.txt"):
ls -R |grep "\.doc$"
More information about using grep with regular expressions in the man.
ls command output is mainly intended for reading by humans. For advanced querying for automated processing, you should use more powerful find command:
find /path -type f \( -iname "*.doc" -o -iname "*.pdf" \)
As if you have bash 4.0++
#!/bin/bash
shopt -s globstar
shopt -s nullglob
for file in **/*.{pdf,doc}
do
echo "$file"
done
find . | grep "\.doc$"
This will show the path as well.
Some of the other methods that can be used:
echo *.{pdf,docx,jpeg}
stat -c %n * | grep 'pdf\|docx\|jpeg'
We had a similar question. We wanted a list - with paths - of all the config files in the etc directory. This worked:
find /etc -type f \( -iname "*.conf" \)
It gives a nice list of all the .conf file with their path. Output looks like:
/etc/conf/server.conf
But, we wanted to DO something with ALL those files, like grep those files to find a word, or setting, in all the files. So we use
find /etc -type f \( -iname "*.conf" \) -print0 | xargs -0 grep -Hi "ServerName"
to find via grep ALL the config files in /etc that contain a setting like "ServerName" Output looks like:
/etc/conf/server.conf: ServerName "default-118_11_170_172"
Hope you find it useful.
Sid
Similarly if you prefer using the wildcard character * (not quite like the regex suggestions) you can just use ls with both the -l flag to list one file per line (like grep) and the -R flag like you had. Then you can specify the files you want to search for with *.doc
I.E. Either
ls -l -R *.doc
or if you want it to list the files on fewer lines.
ls -R *.doc
If you have files with extensions that don't match the file type, you could use the file utility.
find $PWD -type f -exec file -N \{\} \; | grep "PDF document" | awk -F: '{print $1}'
Instead of $PWD you can use the directory you want to start the search in. file prints even out he PDF version.

Resources