find string from the file in somewhere - linux

I want to find a string from some file in subdirectory.
Like
we are in bundle/.
and in bundle/ there are multiple subdirectories and multiple txt files
I want to do something like
find . -type f -exec grep "\<F8\>" {} \;
want to get the file where it contain string < F8 >
this command does work, find the string, but never return filename
I hope anyone can give me a better solution to this, like display filename along with the line containing that string

grep -rl '<F8>' .
The -r option tells grep to search recursively through directories starting at .
The -l option tells it to show you just the filename that's matched, not the line itself.
Your output will look something like this:
./thisfile
./foo/bar/thatfile
If you want to limit this to only one file, append | head -1 to the end of the line.
If you want output like:
./thisfile:My text contains the <F8> string
./foo/bar/thatfile:didn't remember to press the <F8> key on the
then you can just leave off the -l option. Note that this output is not safe to parse, as filenames may contain colons, and colons in filenames are not escaped in grep's output.

You can use grep by itself.
grep -r '<F8>' .

This should list out all the files and line numbers that match:
grep -nR '<F8>' *

Personally I find it's easier just to use ack. (ack-grep is the name used in Ubuntu's repos to keep from confusing it with another software with the same name.) It's available in most major repositories.
The command would be ack -a "<F8>" (or ack-grep -a "<F8>" in Ubuntu). The -a option is to search all file types.
Example:
testfile
Line 1
Line 2
Line 3
<F8>
<F9>
<F10>
Line 4
Line 5
Line 6
Output:
$ ack -a "<F8>"
testfile
4:<F8>

Related

List each file that doesn't match a pattern recursively

Tried the following command, it lists all the lines including file names
which are not matching the given pattern.
grep -nrv "^type.* = .*"
"But what we need is list of file names in a folder with content
which does not have even a single occurrence of above pattern."
Your help will be really appreciated.
You need the -L option:
grep -rL '^type.* = .*' directory_name
From the GNU grep manual:
-L, - -files-without-match
    Suppress normal output; instead print the name of each input file from which no output    would normally have been printed. The scanning will stop on the first match.

grep recursively for a specific file type on Linux

Can we search a term (eg. "onblur") recursively in some folders only in specific files (html files)?
grep -Rin "onblur" *.html
This returns nothing. But,
grep -Rin "onblur" .
returns "onblur" search result from all available files, like in text(".txt"), .mako, .jinja etc.
Consider checking this answer and that one.
Also this might help you: grep certain file types recursively | commandlinefu.com.
The command is:
grep -r --include="*.[ch]" pattern .
And in your case it is:
grep -r --include="*.html" "onblur" .
grep -r --include "*.html" onblur .
Got it from :
How do I grep recursively?
You might also like ag 'the silver searcher' -
ag --html onblur
it searches by regexp and is recursive in the current directory by default, and has predefined sets of extensions to search - in this case --html maps to .htm, .html, .shtml, .xhtml. Also ignores binary files, prints filenames, line numbers, and colorizes output by default.
Some options -
-Q --literal
Do not parse PATTERN as a regular expression. Try to match it literally.
-S --smart-case
Match case-sensitively if there are any uppercase letters in PATTERN,
case-insensitively otherwise. Enabled by default.
-t --all-text
Search all text files. This doesn't include hidden files.
--hidden
Search hidden files. This option obeys ignored files.
For the list of supported filetypes run ag --list-file-types.
The only thing it seems to lack is being able to specify a filetype with an extension, in which case you need to fall back on grep with --include.
To be able to grep only from .py files by typing grepy mystring I added the following line to my bashrc:
alias grepy='grep -r --include="*.py"'
Also note that grep accepts The following:
grep mystring *.html
for .html search in current folder
grep mystring */*.html
for recursive search (excluding any file in current dir!).
grep mystring .*/*/*.html
for recursive search (all files in current dir and all files in subdirs)
Have a look at this answer instead, to a similar question: grep, but only certain file extensions
This worked for me. In your case just type the following:
grep -inr "onblur" --include \*.html ./
consider that
grep: command
-r: recursively
-i: ignore-case
-n: each output line is preceded by its relative line number in the file
--include \*.html: escape with \ just in case you have a directory with asterisks in the filenames
./: start at current directory.

Listing entries in a directory using grep

I'm trying to list all entries in a directory whose names contain ONLY upper-case letters. Directories need "/" appended.
#!/bin/bash
cd ~/testfiles/
ls | grep -r *.*
Since grep by default looks for upper-case letters only (right?), I'm just recursively searching through the directories under testfiles for all names who contain only upper-case letters.
Unfortunately this doesn't work.
As for appending directories, I'm not sure why I need to do this. Does anyone know where I can start with some detailed explanations on what I can do with grep? Furthermore how to tackle my problem?
No, grep does not only consider uppercase letters.
Your question I a bit unclear, for example:
from your usage of the -r option, it seems you want to search recursively, however you don't say so. For simplicity I assume you don't need to; consider looking into #twm's answer if you need recursion.
you want to look for uppercase (letters) only. Does that mean you don't want to accept any other (non letter) characters, but which are till valid for file names (like digits or dashes, dots, etc.)
since you don't say th it i not permissible to have only on file per line, I am assuming it is OK (thus using ls -1).
The naive solution would be:
ls -1 | grep "^[[:upper:]]\+$"
That is, print all lines containing only uppercase letters. In my TEMP directory that prints, for example:
ALLBIG
LCFEM
WPDNSE
This however would exclude files like README.TXT or FILE001, which depending on your requirements (see above) should most likely be included.
Thus, a better solution would be:
ls -1 | grep -v "[[:lower:]]\+"
That is, print all lines not containing an lowercase letter. In my TEMP directory that prints for example:
ALLBIG
ALLBIG-01.TXT
ALLBIG005.TXT
CRX_75DAF8CB7768
LCFEM
WPDNSE
~DFA0214428CD719AF6.TMP
Finally, to "properly mark" directories with a trailing '/', you could use the -F (or --classify) option.
ls -1F | grep -v "[[:lower:]]\+"
Again, example output:
ALLBIG
ALLBIG-01.TXT
ALLBIG005.TXT
CRX_75DAF8CB7768
LCFEM/
WPDNSE/
~DFA0214428CD719AF6.TMP
Note a different option would to be use find, if you can live with the different output (e.g. find ! -regex ".*[a-z].*"), but that will have a different output.
The exact regular expression depend on the output format of your ls command. Assuming that you do not use an alias for ls, you can try this:
ls -R | grep -o -w "[A-Z]*"
note that with -R in ls you will recursively list directories and files under the current directory. The grep option -o tells grep to only print the matched part of the text. The -w options tell grep to consider as match only for whole words. The "[A-Z]*" is a regexp to filter only upper-cased words.
Note that this regexp will print TEST.txt as well as TEXT.TXT. In other words, it will only consider names that are formed by letters.
It's ls which lists the files, not grep, so that is where you need to specify that you want "/" appended to directories. Use ls --classify to append "/" to directories.
grep is used to process the results from ls (or some other source, generally speaking) and only show lines that match the pattern you specify. It is not limited to uppercase characters. You can limit it to just upper case characters and "/" with grep -E '^[A-Z/]*$ or if you also want numbers, periods, etc. you could instead filter out lines that contain lowercase characters with grep -v -E [a-z].
As grep is not the program which lists the files, it is not where you want to perform the recursion. ls can list paths recursively if you use ls -R. However, you're just going to get the last component of the file paths that way.
You might want to consider using find to handle the recursion. This works for me:
find . -exec ls -d --classify {} \; | egrep -v '[a-z][^/]*/?$'
I should note, using ls --classify to append "/" to the end of directories may also append some other characters to other types of paths that it can classify. For instance, it may append "*" to the end of executable files. If that's not OK, but you're OK with listing directories and other paths separately, this could be worked around by running find twice - once for the directories and then again for other paths. This works for me:
find . -type d | egrep -v '[a-z][^/]*$' | sed -e 's#$#/#'
find . -not -type d | egrep -v '[a-z][^/]*$'

How to find dos format files in a linux file system

I would like to find out which of my files in a directory are dos text files (as opposed to unix text files).
What I've tried:
find . -name "*.php" | xargs grep ^M -l
It's not giving me reliable results... so I'm looking for a better alternative.
Any suggestions, ideas?
Thanks
Clarification
In addition to what I've said above, the problem is that i have a bunch of dos files with no ^M characters in them (hence my note about reliability).
The way i currently determine whether a file is dos or not is through Vim, where at the bottom it says:
"filename.php" [dos] [noeol]
How about:
find . -name "*.php" | xargs file | grep "CRLF"
I don't think it is reliable to try and use ^M to try and find the files.
Not sure what you mean exactly by "not reliable" but you may want to try:
find . -name '*.php' -print0 | xargs -0 grep -l '^M$'
This uses the more atrocious-filenames-with-spaces-in-them-friendly options and only finds carriage returns immediately before the end of line.
Keep in mind that the ^M is a single CTRLM character, not two characters.
And also that it'll list files where even one line is in DOS mode, which is probably what you want anyway since those would have been UNIX files mangled by a non-UNIX editor.
Based on your update that vim is reporting your files as DOS format:
If vim is reporting it as DOS format, then every line ends with CRLF. That's the way vim works. If even one line doesn't have CR, then it's considered UNIX format and the ^M characters are visible in the buffer. If it's all DOS format, the ^M characters are not displayed:
Vim will look for both dos and unix line endings, but Vim has a built-in preference for the unix format.
- If all lines in the file end with CRLF, the dos file format will be applied, meaning that each CRLF is removed when reading the lines into a buffer, and the buffer 'ff' option will be dos.
- If one or more lines end with LF only, the unix file format will be applied, meaning that each LF is removed (but each CR will be present in the buffer, and will display as ^M), and the buffer 'ff' option will be unix.
If you really want to know what's in the file, don't rely on a too-smart tool like vim :-)
Use:
od -xcb input_file_name | less
and check the line endings yourself.
i had good luck with
find . -name "*.php" -exec grep -Pl "\r" {} \;
This is much like your original solution; therefore, it's possibly more easy for you to remember:
find . -name "*.php" | xargs grep "\r" -l
Thought process:
In VIM, to remove the ^M you type:
%s:/^M//g
Where ^ is your Ctrl key and M is the ENTER key. But I could never remember the keys to type to print that sequence, so I've always removed them using:
%s:/\r//g
So my deduction is that the \r and ^M are equivalent, with the former being easier to remember to type.
If your dos2unix command has the -i option, you can use that feature to find files in a directory that have DOS line breaks.
$ man dos2unix
.
.
.
-i[FLAGS], --info[=FLAGS] FILE ...
Display file information. No conversion is done.
The following information is printed, in this order:
number of DOS line breaks,
number of Unix line breaks,
number of Mac line breaks,
byte order mark,
text or binary, file name.
.
.
.
Optionally extra flags can be set to change the (-i) output.
.
.
.
c Print only the files that would be converted.
The following one-liner script reads:
find all files in this directory tree,
run dos2unix on all files to determine the files to be changed,
run dos2unix on files to be changed
$ find . -type f | xargs -d '\n' dos2unix -ic | xargs -d '\n' dos2unix
I've been using cat -e to see what line endings files have.
Using ^M as a single CTRLM character didn't really work out for me (it works as if I just press return, without actually inserting the non-printable ^M line ending — tested with echo 'CTRLM' | cat -e), so what I ended up doing will probably seem too much, but it did the job nevertheless:
grep '$' *.php | cat -e | grep '\^M\$' | sed 's/:.*//' | uniq
, where
the first grep just prepends filenames to each line of each file (can be replaced with awk '{print FILENAME, $0}', but grep worked faster on my set of files);
cat -e explicitly prints non-printable line endings;
the second grep finds lines ending with ^M$, and ^M are two characters;
the sed part keeps only the file names (can be replaced with cut -d ':' -f 1);
uniq just keeps each file name once.
GNU find
find . -type f -iname "*.php" -exec file "{}" + | grep CRLF
I don't know what you want to do after you find those DOS php files, but if you want to convert them to unix format, then
find . -type f -iname "*.php" -exec dos2unix "{}" +;
will suffice. There's no need to specifically check whether they are DOS files or not.
If you prefer vim to tell you which files are in this format you can use the following script:
"use this script to check which files are in dos format according to vim
"use: in the folder that you want to check
"create a file, say res.txt
"> vim -u NONE --noplugins res.txt
"> in vim: source this_script.vim
python << EOF
import os
import vim
cur_buf = vim.current.buffer
IGNORE_START = ''.split()
IGNORE_END = '.pyc .swp .png ~'.split()
IGNORE_DIRS = '.hg .git dd_ .bzr'.split()
for dirpath, dirnames, fnames in os.walk(os.curdir):
for dirn in dirnames:
for diri in IGNORE_DIRS:
if dirn.endswith(diri):
dirnames.remove(dirn)
break
for fname in fnames:
skip = False
for fstart in IGNORE_START:
if fname.startswith(fstart):
skip = True
for fend in IGNORE_END:
if fname.endswith(fend):
skip = True
if skip is True:
continue
fname = os.path.join(dirpath, fname)
vim.command('view {}'.format(fname))
curr_ff = vim.eval('&ff')
if vim.current.buffer != cur_buf:
vim.command('bw!')
if curr_ff == 'dos':
cur_buf.append('{} {}'.format(curr_ff, fname))
EOF
your vim needs to be compiled with python (python is used to loop over the files in the folder, there is probably an easier way of doing this, but I don't really know it....

How do you search for files containing DOS line endings (CRLF) with grep on Linux?

I want to search for files containing DOS line endings with grep on Linux. Something like this:
grep -IUr --color '\r\n' .
The above seems to match for literal rn which is not what is desired.
The output of this will be piped through xargs into todos to convert crlf to lf like this
grep -IUrl --color '^M' . | xargs -ifile fromdos 'file'
grep probably isn't the tool you want for this. It will print a line for every matching line in every file. Unless you want to, say, run todos 10 times on a 10 line file, grep isn't the best way to go about it. Using find to run file on every file in the tree then grepping through that for "CRLF" will get you one line of output for each file which has dos style line endings:
find . -not -type d -exec file "{}" ";" | grep CRLF
will get you something like:
./1/dos1.txt: ASCII text, with CRLF line terminators
./2/dos2.txt: ASCII text, with CRLF line terminators
./dos.txt: ASCII text, with CRLF line terminators
Use Ctrl+V, Ctrl+M to enter a literal Carriage Return character into your grep string. So:
grep -IUr --color "^M"
will work - if the ^M there is a literal CR that you input as I suggested.
If you want the list of files, you want to add the -l option as well.
Explanation
-I ignore binary files
-U prevents grep from stripping CR characters. By default it does this it if it decides it's a text file.
-r read all files under each directory recursively.
Using RipGrep (depending on your shell, you might need to quote the last argument):
rg -l \r
-l, --files-with-matches
Only print the paths with at least one match.
https://github.com/BurntSushi/ripgrep
If your version of grep supports -P (--perl-regexp) option, then
grep -lUP '\r$'
could be used.
# list files containing dos line endings (CRLF)
cr="$(printf "\r")" # alternative to ctrl-V ctrl-M
grep -Ilsr "${cr}$" .
grep -Ilsr $'\r$' . # yet another & even shorter alternative
dos2unix has a file information option which can be used to show the files that would be converted:
dos2unix -ic /path/to/file
To do that recursively you can use bash’s globstar option, which for the current shell is enabled with shopt -s globstar:
dos2unix -ic ** # all files recursively
dos2unix -ic **/file # files called “file” recursively
Alternatively you can use find for that:
find -type f -exec dos2unix -ic {} + # all files recursively (ignoring directories)
find -name file -exec dos2unix -ic {} + # files called “file” recursively
You can use file command in unix. It gives you the character encoding of the file along with line terminators.
$ file myfile
myfile: ISO-8859 text, with CRLF line terminators
$ file myfile | grep -ow CRLF
CRLF
The query was search... I have a similar issue... somebody submitted mixed line
endings into the version control, so now we have a bunch of files with 0x0d
0x0d 0x0a line endings. Note that
grep -P '\x0d\x0a'
finds all lines, whereas
grep -P '\x0d\x0d\x0a'
and
grep -P '\x0d\x0d'
finds no lines so there may be something "else" going on inside grep
when it comes to line ending patterns... unfortunately for me!
If, like me, your minimalist unix doesn't include niceties like the file command, and backslashes in your grep expressions just don't cooperate, try this:
$ for file in `find . -type f` ; do
> dump $file | cut -c9-50 | egrep -m1 -q ' 0d| 0d'
> if [ $? -eq 0 ] ; then echo $file ; fi
> done
Modifications you may want to make to the above include:
tweak the find command to locate only the files you want to scan
change the dump command to od or whatever file dump utility you have
confirm that the cut command includes both a leading and trailing space as well as just the hexadecimal character output from the dump utility
limit the dump output to the first 1000 characters or so for efficiency
For example, something like this may work for you using od instead of dump:
od -t x2 -N 1000 $file | cut -c8- | egrep -m1 -q ' 0d| 0d|0d$'

Resources