How to use grep to reverse search files in a folder

How to use grep to reverse search files in a folder - linux

I'm trying to create a script which will find missing topics from multiple log files. These logfiles are filled top down, so the newest logs are at the bottom of the file. I would like to grep only the last line from this file which includes UNKNOWN_TOPIC_OR_PARTITION. This should be done in multiple files with completely different names. Is grep the best solution or is there another solution that suits my needs. I already tried adding tail, but that doesn't seem to work.
missingTopics=$(grep -Ri -m1 --exclude=*.{1,2,3,4,5} UNKNOWN_TOPIC_OR_PARTITION /app/tibco/log/tra/domain/)

You could try a combination of find, tac and grep:
find /app/tibco/log/tra/domain -type f ! -name '*.[1-5]' -exec sh -c \
'tac "$1" | grep -im1 UNKNOWN_TOPIC_OR_PARTITION' "sh" '{}' \;
tac prints files in reverse, the -exec sh -c SCRIPT "sh" '{}' \; action of find executes the shell SCRIPT each time a file matching the previous tests is found. The SCRIPT is executed with "sh" as parameter $0 and the path of the found file as parameter $1.
If performance is an issue you can probably improve it with:
find . -type f ! -name '*.[1-5]' -exec sh -c 'for f in "$#"; do \
tac "$f" | grep -im1 UNKNOWN_TOPIC_OR_PARTITION; done' "sh" '{}' +
which will spawn less shells. If security is also an issue you can also replace -exec by -execdir (even if with this SCRIPT I do not immediately see any exploit).

Related

Variable causing issue while doing the test command in unix:: find $PWD -type d -exec sh -c 'test "{}" ">" "$PWD/$VersionFolders"' ; -print|wc -l`

Variable causing issue while doing the test command in unix Command is::
find $PWD -type d -exec sh -c 'test "{}" ">" "$PWD/$VersionFolders"' \; -print|wc -l`
Input Values-
Here $PWD- Current Directory
b1_v.1.0
b1_v.1.2
b1_v.1.3
b1_v.1.4
Given Version folder as $VersionFolders
b1_v.1.2
The Command should check if any folders exist in current directory which is greater than the give version folder and it should count or display.
This approach has to be consider with out date or time created of folders.
Expected Output-
b1_v.1.3
b1_v.1.4
If I give hard code Directories its working fine. But when I pass it like as variable.it give all folders.
working fine this commend-
find $PWD -type d -exec sh -c 'test "{}" ">" "$PWD/b1_v.1.2"' \; -print|wc -l`
Not working this command with variable-
find $PWD -type d -exec sh -c 'test "{}" ">" "$PWD/$VersionFolders"' ; -print|wc -l`

The variable $VersionFolders won't get expanded inside the single quotes, and apparently you did not export it to make it visible to subprocesses.
An obscure but common hack is to put it in $0 (which nominally should contain the name of the shell itself, and is the first argument after sh -c '...') because that keeps the code simple.
find . -type d \
-exec sh -c 'test "$1" ">" "$0"' \
"$VersionFolders" {} \; \
-print | wc -l
But as #chepner remarks, you can run -exec test {} ">" "$VersionFolders" \; directly.
The shell already does everything in the current directory, so you don't need to spell out $PWD. Perhaps see also What exactly is current working directory?

Moving files with a specific modification date; "find | xargs ls | grep | -exec" fails w/ "-exec: command not found"

Iam using centos 7
If I want to find files that have specific name and specific date then moving these files to another folder iam issuing the command
find -name 'fsimage*' | xargs ls -ali | grep 'Oct 20' | -exec mv {} /hdd/fordelete/ \;
with the following error
-bash: -exec: command not found xargs: ls: terminated by signal 13

As another answer already explains, -exec is an action for find, you can't use it as a shell command. On contrary, xargs and grep are commands, and you can't use them as find actions, just like you can't use pipe | inside find.
But more importantly, even though you could use ls and grep on find's result just to move files older than some amount of time, you shouldn't. Such pipeline is fragile and fails on many corner cases, like symlinks, files with newlines in name, etc.
Instead, use find. You'll find it quite powerful.
For example, to mv files modified more than 7 days ago, use the -mtime test:
find -name 'fsimage*' -mtime +7 -exec mv '{}' /some/dir/ \;
To mv files modified on a specific/reference date, e.g. 2017-10-20, you can use the -newerXY test:
find -name 'fsimage*' -newermt 2017-10-20 ! -newermt 2017-10-21 -exec mv '{}' /some/dir/ \;
Also, if your mv supports the -t option (to give target dir first, multiple files after), you can use {} + placeholder in find for multiple files, reducing the total number of mv command invocations (thanks #CharlesDuffy):
find -name 'fsimage*' -mtime +7 -exec mv -t /some/dir/ '{}' +

the -exec as you wrote it is quite meaningless, moreover it seems you are mixing find syntax with shell oe (-exec as you wrote it should be passed to find)
there are probably more concise ways of doing, but this should do what you expect:
find -name 'fsimage*' -type f | xargs ls -ali | grep 'Oct 20' | awk '{ print $NF }' | while read file; do mv "$file" /hdd/fordelete/ ; done
nevertheless, you should take care of not just copy/paste things you do not really understand from the web, you may wreck you system...

Find files in a dir, executing a command with execdir and redirecting

It seems like I am unable to find a direct answer to this question.
I appreciate your help.
I'm trying to find all files with a specific name in a directory, read the last 1000 lines of the file and copy it in to a new file in the same directory. As an example:
Find all files names xyz.log in the current directory, copy the last 1000 lines to file abc.log (which doesn't exist).
I tried to use the following command with no luck:
find . -name "xyz.log" -execdir tail -1000 {} > abc.log \;
The problem I'm having is that for all the files in the current directory, they all write to abc.log in the CURRENT directory and not in the directory where xyz.log resides. Clearly the find with execdir is first executed and then the output is redirected to abc.log.
Can you guys suggest a way to fix this? I appreciate any information/help.
EDIT- I tried find . -name "xyz.log" -execdir sh -c "tail -1000 {} > abc.log" \; as suggested by some of the friends, but it gives me this error: sh: ./tail: No such file or directory error message. Do you guys have any idea what the problem is?
Luckily the solution to use -printf is working fine.

The simplest way is this:
find . -name "xyz.log" -execdir sh -c 'tail -1000 "{}" >abc.log' \;
A more flexible alternative is to first print out the commands and then execute them all with sh:
find . -name "xyz.log" -printf 'tail -1000 "%p" >"%h/abc.log"\n' | sh
You can remove the | sh from the end when you're trying it out/debugging.
There is a bug in some versions of findutils (4.2 and 4.3, though it was fixed in some 4.2.x and 4.3.x versions) that cause execdir arguments that contain {} to be prefixed with ./ (instead of the prefix being applied only to {} it is applied to the whole quoted string). To work around this you can use:
find . -name "xyz.log" -execdir sh -c 'tail -1000 "$1" >abc.log' sh {} \;
sh -c 'script' arg0 arg1 runs the sh script with arg0, arg1, etc. passed to it. By convention, arg0 is the name of the executable (here, "sh"). From the script you can access the arguments using $0 (corresponding to "sh"), $1 (corresponding to find's expansion of {}), etc.

The redirect isn't passed into execdir, so abc.log shows up in the directory you run the command in. -execdir also doesn't like embedded redirects. but you can workaround the problem by passing -execdir a shell command with a redirect embedded, like this:
find . -name "xyz.log" -execdir sh -c '/usr/bin/tail -1000 {} > abc.log' \;
Much credit to this blog post (not mine):
http://www.microhowto.info/howto/act_on_all_files_in_a_directory_tree_using_find.html
Edit
I put the full path to tail in the command (assuming it's in /usr/bin on your system), since sh may load a .profile with a PATH that differs from your current shell.

Here's another non-find (well, sorta - it still uses find but doesn't try to shoehorn find into doing the whole thing):
while read f
do
d=$(dirname "${f}")
tail -n 1000 "${f}" > "${d}/abc.log"
done < <(find . -type f -name xyz.log -print)

Find multiple files and rename them in Linux

I am having files like a_dbg.txt, b_dbg.txt ... in a Suse 10 system. I want to write a bash shell script which should rename these files by removing "_dbg" from them.
Google suggested me to use rename command. So I executed the command rename _dbg.txt .txt *dbg* on the CURRENT_FOLDER
My actual CURRENT_FOLDER contains the below files.
CURRENT_FOLDER/a_dbg.txt
CURRENT_FOLDER/b_dbg.txt
CURRENT_FOLDER/XX/c_dbg.txt
CURRENT_FOLDER/YY/d_dbg.txt
After executing the rename command,
CURRENT_FOLDER/a.txt
CURRENT_FOLDER/b.txt
CURRENT_FOLDER/XX/c_dbg.txt
CURRENT_FOLDER/YY/d_dbg.txt
Its not doing recursively, how to make this command to rename files in all subdirectories. Like XX and YY I will be having so many subdirectories which name is unpredictable. And also my CURRENT_FOLDER will be having some other files also.

You can use find to find all matching files recursively:
find . -iname "*dbg*" -exec rename _dbg.txt .txt '{}' \;
EDIT: what the '{}' and \; are?
The -exec argument makes find execute rename for every matching file found. '{}' will be replaced with the path name of the file. The last token, \; is there only to mark the end of the exec expression.
All that is described nicely in the man page for find:
-exec utility [argument ...] ;
True if the program named utility returns a zero value as its
exit status. Optional arguments may be passed to the utility.
The expression must be terminated by a semicolon (``;''). If you
invoke find from a shell you may need to quote the semicolon if
the shell would otherwise treat it as a control operator. If the
string ``{}'' appears anywhere in the utility name or the argu-
ments it is replaced by the pathname of the current file.
Utility will be executed from the directory from which find was
executed. Utility and arguments are not subject to the further
expansion of shell patterns and constructs.

For renaming recursively I use the following commands:
find -iname \*.* | rename -v "s/ /-/g"

small script i wrote to replace all files with .txt extension to .cpp extension under /tmp and sub directories recursively
#!/bin/bash
for file in $(find /tmp -name '*.txt')
do
mv $file $(echo "$file" | sed -r 's|.txt|.cpp|g')
done

with bash:
shopt -s globstar nullglob
rename _dbg.txt .txt **/*dbg*

find -execdir rename also works for non-suffix replacements on basenames
https://stackoverflow.com/a/16541670/895245 works directly only for suffixes, but this will work for arbitrary regex replacements on basenames:
PATH=/usr/bin find . -depth -execdir rename 's/_dbg.txt$/_.txt' '{}' \;
or to affect files only:
PATH=/usr/bin find . -type f -execdir rename 's/_dbg.txt$/_.txt' '{}' \;
-execdir first cds into the directory before executing only on the basename.
Tested on Ubuntu 20.04, find 4.7.0, rename 1.10.
Convenient and safer helper for it
find-rename-regex() (
set -eu
find_and_replace="$1"
PATH="$(echo "$PATH" | sed -E 's/(^|:)[^\/][^:]*//g')" \
find . -depth -execdir rename "${2:--n}" "s/${find_and_replace}" '{}' \;
)
GitHub upstream.
Sample usage to replace spaces ' ' with hyphens '-'.
Dry run that shows what would be renamed to what without actually doing it:
find-rename-regex ' /-/g'
Do the replace:
find-rename-regex ' /-/g' -v
Command explanation
The awesome -execdir option does a cd into the directory before executing the rename command, unlike -exec.
-depth ensure that the renaming happens first on children, and then on parents, to prevent potential problems with missing parent directories.
-execdir is required because rename does not play well with non-basename input paths, e.g. the following fails:
rename 's/findme/replaceme/g' acc/acc
The PATH hacking is required because -execdir has one very annoying drawback: find is extremely opinionated and refuses to do anything with -execdir if you have any relative paths in your PATH environment variable, e.g. ./node_modules/.bin, failing with:
find: The relative path ‘./node_modules/.bin’ is included in the PATH environment variable, which is insecure in combination with the -execdir action of find. Please remove that entry from $PATH
See also: https://askubuntu.com/questions/621132/why-using-the-execdir-action-is-insecure-for-directory-which-is-in-the-path/1109378#1109378
-execdir is a GNU find extension to POSIX. rename is Perl based and comes from the rename package.
Rename lookahead workaround
If your input paths don't come from find, or if you've had enough of the relative path annoyance, we can use some Perl lookahead to safely rename directories as in:
git ls-files | sort -r | xargs rename 's/findme(?!.*\/)\/?$/replaceme/g' '{}'
I haven't found a convenient analogue for -execdir with xargs: https://superuser.com/questions/893890/xargs-change-working-directory-to-file-path-before-executing/915686
The sort -r is required to ensure that files come after their respective directories, since longer paths come after shorter ones with the same prefix.
Tested in Ubuntu 18.10.

Script above can be written in one line:
find /tmp -name "*.txt" -exec bash -c 'mv $0 $(echo "$0" | sed -r \"s|.txt|.cpp|g\")' '{}' \;

If you just want to rename and don't mind using an external tool, then you can use rnm. The command would be:
#on current folder
rnm -dp -1 -fo -ssf '_dbg' -rs '/_dbg//' *
-dp -1 will make it recursive to all subdirectories.
-fo implies file only mode.
-ssf '_dbg' searches for files with _dbg in the filename.
-rs '/_dbg//' replaces _dbg with empty string.
You can run the above command with the path of the CURRENT_FOLDER too:
rnm -dp -1 -fo -ssf '_dbg' -rs '/_dbg//' /path/to/the/directory

You can use this below.
rename --no-act 's/\.html$/\.php/' *.html */*.html

This command worked for me. Remember first to install the perl rename package:
find -iname \*.* | grep oldname | rename -v "s/oldname/newname/g

To expand on the excellent answer #CiroSantilliПутлерКапут六四事 : do not match files in the find that we don't have to rename.
I have found this to improve performance significantly on Cygwin.
Please feel free to correct my ineffective bash coding.
FIND_STRING="ZZZZ"
REPLACE_STRING="YYYY"
FIND_PARAMS="-type d"
find-rename-regex() (
set -eu
find_and_replace="${1}/${2}/g"
echo "${find_and_replace}"
find_params="${3}"
mode="${4}"
if [ "${mode}" = 'real' ]; then
PATH="$(echo "$PATH" | sed -E 's/(^|:)[^\/][^:]*//g')" \
find . -depth -name "*${1}*" ${find_params} -execdir rename -v "s/${find_and_replace}" '{}' \;
elif [ "${mode}" = 'dryrun' ]; then
echo "${mode}"
PATH="$(echo "$PATH" | sed -E 's/(^|:)[^\/][^:]*//g')" \
find . -depth -name "*${1}*" ${find_params} -execdir rename -n "s/${find_and_replace}" '{}' \;
fi
)
find-rename-regex "${FIND_STRING}" "${REPLACE_STRING}" "${FIND_PARAMS}" "dryrun"
# find-rename-regex "${FIND_STRING}" "${REPLACE_STRING}" "${FIND_PARAMS}" "real"

In case anyone is comfortable with fd and rnr, the command is:
fd -t f -x rnr '_dbg.txt' '.txt'
rnr only command is:
rnr -f -r '_dbg.txt' '.txt' *
rnr has the benefit of being able to undo the command.

On Ubuntu (after installing rename), this simpler solution worked the best for me. This replaces space with underscore, but can be modified as needed.
find . -depth | rename -d -v -n "s/ /_/g"
The -depth flag is telling find to traverse the depth of a directory first, which is good because I want to rename the leaf nodes first.
The -d flag on rename tells it to only rename the filename component of the path. I don't know how general the behavior is but on my installation (Ubuntu 20.04), it could be the file or the directory as long as it is the leaf node of the path.
I recommend the -n (no action) flag first along with -v, so you can see what would get renamed and how.
Using the two flags together, it renames all the files in a directory first and then the directory itself. Working backwards. Which is exactly what I needed.

classic solution:
for f in $(find . -name "*dbg*"); do mv $f $(echo $f | sed 's/_dbg//'); done

Find and basename not playing nicely

I want to echo out the filename portion of a find on the linux commandline. I've tried to use the following:
find www/*.html -type f -exec sh -c "echo $(basename {})" \;
and
find www/*.html -type f -exec sh -c "echo `basename {}`" \;
and a whole host of other combinations of escaping and quoting various parts of the text. The result is that the path isn't stripped:
www/channel.html
www/definition.html
www/empty.html
www/index.html
www/privacypolicy.html
Why not?
Update: While I have a working solution below, I'm still interested in why "basename" doesn't do what it should do.

The trouble with your original attempt:
find www/*.html -type f -exec sh -c "echo $(basename {})" \;
is that the $(basename {}) code is executed once, before the find command is executed. The output of the single basename is {} since that is the basename of {} as a filename. So, the command that is executed by find is:
sh -c "echo {}"
for each file found, but find actually substitutes the original (unmodified) file name each time because the {} characters appear in the string to be executed.
If you wanted it to work, you could use single quotes instead of double quotes:
find www/*.html -type f -exec sh -c 'echo $(basename {})' \;
However, making echo repeat to standard output what basename would have written to standard output anyway is a little pointless:
find www/*.html -type f -exec sh -c 'basename {}' \;
and we can reduce that still further, of course, to:
find www/*.html -type f -exec basename {} \;
Could you also explain the difference between single quotes and double quotes here?
This is routine shell behaviour. Let's take a slightly different command (but only slightly — the names of the files could be anywhere under the www directory, not just one level down), and look at the single-quote (SQ) and double-quote (DQ) versions of the command:
find www -name '*.html' -type f -exec sh -c "echo $(basename {})" \; # DQ
find www -name '*.html' -type f -exec sh -c 'echo $(basename {})' \; # SQ
The single quotes pass the material enclosed direct to the command. Thus, in the SQ command line, the shell that launches find removes the enclosing quotes and the find command sees its $9 argument as:
echo $(basename {})
because the shell removes the quotes. By comparison, the material in the double quotes is processed by the shell. Thus, in the DQ command line, the shell (that launches find — not the one launched by find) sees the $(basename {}) part of the string and executes it, getting back {}, so the string it passes to find as its $9 argument is:
echo {}
Now, when find does its -exec action, in both cases it replaces the {} by the filename that it just found (for sake of argument, www/pics/index.html). Thus, you get two different commands being executed:
sh -c 'echo $(basename www/pics/index.html)' # SQ
sh -c "echo www/pics/index.html" # DQ
There's a (slight) notational cheat going on there — those are the equivalent commands that you'd type at the shell. The $2 of the shell that is launched actually has no quotes in it in either case — the launched shell does not see any quotes.
As you can see, the DQ command simply echoes the file name; the SQ command runs the basename command and captures its output, and then echoes the captured output. A little bit of reductionist thinking shows that the DQ command could be written as -print instead of using -exec, and the SQ command could be written as -exec basename {} \;.
If you're using GNU find, it supports the -printf action which can be followed by Format Directives such that running basename is unnecessary. However, that is only available in GNU find; the rest of the discussion here applies to any version of find you're likely to encounter.

Try this instead :
find www/*.html -type f -printf '%f\n'
If you want to do it with a pipe (more resources needed) :
find www/*.html -type f -print0 | xargs -0 -n1 basename

Thats how I batch resize files with imagick, rediving output filename from source
find . -name header.png -exec sh -c 'convert -geometry 600 {} $(dirname {})/$(basename {} ".png")_mail.png' \;

I had to accomplish something similar, and found following the practices mentioned for avoiding looping over find's output and using find with sh sidestepped these problems with {} and -printfentirely.
You can try it like this:
find www/*.html -type f -exec sh -c 'echo $(basename $1)' find-sh {} \;
The summary is "Don't reference {} directly inside of a sh -c but instead pass it to sh -c as an argument, then you can reference it with a number variable inside of sh -c" the find-sh is just there as a dummy to take up the $0, there is more utility in doing it that way and using {} for $1.
I'm assuming the use of echo is really to simplify the concept and test function. There are easier ways to simply echo as others have mentioned, But an ideal use case for this scenario might be using cp, mv, or any more complex commands where you want to reference the found file names more than once in the command and you need to get rid of the path, eg. when you have to specify filename in both source and destination or if you are renaming things.
So for instance, if you wanted to copy only the html documents to your public_html directory (Why? because Example!) then you could:
find www/*.html -type f -exec sh -c 'cp /var/www/$(basename $1) /home/me/public_html/$(basename $1)' find-sh {} \;
Over on unix stackexchange, user wildcard's answer on looping with find goes into some great gems on usage of -exec and sh -c. (You can find it here: https://unix.stackexchange.com/questions/321697/why-is-looping-over-finds-output-bad-practice)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string