Using Perl-based rename command with find in Bash - linux

I just stumbled upon Perl today while playing around with Bash scripting. When I tried to remove blank spaces in multiple file names, I found this post, which helped me a lot.
After a lot of struggling, I finally understand the rename and substitution commands and their syntax. I wanted to try to replace all "_(x)" at the end of file names with "x", due to duplicate files. But when I try to do it myself, it just does not seem to work. I have three questions with the following code:
Why is nothing executed when I run it?
I used redirection to show me the success note as an error, so I know what happened. What did I do wrong about that?
After a lot of research, I still do not entirely understand file descriptors and redirection in Bash as well as the syntax for the substitute function in Perl. Can somebody give give me a link for a good tutorial?
find -name "*_(*)." -type f | \
rename 's/)././g' && \
find -name "*_(*." -type f | \
rename 's/_(//g' 2>&1

You either need to use xargs or you need to use find's ability to execute commands:
find -name "*_(*)." -type f | xargs rename 's/)././g'
find -name "*_(*." -type f | xargs rename 's/_(//g'
Or:
find -name "*_(*)." -type f -exec rename 's/)././g' {} +
find -name "*_(*." -type f -exec rename 's/_(//g' {} +
In both cases, the file names are added to the command line of rename. As it was, rename would have to read its standard input to discover the file names — and it doesn't.
Does the first find find the files you want? Is the dot at the end of the pattern needed? Do the regexes do what you expect? OK, let's debug some of those too.
You could do it all in one command with a more complex regex:
find . -name "*_(*)" -type f -exec rename 's/_\((\d+)\)$/$1/' {} +
The find pattern is corrected to lose the requirement of a trailing .. If the _(x) is inserted before the extension, then you'd need "*_(*).*" as the pattern for find (and you'll need to revise the Perl regexes).
The Perl substitute needs dissection:
The \( matches an open parenthesis.
The ( starts a capture group.
The \d+ looks for 'one or more digits'.
The ) stops the capture group. It is the first and only, so it is given the number 1.
The \) matches a close parenthesis.
The $ matches the end of the file name.
The $1 in the replacement puts the value of capture group 1 into the replacement text.
In your code, the 2>&1 sent the error messages from the second rename command to standard output instead of standard error. That really doesn't help much here.
You need two separate tutorials; you are not going to find one tutorial that covers I/O redirection in Bash and regular expressions in Perl.
The 'official' Perl regular expression tutorial is:
perlretut, also available as perldoc perlretut on your machine.
The Bash manual covers I/O redirection, but it is somewhat terse:
I/O Redirections.

Related

Simple Bash Script that recursively searches in subdirs for a certain string

i recently started learning linux because a ctf contest is coming in the next months. The problem that I struggle with is that i am trying to make a bash script that starts from a directory, checks if the content is a directory or other kind of file. If it is a file,image etc apply strings $f | grep -i 'abcdef', if it is a directory cd to that directory and start over. i have c++ experience and i understand the logic but i can't really make it work.I can't succesfully implement the loop that goes thru all the subdirectories. All help would be appreciated!
you don not need a loop for this implementation. The find command can do what you are looking after.
for instance:
find /home -type f -exec sh -c " strings {} | grep abcd " \;
explain:
/home is you base directory can be anything
-type f: means a regular file
-exec from the man page:
"Execute command; true if 0 status is returned. All
following arguments to find are taken to be arguments to
the command until an argument consisting of ;' is encountered. The string {}' is replaced by the current
file name being processed everywhere it occurs in the
arguments to the command, not just in arguments where it
is alone, as in some versions of find. Both of these
constructions might need to be escaped (with a `') or
quoted to protect them from expansion by the shell. See
the EXAMPLES section for examples of the use of the -exec
option. The specified command is run once for each
matched file. The command is executed in the starting
directory. There are unavoidable security problems
surrounding use of the -exec action; you should use the
-execdir option instead."
If you want to just find the string in a file and you do not HAVE TO first find a directory and then a file and then search, you can just simply find the text with grep.
Go to the the parent directory and execute :
grep -iR "abcd"
Or from any place,
grep -iR "abcd" /var/log/mylogs/
Suggesting a grep command on find filter results:
grep "abcd" $(find . -type f)

Find creates a file when I use {}

I try to use find to create a very simple way to add newline to a file. I know there are tons of other ways to do this but it bugs the hell out of me that I cannot get this way to work.
So - I'm NOT asking how to add newline to a file - I'm asking why find is weird.
find . -type f -iname 'file' -exec echo >> {} \;
results in a new file created named "{}" with the newline in while (to check that find works on my computer):
find . -type f -iname 'file' -exec echo {} \;
prints "./file".
So the >> makes find confused. The question is why and how do I solve that?
I'm asking why find is weird.
It isn't. This has nothing to do with find. In fact, when the file is created, find hasn't even started to run.
>> roughly means "redirect stdout to the end of this file, create a new file when necessary". Note how nothing of this has anything to do with whatever is left of the >>.
Redirection is a feature of the shell, find knows nothing about the redirection and the shell knows nothing about find. >> doesn't magically change its meaning just because you happened to call find. It still means the exact same thing.
If you want to use a shell feature whithin -exec, you need to use a shell within -exec:
find . -type f -iname 'file' -exec sh -c 'echo >> "{}"' \;
While the question itself has already been answered, I'd like to point out that you don't strictly need to make find do everything, rather you can use other available facilities to work together with it, for example:
find . -type f -iname 'file' | while read file; do echo >>"$file"; done
This approach also has the advantage of not executing a new process for every match, which is irrelevant in this case but potentially important if there are thousands of matches and the exec is relatively heavy.

renaming with find

I managed to find several files with the find command.
the files are of the type file_sakfksanf.txt, file_afsjnanfs.pdf, file_afsnjnjans.cpp,
now I want to rename them with the rename and -exec command to
mywish_sakfksanf.txt, mywish_afsjnanfs.pdf, mywish_afsnjnjans.cpp
that only the first prefix is changed. I am trying for some time, so don't blame me for being stupid.
If you read through the -exec section of the man pages for find you will come across the {} string that allows you to use the matches as arguments within -exec. This will allow you to use rename on your find matches in the following way:
find . -name 'file_*' -exec rename 's/file_/mywish_/' {} \;
From the manual:
-exec command ;
Execute command; true if 0 status is returned. All following
arguments to find are taken to be arguments to the command until an
argument consisting of ;' is encountered. The string{}' is replaced
by the current file name being processed everywhere it occurs in the
arguments to the command, not just in arguments where it is alone, as
in some versions of find. Both of these constructions might need to
be escaped (with a `\') or quoted to protect them from expansion by
the shell. See the EXAMPLES section for examples of the use of the
-exec option. The specified command is run once for each matched file. The command is executed in the starting directory.There are
unavoidable security problems surrounding use of the -exec action;
you should use the -execdir option instead.
Although you asked for a find/exec solution, as Mark Reed suggested, you might want to consider piping your results to xargs. If you do, make sure to use the -print0 option with find and either the -0 or -null option with xargs to avoid unexpected behaviour resulting from whitespace or shell metacharacters appearing in your file names. Also, consider using the + version of -exec (also in the manual) as this is the POSIX spec for find and should therefore be more portable if you are wanting to run your command elsewhere (not always true); it also builds its command line in a way similar to xargs which should result in less invocations of rename.
Don't think there's a way you can do this with just find, you'll need to create a script:
#!/bin/bash
NEW=`echo $1 | sed -e 's/file_/mywish_/'`
mv $1 ${NEW}
THen you can:
find ./ -name 'file_*' -exec my_script {} \;

Find in Linux combined with a search to return a particular line

I'm trying to return a particular line from files found from this search:
find . -name "database.php"
Each of these files contains a database name, next to a php variable like $dname=
I've been trying to use -exec to execute a grep search on this file with no success
-exec "grep {\}\ dbname"
Can anyone provide me with some understanding of how to accomplish this task?
I'm running CentOS 5, and there are about 100 database.php files stored in subdirectories on my server.
Thanks
Jason
You have the arguments to grep inverted, and you need them as separate arguments:
find . -name "database.php" -exec grep '$dbname' /dev/null {} +
The presence of /dev/null ensures that the file name(s) that match are listed as well as the lines that match.
I think this will do it. Not sure if you need to make any adjustments for CentOS.
find . -name "database.php" -exec grep dbname {} \;
I worked it out using xargs
find . -name "database.php" -print | xargs grep \'database\'\=\> > list_of_databases
Feel free to post a better way if you find one (or what some rep for a good answer)
I tend to habitually avoid find because I've never learned how to use it properly, so the way I'd accomplish your task would be:
grep dbname **/database.php
Edit: This command won't be viable in all cases because it can potentially generate a very long argument list, whereas find executes its command on found files one by one like xargs. And, as I noted in my comment, it's possibly not very portable. But it's damn short ;)

How do I write a bash script to replace words in files and then rename files?

I have a folder structure, as shown below:
I need to create a bash script that does 4 things:
It searches all the files in the generic directory and finds the string 'generic' and makes it into 'something'
As above, but changes "GENERIC" to "SOMETHING"
As above, but changes "Generic" to "Something"
Renames any filename that has "generic" in it with "something"
Right now I am doing this process manually by using the search and replace in net beans. I dont know much about bash scripting, but i'm sure this can be done. I'm thinking of something that I would run and it would take "Something" as the input.
Where would I start? what functions should I use? overall guidance would be great. thanks.
I am using Ubuntu 10.5 desktop edition.
Editing
The substitution part is a sed script - call it mapname:
sed -i.bak \
-e 's/generic/something/g' \
-e 's/GENERIC/SOMETHING/g' \
-e 's/Generic/Something/g "$#"
Note that this will change words in comments and strings too, and it will change 'generic' as part of a word rather than just the whole word. If you want just the word, then you use end-word markers around the terms: 's/\<generic\>/something/g'. The -i.bak creates backups.
You apply that with:
find . -type f -exec mapname {} +
That creates a command with a list of files and executes it. Clearly, you can, if you prefer, avoid the intermediate mapname shell/sed script (by writing the sed script in place of the word mapname in the find command). Personally, I prefer to debug things separately.
Renaming
The renaming of the files is best done with the rename command - of which there are two variants, so you'll need to read your manual. Use one of these two:
find . -name '*generic*' -depth -exec rename generic something {} +
find . -name '*generic*' -depth -exec rename s/generic/something/g {} +
(Thanks to Stephen P for pointing out that I was using a more powerful Perl-based variant of rename with full Perl regexp capacity, and to Zack and Jefromi for pointing out that the Perl one is found in the real world* too.)
Notes:
This renames directories.
It is probably worth keeping the -depth in there so that the contents of the directories are renamed before the directories; you could otherwise get messages because you rename the directory and then can't locate the files in it (because find gave you the old name to work with).
The more basic rename will move ./generic/do_generic.java to ./something/do_generic.java only. You'd need to run the command more than once to get every component of every file name changed.
* The version of rename that I use is adapted from code in the 1st Edition of the Camel book.
Steps 1-3 can be done like this:
find .../path/to/generic -type f -print0 |
xargs -0 perl -pi~ -e \
's/\bgeneric\b/something/g;
s/\bGENERIC\b/SOMETHING/g;
s/\bGeneric\b/Something/g;'
I don't understand exactly what you want to happen in step 4 so I can't help with that part.

Resources