How can i update the contents of a file by replacing strings using grep and find command - linux

I am finding XML files under particular sub directory having "responsible" word in it. Searching is working fine as shown below.
find . -name '*.xml' -exec grep -H 'responsible' {} \;
./dir1/d1.xml<responsible><></responsible>
./dir2/d2.xml<responsible><SYSTEM></responsible>
./dir3/d3.xml<responsible><SYSTEM></responsible>
... and so on.
Is there a way i can replace all occurrences of SYSTEM with blank one.
result i am looking is:
./dir1/d1.xml<responsible><></responsible>
./dir2/d2.xml<responsible><></responsible>
./dir3/d3.xml<responsible><></responsible>

I would use perl pie
perl -p -i -e 's/<responsible><SYSTEM><\/responsible>/<responsible><><\/responsible>/' `find ./ -name *.xml`

Related

bash: filter the files where NOT to search

I have created a script that searches for the specified keywords in specified directories:
find $directory -type f -name "*.properties" -exec grep -Fi "$keyword"
The problem i faced is that the $directory contains 2 types of files - sample files and config files: config / sample.config. Where sample.config is an example only, thus i'm not interested to include them into the search.
The question is how to exclude these 'sample.*' files out of the results of my results?
From the question to exclude sample.config files, add ! -name sample.config in find commands, for example :
find $(<$SRC) -type f -name "*.properties" ! -name sample.config -exec grep -Fi "$keyword" --color {} +
however *.properties can't match sample.config so it will not change the result
Probably 1 command to search $keyword, with all 4 kinds of your file types, exclude sample.*:
msr -rp dir1,dir2,dirN -f "\.(properties|pl|xml|ini)$" --nf "^sample\." -it "keyword"
Use -PAC or -P -A -C to remove color and line number etc. to get pure result.
Use -l to just list the file paths and show distribution: count + percentage.
msr.gcc* is a single exe tool to search/replace file/pipe in my open project https://github.com/qualiu/msr tools directory, with cross platform versions and OS-bit versions. Built-in doc like: https://qualiu.github.io/msr/usage-by-running/msr-CentOS-7.html Vivid-demo, Performance-comparision-with-findstr-and-grep, test etc. just see the home.
Using the suggestion of #Nahuel, i've modified it a bit and it started working for me as:
find $(<$SRC) -type f -name "*.properties" ! -name "sample.*" -exec grep -Fi "$keyword" --color {} +

The mistake in find and sed command in linux?

I want add some script to my site.
But problem in one thing: site include hundreds of html files.
So I need to create some command to insert code after body tag. How I can do this?
find . -name '*.html' exec sed -i 's/<\/body>/<script src="1.js"><\/script><\/body>/g' {} \;
But it can't work.
Please, fix this command
There is an error in command - replace exec with -exec and should be fine.
find . -name '*.html' exec sed -i 's/<\/body>/<script src="1.js"><\/script><\/body>/g' {} \;
That also works for me:
find * -name "*.html" | xargs -L1 -I{} sed -i 's/<\/body>/<script src="1.js"><\/script><\/body>/g' {}
Changes:
replaced path . with '*'
the'xargs' tool gets all lines from stdin, and executes command separately for each of line, with possibility to pass that line as argument in command, so
in that case this is the same approach as find -cmd, but generally it opens another possibilites (check out the xargs manual).

Insert line into multi specified files

I want to insert a line into the start of multiple specified type files, which the files are located in current directory or the sub dir.
I know that using
find . -name "*.csv"
can help me to list the files I want to use for inserting.
and using
sed -i '1icolumn1,column2,column3' test.csv
can use to insert one line at the start of file,
but now I do NOT know how to pipe the filenames from "find" command to "sed" command.
Could anybody give me any suggestion?
Or is there any better solution to do this?
BTW, is it work to do this in one line command?
Try using xargs to pass output of find and command line arguments to next command, here sed
find . -type f -name '*.csv' -print0 | xargs -0 sed -i '1icolumn1,column2,column3'
Another option would be to use -exec option of find.
find . -type f -name '*.csv' -exec sed -i '1icolumn1,column2,column3' {} \;
Note : It has been observed that xargs is more efficient way and can handle multiple processes using -P option.
This way :
find . -type f -name "*.csv" -exec sed -i '1icolumn1,column2,column3' {} +
-exec do all the magic here. The relevant part of man find :
-exec command ;
Execute command; true if 0 status is returned. All following arguments
to find are taken to be arguments to the command until an argument consisting
of `;' is encountered. The string `{}' is replaced by the current file name
being processed everywhere it occurs in the arguments to the command, not just
in arguments where it is alone, as in some versions of find. Both of
these constructions might need to be escaped (with a `\') or quoted to protect
them from expansion by the shell. See the EXAMPLES section for examples of
the use of the -exec option. The specified command is run once for each
matched file. The command is executed in the starting directory. There
are unavoidable security problems surrounding use of the -exec action;
you should use the -execdir option instead

Piping find results into grep for fast directory exclusion

I am successfully using find to create a list of all files in the current subdirectory, excluding those in the subdirectory "cache." Here's my first bit of code:
find . -wholename './cach*' -prune -o -print
I now wish to pipe this into a grep command. It seems like that should be simple:
find . -wholename './cach*' -prune -o -print | xargs grep -r -R -i "samson"
... but this is returning results that are mostly from the cache directory. I've tried removing the xargs reference, but that does what you'd expect, running the grep on text of the file names, rather than on the files themselves. My goal is to find "samson" in any files that aren't cached content.
I'll probably get around this issue by just using doubled greps in this instance, but I'm very curious about why this one-liner behaves this way. I'd love to hear thoughts on a way to modify it while still using these two commands (as there are speed advantages to doing it this way).
(This is in CentOS 5, btw.)
The wholename match may be the reason why it's still including "cache" files. If you're executing the find command in the directory that contains the "cache" folder, it should work. If not, try changing it to -name '*cache*' instead.
Also, you do not need the -r or -R for your grep, that tells it to recurse through directories - but you're testing individual files.
You can update your command using the piped version, or a single-command:
find . -name '*cache*' -prune -o -print0 | xargs -0 grep -il "samson"
or
find . -name '*cache*' -prune -o -exec grep -iq "samson" {} \; -print
Note, the -l in the first command tells grep to "list the file" and not the line(s) that match. The -q in the second does the same; it tells grep to respond quietly so find will then just print the filename.
You've told grep itself to recurse (twice! -r and -R are synonyms). Since one of the arguments you're passing is . (the top directory), grep is searching in every file (some of them twice, or even more if they're in subdirectories).
If you're going to use find and grep, do this:
find . -path './cach*' -prune -o -print0 | xargs -0 grep -i "samson"
Using -print0 and -0 makes your script work even with file names that contain spaces or punctuation characters.
However, you probably don't need to bother with find here, since GNU grep is capable of excluding directories:
grep -R --exclude-dir='cach*' -i "samson" .
(This also excludes ./deeply/nested/directory/cache. If you only want to exclude cache directories at the toplevel, use find as you did.)
Use the -exec option on find instead of piping them to another command. From there you can use grep "samson" {} \; to look for samson in each file listed.
For example:
find . -wholename './cach*' -prune -o -exec grep "samson" "{}" +

search through files and put into array

cat `find . -name '*.css'`
This will open any css file. I now what do two things.
1) How do I add *.js to this as well. So I want to look inside all css and javascript files.
2) I want to look for any css or image files within those (css or js files) and push those into an array. So I guess look for a .png, .jpg, .gif, .tif, .css and put everything before that until the quote or single quote into an array. I want an array because this command will go into a shell script and after I get all the names of the files that I need I will need to loop through and download those files later.
Any help would be appreciated.
Extra hackery, in case someone needs it:
find ./ -name "*.css" | xargs grep -o -h -E '[A-Za-z0-9:./_-]+\.(png|jpg|gif|tif|css)'| sed -e 's/\.\./{{url here}}/g'|xargs wget
will download every missing resource
Do the command:
find ./ -name "*.css" -or -name "*.js" > fileNames.txt
Then read each line of fileNames.txt in the loop and download them.
Or if you are going to use wget to download the images you could do:
find ./ -name "*.css" -or -name "*.js" | xargs grep '*.png' | xargs wget
May need a little refinement like a cut after the grep but you get the idea
1) simple answer: you can add the names of all .js files to your cat command, by instructing find to find more files:
cat `find . -name '*.css' -or -name '*.js'`
2) a text-searching tool such as grep is probably what you're after:
find . -name '*.css' -or -name '*.js' | xargs grep -o -h -E '[A-Za-z0-9:./_-]+\.(png|jpg|gif|tif|css)'
Note: my grep pattern isn't universal or perfect, but it's a starting example. It matches any string that includes alpha-numeric,colon,dot,slash,underscore or hyphens in it, followed by any one of the given extensions.
The -o option causes grep to output only the parts of the .css/.js files that match the pattern (i.e. only the apparent filenames).
If you want to download them you could add | xargs wget -v to the command, which would instruct wget to fetch all those filenames.
NOTE: this won't work for relative filenames; some other magic will be required (i.e. you'll have to resolve them with respect to the grepped file's location). Perhaps some extra hackery, such as sed or awk.
Also: How often do you see references to TIFFs in your CSS/JS?

Resources