XARGS, GREP and GNU parallel - linux

Being a linux newbie I am having trouble figuring out some of the elementary aspects of text searching.
What I want to accomplish is as follows:
I have a file with a list of absolutepaths to a particular path.
I want to go through this list of files and grep for a particular pattern
If the pattern is found in that file, I would like to redirect it to a different output file.
Since these files are spread out on the NFS, I would like to speed up the lookup using GNU parallel.
So..what I did was as follows:
cat filepaths|xargs -iSomePath echo grep -Pl '\d+,\d+,\d+,\d+' \"SomePath\"> FoundPatternsInFile.out| parallel -v -j 30
When I run this command, I am getting the following error repeatedly:
grep: "/path/to/file/name": No such file or directory
The file and the path exists. Can somebody point out what I might be doing wrong with xargs and grep?
Thanks

cat filepaths | parallel -j 30 grep -Pl '\d+,\d+,\d+,\d+' {} > FoundPatternsInFile.out
In this case you can even leave out {}.

Related

Getting the most recent filename where the extension name is case *in*sensitive

I am trying to get the most recent .CSV or .csv file name among other comma separated value files where the extension name is case insensitive.
I am achieving this with the following command, provided by someone else without any explanation:
ls -t ~(i:*.CSV) | head -1
or
ls -t -- ~(i:*.CSV) | head -1
I have two questions:
What is the use of ~ and -- in this case? Does -- helps here?
How can I get a blank response when there is no .csv or .CSV file in
the folder? At the moment I get:
/bin/ls: cannot access ~(i:*.CSV): No such file or directory
I know I can test the exit code of the last command, but I was wondering maybe there is a --silent option or something.
Many thanks for your time.
PS: I made my research online quite thorough and I was unable to find an answer.
The ~ is just a literal character; the intent would appear to be to match filenames starting with ~ and ending with .csv, with i: being a flag to make the match case-insensitive. However, I don't know of any shell that supports that particular syntax. The closest thing I am aware of would be zsh's globbing flags:
setopt extended_glob # Allow globbing flags
ls ~(#i)*.csv
Here, (#i) indicates that anything after it should be matched without regard to case.
Update: as #baptistemm points out, ~(i:...) is syntax defined by ksh.
The -- is a conventional argument, supported by many commands, to mean that any arguments that follow are not options, but should be treated literally. For example, ls -l would mean ls should use the -l option to modify its output, while ls -- -l means ls should try to list a file named -l.
~(i:*.CSV) is to tell to shell (this is only supported apparently in ksh93) the enclosed text after : must be treated as insensitive, so in this example that could all these possibilites.
*.csv or
*.Csv or
*.cSv or
*.csV or
*.CSv or
*.CSV
Note this could have been written ls -t *.[CcSsVv] in bash.
To silent errors I suggest you to look for in this site for "standard error /dev/null" that will help.
I tried running commands like what you have in both bash and zsh and neither worked, so I can't help you out with that, but if you want to discard the error, you can add 2>/dev/null to the end of the ls command, so your command would look like the following:
ls -t ~(i:*.CSV) 2>/dev/null | head -1
This will redirect anything written to STDERR to /dev/null (i.e. throw it out), which, in your case, would be /bin/ls: cannot access ~(i:*.CSV): No such file or directory.

Finding multiple strings in directory using linux commends

If I have two strings, for example "class" and "btn", what is the linux command that would allow me to search for these two strings in the entire directory.
To be more specific, lets say I have directory that contains few folders with bunch of .php files. My goal is to be able to search throughout those .php files so that it prints out only files that contain "class" and "btn" in one line. Hopefully this clarifies things better.
Thanks,
I normally use the following to search for strings inside my source codes. It searches for string and shows the exact line number where that text appears. Very helpful for searching string in source code files. You can always pipes the output to another grep and filter outputs.
grep -rn "text_to_search" directory_name/
example:
$ grep -rn "angular" menuapp
$ grep -rn "angular" menuapp | grep some_other_string
output would be:
menuapp/public/javascripts/angular.min.js:251://# sourceMappingURL=angular.min.js.map
menuapp/public/javascripts/app.js:1:var app = angular.module("menuApp", []);
grep -r /path/to/directory 'class|btn'
grep is used to search a string in a file. With the -r flag, it searches recursively all files in a directory.
Or, alternatively using the find command to "identify" the files to be searched instead of using grep in recursive mode:
find /path/to/your/directory -type f -exec grep "text_to_search" {} \+;

GNU grep on FreeBSD not working properly

I have a weird problem on FreeBSD 8.4-STABLE with grep (GNU grep) 2.5.1-FreeBSD.
If I try to grep -Hnr searchstring I didn't get any output, but grep is running said ps aux and is keep running until I kill the process.
If I copy a testfile in an empty directory and do
cat testfile | grep searchstring it is working.
But if I try to
grep -Hnr searchstring in that directory I also get no output, grep keeps running and running but didn't produce any matches.
Anybody knows how to solve this?
Even though you gave -r, you still have to give grep a file argument. Othersize, as you've discovered, it just sits there waiting for input on stdin.
You want
grep -Hnr searchstring .
# ....................^^
That will recursively find files under the current directory.
Though it doesn't seem to be documented, if you invoke grep with the -r option and no file or directory name arguments, it defaults to the current directory, almost as if you had typed grep -R pattern . except that ./ does not appear in the output.
Apparently this is a fairly new feature.
If you do a recursive grep in a directory with a lot of contents, it could simply take a long time -- perhaps forever if there are device files such as /dev/zero that can produce infinite output.

Terminal command to find lines containing a specific word?

I was just wondering what command i need to put into the terminal to read a text file, eliminate all lines that do not contain a certain keyword, and then print those lines onto a new file. for example, the keyword is "system". I want to be able to print all lines that contain system onto a new separate file. Thanks
grep is your friend.
For example, you can do:
grep system <filename> > systemlines.out
man grep and you can get additional useful info as well (ex: line numbers, 1+ lines prior, 1+lines after, negation - ie: all lines that do not contain grep, etc...)
If you are running Windows, you can either install cygwin or you can find a win32 binary for grep as well.
grep '\<system\>'
Will search for lines that contain the word system, and not system as a substring.
below grep command will solve ur problem
grep -i yourword filename1 > filename2
with -i for case insensitiveness
without -i for case sensitiveness
to learn how grep works on ur server ,refer to man page on ur server by the following command
man grep
grep "system" filename > new-filename
You might want to make it a bit cleverer to not include lines with words like "dysystemic", but it's a good place to start.

How to execute a command with one parameter at a time in the *nix shell?

Some commands like svn log, for example will only take one input from the command line, so I can't say grep 'pattern' | svn log. It will only return the information for the first file, so I need to execute svn log against each one independently.
I can do this with find using it's exec option: find -name '*.jsp' -exec svn log {} \;. However, grep and find provide differently functionality, and the -exec option isn't available for grep or a lot of other tools.
So is there a generalized way to take output from a unix command line tool and have it execute an arbitrary command against each individual output independent of each other like find does?
The answer is xargs -n 1.
echo moo cow boo | xargs -n 1 echo
outputs
moo
cow
boo
try xargs:
grep 'pattern' | xargs svn log
A little one off shell script (using xargs is much better for a one off, that's why it exists)
#!/bin/sh
# Shift past argv[0]
shift 1
for file in "$#"
do
svn log $file
done
You could name it 'multilog' or something like that. Call it like this:
./multilog.sh foo.c abc.php bar.h Makefile
It allows for a little more sanity when being called by automated build scripts, i.e. test the existence of each before talking to SVN, or redirect each output to a separate file, insert it into a sqlite database, etc.
That may or may not be what you are looking for.

Resources