How to Include include path (full or relative) in LS command results

How to Include include path (full or relative) in LS command results - rhel

This command supplies almost all of the information which I need in the proper CSV format (filename,dateModified,size)
But I would also like to include the file's directory as a separate item within each line's output.
Is there a way to do that using ls or another rhel available command?
ls * -R --fu | awk '{ print $9","$6","$5}'

I found an answer but I would hope that there's a simpler solution! I was fortunate enough that all of the filenames in this directory and subdirectories begin with the same characters. Here goes:
find . -type f -exec ls --fu {} \; | awk '{ print $9","$6","$5}' | sed "s/ArchiveFile/,ArchiveFile/"
In essence, I use "find" to locate the file, ls to obtain the information and then a combination of awk, print and sed to produce the output. sed inserts the needed comma before the letters "ArchiveFile"
Example Output:
./ISIS_2011FALL/CAT_01College/,ArchiveFile_MUR115.2011FALL.51917.zip,2016-12-11,29484

Related

Bash Script to Pull Employee Name

I've recently enrolled in a cybersecurity bootcamp and am having a little trouble figuring out where I'm going wrong writing this script with a grep command. I'm supposed to be pulling employee names from a schedule and the script is supposed to be able to accept 2 arguments representing a specific date and a specific time.
If I type the following line below, it successfully goes into the schedule file with the date of 0310 and pulls the name of the employee that was working at 5am.
find -type f -iname *0310* | grep "05:00:00 AM" ./* | awk -F" " '{print $5, $6}'
However when I turn it into a script like this:
#!/bin/bash
find -type f -iname *$1* | grep $2 ./* | awk -F" " '{print $3, $4}'
And execute like this:
./script.sh 0310 "05:00:00 AM"
It gives me the following error code of the following and prints the employees who were working at 5am and also 5pm.
grep: AM: No such file or directory
I also get this error if I have another file with "0310" in the name
find: paths must precede expression: random_file_with_0310.txt' find: possible unquoted pattern after predicate -iname'?
Where am I going wrong with this script? Very new to BASH

I think what you actually want is:
#!/bin/bash
find -type f -iname "*${1}*" -exec awk -v i="${2}" '$0 ~ i {print $5, $6}' "{}" +
Note that awk by default uses any number of whitespace (spaces, tabs) as a separator, so your field-separator may not actually be what you need/want, either.
And a different approach:
#!/bin/bash
grep "${2}" $( find -type f -iname "*${1}*" ) | awk '{print $5, $6}'
Slightly shorter (less typing), but more processes involved.

You first problem is quoting.
grep: AM: No such file or directory
This is because what grep $2 ./* is running is
grep 05:00:00 AM
making AM the file argument, followed by the expansion of ./* which is every file in whatever directory you ran the command from, which is also not what you want. You quoted it correctly in your CLI example, but you have to quote it in your script.
grep "$2" ./* # still not looking at the right file
This will pass the "5:00:00 AM" correctly, but isn't going to search for it in the file(s) returned from find.
Assuming there is only one file (I wouldn't, but for simplicity's sake...) - try
file=`find -type f -iname *"$1"*` # note the quoting here also
Personally, I prefer the improved syntax for the same thing -
file="$(find -type f -iname *"$1"*)" # note the quoting here also
If there is any chance you are going to get multiple files, then this is likely way beyond the scope of a bootcamp unless they are really doing it right, in which case c.f. this discussion of why filenames are not to be trusted.
ANYWAY - once you have your filename, you still don't need grep.
awk -v ts="$ts" '$0~ts{print $5, $6}' "$file"
or even, in one step,
awk -v ts="$ts" '$0~ts{print $5, $6}' "$(find -type f -iname *"$1"*)"
...but if you just felt the need to add a redundant pattern parser antipattern, then
grep "$2" "$(find -type f -iname *"$1"*)" | awk '{print $5, $6}'
A possible alternative, with no promises on performance...
#/bin/bash
ts="$1"; # save the string search pattern
shift; # and shift it off the argument list
shopt -s globstar; # make ** match an arbitrary depth of folders
awk -v ts="$ts" '$0~ts{print $5, $6}' "$#" # just use awk
Call it with
./script.sh "05:00:00 AM" **/*0310* # pass search pattern first, then file list
This should let the interpreter locate matching files for you and pass that list to the script. awk itself will only open the files it is passed as arguments, so you no longer need the find. awk can also pattern match for lines in those files, so you no longer need the separate grep.
(This does run the possibility of returning directories and other weirdness as well as just plain files; we can add lines to accommodate that, but I'm trying to keep it fairly simple for the given problem.)
I omitted the -F" " - you probably don't need that, but be sure to test to see if it changes your actual output dataset. If what you literally meant was that you want every space to delimit a field, so that consecutive spaces mean empty fields, use -F'[ ]'.
If that's too fancy for your context, tink's answer is probably what you want.

How to print only the filename part of files that contain a certain string?

I need to print out the filename (e.g. A001.txt) that contains the string "XYZ".
I tried this:
grep -H -R "XYZ" ~/directory/*.txt | cut -d':' -f1
It would output the entire path (e.g. ~/directory/A001.txt). How can I make it so that it would only output the filename (e.g. A001.txt)?

Why oh why did the GNU guys give grep an option to recursively find files when there's a perfectly good tool designed for the job and with an extremely obvious name. Sigh...
find . -type f -exec awk '/XYZ/{print gensub(/.*\//,"",1,FILENAME); nextfile}' {} +
The above uses GNU awk which I assume you have since you were planning to use GNU grep.

grep -lr term dir/to/search/ | awk -F'/' '{print $NF}' should do the trick.
-l just lists filenames, including their directories.
-r is recursive to go through the directory tree and all files in the dir specified.
This all gets piped to awk, which is told to use / as a delimiter (not allowed in file names, so not as brittle as it could be) and to print the last field (NF is the field count, so $NF is the last field)

grep -Rl "content" | xargs -d '\n' basename -a
This should do the trick and print only the filename without the path.
basename prints filename NAME with any leading directory components
removed.
Reference: https://linux.die.net/man/1/basename

Can find push the filenames of the found files into the pipe?

I would like to do a find in some dir, and do a awk on the files in this direcory, and then replace the original files by each result.
find dir | xargs cat | awk ... | mv ... > filename
So I need the filename (of each of the files found by find) in the last command. How can I do that?

I would use a loop, like:
for filename in `find . -name "*test_file*" -print0 | xargs -0`
do
# some processing, then
echo "what you like" > "$filename"
done
EDIT: as noted in the comments, the benefits of -print0 | xargs -0 are lost because of the for loop. And filenames containing a white space are still not handled correctly.
The following while loop would not handle unusual filenames neither (good to know it, though it was not in the question), but filenames with a standard white space at least, so it works better, indeed:
find . -name "*test*file*" -print > files_list
while IFS= read -r filename
do
# some process
echo "what you like" > "$filename"
done < files_list

You could do something like this (but I wouldn't recommend it at all).
find dir -print0 |
xargs -0 -n 2 awk -v OFS='\0' '<process the input and write to temporary file>
END {print "temporaryfile", FILENAME}' |
xargs -0 -n 2 mv
This passes the files to awk directly two at a time (which avoids the problem with your original where cat will get hundreds (perhaps more) files as arguments all at once and spit all their content at awk via standard input at once and thus lose their individual contents and filenames entirely).
It then has awk write the processed output to a temporary file and then outputs the temporary filename and the original filename where xargs picks them up (again two at a time) and runs mv on the pairs of temporary file/original file names.
As I said at the beginning however this is a terrible way to do this.
If you have a new enough version of GNU awk (version 4.1.0 or newer) then you could just use the -i (in-place) argument to awk and use (I believe):
find dir | xargs awk -i '......'
Without that I would use a while loop of the form in Bash FAQ 001 to read the find output line-by-line and operate on it in the loop.

grep -o and display part of filenames using ls

I have a directory which has many directories inside it with the pattern of their name as :
YYYYDDMM_HHMISS
Example: 20140102_120202
I want to extract only the YYYYDDMM part.
I tried ls -l|awk '{print $9}'|grep -o ^[0-9]* and got the answer.
However i have following questions:
Why doesnt this return any results: ls -l|awk '{print $9}'|grep -o [0-9]* . Infact it should have returned all the directories.
Strangely just including '^' before [0-9] works fine :
ls -l|awk '{print $9}'|grep -o ^[0-9]*
Any other(simpler) way to achieve the result?

Why doesnt this return any results: ls -l|awk '{print $9}'|grep -o [0-9]*
If there are files in your current directory that start with [0-9], then the shell will expand them before calling grep. For example, if I have two files a1, a2 and a3 and run this:
ls | grep a*
After the filenames are expanded, the shell will run this:
ls | grep a1 a2 a3
The result of which is that it will print the lines in a2 and a3 that match the text "a1". It will also ignore whatever is coming from stdin, because when you specify filenames for grep (2nd argument and beyond), it will ignore stdin.
Next, consider this:
ls | grep ^a*
Here, ^ has no special meaning to the shell, so it uses it verbatim. Since I don't have filenames starting with ^a, it will use ^a* as the pattern. If I did have filenames like ^asomething or ^another, then again, ^a* would be expanded to those filenames and grep would do something I didn't really intend.
This is why you have to quote search patterns, to prevent the shell from expanding them. The same goes for patterns in find /path -name 'pattern'.
As for a simpler way for what you want, I think this should do it:
ls | sed -ne 's/_.*//p'

To show only the YYDDMM part of the directory names:
for i in ./*; do echo $(basename "${i%%_*}"); done
Not sure what you want to do with it once you've got it though...

You must avoid parsing ls output.
Simple is to use this printf:
printf "%s\n" [0-9]*_[0-9]*|egrep -o '^[0-9]+'

Pipe output to use as the search specification for grep on Linux

How do I pipe the output of grep as the search pattern for another grep?
As an example:
grep <Search_term> <file1> | xargs grep <file2>
I want the output of the first grep as the search term for the second grep. The above command is treating the output of the first grep as the file name for the second grep. I tried using the -e option for the second grep, but it does not work either.

You need to use xargs's -i switch:
grep ... | xargs -ifoo grep foo file_in_which_to_search
This takes the option after -i (foo in this case) and replaces every occurrence of it in the command with the output of the first grep.
This is the same as:
grep `grep ...` file_in_which_to_search

Try
grep ... | fgrep -f - file1 file2 ...

If using Bash then you can use backticks:
> grep -e "`grep ... ...`" files
the -e flag and the double quotes are there to ensure that any output from the initial grep that starts with a hyphen isn't then interpreted as an option to the second grep.
Note that the double quoting trick (which also ensures that the output from grep is treated as a single parameter) only works with Bash. It doesn't appear to work with (t)csh.
Note also that backticks are the standard way to get the output from one program into the parameter list of another. Not all programs have a convenient way to read parameters from stdin the way that (f)grep does.

I wanted to search for text in files (using grep) that had a certain pattern in their file names (found using find) in the current directory. I used the following command:
grep -i "pattern1" $(find . -name "pattern2")
Here pattern2 is the pattern in the file names and pattern1 is the pattern searched for
within files matching pattern2.
edit: Not strictly piping but still related and quite useful...

This is what I use to search for a file from a listing:
ls -la | grep 'file-in-which-to-search'

Okay breaking the rules as this isn't an answer, just a note that I can't get any of these solutions to work.
% fgrep -f test file
works fine.
% cat test | fgrep -f - file
fgrep: -: No such file or directory
fails.
% cat test | xargs -ifoo grep foo file
xargs: illegal option -- i
usage: xargs [-0opt] [-E eofstr] [-I replstr [-R replacements]] [-J replstr]
[-L number] [-n number [-x]] [-P maxprocs] [-s size]
[utility [argument ...]]
fails. Note that a capital I is necessary. If i use that all is good.
% grep "`cat test`" file
kinda works in that it returns a line for the terms that match but it also returns a line grep: line 3 in test: No such file or directory for each file that doesn't find a match.
Am I missing something or is this just differences in my Darwin distribution or bash shell?

I tried this way , and it works great.
[opuser#vjmachine abc]$ cat a
not problem
all
problem
first
not to get
read problem
read not problem
[opuser#vjmachine abc]$ cat b
not problem xxy
problem abcd
read problem werwer
read not problem 98989
123 not problem 345
345 problem tyu
[opuser#vjmachine abc]$ grep -e "`grep problem a`" b --col
not problem xxy
problem abcd
read problem werwer
read not problem 98989
123 not problem 345
345 problem tyu
[opuser#vjmachine abc]$

You should grep in such a way, to extract filenames only, see the parameter -l (the lowercase L):
grep -l someSearch * | xargs grep otherSearch
Because on the simple grep, the output is much more info than file names only. For instance when you do
grep someSearch *
You will pipe to xargs info like this
filename1: blablabla someSearch blablabla something else
filename2: bla someSearch bla otherSearch
...
Piping any of above line makes nonsense to pass to xargs.
But when you do grep -l someSearch *, your output will look like this:
filename1
filename2
Such an output can be passed now to xargs

I have found the following command to work using $() with my first command inside the parenthesis to have the shell execute it first.
grep $(dig +short) file
I use this to look through files for an IP address when I am given a host name.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string