Using CAT command read all file content recursively for paths file - linux

So I've a file called allTextFiles.txt in which it has paths of a all regular text file.
eg:
./test/file1.txt
./test/file2.txt
./test/file3.txt
My task is to formulate shell command such that command CAT will go through all of these paths and display content of each file.
Is it possible ?

Using xargs with the allTextFiles.txt
You can use the command cat to list the content and xargs (Xargs command documentation) to process every line of the file. For example:
cat allTextFiles.txt | xargs cat
kk #./test/file1.txt
jj #./test/file2.txt
yy #./test/file3.txt
Using find command without allTextFiles.txt
You can use the command find (find command documentation) to go trough the main folder and search recursively. Once you find a file, to use the cat command.
You achieve the purpose of showing all the content of txt files recursively with only one command.
find . -type f -name "*.txt" -exec cat {} \;
kk #./test/file1.txt
jj #./test/file2.txt
yy #./test/file3.txt
Where the . means current directory. -type f only filters files. -name "*.txt" filters only files ending with .txt extension and the -exec part is where you process all the files found.
Does it cover your need?

You can use a loop; something like
while IFS= read -r file; do
cat "$file"
done < allTextFiles.txt
See Bash FAQ 001 for more discussion on reading a file a line at a time (Much of the content applies to all shells, not just bash).
If you are using bash:
readarray -t files < allTextFiles.txt
cat "${files[#]}"
is an alternative that avoids the explicit loop.

Related

how to read multiple paths from a file and then use the path in find commands

I have a scenario to read multiple paths in find command and to cover that i cannot enter every single path in find command. is there any way to read the paths from the file or to ask find command to read the path from the file and then execute the command to search for a file from that path which are mentioned in the read file.
command:
find path1 path2 path 30 -name \*.jks 2>/dev/null
Using bash you can read the file into an array (one line/item in array), and expand it into the first arguments for find:
readarray -t files < files.txt
find "${files[#]}" -name '*.jks'
The -t flag given to readarray will trim the trailing newline.
Just do this in a shell script :
#!/bin/bash
input="/path/to/txt/file"
paths= ""
while IFS= read -r line
do
paths="$paths $line"
done < "$input"
find $paths -name "*.txt" -print
You can replace -print by -exec to execute à task on each file found

Find all directories containing a file that contains a keyword in linux

In my hierarchy of directories I have many text files called STATUS.txt. These text files each contain one keyword such as COMPLETE, WAITING, FUTURE or OPEN. I wish to execute a shell command of the following form:
./mycommand OPEN
which will list all the directories that contain a file called STATUS.txt, where this file contains the text "OPEN"
In future I will want to extend this script so that the directories returned are sorted. Sorting will determined by a numeric value stored the file PRIORITY.txt, which lives in the same directories as STATUS.txt. However, this can wait until my competence level improves. For the time being I am happy to list the directories in any order.
I have searched Stack Overflow for the following, but to no avail:
unix filter by file contents
linux filter by file contents
shell traverse directory file contents
bash traverse directory file contents
shell traverse directory find
bash traverse directory find
linux file contents directory
unix file contents directory
linux find name contents
unix find name contents
shell read file show directory
bash read file show directory
bash directory search
shell directory search
I have tried the following shell commands:
This helps me identify all the directories that contain STATUS.txt
$ find ./ -name STATUS.txt
This reads STATUS.txt for every directory that contains it
$ find ./ -name STATUS.txt | xargs -I{} cat {}
This doesn't return any text, I was hoping it would return the name of each directory
$ find . -type d | while read d; do if [ -f STATUS.txt ]; then echo "${d}"; fi; done
... or the other way around:
find . -name "STATUS.txt" -exec grep -lF "OPEN" \{} +
If you want to wrap that in a script, a good starting point might be:
#!/bin/sh
[ $# -ne 1 ] && echo "One argument required" >&2 && exit 2
find . -name "STATUS.txt" -exec grep -lF "$1" \{} +
As pointed out by #BroSlow, if you are looking for directories containing the matching STATUS.txt files, this might be more what you are looking for:
fgrep --include='STATUS.txt' -rl 'OPEN' | xargs -L 1 dirname
Or better
fgrep --include='STATUS.txt' -rl 'OPEN' |
sed -e 's|^[^/]*$|./&|' -e 's|/[^/]*$||'
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# simulate `xargs -L 1 dirname` using `sed`
# (no trailing `\`; returns `.` for path without dir part)
Maybe you can try this:
grep -rl "OPEN" . --include='STATUS.txt'| sed 's/STATUS.txt//'
where grep -r means recursive , -l means only list the files matching, '.' is the directory location. You can pipe it to sed to remove the file name.
You can then wrap this in a bash script file where you can pass in keywords such as 'OPEN', 'FUTURE' as an argument.
#!/bin/bash
grep -rl "$1" . --include='STATUS.txt'| sed 's/STATUS.txt//'
Try something like this
find -type f -name "STATUS.txt" -exec grep -q "OPEN" {} \; -exec dirname {} \;
or in a script
#!/bin/bash
(($#==1)) || { echo "Usage: $0 <pattern>" && exit 1; }
find -type f -name "STATUS.txt" -exec grep -q "$1" {} \; -exec dirname {} \;
You could use grep and awk instead of find:
grep -r OPEN * | awk '{split($1, path, ":"); print path[1]}' | xargs -I{} dirname {}
The above grep will list all files containing "OPEN" recursively inside you dir structure. The result will be something like:
dir_1/subdir_1/STATUS.txt:OPEN
dir_2/subdir_2/STATUS.txt:OPEN
dir_2/subdir_3/STATUS.txt:OPEN
Then the awk script will split this output at the colon and print the first part of it (the dir path).
dir_1/subdir_1/STATUS.txt
dir_2/subdir_2/STATUS.txt
dir_2/subdir_3/STATUS.txt
The dirname will then return only the directory path, not the file name, which I suppose it what you want.
I'd consider using Perl or Python if you want to evolve this further, though, as it might get messier if you want to add priorities and sorting.
Taking up the accepted answer, it does not output a sorted and unique directory list. At the end of the "find" command, add:
| sort -u
or:
| sort | uniq
to get the unique list of the directories.
Credits go to Get unique list of all directories which contain a file whose name contains a string.
IMHO you should write a Python script which:
Examines your directory structure and finds all files named STATUS.txt.
For each found file:
reads the file and executes mycommand depending on what the file contains.
If you want to extend the script later with sorting, you can find all the interesting files first, save them to a list, sort the list and execute the commands on the sorted list.
Hint: http://pythonadventures.wordpress.com/2011/03/26/traversing-a-directory-recursively/

Unix: traverse a directory

I need to traverse a directory so starting in one directory and going deeper into difference sub directories. However I also need to be able to have access to each individual file to modify the file. Is there already a command to do this or will I have to write a script? Could someone provide some code to help me with this task? Thanks.
The find command is just the tool for that. Its -exec flag or -print0 in combination with xargs -0 allows fine-grained control over what to do with each file.
Example: Replace all foo's by bar's in all files in /tmp and subdirectories.
find /tmp -type f -exec sed -i -e 's/foo/bar/' '{}' ';'
for i in `find` ; do
if [ -d $i ] ; then do something with a directory ; fi
if [ -f $i ] ; then do something with a file etc. ; fi
done
This will return the whole tree (recursively) in the current directory in a list that the loop will go through.
This can be easily achieved by mixing find, xargs, sed (or other file modification command).
For example:
$ find /path/to/base/dir -type f -name '*.properties' | xargs sed -ie '/^#/d'
This will filter all files with file extension .properties.
The xargs command will feed the file path generated by find command into the sed command.
The sed command will delete all lines start with # in the files (feed by xargs).
Command combination in this way is very flexible.
For example, find command have different parameters so you can filter by user name, file size, file path (eg: under /test/ subfolder), file modification time.
Another dimension of flexibility is how and what to change in your file. For ex, sed command allows you to make changes on file in applying substitution (specify via regular expressions). Similarly, you can use gzip to compress the file. And so on ...
You would usually use the find command. On Linux, you have the GNU version, of course. It has many extra (and useful) options. Both will allow you to execute a command (eg a shell script) on the files as they are found.
The exact details of how to make changes to the file depend on the change you want to make to the file. That is probably best scripted, with find running the script:
POSIX or GNU:
find . -type f -exec your_script '{}' +
This will run your script once for a group of files with those names provided as arguments. If you want to do it one file at a time, replace the + with ';' (or \;).
I am assuming SearchMe is the example directory name you need to traverse completely.
I am also assuming, since it was not specified, the files you want to modify are all text file. Is this correct?
In such scenario I would suggest using the command:
find SearchMe -type f -exec vi {} \;
If you are not familiar with vi editor, just use another one (nano, emacs, kate, kwrite, gedit, etc.) and it should work as well.
Bash 4+
shopt -s globstar
for file in **
do
if [ -f "$file" ];then
# do some processing to your file here
# where the find command can't do conveniently
fi
done

how do i script hunpos through text files?

my purpose is to parse several text files using the POS parser HunPos http://code.google.com/p/hunpos/wiki/UserManualI
is there a way to bash script hunpos through a bunch of text files?
Typical mechanisms look like:
for f in glob; do command $f ; done
I often run commands like: for f in *; do echo -n "$f " ; cat $f ; done to see the contents of all the files in a directory. (Especially nice with /proc/sys/kernel/-style directories, where all the files have very short contents.)
or
find . -type f -exec command {} \;
or
find . -type f -print0 | xargs -0 command parameters
Something like find . -type f -exec file {} \; or find . -type f -print0 | xargs -0 file (only works if the command accepts multiple filenames during input).
Of course, if the program accepts multiple filename arguments (like cat or more or similar Unix shell tools) and all the files are in a single directory, you can very easily run: cat * (show contents of all files in the directory) or cat *.* (show contents of all files with a period in the filename).
If you frequently want "all files in all [sub]*directories", the zsh **/ option can be handy: ls -l **/*.c would show you foo/bar/baz.c and /blort/bleet/boop.c at once. Neat tool, but I usually don't mind writing the find command equivalent, I just don't need it that often. (And zsh isn't installed everywhere, so relying on its features could be frustrating in the future.)

How to list specific type of files in recursive directories in shell?

How can we find specific type of files i.e. doc pdf files present in nested directories.
command I tried:
$ ls -R | grep .doc
but if there is a file name like alok.doc.txt the command will display that too which is obviously not what I want. What command should I use instead?
If you are more confortable with "ls" and "grep", you can do what you want using a regular expression in the grep command (the ending '$' character indicates that .doc must be at the end of the line. That will exclude "file.doc.txt"):
ls -R |grep "\.doc$"
More information about using grep with regular expressions in the man.
ls command output is mainly intended for reading by humans. For advanced querying for automated processing, you should use more powerful find command:
find /path -type f \( -iname "*.doc" -o -iname "*.pdf" \)
As if you have bash 4.0++
#!/bin/bash
shopt -s globstar
shopt -s nullglob
for file in **/*.{pdf,doc}
do
echo "$file"
done
find . | grep "\.doc$"
This will show the path as well.
Some of the other methods that can be used:
echo *.{pdf,docx,jpeg}
stat -c %n * | grep 'pdf\|docx\|jpeg'
We had a similar question. We wanted a list - with paths - of all the config files in the etc directory. This worked:
find /etc -type f \( -iname "*.conf" \)
It gives a nice list of all the .conf file with their path. Output looks like:
/etc/conf/server.conf
But, we wanted to DO something with ALL those files, like grep those files to find a word, or setting, in all the files. So we use
find /etc -type f \( -iname "*.conf" \) -print0 | xargs -0 grep -Hi "ServerName"
to find via grep ALL the config files in /etc that contain a setting like "ServerName" Output looks like:
/etc/conf/server.conf: ServerName "default-118_11_170_172"
Hope you find it useful.
Sid
Similarly if you prefer using the wildcard character * (not quite like the regex suggestions) you can just use ls with both the -l flag to list one file per line (like grep) and the -R flag like you had. Then you can specify the files you want to search for with *.doc
I.E. Either
ls -l -R *.doc
or if you want it to list the files on fewer lines.
ls -R *.doc
If you have files with extensions that don't match the file type, you could use the file utility.
find $PWD -type f -exec file -N \{\} \; | grep "PDF document" | awk -F: '{print $1}'
Instead of $PWD you can use the directory you want to start the search in. file prints even out he PDF version.

Resources