my purpose is to parse several text files using the POS parser HunPos http://code.google.com/p/hunpos/wiki/UserManualI
is there a way to bash script hunpos through a bunch of text files?
Typical mechanisms look like:
for f in glob; do command $f ; done
I often run commands like: for f in *; do echo -n "$f " ; cat $f ; done to see the contents of all the files in a directory. (Especially nice with /proc/sys/kernel/-style directories, where all the files have very short contents.)
or
find . -type f -exec command {} \;
or
find . -type f -print0 | xargs -0 command parameters
Something like find . -type f -exec file {} \; or find . -type f -print0 | xargs -0 file (only works if the command accepts multiple filenames during input).
Of course, if the program accepts multiple filename arguments (like cat or more or similar Unix shell tools) and all the files are in a single directory, you can very easily run: cat * (show contents of all files in the directory) or cat *.* (show contents of all files with a period in the filename).
If you frequently want "all files in all [sub]*directories", the zsh **/ option can be handy: ls -l **/*.c would show you foo/bar/baz.c and /blort/bleet/boop.c at once. Neat tool, but I usually don't mind writing the find command equivalent, I just don't need it that often. (And zsh isn't installed everywhere, so relying on its features could be frustrating in the future.)
Related
I'm trying to create a little script using bash in linux. That allows me to find if there is any tag 103=16 inside a log
I have multiple folders named for example l51prdsrv-api1.nebex.local, l51prdsrv-oe1.nebex.local, etc... inside those folders are .log files like TRADX_gsoe3.log, TRADX_gseuoe2.log, etc... .
I need to find if inside those logs there is the tag 103=16
I'm trying this command
find . /opt/FIXLOGS/l51prdsrv* -iname "TRADX_" -type f | grep -e 103=16
But what it does is that is showing just the logs names and not the content to see if there is a tag 103=16
First of all, you are not searching files of the form TRADX_something.log, but only files which are just named TRADX_ (case-insensitively, so TradX_ would also be found).
Then you are feeding to grep the names of the files, but never look into the content of those files. From the grep man page, you see that the file content can be supplied either via stdin, or by specifying the file name on the command line. In your case, the latter is the way to go. Therefore you can either do a
find . /opt/FIXLOGS/l51prdsrv* -iname "TRADX_*.log" -type f -exec grep -F 103=16 {} \;
if you are only interested in the matchin lines, or a
find . /opt/FIXLOGS/l51prdsrv* -iname "TRADX_*.log" -type f -exec grep -F 103=16 {} /dev/null \;
if you also want to see the file names where the pattern matches. The reason is that grep is printing the filename only if it sees more than 1 filename on the command line and the /dev/null provides a second dummy file. find replaces the {} by the filename.
BTW, I used -f for grep instead of your -e, because you don't seem to use any specific regular expression pattern anyway.
But you don't need find for this task. An alternative would be an explicit loop:
shopt -s nocasematch # make globbing case-insensitive
shopt -s globstar # turn on ** globbing
for f in {.,/opt/FIXLOGS/l51prdsrv*}/**/tradx_*.log
do
[[ -f $f ]] && grep -F 103=16 "$f" /dev/null
done
While the loop looks more complicated at first glance, it is easier to extend the logic in case you want to do more with the files instead of just grepping the lines, for instance taking specific actions on those files which contain the pattern.
You are doing:
find . /opt/FIXLOGS/l51prdsrv* -iname "TRADX_" -type f | grep -e 103=16
I propose you do:
find . /opt/FIXLOGS/l51prdsrv* -iname "TRADX_" -type f -exec grep -e "103=16" {} /dev/null \;
What's the difference?
find ... -type f
=> gives you a list of files.
When you add | grep -e 103=16, then you perform that on the filenames.
When you add -exec grep ..., then you perform that on the files itselfs.
This question already has answers here:
Recursively change file extensions in Bash
(6 answers)
Closed 6 years ago.
The user inputs a file type they are looking for; it is stored in $arg1; the file type they would like to change them is stored as $arg2. I'm able to find what I'm looking for, but I'm unsure of how to keep the filename the same but just change the type... ie., file1.txt into file1.log.
find . -type f -iname "*.$arg1" -exec mv {} \;
To enable the full power of shell parameter expansions, you can call bash -c in your exec action:
find . -type f -iname "*.$arg1" \
-exec bash -c 'echo mv "$1" "${1/%.*/$1}"' _ {} "$arg2" \;
We add {} and "$arg2" as a parameters to bash -c, so they become accessible within the command as $0 and $1. ${0%.*} removes the extension, to be replaced by whatever $arg2 expands to.
As it is, the command just prints the mv commands it would execute; to actually rename the files, the echo has to be removed.
The quoting is relevant: the argument to bash -c is in single quotes to prevent $0 and $1 from being expanded prematurely, and the two arguments to mv, and arg2 are also quoted to deal with file names with spaces in them.
Combining the find -exec bash idea with the bash loop idea, you can use the + terminator on the -exec to tell find to pass multiple filenames to a single invocation of the bash command. Pass the new type as the first argument - which shows up in $0 and so is conveniently skipped by a for loop over the rest of the command-line arguments - and you have a pretty efficient solution:
find . -type f -iname "*.$arg1" -exec bash -c \
'for arg; do mv "$arg" "${arg%.*}.$0"; done' "$arg2" {} +
Alternatively, if you have either version of the Linux rename command, you can use that. The Perl one (a.k.a. prename, installed by default on Ubuntu and other Debian-based distributions; also available for OS X from Homebrew via brew install rename) can be used like this:
find . -type f -iname "*.$arg1" -exec rename 's/\Q'"$arg1"'\E$/'"$arg2"'/' {} +
That looks a bit ugly, but it's really just the s/old/new/ substitution command familiar from many UNIX tools. The \Q and \E around $arg1 keep any weird characters inside the suffix from being interpreted as regular expression metacharacters that might match something unexpected; the $ after the \E makes sure the pattern only matches at the end of the filename.
The pattern-based version installed by default on Red Hat-based Linux distros (Fedora, CentOS, etc) is simpler:
find . -type f -iname "*.$arg1" -exec rename ".$arg1" ".$arg2" {} +
but it's also dumber: if you rename .com .exe stackoverflow.com_scanner.com, you'll get a file named stackoverflow.exe_scanner.exe.
I would do it like so:
find . -type f -iname "*.$arg1" -print0 |\
while IFS= read -r -d '' file; do
mv -- "$file" "${file%$arg1}$arg2"
done
I took your find command and fed its output to a while loop. Within that loop, I am doing the actual renaming. This way I have the name of the file as a variable that I can manipulate using bash's string manipulation operations.
If you have perl based rename command
Sample directory:
$ find
.
./a"bn.txt
./t2.abc
./abc
./abc/t1.txt
./abc/list.txt
./a bc.txt
Sample args:
$ arg1='txt'
$ arg2='log'
Dry run:
$ find -type f -iname "*.$arg1" -exec rename -n "s/$arg1$/$arg2/" {} +
rename(./a"bn.txt, ./a"bn.log)
rename(./abc/t1.txt, ./abc/t1.log)
rename(./abc/list.txt, ./abc/list.log)
rename(./a bc.txt, ./a bc.log)
Remove -n option once it is okay:
$ find -type f -iname "*.$arg1" -exec rename "s/$arg1$/$arg2/" {} +
$ find
.
./a bc.log
./t2.abc
./abc
./abc/list.log
./abc/t1.log
./a"bn.log
I'm trying to count the total lines in the files within a directory. To do this I am trying to use a combination of find and wc. However, when I run find . -exec wc -l {}\;, I recieve the error find: missing argument to -exec. I can't see any apparent issues, any ideas?
You simply need a space between {} and \;
find . -exec wc -l {} \;
Note that if there are any sub-directories from the current location, wc will generate an error message for each of them that looks something like that:
wc: ./subdir: Is a directory
To avoid that problem, you may want to tell find to restrict the search to files :
find . -type f -exec wc -l {} \;
Another note: good idea using the -exec option . Too many times people pipe commands together thinking to get the same result, for instance here it would be :
find . -type f | xargs wc -l
The problem with piping commands in such a manner is that it breaks if any files has spaces in it. For instance here if a file name was "a b" , wc would receive "a" and then "b" separately and you would obviously get 2 error messages: a: no such file and b: no such file.
Unless you know for a fact that your file names never have any spaces in them (or non-printable characters), if you do need to pipe commands together, you need to tell all the tools you are piping together to use the NULL character (\0) as a separator instead of a space. So the previous command would become:
find . -type f -print0 | xargs -0 wc -l
With version 4.0 or later of bash, you don't need your find command at all:
shopt -s globstar
wc -l **/*
There's no simple way to skip directories, which as pointed out by Gui Rava you might want to do, unless you can differentiate files and directories by name alone. For example, maybe directories never have . in their name, while all the files have at least one extension:
wc -l **/*.*
I am trying to recursively (with sub-directories) read the last line of each file of a certain type (*.log) and write the output into individual files for each of the *.log files
e.g. (tail_"filename").
The closest bit of code I've been able to piece together is the following. I would need to send the information to a file for each of the instances it runs the tail command however.
find -type f | while read filename; do tail -1 $filename; done
You were almost there with your solution. Just add the > ${f}.tail to create the tail file:
find . -type f | while read f;do tail -1 $f > ${f}.tail;done
Another possibility might be
find . -type f -exec sh -c "tail -1 '{}' > '{}'.tail" \;
I have a directory full of subdirectories and each subdirectory has some text files inside of them (i.e. depth is 1).
I'd like to cat all these files (in no particular order) into one file:
cat file1 file2.... fileN >new.txt
Is there a bash shell one-liner that could list all the files inside of these directories and pass them to cat?
How about this?
find . -name '*.txt' -exec cat {} \; > concatenated.txt
Granted it calls cat a bunch of times rather than just once, but the effect is the same.
find . -type f -print0 | xargs -0 cat
find will recursively search for files (-type f) and print their names as null-terminated strings (-print0).
xargs will read null terminated strings (-0) from stdin and pass them as arguments to cat