Need guidance with a bash script to check log files in a certain directory for a certain string - linux

I would like to preface this with I am a complete noob with scripting. So I have a situation where I need to manually look for a phone number that could live in one of hundreds of files.
so the logs live in the following directory.
/actlogs/sbclogger_archive
The logs file names are in directories numbered 01-31 inside of that directory and all the files are zipped.
Inside of those numbered directories are tons of files but the only ones I want to search are "sipd.logthenthedate.gz" and "sipmsg.logthenthedate.gz".
So I need to look in all the files in the following directory.
"/actlogs/sbclogger_archive"
Which has 31 directories labeled "01-31"
Then in each 01-31 there is hundreds of files the only ones I want to look are are "sipd.logthenthedate.gz" and "sipmsg.logthenthedate.gz".
The script I am using is below, please let me know what I could do to make this work.
#!/bin/bash
read -p "Enter a phone number: " text
read -p "Enter directory of log file's, Hint it should be /actlogs/sbclogger_archive: " directory
#arr=( $(find $directory -type f -exec grep -l "$text" {} \; | sort -r) )
#find $directory -type f -exec grep -qe "$text" {} \; -exec bash -c '
file=$(find $directory -type f -name 'sipd.log*' -exec grep -qe "$text" {} \; -exec bash -c 'select f; do echo $f; break; done' find-sh {} +;)
if [ -z "$file" ]; then
echo "No matches found."
else
echo "select tool:"
tools=("nano" "less" "vim" "quit")
select tool in "${tools[#]}"
do
case $tool in
"quit")
break
;;
*)
$tool $file
break
;;
esac
done
fi

This would give you the list of files matching:
find \( -name 'sipd.log[0-9]*.gz' -o -name 'sipmsg.log[0-9]*.gz' \) \
-exec sh -c 'gunzip -c {}| grep -m1 -q 888333' \; -print
./18/sipd.log20200118.gz
./7/sipd.log20200107.gz
Note: -m1 tells grep to stop after first match, since you need only the file name in this case, it's enough.
If you have zgrep, you can shorten it to:
find \( -name 'sipd.log[0-9]*.gz' -o -name 'sipmsg.log[0-9]*.gz' \) \
-exec zgrep -l '888333' {} \;
./18/sipd.log20200118.gz
./7/sipd.log20200107.gz
Also, some of the tools you are suggesting do not support gzip files (nano and some variants of less for example). In which case you might need to decompress the file and compress it again when done.
And, you might want to consider a loop if you want to "quit". Feeding the file list to the tool doesn't make sense.
Note: AFAIK zgrep doesn't do recursive:
DESCRIPTION
Zgrep invokes grep on compressed or gzipped files. These grep options will cause zgrep to terminate with an
error code:
(-[drRzZ]|--di*|--exc*|--inc*|--rec*|--nu*). All other options specified are passed directly to grep. If no file is specified, then
the
standard input is decompressed if necessary and fed to grep. Otherwise the given files are uncompressed if necessary and fed to
grep.
so zgrep -rl "$text" "$directory" or zgrep -rl --include 'simpd.log*.gz' "$test" {01..31} won't work except if you have a special zgrep

As you must unzip before using your tool, i would divide the problem in two blocks.
Firstly, i would expand the paths you need (looking under <directory> for the phone <text>), and then iterate to apply the tool (because some tools like vim or nano cannot be piped).
Try something like this:
#!/bin/bash
#...
# text/directory input stuff
#...
tmpdir=$(mktemp -d)
trap 'rm -rf ${tmpdir}' EXIT
while IFS= read -r file; do
unzipped=${tmpdir}/$(basename "${file}" .gz)
gunzip -c "${file}" > "${unzipped}"
${tool} "${unzipped}"
done < <(zgrep -lw "${text}" "${directory}"/{01..31}/{sipd.logthenthedate.gz,sipmsg.logthenthedate.gz} 2>/dev/null)
Above is the proposed invert-form by Charles Duffy following this Bash FAQ.
If you prefer to iterate an array, you could build in this way:
# shellcheck disable=SC2207
files=( $(zgrep -lw "${text}" "${directory}"/{01..31}/{sipd.logthenthedate.gz,sipmsg.logthenthedate.gz} 2>/dev/null) )
for file in "${files[#]}"; do
# etc.
as in our particular case, the files to match have no spaces in their names and shellcheck warning is not so important (hidden above).
BRs

Related

Bash Globbing Pattern Matching for Imagemagick recursive convert to pdf

I have the following 2 scripts, that recursively convert folders of images to pdf's for my wifes japanese manga kindle using find and Imagemagick convert:
#!/bin/bash
_d="$(pwd)"
echo "$_d"
find . -type d -exec echo "Will convert in the following order: {}" \;
find . -type d -exec echo "Converting: '{}'" \; -exec convert '{}/*.jpg' "$_d/{}.pdf" \;
and the same for PNG
#!/bin/bash
_d="$(pwd)"
echo "$_d"
find . -type d -exec echo "Will convert in the following order: {}" \;
find . -type d -exec echo "Converting: '{}'" \; -exec convert '{}/*.png' "$_d/{}.pdf" \;
Unfortunately I am not able make one universal script that works for all image formats.
How do I make one script that works for both ?
I would also need JPG,PNG as well as jpeg,JPEG
Thx in advance
I wouldn't use find at all, just a loop:
#!/use/bin/env bash
# enable recursive globs
shopt -s globstar
for dir in **/*/; do
printf "Converting jpgs in %s\n" "$dir"
convert "$dir"/*.jpg "$dir/out.pdf"
done
If you want to combine .jpg and .JPG in the same pdf, add nocaseglob to the shopt line. Add .jpeg to the mix? Add extglob and change "$dir"/*.jpg to "$dir"/*.#(jpg|jpeg)
You can do more complicated actions if you turn the find exec into a bash function (or even a standalone script).
#!/bin/bash
do_convert()(
shopt -s nullglob
for dir in "$#"; do
files=("$dir"/*.{jpg,JPG,PNG,jpeg,JPEG})
if [[ -z $files ]]; then
echo 1>&2 "no suitable files in $dir"
continue
fi
echo "Converting $dir"
convert "${files[#]}" "$dir.pdf"
done
)
export -f do_convert
pwd
echo "Will convert in the following order:"
find . -type d
# find . -type d -exec bash -c 'do_convert {}' \;
find . -type d -exec bash -c 'do_convert "$#"' -- {} \+
nullglob makes *.xyz return nothing if there is no match, instead of returning the original string unchanged
p/*.{a,b,c} expands into p/*.a p/*.b p/*.c before the * are expanded
x()(...) instead of the more normal x(){...} uses a subshell so we don't have to remember to unset nullglob again or clean up any variable definitions
export -f x makes function x available in subshells
we skip conversion if there are no suitable files
with the slightly more complicated find command, we can reduce the number of invocations of bash (probably doesn't save a great deal in this particular case)
how about a one-liner
dry-run
find -name \*.jpg -or -name \*.png | xargs -I xxx echo "xxx =>" xxx.pdf
run
find -name \*.jpg -or -name \*.png | xargs -I xxx echo xxx xxx.pdf
help
-name match name
-or logical or => both jpg and png
xargs map input into a name to execute a command on
-I select a name, it is like {} in file
NOTE
instead of $(pwd) which is a command substitution you can use variable $PWD
xxx maps into a name and xxx.pdf still has the matched extension found by find. which means filename.png becomes filename.png.pdf. If this is not desired, you can sed it
to run convert command in parallel you can use -P 0 with xargs -- see xargs --help
With sed to remove extensions
dry-run
find -name \*.jpg -or -name \*.png | sed 's/.\(png\|jpg\)$//g' | xargs -I xxx echo "xxx =>" xxx.pdf
#shawn Your solution works, just as I stated in the comments, I am to stupid to name the resulting pdf properly (folder name) and save in the script caller directory. Nevertheless, it solves my case insensitive jpg, jpeg, png problems just fine.
Here is shawns solution:
#!/bin/bash
# enable recursive globs
shopt -s globstar nocaseglob extglob
for dir in **/*/; do
printf "Converting (jpg|jpeg|png) in %s\n" "$dir"
convert "$dir"/*.#(jpg|jpeg|png) "$dir/out.pdf"
done
#jhnc Your solution works out of the box, it does exactly what I intended, and I really like calling functions, or even standalone scripts to increase complexity. One drawback is, that I can not Ctrl-c the process, because it is thereby threaded, or runs in a subshell ? I think you were missing an exit statement at the end of the function, it never stopped.
#!/bin/bash
do_convert()(
shopt -s nullglob
for dir in "$#"; do
files=("$dir"/*.{jpg,JPG,png,PNG,jpeg,JPEG})
if [[ -z $files ]]; then
echo 1>&2 "no suitable files in $dir"
continue
fi
echo "Converting $dir"
convert "${files[#]}" "$dir.pdf"
done
exit
)
export -f do_convert
pwd
echo "Will convert in the following order:"
find . -type d
# find . -type d -exec bash -c 'do_convert {}' \;
find . -type d -exec bash -c 'do_convert "$#"' -- {} \+
# everyone else, it's already after midnight again, I guess this is a trivial question for you guys, and I am very grateful for your ALL your answers, I didn't have the time to try everything.
I find linux bash very challenging.
A lot of ways to skin this cat. My thought is:
for F in `find . -type f -print`
do
TYPE=`file -n --mime-type $F`
if [ "$TYPE" = image/png ]
then
## do png conversion here
elif [ "$TYPE" = image/jpg ]
then
## do jpg conversion here
fi
done

How to find a list of files that are of specific extension but do not contain certain characters in their file name?

I have a folder with files that have extensions, such as .txt, .sh and .out.
However, I want a list of files that have only .txt extension, with the file names not containing certain characters.
For example, the .txt files are named L-003_45.txt and so on all up to L-003_70.txt. Some files have a change in the L-003 part to lets say L-004, creating duplicates of lets say file 45, so basically both L-003_45.txt and L-004_45.txt exist. So I want to get a list of text files that don't have 45 in their name.
How would I do that?
I tried with find and ls and succeeded but I would like to know how to do a for loop instead.
I tried:
for FILE in *.txt; do ls -I '*45.txt'; done but it failed.
Would be grateful for the help!
Or you use Bash's extendedglobing
#!/usr/bin/env bash
# Enables extended globing
shopt -s extglob
# Prevents iterating patterns if no match found
shopt -s nullglob
# Iterates files not having 45 or 57 before .txt
for file in !(*#(45|57)).txt; do
printf '%s\n' "$file"
done
I would advise you to use the find command to find all files with the required extensions, and later filter out the ones with the "strange" characters, e.g. for finding the file extensions:
find ./ -name "*.txt" -o -name "*.sh" -o name "*.out"
... and now, for not showing the ones with "45" in the name, you can do:
find ./ -name "*.txt" -o -name "*.sh" -o name "*.out" | grep -v "45"
... and if you don't want "45" nor "56", you can do:
find ./ -name "*.txt" -o -name "*.sh" -o name "*.out" | grep -v "45" | grep -v "56"
Explanation:
-o stands for OR
grep -v stands for "--invert-match" (not showing those results)
Setup:
$ touch L-004_23.txt L-003_45.txt L-004_45.txt L-003_70.txt
$ ls -1 L*txt
L-003_45.txt
L-003_70.txt
L-004_23.txt
L-004_45.txt
One idea using ! to negate a criteria:
$ find . -name "*.txt" ! -name "*_45.txt"
./L-003_70.txt
./L-004_23.txt
Feeding the find results to a while loop, eg:
while read -r file
do
echo "file: ${file}"
done < <(find . -name "*.txt" ! -name "*_45.txt")
This generates:
file: ./L-003_70.txt
file: ./L-004_23.txt
The proposed solution with extglob is a very good one. In case you need to exclude more than one pattern you can also test and continue. Example to exclude all *45.txt and *57.txt:
declare -a excludes=("45" "57")
for f in *.txt; do
for e in "${excludes[#]}"; do
[[ "$f" == *"$e.txt" ]] && continue 2
done
printf '%s\n' "$f"
done

Bash script to find files in a list, copy them to dest, print files not found

I would like to build on the answer I found here: Bash script to find specific files in a hierarchy of files
find $dir -name $name -exec scp {} $destination \;
I have a file with a list of file names and I need to find those files on a backup disk, then copy those files found to a destination folder, and lastly print the files that could not be found to a new file.
the last step would be helpful so that I wouldn't need to make another list of files copied and then do a compare with original list.
If the script can then make a list of the copied files, and do a compare, then print the differences, then that's exactly what's required. Unless the shell process find can print to file each time it "Can't find" a file.
Assuming that your list is separated by newlines; something like this should work
#!/bin/bash
dir=someWhere
dest=someWhereElse
toCopyList=filesomewhere
notCopied=filesomewhereElse
while read line; do
find "$dir" -name "$line" -exec cp '{}' $dest \; -printf "%f\n"
done < "$toCopyList" > cpList
#sed -i 's#'$dir'/##' cpList
# I used # instead of / in sed to not confuse sed with / in $dir
# Also, I assumed the string in $dir doesnot end with a /
cat cpList "$toCopyList" | sort | uniq -c | sed -nr '/^ +1/s/^ +1 +(.*)/\1/p' > "$notCopied"
# Will not work if you give wild cards in your "toCopyList"
Hope it helps
while read fname ; do
find /FROM/WHERE/TO/COPY/ \
-type f \
-name "$fname" \
-exec cp \{\} /DESTINATION/DIR/ \; 2>/dev/null
find /DESTINATION/DIR/ \
-type f \
-name "$fname" &>/dev/null || \
echo $fname
done < FILESTOCOPY > MISSEDFILES
Will do.

A bash script to run a program for directories that do not have a certain file

I need a Bash Script to Execute a program for all directories that do not have a specific file and create the output file on the same directory.This program needs an input file which exist in every directory with the name *.DNA.fasta.Suppose I have the following directories that may contain sub directories also
dir1/a.protein.fasta
dir2/b.protein.fasta
dir3/anyfile
dir4/x.orf.fasta
I have started by finding the directories that don't have that specific file whic name is *.protein.fasta
in this case I want the dir3 and dir4 to be listed (since they do not contain *.protein.fasta)
I have tried this code:
find . -maxdepth 1 -type d \! -exec test -e '{}/*protein.fasta' \; -print
but it seems I missed some thing it does not work.
also I do not know how to proceed for the whole story.
This is a tricky one.
I can't think of a good solution. But here's a solution, nevertheless. Note that this is guaranteed not to work if your directory or file names contain newlines, and it's not guaranteed to work if they contain other special characters. (I've only tested with the samples in your question.)
Also, I haven't included a -maxdepth because you said you need to search subdirectories too.
#!/bin/bash
# Create an associative array
declare -A excludes
# Build an associative array of directories containing the file
while read line; do
excludes[$(dirname "$line")]=1
echo "excluded: $(dirname "$line")" >&2
done <<EOT
$(find . -name "*protein.fasta" -print)
EOT
# Walk through all directories, print only those not in array
find . -type d \
| while read line ; do
if [[ ! ${excludes[$line]} ]]; then
echo "$line"
fi
done
For me, this returns:
.
./dir3
./dir4
All of which are directories that do not contain a file matching *.protein.fasta. Of course, you can replace the last echo "$line" with whatever you need to do with these directories.
Alternately:
If what you're really looking for is just the list of top-level directories that do not contain the matching file in any subdirectory, the following bash one-liner may be sufficient:
for i in *; do test -d "$i" && ( find "$i" -name '*protein.fasta' | grep -q . || echo "$i" ); done
#!/bin/bash
for dir in *; do
test -d "$dir" && ( find "$dir" -name '*protein.fasta' | grep -q . || Programfoo"$dir/$dir.DNA.fasta");
done

Run FFmpeg from Shell Script

I have found a useful shell script that shows all files in a directory recursively.
Where it prints the file name echo "$i"; #Display File name.
I would instead like to run an ffmpeg command on non MP3 files, how can I do this? I have very limited knowledge of shell scripts so I appreciate if I was spoon fed! :)
//if file is NOT MP3
ffmpeg -i [the_file] -sameq [same_file_name_with_mp3_extension]
//delete old file
Here is the shell script for reference.
DIR="."
function list_files()
{
if !(test -d "$1")
then echo $1; return;
fi
cd "$1"
echo; echo `pwd`:; #Display Directory name
for i in *
do
if test -d "$i" #if dictionary
then
list_files "$i" #recursively list files
cd ..
else
echo "$i"; #Display File name
fi
done
}
if [ $# -eq 0 ]
then list_files .
exit 0
fi
for i in $*
do
DIR="$1"
list_files "$DIR"
shift 1 #To read next directory/file name
done
You can do the same with a find one-liner. Assuming the files you want to process are all wav:
find /path/ -type f -name "*wav" -exec ffmpeg -i {} -sameq {}.mp3 \;
If you want to find "rm" files, and delete them after conversion:
find /path/ -type f -name "*.rm" -exec ffmpeg -i {} -sameq {}.mp3 && rm {} \;
That said, if you want to do it with the shell script you showed, take the line that says
echo "$i";
replace it with this:
ffmpeg -i "$i" -sameq "$i".mp3
$i is a variable. A few lines up, you have:
for i in *
this basically means "for every element in * (which in turn stands for all files in the current directory, it's what's called a "shell expansion"), put the name of the element/file in the variable i, and then execute all the code between "do" and "done" ". So for each iteration, i will contain the name of one of the files in this directory.
There's also a section that tests whether i is a directory and if so, it recursively lists its contents.
A quick final note: the \; at the end of the find command IS significant and it NEEDS to have a space before the backslash, otherwise it won't work.
Your shell script seems to be essentially ls -1R, so it's probably easier to just use that. As for running ffmpeg on non-MP3 files, it's probably easier to use find instead of writing a whole shell script to do it. Assuming you're identifying MP3 files by their extension:
find your-path -not -name "*.mp3" -exec ffmpeg -i '{}' -sameq '{}.mp3' \;

Resources