Can I get the name of the file currently being read in a for loop? - linux

I want to write a script that takes a word as an argument and searches the current and sub directories' files for the word. if it is found in any of the files it should echo out a message containing the file name and the line the word is found on.
this is what I have so far, but I can't find a way to actually store the file name of the file being read or the line number..
word=$1
for var in $(grep -R "$word *")
do
filename=$(find . -type f -name "*") ------- //this doesnt work
linenmbr=$(grep -n "$ord" file) ----------- //this doesnt work
echo found $word in $filename on line number $linenmbr
done

In bash, any time you are looping, you want to avoid calling utilities (e.g. grep and find) within the loop. That is horribly inefficient because it will spawn a separate subshell for every utility every iteration. (which for 10 iterations -- that is 20 additional subshells, it adds up quick) So in your case, you call grep to feed the loop, and then spawn a separate subshell calling grep again within the loop as well as spawning a separate subshell for find.
You should think of a way to only call grep (or a utility that will provide the needed information) only once, and then parse the output.
If you did want to use grep, then calling grep -rn within a process substitution which is used to feed a while loop is probably as good as you are going to get. You can then use the bash builtin parameter expansions to isolate the filename and line-numbers which will be about as efficient as bash could get, e.g.
#!/bin/bash
[ -z "$1" ] && { ## validate at least 1 input given
printf "error: insufficient input.\nusage: %s srch_term\n" "${0##*/}"
exit 1
}
while read -r line; do ## read each line of grep output
fn="${line%%:*}" ## isolate filename
no="${line#*:}" ## remove filename
no="${no%%:*}" ## isolate number
printf "found %s in %s on line number %d\n" "$1" "$fn" "$no"
done < <(grep -rn "$1") ## grep in process substitution
Choosing A More Efficient Method
If you can accomplish what you are attempting with one of the stream editing tools, e.g. awk or sed, you are likely to be able to isolate the wanted information an order of magnitude faster. For example, using awk and setting globstar you could do something similar to the following:
#!/bin/bash
shopt -s globstar ## set globstar
[ -z "$1" ] && { ## validate at least 1 input given
printf "error: insufficient input.\nusage: %s srch_term\n" "${0##*/}"
exit 1
}
## find all matching files and line numbers
awk -v word="$1" '/'$1'/ {
print "found",word,"in",FILENAME,"on line number",FNR; next
}' **/* 2>/dev/null
Give both a try and let me know if you have further questions.
If you want to compare and ensure both are producing the same output, you can use diff to confirm, e.g.
$ diff <(grepscript.sh | sort) <(awkscript.sh | sort)
(if no difference is reported, the output is the same)

Related

How to move files using the result as condition after grep command

I have 2 files that I needed to grep in a separate file.
The two files are in this directory /var/list
TB.1234.txt
TB.135325.txt
I have to grep them in another file in another directory which is in /var/sup/. I used the command below:
for i in TB.*; do grep "$i" /var/sup/logs.txt; done
what I want to do is, if the result of the grep command contains the word "ERROR" the files which is found in /var/list will be moved to another directory /var/last.
for example I grep this file TB.1234.txt to /var/sup/logs.txt then the result is like this:
ERROR: TB.1234.txt
TB.1234.txt will be move to /var/last.
please help. I don't know how to construct the logic on how to move the files, I'm stuck in that I provided, I am also trying to use two greps in a for loop but I am encountering an error.
I am new in coding and really appreciates any help and suggestions. Thank you so much.
If you are asking how to move files which contain "ERROR", this should be extremely straightforward.
for file in TB.*; do
grep -q 'ERROR' "$file" &&
mv "$file" /var/last/
done
The notation this && that is a convenient shorthand for
if this; then
that
fi
The -q option to grep says to not print the matches, and quit as soon as you find one. Like all well-defined commands, grep sets its exit code to reflect whether it succeeded (the status is visible in $?, but usually you would not examine it directly; perhaps see also Why is testing ”$?” to see if a command succeeded or not, an anti-pattern?)
Your question is rather unclear, but if you want to find either of the matching files in a third file, perhaps something like
awk 'FNR==1 && (++n < ARGC-1) { a[n] = FILENAME; nextfile }
/ERROR/ { for(j=1; j<=n; ++j) if ($0 ~ a[j]) b[a[j]]++ }
END { for(f in b) print f }' TB*.txt /var/sup/logs.txt |
xargs -r mv -t /var/last/
This is somewhat inefficient in that it will read all the lines in the log file, and brittle in that it will only handle file names which do not contain newlines. (The latter restriction is probably unimportant here, as you are looking for file names which occur on the same line as the string "ERROR" in the first place.)
In some more detail, the Awk script collects the wildcard matches into the array a, then processes all lines in the last file, looking for ones with "ERROR" in them. On these lines, it checks if any of the file names in a are also found, and if so, also adds them to b. When all lines have been processed, print the entries in b, which are then piped to a simple shell command to move them.
xargs is a neat command to read some arguments from standard input, and run another command with those arguments added to its command line. The -r option says to not run the other command if there are no arguments.
(mv -t is a GNU extension; it's convenient, but not crucial to have here. If you need portable code, you could replace xargs with a simple while read -r loop.)
The FNR==1 condition requires that the input files are non-empty.
If the text file is small, or you expect a match near its beginning most of the time, perhaps just live with grepping it multiple times:
for file in TB.*; do
grep -Eq "ERROR.*$file|$file.*ERROR" /var/sup/logs.txt &&
mv "$file" /var/last/
done
Notice how we now need double quotes, not single, around the regular expression so that the variable $file gets substituted in the string.
grep has an -l switch, showing only the filename of the file which contains a pattern. It should not be too difficult to write something like (this is pseudocode, it won't work, it's just for giving you an idea):
if $(grep -l "ERROR" <directory> | wc -l) > 0
then foreach (f in $(grep -l "ERROR")
do cp f <destination>
end if
The wc -l is to check if there are any files which contain the word "ERROR". If not, nothing needs to be done.
Edit after Tripleee's comment:
My proposal can be simplified as:
if grep -lq "ERROR" TB.*;
then foreach (f in $(grep -l "ERROR")
do cp f <destination>
end if
Edit after Tripleee's second comment:
This is even shorter:
for f in $(grep -l "ERROR" TB.*);
do cp "$f" destination;
done

Bash script that counts and prints out the files that start with a specific letter

How do i print out all the files of the current directory that start with the letter "k" ?Also needs to count this files.
I tried some methods but i only got errors or wrong outputs. Really stuck on this as a newbie in bash.
Try this Shellcheck-clean pure POSIX shell code:
count=0
for file in k*; do
if [ -f "$file" ]; then
printf '%s\n' "$file"
count=$((count+1))
fi
done
printf 'count=%d\n' "$count"
It works correctly (just prints count=0) when run in a directory that contains nothing starting with 'k'.
It doesn't count directories or other non-files (e.g. fifos).
It counts symlinks to files, but not broken symlinks or symlinks to non-files.
It works with 'bash' and 'dash', and should work with any POSIX-compliant shell.
Here is a pure Bash solution.
files=(k*)
printf "%s\n" "${files[#]}"
echo "${#files[#]} files total"
The shell expands the wildcard k* into the array, thus populating it with a list of matching files. We then print out the array's elements, and their count.
The use of an array avoids the various problems with metacharacters in file names (see e.g. https://mywiki.wooledge.org/BashFAQ/020), though the syntax is slightly hard on the eyes.
As remarked by pjh, this will include any matching directories in the count, and fail in odd ways if there are no matches (unless you set nullglob to true). If avoiding directories is important, you basically have to get the directories into a separate array and exclude those.
To repeat what Dominique also said, avoid parsing ls output.
Demo of this and various other candidate solutions:
https://ideone.com/XxwTxB
To start with: never parse the output of the ls command, but use find instead.
As find basically goes through all subdirectories, you might need to limit that, using the -maxdepth switch, use value 1.
In order to count a number of results, you just count the number of lines in your output (in case your output is shown as one piece of output per line, which is the case of the find command). Counting a number of lines is done using the wc -l command.
So, this comes down to the following command:
find ./ -maxdepth 1 -type f -name "k*" | wc -l
Have fun!
This should work as well:
VAR="k"
COUNT=$(ls -p ${VAR}* | grep -v ":" | wc -w)
echo -e "Total number of files: ${COUNT}\n" 1>&2
echo -e "Files,that begin with ${VAR} are:\n$(ls -p ${VAR}* | grep -v ":" )" 1>&2

How can we increment a string variable within a for loop

#! /bin/bash
for i in $(ls);
do
j=1
echo "$i"
not expected Output:-
autodeploy
bin
config
console-ext
edit.lok
need Output like below if give input 2 it should print "bin" based on below condition, but I want out put like Directory list
1.)autodeploy
2.)bin
3.)config
4.)console-ext
5.)edit.lok
and if i like as input:- 2 then it should print "bin"
Per BashFAQ #1, a while read loop is the correct way to read content line-by-line:
#!/usr/bin/env bash
enumerate() {
local line i
i=0
while IFS= read -r line; do
((++i))
printf '%d.) %s\n' "$i" "$line"
done
}
ls | enumerate
However, ls is not an appropriate tool for programmatic use; the above is acceptable if the results of ls are only for human consumption, but not if they're going to be parsed by a machine -- see Why you shouldn't parse the output of ls(1).
If you want to list files and let the user choose among them by number, pass the results of a glob expression to select:
select filename in *; do
echo "$filename" && break
done
I don't understand what you mean in your question by like Directory list, but following your example, you do not need to write a loop:
ls|nl -s '.)' -w 1
If you want to avoid ls, you can do the following (but be careful - this only works if the directory entries do not contain white spaces (because this would make fmt to break them into two lines):
echo *|fmt -w 1 |nl -s '.)' -w 1

Why am I getting command not found error on numeric comparison?

I am trying to parse each line of a file and look for a particular string. The script seems to be doing its intended job, however, in parallel it tries to execute the if command on line 6:
#!/bin/bash
for line in $(cat $1)
do
echo $line | grep -e "Oct/2015"
if($?==0); then
echo "current line is: $line"
fi
done
and I get the following (my script is readlines.sh)
./readlines.sh: line 6: 0==0: command not found
First: As Mr. Llama says, you need more spaces. Right now your script tries to look for a file named something like /usr/bin/0==0 to run. Instead:
[ "$?" -eq 0 ] # POSIX-compliant numeric comparison
[ "$?" = 0 ] # POSIX-compliant string comparison
(( $? == 0 )) # bash-extended numeric comparison
Second: Don't test $? at all in this case. In fact, you don't even have good cause to use grep; the following is both more efficient (because it uses only functionality built into bash and requires no invocation of external commands) and more readable:
if [[ $line = *"Oct/2015"* ]]; then
echo "Current line is: $line"
fi
If you really do need to use grep, write it like so:
if echo "$line" | grep -q "Oct/2015"; then
echo "Current line is: $line"
fi
That way if operates directly on the pipeline's exit status, rather than running a second command testing $? and operating on that command's exit status.
#Charles Duffy has a good answer which I have up-voted as correct (and it is), but here's a detailed, line by line breakdown of your script and the correct thing to do for each part of it.
for line in $(cat $1)
As I noted in my comment elsewhere this should be done as a while read construct instead of a for cat construct.
This construct will wordsplit each line making spaces in the file separate "lines" in the output.
All empty lines will be skipped.
In addition when you cat $1 the variable should be quoted. If it is not quoted spaces and other less-usual characters appearing in the file name will cause the cat to fail and the loop will not process the file.
The complete line would read:
while IFS= read -r line
An illustrative example of the tradeoffs can be found here. The linked test script follows. I tried to include an indication of why IFS= and -r are important.
#!/bin/bash
mkdir -p /tmp/testcase
pushd /tmp/testcase >/dev/null
printf '%s\n' '' two 'three three' '' ' five with leading spaces' 'c:\some\dos\path' '' > testfile
printf '\nwc -l testfile:\n'
wc -l testfile
printf '\n\nfor line in $(cat) ... \n\n'
let n=1
for line in $(cat testfile) ; do
echo line $n: "$line"
let n++
done
printf '\n\nfor line in "$(cat)" ... \n\n'
let n=1
for line in "$(cat testfile)" ; do
echo line $n: "$line"
let n++
done
let n=1
printf '\n\nwhile read ... \n\n'
while read line ; do
echo line $n: "$line"
let n++
done < testfile
printf '\n\nwhile IFS= read ... \n\n'
let n=1
while IFS= read line ; do
echo line $n: "$line"
let n++
done < testfile
printf '\n\nwhile IFS= read -r ... \n\n'
let n=1
while IFS= read -r line ; do
echo line $n: "$line"
let n++
done < testfile
rm -- testfile
popd >/dev/null
rmdir /tmp/testcase
Note that this is a bash-heavy example. Other shells do not tend to support -r for read, for example, nor is let portable. On to the next line of your script.
do
As a matter of style I prefer do on the same line as the for or while declaration, but there's no convention on this.
echo $line | grep -e "Oct/2015"
The variable $line should be quoted here. In general, meaning always unless you specifically know better, you should double-quote all expansion--and that means subshells as well as variables. This insulates you from most unexpected shell weirdness.
You decclared your shell as bash which means you will have there "Here string" operator <<< available to you. When available it can be used to avoid the pipe; each element of a pipeline executes in a subshell, which incurs extra overhead and can lead to unexpected behavior if you try to modify variables. This would be written as
grep -e "Oct/2015" <<<"$line"
Note that I have quoted the line expansion.
You have called grep with -e, which is not incorrect but is needless since your pattern does not begin with -. In addition you have full-quoted a string in shell but you don't attempt to expand a variable or use other shell interpolation inside of it. When you don't expect and don't want the contents of a quoted string to be treated as special by the shell you should single quote them. Furthermore, your use of grep is inefficient: because your pattern is a fixed string and not a regular expression you could have used fgrep or grep -F, which does string contains rather than regular expression matching (and is far faster because of this). So this could be
grep -F 'Oct/2015' <<<"$line"
Without altering the behavior.
if($?==0); then
This is the source of your original problem. In shell scripts commands are separated by whitespace; when you say if($?==0) the $? expands, probably to 0, and bash will try to execute a command called if(0==0) which is a legal command name. What you wanted to do was invoke the if command and give it some parameters, which requires more whitespace. I believe others have covered this sufficiently.
You should never need to test the value of $? in a shell script. The if command exists for branching behavior based on the return code of whatever command you pass to it, so you can inline your grep call and have if check its return code directly, thus:
if grep -F 'Oct/2015` <<<"$line" ; then
Note the generous whitespace around the ; delimiter. I do this because in shell whitespace is usually required and can only sometiems be omitted. Rather than try to remember when you can do which I recommend an extra one space padding around everything. It's never wrong and can make other mistakes easier to notice.
As others have noted this grep will print matched lines to stdout, which is probably not something you want. If you are using GNU grep, which is standard on Linux, you will have the -q switch available to you. This will suppress the output from grep
if grep -q -F 'Oct/2015' <<<"$line" ; then
If you are trying to be strictly standards compliant or are in any environment with a grep that doesn't know -q the standard way to achieve this effect is to redirect stdout to /dev/null/
if printf "$line" | grep -F 'Oct/2015' >/dev/null ; then
In this example I also removed the here string bashism just to show a portable version of this line.
echo "current line is: $line"
There is nothing wrong with this line of your script, except that although echo is standard implementations vary to such an extent that it's not possible to absolutely rely on its behavior. You can use printf anywhere you would use echo and you can be fairly confident of what it will print. Even printf has some caveats: Some uncommon escape sequences are not evenly supported. See mascheck for details.
printf 'current line is: %s\n' "$line"
Note the explicit newline at the end; printf doesn't add one automatically.
fi
No comment on this line.
done
In the case where you did as I recommended and replaced the for line with a while read construct this line would change to:
done < "$1"
This directs the contents of the file in the $1 variable to the stdin of the while loop, which in turn passes the data to read.
In the interests of clarity I recommend copying the value from $1 into another variable first. That way when you read this line the purpose is more clear.
I hope no one takes great offense at the stylistic choices made above, which I have attempted to note; there are many ways to do this (but not a great many correct) ways.
Be sure to always run interesting snippets through the excellent shellcheck and explain shell when you run into difficulties like this in the future.
And finally, here's everything put together:
#!/bin/bash
input_file="$1"
while IFS= read -r line ; do
if grep -q -F 'Oct/2015' <<<"$line" ; then
printf 'current line is %s\n' "$line"
fi
done < "$input_file"
If you like one-liners, you may use AND operator (&&), for example:
echo "$line" | grep -e "Oct/2015" && echo "current line is: $line"
or:
grep -qe "Oct/2015" <<<"$line" && echo "current line is: $line"
Spacing is important in shell scripting.
Also, double-parens is for numerical comparison, not single-parens.
if (( $? == 0 )); then

Attempting to pass two arguments to a called script for a pattern search

I'm having trouble getting a script to do what I want.
I have a script that will search a file for a pattern and print the line numbers and instances of that pattern.
I want to know how to make it print the file name first before it prints the lines found
I also want to know how to write a new script that will call this one and pass two arguments to it.
The first argument being the pattern for grep and the second the location.
If the location is a directory, it will loop and search the pattern on all files in the directory using the script.
#!/bin/bash
if [[ $# -ne 2 ]]
then
echo "error: must provide 2 arguments."
exit -1
fi
if [[ ! -e $2 ]];
then
echo "error: second argument must be a file."
exit -2
fi
echo "------ File =" $2 "------"
grep -ne "$1" "$2"
This is the script i'm using that I need the new one to call. I just got a lot of help from asking a similar question but i'm still kind of lost. I know that I can use the -d command to test for the directory and then use 'for' to loop the command, but exactly how isn't panning out for me.
I think you just want to add the -H option to grep:
-H, --with-filename
Print the file name for each match. This is the default when there is more than one file to search.
grep has an option -r which can help you avoid testing for second argument being a directory and using for loop to iterate all files of that directory.
From the man page:
-R, -r, --recursive
Recursively search subdirectories listed.
It will also print the filename.
Test:
On one file:
[JS웃:~/Temp]$ grep -r '5' t
t:5 10 15
t:10 15 20
On a directory:
[JS웃:~/Temp]$ grep -r '5' perl/
perl//hello.pl:my $age=65;
perl//practice.pl:use v5.10;
perl//practice.pl:#array = (1,2,3,4,5);
perl//temp/person5.pm:#person5.pm
perl//temp/person9.pm: my #date = (localtime)[3,4,5];
perl//text.file:This is line 5

Resources