Bash == operator in [[ ]] is too smart! - linux

Case in point. I want to know if some set of files have as a first line '------'.
So,
for file in *.txt
do
if [[ `head -1 "$file"` == "------" ]]
then
echo "$file starts with dashes"
fi
done
Thing is, head returns the content with a newline, but "------" does not have a newline.
Why does it work?

The backticks strip the trailing newline. For example:
foo=`echo bar`
echo "<$foo>"
prints
<bar>
even though that first echo printed out "bar" followed by a newline.

Bash performs word splitting on the result of command substitution i.e. head -1 "$file"
Word splitting will remove newlines among other things.

Related

How Can i validate the first character of a file in shell script?

I am trying to get the script to find the very first character '' backslash and validate if that's the first character of a script otherwise it should not run.
my file is like Test.txt (have lot of starting spaced lines intentionally):
\c
select * from x;
I came up with this and it works:
cut -c -1 test.txt | grep -w '\\'
However if i changed a file a bit suppose like:
select * from x;
\c
it still shows file contains '\' but i want this to fail because i always want the first character to be '\' no matter which line i start.
I have tried using head/cut but not able to validate.Please suggest some ideas.
This uses regex to check if the 1st encountered line contains starts with . If 1st encountered line does not start with \ it exits.
backslash_pat='^\\'
num_lett_pat='[0-9a-Z*]'
while IFS= read -r line; do
if [[ $line =~ $backslash_pat ]]
then
echo "$line"
exit
elif [[ $line =~ $num_lett_pat ]]
then
exit
fi
done < test.txt
You can check if the fitst character is a \ (I understand this is what you want) using dd
if [[ $(dd if=test.txt count=1 bs=1 2>/dev/null| xxd -c1 -ps) == 5c ]]
then
echo 'found'
fi
This works for binary file (i.e. not composed of lines).
Check man dd and man xxd to understand the command line options used.

Loop through a file and sed substitute each line

I have the following bash script:
while IFS= read -r line; do
line=$(echo $line | sed "s/\'/\'\'/")
[[ $line =~ ^\<ID\>(.*) ]] && printf "${BASH_REMATCH[1]}"
done < <(dos2unix < file)
EDITED version of script without dos2unix:
while IFS= read -r line && line=${line%$'\r'}; do
[[ $line =~ ^\<ID\>(.*) ]] && printf "${BASH_REMATCH[1]}"
done < file
I want to substitute every apostrophe in "file" with 2 apostrophes BEFORE I loop through it. How can I do this? I'd be grateful for any suggestions concerning any of the 2 versions.
IMPORTANT
Im NOT allowed to modify the original file!!
This is a job for sed alone:
sed 's/\r$//;s/\'/\'\'/g;s/^<ID>\(.*\)/\1/p;d' < file
The steps are:
sed accepts multiple commands separated with newlines, semicolons or given as multiple -e options.
sed 's/\r$//; removes the CR at end of each line like dos2unix.
The g flag added to s/\'/\'\'/ means replace all occurrences in the line; default is to replace just one.
The s/^<ID>\(.*\)/\1/ does the equivalent of that bash regex match and the p flag at the end makes sed print the matching lines now, because
The d command removes the line so it won't get printed by default (you could do that with the -n option instead).
On a side-note, my zsh does not accept \' in ', so I'd probably write it
sed -n -e 's/\r$//' -e "s/'/''/g" -e 's/^<ID>\(.*\)/\1/p'
It should be equivalent, just switching the quote style, separate options and the -n instead of final d.
While this is not a "solution" (your question is not clear on what is not working in your code), you certainly should avoid calling sed for each individual line. It is not "wrong" in the sense of producing an incorrect result, but it is so much slower that it should be avoided. There are ways do it that are both faster and simpler to code.
Do it this way :
while IFS= read -r line; do
[[ $line =~ ^\<ID\>(.*) ]] && printf "${BASH_REMATCH[1]}"
done < <(dos2unix < file | sed "s/\'/\'\'/")

Why am I getting command not found error on numeric comparison?

I am trying to parse each line of a file and look for a particular string. The script seems to be doing its intended job, however, in parallel it tries to execute the if command on line 6:
#!/bin/bash
for line in $(cat $1)
do
echo $line | grep -e "Oct/2015"
if($?==0); then
echo "current line is: $line"
fi
done
and I get the following (my script is readlines.sh)
./readlines.sh: line 6: 0==0: command not found
First: As Mr. Llama says, you need more spaces. Right now your script tries to look for a file named something like /usr/bin/0==0 to run. Instead:
[ "$?" -eq 0 ] # POSIX-compliant numeric comparison
[ "$?" = 0 ] # POSIX-compliant string comparison
(( $? == 0 )) # bash-extended numeric comparison
Second: Don't test $? at all in this case. In fact, you don't even have good cause to use grep; the following is both more efficient (because it uses only functionality built into bash and requires no invocation of external commands) and more readable:
if [[ $line = *"Oct/2015"* ]]; then
echo "Current line is: $line"
fi
If you really do need to use grep, write it like so:
if echo "$line" | grep -q "Oct/2015"; then
echo "Current line is: $line"
fi
That way if operates directly on the pipeline's exit status, rather than running a second command testing $? and operating on that command's exit status.
#Charles Duffy has a good answer which I have up-voted as correct (and it is), but here's a detailed, line by line breakdown of your script and the correct thing to do for each part of it.
for line in $(cat $1)
As I noted in my comment elsewhere this should be done as a while read construct instead of a for cat construct.
This construct will wordsplit each line making spaces in the file separate "lines" in the output.
All empty lines will be skipped.
In addition when you cat $1 the variable should be quoted. If it is not quoted spaces and other less-usual characters appearing in the file name will cause the cat to fail and the loop will not process the file.
The complete line would read:
while IFS= read -r line
An illustrative example of the tradeoffs can be found here. The linked test script follows. I tried to include an indication of why IFS= and -r are important.
#!/bin/bash
mkdir -p /tmp/testcase
pushd /tmp/testcase >/dev/null
printf '%s\n' '' two 'three three' '' ' five with leading spaces' 'c:\some\dos\path' '' > testfile
printf '\nwc -l testfile:\n'
wc -l testfile
printf '\n\nfor line in $(cat) ... \n\n'
let n=1
for line in $(cat testfile) ; do
echo line $n: "$line"
let n++
done
printf '\n\nfor line in "$(cat)" ... \n\n'
let n=1
for line in "$(cat testfile)" ; do
echo line $n: "$line"
let n++
done
let n=1
printf '\n\nwhile read ... \n\n'
while read line ; do
echo line $n: "$line"
let n++
done < testfile
printf '\n\nwhile IFS= read ... \n\n'
let n=1
while IFS= read line ; do
echo line $n: "$line"
let n++
done < testfile
printf '\n\nwhile IFS= read -r ... \n\n'
let n=1
while IFS= read -r line ; do
echo line $n: "$line"
let n++
done < testfile
rm -- testfile
popd >/dev/null
rmdir /tmp/testcase
Note that this is a bash-heavy example. Other shells do not tend to support -r for read, for example, nor is let portable. On to the next line of your script.
do
As a matter of style I prefer do on the same line as the for or while declaration, but there's no convention on this.
echo $line | grep -e "Oct/2015"
The variable $line should be quoted here. In general, meaning always unless you specifically know better, you should double-quote all expansion--and that means subshells as well as variables. This insulates you from most unexpected shell weirdness.
You decclared your shell as bash which means you will have there "Here string" operator <<< available to you. When available it can be used to avoid the pipe; each element of a pipeline executes in a subshell, which incurs extra overhead and can lead to unexpected behavior if you try to modify variables. This would be written as
grep -e "Oct/2015" <<<"$line"
Note that I have quoted the line expansion.
You have called grep with -e, which is not incorrect but is needless since your pattern does not begin with -. In addition you have full-quoted a string in shell but you don't attempt to expand a variable or use other shell interpolation inside of it. When you don't expect and don't want the contents of a quoted string to be treated as special by the shell you should single quote them. Furthermore, your use of grep is inefficient: because your pattern is a fixed string and not a regular expression you could have used fgrep or grep -F, which does string contains rather than regular expression matching (and is far faster because of this). So this could be
grep -F 'Oct/2015' <<<"$line"
Without altering the behavior.
if($?==0); then
This is the source of your original problem. In shell scripts commands are separated by whitespace; when you say if($?==0) the $? expands, probably to 0, and bash will try to execute a command called if(0==0) which is a legal command name. What you wanted to do was invoke the if command and give it some parameters, which requires more whitespace. I believe others have covered this sufficiently.
You should never need to test the value of $? in a shell script. The if command exists for branching behavior based on the return code of whatever command you pass to it, so you can inline your grep call and have if check its return code directly, thus:
if grep -F 'Oct/2015` <<<"$line" ; then
Note the generous whitespace around the ; delimiter. I do this because in shell whitespace is usually required and can only sometiems be omitted. Rather than try to remember when you can do which I recommend an extra one space padding around everything. It's never wrong and can make other mistakes easier to notice.
As others have noted this grep will print matched lines to stdout, which is probably not something you want. If you are using GNU grep, which is standard on Linux, you will have the -q switch available to you. This will suppress the output from grep
if grep -q -F 'Oct/2015' <<<"$line" ; then
If you are trying to be strictly standards compliant or are in any environment with a grep that doesn't know -q the standard way to achieve this effect is to redirect stdout to /dev/null/
if printf "$line" | grep -F 'Oct/2015' >/dev/null ; then
In this example I also removed the here string bashism just to show a portable version of this line.
echo "current line is: $line"
There is nothing wrong with this line of your script, except that although echo is standard implementations vary to such an extent that it's not possible to absolutely rely on its behavior. You can use printf anywhere you would use echo and you can be fairly confident of what it will print. Even printf has some caveats: Some uncommon escape sequences are not evenly supported. See mascheck for details.
printf 'current line is: %s\n' "$line"
Note the explicit newline at the end; printf doesn't add one automatically.
fi
No comment on this line.
done
In the case where you did as I recommended and replaced the for line with a while read construct this line would change to:
done < "$1"
This directs the contents of the file in the $1 variable to the stdin of the while loop, which in turn passes the data to read.
In the interests of clarity I recommend copying the value from $1 into another variable first. That way when you read this line the purpose is more clear.
I hope no one takes great offense at the stylistic choices made above, which I have attempted to note; there are many ways to do this (but not a great many correct) ways.
Be sure to always run interesting snippets through the excellent shellcheck and explain shell when you run into difficulties like this in the future.
And finally, here's everything put together:
#!/bin/bash
input_file="$1"
while IFS= read -r line ; do
if grep -q -F 'Oct/2015' <<<"$line" ; then
printf 'current line is %s\n' "$line"
fi
done < "$input_file"
If you like one-liners, you may use AND operator (&&), for example:
echo "$line" | grep -e "Oct/2015" && echo "current line is: $line"
or:
grep -qe "Oct/2015" <<<"$line" && echo "current line is: $line"
Spacing is important in shell scripting.
Also, double-parens is for numerical comparison, not single-parens.
if (( $? == 0 )); then

Check if line in file contains a pattern in Bash

I'm trying to figure out why this wont check the lines in the file and echo
How do you compare or check if strings contain something?
#!/bin/bash
while read line
do
#if the line ends
if [[ "$line" == '*Bye$' ]]
then
:
#if the line ends
elif [[ "$line" == '*Fly$' ]]
then
echo "\*\*\*"$line"\*\*\*"
fi
done < file.txt
The problem is that *Bye$ is not a shell pattern (shell patterns don't use the $ notation, they just use the lack of a trailing *) — and even if it were, putting it in single-quotes would disable it. Instead, just write:
if [[ "$line" == *Bye ]]
(and similarly for Fly).
If you want to use proper regular expressions, that's done with the =~ operator, such as:
if [[ "$line" =~ Bye$ ]]
The limited regular expressions you get from shell patterns with == don't include things like the end-line marker $.
Note that you can do something this simple with shell patterns (*Bye) but, if you want the full power of regular expressions (or even just a consistent notation), =~ is the way to go.

Simple sed substitution

I have a text file with a list of files with the structure ABC123456A or ABC123456AA. What I would like to do is check whether the files ABC123456ZZP also exists. i.e I want to substitute the letter(s) after ABC123456 with ZZP
Can I do this using sed?
Like this?
X=ABC123456 ; echo ABC123456AA | sed -e "s,\(${X}\).*,\1ZZP,"
You could use sed as wilx suggests but I think a better option would be bash.
while read file; do
base=${file:0:9}
[[ -f ${base}ZZP ]] && echo "${base}ZZP exists!"
done < file
This will loop over each line in file
then base is set to the first 9 characters of the line (excluding whitespace)
then check to see if a file exists with ZZP on the end of base and print a message if it does.
Look:
$ str="ABC123456AA"
$ echo "${str%[[:alpha:]][[:alpha:]]*}"
ABC123456
so do this:
while IFS= read -r tgt; do
tgt="${tgt%[[:alpha:]][[:alpha:]]*}ZZP"
[[ -f "$tgt" ]] && printf "%s exists!\n" "$tgt"
done < file
It will still fail for file names that contain newlines so let us know if you have that situation but unlike the other posted solutions it will work for file names with other than 9 key characters, file names containing spaces, commas, backslashes, globbing characters, etc., etc. and it is efficient.
Since you said now that you only need the first 9 characters of each line and you were happy with piping every line to sed, here's another solution you might like:
cut -c1-9 file |
while IFS= read -r tgt; do
[[ -f "${tgt}ZZP" ]] && printf "%sZZP exists!\n" "$tgt"
done
It'd be MUCH more efficient and more robust than the sed solution, and similar in both contexts to the other shell solutions.

Resources