Shell script (bash) to match a string variable with multiple values - string

I am trying write a script to compare one string variable to a list of values, i.e. if the variable matches (exact) to one of the values, then some action needs to be done.
The script is trying to match Unix pathnames, i.e. if the user enters / , /usr, /var etc, then to give an error, so that we do not get accidental corruption using the script. The list of values may change in future due to the application requirements. So I cannot have huge "if" statement to check this.
What I intend to do is that in case if the user enters, any of the forbidden path to give an error but sub-paths which are not forbidden should be allowed, i.e. /var should be rejected but /var/opt/app should be accepted.
I cannot use regex as partial match will not work
I am not sure of using a where loop and an if statement, is there any alternative?
thanks

I like to use associative arrays for this.
declare -A nonoList=(
[/foo/bar]=1
["/some/other/path with spaces"]=1
[/and/so/on]=1
# as many as you need
)
This can be kept in a file and sourced, if you want to separate it out.
Then in your script, just do a lookup.
if [[ -n "${nonoList[$yourString]}" ]] # -n checks for nonzero length
This also prevents you from creating a big file ad grep'ing over it redundantly, though that also works.
As an alternative, if you KNOW there will not be embedded newlines in any of those filenames (it's a valid character, but messy for programming) then you can do this:
$: cat foo
/foo/bar
/some/other/path with spaces
/and/so/on
Just a normal file with one file-path per line. Now,
chkSet=$'\n'"$(<foo)"$'\n' # single var, newlines before & after each
Then in your processing, assuming f=/foo/bar or whatever file you're checking,
if [[ "$chkSet" =~ $'\n'"$f"$'\n' ]] # check for a hit
This won't give you accidental hits on /some/other/path when the actual filename is /some/other/path with spaces because the pattern explicitly checks for a newline character before and after the filename. That's why we explicitly assure they exist at the front and end of the file. We assume they are in between, so make sure your file doesn't have any spaces (or any other characters, like quotes) that aren't part of the filenames.
If you KNOW there will also be no embedded whitespace in your filenames, it's a lot easier.
mapfile -t nopes < foo
if [[ " ${nopes[*]} " =~ " $yourString " ]]; then echo found; else echo no; fi
Note that " ${nopes[*]} " embeds spaces (technically it uses the first character of $IFS, but that's a space by default) into a single flattened string. Again, literal spaces before and behind key and list prevent start/end mismatches.

Paul,
Your alternative work around worked like a charm. I don't have any directories which need embedded space in them. So as long as my script can recognize that there are certain directories to avoid, it does its job.
Thanks

Related

Bash split an array, add a variable and concatenate it back together

I've been trying to figure this out, unfortunately I can't. I am trying to create a function that finds the ';' character, puts four spaces before it and then and puts the code back together in a neat sentence. I've been cracking at this for a bit, and can't figure out a couple of things. I can't get the output to display what I want it to. I've tried finding the index of the ';' character and it seems I'm going about it the wrong way. The other mistake that I seem to be making is that I'm trying to split in a array in a for loop, and then split the individual words in the array by letter but I can't figure out how to do that either. If someone can give me a pointer this would be greatly appreciated. This is in bash version 4.3.48
#!commentPlacer()
{
arg=($1) #argument
len=${#arg[#]} #length of the argument
comment=; #character to look for in second loop
commaIndex=(${arg[#]#;}) #the attempted index look up
commentSpace=" ;" #the variable being concatenated into the array
for(( count1=0; count1 <= ${#arg[#]}; count1++ )) #search the argument looking for comment space
do if [[ ${arg[count1]} != commentSpace ]] #if no commentSpace variable then
then for (( count2=0; count2 < ${#arg[count1]} ; count2++ )) #loop through again
do if [[ ${arg[count2]} != comment ]] #if no comment
then A=(${arg[#]:0:commaIndex})
A+=(commentSpace)
A+=(${arg[#]commaIndex:-1}) #concatenate array
echo "$A"
fi
done
fi
done
}
If I understand what you want correctly, it's basically to put 4 spaces in front of each ";" in the argument, and print the result. This is actually simple to do in bash with a string substitution:
commentPlacer() {
echo "${1//;/ ;}"
}
The expansion here has the format ${variable//pattern/replacement}, and it gives the contents of the variable, with each occurrence of pattern replaced by replacement. Note that with only a single / before the pattern, it would replace only the first occurrence.
Now, I'm not sure I understand how your script is supposed to work, but I see several things that clearly aren't doing what you expect them to do. Here's a quick summary of the problems I see:
arg=($1) #argument
This doesn't create an array of characters from the first argument. var=(...) treats the thing in ( ) as a list of words, not characters. Since $1 isn't in double-quotes, it'll be split into words based on whitespace (generally spaces, tabs, and linefeeds), and then any of those words that contain wildcards will be expanded to a list of matching filenames. I'm pretty sure this isn't at all what you want (in fact, it's almost never what you want, so variable references should almost always be double-quoted to prevent it). Creating a character array in bash isn't easy, and in general isn't something you want to do. You can access individual characters in a string variable with ${var:index:1}, where index is the character you want (counting from 0).
commaIndex=(${arg[#]#;}) #the attempted index look up
This doesn't do a lookup. The substitution ${var#pattern} gives the value of var with pattern removed from the front (if it matches). If there are multiple possible matches, it uses the shortest one. The variant ${var##pattern} uses the longest possible match. With ${array[#]#pattern}, it'll try to remove the pattern from each element -- and since it's not in double-quotes, the result of that gets word-split and wildcard-expanded as usual. I'm pretty sure this isn't at all what you want.
if [[ ${arg[count1]} != commentSpace ]] #if no commentSpace variable then
Here (and in a number of other places), you're using a variable without $ in front; this doesn't use the variable at all, it just treats "commentSpace" as a static string. Also, in several places it's important to have double-quotes around it, e.g. to keep the spaces in $commentSpace from vanishing due to word splitting. There are some places where it's safe to leave the double-quotes off, but in general it's too hard to keep track of them, so just use double-quotes everywhere.
General suggestions: don't try to write c (or java or whatever) programs in bash; it works too differently, and you have to think differently. Use shellcheck.net to spot common problems (like non-double-quoted variable references). Finally, you can see what bash is doing by putting set -x before a section that doesn't do what you expect; that'll make bash print each line as it executes it, showing the equivalent of what it's executing.
Make a little function using pattern substitution on stdin:
semicolon4s() { while read x; do echo "${x//;/ ;}"; done; }
semicolon4s <<< 'foo;bar;baz'
Output:
foo ;bar ;baz

What do the single quotes in if [ `wc -c $i` -gt 3 ] mean?

$1 is DirectoryName
$2 is txt (file extension)
$3 is 500 (or any other positive integer)
I don't understand the syntax of single quotes. I think what its supposed to do is to find all text files in a directory name passed in as parameter one, then do a "character count" of the txt files that come up in the search and if the character count is over a specified amount passed in parameter 3, then change the file permissions.
however it doesn't actually work. it says "expects an integer". Now it could be that the question is trying to trick me. But I can't get it to work either by changing it slightly. I've tried removing the single quotes (error says "too much data or something), using double quotes instead (something about syntax), I tried using a pipe or >. I read somewhere that single quotes was supposed to make everything inside literal, so that a $asdf would be taken as $asdf literal characters, but then the commands wc -c should have also failed, instead I am told it is expectinng a non=existent integer.
I even tried to play around with substituting variables like
a = wc -c $i
echo $a
which failed with token / syntax errors.
could someone please help with any concepts here that I've totally misunderstood? I have an exam tomorrow and this is past papers, so it's totally for revision only.
Those are not single-quotes but rather backticks (`). You might want to search for 'Command Substitution' in bash's man.

How to rename a folder that contains smart quotes

I have a folder that was created automatically. The user unintentionally provided smart (curly) quotes as part of the name, and the process that sanitizes the inputs did not catch these. As a result, the folder name contains the smart quotes. For example:
this-is-my-folder’s-name-“Bob”
I'm now trying to rename/remove said folder on the command line, and none of the standard tricks for dealing with files/folders with special characters (enclosing in quotes, escaping the characters, trying to rename it by inode, etc.) are working. All result in:
mv: cannot move this-is-my-folder’s-name-“Bob” to this-is-my-folders-name-BOB: No such file or directory
Can anyone provide some advice as to how I can achieve this?
To get the name in a format you can copy-and-paste into your shell:
printf '%q\n' this*
...will print out the filename in a manner the shell will accept as valid input. This might look something like:
$'this-is-my-folder200\231s-name-200\234Bob200\235'
...which you can then use as an argument to mv:
mv $'this-is-my-folder200\231s-name-200\234Bob200\235' this-is-my-folders-name-BOB
Incidentally, if your operating system works the same way mine does (when running the test above), this would explain why using single-character globs such as ? for those characters didn't work: They're actually more than one byte long each!
You can use shell globbing token ? to match any single character, so matching the smart quotes using ? should do:
mv this-is-my-folder?s-name-?Bob? new_name
Here replacing the smart quotes with ? to match the file name.
There are several possibilities.
If an initial substring of the file name ending before the first quote is unique within the directory, then you can use filename completion to help you type an appropriate command. Type "mv" (without the quotes) and the unique initial substring, then press the TAB key to request filename completion. Bash will complete the filename with the correct characters, correctly escaped.
Use a graphical file browser. Then you can select the file to rename by clicking on it. (Details of how to proceed from there depend on the browser.) If you don't have a graphical terminal and can't get one, then you may be able to do the same with a text-mode browser such as Midnight Commander.
A simple glob built with the ? or * wildcard should be able to match the filename
Use a more complex glob to select the filename, and perhaps others with the same problem. Maybe something like *[^a-zA-Z0-9-]* would do. Use a pattern substitution to assign a new name. Something like this:
for f in *[^a-zA-Z0-9-]*; do
mv "$f" "${f//[^a-zA-Z0-9-]/}"
done
The substitution replaces all appearances of a characters that are not decimal digits, appercase or lowercase Latin letters, or hyphens with nothing (i.e. it strips them). Do take care before you use this, though, to make sure you're not going to make more changes than you intend to do.

Combining part of bash parameters into a string

Alright, so I'm trying to combine some but not all of my script's parameters into one string. I'm trying to write a script that changes spaces in a file name to underscores, and when the option -r is given, it recursively does it to every file in the folder.
Assuming the file is saved as removespaces.sh, if you run removespaces.sh file with spaces.doc it doesn't really have to care about parameters, I can just use $*
but, when I'm trying to do it for an entire folder I now have -r as $1. So I can't just (be lazy) use $*.. how could I create a string that's equal to $2 to end?
A string of $2 to the end of the parameters:
"${*:2}"
This differs from "${#:2}" in that it concatenates all the arguments, with one space between each. In general, it is possible that neither form is what you want (if, for example, you have files with more than one consecutive space in their name).

How do you pass on filenames to other programs correctly in bash scripts?

What idiom should one use in Bash scripts (no Perl, Python, and such please) to build up a command line for another program out of the script's arguments while handling filenames correctly?
By correctly, I mean handling filenames with spaces or odd characters without inadvertently causing the other program to handle them as separate arguments (or, in the case of < or > — which are, after all, valid if unfortunate filename characters if properly escaped — doing something even worse).
Here's a made-up example of what I mean, in a form that doesn't handle filenames correctly: Let's assume this script (foo) builds up a command line for a command (bar, assumed to be in the path) by taking all of foo's input arguments and moving anything that looks like a flag to the front, and then invoking bar:
#!/bin/bash
# This is clearly wrong
FILES=
FLAGS=
for ARG in "$#"; do
echo "foo: Handling $ARG"
if [ x${ARG:0:1} = "x-" ]; then
# Looks like a flag, add it to the flags string
FLAGS="$FLAGS $ARG"
else
# Looks like a file, add it to the files string
FILES="$FILES $ARG"
fi
done
# Call bar with the flags and files (we don't care that they'll
# have an extra space or two)
CMD="bar $FLAGS $FILES"
echo "Issuing: $CMD"
$CMD
(Note that this just an example; there are lots of other times one needs to do this and that to a bunch of args and then pass them onto other programs.)
In a naive scenario with simple filenames, that works great. But if we assume a directory containing the files
one
two
three and a half
four < five
then of course the command foo * fails miserably in its task:
foo: Handling four < five
foo: Handling one
foo: Handling three and a half
foo: Handling two
Issuing: bar four < five one three and a half two
If we actually allow foo to issue that command, well, the results won't be what we're expecting.
Previously I've tried to handle this through the simple expedient of ensuring that there are quotes around each filename, but I've (very) quickly learned that that is not the correct approach. :-)
So what is? Constraints:
I want to keep the idiom as simple as possible (not least so I can remember it).
I'm looking for a general-purpose idiom, hence my making up the bar program and the contrived example above instead of using a real scenario where people might easily (and reasonably) go down the route of trying to use features in the target program.
I want to stick to Bash script, I don't want to call out to Perl, Python, etc.
I'm fine with relying on (other) standard *nix utilities, like xargs, sed, or tr provided we don't get too obtuse (see #1 above). (Apologies to Perl, Python, etc. programmers who think #3 and #4 combine to draw an arbitrary distinction.)
If it matters, the target program might also be a Bash script, or might not. I wouldn't expect it to matter...
I don't just want to handle spaces, I want to handle weird characters correctly as well.
I'm not bothered if it doesn't handle filenames with embedded nul characters (literally character code 0). If someone's managed to create one in their filesystem, I'm not worried about handling it, they've tried really hard to mess things up.
Thanks in advance, folks.
Edit: Ignacio Vazquez-Abrams pointed me to Bash FAQ entry #50, which after some reading and experimentation seems to indicate that one way is to use Bash arrays:
#!/bin/bash
# This appears to work, using Bash arrays
# Start with blank arrays
FILES=()
FLAGS=()
for ARG in "$#"; do
echo "foo: Handling $ARG"
if [ x${ARG:0:1} = "x-" ]; then
# Looks like a flag, add it to the flags array
FLAGS+=("$ARG")
else
# Looks like a file, add it to the files array
FILES+=("$ARG")
fi
done
# Call bar with the flags and files
echo "Issuing (but properly delimited, not exactly as this appears): bar ${FLAGS[#]} ${FILES[#]}"
bar "${FLAGS[#]}" "${FILES[#]}"
Is that correct and reasonable? Or am I relying on something environmental above that will bite me later. It seems to work and it ticks all the other boxes for me (simple, easy to remember, etc.). It does appear to rely on a relatively recent Bash feature (FAQ entry #50 mentions v3.1, but I wasn't sure whether that was arrays in general of some of the syntax they were using with it), but I think it's likely I'll only be dealing with versions that have it.
(If the above is correct and you want to un-delete your answer, Ignacio, I'll accept it provided I haven't accepted any others yet, although I stand by my statement about link-only answers.)
Why do you want to "build up" a command? Add the files and flags to arrays using proper
quoting and issue the command directly using the quoted arrays as arguments.
Selected lines from your script (omitting unchanged ones):
if [[ ${ARG:0:1} == - ]]; then # using a Bash idiom
FLAGS+=("$ARG") # add an element to an array
FILES+=("$ARG")
echo "Issuing: bar \"${FLAGS[#]}\" \"${FILES[#]}\""
bar "${FLAGS[#]}" "${FILES[#]}"
For a quick demo of using arrays in this manner:
$ a=(aaa 'bbb ccc' ddd); for arg in "${a[#]}"; do echo "..${arg}.."; done
Output:
..aaa..
..bbb ccc..
..ddd..
Please see BashFAQ/050 regarding putting commands in variables. The reason that your script doesn't work is because there's no way to quote the arguments within a quoted string. If you were to put quotes there, they would be considered part of the string itself instead of as delimiters. With the arguments left unquoted, word splitting is done and arguments that include spaces are seen as more than one argument. Arguments with "<", ">" or "|" are not a problem in any case since redirection and piping is performed before variable expansion so they are seen as characters in a string.
By putting the arguments (filenames) in an array, spaces, newlines, etc., are preserved. By quoting the array variable when it's passed as an argument, they are preserved on the way to the consuming program.
Some additional notes:
Use lowercase (or mixed case) variable names to reduce the chance that they will collide with the shell's builtin variables.
If you use single square brackets for conditionals in any modern shell, the archaic "x" idiom is no longer necessary if you quote the variables (see my answer here). However, in Bash, use double brackets. They provide additional features (see my answer here).
Use getopts as Let_Me_Be suggested. Your script, though I know it's only an example, will not be able to handle switches that take arguments.
This for ARG in "$#" can be shortened to this for ARG (but I prefer the readability of the more explicit version).
See BashFAQ #50 (and also maybe #35 on option parsing). For the scenario you describe, where you're building a command dynamically, the best option is to use arrays rather than simple strings, as they won't lose track of where the word boundaries are. The general rules are: to create an array, instead of VAR="foo bar baz", use VAR=("foo" "bar" "baz"); to use the array, instead of $VAR, use "${VAR[#]}". Here's a working version of your example script using this method:
#!/bin/bash
# This is clearly wrong
FILES=()
FLAGS=()
for ARG in "$#"; do
echo "foo: Handling $ARG"
if [ x${ARG:0:1} = "x-" ]; then
# Looks like a flag, add it to the flags array
FLAGS=("${FLAGS[#]}" "$ARG") # FLAGS+=("$ARG") would also work in bash 3.1+, as Dennis pointed out
else
# Looks like a file, add it to the files string
FILES=("${FILES[#]}" "$ARG")
fi
done
# Call bar with the flags and files (we don't care that they'll
# have an extra space or two)
CMD=("bar" "${FLAGS[#]}" "${FILES[#]}")
echo "Issuing: ${CMD[*]}"
"${CMD[#]}"
Note that in the echo command I used "${VAR[*]}" instead of the [#] form because there's no need/point to preserving word breaks here. If you wanted to print/record the command in unambiguous form, this would be a lot messier.
Also, this gives you no way to build up redirections or other special shell options in the built command -- if you add >outfile to the FILES array, it'll be treated as just another command argument, not a shell redirection. If you need to programmatically build these, be prepared for headaches.
getopts should be able to handle spaces in arguments correctly ("file name.txt"). Weird characters should work as well, assuming they are correctly escaped (ls -b).

Resources