Regular expression to check if brackets are nested - linux

I have a large number of files which contain lines with matched braces. I do not care if the brackets are matched or not.
I'd like to check if any braces are nested, by checking which comes first after an opening bracket - a closing or another opening bracket. I assume that all brackets are matched, and that there is at most one outer-bracket per line. (Ie, [foo[bar]] is a valid line, [foo][bar] is not, because the second bracket pair isn't nested).
I can get everything inside a bracket pair from this question using 's/.*\[\([^]]*\)\].*/\1/g', but I'm unsure how to re-test the grabbed string for further matches.
For example, given the following string:
foo [ bar, [baz] ]
the steps I think I would take are:
Traverse from the left-hand side until I see an opening bracket. (If none is found, ignore the line).
Non-greedily search from the opening brace until either [ or ] is encountered. If [, brackets are nested, so return the line. If ],
Ideally I'd like a sed or unix-tool based solution, but others are acceptable (perl, for example). Any help would be appreciated.

Use the recursive regexp to check brackets match AND they are nested. Its no point to check nesting without syntax check cus this can break out checking result. For example:
my $regex = qr/\[([^\[\]]+?|(??{$regex}))*\]/;
if( $line =~ /^[^\[\]]*\[$regex\][^\[\]]*$/ ) #Valid

perl -ne 'print if /\[[^\]]*\[/' your_file
tested below:
> cat temp
foo [ bar, [baz] ]
foo [ bar, baz ]
foo [ bar ]
foo [ bar, baz] ]
foo bar, [baz] ]
> perl -ne 'print if /\[.*\[/' temp
foo [ bar, [baz] ]
>

Related

$ is compulsory in case of conditional statement-linux [ ]?

I have one doubt I have started learning Linux and learn about the way we can perform mathematical operations. for example
a) using expr > $ and space are mandatory.
example: sum= expr $a + $b
b) using let keyword > $ is optional but we should use space.
c) using (()) > both $ and space is optional.
d) using [ ] > both $ and space is optional.
so now I have written one simple if statement.
#! /bin/bash
read -p "Please enter username:" name
if [ name = sunny ]
then
echo "hello Sunny is available. "
fi
echo "Sunny is busy-remaining line code"
So inside a square bracket, I am doing arithmetic operation right so why here do I need to use the $ symbol to get the name value.
Note If I'll use if [ $name = sunny ] I'm getting expected result.
Any help/suggestion on this would be highly appreciated.
The rule is relatively simple: $ and spaces are optional in arithmetic context, but not in string contexts.
This is because if you expect a number and see foo, you can safely assume that it must be a variable because it sure isn't a number. This is not possible for strings.
Arithmetic context includes:
Arithmetic expansion $((here)) and arithmetic commands ((here))
Integer comparators in [[ .. ]] (but not [ .. ]), such as [[ here -eq here ]]§. Note in particular that = is a string comparator.
Values assigned to variables declared as integers: declare -i foo=here§
Indices of indexed arrays: ${array[here]}
Arguments to let: let here§
Some other more niche constructs like ${str:here:here}, $[here]
In your case, you are using the test command aka [, which (mostly) does not treat anything as an arithmetic expression. This is why you need the $ to differentiate between the literal string name and the value of the variable name.
§ These words are delimited by spaces so one would terminate the expression, but this does not change the fact that spaces are optional. They just need to be escaped to be considered part of the expression.

Detecting a line with open curly brackets

I am parsing a tcl file line line by line and searching for lines with open curly braces so that I can merge them with the next line and read them.
I am struggling to get a single regex to do this. My concern is lines with with a closing } which can be skipped.
Example:
MATCH: test_command -switch1 {
NO MATCH: single_command
NO MATCH: test_tcl -switch2 {arg1 }
Please help with the regex to get the result. I tried this:
% set a "test_command -swithc1 {bye }"
test_command -swithc1 {bye }
% regexp "{" $a match
1
#0 is expected
This is not my intention. I want match only for lines with open curly brace
% set b "test_command -swithc1 {hi"
test_command -swithc1 {hi
% regexp "{" $a match
1
#1 was expected
I'm looking for a regex that will give 0 for the $a and 1 for $b
You really shouldn't be using a regular expression for that; there's a Tcl command specifically for this sort of thing: info complete. Here's how to use it:
set accumulator ""
while {![eof $inputChannel]} {
# Note well: you *must* add the newline
append accumulator [gets $inputChannel] "\n"
if {[info complete $accumulator]} {
handleCompleteChunk $accumulator
set accumulator ""
}
}
This handles various types of bracket matching and the intricacies of backslash sequences, but just to check whether the “line” is complete. (It's also the core of how Tcl's REPL works, except that uses the Tcl C API equivalents.)
You could try a couple "lookarounds", one to say "I see a left bracket" and one to say "I don't see a right bracket":
(?!.*\})(?=.*\{)
https://regex101.com/r/p8bbsF/1/

[:== :Unexpected operator, too many argument, binary operator expected

See below:
1.
if [ $var1 == "result" ]
2,3.
if [ -z $var ]
I met this warning in bash version 4.4
Does anyone know this? Please explain detail. Thanks.
Try:
"$var1" == "result"
And:
-z "$var1"
When $var1 is non existent the tests do not work, that is easily fixed by surrounding the tested variables with "" so that if they are non existent you compare to the empty variable
The problem is the fact your $var contains spaces. So, those spaces are going to appear in the if, like if they were separating parameters/values. To solve, use "$var", where all spaces are joined into a unique value.
So, if var1 was bound to foo bar in the shell, in [ $var1 = "result" ] the inside of the test(1) is expanded as 4 arguments: foo, bar, =, result but = is binary and wants only one argument on each side (so you've got an error like "too many arguments" or "binary operator expected")

How preserve space separated groups in bash

I want to build a string with contains quoted groups of words.
These groups should go to same function argument.
I tried to play with arrays.
Literally constructed arrays works, but I still hope to find
a magic syntax hack for bare string.
# literal array
LA=(a "b c")
function printArgs() { # function should print 2 lines
while [ $# -ne 0 ] ; do print $1 ; shift; done
}
printArgs "${LA[#]}" # works fine
# but how to use string to split only unquoted spaces?
LA="a \"b c\""
printArgs "${LA[#]}" # doesn't work :(
LA=($LA)
printArgs "${LA[#]}" # also doesn't work :(
bash arrays have a problem they are not transferable over conveyor
- (echo/$()).
A dirty approach would be :
#!/bin/bash
LA=(a "b c")
function printArgs()
{ # function should print 2 lines
while [ $# -ne 0 ]
do
echo "${1//_/ }" #Use parameter expansion to globally replace '_' with space
#Do double quote as we don't want to have word splitting
shift
done
}
printArgs "${LA[#]}" # works fine
LA="a b__c" # Use a place holder '_' for space, note the two '_' for two spaces
printArgs $LA #Don't double quote '$LA' here. We wish word splitting to happen. And works fine :-)
Sample Output
a
b c
a
b c
Note that the number of spaces inside grouped entities are preserved
Sidenote
The choice of place-holder is critical here. Hopefully you could find one that won't appear in the actual string.

Tab on Expect String concatenation

I'm kinda a novice on Expect, but I can't get over a problem I have with a logging-monitoring script i'm writing.
I've spent hours googling on why I can't get this to work:
puts $redirect [concat "${time}\t" "${context}\t" "$id\t" "${eventtype}" "${eventstatus}\t" "${eventcontext}" ]
The \t char ( it does not work even with other \chars ) is not showing up. No matter how and where I place it, I've tried different stuff:
puts $redirect [concat "${time}" "\t" "${context}" [...] ]
puts $redirect [concat "${time}\t" "${context}" [...] ]
puts $redirect [concat "${time}" "\t${context}" [...] ]
puts $redirect [concat "${time}" \t "${context}" [...] ]
*where redirect is set redirect [open $logfile a]
*where [...] are other strings I'm concatenating, in the same way.
From http://tcl.tk/man/tcl8.5/TclCmd/Tcl.htm#M10
[5] Argument expansion.
If a word starts with the string “{}” followed by a non-whitespace character, then the leading “{}” is removed and the
rest of the word is parsed and substituted as any other word. After
substitution, the word is parsed as a list (without command or
variable substitutions; backslash substitutions are performed as is
normal for a list and individual internal words may be surrounded by
either braces or double-quote characters), and its words are added to
the command being substituted. For instance, “cmd a {}{b [c]} d
{}{$e f "g h"}” is equivalent to “cmd a b {[c]} d {$e} f "g h"”.
[6] Braces.
If the first character of a word is an open brace (“{”) and rule [5] does not apply, then the word is terminated by the matching close
brace (“}”). Braces nest within the word: for each additional open
brace there must be an additional close brace (however, if an open
brace or close brace within the word is quoted with a backslash then
it is not counted in locating the matching close brace). No
substitutions are performed on the characters between the braces
except for backslash-newline substitutions described below, nor do
semi-colons, newlines, close brackets, or white space receive any
special interpretation. The word will consist of exactly the
characters between the outer braces, not including the braces
themselves.
Ironically, I can get this to work:
puts $redirect [concat "${time}\n" "-\t${context}" [...] ]
If I put a char before the TAB, it works, but I can't use it.
Ex output: 2016-06-01 15:43:12 - macro
Wanted output: 2016-06-01 15:43:12 macro
I've tried on building the string with append but it's like it is eating pieces of string due to max buffer char, is it possible?
Am I missing something?
Thanks in advice.
That is what concat does. It eats whitespace.
From the documentation for concat:
This command joins each of its arguments together with spaces after trimming leading and trailing white-space from each of them. If all the arguments are lists, this has the same effect as concatenating them into a single list. It permits any number of arguments; if no args are supplied, the result is an empty string.
#Etan gave you why it's not working for you.
An alternate way to code that is to use format
puts $redirect [format "%s\t%s\t%s\t%s%s\t%s" $time $context $id $eventtype $eventstatus $eventcontext]

Resources