Pyparsing will always insert a list on 'setParseAction' - python-3.x

Apparently if you add any parse action and return a result in that action, the result will always be encapsulated into a list 'deepening' the output tree.
I suppose this is for returning multiple values but it makes casual use of the library far harder because you then have to remember which parts of the tree you replaced and call result.normalstruct.replaced[0] (or even worse result.normalstruct['replaced'][0])
This is a bit strange and makes refactoring harder, so i'd like a way to avoid it. Any tips?

The problem here is that the argument token of setParseAction is already a list. Instead of operating on str(token_argument) i should operate on str(token_argument[0]) and return that. No more deepening.
edit: though apparently it doesn't happen always. Happened to me with a word action but when i tried to 'unwrap' element zero of a 'And' expression result from a setParseAction functor it gave me the first subexpression.
Man, i'd like consistency here.

Related

Nested fragments UML understanding?

I have a UML book and I am reading about nested fragments. It shows an example of a nested fragment. But what I dont get.. why does it say "If the condition "cancelation not sucessful" is true when entering this fragment (i.e. cancelation was unsuccesful) the interaction within this fragment is no longer executed".
What I have learned before is that a condition should be true before the interaction will be executed? But in this case it says the opposite of it.. (because they say it should be false to execute the interaction)
See for the image: https://ibb.co/CmstLcX
I think this is simply a typo in the book. The diagram makes sense, but the text describes nonsense. While the condition is true, the messages in the loop will happen up to three times.
Maybe the author got confused, because the message immediately before the loop is Cancellation. I assume the loop guard is referring to the success of the Order cancellation message, not to this Cancellation.
By the way, reply messages need to have the name of the original message. Most people get this wrong (and some textbooks). I'll grant that the text on the reply message is often meant to be the return value. As such it should be separated with a colon from the name of the message. In your case it is probably not necessary to repeat the message name, even though techically required. However, the colon is mandatory.
If it is the return value, all reply messages in the loops return the value Acceptance. I wonder, how the guard can then evaluate to true after the first time. Maybe it is only showing the scenario for this case. This is perfectly Ok. A sequence diagram almost never shows all possible scenarios. However, then the loop doesn't make sense. I guess the author didn't mean to return a specific value.
Or maybe it is the assignment target. In this case, it should look like this: Acceptance=Order Cancellation. Then the Acceptance attribute of the Dispatcher Workstation would be filled with whatever gets returned by the Order Cancellation message. Of course, then I would expect this attribute to be used in the guard, like this [not Acceptance].
A third possiblity is, that the author didn't mean synchroneous communication and just wanted to send signals. The Acceptance Signal could well contain an attribute Cancellation not successful. Then of course, no filled arrows and no dashed lines.
I can even think of a fourth possibility. Maybe the author wanted to show the name of an out parameter of the called operation. But this would officially look like this: Order Cancellation(Acceptance). Again the name of the message could be omitted, but the round brackets are needed to make the intention clear.
I think the diagram leaves a lot more questions open than you asked.
It only says "If the cancellation was not successful then" (do exception handling). This is pretty straight forward. if not false is the same as if true. Since it is within a Break fragment, this will be performed (with what ever is inside) and as a result will break the fragment where it is contained (usually some loop). Having a break just stand alone seems a bit odd (not to say wrong).
UML 2.5 p. 581:
17.6.3.9 Break
The interactionOperator break designates that the CombinedFragment represents a breaking scenario in the sense that the operand is a scenario that is performed instead of the remainder of the enclosing InteractionFragment. A break operator with a guard is chosen when the guard is true and the rest of the enclosing Interaction Fragment is ignored. When the guard of the break operand is false, the break operand is ignored and the rest of the enclosing InteractionFragment is chosen. The choice between a break operand without a guard and the rest of the enclosing InteractionFragment is done non-deterministically.
A CombinedFragment with interactionOperator break should cover all Lifelines of the enclosing InteractionFragment.
Except for that: you should not take fragments too serious. Graphical programming is nonsense. You make only spare use of any such constructs and only if they help understanding certain behavior. Do not get tempted to re-document existing code this way. Code is much more dense and better to read. YMMV

Ensure variable is a list

I often find myself in a situation where I have a variable that may or may not be a list of items that I want to iterate over. If it's not a list I'll make a list out of it so I can still iterate over it. That way I don't have to write the code inside the loop twice.
def dispatch(stuff):
if type(stuff) is list:
stuff = [stuff]
for item in stuff:
# this could be several lines of code
do_something_with(item)
What I don't like about this is (1) the two extra lines (2) the type checking which is generally discouraged (3) besides I really should be checking if stuff is an iterable because it could as well be a tuple, but then it gets even more complicated. The point is, any robust solution I can think of involves an unpleasant amount of boilerplate code.
You cannot ensure stuff is a list by writing for item in [stuff] because it will make a nested list if stuff is already a list, and not iterate over the items in stuff. And you can't do for item in list(stuff) either because the constructor of list throws an error if stuff is not an iterable.
So the question I'd like to ask: is there anything obvious I've missed to the effect of ensurelist(stuff), and if not, can you think of a reason why such functionality is not made easily accessible by the language?
Edit:
In particular, I wonder why list(x) doesn't allow x to be non-iterable, simply returning a list with x as a single item.
Consider the example of the classes defined in the io module, which provide separate write and writelines methods for writing a single line and writing a list of lines. Provide separate functions that do different things. (One can even use the other.)
def dispatch(stuff):
do_something_with(item)
def dispatch_all(stuff):
for item in stuff:
dispatch(item)
The caller will have an easier time deciding whether dispatch or dispatch_all is the correct function to use than your do-it-all function will have deciding whether it needs to iterate over its argument or not.

Why would more array accesses perform better?

I'm taking a course on coursera that uses minizinc. In one of the assignments, I was spinning my wheels forever because my model was not performing well enough on a hidden test case. I finally solved it by changing the following types of accesses in my model
from
constraint sum(neg1,neg2 in party where neg1 < neg2)(joint[neg1,neg2]) >= m;
to
constraint sum(i,j in 1..u where i < j)(joint[party[i],party[j]]) >= m;
I dont know what I'm missing, but why would these two perform any differently from eachother? It seems like they should perform similarly with the former being maybe slightly faster, but the performance difference was dramatic. I'm guessing there is some sort of optimization that the former misses out on? Or, am I really missing something and do those lines actually result in different behavior? My intention is to sum the strength of every element in raid.
Misc. Details:
party is an array of enum vars
party's index set is 1..real_u
every element in party should be unique except for a dummy variable.
solver was Gecode
verification of my model was done on a coursera server so I don't know what optimization level their compiler used.
edit: Since minizinc(mz) is a declarative language, I'm realizing that "array accesses" in mz don't necessarily have a direct corollary in an imperative language. However, to me, these two lines mean the same thing semantically. So I guess my question is more "Why are the above lines different semantically in mz?"
edit2: I had to change the example in question, I was toting the line of violating coursera's honor code.
The difference stems from the way in which the where-clause "a < b" is evaluated. When "a" and "b" are parameters, then the compiler can already exclude the irrelevant parts of the sum during compilation. If "a" or "b" is a variable, then this can usually not be decided during compile time and the solver will receive a more complex constraint.
In this case the solver would have gotten a sum over "array[int] of var opt int", meaning that some variables in an array might not actually be present. For most solvers this is rewritten to a sum where every variable is multiplied by a boolean variable, which is true iff the variable is present. You can understand how this is less efficient than an normal sum without multiplications.

How to find the optimal processing order?

I have an interesting question, but I'm not sure exactly how to phrase it...
Consider the lambda calculus. For a given lambda expression, there are several possible reduction orders. But some of these don't terminate, while others do.
In the lambda calculus, it turns out that there is one particular reduction order which is guaranteed to always terminate with an irreducible solution if one actually exists. It's called Normal Order.
I've written a simple logic solver. But the trouble is, the order in which it processes the constraints seems to have a huge effect on whether it finds any solutions or not. Basically, I'm wondering whether something like a normal order exists for my logic programming language. (Or wether it's impossible for a mere machine to deterministically solve this problem.)
So that's what I'm after. Presumably the answer critically depends on exactly what the "simple logic solver" is. So I will attempt to briefly describe it.
My program is closely based on the system of combinators in chapter 9 of The Fun of Programming (Jeremy Gibbons & Oege de Moor). The language has the following structure:
The input to the solver is a single predicate. Predicates may involve variables. The output from the solver is zero or more solutions. A solution is a set of variable assignments which make the predicate become true.
Variables hold expressions. An expression is an integer, a variable name, or a tuple of subexpressions.
There is an equality predicate, which compares expressions (not predicates) for equality. It is satisfied if substituting every (bound) variable with its value makes the two expressions identical. (In particular, every variable equals itself, bound or not.) This predicate is solved using unification.
There are also operators for AND and OR, which work in the obvious way. There is no NOT operator.
There is an "exists" operator, which essentially creates local variables.
The facility to define named predicates enables recursive looping.
One of the "interesting things" about logic programming is that once you write a named predicate, it typically works fowards and backwards (and sometimes even sideways). Canonical example: A predicate to concatinate two lists can also be used to split a list into all possible pairs.
But sometimes running a predicate backwards results in an infinite search, unless you rearrange the order of the terms. (E.g., swap the LHS and RHS of an AND or an OR somehwere.) I'm wondering whether there's some automated way to detect the best order to run the predicates in, to ensure prompt termination in all cases where the solution set is exactually finite.
Any suggestions?
Relevant paper, I think: http://www.cs.technion.ac.il/~shaulm/papers/abstracts/Ledeniov-1998-DCS.html
Also take a look at this: http://en.wikipedia.org/wiki/Constraint_logic_programming#Bottom-up_evaluation

In Functional Programming, is it considered a bad practice to have incomplete pattern matchings

Is it generally considered a bad practice to use non-exhaustive pattern machings in functional languages like Haskell or F#, which means that the cases specified don't cover all possible input cases?
In particular, should I allow code to fail with a MatchFailureException etc. or should I always cover all cases and explicitly throw an error if necessary?
Example:
let head (x::xs) = x
Or
let head list =
match list with
| x::xs -> x
| _ -> failwith "Applying head to an empty list"
F# (unlike Haskell) gives a warning for the first code, since the []-case is not covered, but can I ignore it without breaking functional style conventions for the sake of succinctness? A MatchFailure does state the problem quite well after all ...
If you complete your pattern-matchings with a constructor [] and not the catch-all _, the compiler will have a chance to tell you to look again at the function with a warning the day someone adds a third constructor to lists.
My colleagues and I, working on a large OCaml project (200,000+ lines), force ourselves to avoid partial pattern-matching warnings (even if that means writing | ... -> assert false from time to time) and to avoid so-called "fragile pattern-matchings" (pattern matchings written in such a way that the addition of a constructor may not be detected) too. We consider that the maintainability benefits.
Explicit is better than implicit (borrowed from the Zen of Python ;))
It's exactly the same as in a C switch over an enum... It's better to write all the cases (with a fall through) rather than just putting a default, because the compiler will tell you if you add new elements to the enumeration and you forgot to handle them.
I think that it depends quite a bit on the context. Are you trying to write robust, easy to debug code, or are you trying to write something simple and succinct?
If I were working on a long term project with multiple developers, I'd put in the assert to give a more useful error message. I also agree with Pascal's comment that not using a wildcard would be ideal from a software engineering perspective.
If I were working on a smaller scale project on which I was the only developer, I wouldn't think twice about using an incomplete match. If necessary, you can always check the compiler warnings.
I think it also depends a bit on the types you're matching against. Realistically, no extra union cases will be added to the list type, so you don't need to worry about fragile matching. On the other hand, in code that you control and are actively working on, there may well be types which are in flux and have additional union cases added, which means that protecting against fragile matching may be worth it.
This is a special case of a more general question, which is "should you ever create partial functions". Incomplete pattern matches are only one example of partial functions.
As a rule total functions are preferable. When you find yourself looking at a function that just has to be partial, ask yourself if you can solve the problem in the type system first. Sometimes that is more trouble than its worth (e.g. creating a whole type of lists with known lengths just to avoid the "head []" problem). So its a trade-off.
Or maybe you just asking whether its good practice in partial functions to say things like
head [] = error "head: empty list"
In which case the answer is YES!
The Haskell prelude (standard functions) contains many partial functions, e.g. head and tail only work on non-empty lists, but don't ask me why.
This question has two aspects.
For the user of the API, failwith... simply throws a System.Exception, which is unspecific (and therefore is sometimes considered a bad practice in itself). On the other hand, the implicitly thrown MatchFailureException can be specifically caught using a type test pattern, and therefore is preferrable.
For the reviewer of the implementation code, failwith... clearly documents that the implementer has at least given some thought about the possible cases, and therefore is preferrable.
As the two aspects contradict each other, the right answer depends on the circumstances (see also kvb's answer). A solution which is 100% "correct" from any point of view would have to
deal with every case explicitly,
throw a specific exception where necessary, and
clearly document the exception
Example:
/// <summary>Gets the first element of the list.</summary>
/// <exception cref="ArgumentException">The list is empty.</exception>
let head list =
match list with
| [] -> invalidArg "list" "The list is empty."
| x::xs -> x

Resources