Recursive methods on CUDD - cudd

This is a follow-up to a suggestion by #DCTLib in the post below.
Cudd_PrintMinterm, accessing the individual minterms in the sum of products
I've been pursuing part (b) of the suggestion and will share some pseudo-code in a separate post.
Meanwhile, in his part (b) suggestion, #DCTLib posted a link to https://github.com/VerifiableRobotics/slugs/blob/master/src/BFAbstractionLibrary/BFCudd.cpp. I've been trying to read this program. There is a recursive function in the classic Somenzi paper, Binary Decision Diagrams, which describes an algo to compute the number of satisfying assignments (below, Fig. 7). I've been trying to compare the two, slugs and Fig. 7. But having a hard time seeing any similarities. But then C is mostly inscrutable to me. Do you know if slugs BFCudd is based on Somenze fig 7, #DCTLib?
Thanks,
Gui

It's not exactly the same algorithm.
There are two main differences:
First, the "SatHowMany" function does not take a cube of variables to consider for counting. Rather, that function considers all variables. The fact that "recurse_getNofSatisfyingAssignments" supports cubes manifest in the function potentially returning NaN (not a number) if a variable is found in the BDD that does not appear in the cube. The rest of the differences seem to stem from this support.
Second, SatHowMany returns the number of satisfying assignments to all n variables for a node. This leads, for instance, to the division by 2 in line -4. "recurse_getNofSatisfyingAssignments" only returns the number of assignments for the remaining variables to be considered.
Both algorithms cache information - in "SatHowMany", it's called a table, in "recurse_getNofSatisfyingAssignments" it's called a buffer. Note that in line 24 of "recurse_getNofSatisfyingAssignments", there is a constant string thrown. This means that either the function does not work, or the code is never reached. Most likely it's the latter.
Function "SatHowMany" seems to assume that it gets a BDD node - it cannot be a pointer to a complemented BDD node. Function "recurse_getNofSatisfyingAssignments" works correctly with complemented nodes, as a DdNode* may store a pointer to a complemented node.
Due to the support for cubes, "recurse_getNofSatisfyingAssignments" supports flexible variable ordering (hence the lookup of "cuddI" which denotes for a variable where it is in the current BDD variable ordering). For function SatHowMany, the variable ordering does not make a difference.

Related

Unclear why functions from Data.Ratio are not exposed and how to work around

I am implementing an algorithm using Data.Ratio (convergents of continued fractions).
However, I encounter two obstacles:
The algorithm starts with the fraction 1%0 - but this throws a zero denominator exception.
I would like to pattern match the constructor a :% b
I was exploring on hackage. An in particular the source seems to be using exactly these features (e.g. defining infinity = 1 :% 0, or pattern matching for numerator).
As beginner, I am also confused where it is determined that (%), numerator and such are exposed to me, but not infinity and (:%).
I have already made a dirty workaround using a tuple of integers, but it seems silly to reinvent the wheel about something so trivial.
Also would be nice to learn how read the source which functions are exposed.
They aren't exported precisely to prevent people from doing stuff like this. See, the type
data Ratio a = a:%a
contains too many values. In particular, e.g. 2/6 and 3/9 are actually the same number in ℚ and both represented by 1:%3. Thus, 2:%6 is in fact an illegal value, and so is, sure enough, 1:%0. Or it might be legal but all functions know how to treat them so 2:%6 is for all observable means equal to 1:%3 – I don't in fact know which of these options GHC chooses, but at any rate it's an implementation detail and could change in future releases without notice.
If the library authors themselves use such values for e.g. optimisation tricks that's one thing – they have after all full control over any algorithmic details and any undefined behaviour that could arise. But if users got to construct such values, it would result in brittle code.
So – if you find yourself starting an algorithm with 1/0, then you should indeed not use Ratio at all there but simply store numerator and denominator in a plain tuple, which has no such issues, and only make the final result a Ratio with %.

Compressed trie implementation?

I am going through a Udacity course and in one of the lectures (https://www.youtube.com/watch?v=gPQ-g8xkIAQ&feature=player_embedded), the professor gives the function high_common_bits which (taken verbatim from the lecture) looks like this in pseudocode:
function high_common_bits(a,b):
return:
- high order bits that a+b in common
- highest differing bit set
- all remaining bits clear
As an example:
a = 10101
b = 10011
high_common_bits(a,b) => 10100
He then says that this function is used in highly-optimized implementations of tries. Does anyone happen to know which exact implementation he's referring to?
If you are looking for a highly optimized bitwise compressed trie (aka Radix Tree). The BSD routing table uses one in it's implementation. The code is not easy to read though.
He was talking about Succinct Tries, tries in which each node requires only two bits to store (the theoretical minimum).
Steve Hanov wrote a very approachable blog post on Succinct Tries here. You can also read the original paper by Guy Jacobson (written as recently as 1989) which introduced them here.
A compressed trie stores a prefix in one node, then branches from that node to each possible item that's been seen that starts with that prefix.
In this case he's apparently doing a bit-wise trie, so it's storing a prefix of bits -- i.e., the bits at the beginning that the items have in common go in one node, then there are two branches from that node, one to a node for the next bit being a 0, and the other for the next bit being a 1. Presumably those nodes will be compressed as well, so they won't just store the next single bit, but instead store a number of bits that all matched in the items inserted into the trie so far.
In fact, the next bit following a given node may not be stored in the following nodes at all. That bit can be implicit in the link that's followed, so the next nodes store only bits after that.

How to find the optimal processing order?

I have an interesting question, but I'm not sure exactly how to phrase it...
Consider the lambda calculus. For a given lambda expression, there are several possible reduction orders. But some of these don't terminate, while others do.
In the lambda calculus, it turns out that there is one particular reduction order which is guaranteed to always terminate with an irreducible solution if one actually exists. It's called Normal Order.
I've written a simple logic solver. But the trouble is, the order in which it processes the constraints seems to have a huge effect on whether it finds any solutions or not. Basically, I'm wondering whether something like a normal order exists for my logic programming language. (Or wether it's impossible for a mere machine to deterministically solve this problem.)
So that's what I'm after. Presumably the answer critically depends on exactly what the "simple logic solver" is. So I will attempt to briefly describe it.
My program is closely based on the system of combinators in chapter 9 of The Fun of Programming (Jeremy Gibbons & Oege de Moor). The language has the following structure:
The input to the solver is a single predicate. Predicates may involve variables. The output from the solver is zero or more solutions. A solution is a set of variable assignments which make the predicate become true.
Variables hold expressions. An expression is an integer, a variable name, or a tuple of subexpressions.
There is an equality predicate, which compares expressions (not predicates) for equality. It is satisfied if substituting every (bound) variable with its value makes the two expressions identical. (In particular, every variable equals itself, bound or not.) This predicate is solved using unification.
There are also operators for AND and OR, which work in the obvious way. There is no NOT operator.
There is an "exists" operator, which essentially creates local variables.
The facility to define named predicates enables recursive looping.
One of the "interesting things" about logic programming is that once you write a named predicate, it typically works fowards and backwards (and sometimes even sideways). Canonical example: A predicate to concatinate two lists can also be used to split a list into all possible pairs.
But sometimes running a predicate backwards results in an infinite search, unless you rearrange the order of the terms. (E.g., swap the LHS and RHS of an AND or an OR somehwere.) I'm wondering whether there's some automated way to detect the best order to run the predicates in, to ensure prompt termination in all cases where the solution set is exactually finite.
Any suggestions?
Relevant paper, I think: http://www.cs.technion.ac.il/~shaulm/papers/abstracts/Ledeniov-1998-DCS.html
Also take a look at this: http://en.wikipedia.org/wiki/Constraint_logic_programming#Bottom-up_evaluation

Ordering of parameters to make use of currying

I have twice recently refactored code in order to change the order of parameters because there was too much code where hacks like flip or \x -> foo bar x 42 were happening.
When designing a function signature what principles will help me to make the best use of currying?
For languages that support currying and partial-application easily, there is one compelling series of arguments, originally from Chris Okasaki:
Put the data structure as the last argument
Why? You can then compose operations on the data nicely. E.g. insert 1 $ insert 2 $ insert 3 $ s. This also helps for functions on state.
Standard libraries such as "containers" follow this convention.
Alternate arguments are sometimes given to put the data structure first, so it can be closed over, yielding functions on a static structure (e.g. lookup) that are a bit more concise. However, the broad consensus seems to be that this is less of a win, especially since it pushes you towards heavily parenthesized code.
Put the most varying argument last
For recursive functions, it is common to put the argument that varies the most (e.g. an accumulator) as the last argument, while the argument that varies the least (e.g. a function argument) at the start. This composes well with the data structure last style.
A summary of the Okasaki view is given in his Edison library (again, another data structure library):
Partial application: arguments more likely to be static usually appear before other arguments in order to facilitate partial application.
Collection appears last: in all cases where an operation queries a single collection or modifies an existing collection, the collection argument will appear last. This is something of a de facto standard for Haskell datastructure libraries and lends a degree of consistency to the API.
Most usual order: where an operation represents a well-known mathematical function on more than one datastructure, the arguments are chosen to match the most usual argument order for the function.
Place the arguments that you are most likely to reuse first. Function arguments are a great example of this. You are much more likely to want to map f over two different lists, than you are to want to map many different functions over the same list.
I tend to do what you did, pick some order that seems good and then refactor if it turns out that another order is better. The order depends a lot on how you are going to use the function (naturally).

Misuse of a variables value?

I came across an instance where a solution to a particular problem was to use a variable whose value when zero or above meant the system would use that value in a calculation but when less than zero would indicate that the value should not be used at all.
My initial thought was that I didn't like the multipurpose use of the value of the variable: a.) as a range to be using in a formula; b.) as a form of control logic.
What is this kind of misuse of a variable called? Meta-'something' or is there a classic antipattern that this fits?
Sort of feels like when a database field is set to null to represent not using a value and if it's not null then use the value in that field.
Update:
An example would be that if a variable's value is > 0 I would use the value if it's <= 0 then I would not use the value and decided to perform some other logic.
Values such as these are often called "distinguished values". By far the most common distinguished value is null for reference types. A close second is the use of distinguished values to indicate unusual conditions (e.g. error return codes or search failures).
The problem with distinguished values is that all client code must be aware of the existence of such values and their associated semantics. In practical terms, this usually means that some kind of conditional logic must be wrapped around each call site that obtains such a value. It is far too easy to forget to add that logic, obtaining incorrect results. It also promotes copy-and-paste code as the boilerplate code required to deal with the distinguished values is often very similar throughout the application but difficult to encapsulate.
Common alternatives to the use of distinguished values are exceptions, or distinctly typed values that cannot be accidentally confused with one another (e.g. Maybe or Option types).
Having said all that, distinguished values may still play a valuable role in environments with extremely tight memory availability or other stringent performance constraints.
I don't think what your describing is a pure magic number, but it's kind of close. It's similar to the situation in pre-.NET 2.0 where you'd use Int32.MinValue to indicate a null value. .NET 2.0 introduced Nullable and kind of alleviated this issue.
So you're describing the use of a variable who's value really means something other than it's value -- -1 means essentially the same as the use of Int32.MinValue as I described above.
I'd call it a magic number.
Hope this helps.
Using different ranges of the possible values of a variable to invoke different functionality was very common when RAM and disk space for data and program code were scarce. Nowadays, you would use a function or an additional, accompanying value (boolean, or enumeration) to determine the action to take.
Current OS's suggest 1GiB of RAM to operate correctly, when 256KiB was high very few years ago. Cheap disk space has gone from hundreds of MiB to multiples of TiB in a matter of months. Not too long ago I wrote programs for 640KiB of RAM and 10MiB of disk, and you would probably hate them.
I think it would be good to cope with code like that if it's just a few years old (refactor it!), and denounce it as bad practice if it's recent.

Resources