Regular language demonstration - regular-language

Consider this function:
Faro(x,z) that assume value: z if x = ε and a.faro(z,y) if x = ay
So is a recursive function, and for instance we have as result of faro(00110, 0101) = 000110110
I have to demonstrate that if L and M are regular language on the same alphabet we have that faro(L,M) = {faro(x,z) | x is on L and z is on M} is regular.
I'm not sure which is the best method to demonstrate the result. I'm now studying this type of problem for the first time.
Is more convenient to create a DFA or use the pumping lemma?

It looks like what's going on here is that the faro function takes two strings and interleaves them: faro(a1a2a3...an, b1b2b3...bm) = a1b1a2b2a3b3...anbn...bm (assuming w.l.o.g. m >= n).
If you can demonstrate that semantic fact about the faro function to your and others' satisfaction, then producing a NFA that does the same thing seems to be within reach:
create four states in the NFA for each triple (q, q', s) where q is a state in L's DFA, q' is a state in M's DFA, and s is either L, M, L' or M';
make the start state (q0, q0', L) to signify that we will start reading according to L's DFA with the possibility of alternating still available
probably, the accepting states are just those where q is accepting in L's DFA and q' is accepting in M's DFA... but I will let you work through the details here
the transitions will work as follows:
When you see symbol a in state (q, q', L), then go to states (w, q', M) and (w, q', L'), where L's DFA transitions from q to w on a. This means we can either accept the next interleaved symbol of the string from M, or we can just read the rest of the string from L without a possibility of taking any more of the string from M
When you see symbol a in state (q, q', M), go to states (q, w', L) and (q, w', M'), similar to above
When you see symbol a in state (q, q', L'), go to state (w, q', L'); remember, L' means we gave up on reading interleaved symbols from the string in M
When you see symbol a in state (q, q', M'), go to the state (q, w', M'), similar to the above
This is an NFA so if at least one path through accepts, the string is accepted. This is as it should be since if a string can be interpreted as some result of faro(x, y) for x in L and y in M, our NFA should accept it. Our NFA works by keeping track of:
how much of x has been read so far, by the component q in (q, q', s)
how much of y has been read so far, by the component q' in (q, q', s)
whether we need to read a symbol from x or y next, by the component s in (q, q', s)
whether we will subsequently be able to read from x or y, by the component s in (q, q', s)
I think the next steps are probably doing a couple of simple examples to see whether this construction works and, if so, then proceeding to prove that this NFA's language is the one desired (a proof by induction would be a good idea here). Once you prove there's an NFA for your language, you're done.

Related

Coq - Rewriting a FMap Within a Relation

I am new to Coq, and was hoping that someone with more experience could help me with a problem I am facing.
I have defined a relation to represent the evaluation of a program in an imaginary programming language. The goal of the language is unify function calls and a constrained subset of macro invocations under a single semantics. Here is the definition of the relation, with its first constructor (I am omitting the rest to save space and avoid unnecessary details).
Inductive EvalExpr:
store -> (* Store, mapping L-values to R-values *)
environment -> (* Local environment, mapping function-local variables names to L-values *)
environment -> (* Global environment, mapping global variables names to L-values *)
function_table ->(* Mapping function names to function definitions *)
macro_table -> (* Mapping macro names to macro definitions *)
expr -> (* The expression to evaluate *)
Z -> (* The value the expression terminates to *)
store -> (* The final state of the program store after evaluation *)
Prop :=
(* Numerals evaluate to their integer representation and do not
change the store *)
| E_Num : forall S E G F M z,
EvalExpr S E G F M (Num z) z S
...
The mappings are defined as follows:
Module Import NatMap := FMapList.Make(OrderedTypeEx.Nat_as_OT).
Module Import StringMap := FMapList.Make(OrderedTypeEx.String_as_OT).
Definition store : Type := NatMap.t Z.
Definition environment : Type := StringMap.t nat.
Definition function_table : Type := StringMap.t function_definition.
Definition macro_table : Type := StringMap.t macro_definition.
I do not think the definitions of the other types are relevant to this question, but I can add them if needed.
Now when trying to prove the following lemma, which seems intuitively obvious, I get stuck:
Lemma S_Equal_EvalExpr_EvalExpr : forall S1 S2,
NatMap.Equal S1 S2 ->
forall E G F M e v S',
EvalExpr S1 E G F M e v S' <-> EvalExpr S2 E G F M e v S'.
Proof.
intros. split.
(* -> *)
- intros. induction H0.
+ (* Num *)
Fail constructor.
Abort.
If I were able to rewrite S2 for S1 in the goal, the proof would be trivial; however, if I try to do this, I get the following error:
H : NatMap.Equal S S2
(* Other premises *)
---------------------
EvalExpr S2 E G F M (Num z) z S
rewrite <- H.
Found no subterm matching "NatMap.find (elt:=Z) ?M2433 S2" in the current goal.
I think this has to do with finite mappings being abstract types, and thus not being rewritable like concrete types are. However, I noticed that I can rewrite mappings within other equations/relations found in Coq.FSets.FMapFacts. How would I tell Coq to let me rewrite mapping types inside my EvalExpr relation?
Update: Here is a gist containing a minimal working example of my problem. The definitions of some of the mapping types have been altered for brevity, but the problem is the same.
The issue here is that the relation NatMap.Equal, which says that two maps have the same bindings, is not the same as the notion of equality in Coq's logic, =. While it is always possible to rewrite with =, rewriting with some other relation R is only possible if you can prove that the property you are trying to show is compatible with it. This is already done for the relations in FMap, which is why rewriting there works.
You have two options:
Replace FMap with an implementation for which the intended map equality coincides with =, a property usually known as extensionality. There are many libraries that provide such data structures, including my own extructures, but also finmap and std++. Then, you never need to worry about a custom equality relation; all the important properties of maps work with =.
Keep FMap, but use the generalized rewriting mechanism to allow rewriting with FMap.Equal. To do this, you probably need to modify the definition of your execution relation so that it is compatible with FMap.Equal. Unfortunately, I believe the only way to do this is by explicitly adding equality hypotheses everywhere, e.g.
Definition EvalExpr' S E G F M e v S' :=
exists S0 S0', NatMap.Equal S S0 /\
NatMap.Equal S' S0' /\
EvalExpr S0 E G F M e v S0'.
Since this will pollute your definitions, I would not recommend this approach.
Arthur's answer explains the problem very well.
One other (?) way to do it could be to modify your Inductive definition of EvalExpr to explicitly use the equality that you care about (NatMap.Equal instead of Eq). You will have to say in each rule that it is enough for two maps to be Equal.
For example:
| E_Num : forall S E G F M z,
EvalExpr S E G F M (Num z) z S
becomes
| E_Num : forall S1 S2 E G F M z,
NatMap.Equal S1 S2 ->
EvalExpr S1 E G F M (Num z) z S2
Then when you want to prove your Lemma and apply the constructor, you will have to provide a proof that S1 and S2 are equal. (you'll have to reason a little using that NatMap.Equal is an equivalence relation).

How to work around the first-order constraint on arrows?

What I mean by first-order constraint
First, I'll explain what I mean by first-order constraint on arrows:
Due to the way arrows desugar, you cannot use a locally bound name where an arrow command is expected in the arrow do-notation.
Here is an example to illustrate:
proc x -> f -< x + 1 desugars to arr (\x -> x + 1) >>> f and similarly proc x -> g x -< () would desugar to arr (\x -> ()) >>> g x, where the second x is a free variable. The GHC user guide explains this and says that when your arrow is also a monad you may make an instance of ArrowApply and use app to get around this. Something like, proc x -> g x -<< () becomes arr (\x -> (g x, ())) >>> app.
My Question
Yampa defines the accumHold function with this type: a -> SF (Event (a -> a)) a.
Due to this first-order limitation of arrows, I'm struggling to write the following function:
accumHoldNoiseR :: (RandomGen g, Random a) => (a,a) -> g -> SF (Event (a -> a)) a
accumHoldNoiseR r g = proc f -> do
n <- noiseR r g -< ()
accumHold n -< f
The definition above doesn't work because n is not in scope after desugaring.
Or, similarly this function, where the first part of the pair to SF is meant to be the initial value passed to accumHold
accumHold' :: SF (a,Event (a -> a)) -> a
accumHold' = ...
Is there some combinator or trick that I'm missing? Or is it not possible to write these definitions without an ArrowApply instance?
tl;dr: Is it possible to define accumHoldNoiseR :: (RandomGen g, Random a) => (a,a) -> g -> SF (Event (a -> a)) a or accumHold' :: SF (a,Event (a -> a)) -> a in yampa?
Note: There is no instance of ArrowApply for SF. My understanding is that it doesn't make sense to define one either. See "Programming with Arrows" for details.
This is a theoretical answer. Look to Roman Cheplyaka's answer to this question, which deals more with the practical details of what you're trying to achieve.
The reason n is out of scope is that for it to be in scope to use there, you would have the equivalent of bind or >>= from monads. It's the use of the results of a previous computation as a functional input to the next which makes something as powerful as a monad.
Hence you can supply n as a function argument to a subsequent arrow exactly when you can make an ArrowApply instance.
Chris Kuklewicz correctly points out in his comment that -<< would bring n into scope - it also uses app, so you need an ArrowApply instance.
Summary
Not unless you use ArrowApply. This is what ArrowApply is for.
noiseR is a signal function; it produces a stream of random numbers, not just one random number (for that, you'd just use randomR from System.Random).
On the other hand, the first argument of accumHold is just one, initial, value.
So this is not just some limitation — it actually prevents you from committing a type error.
If I understand correctly what you're trying to do, then simply using randomR should do the trick. Otherwise, please clarify why you need noiseR.
To help others understand how I worked around this I'll answer my own question.
I was trying to implement the game pong. I wanted the ball to start with a random velocity each round. I wanted to use accumHold to define the ball's velocity. I had some code like this:
ballPos = proc e -> mdo -- note the recursive do
{- some clipping calculations using (x,y) -}
...
vx <- accumHold 100 -< e `tag` collisionResponse paddleCollision
vy <- accumHold 100 -< e `tag` collisionResponse ceilingFloorCollision
(x,y) <- integral -< (vx,vy)
returnA -< (x,y)
I wanted to replace the 100s with random values (presumably from noiseR).
How I solved this instead is to accumulate over the direction, where collisionResponse just flips the sign (eventually I'll want to use the angle of the velocity relative to wall/paddle):
ballPos = proc (initV, e) -> mdo
{- some clipping calculations using (x,y) -}
...
(iVx,iVy) <- hold (0,0) -< initV
vx <- accumHold 1 -< e `tag` collisionResponse paddleCollision
vy <- accumHold 1 -< e `tag` collisionResponse ceilingFloorCollision
(x,y) <- integral -< (iVx*vx,iVy*vy)
returnA -< (x,y)
Lesson Learned:
You can often separate the value/state you want to accumulate into a behavior describing how it changes and a "magnitude" that describes its current value taking the behavior as input. In my case, I separate out the magnitude of the initial velocity, pass that as input to the signal function, and use accumHold to compute the affect on the ball (the behavior) of having collisions. So regardless of what the initial velocity was, hitting the walls "reflects" the ball. And that's exactly what the accumHold is accumulating.

Haskell starter questions... please explain it to me

I am supposed to write some haskell program but I don't really know where to start. I will be really really grateful if you can point me to some resources to read or explain me the question. I am sure this is something totally amateurish, but I really need a starting point.
data DFA q o = DFA (q -> o -> q) q [q]
data NFA q o = NFA (q -> o -> [q]) [q] [q]
-- I really realy don't understand the declarations here
-- I can guess that q is somewhat related to Q and o to E, but don't get what it really means
data Q = Q0 | Q1 | Q2
deriving (Eq, Enum, Bounded)
data E = A | B
-- what does n1 do ??
n1 :: NFA Q E
n1 = NFA d [Q0] [Q2] -- i see [Q0] refers to set of initial states and [Q2] refers to final states :)
where
d Q0 A = [Q0]
d Q0 B = [Q0, Q1]
d Q1 _ = [Q2]
d Q2 _ = []
-- the following functions are for me to write
starDFA :: Eq q => DFA q o -> [o] -> Bool
--for the above function, what are the arguments the function takes in ?
--how can we relate q with Q and [o] with [E] ??
Any explanations or references to proper starting points will be really helpful to me.
Sorry to ask such a dumb question, but I really don't know where to start :)
Thanks
Learn You a Haskell for Great Good! is probably the best Haskell tutorial at the moment. I recommend reading through the whole thing, but if you are in a hurry, I'll point out some of the more relevant sections for this assignment.
The main part in the code you are given are the data declarations, so once you're familiar with the basics (chapter 2 and the first section of chapter 3), a good place to dive in is the Algebraic data types intro, Type variables and Type parameters.
The above should be enough for deciphering the data declarations and understanding the relationship between q vs Q and o vs E.
Now to implement the actual function, you need to be familiar with how deterministic finite automata work and then know enough Haskell to write the actual implementation. Chapter 4 and chapter 5 are the most relevant chapters for this in the tutorial and you might also find the section on standard library list functions useful.
Once you get to this point, if you are stuck with the implementation, you can post another question with the code you have written so far.
In haskell we have three way to define new type, using three distinct keywords, type, newtype, and data.
In your example it's data keyword which is in use, let's focus a bit more on it.
It's better to start with the easiest one coming from your code
data E = A | B
Here, We have define a new type E which can take only two mode or state or value.
A type like this is what we call a sum type.
How can we use a it ?
Mostly with pattern matching.
useE :: E -> String
useE A = "This is A"
useE B = "This is B"
Now, a more complex data declaration from your code.
data Q = Q0 | Q1 | Q2 deriving (Eq, Enum, Bounded)
Again, as said previously we have a sum type which define a new type Q, taken three value, Q0, Q1 or Q2. But we have a deriving clause, which tell to the compiler that this new type implement the method (or function) deriving (or inherited) from Eq, Enum, Bounded class.
What's that mean ?
Let's take a look to a function.
Imagine you want to associate a number for each of the value of Q, how can we perform that ?
enumQ :: Q -> Int
enumQ x = fromEnum x
If you want more insight about this particular functionality provide by deriving clause, read the resources which have been indicated and try :info Enum under ghci. Note that the previous type could also derive from the same class. As these types are fully describe as the sum of an enumerable set of value (discriminated by |) we better understand why we call them sum type.
Finally the most difficult data declaration.
data DFA q o = DFA (q -> o -> q) q [q]
data NFA q o = NFA (q -> o -> [q]) [q] [q]
If fact they are almost the same data definition then I will go trough the first one and let you the analyses of the second one as an exercise.
data DFA q o = DFA (q -> o -> q) q [q]
This time we must talk about data constructor and type constructor.
On the left hand side of the equality, there is data constructor,
to built data and give it a name. On this side we have the required
parameter used to built this new data.
On the right hand side of the equality, there is type constructor, to
built this new type. On this side we have the explicit
plumbering which show to the reader how this new type (data) is built
using the existing type.
Now keeping in mind, that the following are type,
[x] ::: Type representing the polymorphic list, Example, [Int] =>
List of Int
x ::: A basic type, one of the existing one (Int, Char, String ...)
x -> y ::: Type which define a function taken a type x to produce a
type y.
x -> y -> z ::: Type which define a function taken a type x and a
type y to produce a type z. Which can be view as a function taking another function of type (x->y) and producing a type z. This is what we call an High-Order Function.
Then our data declaration, put in this context, is a data constructor, feed by two type parameter, q and o and as a result, it return a new type as the product of a high-order function a basic type and a list type. Which explain why we call this a product type.
It should be enough, now, infering by yourself, to answer your question what does n1 ?
Good luck.
From the little I understand about Haskell type declarations, the initial statements about DFA and NFA are saying something like (looking at NFA, for example):
(Left hand side:) NFA is a type that utilizes two types (q and o) in its construction.
(Right hand side:) an instance of NFA will be called NFA, and be composed of three parameters:
(1) "(q -> o -> [q])" , meaning a function that takes two parameters, one of type q and one of type o, and returns a list of q's, ([q])
(2) "[q]" , meaning one list of values of type q
(3) "[q]" , another list of values of type q
n1 seems like an instance construction of NFA, and we see
n1 = NFA d [Q0] [Q2]
So we can infer that:
(1) d is a function that takes two parameters, a 'q' and an 'o' and returns a list of q's
(2) [Q0] is a list of q's, and
(3) [Q2] is a list of q's.
And, indeed, the definition of d follows:d takes two parameters, a 'Q' and an 'E' and returns a list of Q's (which we know can be either Q0, Q1, or Q2) or an empty list.
I hope that helps a little and/or perhaps someone could clarify and correct my vague understanding as well.

Haskell monad return arbitrary data type

I am having trouble defining the return over a custom defined recursive data type.
The data type is as follows:
data A a = B a | C (A a) (A a)
However, I don't know how to define the return statement since I can't figure out when to return B value and when to recursively return C.
Any help is appreciated!
One way to define a Monad instance for this type is to treat it as a free monad. In effect, this takes A a to be a little syntax with one binary operator C, and variables represented by values of type a embedded by the B constructor. That makes return the B constructor, embedding variables, and >>= the operator which performs subsitution.
instance Monad A where
return = B
B x >>= f = f x
C l r >>= f = C (l >>= f) (r >>= f)
It's not hard to see that (>>= B) performs the identity substitution, and that composition of substitutions is associative.
Another, more "imperative" way to see this monad is that it captures the idea of computations that can flip coins (or read a bitstream or otherwise have some access to a sequence of binary choices).
data Coin = Heads | Tails
Any computation which can flip coins must either stop flipping and be a value (with B), or flip a coin and carry on (with C) in one way if the coin comes up Heads and another if Tails. The monadic operation which flips a coin and tells you what came up is
coin :: A Coin
coin = C (B Heads) (B Tails)
The >>= of A can now be seen as sequencing coin-flipping computations, allowing the choice of a subsequent computation to depend on the value delivered by an earlier computation.
If you have an infinite stream of coins, then (apart from your extraordinary good fortune) you're also lucky enough to be able to run any A-computation to its value, as follows
data Stream x = x :> Stream x -- actually, I mean "codata"
flipping :: Stream Coin -> A v -> v
flipping _ (B v) = v
flipping (Heads :> cs) (C h t) = flipping cs h
flipping (Tails :> cs) (C h t) = flipping cs t
The general pattern in this sort of monad is to have one constructor for returning a value (B here) and a bunch of others which represent the choice of possible operations and the different ways computations can continue given the result of an operation. Here C has no non-recursive parameters and two subtrees, so I could tell that there must be just one operation and that it must have just two possible outcomes, hence flipping a coin.
So, it's substitution for a syntax with variables and one binary operator, or it's a way of sequencing computations that flip coins. Which view is better? Well... they're two sides of the same coin.
A good rule of thumb for return is to make it the simplest possible thing which could work (of course, any definition that satisfies the monad laws is fine, but usually you want something with minimal structure). In this case it's as simple as return = B (now write a (>>=) to match!).
By the way, this is an example of a free monad -- in fact, it's the example given in the documentation, so I'll let the documentation speak for itself.

Finite automaton in Haskell

What is a good way to represent finite automaton in Haskell? How would the data type of it look like?
In our college, automata were defined as a 5-tuple
(Q, X, delta, q_0, F)
where Q is the set of automaton's states, X is the alphabet (is this part even necessery), delta is the transition function taking 2-tuple from (Q,X) and returning state/-s (in non-deterministic version) and F is the set of accepting/end states.
Most importantly, I'm not sure what type delta should have...
There are two basic options:
An explicit function delta :: Q -> X -> Q (or [Q] as appropriate) as Sven Hager suggests.
A map delta :: Map (Q, X) Q e.g. using Data.Map, or if your states/alphabet can be indexed by ascending numbers Data.Array or Data.Vector.
Note that these two approaches are essentially equivalent, one can convert from the map version to a function version (this is slightly different due to an extra Maybe from the lookup call) relatively easily
delta_func q x = Data.Map.lookup (q,x) delta_map
(Or the appropriately curried version of the look-up function for whatever mapping type you are using.)
If you are constructing the automata at compile time (and so know the possible states and can have them encoded as a data type), then using the function version gives you better type safety, as the compiler can verify that you have covered all cases.
If you are constructing the automata at run time (e.g. from user input), then storing delta as a map (and possibly doing the function conversion as above) and having an appropriate input validation that guarantees correctness so that fromJust is safe (i.e. there is always an entry in the map for any possible (Q,X) tuple and so the look-up never fails (never returns Nothing)).
Non-deterministic automata work well with the map option, because a failed look-up is the same as having no state to go to, i.e. an empty [Q] list, and so there doesn't need to be any special handling of the Maybe beyond a call to join . maybeToList (join is from Data.Monad and maybeToList is from Data.Maybe).
On a different note, the alphabet is most definitely necessary: it is how the automaton receives input.
Check out the Control.Arrow.Transformer.Automaton module in the "arrows" package. The type looks like this
newtype Automaton a b c = Automaton (a b (c, Automaton a b c))
This is a bit confusing because its an arrow transformer. In the simplest case you can write
type Auto = Automaton (->)
Which uses functions as the underlying arrow. Substituting (->) for "a" in the Automaton definition and using infix notation you can see this is roughly equivalent to:
newtype Auto b c = Automaton (b -> (c, Auto b c))
In other words an automaton is a function that takes an input and returns a result and a new automaton.
You can use this directly by writing a function for each state that takes an argument and returns the result and the next function. For instance, here is a state machine to recognise the regexp "a+b" (that is, a series of at least one 'a' followed by a 'b'). (Note: untested code)
state1, state2 :: Auto Char Bool
state1 c = if c == 'a' then (False, state2) else (False, state1)
state2 c = case c of
'a' -> (False, state2)
'b' -> (True, state1)
otherwise -> (False, state1)
In terms of your original question, Q = {state1, state2}, X = Char, delta is function application, and F is the state transition returning True (rather than having an "accepting state" I've used an output transition with an accepting value).
Alternatively you can use Arrow notation. Automaton is an instance of all the interesting arrow classes, including Loop and Circuit, so you can get access to previous values by using delay. (Note: again, untested code)
recognise :: Auto Char Bool
recognise = proc c -> do
prev <- delay 'x' -< c -- Doesn't matter what 'x' is, as long as its not 'a'.
returnA -< (prev == 'a' && c == 'b')
The "delay" arrow means that "prev" is equal to the previous value of "c" rather than the current value. You can also get access to your previous output by using "rec". For instance, here is an arrow that gives you a decaying total over time. (Actually tested in this case)
-- | Inputs are accumulated, but decay over time. Input is a (time, value) pair.
-- Output is a pair consisting
-- of the previous output decayed, and the current output.
decay :: (ArrowCircuit a) => NominalDiffTime -> a (UTCTime, Double) (Double, Double)
decay tau = proc (t2,v2) -> do
rec
(t1, v1) <- delay (t0, 0) -< (t2, v)
let
dt = fromRational $ toRational $ diffUTCTime t2 t1
v1a = v1 * exp (negate dt / tau1)
v = v1a + v2
returnA -< (v1a, v)
where
t0 = UTCTime (ModifiedJulianDay 0) (secondsToDiffTime 0)
tau1 = fromRational $ toRational tau
Note how the input to "delay" includes "v", a value derived from its output. The "rec" clause enables this, so we can build up a feedback loop.

Resources