why push down automata need a initial stack symbol? - automata-theory

while defining a transition of CFG or type-2 grammer with PDA we need initial stack-symbol mostly denoted by Zo.
my doubt is why we need it because finally we are going to empty the stack at all....??

Pushdown automata need the initial stack symbol because each move is determined by the current input symbol and the one at the top of the stack. This leads to the reality that no move is possible if the stack is empty.
And yes the stack can be reduced to only the stack symbol. Consider...
L={ (a^n)(b^n) : n >= 0 }
I could push down a 0 for each a I read, which - btw - the first of which will be (q0, a, z), and then when I read my first b I pop 0s and push nothing back. I know that I'm done and the language is accepted when there's no input consumed and the stack symbol is atop the stack.
Notice in the transition function above the first move is determined by the first input and the stack symbol. Can you see how without it you'd never be able to start?

Related

Creating a diagram for a PDA that recognizes each L

Trying to understand PDAs, but don't really grasp how to draw them.
first of all, I don't really understand how a PDA would look different if the restrictions say that the #a, or #b, or #c would be "greater than zero" or "greater than or equal to zero." I drew the following PDA diagrams for each of the textbook questions they seem right enough to me. Just want a second hand look.
So the logic for my first PDA is that for every "a" input, there will be a multiple of four "a" moved onto the stack. When a "b" input is entered, every "a" from the stack will result in one empty string, clearing the stack.
For the second PDA I created a non deterministic PDA because there can be zero "a" or zero "b". Therefore for every even input of "bb" there will be no change to the stack since there can be an infinite number of "bb", but the "a" will be preserved on the stack because it will have to be called again after the "bb" if there are "a".
If I understand your diagrams correctly, they are not correct and seem to betray a basic misunderstanding of how PDAs work. Rather than try to work with you to understand your thought process in producing these, I will derive new PDAs and let you reconcile them to what you have attempted. I will not provide drawings but I will describe what the drawings would look like; I'll provide tables as these are easier to render.
1) L = {a^i b^j | i > 0, j = 4i + 2}
When you make any automaton, you are trying to find states and transitions. A good first place to start is to wonder about the initial state, since we know we'll need one. Is it the only one we need, or will we need more? Is it accepting? For this language, we can see that it is not accepting (since, if the initial state were accepting, the PDA would accept the empty string, which can't be in our language since we require i > 0). Since our language does contain strings, we know we need an accepting state so we know we need at least two states; call them q0 for the initial state and q1 for the other state we know we'll need. Now that we have a couple of states, we can turn out attention for a moment to the first transitions.
Now, we know we need to allow for some a's at the beginning of our strings. Indeed, we must require at least one a. Let us think for a moment about what this means. Suppose we add a transition from q0 back to q0 accepting a. If we don't put anything on the stack, we lose information about whether we have seen any a's or not. However, we do have a stack, and so we can remember what we need to know: that we have before seen an a. Therefore, we can add an ability to transition from q0 to q0 upon reading an a and record that fact upon our stack. We will later require the ability to check the stack to see if we have seen some a's. Now we can ask what we shall put on the stack.
We have some options here, and there are no right or wrong answers. What you choose will depend on what automaton formulation you're relying on and what your design goals are for the automaton. Yes, an automaton is designed in the same way as a real machine or a computer program. We want to make a design choice that obeys the rules of our formalism and that is simple and easy to work with. One approach is to add what we later are expecting to see corresponding to the symbol we have consumed; for each a we see now, we need to see four b's later; so we can add bbbb to the stack now, and later pop one b from the stack for each b read from input. This is convenient. Notice that when we read the first a, we can take care of the "+2" requirement by adding bbbbbb instead of bbbb, so long as we know when we are reading the first a - which we know by examining the stack.
Based on all of these considerations, we can produce a partial table for the PDA we have been designing, and then we can assess our progress and see where left we have to go:
q e S q' S'
--- --- --- --- ---
q0 a Z q0 bbbbbbZ
q0 a b q0 bbbbb
We use Z to represent the bottom of the stack. When we see the first a (we know it is the first because the stack is empty, i.e., the topmost symbol is the one representing the bottom of the stack) we add bbbbbb. Each successive a adds bbbb to the top of the stack (by replacing b with bbbbb).
Now we must consider how b's are to be handled. Can we process a b by looping from q0 to q0? A moment's thought should convince you that this is not a good idea. If we loop from q0 to q0 upon seeing a b, then we have no easy facility for preventing the PDA from accepting further a's. But this would result in a string that cannot possibly be in our desired language, since in our language no a's can come after a b. It seems therefore necessary that whatever transition accepts b's, not have as its target the state q0. We therefore choose q1 as the target of the transition under discussion. What shall be the source? So far, only q0 has been reached in our PDA, and we only have q0 and q1 to choose from. We have two choices now: either we provide a mechanism for transitioning from q0 to q1 without seeing a b, or we provide a mechanism for transitioning on a b. The first of these approaches requires indeterminism based on our previous design choices, so we might prefer the latter as being more explicit and easier to reason about. This leads to the following transition (added to our previous table):
q e S q' S'
--- --- --- --- ---
q0 a Z q0 bbbbbbZ
q0 a b q0 bbbbb
q0 b b q1 -
This transition says that when we see a b in q0 and we have seen a's before on the stack, transition to q1 and throw away the topmost stack symbol. Remember, we designed the stack in such a way that we should always throw away one b on the stack for each b we read in the input.
Now that we have arrived at state q1, a natural question to ask is whether this state should be accepting or not. The answer is that it must not be, otherwise we could accept a string without enough b's to clear its stack; that is, we would have j < 4i + 2. So we will need a new state, call it q2, and q1 will not be accepting. What shall we use q1 for? We can easily use it to read the remaining b's and pop from the stack by adding this transition:
q e S q' S'
--- --- --- --- ---
q0 a Z q0 bbbbbbZ
q0 a b q0 bbbbb
q0 b b q1 -
q1 b b q1 -
This fourth transition can be taken as long as we have b's to read and b's on the stack. There are three things that can happen during this process:
We run out of b's on the stack, but still have b's in the input
We run out of b's on the stack at the same time as we finish reading all b's in the input
We still have b's on the stack when the input is exhausted
Only in the second case should we accept, and we can reach the second case by repeated applications of the fourth transition. However, q1 is not accepting. Therefore, we will require a way to transition from q1 to an accepting state - we might as well call q2 this state - when we might be in it. If we are truly in that state, we require an epsilon transition when we see an empty stack in q1:
q e S q' S'
--- --- --- --- ---
q0 a Z q0 bbbbbbZ
q0 a b q0 bbbbb
q0 b b q1 -
q1 b b q1 -
q1 - Z q2 Z
Now, in q2, we might be in either case 1 or 2: the stack is empty, but do we have any more symbols to read? It turns out this is not problematic in the least since further input symbols will cause the PDA to crash immediately in state q2.
Second is left as an exercise.

Understanding Haskell's `map` - Stack or Heap?

Given the following function:
f :: [String]
f = map integerToWord [1..999999999]
integerToWord :: Integer -> String
Let's ignore the implementation. Here's a sample output:
ghci> integerToWord 123999
"onehundredtwentythreethousandandninehundredninetynine"
When I execute f, do all results, i.e. f(0) through f(999999999) get stored on the stack or heap?
Note - I'm assuming that Haskell has a stack and heap.
After running this function for ~1 minute, I don't see the RAM increasing from its original usage.
To be precise - when you "just execute" f it's not evaluated unless you use its result somehow. And when you do - it's stored according to how it's required to fulfill the caller requirements.
As of this example - it's not stored anywhere: the function is applied to every number, the result is output to your terminal and is discarded. So at a given moment in time you only allocate enough memory to store the current value and the result (which is an approximation, but for the case it's precise enough).
References:
https://wiki.haskell.org/Non-strict_semantics
https://wiki.haskell.org/Lazy_vs._non-strict
First: To split hairs, the following answer applies to GHC. A different Haskell compiler could plausibly implement things differently.
There is indeed a heap and a stack. Almost everything goes on the heap, and hardly anything goes on the stack.
Consider, for example, the expression
let x = foo 17 in ...
Let's assume that the optimiser doesn't transform this into something completely different. The call to foo doesn't appear on the stack at all; instead, we create a note on the heap saying that we need to do foo 17 at some point, and x becomes a pointer to this note.
So, to answer your question: when you call f, a note that says "we need to execute map integerToWord [1..999999999] someday" gets stored on the heap, and you get a pointer to that. What happens next depends on what you do with that result.
If, for example, you try to print the entire thing, then yes, the result of every call to f ends up on the heap. At any given moment, only a single call to f is on the stack.
Alternatively, if you just try to access the 8th element of the result, then a bunch of "call f 5 someday" notes end up on the heap, plus the result of f 8, plus a note for the rest of the list.
Incidentally, there's a package out there ("vacuum"?) which lets you print out the actual object graphs for what you're executing. You might find it interesting.
GHC programs use a stack and a heap... but it doesn't work at all like the eager language stack machines you're familiar with. Somebody else is gonna have to explain this, because I can't.
The other challenge in answering your question is that GHC uses the following two techniques:
Lazy evaluation
List fusion
Lazy evaluation in Haskell means that (as the default rule) expressions are only evaluated when their value is demanded, and even then they may only be partially evaluated—only far enough as needed to resolve a pattern match that requires the value. So we can't say what your map example does without knowing what is demanding its value.
List fusion is a set of rewrite rules built into GHC, that recognize a number of situations where the output of a "good" list producer is only ever consumed as the input of a "good" list consumer. In these cases, Haskell can fuse the producer and the consumer into an object-code loop without ever allocating list cells.
In your case:
[1..999999999] is a good producer
map is both a good consumer and a good producer
But you seem to be using ghci, which doesn't do fusion. You need to compile your program with -O for fusion to happen.
You haven't told us what would be consuming the output of the map. If it's a good consumer it will fuse with the map.
But there's a good chance that GHC would eliminate most or all of the list cell allocations if you compiled (with -O) a program that just prints the result of that code. In that case, the list would not exist as a data structure in memory at all—the compiler would generate object code that does something roughly equivalent to this:
for (int i = 1; i <= 999999999; i++) {
print(integerToWord(i));
}

What type of languages are accepted by a PDA in which stack size is limited?

What type(s) of languages are accepted by a PDA in which stack size is limited to, say 20 items?
In my view it should still be CFL, because there is a temporary memory to store.
A PDA with a stack limited to containing 20 items is equivalent to a DFA. Here's the proof.
Take any PDA-20, and you can make it into an equivalent DFA. Let's say the stack alphabet S where |S| = N. You also have the bottom-of-stack symbol Z. We imagine an additional symbol, -, which we can also have on the stack, which stands for "unused". The stack is now equivalent to a string of the form x = -* S* Z where |x| = 20, in all cases. Pushing onto the stack consists in replacing occurrences of -, whereas popping consists in replacing other symbols with -, in a LIFO manner. There are now (N+2)^20 possible configurations of the stack for any PDA-20. To construct the DFA, simply replicate each state of the DFA by this factor, and have transitions to states of the DFA reflect the new configuration of the stack. This way, the information contained in the configuration of the stack in the PDA-20 is contained in the current state of the DFA.
Take any DFA, and you can make it into an equivalent PDA-20. Simply don't use the stack, and you have a PDA-20 which accepts the same language as the DFA.
Just to illustrate the first part of the proof, consider a PDA-5 with states A, B, C, ..., Z, and a lot of transitions. Let's say the input alphabet is {0, 1}. Then there are 2^5 = 32 different stack configurations, say. The DFA equivalent to this PDA-5 might have states A1, B1, ..., Z1, A2, B2, ..., Z2, ..., A32, B32, ..., Z32, though it will have the same number of transitions as the original. If a transition in the original PDA-5 would have taken the stack from configuration #2 in state R to configuration #17 and the machine to state F, the DFA will go from state R2 to state F17.

How does ArrowLoop work? Also, mfix?

I'm fairly comfortable now with the rest of the arrow machinery, but I don't get how loop works. It seems magical to me, and that's bad for my understanding. I also have trouble understanding mfix. When I look at a piece of code that uses rec in a proc or do block, I get confused. With regular monadic or arrow code, I can step through the computation and keep an operational picture of what's going on in my head. When I get to rec, I don't know what picture to keep! I get stuck, and I can't reason about such code.
The example I'm trying to grok is from Ross Paterson's paper on arrows, the one about circuits.
counter :: ArrowCircuit a => a Bool Int
counter = proc reset -> do
rec output <- returnA -< if reset then 0 else next
next <- delay 0 -< output+1
returnA -< output
I assume that if I understand this example, I'll be able to understand loop in general, and it'll go a great way towards understanding mfix. They feel essentially the same to me, but perhaps there is a subtlety I'm missing? Anyway, what I would really prize is an operational picture of such pieces of code, so I can step through them in my head like 'regular' code.
Edit: Thanks to Pigworker's answer, I have started thinking about rec and such as demands being fulfilled. Taking the counter example, the first line of the rec block demands a value called output. I imagine this operationally as creating a box, labelling it output, and asking the rec block to fill that box. In order to fill that box, we feed in a value to returnA, but that value itself demands another value, called next. In order to use this value, it must be demanded of another line in the rec block but it doesn't matter where in the rec block it is demanded, for now.
So we go to the next line, and we find the box labelled next, and we demand that another computation fill it. Now, this computation demands our first box! So we give it the box, but it has no value inside it, so if this computation demands the contents of output, we hit an infinite loop. Fortunately, delay takes the box, but produces a value without looking inside the box. This fills next, which then allows us to fill output. Now that output is filled, when the next input of this circuit is processed, the previous output box will have its value, ready to be demanded in order to produce the next next, and thus the next output.
How does that sound?
In this code, they key piece is the delay 0 arrow in the rec block. To see how it works, it helps to think of values as varying over time and time as chopped into slices. I think of the slices as ‘days’. The rec block explains how each day's computation works. It's organised by value, rather than by causal order, but we can still track causality if we're careful. Crucially, we must make sure (without any help from the types) that each day's work relies on the past but not the future. The one-day delay 0 buys us time in that respect: it shifts its input signal one day later, taking care of the first day by giving the value 0. The delay's input signal is ‘tomorrow's next’.
rec output <- returnA -< if reset then 0 else next
next <- delay 0 -< output+1
So, looking at the arrows and their outputs, we're delivering today's output but tomorrow's next. Looking at the inputs, we're relying on today's reset and next values. It's clear that we can deliver those outputs from those inputs without time travel. The output is today's next number unless we reset to 0; tomorrow, the next number is the successor of today's output. Today's next value thus comes from yesterday, unless there was no yesterday, in which case it's 0.
At a lower level, this whole setup works because of Haskell's laziness. Haskell computes by a demand-driven strategy, so if there is a sequential order of tasks which respects causality, Haskell will find it. Here, the delay establishes such an order.
Be aware, though, that Haskell's type system gives you very little help in ensuring that such an order exists. You're free to use loops for utter nonsense! So your question is far from trivial. Each time you read or write such a program, you do need to think ‘how can this possibly work?’. You need to check that delay (or similar) is used appropriately to ensure that information is demanded only when it can be computed. Note that constructors, especially (:) can act like delays, too: it's not unusual to compute the tail of a list, apparently given the whole list (but being careful only to inspect the head). Unlike imperative programming, the lazy functional style allows you to organize your code around concepts other than the sequence of events, but it's a freedom that demands a more subtle awareness of time.

Why is this an invalid Turing machine? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 1 year ago.
Improve this question
Whilst doing exam revision I am having trouble answering the following question from the book, "An Introduction to the Theory of Computation" by Sipser. Unfortunately there's no solution to this question in the book.
Explain why the following is not a legitimate Turing machine.
M = {
The input is a polynomial p over variables x1, ..., xn
Try all possible settings of x1, ..., xn to integer values
Evaluate p on all of these settings
If any of these settings evaluates to 0, accept; otherwise reject.
}
This is driving me crazy! I suspect it is because the set of integers is infinite? Does this somehow exceed the alphabet's allowable size?
Although this is quite an informal way of describing a Turing machine, I'd say the problem is one of the following:
otherwise reject - i agree with Welbog on that. Since you have a countably infinite set of possible settings, the machine can never know whether a setting on which it evaluates to 0 is still to come, and will loop forever if it doesn't find any - only when such a setting is encountered, the machine may stop. That last statement is useless and will never be true, unless of course you limit the machine to a finite set of integers.
The code order: I would read this pseudocode as "first write all possible settings down, then evaluate p on each one" and there's your problem:
Again, by having an infinite set of possible settings, not even the first part will ever terminate, because there never is a last setting to write down and continue with the next step. In this case, not even can the machine never say "there is no 0 setting", but it can never even start evaluating to find one. This, too, would be solved by limiting the integer set.
Anyway, i don't think the problem is the alphabet's size. You wouldn't use an infinite alphabet since your integers can be written in decimal / binary / etc, and those only use a (very) finite alphabet.
I'm a bit rusty on turing machines, but I believe your reasoning is correct, ie the set of integers is infinite therefore you cannot compute them all. I am not sure how to prove this theoretically though.
However, the easiest way to get your head around Turing machines is to remember "Anything a real computer can compute, a Turing machine can also compute.". So, if you can write a program that given a polynomial can solve your 3 questions, you will be able to find a Turing machine which can also do it.
I think the problem is with the very last part: otherwise reject.
According to countable set basics, any vector space over a countable set is countable itself. In your case, you have a vector space over the integers of size n, which is countable. So your set of integers is countable and therefore it is possible to try every combination of them. (That is to say without missing any combination.)
Also, computing the result of p on a given set of inputs is also possible.
And entering an accepting state when p evaluates to 0 is also possible.
However, since there is an infinite number of input vectors, you can never reject the input. Therefore no Turing machine can follow all of the rules defined in the question. Without that last rule, it is possible.

Resources