how to deal with this case in transition-based dependency parsing - nlp

I'm reading the "14.4 Transition-Based Dependency Parsing" part of this article, and have some questions about the execution process of the following case.
As described, the parser read the input buffer from the left to right, and leverage a stack structure to apply one operation choosing from left-arc, right-arc and shift. Each reduce operation(i.e. left-arc and right-arc) is applied on the top two elements of the stack, and parser can also choose shift to push the next unprocessed word in the input buffer into the stack.
I'm wondering how will the parser process the following case.
Assume that the input sequence consists of four words: a b c d, and the golden dependency structure is:
ROOT
|
b
|
c
|
d
|
a
, which means that b is the head of c and c is the dependent child of b, and so on.
I draw these dependencies on the paper, and confirm that this dependency tree is projective which satisfy the requirement of the parser model. However, I've failed to figure out the correct process for its dependency tree construction. According to the transition-based parser algorithm, it seems that I could only do the shift operation and finally get to the state with the stack of [ROOT, a, b, c, d]. What would be the next step of the parsing?

Related

How to generate stable id for AST nodes in functional programming?

I want to substitute a specific AST node into another, and this substituted node is specified by interactive user input.
In non-functional programming, you can use mutable data structure, and each AST node have a object reference, so when I need to reference to a specific node, I can use this reference.
But in functional programming, use IORef is not recommended, so I need to generate id for each AST node, and I want this id to be stable, which means:
when a node is not changed, the generated id will also not change.
when a child node is changed, it's parent's id will not change.
And, to make it clear that it is an id instead of a hash value:
for two different sub nodes, which are compared equal, but corresponding to different part of an expression, they should have different id.
So, what should I do to approach this?
Perhaps you could use the path from the root to the node as the id of that node. For example, for the datatype
data AST = Lit Int
| Add AST AST
| Neg AST
You could have something like
data ASTPathPiece = AddGoLeft
| AddGoRight
| NegGoDown
type ASTPath = [ASTPathPiece]
This satisfies conditions 2 and 3 but, alas, it doesn't satisfy 1 in general. The index into a list will change if you insert a node in a previous position, for example.
If you are rendering the AST into another format, perhaps you could add hidden attributes in the result nodes that identified which ASTPathPiece led to them. Traversing the result nodes upwards to the root would let you reconstruct the ASTPath.

Match symbol stream against dynamic patterns

How would you implement a solution to the following problem:
Given a string of symbols and patterns, that can be added at any time, count how often each pattern occurs.
Example:
There is a continuous stream of incoming symbols let’s say A B A C D E ...
The user can at any time register a new pattern for example (B A C). If the pattern was registered before the second timestep the pattern should be counted once.
In case of overlapping patterns only the first pattern should be counted e.g (B A C) and (A C D) would result only (B A C) being counted.
Solution approaches:
The trivial solution is to just keep one position per pattern, advance it when the pattern is matched and reset all positions once one is matched. This would lead to a runtime of O(n * m)
where n is the length of the stream and m is the length of the longest pattern (by having the same prefix of length m - 1 in all pattern for example).
The alternative approach would be to construct a finite automata and use the fact that pattern can have combined prefixes.
However there are a few problems with that:
How construct the edges between the patterns? (e.g. B D E from A B)
How to add patterns at runtime. E.g. let’s say the stream is A B and at the moment only the pattern (A B C) is registered. Now the user registers (B A C). If the stream continues with A C D E. The pattern should not be matched since the first symbol occurred before registering it.
The idea could be linked to Aho Corasick algorithm. However the algorithm does match all occurrences of the patterns and not only the first one. It does not allow for patterns to be added at runtime.
Maintain an initially-empty list of Aho-Corasick FSMs. Whenever a new pattern is registered, create a new FSM for just this pattern, append it to the list, and check whether there are now 2 single-string FSMs at the end of the list: if so, delete them, build a single new FSM for both strings, and put this FSM in place of the original 2. Now check whether there are 2 2-string FSMs, and combine them into a single 4-string FSM if so. Repeat this procedure of combining two k-string FSMs into a single (2k)-string FSM until all FSMs are for distinct numbers of strings. (Notice that any 2 FSMs for the same number of strings must be at adjacent positions in the list.)
Suppose n registrations occur in total. As a result of the above "compacting" procedure, the list will contain at most log2(n)+1 FSMs at all times, so the overall "cost factor" of using each of these FSMs to search the input stream (vs. a single FSM containing all strings) is O(log n). Also, the number of FSM-building processes that a particular string participates in is capped at log2(n)+1, since each new FSM that it participates in building is necessarily twice as large as the previous one that it participated in building. So the overall "cost factor" of building all the FSMs (vs. building a single FSM containing all strings) is also O(log n).

How can I find the closure of a D FA

I am trying to implement the closure of D FA. I have successfully implemented Union, Compliment Intersection, Subtraction and Concatenation of D FA without using N FA. Our teacher did not tell us the algorithm to find the closure. I tried to do it by concatenating a D FA to itself but quite obviously it did not work.
I just need the steps by the way I am representing D FA by using matrix. Alongside can you please elaborate on Klein closure but I am sure I can can do that once I know how to get the closure.
I am not sure what you mean by closure in general. But for the Kleene closure you can proceed like this: from every final state you add an epsilon-transition (not reading anything) to the start state. So after reading one word of the language, you can start with another one.
Of course, the resulting automaton is not deterministic any more. But there are standard procedures for determinizing it again.
For a direct construction: look at all the transition that enter a final state, for example (q,a) -> f. From the outgoing state add another transition reading the same symbol and going to the start state: (q,a) -> s. So the automaton has the possibility to finish reading the word and just after that starting again.

How is insert O(log(n)) in Data.Set?

When looking through the docs of Data.Set, I saw that insertion of an element into the tree is mentioned to be O(log(n)). However, I would intuitively expect it to be O(n*log(n)) (or maybe O(n)?), as referential transparency requires creating a full copy of the previous tree in O(n).
I understand that for example (:) can be made O(1) instead of O(n), as here the full list doesn't have to be copied; the new list can be optimized by the compiler to be the first element plus a pointer to the old list (note that this is a compiler - not a language level - optimization). However, inserting a value into a Data.Set involves rebalancing that looks quite complex to me, to the point where I doubt that there's something similar to the list optimization. I tried reading the paper that is referenced by the Set docs, but couldn't answer my question with it.
So: how can inserting an element into a binary tree be O(log(n)) in a (purely) functional language?
There is no need to make a full copy of a Set in order to insert an element into it. Internally, element are stored in a tree, which means that you only need to create new nodes along the path of the insertion. Untouched nodes can be shared between the pre-insertion and post-insertion version of the Set. And as Deitrich Epp pointed out, in a balanced tree O(log(n)) is the length of the path of the insertion. (Sorry for omitting that important fact.)
Say your Tree type looks like this:
data Tree a = Node a (Tree a) (Tree a)
| Leaf
... and say you have a Tree that looks like this
let t = Node 10 tl (Node 15 Leaf tr')
... where tl and tr' are some named subtrees. Now say you want to insert 12 into this tree. Well, that's going to look something like this:
let t' = Node 10 tl (Node 15 (Node 12 Leaf Leaf) tr')
The subtrees tl and tr' are shared between t and t', and you only had to construct 3 new Nodes to do it, even though the size of t could be much larger than 3.
EDIT: Rebalancing
With respect to rebalancing, think about it like this, and note that I claim no rigor here. Say you have an empty tree. Already balanced! Now say you insert an element. Already balanced! Now say you insert another element. Well, there's an odd number so you can't do much there.
Here's the tricky part. Say you insert another element. This could go two ways: left or right; balanced or unbalanced. In the case that it's unbalanced, you can clearly perform a rotation of the tree to balance it. In the case that it's balanced, already balanced!
What's important to note here is that you're constantly rebalancing. It's not like you have a mess of a tree, decided to insert an element, but before you do that, you rebalance, and then leave a mess after you've completed the insertion.
Now say you keep inserting elements. The tree's gonna get unbalanced, but not by much. And when that does happen, first off you're correcting that immediately, and secondly, the correction occurs along the path of the insertion, which is O(log(n)) in a balanced tree. The rotations in the paper you linked to are touching at most three nodes in the tree to perform a rotation. so you're doing O(3 * log(n)) work when rebalancing. That's still O(log(n)).
To add extra emphasis to what dave4420 said in a comment, there are no compiler optimizations involved in making (:) run in constant time. You could implement your own list data type, and run it in a simple non-optimizing Haskell interpreter, and it would still be O(1).
A list is defined to be an initial element plus a list (or it's empty in the base case). Here's a definition that's equivalent to native lists:
data List a = Nil | Cons a (List a)
So if you've got an element and a list, and you want to build a new list out of them with Cons, that's just creating a new data structure directly from the arguments the constructor requires. There is no more need to even examine the tail list (let alone copy it), than there is to examine or copy the string when you do something like Person "Fred".
You are simply mistaken when you claim that this is a compiler optimization and not a language level one. This behaviour follows directly from the language level definition of the list data type.
Similarly, for a tree defined to be an item plus two trees (or an empty tree), when you insert an item into a non-empty tree it must either go in the left or right subtree. You'll need to construct a new version of that tree containing the element, which means you'll need to construct a new parent node containing the new subtree. But the other subtree doesn't need to be traversed at all; it can be put in the new parent tree as is. In a balanced tree, that's a full half of the tree that can be shared.
Applying this reasoning recursively should show you that there's actually no copying of data elements necessary at all; there's just the new parent nodes needed on the path down to the inserted element's final position. Each new node stores 3 things: an item (shared directly with the item reference in the original tree), an unchanged subtree (shared directly with the original tree), and a newly created subtree (which shares almost all of its structure with the original tree). There will be O(log(n)) of those in a balanced tree.

drawing minmal DFA for the given regular expression

What is the direct and easy approach to draw minimal DFA, that accepts the same language as of given Regular Expression(RE).
I know it can be done by:
Regex ---to----► NFA ---to-----► DFA ---to-----► minimized DFA
But is there any shortcut way? like for (a+b)*ab
Regular Expression to DFA
Although there is NO algorithmic shortcut to draw DFA from a Regular Expression(RE) but a shortcut technique is possible by analysis not by derivation, it can save your time to draw a minimized dfa. But off-course the technique you can learn only by practice. I take your example to show my approach:
(a + b)*ab
First, think about the language of the regular expression. If its difficult to sate what is the language description at first attempt, then find what is the smallest possible strings can be generate in language then find second smallest.....
Keep memorized solution of some basic regular expressions. For example, I have written here some basic idea to writing left-linear and right-linear grammars directly from regular expression. Similarly you can write for construing minimized dfa.
In RE (a + b)*ab, the smallest possible string is ab because using (a + b)* one can generate NULL(^) string. Second smallest string can be either aab or bab. Now one thing we can easily notice about language is that any string in language of this RE always ends with ab (suffix), Whereas prefix can be any possible string consist of a and b including ^.
Also, if current symbol is a; then one possible chance is that next symbol would be a b and string end. Thus in dfa we required, a transition such that when ever a b symbol comes after symbol a, then it should be move to some of the final state in dfa.
Next, if a new symbol comes on final state then we should move to some non-final state because any symbol after b is possible only in middle of some string in language as all language string terminates with suffix 'ab'.
So with this knowledge at this stage we can draw an incomplete transition diagram like below:
--►(Q0)---a---►(Q1)---b----►((Qf))
Now at this point you need to understand: every state has some meaning for example
(Q0) means = Start state
(Q1) means = Last symbol was 'a', and with one more 'b' we can shift to a final state
(Qf) means = Last two symbols was 'ab'
Now think what happens if a symbol a comes on final state. Just more to state Q1 because this state means last symbol was a. (updated transition diagram)
--►(Q0)---a---►(Q1)---b----►((Qf))
▲-----a--------|
But suppose instead of symbol a a symbol b comes at final state. Then we should move from final state to some non-final state. In present transition graph in this situation we should make a move to initial state from final state Qf.(as again we need ab in string for acceptation)
--►(Q0)---a---►(Q1)---b----►((Qf))
▲ ▲-----a--------|
|----------------b--------|
This graph is still incomplete! because there is no outgoing edge for symbol a from Q1. And for symbol a on state Q1 a self loop is required because Q1 means last symbol was an a.
a-
||
▼|
--►(Q0)---a---►(Q1)---b----►((Qf))
▲ ▲-----a--------|
|----------------b--------|
Now I believe all possible out-going edges are present from Q1 & Qf in above graph. One missing edge is an out-going edge from Q0 for symbol b. And there must be a self loop at state Q0 because again we need a sequence of ab so that string can be accept. (from Q0 to Qf shift is possible with ab)
b- a-
|| ||
▼| ▼|
--►(Q0)---a---►(Q1)---b----►((Qf))
▲ ▲-----a--------|
|----------------b--------|
Now DFA is complete!
Off-course the method might look difficult at first few tries. But if you learn to draw this way you will observe improvement in your analytically skills. And you will find this method is quick and objective way to draw DFA.
* In the link I given, I described some more regular expressions, I would highly encourage you to learn them and try to make DFA for those regular expressions too.

Resources