Natural language processing (NLP) - automata-theory

With the advancement in technology, industry has been moving towards automation and intelligence. In this regards artificial intelligence and machine learning has played a vital role. Natural language processing (NLP) is a field of computer science and linguistics which focuses on methods to process the natural languages. So, which one is more reliable and efficient in natural language processing, Finite state machine [FSM] or Push down Automata?

Even though there are many techniques to do NLP, the state of art way is to use deep learning. Many significant improvements are shown in NLP using Deep Learning Techniques. This has happened because of the enormous amount of processing power which is available at low cost. If you want to read cutting edge techniques used in NLP domain or any other research domain, Go to google scholar (https://scholar.google.com/).

It seems like the real question you want to be asking are: "What are some efficient techniques in natural language processing?" But I will address your question first.
First of all, neither FSA (Finite State Automata) not PDA (Push Down Automata) are sufficient techniques to model language. FSAs can handle regular languages. They cannot, however, even answer the question of whether a word is a palindrome. PDAs are a little more powerful, and can answer such questions. Turing machines give universal computation and are useful for writing programs of arbitrary complexity.
Now to start bridging this gap. Natural languages are not regular languages. They thus cannot be handled by FSAs. Some context-free grammars such as LR(k) grammars are handled by PDAs, however natural human language is not context-free. As an example. The following three statements. "Jill drove to the grocery store to meet her friend Sally before she picked up her kids. Sally bought three boxes of cereal. Then she drove to the school." While this is poor grammar, it is "natural" in that they are utterances that people make and they are generally parseable by other people. The antecedent to the pronoun "She" in the third sentence clearly refers to Jill as she is the one with children. However, it is ambiguous and we have to infer that association.
The amount of ambiguity in context in natural human language makes it impossible to parse deterministically. Instead, we turn towards the fields of statistics and decision theory to make our inferences about the maximally likely model for the communication.
The locality but non-determinism in speech and writing are one of the things that make the application of machine learning techniques such as the utilization of deep recurrent neural networks so immensely effective by comparison to their classical rule-based counterparts.
While the term Neural Network is a bit of a misnomer as ultimately the human brain is far, far more complex than these rudimentary models from a neurological perspective, the general learning through approximate inference is ostensibly close to reality. We might better call these methods "Differentiable Computing" but that is a digression for another time.
In summary. The answer to your question you actually asked would be PDAs are going to produce better models than FSAs but both are going to be absolutely worthless by comparison to even rudimentary statistical methods.
If you are curious about NLP, I would actually recommend a course in machine learning and a follow up in deep learning.
Andrew Ng has a good series of courses that are targeted toward beginners. After that, I would follow up with Sirajs course on deep learning in Tensorflow.

Related

Manning & Jordan's opinions on the approach to better NLP basing on the article

In this article, Manning discussed about deep learning's impact and contribution to NLP and NLP's future. He quoted Michael Jordan's AMA reply on the R website,
Although current deep learning research tends to claim to encompass NLP, I'm (1) much less convinced about the strength of the results, compared to the results in, say, vision; (2) much less convinced in the case of NLP than, say, vision, the way to go is to couple huge amounts of data with black-box learning architectures.
and said he agrees with point (1) but feels reserved towards point (2).
The problem is, I am not sure about point (2)'s position.
Later in the article, Manning pointed out that domain knowledge, e.g. linguistics & cognitive psychology, etc., is the center of NLP's recent advancement, and thus, in some sense, downplayed the role of deep learning in advancing NLP.
Therefore, was Manning arguing against mono-focusing on "huge amounts of data" and "black-box learning architectures" since they are often attributed to deep learning? Was Jordan arguing for them as the "way to go" for future NLP?

Anyone know of any real systems using Computational Semantics with Lambda Calculus?

I was wondering if Computational Semantics is actually used in any real-world system? (Simple examples here and here). I would like to see how an actual system works.
It seems like there are a bunch of issues with actually using Computational Semantics in any real world system:
It seems just labeling sentences with part-of-speech tags is error prone.
But you also need a reliable parse tree which is error prone and there can be many valid trees for one sentence.
Finding what pronouns are referring to what entities is error prone.
Word disambiguation is also another source of errors and multiple meanings could be valid in the same context.
Any context-free-grammar of English I can find seems to be incomplete.
Finally, after all these sources of error are dodged, we can finally convert the sentence to FOL with Computation Semantics!
Also, I can't seem to figure out how to deal with prepositions in Computation Semantics.
Is this really just an academic exercise or is Computational Semantics actually useful?
There are several better aproaches to natural language than simple lambda calculus and context free grammars, ie. HPSG, Montague Grammar, TAG, ...
Word disambiguation can be handled by Markov chains, for example.
Siri, Google Now, Cortana and IBM Watson are some examples for real world systems.
Google Translate is another application that uses Computational Semantics.
I believe (bu't don't quote me on this) that the technology spun out of the now-defunct Natural Language Theory and Technology group at the Palo Alto Research Center (PARC, formerly Xerox PARC), utilizes the lambda calculus to provide inferences about textual entailments. idk i only worked there a summer (freshman, so was wonderfully igorant of most of the goings-on there).
Anyway, that 'technology' was developed over 30 years and then Powerset bought the right to all of it for $15 million, attempting to disrupt smart search in general. Then Bing's fatass came along, gobbled it up nom nom nom, then continued devouring the entire research group as whole. The principal core investigators now work solely as adjuct profs at Stanford. Sad.

How to implement a "Generalisation" in SCL

Is it possible for a generalisation in UML to be implemented in Simatic SCL code (or Structured text code)?
The definition of a Generalisation in UML:
A generalisation is a relationship between a morew general classifier and a
more specific classifier. Each Instance of the specific classifier is also an
indirect instance of the general clasifier. Thus, the specific classifier
inherits the features of the more general classifier.
Features specified for instances of the general classifier are implicitly
specified for instances of the specific classifier. Any constraint applying
to instances of the general classifier also applies to instances of the
specific classifier.
In general the answer to this is no, not really. All means of programming PLCs (ladder, ST, FBD, etc) are generally only very lightly abstracted from the actual machine code. They are closer to assembly wrappers than to anything we would think of as a modern development language. Structured Text is closer to very primitive Pascal - it lacks most any sort of object oriented features.
The notion is that PLCs and PLC programmers have long since been used to an approach of extreme micromanagement when it comes to developing programs for them. The reasons for this are many - some more valid than others. Scott Whitlock wrote a good bit here outlining some of those reasons. A big one is that maintenance guys on the factory floor are often the ones trying to troubleshoot the machines and having clear, non-abstract, state-machine information available to them is much more valuable than the need for an elegant, minimal formulation to stroke the ego of the system developer.
PLC programming is a ruthlessly practical industry. If you have the choice between something 10% more practical and something 90% more elegant, the practical solution will always win.
With that said - there are some who are playing in this area. I suggest a quick read of this article for some examples of trying to make ST work a bit like you are suggesting. Still, I would be cautious before putting anything like this to work in a real factory with real machines that need to be both safe and reliably making money.

machine representation of natural text

I'm currently working on high-level machine representation of natural text.
For example,
"I had one dog but I gave it to Danny who didn't have any"
would be
I.have.dog =1
I.have.dog -=1
Danny.have.dog = 0
Danny.have.dog +=1
something like this....
I'm trying to find resources, but can't really find matching topics..
Is there a valid subject name for this type of research? Any library of resources?
Natural logic sounds like something related but it's not really the same thing I'm working on. Please help me out!
Representing natural language's meaning is the domain of computational semantics. Within that area, lots of frameworks have been developed, though the basic one is still first-order logic.
Specifically, your problem seems to be that of recognizing discourse semantics, which deals with information change brought about by language use. This is pretty much an open area of research, so expect to find a lot of research papers and PhD positions, but little readily-usable software.
As larsmans already said, this is pretty much a really open field of research, called computational semantics (a subfield of computational linguistics.)
There's one important thing that you'll need to understand before starting off in the comp-sem world: most people there use fancy high-level languages. By high-level I don't mean C, but more something like LISP, Prolog, or, as of late, Haskell. Computational semantics is very close to logic, which is why people researching the topic are more comfortable with functional and logical languages — they're closer to what they actually use all day long.
It will also be very useful for you to first look at some foundational course in predicate logic, since that's what the underlying literature usually takes for granted.
A good introduction to the connection between logic and language is L.T.F. Gamut — Logic, Language, and Meaning, volume I. This deals with the linguistic side of semantics, which won't help you implement anything, but it will help you understand the following literature. That said, there are at least some books that will explain predicate logic as they go, but if you ask me, any person really interested in the representation of language as a formal system should take a course in predicate and possibly intuitionist and intensional logic.
To give you a bit of a peek, your example is rather difficult to treat for
current comp-sem approaches. Not impossible, but already pretty high up the
scale of difficulty. What makes it difficult is the tense for one part (dealing
with tense and aspect will typically bring you into even semantics,) but also
that you'd have to define the give and have relations in a way that
works for this example. (An easier example to work with would be, say "I had
a dog, but I gave it to Danny who didn't have any." Can you see why?)
Let's translate "I have a dog."
∃x[dog(x) ∧ have(I,x)]
(There is an object x, such that x is a dog and the have-relation holds between
"I" and x.)
These sentences would then be evaluated against a model, where the "I"
constant might already be defined. By evaluating multiple sentences in sequence,
you could then alter that model so that it keeps track of a conversation.
Let's give you some suggestions to start you off.
The classic comp-sem system is
SHRDLU, which places geometric
figures of certain color in a virtual environment. You can play around with it, since there's a Windows-compatible demo online at that page I linked you to.
The best modern book on the topic is probably Blackburn and Bos
(2005). It's written in Prolog, but
there are sources linked on the page to learn Prolog
(now!)
Van Eijck and Unger give a good course on computational semantics in Haskell, which is a bit more recent, but in my eyes not quite as educational in terms of raw computational semantics as Blackburn and Bos.

What are good starting points for someone interested in natural language processing? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
Question
So I've recently came up with some new possible projects that would have to deal with deriving 'meaning' from text submitted and generated by users.
Natural language processing is the field that deals with these kinds of issues, and after some initial research I found the OpenNLP Hub and university collaborations like the attempto project. And stackoverflow has this.
If anyone could link me to some good resources, from reseach papers and introductionary texts to apis, I'd be happier than a 6 year-old kid opening his christmas presents!
Update
Through one of your recommendations I've found opencyc ('the world's largest and most complete general knowledge base and commonsense reasoning engine'). Even more amazing still, there's a project that is a distilled version of opencyc called UMBEL. It features semantic data in rdf/owl/skos n3 syntax.
I've also stumbled upon antlr, a parser generator for 'constructing recognizers, interpreters, compilers, and translators from grammatical descriptions'.
And there's a question on here by me, that lists tons of free and open data.
Thanks stackoverflow community!
Tough call, NLP is a much wider field than most people think it is. Basically, language can be split up into several categories, which will require you to learn totally different things.
Before I start, let me tell you that I doubt you'll have any notable success (as a professional, at least) without having a degree in some (closely related) field. There is a lot of theory involved, most of it is dry stuff and hard to learn. You'll need a lot of endurance and most of all: time.
If you're interested in the meaning of text, well, that's the Next Big Thing. Semantic search engines are predicted as initiating Web 3.0, but we're far from 'there' yet. Extracting logic from a text is dependant on several steps:
Tokenization, Chunking
Disambiguation on a lexical level (Time flies like an arrow, but fruit flies like a banana.)
Syntactic Parsing
Morphological analysis (tense, aspect, case, number, whatnot)
A small list, off the top of my head. There's more :-), and many more details to each point. For example, when I say "parsing", what is this? There are many different parsing algorithms, and there are just as many parsing formalisms. Among the most powerful are Tree-adjoining grammar and Head-driven phrase structure grammar. But both of them are hardly used in the field (for now). Usually, you'll be dealing with some half-baked generative approach, and will have to conduct morphological analysis yourself.
Going from there to semantics is a big step. A Syntax/Semantics interface is dependant both, on the syntactic and semantic framework employed, and there is no single working solution yet. On the semantic side, there's classic generative semantics, then there is Discourse Representation Theory, dynamic semantics, and many more. Even the logical formalism everything is based on is still not well-defined. Some say one should use first-order logic, but that hardly seems sufficient; then there is intensional logic, as used by Montague, but that seems overly complex, and computationally unfeasible. There also is dynamic logic (Groenendijk and Stokhof have pioneered this stuff. Great stuff!) and very recently, this summer actually, Jeroen Groenendijk presented a new formalism, Inquisitive Semantics, also very interesting.
If you want to get started on a very simple level, read Blackburn and Bos (2005), it's great stuff, and the de-facto introduction to Computational Semantics! I recently extended their system to cover the partition-theory of questions (question answering is a beast!), as proposed by Groenendijk and Stokhof (1982), but unfortunately, the theory has a complexity of O(n²) over the domain of individuals. While doing so, I found B&B's implementation to be a bit, erhm… hackish, at places. Still, it is going to really, really help you dive into computational semantics, and it is still a very impressive showcase of what can be done. Also, they deserve extra cool-points for implementing a grammar that is settled in Pulp Fiction (the movie).
And while I'm at it, pick up Prolog. A lot of research in computational semantics is based on Prolog. Learn Prolog Now! is a good intro. I can also recommend "The Art of Prolog" and Covington's "Prolog Programming in Depth" and "Natural Language Processing for Prolog Programmers", the former of which is available for free online.
Chomsky is totally the wrong source to look to for NLP (and he'd say as much himself, emphatically)--see: "Statistical Methods and Linguistics" by Abney.
Jurafsky and Martin, mentioned above, is a standard reference, but I myself prefer Manning and Schütze. If you're serious about NLP you'll probably want to read both. There are videos of one of Manning's courses available online.
If you get through Prolog until the DCG chapter in Learn Prolog Now! mentioned by Mr. Dimitrov above, you'll have a good beginning at getting some semantics into your system, since Prolog gives you a very simple way of maintaining a database of knowledge and belief, which can be updated through question-answering.
As regards the literature, I have one major recommendation for you: run out and buy Speech and Language Processing by Jurafsky & Martin. It is pretty much the book on NLP (the first chapter is available online); used in a frillion university courses but also very readable for the non-linguist and practically oriented, while at the same time going fairly deep into the linguistics problems. I really cannot recommend it enough. Chapters 17, 18 and 21 seem to be what you're looking for (14, 15 and 18 in the first edition); they show you simple lambda notation which translates pretty well to Prolog DCG's with features.
Oh, btw, on getting the masters in linguistics; if NL semantics is what you're into, I'd rather recommend taking all the AI-related courses you can find (although any courses on "plain" linguistic semantics, logic, logical semantics, DRT, LFG/HPSG/CCG, NL parsing, formal linguistic theory, etc. wouldn't hurt...)
Reading Chomsky's original literature is not really useful; as far as I know there are no current implementations that directly correspond to his theories, all the useful stuff of his is pretty much subsumed by other theories (and anyone who stays near linguists for any matter of time will absorb knowledge of Chomsky by osmosis).
I'd highly recommend playing around with the NLTK and reading the NLTK Book. The NLTK is very powerful and easy to get into.
You could try reading up a bit on phrase structured grammers, which is basically the mathematics behind much language processessing. It's actually not that heavy, being largely based on set and graph theory. I studied it many moons ago as part of a discrete math course, and I guess there are many good references available at this stage.
Edit:Not as much as I expected on google, although this one looks like a good learning source.
One of the early explorers into NLP is Noam Chomsky; he wrote small books on the subject in the 50s through the 70s. You may find that engaging reading.
Cycorp have a short description of how their Cyc knowledge base derives meaning from sentences.
By utilising a massive knowledge base of common facts, the system can determine the most logical parse of a sentence.
A simpler place to begin with the building blocks is the look at the documentation for a package that attempts to do it. I'd recommend the Python [Natural Language Toolkit (NLTK)1, particularly because of their well-written, free book, which is filled with examples. It won't get you all the way to what you want (which is an AI-hard problem), but it will give you a good footing. NLTK has parsers, chunkers, context-free grammars, and more.
This is really hard stuff. I'd start off by getting at least a Masters in Linguistics, and then work towards my PhD in computer science, concentrating on NLP.
The problem is that most of us don't have the understanding of what language is. And without that understanding, it's bloody tough to implement a solution.
Other comments give some readings, which are probably fine if you want to get started playing around with a small subset of the problem, but in order to come up with a really robust solution, then there are no shortcuts. You need the academic background in both disciplines.
A very enjoyable readable introduction is The Language Instinct by Steven Pinker. It goes into the Chomsky stuff and also tells interesting stories from the evolutionary biology angle. Might be worth starting with something like that before diving into Chomsky's papers and related work, if you're new to the subject.

Resources