Automated planning: what makes "blocksworld" a non trivial problem? - planning

Blocksworld is apparently a benchmark domain in automated planning.
This domain consists of a set of blocks, a table and a robot hand.
The blocks can be on top of other blocks or on the table;
a block that has nothing on it is clear;
and the robot hand can hold one block or be empty.
The goal is to find a plan to move from one configuration of blocks to another.
Can someone explain what makes this a non trivial problem? I can't think of a problem instance where the solution is not trivial (e.g. build desired towers bottom-up one block at a time).

There's a historical and two practical reasons for Blocks World being a benchmark of interest.
The historical one is that Blocks World was used to illustrate the so-called Sussman's Anomaly. It has no longer any scientific relevance, but it was used to illustrate the limitations and challenges of planning algorithms that approach the problem of planning as that of searching through the space of plans directly. The link points to a chapter of the following book, which is a good introduction into Automated Planning
Automated Planning and Acting
Malik Ghallab, Dana Nau, Paolo Traverso
Cambridge University Press
It used to be the case, especially back in the mid 1990s, when SAT solving really took off, that it was an example of how limited was the state of the art in Automated Planning back in the day.
As you write in your question, solving Blocks World is easy: the algorithm you sketch is well known and is clearly in polynomial time. Finding an optimal plan, though, is not easy. I refer you to this excellent book
Understanding Planning Tasks: Domain Complexity and Heuristic Decomposition
Malte Helmert
Springer, 2006
or his shorter, classic paper
Complexity results for standard benchmark domains in planning
Malte Helmert
Artificial Intelligence, 2003
The second "practical" reason for the relevance of Blocks World is that, even being a "simple" problem, it can defeat planning heuristics and elaborate algorithms or compilations to other computational frameworks such as SAT or SMT.
For instance, it wasn't until relatively recently (2012) that Jussi Rintanen showed good performance on that "simple" benchmark after heavily modifying standard SAT solvers
Planning as satisfiability: Heuristics
Jussi Rintanen
Artificial Intelligence, 2012
by compiling into them heuristics as clauses that the combination of unit propagation, clause learning and variable selection heuristics can exploit to obtain deductive lower bounds quickly.
EDIT: Further details on the remark above optimal planning for blocks not being easy have been requested. From the references provided, this paper
On the complexity of blocks-world planning
Naresh Gupta and Dana S. Nau
Artificial Intelligence, 1992
has the original proof, reducing the problem of computing optimal plans for Blocks World to HITTING-SET (one of the Karp's NP-hard problems).
An easier to access paper, which looks quite deep into planning in the Blocks World domain is
Blocks World revisited
John Slaney, Sylvie Thiébaux
Artificial Intelligence, 2001
Figure 1 in the paper above shows an example of an instance that illustrates the intuition behind Gupta and Nau's complexity proof.

Another Blocksworld-related paper that I found quite interesting is How Good is Almost Perfect? (AAAI 2008) by Helmert and Röger.
It showed that even when using an almost perfect heuristic (a heuristic which is, for every possible state, wrong by only a constant) A* search is bound to produce an exponentially large search space. (This shows that even with almost perfect information about the goal distance, search will still get lost in the search space in this domain.)


Natural language processing (NLP)

With the advancement in technology, industry has been moving towards automation and intelligence. In this regards artificial intelligence and machine learning has played a vital role. Natural language processing (NLP) is a field of computer science and linguistics which focuses on methods to process the natural languages. So, which one is more reliable and efficient in natural language processing, Finite state machine [FSM] or Push down Automata?
Even though there are many techniques to do NLP, the state of art way is to use deep learning. Many significant improvements are shown in NLP using Deep Learning Techniques. This has happened because of the enormous amount of processing power which is available at low cost. If you want to read cutting edge techniques used in NLP domain or any other research domain, Go to google scholar (
It seems like the real question you want to be asking are: "What are some efficient techniques in natural language processing?" But I will address your question first.
First of all, neither FSA (Finite State Automata) not PDA (Push Down Automata) are sufficient techniques to model language. FSAs can handle regular languages. They cannot, however, even answer the question of whether a word is a palindrome. PDAs are a little more powerful, and can answer such questions. Turing machines give universal computation and are useful for writing programs of arbitrary complexity.
Now to start bridging this gap. Natural languages are not regular languages. They thus cannot be handled by FSAs. Some context-free grammars such as LR(k) grammars are handled by PDAs, however natural human language is not context-free. As an example. The following three statements. "Jill drove to the grocery store to meet her friend Sally before she picked up her kids. Sally bought three boxes of cereal. Then she drove to the school." While this is poor grammar, it is "natural" in that they are utterances that people make and they are generally parseable by other people. The antecedent to the pronoun "She" in the third sentence clearly refers to Jill as she is the one with children. However, it is ambiguous and we have to infer that association.
The amount of ambiguity in context in natural human language makes it impossible to parse deterministically. Instead, we turn towards the fields of statistics and decision theory to make our inferences about the maximally likely model for the communication.
The locality but non-determinism in speech and writing are one of the things that make the application of machine learning techniques such as the utilization of deep recurrent neural networks so immensely effective by comparison to their classical rule-based counterparts.
While the term Neural Network is a bit of a misnomer as ultimately the human brain is far, far more complex than these rudimentary models from a neurological perspective, the general learning through approximate inference is ostensibly close to reality. We might better call these methods "Differentiable Computing" but that is a digression for another time.
In summary. The answer to your question you actually asked would be PDAs are going to produce better models than FSAs but both are going to be absolutely worthless by comparison to even rudimentary statistical methods.
If you are curious about NLP, I would actually recommend a course in machine learning and a follow up in deep learning.
Andrew Ng has a good series of courses that are targeted toward beginners. After that, I would follow up with Sirajs course on deep learning in Tensorflow.

Manning & Jordan's opinions on the approach to better NLP basing on the article

In this article, Manning discussed about deep learning's impact and contribution to NLP and NLP's future. He quoted Michael Jordan's AMA reply on the R website,
Although current deep learning research tends to claim to encompass NLP, I'm (1) much less convinced about the strength of the results, compared to the results in, say, vision; (2) much less convinced in the case of NLP than, say, vision, the way to go is to couple huge amounts of data with black-box learning architectures.
and said he agrees with point (1) but feels reserved towards point (2).
The problem is, I am not sure about point (2)'s position.
Later in the article, Manning pointed out that domain knowledge, e.g. linguistics & cognitive psychology, etc., is the center of NLP's recent advancement, and thus, in some sense, downplayed the role of deep learning in advancing NLP.
Therefore, was Manning arguing against mono-focusing on "huge amounts of data" and "black-box learning architectures" since they are often attributed to deep learning? Was Jordan arguing for them as the "way to go" for future NLP?

Are limitations of CPU speed and memory prevent us from creating AI systems?

Many technology optimists say that in 15 years the speed of computers will be comparable with the speed of the human brain. This is why they believe that computers will achieve the same level of intelligence as humans.
If Moore's law holds, then every 18 months we should expect doubling of CPU speed. 15 years is 180 months. So, we will have the doubling 10 times. Which means that in 15 years computer will be 1024 times faster than they are now.
But is the speed the reason of the problem? If it is so, we would be able to build an AI system NOW, it would just 1024 times slower than in 15 years. Which means that to answer a question it will need 1024 second (17 minutes) instead of acceptable 1 second. But do we have now strong (but slow) AI system? I think no. Even if now (2015) we give to a system 1 hour instead of 17 minutes, or 1 day, or 1 month or even 1 year, it still will be unable to answer complex questions formulated in natural language. So, it is not the speed that causes problems.
It means that in 15 years our intelligence will not be 1024 faster than now (because we have no intelligence). Instead our "stupidity" will be 1024 times faster than now.
We need both faster hardware and better algorithms. Of course speed alone is not enough as you pointed out.
We need self-modifying meta-learning algorithms capable of creating hypotheses and performing experiments to verify them (like humans do). Systems that are learning to learn and self-improving. Algorithms that can prove that given self-modification is optimal in certain sense and will lead to even better self-modifications in the future. Systems that can reflect on and inspect their own software (can you call it consciousness ?). Such research is being done and may create superhuman intelligence in the future or even technological singularity as some believe.
There is one problem with this approach, though. People doing this research usually assume that consciousness is computable. That it is all about intelligence. They don't take into account experiences like pleasure and pain which have nothing to do (in my opinion) with computation nor intellect. You can understand pain through experience only (not intellectual speculation). Setting variable pleasure to 5 or behaving like one feels pleasure is very different from experiencing pleasure. Some people say that feelings originate in brain so it is enough to understand brain. Not necessarily. Child can ask: "How did they put small people inside TV box ?". Of course TV is just a receiver and there are no small people inside. Brain might be receiver too. Do we need higher knowledge for feelings and other experiences ?
The answer has to be answered in the context of computation and complexity.
Every algorithm has its own complexity and running time (See Big O notation). There are problems which are non-computable problems such as the halting problem. These problems are proven that an algorithm does not exists independent of the hardware.
Computable algorithms are described in the number of steps required with respect to the input to solve an algorithm. As the number of input increases, the execution time of the algorithm also increases. However, these algorithms can be categorized into two: exponential time algorithms and non-exponential time algorithms. Exponential time algorithms increases drastically with the number of input and becomes intractable.
These executing time of these problems can be improved with better hardware however the complexity will always be the same. This means that no matter what the CPU uses, the execution time will always require the same number of steps. This means that the hardware is important to provide an answer in less time but the hardness of the problem will always remain the same. Thus, the limitation of the hardware is not preventing us from creating an AI system. For instance, you can use parallel programming (ex: GPU) to improve the execution time of the algorithm drastically but the algorithm is still the same as a normal CPU algorithm.
I would say no. As you showed, speed is not the only factor of intelligence. I for one would think Language is, yes language. Language is the primary skill we learn as humans, so why not for computers? Language gives an understanding that can be understood across the globe, given you know that language. Humans use nonverbal and verbal language to communicate. But I honestly think it really works something like this:
Humans go through experiences. These experiences have a bigger impact on our lives the closer we are to our birth date, or the more emotional they are. For example, the first time we are told no means ALOT more to us as an infant than as a 70 year old adult. These get stored as either long term or short term memory and correlated to that event later on in life for reference. We mainly store events to learn from them to prevent negative experience or promote positive experiences.
Think of it as a tag cloud. The more often you do task A, the bigger the cloud is in memory. We then store crucial details such as type of emotion, location, smells etc. Now when we reference them again from memory we pick out those details and create a logical sentence:
Touching that stove hurt me when I was at grandma's house.
All of the bolded words would have to be stored to have a complete memory.
Now inside of this sentence we have learned a lot more things than just being hurt from the stove at grandma's house. We have learned that stove's can be hot, dangerous, and grandma allows it to be in her house. We also learned how long it takes to heal from such an event, emotionally and physically to gauge how important the event is. And so much more. So we also store this sub-event information inside of other knowledge bubbles. And these bubbles continue to grow exponentially.
Now when asked: Are stoves dangerous?
You can identify the words in the sentence:
are, stoves, dangerous, question
and reference the definition of dangerous as: hurt, bad
and then provide more evidence that this is true, such as personal experience to result in:
Yes, stoves are dangerous because I was hurt at grandmas house by one.
So intelligence seems to be a mix of events, correlation and data retrieval to solve some solution. I'm sure there's a lot more to it than that but this is just my understanding of intelligence.

Graduate Level Degree for Simulation/Statistics/Prediction?

I am wondering if anyone has any insight into this. I am thinking of going to grad school to get some computer science related degree. I have always been intrigued by people who are working on problems using statistical packages or simulation to solve problems. What would I study to get a good breadth of knowledge of these things? Do they fall into machine learning?
My girlfriend is getting a degree in mathematics with an emphasis in Statistics and Operations Research.
She does a lot of work with SAS and other statistical software to maximize certain functions and predict the likelihood of future events. It may be more mathematics then you like, but you might try looking for masters of CS programs with an emphasis in Operations Research or Statistics.
There's a wide range of possible opportunities here. Let me add the following choices:
Physics with a focus on complex networks. This has applications in biology, epidemiology, sociology, finance, and computer science.
A good machine learning program, with statistics, data mining, text analysis, and computational learning theory.
Industrial engineering/operations research, with simulation, reliability, and process control.
I'd be happy to talk further about this, please put questions in comments.
I would assume that your school would offer some actual Statistics courses, probably in the Math department, which you could take to learn all about this.
Study a lot of mathematics, especially probability and statistics. I have a graduate simulation course right now, and I wish I knew more probs/stats stuff.
In Biostatics (at the U of Minnesota), we did a lot of simulation, in areas like Bayesian statistics, genetics, and others. Any strongly analytical program is a good candidate for teaching the skills you want, including: econ, econometrics, agronomics, statistical genetics... etc., etc., :)
While you're waiting, pick up R, Matlab (Octave is the free implementation), or your Turing-Complete language of choice, dig into Wikipedia, and get to work :)
I'd like to second Gregg Lind's recommendation of thinking about statistics in the biological sciences. It's well-funded, there's a lot of interesting work going on (both theoretical and applied!), and you can sound really cool at parties because somehow, someway you can always make some sort of connection from your work back to curing cancer. :)
Seriously though, a lot of great statistical work was done in the early 20th century by people like Haldane, Fiscer and Wright. More recent interesting work has been done on analysis or large data sets, multiple hypothesis testing, and applied machine learning. It's super exciting. Come join us!

canonical problems list

Does anyone known of a a good reference for canonical CS problems?
I'm thinking of things like "the sorting problem", "the bin packing problem", "the travailing salesman problem" and what not.
edit: websites preferred
You can probably find the best in an algorithms textbook like Introduction to Algorithms. Though I've never read that particular book, it's quite renowned for being thorough and would probably contain most of the problems you're likely to encounter.
"Computers and Intractability: A guide to the theory of NP-Completeness" by Garey and Johnson is a great reference for this sort of thing, although the "solved" problems (in P) are obviously not given much attention in the book.
I'm not aware of any good on-line resources, but Karp's seminal paper Reducibility among Combinatorial Problems (1972) on reductions and complexity is probably the "canonical" reference for Hard Problems.
Have you looked at Wikipedia's Category:Computational problems and Category:NP Complete Problems pages? It's probably not complete, but they look like good starting points. Wikipedia seems to do pretty well in CS topics.
I don't think you'll find the answers to all those problems in only one book. I've never seen any decent, comprehensive website on algorithms, so I'd recommend you to stick to the books. That said, you can always get some introductory material on canonical algorithm texts (there are always three I usually recommend: CLRS, Manber, Aho, Hopcroft and Ullman (this one is a bit out of date in some key topics, but it's so formal and well-written that it's a must-read). All of them contain important combinatorial problems that are, in some sense, canonical problems in computer science. After learning some fundamentals in graph theory you'll be able to move to Network Flows and Linear Programming. These comprise a set of techniques that will ultimately solve most problems you'll encounter (linear programming with the variables restricted to integer values is NP-hard). Network flows deals with problems defined on graphs (with weighted/capacitated edges) with very interesting applications in fields that seemingly have no relationship to graph theory whatsoever. THE textbook on this is Ahuja, Magnanti and Orlin's. Linear programming is some kind of superset of network flows, and deals with optimizing a linear function on variables subject to restrictions in the form of a linear system of equations. A book that emphasizes the relationship to network flows is Bazaraa's. Then you can move on to integer programming, a very valuable tool that presents many natural techniques for modelling problems like bin packing, task scheduling, the knapsack problem, and so on. A good reference would be L. Wolsey's book.
You definitely want to look at NIST's Dictionary of Algorithms and Data Structures. It's got the traveling salesman problem, the Byzantine generals problem, the dining philosophers' problem, the knapsack problem (= your "bin packing problem", I think), the cutting stock problem, the eight queens problem, the knight's tour problem, the busy beaver problem, the halting problem, etc. etc.
It doesn't have the firing squad synchronization problem (I'm surprised about that omission) or the Jeep problem (more logistics than computer science).
Interestingly enough there's a blog on which talks about some of these in puzzle form. (I can't remember whether I've read Smullyan's book cited in the blog, but he is a good compiler of puzzles & philosophical musings. Martin Gardner and Douglas Hofstadter and H.E. Dudeney are others.)
Also maybe check out the Stony Brook Algorithm Repository.
(Or look up "combinatorial problems" on google, or search for "problem" in Wolfram Mathworld or look at Hilbert's problems, but in all these links many of them are more pure-mathematics than computer science.)
#rcreswick those sound like good references but fall a bit shy of what I'm thinking of. (However, for all I know, it's the best there is)
I'm going to not mark anything as accepted in hopes people might find a better reference.
Meanwhile, I'm going to list a few problems here, fell free to add more
The sorting problem Find an order for a set that is monotonic in a given way
The bin packing problem partition a set into a minimum number of sets where each subset is "smaller" than some limit
The travailing salesman problem Find a Hamiltonian cycle in a weighted graph with the minimum total weight
