Related
I am researching formal and informal search heuristics. One of the best books on the subject I've found is Judea Pearl's Heuristics. Embarrassingly, I find myself unable to find a good search strategy that returns more material in this vein.
Things I am looking for:
Summary papers about advances in search
Books/papers that cover some of the history of the development of methods
Some idea about who is currently producing research in this space and their specialization
Additional keywords, search methods, and items that should appear on this list to broaden the search
I'm looking for non-technical material. Most works have a bunch of specific implementation detail and small, short bits about where the research came from and what it lead to (which leads to me chasing citation trails). This is totally fine, but hoping to find works that include more of the non-technical info all in one place.
Some works I've canvassed so far:
Search and Optimization by Metaheuristics. Techniques and Algorithms Inspired by Nature
Metaheuristics: from design to implementation
Artificial Intelligence, Evolutionary Computing and Metaheuristics: In the Footsteps of Alan Turing
Essays and Surveys in Metaheuristics
Essentials of Metaheuristics
Handbook of approximation algorithms and metaheuristics
Heuristics, Metaheuristics and Approximate Methods in Planning and Scheduling
Recent Advances on Meta-Heuristics and Their Application to Real Scenarios
Advances in Knowledge Representation
Applications of Conceptual Spaces: The Case for Geometric Knowledge Representation
Concepts, Ontologies, and Knowledge Representation
Handbook of Knowledge Representation
I realize this is more an academically oriented question and am also open to suggestions of where else to post such a question.
I've noticed that a number of top universities are offering courses where students are taught subjects relating to Computer Graphics for their CS majors. Sadly this is something not offered by my university and something I would really like to get into sometime in the next couple of years.
A couple of the projects I've found from some universities are great, although I'm mostly interested in two things:
Raytracing:
I want to write a Raytracer within the next two years. What do I need to know? I'm not a fantastic programmer yet (Java, C and Prolog are my main languages as of today) but I'm slowly learning every day. Also, my Math background isn't all that great, so any pointers on books to read or advice on writing such a program would be fantastic. I tend to pick these things up pretty quickly so feel free to chuck references at me.
Programming 3D Rendered Models
I've looked at a couple of projects where students have developed models and used them in games. I've made a couple of 2D games with raster images but have never worked with 3D models. What would I need to learn in regards to programming these models? If it helps I used to be okay with 3D Studio Max and Cinema4D (although every single course seems to use Maya), but haven't touched it in about four years.
Sorry for posting such vague and, let's be honest, stupid questions. It's just something I've wanted to do for a while and something that'd be good as a large project for me to develop in my own time.
Related Questions
Literature and Tutorials for Writing a Ray Tracer
I can recommend pbrt, it's a book and a physically-based renderer used to teach computer science graduates. The description of the maths used is nice and clear, and since it is written in the 'literate programming' you can see the appropriate code (in C++) too.
The book "Computer Graphics: Principles and Practice" (known in the Computer Graphics circles as the "Foley-VanDam") is the basic for most computer graphics courses, and it covers the topic of implementing a ray-tracer in much detail. It is quite dated, but it's still the best, afaik, and the basic principles remain the same.
I also second the recommendation for Eric Lengyel's Mathematics for 3D Game Programming and Computer Graphics. It's not as thorough, but it's a wonderful review of the math basics you need for 3D programming, it has very useful summaries at the end of each chapter, and it's written in an approachable, not too scary way.
In addition, you'll probably want some OpenGL or DirectX basics. It's easier to start working with a 3D API, then learn the underlying maths than the opposite (in my opinion), but both options are possible. Just look for OpenGL on SO and you should find a couple of good references as well.
The 2000 ICFP Programming Contest asked participants to build a ray tracer in three days. They have a good specification for a simple ray tracer, and you can get code for the winning entries and some other entries as well. There were entries in a large number of different programming languages. This might be a nice way for you to get started.
The briefest useful answer I can give is that most of the important algorithms can be found in Real-Time Rendering by Tomas Akenine-Möller, Eric Haines, and Naty Hoffman, and the bibliography at the end has references to the necessary maths. Their website has a recommended reading list as well.
The most useful math book I've read on the subject is Eric Lengyel's Mathematics for 3D Game Programming and Computer Graphics. The maths you need most are geometry (obviously) and linear algebra (for dealing with all the matrices).
I took such a class last year, and I believe that the class was wonderful for forcing students to learn the math behind the computer graphics - not just the commands for making a computer do what you want.
My professor has a site located here and it has his lecture notes and problem sets that you can take a look through.
Our final project was indeed a raytracer, but once you know the mathematics behind it, coding (an inefficient one) is trivial.
For a mathematical introduction into these topics, see
http://graphics.idav.ucdavis.edu/education/GraphicsNotes/homepage.html
Check http://www.scratchapixel.com/lessons/3d-basic-lessons/lesson-1-writing-a-simple-raytracer/
This is a very good place to learn about ray tracing and rendering in general.
I am wondering if anyone has any insight into this. I am thinking of going to grad school to get some computer science related degree. I have always been intrigued by people who are working on problems using statistical packages or simulation to solve problems. What would I study to get a good breadth of knowledge of these things? Do they fall into machine learning?
Thanks
My girlfriend is getting a degree in mathematics with an emphasis in Statistics and Operations Research.
She does a lot of work with SAS and other statistical software to maximize certain functions and predict the likelihood of future events. It may be more mathematics then you like, but you might try looking for masters of CS programs with an emphasis in Operations Research or Statistics.
There's a wide range of possible opportunities here. Let me add the following choices:
Physics with a focus on complex networks. This has applications in biology, epidemiology, sociology, finance, and computer science.
A good machine learning program, with statistics, data mining, text analysis, and computational learning theory.
Industrial engineering/operations research, with simulation, reliability, and process control.
I'd be happy to talk further about this, please put questions in comments.
I would assume that your school would offer some actual Statistics courses, probably in the Math department, which you could take to learn all about this.
Study a lot of mathematics, especially probability and statistics. I have a graduate simulation course right now, and I wish I knew more probs/stats stuff.
In Biostatics (at the U of Minnesota), we did a lot of simulation, in areas like Bayesian statistics, genetics, and others. Any strongly analytical program is a good candidate for teaching the skills you want, including: econ, econometrics, agronomics, statistical genetics... etc., etc., :)
While you're waiting, pick up R, Matlab (Octave is the free implementation), or your Turing-Complete language of choice, dig into Wikipedia, and get to work :)
I'd like to second Gregg Lind's recommendation of thinking about statistics in the biological sciences. It's well-funded, there's a lot of interesting work going on (both theoretical and applied!), and you can sound really cool at parties because somehow, someway you can always make some sort of connection from your work back to curing cancer. :)
Seriously though, a lot of great statistical work was done in the early 20th century by people like Haldane, Fiscer and Wright. More recent interesting work has been done on analysis or large data sets, multiple hypothesis testing, and applied machine learning. It's super exciting. Come join us!
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
Question
So I've recently came up with some new possible projects that would have to deal with deriving 'meaning' from text submitted and generated by users.
Natural language processing is the field that deals with these kinds of issues, and after some initial research I found the OpenNLP Hub and university collaborations like the attempto project. And stackoverflow has this.
If anyone could link me to some good resources, from reseach papers and introductionary texts to apis, I'd be happier than a 6 year-old kid opening his christmas presents!
Update
Through one of your recommendations I've found opencyc ('the world's largest and most complete general knowledge base and commonsense reasoning engine'). Even more amazing still, there's a project that is a distilled version of opencyc called UMBEL. It features semantic data in rdf/owl/skos n3 syntax.
I've also stumbled upon antlr, a parser generator for 'constructing recognizers, interpreters, compilers, and translators from grammatical descriptions'.
And there's a question on here by me, that lists tons of free and open data.
Thanks stackoverflow community!
Tough call, NLP is a much wider field than most people think it is. Basically, language can be split up into several categories, which will require you to learn totally different things.
Before I start, let me tell you that I doubt you'll have any notable success (as a professional, at least) without having a degree in some (closely related) field. There is a lot of theory involved, most of it is dry stuff and hard to learn. You'll need a lot of endurance and most of all: time.
If you're interested in the meaning of text, well, that's the Next Big Thing. Semantic search engines are predicted as initiating Web 3.0, but we're far from 'there' yet. Extracting logic from a text is dependant on several steps:
Tokenization, Chunking
Disambiguation on a lexical level (Time flies like an arrow, but fruit flies like a banana.)
Syntactic Parsing
Morphological analysis (tense, aspect, case, number, whatnot)
A small list, off the top of my head. There's more :-), and many more details to each point. For example, when I say "parsing", what is this? There are many different parsing algorithms, and there are just as many parsing formalisms. Among the most powerful are Tree-adjoining grammar and Head-driven phrase structure grammar. But both of them are hardly used in the field (for now). Usually, you'll be dealing with some half-baked generative approach, and will have to conduct morphological analysis yourself.
Going from there to semantics is a big step. A Syntax/Semantics interface is dependant both, on the syntactic and semantic framework employed, and there is no single working solution yet. On the semantic side, there's classic generative semantics, then there is Discourse Representation Theory, dynamic semantics, and many more. Even the logical formalism everything is based on is still not well-defined. Some say one should use first-order logic, but that hardly seems sufficient; then there is intensional logic, as used by Montague, but that seems overly complex, and computationally unfeasible. There also is dynamic logic (Groenendijk and Stokhof have pioneered this stuff. Great stuff!) and very recently, this summer actually, Jeroen Groenendijk presented a new formalism, Inquisitive Semantics, also very interesting.
If you want to get started on a very simple level, read Blackburn and Bos (2005), it's great stuff, and the de-facto introduction to Computational Semantics! I recently extended their system to cover the partition-theory of questions (question answering is a beast!), as proposed by Groenendijk and Stokhof (1982), but unfortunately, the theory has a complexity of O(n²) over the domain of individuals. While doing so, I found B&B's implementation to be a bit, erhm… hackish, at places. Still, it is going to really, really help you dive into computational semantics, and it is still a very impressive showcase of what can be done. Also, they deserve extra cool-points for implementing a grammar that is settled in Pulp Fiction (the movie).
And while I'm at it, pick up Prolog. A lot of research in computational semantics is based on Prolog. Learn Prolog Now! is a good intro. I can also recommend "The Art of Prolog" and Covington's "Prolog Programming in Depth" and "Natural Language Processing for Prolog Programmers", the former of which is available for free online.
Chomsky is totally the wrong source to look to for NLP (and he'd say as much himself, emphatically)--see: "Statistical Methods and Linguistics" by Abney.
Jurafsky and Martin, mentioned above, is a standard reference, but I myself prefer Manning and Schütze. If you're serious about NLP you'll probably want to read both. There are videos of one of Manning's courses available online.
If you get through Prolog until the DCG chapter in Learn Prolog Now! mentioned by Mr. Dimitrov above, you'll have a good beginning at getting some semantics into your system, since Prolog gives you a very simple way of maintaining a database of knowledge and belief, which can be updated through question-answering.
As regards the literature, I have one major recommendation for you: run out and buy Speech and Language Processing by Jurafsky & Martin. It is pretty much the book on NLP (the first chapter is available online); used in a frillion university courses but also very readable for the non-linguist and practically oriented, while at the same time going fairly deep into the linguistics problems. I really cannot recommend it enough. Chapters 17, 18 and 21 seem to be what you're looking for (14, 15 and 18 in the first edition); they show you simple lambda notation which translates pretty well to Prolog DCG's with features.
Oh, btw, on getting the masters in linguistics; if NL semantics is what you're into, I'd rather recommend taking all the AI-related courses you can find (although any courses on "plain" linguistic semantics, logic, logical semantics, DRT, LFG/HPSG/CCG, NL parsing, formal linguistic theory, etc. wouldn't hurt...)
Reading Chomsky's original literature is not really useful; as far as I know there are no current implementations that directly correspond to his theories, all the useful stuff of his is pretty much subsumed by other theories (and anyone who stays near linguists for any matter of time will absorb knowledge of Chomsky by osmosis).
I'd highly recommend playing around with the NLTK and reading the NLTK Book. The NLTK is very powerful and easy to get into.
You could try reading up a bit on phrase structured grammers, which is basically the mathematics behind much language processessing. It's actually not that heavy, being largely based on set and graph theory. I studied it many moons ago as part of a discrete math course, and I guess there are many good references available at this stage.
Edit:Not as much as I expected on google, although this one looks like a good learning source.
One of the early explorers into NLP is Noam Chomsky; he wrote small books on the subject in the 50s through the 70s. You may find that engaging reading.
Cycorp have a short description of how their Cyc knowledge base derives meaning from sentences.
By utilising a massive knowledge base of common facts, the system can determine the most logical parse of a sentence.
A simpler place to begin with the building blocks is the look at the documentation for a package that attempts to do it. I'd recommend the Python [Natural Language Toolkit (NLTK)1, particularly because of their well-written, free book, which is filled with examples. It won't get you all the way to what you want (which is an AI-hard problem), but it will give you a good footing. NLTK has parsers, chunkers, context-free grammars, and more.
This is really hard stuff. I'd start off by getting at least a Masters in Linguistics, and then work towards my PhD in computer science, concentrating on NLP.
The problem is that most of us don't have the understanding of what language is. And without that understanding, it's bloody tough to implement a solution.
Other comments give some readings, which are probably fine if you want to get started playing around with a small subset of the problem, but in order to come up with a really robust solution, then there are no shortcuts. You need the academic background in both disciplines.
A very enjoyable readable introduction is The Language Instinct by Steven Pinker. It goes into the Chomsky stuff and also tells interesting stories from the evolutionary biology angle. Might be worth starting with something like that before diving into Chomsky's papers and related work, if you're new to the subject.
Does anyone known of a a good reference for canonical CS problems?
I'm thinking of things like "the sorting problem", "the bin packing problem", "the travailing salesman problem" and what not.
edit: websites preferred
You can probably find the best in an algorithms textbook like Introduction to Algorithms. Though I've never read that particular book, it's quite renowned for being thorough and would probably contain most of the problems you're likely to encounter.
"Computers and Intractability: A guide to the theory of NP-Completeness" by Garey and Johnson is a great reference for this sort of thing, although the "solved" problems (in P) are obviously not given much attention in the book.
I'm not aware of any good on-line resources, but Karp's seminal paper Reducibility among Combinatorial Problems (1972) on reductions and complexity is probably the "canonical" reference for Hard Problems.
Have you looked at Wikipedia's Category:Computational problems and Category:NP Complete Problems pages? It's probably not complete, but they look like good starting points. Wikipedia seems to do pretty well in CS topics.
I don't think you'll find the answers to all those problems in only one book. I've never seen any decent, comprehensive website on algorithms, so I'd recommend you to stick to the books. That said, you can always get some introductory material on canonical algorithm texts (there are always three I usually recommend: CLRS, Manber, Aho, Hopcroft and Ullman (this one is a bit out of date in some key topics, but it's so formal and well-written that it's a must-read). All of them contain important combinatorial problems that are, in some sense, canonical problems in computer science. After learning some fundamentals in graph theory you'll be able to move to Network Flows and Linear Programming. These comprise a set of techniques that will ultimately solve most problems you'll encounter (linear programming with the variables restricted to integer values is NP-hard). Network flows deals with problems defined on graphs (with weighted/capacitated edges) with very interesting applications in fields that seemingly have no relationship to graph theory whatsoever. THE textbook on this is Ahuja, Magnanti and Orlin's. Linear programming is some kind of superset of network flows, and deals with optimizing a linear function on variables subject to restrictions in the form of a linear system of equations. A book that emphasizes the relationship to network flows is Bazaraa's. Then you can move on to integer programming, a very valuable tool that presents many natural techniques for modelling problems like bin packing, task scheduling, the knapsack problem, and so on. A good reference would be L. Wolsey's book.
You definitely want to look at NIST's Dictionary of Algorithms and Data Structures. It's got the traveling salesman problem, the Byzantine generals problem, the dining philosophers' problem, the knapsack problem (= your "bin packing problem", I think), the cutting stock problem, the eight queens problem, the knight's tour problem, the busy beaver problem, the halting problem, etc. etc.
It doesn't have the firing squad synchronization problem (I'm surprised about that omission) or the Jeep problem (more logistics than computer science).
Interestingly enough there's a blog on codinghorror.com which talks about some of these in puzzle form. (I can't remember whether I've read Smullyan's book cited in the blog, but he is a good compiler of puzzles & philosophical musings. Martin Gardner and Douglas Hofstadter and H.E. Dudeney are others.)
Also maybe check out the Stony Brook Algorithm Repository.
(Or look up "combinatorial problems" on google, or search for "problem" in Wolfram Mathworld or look at Hilbert's problems, but in all these links many of them are more pure-mathematics than computer science.)
#rcreswick those sound like good references but fall a bit shy of what I'm thinking of. (However, for all I know, it's the best there is)
I'm going to not mark anything as accepted in hopes people might find a better reference.
Meanwhile, I'm going to list a few problems here, fell free to add more
The sorting problem Find an order for a set that is monotonic in a given way
The bin packing problem partition a set into a minimum number of sets where each subset is "smaller" than some limit
The travailing salesman problem Find a Hamiltonian cycle in a weighted graph with the minimum total weight