Related
I am working on a graduation project related to "Aspect extraction (AE)".
I'm pretty confused about POS taging, syntax tree, grammar rules, and other low-level NLP stuff. I need a reference that teaches me these things in detail, so if any of you know I hope you don't mind me?
I know my question is not about programming directly and this may not agree with the site, but I really need to.
This is not an easy one! I assume that you are trying to understand the underlying 'why' and 'what', so, if I were you I would start with the one and only "Speech and Language Processing" by Daniel Jurafsky and James H. Martin. They have a whole section (Section 17 in my Second edition) on the representation of meaning, and state representation, with a whole subsection on Aspect.
In addition to that, the book will also help you understand various existing approaches to POS-tagging and the other topics you mentioned above, and, the book is available online for free! There is even the draft of the 3rd edition out there.
Additionally, after reading the chapters above, you can check out how other people do aspect extraction here
I found the geoquery program that can answer some simple questions, but I can't find any demo or instructions of how to process questions.
So my question is how can I get this:
[what,is,the,shortest,river,?]
into this:
answer(A,shortest(A,river(A))).
Moreover, how is that process done?
In SWI-Prolog, there is something you should try:
?- pack_install(chat80).
...
?- edit(library(chat80)).
It's a revived edition of the CHAT-80 system.
There you will find both the parser and the processor to answer your queries.
Two answers:
It's complex!
Use a library to generate a data structure from the question string (I have not tried that ever).
Here is comment I put up on the SWI Prolog doc site. It may be of some help.
Start with an overview:
https://plato.stanford.edu/entries/computational-linguistics/
In a recent heavy textbook:
Introduction to Natural Language Processing
by Jacob Eisenstein, 2019
https://mitpress.mit.edu/books/introduction-natural-language-processing
the author goes far beyond "NLP and Parsing" (indeed the part on
Formal Language Theory, Context-Free Parsing, Dependency Parsing is
only 80 pages of 450) and throws statistics and neural networks at the
problem.
Prolog related works by reverse year of publication
An Introduction to Language Processing with Perl and Prolog
by Pierre M. Nugues, 2nd edition, 2014
https://link.springer.com/book/10.1007%2F3-540-34336-9
"An Outline of Theories, Implementation, and Application with Special Consideration of English, French, and German"
Contains an intro to Prolog, specifically SWI-Prolog.
Natural Language Processing Techniques in Prolog
by Patrick Blackburn and Kristina Striegnitz, 2002
http://cs.union.edu/~striegnk/courses/nlp-with-prolog/html/index.html
Prolog and Natural-Language Analysis
by Fernando C. N. Pereira and Stuart M. Shieber, (Original 1987, Millenial reissue 2002)
http://www.mtome.com/Publications/PNLA/prolog-digital.pdf
So my question is how can I get this:
[what,is,the,shortest,river,?]
into this:
answer(A,shortest(A,river(A))).
The research at the University of Texas (over 2005-2009) is in Machine learning: Learning what the Geobase program does (published in 1988). The Prolog programs and data downloaded allows you to query its internal DB, in Prolog. The semantic parser which maps english queries into logic queries (which can be executed by the said programs) is missing.
The program comes with 2 sets of data pairs (250 and 880 pairs) of English queries and logic queries. This data is used for research in machine learning.
So if the user types in the preparsed query, you can use the geoquery program to retrieve the logic query, run the logic query and get the results.
Moreover, how is that process done?
That is called semantic parsing. Since Alain Colmerauer's work in 1972 producing a QA application Orbis on an astronomy db, the Chat80 system in 1980, etc.
I recommend these books by Prof. Covington (taught at the University of Georgia)
NLP for Prloog Programmers (1994)
Prolog Programming in Depth (1997)
Available free from his site, these books will help you go a long way into NLP.
There are a lot of resources out there that address a couple aspects of ES, but most of them are pros/cons list or example snippets. Terms, such as projection, apply, replay are also used (mostly) without explanations or in slightly different contexts.
The best sample implementation with corresponding documentation and extra resources is the CQRS Journey from Microsoft, and one can learn from it a lot, but it is not authoritative.
The closest to an informal spec I could find is Leif Battermann's concise summary, but his entire site has been down for a couple weeks now.
It may be that such comprehensive guideline does not exist, because it is a concept that became popular, people picked it up, started using it as they saw fit and no one will ever agree on the details anymore.
UPDATE (2/16/2018 11:03)
I missed Greg Young's (who coined the term CQRS) Event Centric: Finding Simplicity in Complex Systems book somehow. Would this be a publication to ES as Eric Evans' book is to Domain-Driven Design?
The closest thing to what you are looking for is probably Martin Fowler's 2005 essay; Martin's description predates Young's introduction of DDDD/CQRS by a few years.
Would this be a publication to ES as Eric Evans' book is to Domain-Driven Design?
No; among other things, Evans actually wrote the blue book. Event Centric has not been written (as of October 2017).
I've noticed that a number of top universities are offering courses where students are taught subjects relating to Computer Graphics for their CS majors. Sadly this is something not offered by my university and something I would really like to get into sometime in the next couple of years.
A couple of the projects I've found from some universities are great, although I'm mostly interested in two things:
Raytracing:
I want to write a Raytracer within the next two years. What do I need to know? I'm not a fantastic programmer yet (Java, C and Prolog are my main languages as of today) but I'm slowly learning every day. Also, my Math background isn't all that great, so any pointers on books to read or advice on writing such a program would be fantastic. I tend to pick these things up pretty quickly so feel free to chuck references at me.
Programming 3D Rendered Models
I've looked at a couple of projects where students have developed models and used them in games. I've made a couple of 2D games with raster images but have never worked with 3D models. What would I need to learn in regards to programming these models? If it helps I used to be okay with 3D Studio Max and Cinema4D (although every single course seems to use Maya), but haven't touched it in about four years.
Sorry for posting such vague and, let's be honest, stupid questions. It's just something I've wanted to do for a while and something that'd be good as a large project for me to develop in my own time.
Related Questions
Literature and Tutorials for Writing a Ray Tracer
I can recommend pbrt, it's a book and a physically-based renderer used to teach computer science graduates. The description of the maths used is nice and clear, and since it is written in the 'literate programming' you can see the appropriate code (in C++) too.
The book "Computer Graphics: Principles and Practice" (known in the Computer Graphics circles as the "Foley-VanDam") is the basic for most computer graphics courses, and it covers the topic of implementing a ray-tracer in much detail. It is quite dated, but it's still the best, afaik, and the basic principles remain the same.
I also second the recommendation for Eric Lengyel's Mathematics for 3D Game Programming and Computer Graphics. It's not as thorough, but it's a wonderful review of the math basics you need for 3D programming, it has very useful summaries at the end of each chapter, and it's written in an approachable, not too scary way.
In addition, you'll probably want some OpenGL or DirectX basics. It's easier to start working with a 3D API, then learn the underlying maths than the opposite (in my opinion), but both options are possible. Just look for OpenGL on SO and you should find a couple of good references as well.
The 2000 ICFP Programming Contest asked participants to build a ray tracer in three days. They have a good specification for a simple ray tracer, and you can get code for the winning entries and some other entries as well. There were entries in a large number of different programming languages. This might be a nice way for you to get started.
The briefest useful answer I can give is that most of the important algorithms can be found in Real-Time Rendering by Tomas Akenine-Möller, Eric Haines, and Naty Hoffman, and the bibliography at the end has references to the necessary maths. Their website has a recommended reading list as well.
The most useful math book I've read on the subject is Eric Lengyel's Mathematics for 3D Game Programming and Computer Graphics. The maths you need most are geometry (obviously) and linear algebra (for dealing with all the matrices).
I took such a class last year, and I believe that the class was wonderful for forcing students to learn the math behind the computer graphics - not just the commands for making a computer do what you want.
My professor has a site located here and it has his lecture notes and problem sets that you can take a look through.
Our final project was indeed a raytracer, but once you know the mathematics behind it, coding (an inefficient one) is trivial.
For a mathematical introduction into these topics, see
http://graphics.idav.ucdavis.edu/education/GraphicsNotes/homepage.html
Check http://www.scratchapixel.com/lessons/3d-basic-lessons/lesson-1-writing-a-simple-raytracer/
This is a very good place to learn about ray tracing and rendering in general.
On my reading spree, I stumbled upon something called Intentional Programming.
I understood it somewhat, but I not fully. If anyone can explain it in better detail, please do. Is it being used in any real application?
You got me started on this one...
Looks like C. Simonyi wanted to step to the next level of abstraction from High level languages. Reduce the dependency of customers on developers to make every change.. in code (cryptic for people not in development).
So he invents this new product called IP, which has a WYSIWYG type GUI editor to create a domain specific model. (i.e. IP has a GUI to create the building blocks for your app.. LISP allowed you to create the meta/building blocks but not in a way that domain experts could easily do it.)
Like the models in UML, the promise is that you can auto-generate the corresponding source code at the "push of a button". So the domain experts can tweak the model in the future and press the Bake button to deliver the next version of the app.
It seems to utilise DSLs however with the added benefit that multiple user-created DSLs can talk with each other via a built-in IP mechanism... which means the finance model and sales model can interact and reuse blocks as needed. As with DSLs, you get the benefit of code that conveys developer intent rather than appeases implementation language constraints.
The idea being to give greater control to the BA and domain experts who actually know what's needed...
Update:
Real world use looks like 'not yet'.. although Simonyi believes 'absolutely in the long term'.
Short Story: MS squished IP in favor of .Net framework, Simonyi left MS and formed his own company 'Intentional Software'.. with the contract that he could use the IP ideas but he would have to rewrite his working proto from the ground up.. (that should slow him down). It's still Work-In-Progress I think.. and being written in C# (to boot)
Sources:
Anything you can do, I can do meta by Scott Rosenberg, MIT Tech Review (2007)
To think till yesterday.. I didn't know a thing about this. Investigative reporter signing off. Going back to day job :)
It's the opposite of what happens when I come home at 2am after a pub crawl and fire up the laptop "just to check my email real quick, hon."
Then, the next day, when I peel open one eye and find my way to the bathroom at the crack of noon, I start brushing my teeth and realize, toothpaste dribbling out of my mouth, that last night I made 4 SVN commits, closed 3 bugs, and figured out how to solve the starvation problem on our distributed locking protocol. And I have no idea how the hell any of it works, anymore.
Or maybe it's what workmad3 said.
It appears to be a method of programming that allows the programmer to expand what is actually in the language to more closely follow their original intent, rather than forcing the programmers intent into the constrained syntax of the language.
It explicitly mentions LISP as a language that supports this, so I'd suggest you read up on this great language :) LISP Macros are exactly what are described in the article, allowing you to indefinitely expand the language to cover almost anything you would care to express. (A fairly common outcome of large LISP systems is that you end up with a domain specific language that is very good for writing specific applications, i.e. writing a word processor ends up with a word processor specific language).
For your last part, yes LISP (and thus Intentional Programming) is used in some projects. Paul Graham is a great proponent of LISP, and other examples of it include the original Crash Bandicoot (a game object creation system was created in LISP for this, including a LISP PlayStation compiler)
I have a slightly different understanding of Intentional Programming (as a more general term, not just what Charles Simonyi is doing). It is closely linked to fluent interfaces and can be achieved, with various degrees of difficulty, in modern Object Orientated languages.
Some of these concepts come from Domain Driven Design (in fact the term "fluent interface" has been popularised by Eric Evans, the author of "the" blue book - Domain Driven Design: Tacking Complexity in the Heart of Software).
The aim is to make business layer code readable by a non-programmer (i.e. a business person). This can be achieved by class and method names that explicitly state the intent of the operation. In my opinion, being explicit and being intentional produces highly readable and maintainable code.
Consider the two examples below that achieve the same thing - creating an order for a customer with 10% discount and adding a couple of products to it.
//C#, Normal version
Customer customer = CustomerService.Get(23);
Order order = new Order();
//What is 0.1? Need to look at Discount property to understand
order.Discount = 0.1;
order.Customer = customer;
//What's 34?
Product product = ProductService.Get(34);
//Do we really care about Order stores OrderLines?
order.OrderLines.Add(new OrderLine(product, 1));
Product product2 = ProductService.Get(54);
order.OrderLines.Add(new OrderLine(product2, 2)); //What's 2?
Order.Submit();
//C#, Fluent version
//byId is named parameter, states that this method looks up customer by Id
ICustomerForOrderCreation customer =
CustomerService.GetCustomerForOrderCreation(byId: 23);
//Explicit method to create a discount order and explicit percentage
Order order = customer.CreateDiscountOrder(10.Percent())
.WithProduct(ProductService.Get(byId: 34))
.WithProduct(ProductService.Get(byId: 54))
.WithQuantity(2); //Explicit quantity
Order.Submit();
By changing your programming style slightly, you are able to communicate your intent more clearly and reduce the amount of having to look at code elsewhere to understand what's going on.
Seems to me like yet another fad of software engineering. We've seen thousands of them already: meta programming, generative programming, visual programming, and so on. For a short time they get very fashionable, people use it everywhere, and then they invariably go back to old ways of creating software.
Why? Frederick Brooks has already answered this question over 20 years ago: there's No Single Silver Bullet to kill the werewolf...
Intentional Programming is encoding your intent, or goals. Thus it is Goal-Oriented Programming or Planning. Step up to manangement.
It's where you intend to program, you don't just accidently do it. ;)