Creating a Neural Machine Translation basics - nlp

I'm currently working on a project design where I will create a program/model to translate my native dialect to English, I'm asking is there any books or anything that can you recommend to me in creating my project.

On the NLP side of things there's this course: Natural Language Processing with spaCy & Python - Course for Beginners and this older course: Natural Language Processing (NLP) Tutorial with Python & NLTK on Free Code Camp, which is generally a good place to start. Their courses provide in depth explanations of concepts and provide good examples.
On the translation side of things, the DeepL translator is easy to use in multiple languages and offers a free api. It also offers and incredibly easy to use python library if that's the language you intend to use (which you should because python is the best out there for NLP).
I hope this helps, but dennlinger is right - you shouldn't typically ask broad recommendation questions on StackOverflow!

Related

Which language or tools to learn for natural language processing?

I am French, and am a former Certified Network Security Administrator.
I went back to university 3 years ago to achieve a Bachelor's degree in linguistics, and I am now going to enroll in a Masters Degree in Computer Science applied to Linguistics, with the objective of eventually trying to go through a Doctorate (but I'm not there yet :-) ).
The course will focus on speech recognition, automatic language translation, statistical analysis of texts, speech encoding and decoding, and information abstratction from textual sources.
The professors will let us use any computer language we want to use to code the algorithms and programs we will develop during the curriculum.
I used to develop web apps as a side gig for about 3-4 years and I am proficient in Javascript as I wrote software that used node.js at the server end and the browser at the client. I also have some familiarity with postgresql.
My current style of coding (if we can call that a style) is mainly procedural and I use object prototyping as my main way to create/manage objects in my code. I don't have much experience with object oriented language that use the concept of classes to manage the objects. Therefore I am pretty confident my current coding skills are definitely lacking in regards to what is required for me to write efficient code to deal with that stuff.
So my question is this : what would be the best computer language for me to learn in order to be effective in writing algorithms and data structure suited for the above mentionned linguistic areas?
Thanks in advance for your enlightened answers.
Sat Cit Ananda.
Your question is opinion based, so probably off-topic here.
In France, you have a lot of good courses on Ocaml which is developed at INRIA with several good books (notably, both in French, Developpement d'Applications en Ocaml by Chailloux, Manoury, Pagano; and Programmation de Droite à Gauche & vice versa by Manoury). J.Pitrat also wrote Textes, Ordinateurs et Compréhension; his latest book artificial beings: the conscience of a conscious machines will also interest you.
And learning several programming languages, not only one, is always useful (a single programming language is not enough to do Natural Language Processing; you need to learn several programming languages and several programming paradigms - both functional and object paradigms are useful, and also prolog). You could also start reading the SICP while learning Scheme. Learning more about Lisp-like languages thru Queinnec's book Principe d'implementation de Scheme et Lisp - the updated version of Lisp In Small Pieces will also teach you a big lot.
Java might also be useful (because some NLP libraries are available in Java). CommonLisp, C++2011, Haskell ... too.
Also take time to use and master Linux (and its programming) and free software.
In general, natural language processing requires a lot of computer science (and math).
For production NLP systems, Java seems to be the most common choice. It is a nice and safe language for beginner/intermediate programmers that scales well with codebase size, has a simple grammar and a vast standard library, and it is one of the most commonly used languages where software performance isn't the absolute top priority (or where performance can be scaled horizontally/distributed). I believe for example most of the higher layers of IBM Watson are written in Java. You'll also find it as one of the primary teaching languages in CS courses.

What are the steps to create domain specific query language?

i want to create domain specific query language
i need steps to create it and how to transfer from the created domain specific query language to normal SQL query to execute it.
and any recommended tools??
DSLs are not much related to SQL.
You first need to specify your DSL on paper. I strongly recommend reading good books about programming languages while doing that. (e.g. Lisp in Small Pieces by C.Queinnec).
Then you need to implement your DSL as an interpreter. You'll use standard lexing, parsing and interpreters (or possibly compiler) techniques. Very probably you'll need to use or implement a garbage collector (or use Boehm's GC). Parsers generators like ANTLR could help you.
Co-designing and implementing your DSL in parallel is usually a good way of working.
You really should read several books & papers on several languages before designing & implementing your own DSL.
A practical way to do that is to embed an existing interpreter like Lua into your application, or to embed your application inside an interpreter like ocaml or python
Designing and implementing a good DSL is not trivial (several months or years of work), and requires some computer science & programming culture & know-how. Perhaps reading proceedings of conferences like DSL2011 will help you.
In addition of C.Queinnec's book, you could also read Programming Languages: Principles and Paradigms (by Maurizio Gabbrielli & Simone Martini) & Seven Languages in Seven Weeks: A Pragmatic Guide to Learning Programming Languages (by Tate) & Programmming Language Pragmatics (by M.Scott)

Natural Language Processing Package

I have started working on a project which requires Natural Language Processing. We have do the spell checking as well as mapping sentences to phrases and their synonyms. I first thought of using GATE but i am confused on what to use? I found an interesting post here which got me even more confused.
http://lordpimpington.com/codespeaks/drupal-5.1/?q=node/5
Please help me decide on what suits my purpose the best. I am working a web application which will us this NLP tool as a service.
You didn't really give much info, but try this: http://www.nltk.org/
I don't think NLTK does spell checking (I could be wrong on this), but it can do parts of speech tagging for text input.
For finding/matching synonyms you could use something like WordNet http://wordnet.princeton.edu/
If you're doing something really domain specific: I would recommend coming up with your own ontology for domain specific terms.
If you are using Python you can develop a spell checker with Python Enchant.
NLTK is good for developing Sentiment Analysis system too. I have some prototypes of the same too
Jaggu
If you are using deep learning based models, and if you have sufficient data, you can implement task specific models for any purpose. With the development of deep leaning based languages models, you can used word embedding based models with lexicon resources to obtain synonyms and antonyms. You can also follow the links below to obtain more resources.
https://stanfordnlp.github.io/CoreNLP/
https://www.nltk.org/
https://wordnet.princeton.edu/

A common set of problems to learn new languages

With "Polyglot" programming techniques becoming more relevant, it is almost a necessity to use the "right" PL for the problem. However, learning new languages takes time which usually most project team can't afford. What is the best way to learn a new programming language? Is there a common set of problems that can be solved to reach a certain level of competence?
Well, it depends what you want to do. (web, db, whatever).
Generally I'd want to know:
What's the library like, how do I reference it
What ORMs are there
What build/deployment platforms exist for it
How does it handle updates
How do I do general things, like:
DB Access
File things
Display UI's
and so on.
Really, learning is only by doing -- you need a project that you can use the given language for.
Project Euler is the first thing to come to mind as an oft-used set of problems to try in a new language, even if it's not something I've ever tried.
If the language is another JVM or CLR hosted one, the issues about learning the environment can be set aside -- you can use all your familiar APIs in your Clojure/Scala/F#... code -- and concentrate on the syntax and idiom.
Otherwise, you're probably using the new language because it has a good fit for the particular problem you want to solve (e.g. native code and functional -> Haskell; distributed and concurrent -> Erlang) so the fit of the feature set is known in advance but you have the extra load of learning the standard APIs. And that's what prototyping is for.
The book Programming Challenges and the associated website provide a large list of algorithmic problems, with automatic online judging in several languages (Java, C, C++). Any algorithm textbook can give you lots of examples of basic data structures and procedures to try and implement, which is often a nice way to get some practice with basic language syntax and features. My personal favourite for this is The Algorithm Design Manual, which is language agnostic, but there are plenty of good language-specific books available as well (Mastering Algorithms in Perl or Data Structures and Algorithms in Java, for example).
If you're interested in a general set of mathematical problems to try and solve, Project Euler is a great resource.
For more day to day problems, I find the cookbook approach most helpful. For example, both Perl and Python have excellent O'Reilly cookbooks, as well as online resources, which provide short examples of many common and important problems. As mentioned in another answer, the key here is to find canonical examples of basic features you will need, particularly by leveraging what's available in standard libraries. I usually try and build up my own small library of examples as I go along, e.g. a socket example, a DB access example, a file reading example, a simple numerical solver, etc, which I then pillage for ideas when it's time to write production code.

Language Books/Tutorials for popular languages

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
It wasn't that long ago that I was a beginning coder, trying to find good books/tutorials on languages I wanted to learn. Even still, there are times I need to pick up a language relatively quickly for a new project I am working on. The point of this post is to document some of the best tutorials and books for these languages. I will start the list with the best I can find, but hope you guys out there can help with better suggestions/new languages. Here is what I found:
Since this is now wiki editable, I am giving control up to the community. If you have a suggestion, please put it in this section. I decided to also add a section for general be a better programmer books and online references as well. Once again, all recommendations are welcome.
General Programming
Online Tutorials
Foundations of Programming By Karl Seguin - From Codebetter, its C# based but the ideas ring true across the board, can't believe no-one's posted this yet actually.
How to Write Unmaintainable Code - An anti manual that teaches you how to write code in the most unmaintable way possible. It would be funny if a lot of these suggestions didn't ring so true.
The Programming Section of Wiki Books - suggested by Jim Robert as having a large amount of books/tutorials on multiple languages in various stages of completion
Just the Basics To get a feel for a language.
Books
Code Complete - This book goes without saying, it is truely brilliant in too many ways to mention.
The Pragmatic Programmer - The next best thing to working with a master coder, teaching you everything they know.
Mastering Regular Expressions - Regular Expressions are an essential tool in every programmer's toolbox. This book, recommended by Patrick Lozzi is a great way to learn what they are capable of.
Algorithms in C, C++, and Java - A great way to learn all the classic algorithms if you find Knuth's books a bit too in depth.
C
Online Tutorials
This tutorial seems to pretty consise and thourough, looked over the material and seems to be pretty good. Not sure how friendly it would be to new programmers though.
Books
K&R C - a classic for sure. It might be argued that all programmers should read it.
C Primer Plus - Suggested by Imran as being the ultimate C book for beginning programmers.
C: A Reference Manual - A great reference recommended by Patrick Lozzi.
C++
Online Tutorials
The tutorial on cplusplus.com seems to be the most complete. I found another tutorial here but it doesn't include topics like polymorphism, which I believe is essential. If you are coming from C, this tutorial might be the best for you.
Another useful tutorial, C++ Annotation. In Ubuntu family you can get the ebook on multiple format(pdf, txt, Postscript, and LaTex) by installing c++-annotation package from Synaptic(installed package can be found in /usr/share/doc/c++-annotation/.
Books
The C++ Programming Language - crucial for any C++ programmer.
C++ Primer Plus - Orginally added as a typo, but the amazon reviews are so good, I am going to keep it here until someone says it is a dud.
Effective C++ - Ways to improve your C++ programs.
More Effective C++ - Continuation of Effective C++.
Effective STL - Ways to improve your use of the STL.
Thinking in C++ - Great book, both volumes. Written by Bruce Eckel and Chuck Ellison.
Programming: Principles and Practice Using C++ - Stroustrup's introduction to C++.
Accelerated C++ - Andy Koenig and Barbara Moo - An excellent introduction to C++ that doesn't treat C++ as "C with extra bits bolted on", in fact you dive straight in and start using STL early on.
Forth
Books
FORTH, a text and reference. Mahlon G. Kelly and Nicholas
Spies. ISBN 0-13-326349-5 / ISBN 0-13-326331-2. 1986
Prentice-Hall. Leo Brodie's books are good but this book
is even better. For instance it covers defining words and
the interpreter in depth.
Java
Online Tutorials
Sun's Java Tutorials - An official tutorial that seems thourough, but I am not a java expert. You guys know of any better ones?
Books
Head First Java - Recommended as a great introductory text by Patrick Lozzi.
Effective Java - Recommended by pek as a great intermediate text.
Core Java Volume 1 and Core Java Volume 2 - Suggested by FreeMemory as some of the best java references available.
Java Concurrency in Practice - Recommended by MDC as great resource for concurrent programming in Java.
The Java Programing Language
Python
Online Tutorials
Python.org - The online documentation for this language is pretty good. If you know of any better let me know.
Dive Into Python - Suggested by Nickola. Seems to be a python book online.
Perl
Online Tutorials
perldoc perl - This is how I personally got started with the language, and I don't think you will be able to beat it.
Books
Learning Perl - a great way to introduce yourself to the language.
Programming Perl - greatly referred to as the Perl Bible. Essential reference for any serious perl programmer.
Perl Cookbook - A great book that has solutions to many common problems.
Modern Perl Programming - newly released, contains the latest wisdom on modern techniques and tools, including Moose and DBIx::Class.
Ruby
Online Tutorials
Adam Mika suggested Why's (Poignant) Guide to Ruby but after taking a look at it, I don't know if it is for everyone.
Found this site which seems to offer several tutorials for Ruby on Rails.
Books
Programming Ruby - suggested as a great reference for all things ruby.
Visual Basic
Online Tutorials
Found this site which seems to devote itself to visual basic tutorials. Not sure how good they are though.
PHP
Online Tutorials
The main PHP site - A simple tutorial that allows user comments for each page, which I really like.
PHPFreaks Tutorials - Various tutorials of different difficulty lengths.
Quakenet/PHP tutorials - PHP tutorial that will guide you from ground up.
JavaScript
Online Tutorials
Found a decent tutorial here geared toward non-programmers. Found another more advanced one here. Nickolay suggested A reintroduction to javascript as a good read here.
Books
Head first JavaScript
JavaScript: The Good Parts (with a Google Tech Talk video by the author)
C#
Online Tutorials
C# Station Tutorial - Seems to be a decent tutorial that I dug up, but I am not a C# guy.
C# Language Specification - Suggested by tamberg. Not really a tutorial, but a great reference on all the elements of C#
Books
C# to the point - suggested by tamberg as a short text that explains the language in amazing depth
ocaml
Books
nlucaroni suggested the following:
OCaml for Scientists
Introduction to ocaml
Using Understand and unraveling ocaml: practice to theory and vice versa
Developing Applications using Ocaml - O'Reilly
The Objective Caml System - Official Manua
Haskell
Online Tutorials
nlucaroni suggested the following:
Explore functional programming with Haskell
Books
Real World Haskell
Total Functional Programming
LISP/Scheme
Books
wfarr suggested the following:
The Little Schemer - Introduction to Scheme and functional programming in general
The Seasoned Schemer - Followup to Little Schemer.
Structure and Interpretation of Computer Programs - The definitive book on Lisp (also available online).
Practical Common Lisp - A good introduction to Lisp with several examples of practical use.
On Lisp - Advanced Topics in Lisp
How to Design Programs - An Introduction to Computing and Programming
Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp - an approach to high quality Lisp programming
What about you guys? Am I totally off on some of there? Did I leave out your favorite language? I will take the best comments and modify the question with the suggestions.
I know this is going to seem old-fashioned, but I don't think much of using online tutorials to learn programming languages or platforms. These generally give you no more than a little taste of the language. To really learn a language, you need the equivalent of a "book", and in many cases, this means a real dead-tree book.
If you want to learn C, read K&R. If you want to learn C++, read Stroustrup. If you want to learn Lisp/Scheme, read SICP. Etc.
If you're not willing to spend more than $30 and a few hours to learn a language, you probably aren't going to learn it.
These are all really good, written by academia and (some) are books (an unpublished oreilly book --translated from French, but no issues I've found), for example). I've *'d my favorite ones that helped me the most.
ocaml :
*Introduction to ocaml
Using Understand and unraveling ocaml: practice to theory and vice versa
*Developing Applications using Ocaml - O'Reilly
The Objective Caml System - Official Manual
A Concise Introduction to Objective Caml
Practical Ocaml
Haskell :
Explore functional programming with Haskell
*Real World Haskell
*Total Functional Programming
For C#:
CLR via C#
C# in Depth
For C++, I suggest Accelerated C++ by Koenig and Moo as a beginning text, though I don't know how it would be for an absolute novice. It focuses on using the STL right away, which makes getting things done much easier.
Haskell:
O'Reilly Book:
Real World Haskell, a great tutorial-oriented book on Haskell, available online and in print.
My favorite general, less academic online tutorials:
The Haskell wikibook which contains all of the excellent Yet Another Haskell Tutorial. (This tutorial helps with specifics of setting up a Haskell distro and running example programs, for example.)
Learn you a Haskell for Great Good, in the spirit of Why's Poignant Guide to Ruby but more to the point.
Write yourself a Scheme in 48 hours. Get your hands dirty learning Haskell with a real project.
Books on Functional Programming with Haskell:
Lambda calculus, combinators, more theoretical, but in a very down to earth manner: Davie's Introduction to Functional Programming Systems Using Haskell
Laziness and program correctness, thinking functionally: Bird's Introduction to Functional Programming Using Haskell
Effective Java is a must but I recommend being comfortable with Java first to fully understand the examples.
Ruby
The Free Ruby on Rails Training Online Course by Sang Shin Isn't too bad. It also has a decent amount of further reading links on each subject in the course
I'd add Bruce Eckel's programming books:
Thinking in Java (print version: 4th edition; 3rd. ed. is online: http://www.mindview.net/Books/TIJ/)
Thinking in C++ (2nd ed, freely available online: http://mindview.net/Books/TICPP/ThinkingInCPP2e.html
In general, his "Books" page (http://mindview.net/Books/) is a good resource. The freely availabe books can also be found at http://www.ibiblio.org/pub/docs/books/eckel/
Can't believe nobody has mentioned the Perl Best Practices. There's also a Twitter feed that delivers one PBP per day.
I learned Perl from Robert's Perl Tutorial, which I recommend, but it hasn't been updated since 1999. A newer recommended tutorial is Steve's Perl Tutorial.
For web development with Perl, the clear winner is Catalyst, and the Catalyst wiki is the starting point for learning.
For Lisp and Scheme (hell, functional programming in general), there are few things that provide a more solid foundation than The Little Schemer and The Seasoned Schemer. Both provide a very simple and intuitive introduction to both Scheme and functional programming that proves far simpler for new students or hobbyists than any of the typical volumes that rub off like a nonfiction rendition of War & Peace.
Once they've moved beyond the Schemer series, SICP and On Lisp are both fantastic choices.
check out the programming section of wikibooks
Many of them are fully formed, and quite a few have more advanced sections (which are in varying states of completion) on specific functionality.
also, w3 schools has a great php tutorial and reference section
their html and css sections are good for reference too.
C++
Thinking in C++ by Bruce Eckel
C++ Coding Standards by Herb Sutter & Andrei Alexandrescu
The first one is good for beginners and the second one requires more advanced level in C++.
C - The C Programming Language - Obviously I had to reference K&R, one of the best programming books out there full stop.
C++ - Accelerated C++ - This clear, well written introduction to C++ goes straight to using the STL and gives nice, clear, practical examples. Lives up to its name.
C# - Pro C# 2008 and the .NET 3.5 Platform - Bit of a mouthful but wonderfully written and huge depth.
F# - Expert F# - Designed to take experienced programmers from zero to expert in F#. Very well written, one of the author's invented F# so you can't go far wrong!
Scheme - The Little Schemer - Really unique approach to teaching a programming language done really well.
Ruby - Programming Ruby - Affectionately known as the 'pick axe' book, this is THE defacto introduction to Ruby. Very well written, clear and detailed.
For Javascript:
Javascript: The Definitive Guide
Pro Javascript Techniques
For PHP:
PHP Objects, Patterns, and Practice
For OO design & programming, patterns:
Object-Oriented Software Construction (a bible, maybe the Head First OO would be nice, I don't know it)
Head First Design Patterns (I so love this book)
Design Patterns
For Refactoring:
Refactoring: Improving the Design of Existing Code
Working Effectively with Legacy Code
For SQL/MySQL:
Joe Celko: Tree and Hierarchies in SQL (only on a specific subject, but I found it interesting)
Pro MySQL
C Primer Plus, 5th Edition - The C book to get if you're learning C without any prior programming experience. It's a personal favorite of mine as I learned to program from this book. It has all the qualities a beginner friendly book should have:
Doesn't assume any prior exposure to programming
Enjoyable to read (without becoming annoying like For Dummies /
Doesn't oversimplify
Let's not forget Head First Java, which could be considered the essential first step in this language or maybe the step after the online tutorials by Sun. It's great for the purpose of grasping the language concisely, while adding a bit of fun, serving as a stepping stone for the more in-depth books already mentioned.
Sedgewick offers great series on Algorithms which are a must-have if you find Knuth's books to be too in-depth. Knuth aside, Sedgewick brings a solid approach to the field and he offers his books in C, C++ and Java. The C++ books could be used backwardly on C since he doesn't make a very large distinction between the two languages in his presentation.
Whenever I'm working on C, C:A Reference Manual, by Harbison and Steele, goes with me everywhere. It's concise and efficient while being extremely thorough making it priceless(to me anyways).
Languages aside, and if this thread is to become a go-to for references in which I think it's heading that way due to the number of solid contributions, please include Mastering Regular Expressions, for reasons I think most of us are aware of... some would also say that regex can be considered a language in its own right. Further, its usefulness in a wide array of languages makes it invaluable.
Common Lisp
For a good reference of CL check out Common Lisp the Language, 2nd Edition
For Objective C:
Cocoa Programming for Mac OSX - Third Edition
Aaron Hillegass
Published by Addison Wesley
Programming in Objective C,
Stephen G Kochan,
Head First Javascript is a good intro to JS for beginning programmers - it creatively explains basic programming concepts using JS syntax. The Head First series is based on researched techniques for helping you learn and remember new information. They have you do a lot of exercises and puzzles which might seem juvenile, but really help cement the knowledge in your brain.
One exercise I really liked was after they explained data types, they show a picture of a city street and say "label all the data types you can find in this picture." So the blinker on a car is a boolean, the sign on the store is a string, and the address is a number. That helped me get the idea of how to translate real information into a program.
Based only on this book, I'd say the Head First series is a great way to learn something the first time, but the story-like format they have would make them difficult to use as references.
The Ruby Way by Hal Fulton
The Ruby Way cover http://rubyhacker.com/trw2cover.gif
Python: http://diveintopython.net/
JS: a re-introduction to JavaScript is the introduction to the language (not the browser specifics) for programmers. Don't know a good tutorial on JS in browser.
Great idea by the way!
Given recent developments I think it's important to include the recent explosion of free online course offerings from universities and private companies. The new boston is a tutorial site i've always used for numerous languages for years, great beginner point.
http://www.udacity.com/
https://www.coursera.org/
http://www.coursehero.org/
http://www.codecademy.com/
http://mitx.mit.edu/
http://www.khanacademy.org/
http://thenewboston.org/
I second Kristopher's recommendation of K&R for C.
I've found the "Essential Actionscript 2.0" book quite useful for AS coding (there's an AS3 version out now I believe).
I've found that having real books to thumb through is more helpful than an online reference in some cases. Not really sure why though.
hmm, I don't know if I would say that online materials are useless, but I do agree that there is something about books. Maybe they are better written, or maybe it is the act of forking over $50 that makes you more inclined to study the material.
Either way, I agree that books should be part of this question. If anyone has any suggestions for books for languages I will edit the post with the best suggestions.
The reference you have listed for Ruby is for Ruby on Rails. While still ruby deep down, it is definitely not a place to start for people wanting to learn Ruby.
For Ruby tutorials, I would suggest Why's (Poignant) Guide to Ruby as a great starting point for anyone interested in the language.
If you would want to get into more detail, I would recommend the book Programming Ruby, which has become the standard for all things Ruby. The third edition is currently being written, highlighting Ruby 1.9 features, so I would hold off for a while if anyone is considering buying this book.
For J2EE you have a very comprehensive tutorial at:
http://java.sun.com/javaee/5/docs/tutorial/doc/
For Java, I highly recommend Core Java. It's a large tome (or two large tomes), but I've found it to be one of the best references on Java I've read.
I know this is a cross post from here... but, I think one of the best Java books is Java Concurrency in Practice by Brian Goetz. A rather advanced book - but, it will wear well on your concurrent code and Java development in general.
The defacto standard for learning Grails is the excellent Getting Started with Grails by Jason Rudolph. You can debate whether it is an online tutorial or a book since it can be purchased but is available as a free download. There are more "real" books being published and I recommend Beginning Groovy and Grails.
C#
C# to the Point by Hanspeter Mössenböck. On a mere 200 pages he explains C# in astonishing depth, focusing on underlying concepts and concise examples rather than hand waving and Visual Studio screenshots.
For additional information on specific language features, check the C# language specification ECMA-334.
Framework Design Guidelines, a book by Krzysztof Cwalina and Brad Abrams from Microsoft, provides further insight into the main design decisions behind the .NET library.

Resources