Microsoft Translate API defaults to performing Statistical machine translation. This is lower quality than the more modern Deep Neural Network translation, which is also available for many languages.
I am able to get translations to work with SMT. However, I'm unable to figure out how to get DNN to work. Microsoft's own documentation provides no information on this.
Anyone have experience getting this to work?
Translations to and from Chinese and Hindi are NN by default. For the other 18 languages supported, just add the parameter “category=generalnn” to tell our service to use the NN models instead of the SMT ones. More details on language supported and hybrid translations can be found on the Translator blog: https://blogs.msdn.microsoft.com/translation/2017/11/15/microsoft-translator-accelerates-use-of-neural-networks-across-its-offerings/
Related
I want to build my own language model using Open-AI GPT. How can I build domain specific language models using Open-AI GPT for natural language generation?
You have to fill out the waitlist form and be accepted by OpenAI.
More so than that however, you likely have to have an idea that is worthwhile for OpenAI to consider you, have a personal connection to the team, or a history of building interesting things within the field of Natural Language Generation.
There are tens of thousands of people who would like access to GPT-3, so you have to think carefully about why OpenAI would allow you as opposed to someone else access to a scarce resource. Do you have a PhD? Are you improving the world somehow? Will you make them tons of money? These are some of the possible considerations you may need to demonstrate when communicating your idea.
Has anyone implemented a chatbot using Dialogflow for Finnish.
I know it is not supported natively yet. But i havent seen any clear roadmap for the supported languages in the coming months. Any feedback or information on this.
If not dialogflow, what other NLP would you recommend to implement a Finnish chatbot.
Also would it be a good idea to try to administer the intents in English and use Translator APIs to translate user text from Finnish to English and do the intent matching on Dialogflow. Obviously the matched intent response has to be translated back to Finnish when delivered to the customer.
Regards,
Ujjwal
Edit: Support for Finnish is added now.
Dialog-flow currently doesn't list Finnish in their
language list. (If your use-case is urgent, you shouldn't wait for its additive support in near future)
Necessity is the mother of invention
Before Proceeding, I wouldn't recommend you to use Translator API's to convert English and Finnish, vice versa. It will not train the model exactly you want to because relationship between words, across languages are very different.
NLTK is a great NLP library with all the features which you can use to develop the chatbot in Finnish. Stemming with Finnish
Note: Apart from NLTK, SpaCy and TextBlob are also great NLP libraries you can use. If a library doesn't support a particular language, you can use UNI-Code to train.
ALSO You could use various openly available modules to develop your bot.
Like this one, https://github.com/TurkuNLP/Finnish-dep-parser
With this in mind, NLP applied with basic Word2Vec and Markov Models (Many options you can find over the internet) will help you to build the chatbot you want.
Cheers to building your chatbot
Dialogflow ES and Dialogflow CX now support Finnish.
I am about to create chatbot with NLP or DNN's method. The chatbot is about to reply user's sentence based on a knowledgebase (maybe in a database).
I found LSA/PLSA can be used, but I want to explore any further methods I can use except that. Recently, I was looking for some methods and found that DSSM (Deep Structured Semantic Model) can be an alternative. For anyone who are expertise in this case, would you mind to tell me is this the method I can use or might you suggest me any methods I can use?
By the way, after reading some articles about DSSM, I have misconception about negative samples when training DSSM. If you are about to suggest me with DSSM, please help me to explain it.
So much thank you for all of you, buddies.
Correct me if I'm wrong, but I don't think we have any DSSM platforms available for general use to us as chatbot developers, just yet.
At least not like the currently available language interpreter platforms
(Like IBM Watson, Microsoft LUIS).
I would suggest starting with one of the popular platforms available now and continuing research / watching for developments in deep learning.
I'm looking for something along the lines of a Domain Model for a Wikipedia style site. I'm interested to learn more about how it works under the hood, and how I can borrow some of it's versioning principals for use in my own products.
Please note I am not looking for a versioning class or plugin that I can incorporate into my project. I would rather some high level literature that explains the objects in the domain and how they interact with each other to create wiki style platform.
I've done a lot of work myself in crafting a domain model, but I'm at a point where I would like some reinforcement that what I'm doing is on the right path.
I am a newbie when it comes to information extraction. For the past several days, I have read a lot of academic papers and ordered a book on NLP. I want to figure out how I can build a FlipDog.com like system (hopefully not from scratch). They extract job openings from more than 60,000 company web sites. How do I get started?
I am open to learning any programming language. Has anybody used Mallet/GATE/MinorThird or RoadRunner? Ideally, I want to be able to train a system with the data set particular to my domain and have it extract information based on that. Which platform would you recommend for this purpose?
Thanks!
The faster way to extract job offerings is to use dapper.net (a web scraping service from websites). You can very easily to teach dapper to extract data using visual editor. It works very well when on your target websites you have tables.
To learn Information Extraction, I suggest to start from lingpipe. It is a java framework for Information Extraction, so you do not need to learn architectural specific features of the framework, such as Gate or Apache UIMA. On lingpipe website you will find a lot of tutorials which will help you to learn various Information Extraction approaches. After that I suggest to learn Gate and UIMA.
If you want to realize such a website, you also need to learn how to use web crawler frameworks (e.g., nutch), web search engines (yahoo, google, bing), and Information Retrieval engines (such as, apache lucene) to provide a search service on the top of extracted data.
Update:
For python, it is the best to start with: http://www.nltk.org/