Matchbox recommender in azure machine learning workspace - azure-machine-learning-service

The matchbox recommender available in http://studio.azureml.net doesn't seem to have a counterpart in http://ml.azure.com (which it appears is the newer portal for azure ml). Here only the plain SVD recommender is available, which doesn't take user or item features. This is a feature takeaway from the matchbox.
Is there an ETA when matchbox would be made available in the azure machine learning services? Either via SDK or designer.
Thanks.

we don't have a plan to bring back the matchbox yet. If you are looking for recommender algorithms, this repo could be a good reference for best practices: https://github.com/Microsoft/Recommenders. Please let me know if this can unblock you or if you are looking for specific thing in matchbox.

Related

Can I build ML models with microsoft azure and then export them for use off the cloud?

I have recently been doing some work with time-series analysis and Microsoft Azure has some good resources for building models. I've never worked on anything like this, or for that matter, with Microsoft Azure before (I'm a student - sorry for the lack of experience!)
Is it possible to build a model on Azure - specifically I'm interesting in building a multivariate time-series analysis model - and then export it to be run on my own hardware? I'm not really interested in renting cloud space to run it.
Any advice or insight would be great - thanks!
Yes you can do that. Once you build a model with an experiment, There is a model tab on the portal that allows you to download the model. Something like below.
Below examples will provide you some guidance on deploying to local machines.
how to Deploy models trained with Azure Machine Learning on your local machines and Introducing Multivariate Anomaly Detection

Design for a Cloud Native Application in Azure for ML Insights and Actions

I have an idea whereby I intend to build a cloud native application for algorithmic trading, ideally by consuming all PaaS and SaaS (no IaaS), and I'd like to get some feedback on how I intend to build it. The concept is pretty straight-forward in that I intend to consume financial trading data from an external SaaS solution via an API query, feed that data into various Azure PaaS solutions (most notably ML for modeling), and then take some action. Here is a high-level diagram I've come up with so far:
Solution Overview
As a note, while I'm familiar with Azure, I'm not a Azure cloud engineer and have limited experience in actually building solutions myself. Subsequently, I intend to use this project as a foundation to further educate myself.
When starting on the build, I immediately questioned whether I should or shouldn't use Event Hubs. Conceptually it makes sense, in that I'm decoupling the production of a data stream from the consumption of it. Presumably, this facilitates less complications when / if I need to update the data feed(s) in the future. I also thought about where the data is stored... should it be a SQL database, or more simply, an Azure Table? The idea here is that the trading data will need to be stored for regression testing as my iterate through my models. All that said, looking for some insights from anybody that may have experience in this space.
Thanks!
There's no real question in here. Take a look on the architecture reference provided by Microsoft: https://learn.microsoft.com/en-us/azure/architecture/reference-architectures/

Productionizing Spark Pipeline [duplicate]

I'm evaluating tools for production ML based applications and one of our options is Spark MLlib , but I have some questions about how to serve a model once its trained?
For example in Azure ML, once trained, the model is exposed as a web service which can be consumed from any application, and it's a similar case with Amazon ML.
How do you serve/deploy ML models in Apache Spark ?
From one hand, a machine learning model built with spark can't be served the way you serve in Azure ML or Amazon ML in a traditional manner.
Databricks claims to be able to deploy models using it's notebook but I haven't actually tried that yet.
On other hand, you can use a model in three ways :
Training on the fly inside an application then applying prediction. This can be done in a spark application or a notebook.
Train a model and save it if it implements an MLWriter then load in an application or a notebook and run it against your data.
Train a model with Spark and export it to PMML format using jpmml-spark. PMML allows for different statistical and data mining tools to speak the same language. In this way, a predictive solution can be easily moved among tools and applications without the need for custom coding. e.g from Spark ML to R.
Those are the three possible ways.
Of course, you can think of an architecture in which you have RESTful service behind which you can build using spark-jobserver per example to train and deploy but needs some development. It's not a out-of-the-box solution.
You might also use projects like Oryx 2 to create your full lambda architecture to train, deploy and serve a model.
Unfortunately, describing each of the mentioned above solution is quite broad and doesn't fit in the scope of SO.
One option is to use MLeap to serve a Spark PipelineModel online with no dependencies on Spark/SparkContext. Not having to use the SparkContext is important as it will drop scoring time for a single record from ~100ms to single-digit microseconds.
In order to use it, you have to:
Serialize your Spark Model with MLeap utilities
Load the model in MLeap (does not require a SparkContext or any Spark dependencies)
Create your input record in JSON (not a DataFrame)
Score your record with MLeap
MLeap is well integrated with all the Pipeline Stages available in Spark MLlib (with the exception of LDA at the time of this writing). However, things might get a bit more complicated if you are using custom Estimators/Transformers.
Take a look at the MLeap FAQ for more info about custom transformers/estimators, performances, and integration.
You are comparing two rather different things. Apache Spark is a computation engine, while mentioned by you Amazon and Microsoft solutions are offering services. These services might as well have Spark with MLlib behind the scene. They save you from the trouble building a web service yourself, but you pay extra.
Number of companies, like Domino Data Lab, Cloudera or IBM offer products that you can deploy on your own Spark cluster and easily build service around your models (with various degrees of flexibility).
Naturally you build a service yourself with various open source tools. Which specifically? It all depends on what you are after. How user should interact with the model? Should there be some sort of UI or jest a REST API? Do you need to change some parameters on the model or the model itself? Are the jobs more of a batch or real-time nature? You can naturally build all-in-one solution, but that's going to be a huge effort.
My personal recommendation would be to take advantage, if you can, of one of the available services from Amazon, Google, Microsoft or whatever. Need on-premises deployment? Check Domino Data Lab, their product is mature and allows easy working with models (from building till deployment). Cloudera is more focused on cluster computing (including Spark), but it will take a while before they have something mature.
[EDIT] I'd recommend to have a look at Apache PredictionIO, open source machine learning server - amazing project with lot's of potential.
I have been able to just get this to work. Caveats: Python 3.6 + using Spark ML API (not MLLIB, but sure it should work the same way)
Basically, follow this example provided on MSFT's AzureML github.
Word of warning: the code as-is will provision but there is an error in the example run() method at the end:
#Get each scored result
preds = [str(x['prediction']) for x in predictions]
result = ",".join(preds)
# you can return any data type as long as it is JSON-serializable
return result.tolist()
Should be:
#Get each scored result
preds = [str(x['prediction']) for x in predictions]
#result = ",".join(preds)
# you can return any data type as long as it is JSON-serializable
output = dict()
output['predictions'] = preds
return json.dumps(output)
Also, completely agree with MLeap assessment answer, this can make the process run way faster but thought I would answer the question specifically

Azure Machine Learning Studio vs. Workbench

What is the difference between Azure Machine Learning Studio and Azure Machine Learning Workbench? What is the intended difference? And is it expected that Workbench is heading towards deprecation in favor of Studio?
I have gathered an assorted collection of differences:
Studio has a hard limit of 10 GB total input of training data per module, whereas Workbench has a variable limit by price.
Studio appears to have a more fully-featured GUI and user-friendly deployment tools, whereas Workbench appears to have more powerful / customizable deployment tools.
etc.
However, I have also found several scattered references claiming that Studio is a renamed updated of Workbench, even though both services appear to still be offered.
For a fresh Data Scientist looking to adopt the Microsoft stack (potentially on an enterprise scale within the medium-term and for the long-term), which offering should I prefer?
Azure Machine Learning Workbench is a preview downloadable application. It provides a UI for many of the Azure Machine Learning CLI commands, particularly around experimentation submission for Python based jobs to DSVM or HDI. The Azure Machine Learning CLI is made up of many key functions, such as job submisison, and creation of real time web services. The workbench installer provided a way to install everything required to participate in the preview.
Azure Machine Learning Studio is an older product, and provides a drag and drop interface for creating simply machine learning processes. It has limitations about the size of the data that can be handled (about 10gigs of processing). Learning and customer requests have based on this service have contributed to the design of the new Azure Machine Learning CLI mentioned above.
It should be added that Azure Machine Learning Workbench is deprecated since september 2018 and has been replaced by the Azure Machine Learning services, which was made generally available in december 2018. The core functionality is still intact, but some major changes to point out about the architecture are:
A simplified Azure resources model
New portal UI to manage your experiments and compute targets
A new, more comprehensive Python SDK
A new expanded Azure CLI extension for machine learning

How to implement BOT engine like WIT.AI for on an on-premise solution?

I want to build a chatbot for a customer service application. I tried SaaS services like Wit.Ai, Motion.Ai, Api.Ai, LUIS.ai etc. These cognitive services find the "intent" and "entities" when trained with the typical interactions model.
I need to build chatbot for on-premise solution, without using any of these SaaS services.
e.g Typical conversation would be as following -
Can you book me a ticket?
Is my ticket booked?
What is the status of my booking BK02?
I want to cancel the booking BK02.
Book the tickets
StandFord NLP toolkit looks promising but there are licensing constraints. Hence I started experimenting with the OpenNLP. I assume, there are two OpenNLP tasks involved -
Use 'Document Categorizer' to find out the intent
Use 'Named Entity Recognition' to find out entities
Once the context is identified, I will call my application APIS to build the response.
Is it a right approach?
How good OpenNLP is in parsing the text?
Can I use Facebook FASTTEXT library for Intent identification?
Is there any other open source library which can be helpful in building the BOT?
Will "SyntaxNet" be useful for my adventure?
I prefer to do this in Java. BUT open to node or python solution too.
PS - I am new to NLP.
Have a look at this. It says it is an Open-source language understanding for bots and a drop-in replacement for popular NLP tools like wit.ai, api.ai or LUIS
https://rasa.ai/
Have a look at my other answer for a plan of attack when using Luis.ai:
Creating an API for LUIS.AI or using .JSON files in order to train the bot for non-technical users
In short use Luis.ai and setup some intents, start with one or two and train it based on your domain. I am using asp.net to call the Cognitive Service API as outlined above. Then customize the response via some JQuery...you could search a list of your rules in a javascript array when each intent or action is raised by the response from Luis.
If your Bot is english based, then I would use OpenNLP's sentence parser to dump the customer input into a database (I do this today). I then use the OpenNLP tokenizer and push the keywords (less the stop words) and Parts of Speech into a database table for keyword analysis. I have a custom Sentiment model built for OpenNLP that will tag each sentence with a Pos, Neg, Neutral sentiment...You can then use this to identify negative customer service feedback. To build your own Sentiment model have a look at SentiWord.net and download their domain agnostic data file to build and train an OpenNLP model or have a look at this Node version...
https://www.npmjs.com/package/sentiword
Hope that helps.
I'd definitely recommend Rasa, it's great for your use case, working on-premise easily, handling intents and entities for you and on top of that it has a friendly community too.
Check out my repo for an example of how to build a chatbot with Rasa that interacts with a simple database: https://github.com/nmstoker/lockebot
I tried RASA, But one glitch I found there was the inability of Rasa to answer unmatched/untrained user texts.
Now, I'm using ChatterBot and I'm totally in love with it.
Use "ChatterBot", and host it locally using - 'flask-chatterbot-master"
Links:
ChatterBot Installation: https://chatterbot.readthedocs.io/en/stable/setup.html
Host Locally using - flask-chatterbot-master: https://github.com/chamkank/flask-chatterbot
Cheers,
Ratnakar
With the help of the RASA and Botkit framework we can build the onpremise chatbot and the NLP engine for any channel. Please follow this link for End to End steps on building the same. An awsome blog that helped me to create a one for my office
https://creospiders.blogspot.com/2018/03/complete-on-premise-and-fully.html
First of all any chatbot is going to be the program that runs along with the NLP, Its the NLP that brings the knowledge to the chatbot. NLP lies on the hands of the Machine learning techniques.
There are few reasons why the on premise chatbots are less.
We need to build the infrastructure
We need to train the model often
But using the cloud based NLP may not provide the data privacy and security and also the flexibility of including my business logic is very less.
All together going to the on premise or on cloud is based on the needs and the use case of the requirements.
How ever please refer this link for end to end knowledge on building the chatbot on premise with very few steps and easily and fully customisable.
Complete On-Premise and Fully Customisable Chat Bot - Part 1 - Overview
Complete On-Premise and Fully Customisable Chat Bot - Part 2 - Agent Building Using Botkit
Complete On-Premise and Fully Customisable Chat Bot - Part 3 - Communicating to the Agent that has been built
Complete On-Premise and Fully Customisable Chat Bot - Part 4 - Integrating the Natural Language Processor NLP
Disclaimer: I am the author of this package.
Abodit NLP (https://nlp.abodit.com) can do what you want but it's .NET only at present.
In particular you can easily connect it to databases and can provide custom Tokens that are queries against a database. It's all strongly-typed and adding new rules is as easy as adding a method in C#.
It's also particularly adept at turning date time expressions into queries. For example "next month on a Thursday after 4pm" becomes ((((DatePart(year,[DATEFIELD])=2019) AND (DatePart(month,[DATEFIELD])=7)) AND (DatePart(dw,[DATEFIELD])=4)) AND DatePart(hour,[DATEFIELD])>=16)

Resources