The Azure F0 (free) plan allow users to train their own model but not allowing them to deploy and use the model trained. So what is the purpose of allowing users to train if it could not be used?
You could see the result of your training, as long as your training data is less than 2 million characters. The result includes a BLEU score, and the translation of your test set.
Related
I have this dataset in which the positive class consists of component failures for a specific component of the APS system.
I am doing Predictive Maintenance using Microsoft Azure Machine Learning Studio.
As you can see from the pictures below, I am using 4 algorithm: Logistic Regression, Random Forest, Decision Tree and SVM. And you can see that the Output dataset in the score model node consists of 16k rows. However, when I see the output of the Evaluate Model, in the confusion matrix there are only 160 observations for the Logistic Regression, and the correct number, 16k for Random Forest. I have the same problem, only 160 observations in the models of Decision Tree and SVM. And the same problem is repeated in other experiments for example after feature selection, normalization etc.: some evaluate model does not use all the rows of the test dataset, and some other node does it.
How can I fix this problem? Because I am interested in the real number of false positive and false negatives.
The output metrics shown are based on the validation set (e.g. “validation metric”, “val-accuracy”).All the metrics computed and displayed are on validation set and not on the original training set. All those metrics are calculated only over the validation set without considering the training set, otherwise we would inflate the performances of the model by considering data already used to train the model.
I was taking a udemy course, which made a strong case for normalizing only the train data (after the split from test data) since the model will typically used by fresh data, with features of the scale of the original set. And if you scale the test data, then you are not scoring the model properly.
On the other hand, what I found was that my two-class logistic regression model (created with Azure Machine Learning Studio) was getting terrible results after Z-Score scaling only the train data.
a. Is this a problem only with Azure's tools?
b. What is a good rule of thumb for when feature data needs to be scaled (one, two, or three orders of magnitude in difference)?
Not scoring the model properly due to normalized test set doesn't seem to make sense:
you would presumably also normalize data that you use for predictions in the future.
I found this similar question in datascience stackexchange and the top answer suggests not only that test data has to be normalized, but you need to apply the exact same scaling as you have done to the training data, because the scale of your data is also taken into account by your model: differently scaled test/prediction data would potentially lead to over/under-exaggeration of a feature.
I have a machine learning model built that tries to predict weather data, and in this case I am doing a prediction on whether or not it will rain tomorrow (a binary prediction of Yes/No).
In the dataset there is about 50 input variables, and I have 65,000 entries in the dataset.
I am currently running a RNN with a single hidden layer, with 35 nodes in the hidden layer. I am using PyTorch's NLLLoss as my loss function, and Adaboost for the optimization function. I've tried many different learning rates, and 0.01 seems to be working fairly well.
After running for 150 epochs, I notice that I start to converge around .80 accuracy for my test data. However, I would wish for this to be even higher. However, it seems like the model is stuck oscillating around some sort of saddle or local minimum. (A graph of this is below)
What are the most effective ways to get out of this "valley" that the model seems to be stuck in?
Not sure why exactly you are using only one hidden layer and what is the shape of your history data but here are the things you can try:
Try more than one hidden layer
Experiment with LSTM and GRU layer and combination of these layers together with RNN.
Shape of your data i.e. the history you look at to predict the weather.
Make sure your features are scaled properly since you have about 50 input variables.
Your question is little ambiguous as you mentioned RNN with a single hidden layer. Also without knowing the entire neural network architecture, it is tough to say how can you bring in improvements. So, I would like to add a few points.
You mentioned that you are using "Adaboost" as the optimization function but PyTorch doesn't have any such optimizer. Did you try using SGD or Adam optimizers which are very useful?
Do you have any regularization term in the loss function? Are you familiar with dropout? Did you check the training performance? Does your model overfit?
Do you have a baseline model/algorithm so that you can compare whether 80% accuracy is good or not?
150 epochs just for a binary classification task looks too much. Why don't you start from an off-the-shelf classifier model? You can find several examples of regression, classification in this tutorial.
So I have been playing around with Azure ML lately, and I got one dataset where I have multiple values I want to predict. All of them uses different algorithms and when I try to train multiple models within one experiment; it says the “train model can only predict one value”, and there are not enough input ports on the train-model to take in multiple values even if I was to use the same algorithm for each measure. I tried launching the column selector and making rules, but I get the same error as mentioned. How do I predict multiple values and later put the predicted columns together for the web service output so I don’t have to have multiple API’s?
What you would want to do is to train each model and save them as already trained models.
So create a new experiment, train your models and save them by right clicking on each model and they will show up in the left nav bar in the Studio. Now you are able to drag your models into the canvas and have them score predictions where you eventually make them end up in the same output as I have done in my example through the “Add columns” module. I made this example for Ronaldo (Real Madrid CF player) on how he will perform in match after training day. You can see my demo on http://ronaldoinform.azurewebsites.net
For more detailed explanation on how to save the models and train multiple values; you can check out Raymond Langaeian (MSFT) answer in the comment section on this link:
https://azure.microsoft.com/en-us/documentation/articles/machine-learning-convert-training-experiment-to-scoring-experiment/
You have to train models for each variable that you going to predict. Then add all those predicted columns together and get as a single output for the web service.
The algorithms available in ML are only capable of predicting a single variable at a time based on the inputs it's getting.
I am new in machine learning. I did a test but do not know how to explain and evaluate.
Case 1:
I first divide randomly the data (data A, about 8000 words) into 10 groups (a1..a10). Within each group, I use 90% of data to build ngram model. This ngram model is then tested on the other 10% data of the same group. The result is below 10% accuracy. Other 9 groups are done same way (respectively build model and respectively tested on the remained 10% data of that group). All results are about 10% accuracy. (Is this 10 fold cross-validation?)
Case 2:
I first build a ngram model based on entire data set (data A) of about 8000 words. Then I divide this A into 10 groups(a1,a2,a3..a10), randomly of course. I then use this ngram to test respectively a1,a2..a10. I found the model is almost 96% accuracy on all groups.
How to explain such situations.
Thanks in advance.
Yes, 10-fold cross validation.
This testing method has the common flaw of testing on the training set. That is why the accuracy is inflated. It is unrealistic because, in real life, your test instances are novel and previously unseen by the system.
N-fold cross validation is a valid evaluation method used in many works.
You need to read up on the topic of overfitting.
The situation you describes gives the impression that your ngram model is heavily overfitted: it can "memorize" 96% of the training data. But when trained on a proper subset, it only achieves a prediction on the unknown data of 10%.
This is called 10 fold cross-validation