Microsoft Azure Machine Learning - azure

I built an ML linear regression model.
trainning_expscore_model
When I 'preview' in Score Model I have a lot of empty fields.
Does anyone know what the problem might be?

Related

How can I do unsupervised learning with LSTM in tensorflow (Keras)?

I am trying to use UNSW-NB15 to train a model. After the model is trained, I would like to use the model on live network data. I began creating this using a supervised LSTM but started wondering about handling the data from the network and the necessity to create a data pipeline that preprocesses network data to get it in a manner similar to the UNSW-nb15 dataset. This seemed impractical to me as this would most likely mean going through data manually with each network data source. I am thinking that an unsupervised model may be better for my purposes. I still wanted to use LSTM but I'm finding very little in terms of information for creating an unsupervised lstm model in keras. Read a paper suggesting using BINGO (Binary Information gain optimization) or NEO (nonparametric entropy optimization) to train the lstm model. I am not certain how this can be done in keras. I am unable to find such functions there. (I will search python libraries though). Any suggestions?
I am still researching.

Can Azure Machine Learning be applied for manipulations?

I am going thru the samples for Azure Machine Learning. It looks like the examples are leading me to the point that ML is being used to classification problems like ranking, classifying or detecting the category by model trained from inferred-sample-data.
Now that I am wondering if ML can be trained to computational problems like Multiplication, Division, other series problems,..? Does this problem fit in ML scope?
MULTIPLICATION DATASET:
Num01,Num02,Result
1,1,1
1,2,2
1,3,3
1,4,4
1,5,5
1,6,6
1,7,7
1,8,8
1,9,9
1,10,10
1,11,11
1,12,12
1,13,13
1,14,14
2,1,2
2,2,4
2,3,6
2,4,8
2,5,10
2,6,12
2,7,14
2,8,16
2,9,18
2,10,20
2,11,22
2,12,24
2,13,26
2,14,28
3,1,3
3,2,6
SCORING DATASET:
Num01,Num02
1,5
3,1
2,16
3,15
1,32
It seems like you are looking for regression, which is supportd by almost every machine learning library, including Azure's services. In laymans terms, the goal of regression is to approximate an unknown function that maps data X to a continuous value y.
This can be any function, indeed including multiplication or division. However, do note that these cases are usually way too simple to solve with machine learning. Most machine learning algorithms (except maybe linear regression)do a lot more internal computations and will as a result be slower than a native implementation on your device.
As an extra point of clarification, most of the actual machine learning (ML) in Azure ML is done by great open source libraries such as sk-learn or keras. Azure mainly provides compute power and higher-level management tools, such as experiment tracking and efficient hyper-parameter-tuning.
If you are just getting started with ML and want to go more in-depth, then this extra functionality might be overkill/confusing. So I would advise to start with focusing on one of the packages that I described above. Additionally you would need to combine that with some more formal training, which will explain most of the important concepts to you.

Visualizing decision jungle in Azure Machine Learning Studio

I have trained a decision jungle model on Azure Machine Learning, and now I want to visualize the trees, to see if I can identify the root nodes that are the most determinant in the decision.
When I right-click and click Visualize on the Train Model, what is shown is the parameter set used for the training. How can I either visualize the jungle, or identify the features with highest information gain from this?
Thanks in advance!

Hyperparameters tuning with Google Cloud ML Engine and XGBoost

I am trying to replicate the hyperparameter tuning example reported at this link but I want to use scikit learn XGBoost instead of tensorflow in my training application.
I am able to run multiple trials in a single job, on for each of the hyperparameters combination. However, the Training output object returned by ML-Engine does not include the finalMetric field, reporting metric information (see the differences in the picture below).
What I get with the example of the link above:
Training output object with Tensorflow training app
What I get running my Training application with XGBoost:
Training output object with XGBoost training app
Is there a way for XGBoost to return training metrics to ML-Engine?
It seems that this process is automated for tensorflow, as specified in the documentation:
How Cloud ML Engine gets your metric
You may notice that there are no instructions in this documentation
for passing your hyperparameter metric to the Cloud ML Engine training
service. That's because the service monitors TensorFlow summary events
generated by your training application and retrieves the metric.
Is there a similar mechanism for XGBoost?
Now, I can always dump each metric results to a file at the end of each trial and then analyze them manually to select the best parameters. But, by doing so, am I loosing the automated mechanism offered by Cloud ML Engine, especially concerning the "ALGORITHM_UNSPECIFIED" hyperparameters search algorithm?
i.e.,
ALGORITHM_UNSPECIFIED: [...] applies Bayesian optimization to search
the space of possible hyperparameter values, resulting in the most
effective technique for your set of hyperparameters.
Hyperparameter tuning support of XGBoost was implemented in a different way. We created the cloudml-hypertune python package to help do it. We're still working on the public doc for it. At the meantime, you can refer to this staging sample to learn about how to use it.
Sara Robinson over at google put together a good post on how to do this. Rather than regurgitate and claim it as my own, I'll post this here for anyone else that comes across this post:
https://sararobinson.dev/2019/09/12/hyperparameter-tuning-xgboost.html

Azure ML: What is the confidence level setting for Azure ML prediction? And can it be tuned?

I have built a couple of models on some data using Boosted Trees and hyperparameter setting.
However, when I am trying to use the models for prediction, it doesn't give prediction results for a lot of them, some ranging to 75% of the data. I am guessing this has got something to do with the model; and for some reason it does not predict for some results, which makes me guess it has got something to do with the confidence threshold of the prediction.
Please correct me, if I am wrong somewhere.
Guide me, in any case.
So, after a lot of deliberate attempts, the only thing which worked is imputation. As suggested in the question comments, the issue was with the missing data, and as soon as we handled the missing data case, Azure ML worked and predicted results for all the records.

Resources