Visualizing decision jungle in Azure Machine Learning Studio - decision-tree

I have trained a decision jungle model on Azure Machine Learning, and now I want to visualize the trees, to see if I can identify the root nodes that are the most determinant in the decision.
When I right-click and click Visualize on the Train Model, what is shown is the parameter set used for the training. How can I either visualize the jungle, or identify the features with highest information gain from this?
Thanks in advance!

Related

Neural network graph visualization

I would like to generate visualization of my neural network (PyTorch or ONNX model) similar to this using Graphcore Poplar.
I have looked in the documentation but I cannot find where this visualization feature is.
How can I achieve such a task ? Is there any other existing library ?
that visualization is not part of the Graphcore Poplar software. It is "data art" generated by the team at GraphCore.
It is a tough work and requires many hours to get to that fine quality, but if you are decided, I would suggest to start looking at graph visualization tools looking for "graph network visualization" (and get inspiration from galleries like https://cytoscape.org/screenshots.html).
The NN architecture can be converted into a common graph format (neurons as nodes, connections as edges) and then you may start trying.
Some ideas:
Start with a simple NN with three layers. Place the input layer at the outer circle, there is a inner circle for the hidden layer and the output layer is placed in the center. Each neuron is a dot, with radius relative to the weight and color with the bias, and you can displace it towards/away the neurons in the previous layers based on the weight. Check this image for inspiration if you are looking for a "biological" style: https://cytoscape.org/images/screenshots/edge_bundling3_1400px.png

How to prioritize few Neural Network inputs?

I have a Neural Network with five inputs for a classification task. Two inputs out of those five are very important and have a direct relationship to the classification task. Therefore, I need to prioritize those two inputs within the network and give less priority to the other three. Is there a way in the neural network to facilitate my requirement?
If training works well, the NN should automatically pick up what's most important for your classification. That's the entire point of a NN (or ML in general); so that you don't have to manually tell it what's more important and what's not. After learning, you can verify that the model indeed does learn the correct order of importance between the features.
You can use any model explanation technique for this. ELI5, SHAP or LIME are some examples. All these will tell you if your model did indeed learn that the features that you know are important is actually important to the network.
You probably shouldn't try to manually incorporate such biases into the network (unless you have a very good reason for doing so, like incorporating spatial information of images via CNNs). Trust the learning xD

Microsoft Azure Machine Learning

I built an ML linear regression model.
trainning_expscore_model
When I 'preview' in Score Model I have a lot of empty fields.
Does anyone know what the problem might be?

Can Azure Machine Learning be applied for manipulations?

I am going thru the samples for Azure Machine Learning. It looks like the examples are leading me to the point that ML is being used to classification problems like ranking, classifying or detecting the category by model trained from inferred-sample-data.
Now that I am wondering if ML can be trained to computational problems like Multiplication, Division, other series problems,..? Does this problem fit in ML scope?
MULTIPLICATION DATASET:
Num01,Num02,Result
1,1,1
1,2,2
1,3,3
1,4,4
1,5,5
1,6,6
1,7,7
1,8,8
1,9,9
1,10,10
1,11,11
1,12,12
1,13,13
1,14,14
2,1,2
2,2,4
2,3,6
2,4,8
2,5,10
2,6,12
2,7,14
2,8,16
2,9,18
2,10,20
2,11,22
2,12,24
2,13,26
2,14,28
3,1,3
3,2,6
SCORING DATASET:
Num01,Num02
1,5
3,1
2,16
3,15
1,32
It seems like you are looking for regression, which is supportd by almost every machine learning library, including Azure's services. In laymans terms, the goal of regression is to approximate an unknown function that maps data X to a continuous value y.
This can be any function, indeed including multiplication or division. However, do note that these cases are usually way too simple to solve with machine learning. Most machine learning algorithms (except maybe linear regression)do a lot more internal computations and will as a result be slower than a native implementation on your device.
As an extra point of clarification, most of the actual machine learning (ML) in Azure ML is done by great open source libraries such as sk-learn or keras. Azure mainly provides compute power and higher-level management tools, such as experiment tracking and efficient hyper-parameter-tuning.
If you are just getting started with ML and want to go more in-depth, then this extra functionality might be overkill/confusing. So I would advise to start with focusing on one of the packages that I described above. Additionally you would need to combine that with some more formal training, which will explain most of the important concepts to you.

Azure Machine Learning Decision Tree output

Is there any way to get the output of the Boosted Decision Tree module in ML Studio? To analyze the learned tree, like in Weka.
Update: visualization of decision trees is available now! Right-click on the output node of the "Train Model" module and select "Visualize".
My old answer:
I'm sorry; visualization of decision trees isn't available yet. (I really want it too! You can upvote this feature request at http://feedback.azure.com/forums/257792-machine-learning/suggestions/7419469-show-variable-importance-after-experiment-runs, but they are currently working on it.)
Just FYI, you can currently see what the model builds for linear algorithms by right-clicking on the "Train Model" module output node and selecting "Visualize". It will show the initial parameter values and the feature weights. But for non-linear algorithms like decision trees, that visibility is still forthcoming.
Yes, I don't know your structure but you should have your dataset and the algorithm going into a train model and put the results of the train model with your other half of the dataset (if you used split) into a score model. You can see the scored label and scored probabilities here when you press visualise
Your experiment should look a bit like this. Connect the boosted decision tree with the dataset to a train model, you can see the results in the score model

Resources