How can I implement a decision tree which performs a Gaussian Process classification in the leafs? I want to do that in the context of an image segmentation.
Related
With the sklearn RandomForest, you can get the prediction of each tree of the random forest like so:
per_tree_pred = [tree.predict(X) for tree in clf.estimators_]
xgboost as the option of fitting random forest, which is quite nice since we can leverage the GPU, but is there any way to get the prediction from each tree like we can in sklearn? This can be very useful to get a sense of the uncertainty of the predictions.
Thanks,
I am building decision tree in scikit-learn. Searching stackoverflow one can find a way to extract rules associated with each leaf. Now my goal is to apply these rules to new observation and see in what leaf new observation will end up.
Here is an abstract example. Suppose we got rule for leaf #1. a<5 and b>7, then observation belong to leaf #1. Now I would like to take new observation and apply these rules to it to check in what leaf it ends up.
I am trying to use decision tree for the purpose of segmentation.
You can use the apply method of DecisionTreeClassifier to get the index of hte leaf that each sample is predicted as.
from sklearn.tree import DecisionTreeClassifier
clf = DecisionTreeClassifier()
clf.fit([[1,2,3],[10,19,20],[6,7,7]],[1,1,0])
clf.apply([[6,7,7]])
# array([3])
An example for using a decision tree classifier with scikit learn can be found here. This example includes training the classifier and validating the results for a second data set.
The predict function can be used to return the results for a new data sample when applying the trained decision tree to it:
predict(X, check_input=True)
where X is the feature vector of the new data sample under examination.
This link might help you to understand how to output the rules of your decision tree classifier.
I have a satellite data that provides radiance which i use to compute Flux (using surface and cloud info). Now using a regression method, I can have a mathematical model relating radiance and flux and can be used to predict the flux for new radiance values without the other new inputs.
Is it possible to do same using decision trees or regression trees..? In a regression there is mathematical equation connecting dependent and independent variable. using decision trees, how you develop such a model?
It's best if you ask this in stats.stackexchange.com. A simple global regression model is a special case of a regression tree where there is only one node, so you can definitely apply regression tree for your data. Decision trees are generally used for classification, not regression.
I am trying to draw the decision boundary obtain with the sci-kit learn LDA classifier.
I understand that you can transform your multivariate data using the transform method to project the data onto the first component line (two classes case). How do I obtain the value on the first component that acts as classification pivot? This is, the value that serves as the decision boundary?
Thanks!
LDA performs PCA or PCA like operation on $Cov_{between}/Cov_{within}$. The classification pivots are just the first n-1 eigenvectors of $Cov_{between}/Cov_{within}$.
The eigenmatrix is stored in lda.scalings_, so the first component is the first vector of lda.scalings_. It is the decision boundary for 2-class case, but not for the multi-class case
I am attempting 3 class classification by using SVM classifier. How do we interpret the probabililty estimates predicted by LIBSVM. Is it based on perpendicular distance of the instance from the maximal margin hyperplane?.
Kindly through some light on the interpretation of probability estimates predicted by LIBSVM classifier. Parameters C and gamma are first tuned and then probability estimates are outputted by using -b option with both training and testing.
Multiclass SVM is always decomposed into several binary classifiers (typically a set of one vs all classifiers). Any binary SVM classifier's decision function outputs a (signed) distance to the separating hyperplane. In short, an SVM maps the input domain to a one-dimensional real number (the decision value). The predicted label is determined by the sign of the decision value. The most common technique to obtain probabilistic output from SVM models is through so-called Platt scaling (paper of LIBSVM authors).
Is it based on perpendicular distance of the instance from the maximal margin hyperplane?
Yes. Any classifier that outputs such a one-dimensional real value can be post-processed to yield probabilities, by calibrating a logistic function on the decision values of the classifier. This is the exact same approach as in standard logistic regression.
SVM performs binary classification. In order to achieve multiclass classification libsvm performs what it's called one vs all. What you get when you invoke -bis the probability related to this technique that you can found explained here .