Tensorflow Extracting Classification Predictions - python-3.x

I've a tensorflow NN model for classification of one-hot-encoded group labels (groups are exclusive), which ends with (layerActivs[-1] are the activations of the final layer):
probs = sess.run(tf.nn.softmax(layerActivs[-1]),...)
classes = sess.run(tf.round(probs))
preds = sess.run(tf.argmax(classes))
The tf.round is included to force any low probabilities to 0. If all probabilities are below 50% for an observation, this means that no class will be predicted. I.e., if there are 4 classes, we could have probs[0,:] = [0.2,0,0,0.4], so classes[0,:] = [0,0,0,0]; preds[0] = 0 follows.
Obviously this is ambiguous, as it is the same result that would occur if we had probs[1,:]=[.9,0,.1,0] -> classes[1,:] = [1,0,0,0] -> 1 preds[1] = 0. This is a problem when using the tensorflow builtin metrics class, as the functions can't distinguish between no prediction, and prediction in class 0. This is demonstrated by this code:
import numpy as np
import tensorflow as tf
import pandas as pd
''' prepare '''
classes = 6
n = 100
# simulate data
np.random.seed(42)
simY = np.random.randint(0,classes,n) # pretend actual data
simYhat = np.random.randint(0,classes,n) # pretend pred data
truth = np.sum(simY == simYhat)/n
tabulate = pd.Series(simY).value_counts()
# create placeholders
lab = tf.placeholder(shape=simY.shape, dtype=tf.int32)
prd = tf.placeholder(shape=simY.shape, dtype=tf.int32)
AM_lab = tf.placeholder(shape=simY.shape,dtype=tf.int32)
AM_prd = tf.placeholder(shape=simY.shape,dtype=tf.int32)
# create one-hot encoding objects
simYOH = tf.one_hot(lab,classes)
# create accuracy objects
acc = tf.metrics.accuracy(lab,prd) # real accuracy with tf.metrics
accOHAM = tf.metrics.accuracy(AM_lab,AM_prd) # OHE argmaxed to labels - expected to be correct
# now setup to pretend we ran a model & generated OHE predictions all unclassed
z = np.zeros(shape=(n,classes),dtype=float)
testPred = tf.constant(z)
''' run it all '''
# setup
sess = tf.Session()
sess.run([tf.global_variables_initializer(),tf.local_variables_initializer()])
# real accuracy with tf.metrics
ACC = sess.run(acc,feed_dict = {lab:simY,prd:simYhat})
# OHE argmaxed to labels - expected to be correct, but is it?
l,p = sess.run([simYOH,testPred],feed_dict={lab:simY})
p = np.argmax(p,axis=-1)
ACCOHAM = sess.run(accOHAM,feed_dict={AM_lab:simY,AM_prd:p})
sess.close()
''' print stuff '''
print('Accuracy')
print('-known truth: %0.4f'%truth)
print('-on unprocessed data: %0.4f'%ACC[1])
print('-on faked unclassed labels data (s.b. 0%%): %0.4f'%ACCOHAM[1])
print('----------\nTrue Class Freqs:\n%r'%(tabulate.sort_index()/n))
which has the output:
Accuracy
-known truth: 0.1500
-on unprocessed data: 0.1500
-on faked unclassed labels data (s.b. 0%): 0.1100
----------
True Class Freqs:
0 0.11
1 0.19
2 0.11
3 0.25
4 0.17
5 0.17
dtype: float64
Note freq for class 0 is same as faked accuracy...
I experimented with setting a value of preds to np.nan for observations with no predictions, but tf.metrics.accuracy throws ValueError: cannot convert float NaN to integer; also tried np.inf but got OverflowError: cannot convert float infinity to integer.
How can I convert the rounded probabilities to class predictions, but appropriately handle unpredicted observations?

This has gone long enough without an answer, so I'll post here as the answer my solution. I convert belonging probabilities to class predictions with a new function that has 3 main steps:
set any NaN probabilities to 0
set any probabilities below 1/num_classes to 0
use np.argmax() to extract predicted classes, then set any unclassed observations to a uniformly selected class
The resultant vector of integer class labels can be passed to the tf.metrics functions. My function below:
def predFromProb(classProbs):
'''
Take in as input an (m x p) matrix of m observations' class probabilities in
p classes and return an m-length vector of integer class labels (0...p-1).
Probabilities at or below 1/p are set to 0, as are NaNs; any unclassed
observations are randomly assigned to a class.
'''
numClasses = classProbs.shape[1]
# zero out class probs that are at or below chance, or NaN
probs = classProbs.copy()
probs[np.isnan(probs)] = 0
probs = probs*(probs > 1/numClasses)
# find any un-classed observations
unpred = ~np.any(probs,axis=1)
# get the predicted classes
preds = np.argmax(probs,axis=1)
# randomly classify un-classed observations
rnds = np.random.randint(0,numClasses,np.sum(unpred))
preds[unpred] = rnds
return preds

Related

Predicting classes in MNIST dataset with a Gaussian- the same prediction errors with different paramemters?

I am trying to find the best c parameter following the instructions to a task that asks me to ' Define a function, fit_generative_model, that takes as input a training set (train_data, train_labels) and fits a Gaussian generative model to it. It should return the parameters of this generative model; for each label j = 0,1,...,9, where
pi[j]: the frequency of that label
mu[j]: the 784-dimensional mean vector
sigma[j]: the 784x784 covariance matrix
It is important to regularize these matrices. The standard way of doing this is to add cI to them, where c is some constant and I is the 784-dimensional identity matrix. c is now a parameter, and by setting it appropriately, we can improve the performance of the model.
%matplotlib inline
import sys
import matplotlib.pyplot as plt
import gzip, os
import numpy as np
from scipy.stats import multivariate_normal
if sys.version_info[0] == 2:
from urllib import urlretrieve
else:
from urllib.request import urlretrieve
# Downloads the dataset
def download(filename, source='http://yann.lecun.com/exdb/mnist/'):
print("Downloading %s" % filename)
urlretrieve(source + filename, filename)
# Invokes download() if necessary, then reads in images
def load_mnist_images(filename):
if not os.path.exists(filename):
download(filename)
with gzip.open(filename, 'rb') as f:
data = np.frombuffer(f.read(), np.uint8, offset=16)
data = data.reshape(-1,784)
return data
def load_mnist_labels(filename):
if not os.path.exists(filename):
download(filename)
with gzip.open(filename, 'rb') as f:
data = np.frombuffer(f.read(), np.uint8, offset=8)
return data
## Load the training set
train_data = load_mnist_images('train-images-idx3-ubyte.gz')
train_labels = load_mnist_labels('train-labels-idx1-ubyte.gz')
## Load the testing set
test_data = load_mnist_images('t10k-images-idx3-ubyte.gz')
test_labels = load_mnist_labels('t10k-labels-idx1-ubyte.gz')
train_data.shape, train_labels.shape
So I have written this code for three different C-values. they each give me the same error?
def fit_generative_model(x,y):
lst=[]
for c in [20,200, 4000]:
k = 10 # labels 0,1,...,k-1
d = (x.shape)[1] # number of features
mu = np.zeros((k,d))
sigma = np.zeros((k,d,d))
pi = np.zeros(k)
for label in range(0,k):
indices = (y == label)
mu[label] = np.mean(x[indices,:], axis=0)
sigma[label] = np.cov(x[indices,:], rowvar=0, bias=1) + c*np.identity(784) # I define the identity matrix
predictions = np.argmax(score, axis=1)
errors = np.sum(predictions != y)
lst.append(errors)
print(c,"Model makes " + str(errors) + " errors out of 10000", lst)
Then I fit it to the training data and get these same errors:
mu, sigma, pi = fit_generative_model(train_data, train_labels)
20 Model makes 1 errors out of 10000 [1]
200 Model makes 1 errors out of 10000 [1, 1]
4000 Model makes 1 errors out of 10000 [1, 1, 1]
and to the test data:
mu, sigma, pi = fit_generative_model(test_data, test_labels)
20 Model makes 9020 errors out of 10000 [9020]
200 Model makes 9020 errors out of 10000 [9020, 9020]
4000 Model makes 9020 errors out of 10000 [9020, 9020, 9020]
What is it I'm doing wrong? the correct answer is c=4000 which yields an error of ~4.3%.

How to run a proper Bayesian Logistic Regression

I'm trying to run a bayesian logistic regression on the wine dataset provided from the sklearn package. As variables, I decided to use alcohol, color_intensity, flavanoids, hue and magnesium where alcohol is my response variable and the rest the predictors. To do so, I'm using pyro and torch packages:
import pyro
import torch
import pyro.distributions as dist
import pyro.optim as optim
from pyro.infer import SVI, Trace_ELBO
import pandas as pd
import numpy as np
from pyro.infer import Predictive
import torch.distributions.constraints as constraints
from sklearn import datasets
pyro.set_rng_seed(0)
#loading data and prepearing dataframe
wine = datasets.load_wine()
data = pd.DataFrame(columns = wine['feature_names'], data=wine['data'] )
#choosiing variables: response and predictors
variables = data[['alcohol', 'color_intensity', 'flavanoids', 'hue', 'magnesium']]
#standardization
variables = (variables-variables.min())/(variables.max()-variables.min())
#tensorizing
alcohol = torch.tensor(variables['alcohol'].values, dtype=torch.float)
predictors = torch.stack([torch.tensor(variables[column].values, dtype=torch.float)
for column in ['alcohol', 'color_intensity', 'flavanoids', 'hue', 'magnesium']], 1)
#splitting data
k = int(0.8 * len(variables))
x_train, y_train = predictors[:k], alcohol[:k]
x_test, y_test = predictors[k:], alcohol[k:]
#modelling
def model_alcohol(predictors, alcohol):
n_observations, n_predictors = predictors.shape
#weights
w = pyro.sample('w', dist.Normal(torch.zeros(n_predictors), torch.ones(n_predictors)))
epsilon = pyro.sample('epsilon', dist.Normal(0.,1.))
#non-linearity
y_hat = torch.sigmoid((w*predictors).sum(dim=1) + epsilon)
sigma = pyro.sample("sigma", dist.Uniform(0.,3.))
with pyro.plate('alcohol', len(alcohol)):
y=pyro.sample('y', dist.Normal(y_hat, sigma), obs=alcohol)
def guide_alcohol(predictors, alcohol=None):
n_observations, n_predictors = predictors.shape
w_loc = pyro.param('w_loc', torch.rand(n_predictors))
w_scale = pyro.param('w_scale', torch.rand(n_predictors), constraint=constraints.positive)
w = pyro.sample('w', dist.Normal(w_loc, w_scale))
epsilon_loc = pyro.param('b_loc', torch.rand(1))
epsilon_scale = pyro.param('b_scale', torch.rand(1), constraint=constraints.positive)
epsilon = pyro.sample('epsilon', dist.Normal(epsilon_loc, epsilon_scale))
sigma_loc = pyro.param('sigma_loc', torch.rand(n_predictors))
sigma_scale = pyro.param('sigma_scale', torch.rand(n_predictors),
constraint=constraints.positive)
sigma = pyro.sample('sigma', dist.Normal(sigma_loc, sigma_scale))
alcohol_svi = SVI(model=model_alcohol, guide=guide_alcohol, optim=optim.ClippedAdam({'lr' : 0.0002}),
loss=Trace_ELBO())
losses = []
for step in range(10000):
loss = alcohol_svi.step(x_train, y_train)/len(x_train)
losses.append(loss)
As I have to use Stochastic Variational Inference, I have defined both the model and the guide. My problem is now at matching tensor sizes, as I now I get the error:
RuntimeError: The size of tensor a (142) must match the size of tensor b (5) at non-singleton
dimension 0
Trace Shapes:
Param Sites:
Sample Sites:
w dist 5 |
value 5 |
epsilon dist |
value 1 |
sigma dist |
value 5 |
alcohol dist |
value 142 |
I'm kinda new to the idea of modelling on my own, so clearly there are mistakes around the code (hopefully not on the theory behind it). Still, I see I should adjust dimension on the guide maybe? I'm not entirely sure on how to honestly.
Your main problem is that w is not declared as a single event (.to_event(1)), and your variance (sigma) should have the same dim as your observations (()). The model and guide below fix this; I suggest you look at auto-generated guides in Pyro, and a different prior on sigma.
def model_alcohol(predictors, alcohol):
n_observations, n_predictors = predictors.shape
# weights
# w is a single event
w = pyro.sample('w', dist.Normal(torch.zeros(n_predictors), torch.ones(n_predictors)).to_event(1))
epsilon = pyro.sample('epsilon', dist.Normal(0., 1.))
# non-linearity
y_hat = torch.sigmoid(predictors # w + epsilon) # (predictors * weight).sum(1) == predictors # w
sigma = pyro.sample("sigma", dist.Uniform(0., 3.))
with pyro.plate('alcohol', len(alcohol)):
pyro.sample('y', dist.Normal(y_hat, sigma), obs=alcohol)
def guide_alcohol(predictors, alcohol=None):
n_observations, n_predictors = predictors.shape
w_loc = pyro.param('w_loc', torch.rand(n_predictors))
w_scale = pyro.param('w_scale', torch.rand(n_predictors), constraint=constraints.positive)
pyro.sample('w', dist.Normal(w_loc, w_scale).to_event(1))
epsilon_loc = pyro.param('b_loc', torch.rand(1))
epsilon_scale = pyro.param('b_scale', torch.rand(1), constraint=constraints.positive)
epsilon = pyro.sample('epsilon', dist.Normal(epsilon_loc, epsilon_scale))
sigma_loc = pyro.param('sigma_loc', torch.rand(1))
sigma_scale = pyro.param('sigma_scale', torch.rand(1),
constraint=constraints.positive)
pyro.sample('sigma', dist.HalfNormal(sigma_loc, sigma_scale)) # MUST BE POSITIVE

Find wrongly categorized samples from validation step

I am using a keras neural net for identifying category in which the data belongs.
self.model.compile(loss='categorical_crossentropy',
optimizer=keras.optimizers.Adam(lr=0.001, decay=0.0001),
metrics=[categorical_accuracy])
Fit function
history = self.model.fit(self.X,
{'output': self.Y},
validation_split=0.3,
epochs=400,
batch_size=32
)
I am interested in finding out which labels are getting categorized wrongly in the validation step. Seems like a good way to understand what is happening under the hood.
You can use model.predict_classes(validation_data) to get the predicted classes for your validation data, and compare these predictions with the actual labels to find out where the model was wrong. Something like this:
predictions = model.predict_classes(validation_data)
wrong = np.where(predictions != Y_validation)
If you are interested in looking 'under the hood', I'd suggest to use
model.predict(validation_data_x)
to see the scores for each class, for each observation of the validation set.
This should shed some light on which categories the model is not so good at classifying. The way to predict the final class is
scores = model.predict(validation_data_x)
preds = np.argmax(scores, axis=1)
be sure to use the proper axis for np.argmax (I'm assuming your observation axis is 1). Use preds to then compare with the real class.
Also, as another exploration you want to see the overall accuracy on this dataset, use
model.evaluate(x=validation_data_x, y=validation_data_y)
I ended up creating a metric which prints the "worst performing category id + score" on each iteration. Ideas from link
import tensorflow as tf
import numpy as np
class MaxIoU(object):
def __init__(self, num_classes):
super().__init__()
self.num_classes = num_classes
def max_iou(self, y_true, y_pred):
# Wraps np_max_iou method and uses it as a TensorFlow op.
# Takes numpy arrays as its arguments and returns numpy arrays as
# its outputs.
return tf.py_func(self.np_max_iou, [y_true, y_pred], tf.float32)
def np_max_iou(self, y_true, y_pred):
# Compute the confusion matrix to get the number of true positives,
# false positives, and false negatives
# Convert predictions and target from categorical to integer format
target = np.argmax(y_true, axis=-1).ravel()
predicted = np.argmax(y_pred, axis=-1).ravel()
# Trick from torchnet for bincounting 2 arrays together
# https://github.com/pytorch/tnt/blob/master/torchnet/meter/confusionmeter.py
x = predicted + self.num_classes * target
bincount_2d = np.bincount(x.astype(np.int32), minlength=self.num_classes**2)
assert bincount_2d.size == self.num_classes**2
conf = bincount_2d.reshape((self.num_classes, self.num_classes))
# Compute the IoU and mean IoU from the confusion matrix
true_positive = np.diag(conf)
false_positive = np.sum(conf, 0) - true_positive
false_negative = np.sum(conf, 1) - true_positive
# Just in case we get a division by 0, ignore/hide the error and set the value to 0
with np.errstate(divide='ignore', invalid='ignore'):
iou = false_positive / (true_positive + false_positive + false_negative)
iou[np.isnan(iou)] = 0
return np.max(iou).astype(np.float32) + np.argmax(iou).astype(np.float32)
~
usage:
custom_metric = MaxIoU(len(catagories))
self.model.compile(loss='categorical_crossentropy',
optimizer=keras.optimizers.Adam(lr=0.001, decay=0.0001),
metrics=[categorical_accuracy, custom_metric.max_iou])

h2o vs scikit learn confusion matrix

Anyone able to match the sklearn confusion matrix to h2o?
They never match....
Doing something similar with Keras produces a perfect match.
But in h2o they are always off. Tried it every which way...
Borrowed some code from:
Any difference between H2O and Scikit-Learn metrics scoring?
# In[30]:
import pandas as pd
import h2o
from h2o.estimators.gbm import H2OGradientBoostingEstimator
h2o.init()
# Import a sample binary outcome train/test set into H2O
train = h2o.import_file("https://s3.amazonaws.com/erin-data/higgs/higgs_train_10k.csv")
test = h2o.import_file("https://s3.amazonaws.com/erin-data/higgs/higgs_test_5k.csv")
# Identify predictors and response
x = train.columns
y = "response"
x.remove(y)
# For binary classification, response should be a factor
train[y] = train[y].asfactor()
test[y] = test[y].asfactor()
# Train and cross-validate a GBM
model = H2OGradientBoostingEstimator(distribution="bernoulli", seed=1)
model.train(x=x, y=y, training_frame=train)
# In[31]:
# Test AUC
model.model_performance(test).auc()
# 0.7817203808052897
# In[32]:
# Generate predictions on a test set
pred = model.predict(test)
# In[33]:
from sklearn.metrics import roc_auc_score, confusion_matrix
pred_df = pred.as_data_frame()
y_true = test[y].as_data_frame()
roc_auc_score(y_true, pred_df['p1'].tolist())
#pred_df.head()
# In[36]:
y_true = test[y].as_data_frame().values
cm = pd.DataFrame(confusion_matrix(y_true, pred_df['predict'].values))
# In[37]:
print(cm)
0 1
0 1354 961
1 540 2145
# In[38]:
model.model_performance(test).confusion_matrix()
Confusion Matrix (Act/Pred) for max f1 # threshold = 0.353664307031828:
0 1 Error Rate
0 964.0 1351.0 0.5836 (1351.0/2315.0)
1 274.0 2411.0 0.102 (274.0/2685.0)
Total 1238.0 3762.0 0.325 (1625.0/5000.0)
# In[39]:
h2o.cluster().shutdown()
This does the trick, thx for the hunch Vivek. Still not an exact match but extremely close.
perf = model.model_performance(train)
threshold = perf.find_threshold_by_max_metric('f1')
model.model_performance(test).confusion_matrix(thresholds=threshold)
I also meet the same issue. Here is what I would do to make a fair comparison:
model.train(x=x, y=y, training_frame=train, validation_frame=test)
cm1 = model.confusion_matrix(metrics=['F1'], valid=True)
Since we train the model using training data and validation data, then the pred['predict'] will use the threshold which maximizes the F1 score of validation data. To make sure, one can use these lines:
threshold = perf.find_threshold_by_max_metric(metric='F1', valid=True)
pred_df['predict'] = pred_df['p1'].apply(lambda x: 0 if x < threshold else 1)
To get another confusion matrix from scikit learn:
from sklearn.metrics import confusion_matrix
cm2 = confusion_matrix(y_true, pred_df['predict'])
In my case, I don't understand why I get slightly different results. Something like, for example:
print(cm1)
>> [[3063 176]
[ 94 146]]
print(cm2)
>> [[3063 176]
[ 95 145]]

Ensemble model in H2O with fold_column argument

I am new to H2O in python. I am trying to model my data using ensemble model following the example codes from H2O's web site. (http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/stacked-ensembles.html)
I have applied GBM and RF as base models. And then using stacking, I tried to merge them in ensemble model. In addition, in my training data I created one additional column named 'fold' to be used in fold_column = "fold"
I applied 10 fold cv and I observed that I received results from cv1. However, all the predictions coming from other 9 cvs, they are empty. What am I missing here?
Here is my sample data:
code:
import h2o
from h2o.estimators.random_forest import H2ORandomForestEstimator
from h2o.estimators.gbm import H2OGradientBoostingEstimator
from h2o.estimators.stackedensemble import H2OStackedEnsembleEstimator
from h2o.grid.grid_search import H2OGridSearch
from __future__ import print_function
h2o.init(port=23, nthreads=6)
train = h2o.H2OFrame(ens_df)
test = h2o.H2OFrame(test_ens_eq)
x = train.drop(['Date','EQUITY','fold'],axis=1).columns
y = 'EQUITY'
cat_cols = ['A','B','C','D']
train[cat_cols] = train[cat_cols].asfactor()
test[cat_cols] = test[cat_cols].asfactor()
my_gbm = H2OGradientBoostingEstimator(distribution="gaussian",
ntrees=10,
max_depth=3,
min_rows=2,
learn_rate=0.2,
keep_cross_validation_predictions=True,
seed=1)
my_gbm.train(x=x, y=y, training_frame=train, fold_column = "fold")
Then when I check cv results with
my_gbm.cross_validation_predictions():
Plus when I try the ensemble in the test set I get the warning below:
# Train a stacked ensemble using the GBM and GLM above
ensemble = H2OStackedEnsembleEstimator(model_id="mlee_ensemble",
base_models=[my_gbm, my_rf])
ensemble.train(x=x, y=y, training_frame=train)
# Eval ensemble performance on the test data
perf_stack_test = ensemble.model_performance(test)
pred = ensemble.predict(test)
pred
/mgmt/data/conda/envs/python3.6_4.4/lib/python3.6/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'fold': substituting in a column of NaN
warnings.warn(w)
Am I missing something about fold_column?
Here is an example of how to use a custom fold column (created from a list). This is a modified version of the example Python code in the Stacked Ensemble page in the H2O User Guide.
import h2o
from h2o.estimators.random_forest import H2ORandomForestEstimator
from h2o.estimators.gbm import H2OGradientBoostingEstimator
from h2o.estimators.stackedensemble import H2OStackedEnsembleEstimator
from h2o.grid.grid_search import H2OGridSearch
from __future__ import print_function
h2o.init()
# Import a sample binary outcome training set into H2O
train = h2o.import_file("https://s3.amazonaws.com/erin-data/higgs/higgs_train_10k.csv")
# Identify predictors and response
x = train.columns
y = "response"
x.remove(y)
# For binary classification, response should be a factor
train[y] = train[y].asfactor()
# Add a fold column, generate from a list
# The list has 10 unique values, so there will be 10 folds
fold_list = list(range(10)) * 1000
train['fold_id'] = h2o.H2OFrame(fold_list)
# Train and cross-validate a GBM
my_gbm = H2OGradientBoostingEstimator(distribution="bernoulli",
ntrees=10,
keep_cross_validation_predictions=True,
seed=1)
my_gbm.train(x=x, y=y, training_frame=train, fold_column="fold_id")
# Train and cross-validate a RF
my_rf = H2ORandomForestEstimator(ntrees=50,
keep_cross_validation_predictions=True,
seed=1)
my_rf.train(x=x, y=y, training_frame=train, fold_column="fold_id")
# Train a stacked ensemble using the GBM and RF above
ensemble = H2OStackedEnsembleEstimator(base_models=[my_gbm, my_rf])
ensemble.train(x=x, y=y, training_frame=train)
To answer your second question about how to view the cross-validated predictions in a model. They are stored in two places, however, the method that you probably want to use is: .cross_validation_holdout_predictions() This method returns a single H2OFrame of the cross-validated predictions, in the original order of the training observations:
In [11]: my_gbm.cross_validation_holdout_predictions()
Out[11]:
predict p0 p1
--------- -------- --------
1 0.323155 0.676845
1 0.248131 0.751869
1 0.288241 0.711759
1 0.407768 0.592232
1 0.507294 0.492706
0 0.6417 0.3583
1 0.253329 0.746671
1 0.289916 0.710084
1 0.524328 0.475672
1 0.252006 0.747994
[10000 rows x 3 columns]
The second method, .cross_validation_predictions() is a list which stores the predictions from each fold in an H2OFrame that has the same number of rows as the original training frame, but the rows that are not active in that fold have a value of zero. This is not usually the format that people find most useful, so I'd recommend using the other method instead.
In [13]: type(my_gbm.cross_validation_predictions())
Out[13]: list
In [14]: len(my_gbm.cross_validation_predictions())
Out[14]: 10
In [15]: my_gbm.cross_validation_predictions()[0]
Out[15]:
predict p0 p1
--------- -------- --------
1 0.323155 0.676845
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
[10000 rows x 3 columns]

Resources