How to calculate Schoenfeld residuals from competing risk regression using a Fine-and-Gray model? - survival-analysis

I am trying to Schoenfeld residuals from competing risk regression using a Fine-and-Gray model. The model works fine, but I cant find a way to calculate the Schoenfeld residuals. The model looks like this:
fg.multi <- FGR(formula = Hist(FUdur30, cause.mort) ~ emp + AGE_JR + CCMTOTAL + CLIN_SEV_SEPS + def.focus + Bacteraemiaclass , data = total.pop, cause = 2)

Related

What is the output of predict.coxph() using type = "survival"?

I am trying to learn what the various outputs of predict.coxph() mean. I am currently attempting to fit a cox model on a training set then use the resulting coefficients from the training set to make predictions in a test set (new set of data).
I see from the predict.coxph() help page that I could use type = "survival" to extract and individual's survival probability-- which is equal to exp(-expected).
Here is a code block of what I have attempted so far, using the ISLR2 BrainCancer data.
set.seed(123)
n.training = round(nrow(BrainCancer) * 0.70) # 70:30 split
idx = sample(1:nrow(BrainCancer), size = n.training)
d.training = BrainCancer[idx, ]
d.test = BrainCancer[-idx, ]
# fit a model using the training set
fit = coxph(Surv(time, status) ~ sex + diagnosis + loc + ki + gtv + stereo, data = d.training)
# get predicted survival probabilities for the test set
pred = predict(fit, type = "survival", newdata = d.test)
The predictions generated:
predict(fit, type = "survival", newdata = d.test)
[1] 0.9828659 0.8381164 0.9564982 0.2271862 0.2883800 0.9883625 0.9480138 0.9917512 1.0000000 0.9974775 0.7703657 0.9252100 0.9975044 0.9326234 0.8718161 0.9850815 0.9545622 0.4381646 0.8236644
[20] 0.2455676 0.7289031 0.9063336 0.9126897 0.9988625 0.4399697 0.9360874
Are these survival probabilities associated with a specific time point? From the help page, it sounds like these are survival probabilities at the follow-up times in the newdata argument. Is this correct?
Additional questions:
How is the baseline hazard estimated in predict.coxph? Is it using the Breslow estimator?
If type = "expected" is used, are these values the cumulative hazard? If yes, what are the relevant time points for these?
Thank you!

Running multiple inferences in parallel with PyTorch

I'm trying to implement Double DQN (not to be confused with DQN with a slightly delayed Q-target network) in PyTorch to train an agent to play an Atari OpenAI Gym game. Here I discuss the implementation of the following formula:
Update of Q-network, formula taken from Sutton & Barto.
My first implementation is:
Q_pred = self.Q_1.forward(s_now)[T.arange(batch_size), actions.long()]
Q_next_all = self.Q_1.forward(s_next)
maxA_id = T.argmax(Q_next_all, dim=1)
Q_pred2 = self.Q_2.forward(s_next)[T.arange(batch_size), maxA_id]
Q_target = (rewards + (~dones) * self.GAMMA * Q_pred2).detach()
self.Q_1.optimizer.zero_grad()
self.Q_1.loss(Q_target, Q_pred).backward()
self.Q_1.optimizer.step()
(Q_1 and Q_2 are nn.Module classes, and all of the variables involved here are already torch tensors lying in the GPU.)
I noticed that my program ran much slower than a previous implementation which used plain DQN.
I realized that I can combine the batches entering Q_1, so there will be one combined batch being forwarded in the neural network, instead of two batches in sequence. The code becomes:
s_combined = T.cat((s_now, s_next))
Q_combined = self.Q_1.forward(s_combined)
Q_pred = Q_combined[T.arange(batch_size), actions.long()]
Q_next_all = Q_combined[batch_size:]
Q_pred2_all = self.Q_2.forward(s_next)
maxA_id = T.argmax(Q_next_all, dim=1)
Q_pred2 = Q_pred2_all[T.arange(batch_size), maxA_id]
Q_target = (rewards + (~dones) * self.GAMMA * Q_pred2).detach()
self.Q_1.optimizer.zero_grad()
self.Q_1.loss(Q_target, Q_pred).backward()
self.Q_1.optimizer.step()
(This proves that I understand how to do batch training in PyTorch, so don't mark this as a duplicate of this question.)
Furthermore, I realized that Q_1 and Q_2 can process their batches in parallel. So I looked up how to do multiprocessing in PyTorch. Unfortunately, I couldn't find a good example. I tried to adapt a code that looks similar to my scenario, and my code becomes:
def spawned():
s_combined = T.cat((s_now, s_next))
Q_combined = self.Q_1.forward(s_combined)
Q_pred = Q_combined[T.arange(batch_size), actions.long()]
Q_next_all = Q_combined[batch_size:]
mp.set_start_method('spawn', force=True)
p = mp.Process(target=spawned)
p.start()
Q_pred2_all = self.Q_2.forward(s_next)
p.join()
maxA_id = T.argmax(Q_next_all, dim=1)
Q_pred2 = Q_pred2_all[T.arange(batch_size), maxA_id]
Q_target = (rewards + (~dones) * self.GAMMA * Q_pred2).detach()
self.Q_1.optimizer.zero_grad()
self.Q_1.loss(Q_target, Q_pred).backward()
self.Q_1.optimizer.step()
This crashes with the error message:
AttributeError: Can't pickle local object 'Agent.learn.<locals>.spawned'
So how do I make this work?
(Achieving this in CUDA programming is trivial. One simply launches two device kernels using a sequential host code, and the two kernels are automatically computed in parallel in the GPU.)

QuantLib-python pricing barrier option using Heston model

I have recently started exploring the QuantLib option pricing libraries for python and have come across an error that I don't seem to understand. Basically, I am trying to price an Up&Out Barrier option using the Heston model. The code that I have written has been taken from examples found online and adapted to my specific case. Essentially, the problem is that when I run the code below I get an error that I believe is triggered at the last line of the code, i.e. the european_option.NPV() function
*** RuntimeError: wrong argument type
Can someone please explain me what I am doing wrong?
# option inputs
maturity_date = ql.Date(30, 6, 2020)
spot_price = 969.74
strike_price = 1000
volatility = 0.20
dividend_rate = 0.0
option_type = ql.Option.Call
risk_free_rate = 0.0016
day_count = ql.Actual365Fixed()
calculation_date = ql.Date(26, 6, 2020)
ql.Settings.instance().evaluationDate = calculation_date
# construct the option payoff
european_option = ql.BarrierOption(ql.Barrier.UpOut, Barrier, Rebate,
ql.PlainVanillaPayoff(option_type, strike_price),
ql.EuropeanExercise(maturity_date))
# set the Heston parameters
v0 = volatility*volatility # spot variance
kappa = 0.1
theta = v0
hsigma = 0.1
rho = -0.75
spot_handle = ql.QuoteHandle(ql.SimpleQuote(spot_price))
# construct the Heston process
flat_ts = ql.YieldTermStructureHandle(ql.FlatForward(calculation_date,
risk_free_rate, day_count))
dividend_yield = ql.YieldTermStructureHandle(ql.FlatForward(calculation_date,
dividend_rate, day_count))
heston_process = ql.HestonProcess(flat_ts, dividend_yield,
spot_handle, v0, kappa,
theta, hsigma, rho)
# run the pricing engine
engine = ql.AnalyticHestonEngine(ql.HestonModel(heston_process),0.01, 1000)
european_option.setPricingEngine(engine)
h_price = european_option.NPV()
The problem is that the AnalyticHestonEngine is not able to price Barrier options.
Check here https://www.quantlib.org/reference/group__barrierengines.html for a list of Barrier Option pricing engines.

Joining multiple Keras models

I'm trying to replicate the network described in this link: https://arxiv.org/pdf/1806.07492.pdf. I've been able to replicate the patch networks of the model so far, so now I have 9 models of (32,32,3) input size each.
However, the next step is to merge those 9 models into a single one to obtain a single (96,96,3) input network. The idea is that of the 27 neurons of the first layer, the first 3 correspond to the pretrained model of the first patch, and so on. Here is an image of how it should result (each colored row is a different model):
Final concatenated model
In this image, each row before the fully-connected layer represents a previously trained model with the following characteristics:
(32,32,3) architecture
As you can see, this network analyzes images of (32,32,3). However, the complete model needs to follow this structure that uses a (96,96,3) model:
(96,96,3) architecture
I already have the (32,32,3) models in their respective hdf5 file, but I do not know how to merge them into a single model to obtain the one of size (96,96,3). I tried using the concatenate function of Keras, but I got an error that said that my inputs needed to be tensors.
Here is the code that I used:
in1 = Input(shape=(32,32,3))
model_patch1 = load_model('patch1.hdf5')
in2 = Input(shape=(32,32,3))
model_patch2 = load_model('patch2.hdf5')
in3 = Input(shape=(32,32,3))
model_patch3 = load_model('patch3.hdf5')
in4 = Input(shape=(32,32,3))
model_patch4 = load_model('patch4.hdf5')
in5 = Input(shape=(32,32,3))
model_patch5 = load_model('patch5.hdf5')
in6 = Input(shape=(32,32,3))
model_patch6 = load_model('patch6.hdf5')
in7 = Input(shape=(32,32,3))
model_patch7 = load_model('patch7.hdf5')
in8 = Input(shape=(32,32,3))
model_patch8 = load_model('patch8.hdf5')
in9 = Input(shape=(32,32,3))
model_patch9 = load_model('patch9.hdf5')
model_final_concat = Concatenate(axis=-1)([model_patch1, model_patch2,
model_patch3, model_patch4, model_patch5, model_patch6, model_patch7,
model_patch8, model_patch9])
model_final_dense_1 = Dense(1, activation='sigmoid')(model_final_concat)
lsCnn_faces = Model(inputs=[in1,in2,in3,in4,in5,in6,in7,in8,in9],
outputs=model_final_dense_1) lsCnn_faces.summary()
Any help will be very much appreciated.

Using multiple self-defined metrics in LightGBM

Given that we could use self-defined metric in LightGBM and use parameter 'feval' to call it during training.
And for given metric, we could define it in the parameter dict like metric:(l1, l2)
My question is that how call several self-defined metric at the same time? I cannot use feval=(my_metric1, my_metric2) to get the result
params = {}
params['learning_rate'] = 0.003
params['boosting_type'] = 'goss'
params['objective'] = 'multiclassova'
params['metric'] = ['multi_error', 'multi_logloss']
params['sub_feature'] = 0.8
params['num_leaves'] = 15
params['min_data'] = 600
params['tree_learner'] = 'voting'
params['bagging_freq'] = 3
params['num_class'] = 3
params['max_depth'] = -1
params['max_bin'] = 512
params['verbose'] = -1
params['is_unbalance'] = True
evals_result = {}
aa = lgb.train(params,
d_train,
valid_sets=[d_train, d_dev],
evals_result=evals_result,
num_boost_round=4500,
feature_name=f_names,
verbose_eval=10,
categorical_feature = f_names,
learning_rates=lambda iter: (1 / (1 + decay_rate * iter)) * params['learning_rate'])
Lets' discuss on the code I share here. d_train is my training set. d_dev is my validation set (I have a different test set.) evals_result will record our multi_error and multi_logloss per iteration as a list. verbose_eval = 10 will make LightGBM print multi_error and multi_logloss of both training set and validation set at every 10 iterations. If you want to plot multi_error and multi_logloss as a graph:
lgb.plot_metric(evals_result, metric='multi_error')
plt.show()
lgb.plot_metric(evals_result, metric='multi_logloss')
plt.show()
You can find other useful functions from LightGBM documentation. If you can't find what you need, go to XGBoost documentation, a simple trick. If there is something missing, please do not hesitate to ask more.

Resources