CoxPH model with Frailty and L1 regularization

This question stems from an approach proposed by Dr. Silverman, Predicting Horse Race winners through A Regularized Conditional Logistic Regression with Frailty.

In this paper, he proposes a modified Cox Proportional Hazard model including a frailty parameter taken from Muriel Gillick's article, Guest Editorial: Pinning Down Frailty.

The loglikelihood with frailty has the form:

Where:

$ X^{w}_{rh} $ = characteristics of the horse that won race r

$\beta$ are the parameters to be estimated

$w^{w}_{rh}$ is the frailty indicator of the horse that won race r

$X_{rh}$ are the characteristics of the horse in race r

$w_{rh}$ are the frailty indicators of the horses in race r

While the formula for the frailty log-likelihood including L1 regularization is:

Where:

$\lambda$ is the L1 regularization term

$\beta_j$ are the estimated parameters

I'm programming this in python. I only know of the lifelines package and the RPY2 package.

I'm currently attempting to solve this using python's lifelines package as the API allows for custom functions.

However, if there is a package in R that would allow me to conduct both L1 regularization and frailty that would likely be easier.

My current attempt looks as follows:

from lifelines import CoxPHFitter

class FrailtyCPH(CoxPHFitter, w):
    
    _fitted_parameter_names = ['lambda_']
    
    def _cumulative_hazard(self, params, t, Xs, w):
        beta = params['lambda_']
        x = Xs['lambda']
        lambda_ = np.exp(np.dot(x, beta)+w)
        return lambda_

However, the issue I'm running into is that I don't believe the above code will actually allow me to pass the frailty parameter, w to the algorithm.

Likewise, I'm not sure how to implement the L1 regularization.

My above code is taken directly from the API custom function reference page, but I'm not sure it's implemented correctly at all. I know my function needs to return the cumulative hazard function for the custom model, but I am not familiar with survival models or the COXPH model.

The alternative I have considered is using the R-package COXPH or COXME but I'm not sure how to format the inputs to allow the time variable to be the individual race idea so that the probability of winning is conditional on the others within the race, r.

If anyone knows of a preexisting package that would allow me to introduce both frailty and L1 regularization or the proper formatting for the Lifelines custom API, that would be greatly appreciated.

Thanks in advance!

Topic forecasting survival-analysis rstudio logistic-regression python

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.