Survival Analysis: Pseudo Observation Vs Stratified Cox Regression. Which one is better?

I've been looking into the Cox Regression method for Survival Analysis in Churn Prediction. Cox regression will allow us to determine the probability that a subscriber will unsubscribe after a time $t$, defined by the hazard rate:

$$ h(t \lvert X_i ) = h_0(t)exp\big( \boldsymbol{\beta} ^T\boldsymbol{X}_{i} \big) $$

Where

  • $h_0(t)$: Baseline Hazard is a prior Probability that any customer churns at time t when all influencing factors are 0.

  • $\boldsymbol{\beta} \in \mathbb{R}^D$: Exponent of each Coefficient gives us a Hazard ratio. These should be constant w.r.t time (proportionality assumption).

  • $\boldsymbol{X}\in \mathbb{R}^{N\times D}$: Set of $N$ sample customers


Problem: Proportionality Hazard Assumption: Cox regression makes an assumption that the Hazard Ratios should remain constant through time $t$. For example, for a covariate $X_1$ = "gender", say $\beta_1=1.8$. In english, it means male subscribers tend to leave the service $80\%$ more than females after a time $t$. However, this $80\%$ should hold for any time $t$.

This is usually an unreasonable constrain for many variables. But there are other methods that can incorporate variables that don’t follow the proportional hazards assumption.

  • stratified cox regression
  • pseudo-observations
  • cox regression with time-dependent covariates

I was just reading up on stratified cox regression. The only apparent downside here is:

  • The variables that are stratified need to be converted into categorical variables
  • The stratified categorical variables should not have too many degrees of freedom. This will lead to a LARGE number of models whose parameters need to be estimated.

Question: Is pseudo-observations similar? Does it have less/more rigid constraints? Even so, how is it's performance considering I have copious amounts of data?

Topic survival-analysis statistics machine-learning

Category Data Science


I suggest using a model with more relaxed assumptions on proportionality of hazards. In my work I use piecewise constant hazard model, which works wonderfully. Its assumption is that the hazards are proportional in a time interval. It allows using numerical covariates with splines, and time-dependent covariates. Moreover in my experience the model is usually very well calibrated and does not overfit much.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.