When the cause of death is unknown, the most common method to estimate the cancer-related survival is net survival. Its estimation assumes that the observed hazard λobs is equal to the sum of the known background mortality hazard in the general population λpop (obtained from national Statistic Institutes such as INSEE in France) and the excess hazard (due to cancer) λexc. For one individual i, this relation can be expressed as:
λobs(ti|zi) = λpop(ti + ai|zpop, i) + λexc(ti|zi) where Zpop ⊂ Z are the variables of the mortality table and of the model respectively.
The cumulative observed hazard can be written as: Λobs(t|z) = Λpop(t + a|z) + Λexc(t|z) and the net survival is obtained as following: Sn(t|z) = exp (−Λexc(t|z))
The TNEH model is a relatively recent excess hazard model developed by Boussari et al. $\\$
The particularity of this model is that it enables the estimation, at the same time as the classical parameters of a model of the excess rate, of a quantity which is obtained by post-estimation by the classical models: it concerns the time after which the excess rate becomes null i.e. the cure point.
The excess hazard proposed can be expressed as following:
$$ \lambda_{exc}(t|z;\theta) = \left(\dfrac{t}{\tau(z;\tau*)}\right)^{\alpha(z;\alpha*)-1} \left(1 - \dfrac{t}{\tau(z;\tau*)}\right)^{\beta-1} 1_{\left\{0 \le t \le \tau(z;\tau*)\right\}} $$
where : $\\$
τ(z; τ*) > 0 is the time to cure, depends on covariates z and vector of parameters τ*. It corresponds to the vector of parameters fitting the time-to-null excess hazard. $\\$
α(z; α*) > 0 and β > 1 are shape parameters. With β > 1, the excess hazard is forced to be null and continuous in τ(z; τ*). $\\$
The vector of parameters to be estimated is θ = (α*, β, τ*) with α(z; α*) > 0 .
$$ \Lambda_{exc}(t|z;\theta) = \tau(z;\tau*) B \left( \alpha(z;\alpha*), \beta \right) F_{Be} \left( \dfrac{t}{\tau(z;\tau*)} ; \alpha(z;\alpha*) , \beta \right) $$
where
B is the beta function $\\$ FBe is the cumulative distribution function (cdf) of the beta distribution
$$ S_n(t|z) = \exp(-\Lambda_{exc}(t|z)) = \exp\left(-\tau(z;\tau*) B \left( \alpha(z;\alpha*), \beta \right) F_{Be} \left( \dfrac{t}{\tau(z;\tau*)} ; \alpha(z;\alpha*),\beta \right)\right) $$
The cure fraction corresponds to the net survival at t = τ in TNEH model. It can be expressed as:
π(z|θ) = exp (−Λexc(τ(z; τ*)|z)) = exp (−τ(z; τ*)B(α(z; α*), β))
This quantity corresponds to the probability P(t) of being cured at a given time t after diagnosis knowing that he/she was alive up to time t. It can be expressed as following:
$$ P(t|z) = \dfrac{\pi(z|\theta)}{S_n(t|z)} = \exp \left( \tau(z;\tau*) \left( B \left( \dfrac{t}{\tau(z;\tau*)} ; \alpha(z;\alpha*) , \beta \right) - B(\alpha, \beta) \right) \right) $$ To calculates the confidence intervals of P(t|z), can be obtained using the delta method. The application of this method requires the partial derivatives of P(t|z) with respect of the parameters of the model. This can be written as:
$$ \dfrac{\partial P(t|z)}{\partial \theta} = \dfrac{1}{S_n(t|z)^2} \left( \dfrac{\partial \pi(z|\theta)}{\partial \theta} S_n(t|z) - \dfrac{\partial S_n(t|z)}{\partial \theta} \pi(z|\theta) \right) $$
There are no covariables acting on parameters τ (τ = τ0) and α (α = α0)
fit_ad_tneh_nocov <- curesurv(Surv(time_obs, event) ~ 1,
pophaz = "ehazard",
cumpophaz = "cumehazard",
model = "nmixture", dist = "tneh",
link_tau = "linear",
data = testiscancer,
method_opt = "L-BFGS-B")
summary(fit_ad_tneh_nocov)
#> Call:
#> curesurv(formula = Surv(time_obs, event) ~ 1, data = testiscancer,
#> pophaz = "ehazard", cumpophaz = "cumehazard", model = "nmixture",
#> dist = "tneh", link_tau = "linear", method_opt = "L-BFGS-B")
#>
#> coef se(coef) z p
#> alpha0 2.1841 0.1032 21.166 <2e-16
#> beta 4.4413 0.5178 8.577 <2e-16
#> tau0 5.1018 0.5397 9.452 <2e-16
#>
#> Estimates and their 95% CI after back-transformation
#> estimates LCI UCI
#> alpha0 2.184 1.982 2.386
#> beta 4.441 3.426 5.456
#> tau0 5.102 4.044 6.160
#>
#> Cured proportion exp[-ζ0* B((α0+α*Z)β)]
#> For the reference individual (each Z at 0)
#> [1] 0.8474
#>
#> log-likelihood: -2633.903 (for 3 degree(s) of freedom)
#> AIC: 5273.806
#>
#> n= 2000 , number of events= 949
We can see that the time-to-cure τ = τ0 is estimated at 5.102 years, and the cure proportion is estimated at 84.74%.
We predict for varying time after diagnosis
The confidence intervals at 1 − α level for the cure fraction π can be written as:
$$ \left[\hat{\pi} \pm z_{1 - \alpha / 2} \sqrt{Var(\hat{\hat{\pi}})}\right] $$ where $$ Var(\hat{\pi}) = \dfrac{\partial \hat{\pi}}{\partial \theta} Var(\theta) \left(\dfrac{\partial \hat{\pi}}{\partial \theta}\right)^T $$
We search the time t = TTCi from which Pi(t) = 1 − ϵ. ϵ can be fixed to 0.95.
The variance formula can be expressed as:
$$ Var(TTC) = Var(g(\theta;z_i)) \simeq \left(\dfrac{\partial P(t|z_i;\theta)}{\partial t}_{|t = TTC}\right)^{-2} Var(P(t|z_i;\theta))_{|t=TTC} $$
We create a new age variable : age_crmin the reduced age and “centered” around the age of the youngest patient
This model has the variable age_crmin acting on both α and τ : (α = α0 + Zage_crmin × α1 et τ = τ0 + Zage_crmin × τ1)
fit_m1_ad_tneh <- curesurv(Surv(time_obs, event) ~ z_tau(age_crmin) +
z_alpha(age_crmin),
pophaz = "ehazard",
cumpophaz = "cumehazard",
model = "nmixture", dist = "tneh",
link_tau = "linear",
data = testiscancer,
method_opt = "L-BFGS-B")
summary(fit_m1_ad_tneh)
#> Call:
#> curesurv(formula = Surv(time_obs, event) ~ z_tau(age_crmin) +
#> z_alpha(age_crmin), data = testiscancer, pophaz = "ehazard",
#> cumpophaz = "cumehazard", model = "nmixture", dist = "tneh",
#> link_tau = "linear", method_opt = "L-BFGS-B")
#>
#> coef se(coef) z p
#> alpha0 2.87676 0.24110 11.932 < 2e-16
#> alpha_1_(age_crmin) -0.50110 0.07506 -6.676 2.46e-11
#> beta 5.15288 1.04648 4.924 8.48e-07
#> tau0 3.25984 0.55516 5.872 4.31e-09
#> tau_1_(age_crmin) 3.46629 1.24242 2.790 0.00527
#>
#> Estimates and their 95% CI after back-transformation
#> estimates LCI UCI
#> alpha0 2.877 2.404 3.349
#> alpha_1_(age_crmin) 2.376 2.229 2.523
#> beta 5.153 3.102 7.204
#> tau0 3.260 2.172 4.348
#> tau_1_(age_crmin) 6.726 4.291 9.161
#>
#> Cured proportion exp[-(ζ0+ζ*Z)* B((α0+α*Z)β)] and its 95% CI
#> For the reference individual (each Z at 0)
#> [1] 0.9675
#>
#> log-likelihood: -2544.1 (for 5 degree(s) of freedom)
#> AIC: 5098.2
#>
#> n= 2000 , number of events= 949
For the reference individual (that is the patient with age_crmin=0, so the youngest patient, approximately 20y old at diagnosis), the cure proportion is estimated at 96,75% and the time to null excess hazard at 3.258 year. For an individual whose age_ is_crmin is 1 (that is they are aged the standard deviation more than the youngest, that is approximately 39y old at diagnosis), the time to null excess hazard is 6.721 year.
With varying time for patient of mean age
#time varying prediction for patient with age mean
newdata1 <- with(testiscancer,
expand.grid(event = 0,
age_crmin = mean(age_crmin),
time_obs = seq(0.001,10,0.1)))
pred_agemean <- predict(object = fit_m1_ad_tneh, newdata = newdata1)
With varying time for oldest patient
#time varying prediction for patient with age max
newdata2 <- with(testiscancer,
expand.grid(event = 0,
age_crmin = max(age_crmin),
time_obs = seq(0.001,10,0.1)))
pred_agemax <- predict(object = fit_m1_ad_tneh, newdata = newdata2)
At 2 years after diagnostic for patients of different ages
# predictions at time 2 years with varying age
newdata3 <- with(testiscancer,
expand.grid(event = 0,
age_crmin = seq(min(testiscancer$age_crmin),
max(testiscancer$age_crmin), 0.1),
time_obs = 2))
pred_age_val <- predict(object = fit_m1_ad_tneh, newdata = newdata3)
val_age <- seq(min(testiscancer$age_crmin),
max(testiscancer$age_crmin),
0.1) * sd(testiscancer$age) + min(testiscancer$age)
par(mfrow = c(2, 2),cex = 1.0)
plot(pred_agemax$time,pred_agemax$ex_haz,type = "l",lty = 1,lwd = 2,
xlab = "Time since diagnosis", ylab = "excess hazard")
lines(pred_agemean$time,pred_agemean$ex_haz,type = "l",lty = 2,lwd = 2)
legend("topright",horiz = FALSE,
legend = c("hE(t) age.max = 79.9", "hE(t) age.mean = 50.8"),
col = c("black", "black"),lty = c(1, 2, 1, 1, 2, 2))
grid()
plot(pred_agemax$time,pred_agemax$netsurv,type = "l",lty = 1,lwd = 2,
ylim = c(0, 1),xlab = "Time since diagnosis",ylab = "net survival")
lines(pred_agemean$time,pred_agemean$netsurv,type = "l",lty = 2,lwd = 2)
legend("bottomleft",horiz = FALSE,
legend = c("Sn(t) age.max = 79.9", "Sn(t) age.mean = 50.8"),
col = c("black", "black"),lty = c(1, 2, 1, 1, 2, 2))
grid()
plot(pred_agemax$time,pred_agemax$pt_cure,type = "l",lty = 1,lwd = 2,ylim = c(0, 1),
xlab = "Time since diagnosis",ylab = "probability of being cured P(t)")
lines(pred_agemean$time,pred_agemean$pt_cure,type = "l",lty = 2,lwd = 2)
abline(v = pred_agemean$tau[1],lty = 2,lwd = 2,col = "blue")
abline(v = pred_agemean$time_to_cure_ttc[1],lty = 2,lwd = 2,col = "red")
abline(v = pred_agemax$tau[1],lty = 1,lwd = 2,col = "blue")
abline(v = pred_agemax$time_to_cure_ttc[1],lty = 1,lwd = 2,col = "red")
grid()
legend("bottomright",horiz = FALSE,
legend = c("P(t) age.max = 79.9","P(t) age.mean = 50.8",
"TNEH age.max = 79.9","TTC age.max = 79.9",
"TNEH age.mean = 50.8","TTC age.mean = 50.8"),
col = c("black", "black", "blue", "red", "blue", "red"),
lty = c(1, 2, 1, 1, 2, 2))
par(oldpar)
par(mfrow=c(2,2))
plot(val_age,pred_age_val$ex_haz,
type = "l",lty=1, lwd=2,
xlab = "age",ylab = "excess hazard 2y after diagnosis")
grid()
plot(val_age,pred_age_val$netsurv,
type = "l", lty=1,lwd=2,ylim=c(0,1),
xlab = "age", ylab = "net survival 2y after diagnosis")
grid()
plot(val_age,pred_age_val$pt_cure, type = "l", lty=1, lwd=2,ylim=c(0,1),
xlab = "age",ylab = "P(t) 2y after diagnosis")
grid()
plot(val_age,pred_age_val$cured, type = "l", lty=1, lwd=2,ylim=c(0,1),
xlab = "age", ylab = "cure proportion")
grid()
Effects of centered and reduced age on τ (tau = τ0 + Zage_cr × τ1, α = α0)
#| echo: true
#| label: withtauonly
#| warning: false
#| message: false
fit_ad_tneh_covtau <- curesurv(
Surv(time_obs, event) ~ z_tau(age_cr),
pophaz = "ehazard",
cumpophaz = "cumehazard",
model = "nmixture",
dist = "tneh",
link_tau = "linear",
data = testiscancer,
method_opt = "L-BFGS-B"
)
#> Warning in diag(varcov_star): NAs introduced by coercion
#> Warning in diag(varcov): NAs introduced by coercion
#> Warning in diag(varcov_star): NAs introduced by coercion
#> Warning in diag(varcov): NAs introduced by coercion
#> Warning in par(oldpar): calling par(new=TRUE) with no plot
#> Warning in diag(varcov_star): NAs introduced by coercion
#> Warning in diag(varcov): NAs introduced by coercion
#> Warning in par(oldpar): calling par(new=TRUE) with no plot
#> Warning in diag(varcov_star): NAs introduced by coercion
#> Warning in diag(varcov): NAs introduced by coercion
#> Warning in par(oldpar): calling par(new=TRUE) with no plot
#> Warning in par(oldpar): calling par(new=TRUE) with no plot
summary(fit_ad_tneh_covtau)
#> Call:
#> curesurv(formula = Surv(time_obs, event) ~ z_tau(age_cr), data = testiscancer,
#> pophaz = "ehazard", cumpophaz = "cumehazard", model = "nmixture",
#> dist = "tneh", link_tau = "linear", method_opt = "L-BFGS-B")
#>
#> coef se(coef) z p
#> alpha0 1.9753 0.1299 15.206 < 2e-16
#> beta 5.3066 1.1286 4.702 2.58e-06
#> tau0 7.4380 1.8109 4.107 4.00e-05
#> tau_1_(age_cr) 2.3159 0.8529 2.715 0.00662
#>
#> Estimates and their 95% CI after back-transformation
#> estimates LCI UCI
#> alpha0 1.975 1.721 2.230
#> beta 5.307 3.095 7.519
#> tau0 7.438 3.889 10.987
#> tau_1_(age_cr) 9.754 8.082 11.426
#>
#> Cured proportion exp[-(ζ0+ζ*Z)* B((α0+α*Z)β)] and its 95% CI
#> For the reference individual (each Z at 0)
#> [1] 0.794
#>
#> log-likelihood: -2610.768 (for 4 degree(s) of freedom)
#> AIC: 5229.537
#>
#> n= 2000 , number of events= 949
This model estimates a cure proportion of 79.4% for individuals of mean age (51.05 year old at diagnosis) with a time to null excess hazard 7.44 year. For an individual of age at diagnosis 70.01 (age_cr=1), the time to null excess hazard is estimated at 9.754. This model estimates that the time to null excess hazard increases by 2.3159 when age_cr increases by (that is, when age at diagnosis increases by 18.96)
summary(testiscancer$age_cr)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> -1.6362 -0.9358 0.0276 0.0000 0.9525 1.5240
summary(testiscancer$age)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 20.02 33.30 51.57 51.05 69.11 79.95
newdata2 <- with(testiscancer,
expand.grid(event = 0,
time_obs = seq(0.1, 10, 0.1),
age_cr = c(-0.9358, 0.0276, 0.9525) ))
newdata2_1stqu <- newdata2[newdata2$age_cr==-0.9358,]
newdata2_2rdqu <- newdata2[newdata2$age_cr==0.0276,]
newdata2_3rdqu <- newdata2[newdata2$age_cr==0.9525,]
p1stqu <- predict(object = fit_ad_tneh_covtau, newdata = newdata2_1stqu)
p2rdqu <- predict(object = fit_ad_tneh_covtau, newdata = newdata2_2rdqu)
p3rdqu <- predict(object = fit_ad_tneh_covtau, newdata = newdata2_3rdqu)
Effect of age_cr on only α (α = α0 + Zage_cr * α1, τ = τ0)
#| echo: true
#| label: only_covariate_on_alpha
#| message: false
#| warning: false
fit_ad_tneh_covalpha <-
curesurv(
Surv(time_obs, event) ~ z_alpha(age_cr),
pophaz = "ehazard",
cumpophaz = "cumehazard",
model = "nmixture",
dist = "tneh",
link_tau = "linear",
data = testiscancer,
method_opt = "L-BFGS-B"
)
summary(fit_ad_tneh_covalpha)
#> Call:
#> curesurv(formula = Surv(time_obs, event) ~ z_alpha(age_cr), data = testiscancer,
#> pophaz = "ehazard", cumpophaz = "cumehazard", model = "nmixture",
#> dist = "tneh", link_tau = "linear", method_opt = "L-BFGS-B")
#>
#> coef se(coef) z p
#> alpha0 2.06862 0.11152 18.550 < 2e-16
#> alpha_1_(age_cr) -0.46785 0.06331 -7.389 1.48e-13
#> beta 4.77703 0.81573 5.856 4.74e-09
#> tau0 6.09881 1.11797 5.455 4.89e-08
#>
#> Estimates and their 95% CI after back-transformation
#> estimates LCI UCI
#> alpha0 2.069 1.850 2.287
#> alpha_1_(age_cr) 1.601 1.477 1.725
#> beta 4.777 3.178 6.376
#> tau0 6.099 3.908 8.290
#>
#> Cured proportion exp[-ζ0* B((α0+α*Z)β)]
#> For the reference individual (each Z at 0)
#> [1] 0.8181
#>
#> log-likelihood: -2586.138 (for 4 degree(s) of freedom)
#> AIC: 5180.275
#>
#> n= 2000 , number of events= 949
The time to null excess hazard is estimated at 6.099 year for all individuals, and the cure proportion is estimated at 81.81% for patient of mean age (51.05 yo at diagnosis)
We fitted 4 models and to compare them we can use the AIC criteria
AIC(fit_ad_tneh_nocov,fit_ad_tneh_covalpha,fit_ad_tneh_covtau,fit_m1_ad_tneh)
#> fit_ad_tneh_nocov fit_ad_tneh_covalpha fit_ad_tneh_covtau fit_m1_ad_tneh
#> 1 5273.806 5180.275 5229.537 5098.2
The best model is the one with the lowest AIC : it’s the one with an effect of age on both $ $ and α.
date()
#> [1] "Sun Nov 17 05:11:16 2024"
sessionInfo()
#> R version 4.4.2 (2024-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.1 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] curesurv_0.1.1 survival_3.7-0 stringr_1.5.1 rmarkdown_2.29
#>
#> loaded via a namespace (and not attached):
#> [1] vctrs_0.6.5 cli_3.6.3 knitr_1.49
#> [4] rlang_1.1.4 xfun_0.49 Formula_1.2-5
#> [7] stringi_1.8.4 jsonlite_1.8.9 glue_1.8.0
#> [10] buildtools_1.0.0 htmltools_0.5.8.1 maketools_1.3.1
#> [13] sys_3.4.3 sass_0.4.9 grid_4.4.2
#> [16] evaluate_1.0.1 jquerylib_0.1.4 fastmap_1.2.0
#> [19] numDeriv_2016.8-1.1 yaml_2.3.10 lifecycle_1.0.4
#> [22] compiler_4.4.2 rngWELL_0.10-10 randtoolbox_2.0.5
#> [25] lattice_0.22-6 digest_0.6.37 R6_2.5.1
#> [28] splines_4.4.2 magrittr_2.0.3 bslib_0.8.0
#> [31] Matrix_1.7-1 tools_4.4.2 cachem_1.1.0