17.6.2.2 Algorithms (Cox Proportional Hazard Regression)

Let t_i\,\!,for i = 1, 2, ?, n, be the failure time or censored time for the ith observation with the vector of p covariates Z_j(j=1,2,\ldots ,p). It is assumed that the failure and censored mechanisms are independent. The hazard function, \lambda (z,t)\,\! , is the probability that an individual with covariates z fails at time t given that the individual survived up to time t. In the Cox proportional model is of the form:

\lambda (z,t)=\lambda _0(t)\exp (z^{T}\beta +\omega )\,\!

where \lambda _0\,\! is the base-line hazard function, an unspecified function of time, \beta \,\! is a vector of unknown parameters and \omega\,\! is a known offset.

Assuming there are ties in the failure time giving n_d < n\,\! distinct failure times, t_{(1)} < t_{(2)} < ?< t_{(nd)} , such that d_i\,\! individuals fail at t_{(i)}\,\! , it follows that the marginal likelihood for \beta is well approximated by:

L=\prod_{i=1}^{n_d}\frac{\exp (s_i^{T}\beta +\omega _i)}{[\sum_{l\in R(t_{(1)})}\exp (z_i^{T}\beta +\omega _i)]^{d_{i}}}

(1)

where s_i\,\! is the sum of covariate of individuals observed fail at t_{(i)}\,\! and R(t_{(i)})\,\! is the set of individuals at risk just prior to t_{(i)}\,\! , that is it is all the individuals that fail or censored at time t_{(i)} along with all individuals survived beyond the time t_{(i)}\,\! . The MLE (maximum likelihood estimates) of \beta\,\!, given by\hat \beta\,\!, are obtained by maximizing (1) using a Newton-Raphson iteration technique that includes step having and utilizes the first and second partial derivatives of (1) which are given by (2) and (3) below:

U_j(\beta )=\frac{\partial Ln(L)}{\partial \beta _j}=\sum_{i=1}^{n_d}[s_{ji}-d_i\alpha _{ji}(\beta )]=0

(2)

for j = 1, 2, ?, p, where s_{ji}\,\! is the jth element in the vector s_i\,\! and

\alpha _{ji}(\beta )=\frac{\sum_{l\in R(t_{(1)})}z_{jl}\exp (z_l^{T}\beta +\omega _l)}{\sum_{l\in R(t_{(1)})}\exp (z_l^{T}\beta +\omega _l)}

Similarly,

I_{hj}(\beta )=-\frac{\partial ^2Ln(L)}{\partial \beta _h\partial \beta _j}=\sum_{i=1}^{n_d}d_i\gamma _{hji}

(3)

where \gamma _{hji}=\frac{\sum_{l\in R(t_{(1)})}z_{hl}z_{jl}\exp (z_l^{T}\beta +\omega _l)}{\sum_{l\in R(t_{(1)})}\exp (z_l^{T}\beta +\omega _l)}-\alpha _{hi}(\beta )\alpha _{ji}(\beta ) h, j = 1, ? p.

U_j(\beta )\,\! is the jth component of a score vector I_{hi}(\beta )\,\! is the (h, j) element of the observed information matrix I(\beta )\,\! whose inverse I(\beta )^{-1}=I_{hi}(\beta )^{-1}\,\! gives the variance-covariance matrix of \beta\,\!.

It should be noted that if a covariate or a linear combination of covariates is monotonically increasing or decreasing with time ,then one or more of the \beta _j^{\prime }s will be infinite.

If \lambda _0(t)\,\! varies across \nu\,\! strata, where the number of individuals in the kth stratum is n_k\,\!, k = 1, ?, \nu\,\!, with n=\sum_{k=1}^\nu n_k , then rather than maximizing (1) to obtain \hat \beta\,\!, the following marginal likelihood is maximized:

L=\prod_{k=1}^\nu L_k

(4)

where L_k\,\! is the contribution to likelihood for the n_k\,\! observations in the kth stratum treated as a single sample in (1). When strata are concluded the covariate coefficients are constant across strata but there is a different base-line hazard function \lambda _0(t)\,\!.

The base-line survival function associated with a failure time t_{(i)}\,\! , is estimated as

exp(-\hat H(t_{(i)})) ,

where \hat H(t_{(i)})=\sum_{t(j)\leq t(i)}(\frac{d_i}{\sum_{l\in R(t_{(j)})}\exp (z_l^T\hat \beta +\omega _l)})

and d_i\,\! is the number of failures at time t_{(i)}\,\! . The residual of the lth observation is computed as:

r(t_l)=\hat H(t_l)\exp (-z_l^T\hat \beta +\omega _l)

where \hat H(t_l)=\hat H(t_{(i)}),t_{(i)}\leq t_l<t_{(i+1)} .

The deviance is defined as -2^*\,\! (logarithm of marginal likelihood). There are two ways to test whether individual covariates are significant: the differences between the deviances of nested models can be compared with appropriate \chi ^2\,\!-distribution; or, the asymptotic normality of the parameter estimates can be used to form z-test by dividing the estimates by their standard errors or the score function for the model under the null hypothesis can be used to form z-test.