17.10.2 Algorithms (ROC Curve)

1 ROC Values
2 The area under the ROC curve
3 The SE of the area under the ROC curve statistic
4 The asymptotic confidence interval of the area under the ROC curve
5 The asymptotic P-value under the null hypothesis that vs. the alternative hypothesis that
6 Optimal Cut-Point Value

In this part, Following notation will be used.

$x_i\,\!$ : Test result score for case

$n_{TP}\,\!$ : Number of true positive decisions

$n_{FN}\,\!$ : Number of false negative decisions

$n_{TN}\,\!$ : Number of true negative decisions

$n_{FP}\,\!$ : Number of false positive decisions

$n_{-}\,\!$ : Number of cases with negative actual state

$n_{+}\,\!$ : Number of cases with positive actual state

$n_{-=j}\,\!$ : Number of true negative cases with test results equal to

$n_{+>j}\,\!$ : : Number of true positive cases with test results greater than

$n_{+=j}\,\!$ : : Number of true positive cases with test results equal to

$n_{-<j}\,\!$ : : Number of true negative cases with test results less than

ROC Values

1- Specificity (X): $1-\frac{n_{TN}}{n_{TN}+n_{FP}}\,\!$

Sensitivity (Y): $\frac{n_{TP}}{n_{TP}+n_{FN}}\,\!$

The area under the ROC curve

Let $x\,\!$ be the scale of the test result variable. Denote $x_{-}\,\!$ by the $x\,\!$ values for cases with negative actual states and $x_{+}\,\!$ the values for cases with positive actual states. Then, the nonparametric approximation of the &rdquor;true” area under the ROC curve, $\theta \,\!$ ,is

$A_Z=\frac 1{n_{+}n_{-}}$ $\sum_{j=1}^{n_{-}}\sum _{i=1}^{n_{+}}\Psi (x_{+},x_{-})$

where $n_{+}\,\!$ is the sample size of $D\,\!$ +, $n_{+}\,\!$ is the sample size of $D\,\!$ -, and

$\Psi (x_{+},x_{-})=\,\!$ $\begin{cases} 1, & \mbox{if }x_{+}>x_{-} \\ 0.5, & \mbox{if }x_{+}=x_{-} \\ 0, & \mbox{if }x_{+}<x_{-} \end{cases}$

Note that $A_z\,\!$ is the observed area under the ROC curve, which connects successive points by a straight line, i.e., by the trapezoidal rule.

An alternative way to compute $A_z\,\!$ is as follows:

$A_Z=\frac 1{n_{+}+n_{-}}\sum \left\{ n_{-=j}n_{+>j}+\frac{n_{-=j}n_{+=j}}2\right\}$

The SE of the area under the ROC curve statistic

The standard deviation of $A_z\,\!$ is estimated by:

$SE(A_Z)=\sqrt{\frac{A_Z(1-A_Z)+(n_{+}-1)(Q_1-A_Z^2)+(n_{-}-1)(Q_2-A_Z^2)}{n_{+}n_{-}}} \,\!$

where

$Q_{1=\frac 1{n_{-}n_{+}^2}}\sum n\__{=j}[n_{+>j}^2+n_{+>j}n_{+=j}+\frac{n_{+>j}^2}3] \,\!$

and

$Q_{2=\frac 1{n_{-}^2n_{+}}}\sum n_{+=j}[n_{->j}^2+n_{->j}n_{-=j}+\frac{n_{-=j}^2}3] \,\!$

The asymptotic confidence interval of the area under the ROC curve

A 2-sided asymptotic $c\%=(100-\alpha )\%\,\!$ confidence interval for the true area under the ROC curve is

$A_Z\pm SE(A_Z)\,\!$

The asymptotic P-value under the null hypothesis that $\theta=0.5\ \,\!$ vs. the alternative hypothesis that $\theta \neq 0.5\ \,\!$

Since $A_z\,\!$ is asymptotically normal under the null hypothesis that $\theta=0.5\ \,\!$ , we can calculate the asymptotic P-value under the null hypothesis that $\theta=0.5\ \,\!$ vs. the alternative hypothesis that $\theta \neq 0.5\ \,\!$ :

$P\left( \left| Z\right| >\left| \frac{A_Z-0.5}{SD(A_Z)|_{\theta =0.5}}\right| \right) =2P\left( Z>\left| \frac{A_Z-0.5}{SD(A_Z)\mid _{\theta =0.5}}\right| \right)$

In the nonparametric case,

$SD(A_Z)|_{\theta =0.5}=\sqrt{\frac{\theta (1-\theta )+(n_{+}-1)(Q_1-\theta ^2)+(n_{-}-1)(Q_2-\theta ^2)}{n_{+}n_{-}}}|_{\theta =0.5}\,\!$

$=\sqrt{\frac{0.5(1-0.5)+(n_{+}-1)(\frac 13-0.5^2)+(n_{-}-1)(\frac 13-0.5^2)}{n_{+}n_{-}}}$

Optimal Cut-Point Value

The cut-point value is defined by the equality maximization of these two quantities (SpEqualSe), which is min( abs(1-x-y) ) for ROC curve.

Skip Navigation Links

English | Deutsch | 日本語

© OriginLab Corporation. All rights reserved. Site Map \| Privacy Policy \| Terms of Use

× ☐ _ Let's Chat