# 17.5.4.2 Algorithms (Mann-Whitney Test)

Consider two independent samples $F(x)\,$ and $G(y)\,$, with the size of $n_1\,\!$ and $n_2\,\!$, and the sample data is denoted as $x_1,x_2,\ldots ,x_{n_1}\,\!$ and $y_1,y_2,\ldots ,y_{n_1}\,\!$ respectively.

The null hypothesis, $H_0: F(x) = G(y)\,$, is that the two distributions are the same. And this is to be tested against an alternative hypothesis $H_1\,$ which is: $H_1: F(x) \neq G(y)\,$; or $H_1: F(x) < G(y)\,\!$, the $x\,$'s tend to be greater than the $y\,$'s; or $H_1: F(x) > G(y)\,\!$, the $x\,$'s tend to be less than the $y\,$'s.

The test procedure includes the following steps:

• Combine $x_i \,\!$, $y_i\,\!$ in a group.
• Rank them in ascending order. Ties receive the average of their ranks. Let $r_{1i}\,\!$be the ranks assigned to $x_i \,\!$, for $i=1,2,\ldots ,n_1$ and be the ranks assigned to $y_i\,\!$, for $j=1,2,\ldots ,n_2$.
• Calculate sum of ranks: $S_1=\sum_{I=1}^{n_1}r_{1i}\,\!$, and $S_2=\sum_{I=1}^{n_2}r_{2j}\,\!$
• Test statistic $U\,$ is defined as follow: $U=S_1-\frac{n_1(n_1+1)}2\,$
• The approximate Normal test statistic $z\,$is calculated as: $z=\frac{U-M(U)\pm \frac 12}{\sqrt{Var(U)}} \,$
where $M(U)=\frac{n_1n_2}2 \,$
and $Var(U)=\frac{n_1n_2(n_1+n_2+1)}{12}-\frac{n_1n_2}{(n_1+n_2)(n_1+n_2-1)}\times TS \,$
where $TS=\sum_{j=1}^\tau \frac{(t_j)(t_j-1)(t_j+1)}{12}\,$. $\tau \,$is the number of ties in the sample and $t_j\,$is the number of ties in the jth group.
Note that if no ties are present, the variance of $U \,$ reduces to $\frac{n_1n_2(n_1+n_2+1)}{12}\,$

For more details of the algorithm, please refer to nag_mann_whitney (g08amc)