For a set of 
 observations classified by two variables, with 
 and 
 levels respectively, a two-way table of frequencies with 
 rows and 
 columns can be computed.
To measure the association between the two classification variables two statistics that can be used are:
Where 
 are the fitted values from the model that assumes the effects due to the classification variables are additive, i.e., there is no association.  These values are the expected cell frequencies and are given by,
Under the hypothesis of no association between the two classification variables, both these statistics have, approximately, a 
 distribution with 
 degrees of freedom.  This distribution is arrived at under the assumption that the expected cell frequencies, 
, are not too small.  For a discussion of this point see 
Everitt (1977).  He concludes by saying, ‘`...  in the majority of cases the chi-square criterion may be used for tables with expectations in excess of 
 in the smallest cell’'.
In the case of the 
 table, i.e., 
 and 
, the 
 approximation can be improved by using Yates' continuity correction factor.  This decreases the absolute value of 
 by 
.  For 
 tables with a small value of 
 the exact probabilities from Fisher's test are computed.  These are based on the hypergeometric distribution and are computed using 
nag_hypergeom_dist (g01blc).  A two-tail probability is computed as 
, where 
 and 
 are the upper and lower one-tail probabilities from the hypergeometric distribution.
- 1:
  
      – IntegerInput
- 
On entry: the number of rows in the contingency table, . Constraint:
  .
 
- 2:
  
      – IntegerInput
- 
On entry: the number of columns in the contingency table, . Constraint:
  .
 
- 3:
  
      – const IntegerInput
- 
On entry: the contingency table,  must contain , for  and . Constraint:
   for  and .
 
- 4:
  
      – IntegerInput
- 
On entry: the stride separating matrix column elements in  the arrays  nobst,  expt,  chist. 
 Constraint:
  . 
- 5:
  
      – doubleOutput
- 
On exit: the table of expected values,  contains , for  and . 
- 6:
  
      – doubleOutput
- 
On exit: the table of  contributions,  contains , for  and . 
- 7:
  
      – double *Output
- 
On exit: if  ,   and   then  prob contains the two-tail significance level for Fisher's exact test, otherwise  prob contains the significance level from the Pearson   statistic. 
 
- 8:
  
      – double *Output
- 
On exit: the Pearson  statistic. 
- 9:
  
      – double *Output
- 
On exit: the likelihood ratio test statistic. 
- 10:
  
    – double *Output
- 
On exit: the degrees of freedom for the statistics. 
- 11:
  
    – NagError *Input/Output
- 
The NAG error argument (see  Section 3.7 in How to Use the NAG Library and its Documentation). 
For the accuracy of the probabilities for Fisher's exact test see 
nag_hypergeom_dist (g01blc).
Multidimensional contingency tables can be analysed using log-linear models fitted by 
nag_glm_binomial (g02gbc).
The data below, taken from 
Everitt (1977), is from 141 patients with brain tumours.  The row classification variable is the site of the tumour: frontal lobes, temporal lobes and other cerebral areas.  The column classification variable is the type of tumour: benign, malignant and other cerebral tumours.
The data is read in and the statistics computed and printed.