Given an input contingency table, fun.chisq.test()
offers three quantities to evaluate non-parametric functional dependency
of the column variable \(Y\) on the row
variable \(X\). They include the
functional chi-squared test statistic (\(\chi^2_f\)), statistical significance
(\(p\)-value), and effect size
(function index \(\xi_f\)).
We explain their differences in analogy to those statistics returned
from cor.test()
, the R function for the test of
correlation, and the \(t\)-test. We
chose both tests because they are widely used and well understood.
Another choice could be the Pearson’s chi-squared test plus a statistic
called Cramer’s V, analogous to correlation coefficient, but not as
popularly used. The table below summarizes the differences among the
quantities and their analogous counterparts in correlation and \(t\) tests.
Quantity | Measure functional dependency? | Affected by sample size? | Affected by table size? | Measure statistical significance? | Counterpart in correlation test | Counterpart in two-sample \(t\)-test |
---|---|---|---|---|---|---|
\(\chi^2_f\) | Yes | Yes | Yes | No | \(t\)-statistic | \(t\)-statistic |
\(p\)-value | Yes | Yes | Yes | Yes | \(p\)-value | \(p\)-value |
\(\xi_f\) | Yes | No | No | No | correlation coefficient | mean difference |
The test statistic \(\chi^2_f\)
measures deviation of \(Y\) from a
uniform distribution contributed by \(X\). It is maximized when there is a
functional relationship from \(X\) to
\(Y\). This statistic is also affected
by sample size and the size of the contingency table. It summarizes the
strength of both functional dependency and support from the sample. A
strong function supported by few samples may have equal \(\chi^2_f\) to a weak function supported by
many samples. It is analogous to the test statistic (not to be confused
with correlation coefficient) in cor.test()
, or the \(t\) statistic from the \(t\)-test.
The \(p\)-value of \(\chi^2_f\) overcomes the table size factor
and making tables of different sizes or sample sizes comparable.
However, its null distribution (chi-squared or normalized) is only
asymptotically true. It is analogous to the role of the \(p\)-value of cor.test()
.
The function index \(\xi_f\)
measures only the strength of functional dependency normalized
by sample and table sizes without considering statistical significance.
When the sample size is small, the index can be unreliable; when the
sample size is large, it is a direct measure of functional dependency
and is comparable across tables. It is analogous to the role of
correlation coefficient in cor.test()
, or fold change in
\(t\)-test for differential gene
expression analysis.