# Chi-Square Test in R and Interpretation – R tutorial

247

R provides several methods of testing the independence of the categorical variables. In my tutorial, I will show three tests such as the chi-square test of independence, the Fisher exact test, and the Cochran-Mantel–Haenszel test.

Chi-Square test is a statistical method used to determine if two categorical variables have a significant correlation between them. The two variables are selected from the same population. Furthermore, these variables are then categorised as Male/Female, True/False, etc.

The function `chisq.test()` is used to perform this operation. I will show an example with builtin data on `vcd` package. You can always import data into R using CSV, Excel or SPSS data file. Also, we will see how to interpret the results of the Chi-square test.

#### Hypotheses of Chi-Square test

Null hypothesis – Assumes that there is no association between the two variables.

Alternative hypothesis – Assumes that there is an association between the two variables.

Let us see an example now.

#### Example

To install vcd package use the command `install.packages("vcd")`. Then use the following code to performs Chi-Square test in R for two different sets of variables and to understand when to accept and when to reject the hypothesis.

```> library(vcd)
> chisq.test(Arthritis\$Treatment,Arthritis\$Improved)
Pearson's Chi-squared test

data:  Arthritis\$Treatment and Arthritis\$Improved
X-squared = 13.055, df = 2, p-value = 0.001463
> chisq.test(Arthritis\$Improved,Arthritis\$Sex)

Pearson's Chi-squared test

data:  Arthritis\$Improved and Arthritis\$Sex
X-squared = 4.8407, df = 2, p-value = 0.08889
Warning message:
In chisq.test(Arthritis\$Improved, Arthritis\$Sex) :
Chi-squared approximation may be incorrect```

From the result of `chisq.test(Arthritis\$Treatment,Arthritis\$Improved)`, there appears to be a relationship between treatment received and level of improvement, We come to this conclusion because the p-value is less than 0.01. i.e, p < 0.01. Hence, we reject the null hypothesis and accept the alternative hypothesis.

But the result of `chisq.test(Arthritis\$Improved,Arthritis\$Sex)` shows that there doesn’t appear to be a relationship between patient sex and improvement because the p-value is greater than 0.01 or 0.05 i.e, p > 0.05. Hence, we reject the alternative hypothesis and accept the null hypothesis.

The warning message is produced because one of the six cells in the table (male-some improvement) has an expected value of less than five, which may invalidate the chi-square approximation. Use the code `head(Arthritis)` to check this.

So, this is how you can perform a Chi-Square test in R and interpret the result.