Overview of Two Way ANOVA in R
A statistical concept that helps to understand the relationship between one continuous dependent variable and two categorical independent variables and is usually studied over samples from various populations through the formulation of null and alternative hypotheses, and that certain considerations such as related to the independence of samples, normal distribution, equality of variance, outliers, etc. must be followed by the variables of interest, with R programming providing a very effective functionality to leverage the concept, is termed as Two Way ANOVA, i.e., two-way analysis of variance in R.
Example of Two Way ANOVA in R
Let’s perform one way ANOVA test on cancer levels data set, which contains 48 rows and 3 data variables:
Time Taken: Survival time of an animal
Different levels of cancer 1 – 3
Treatment: Treatments used from 1-3
Before we test, we need the following data in hand.
- Importing the data
- Remove unnecessary variable
- Convert variables (levels of Cancer) as ordered levels.
Below is the data set.
time for survival <dbl> 0.31, 0.45, 0.46, 0.43, 0.36, 0.29, 0.40, 0.23, 0.22, 0…
cancer levels<ord> 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 1, 1, 1, 1, 2, 2, 2…
Treatment <fctr> A, A, A, A, A, A, A, A, A, A, A, A, B, B, B, B, B, B,…
- H₀: no change in average survival time between group
- H₀: survival time is different for at least one group.
- Check the cancer levels. We can see three character values because we convert them into factors with a mutate verb.
output:  "1" "2" "3"
- Compute both mean and standard deviation
df % > %
group_by(cancerlevels) % > %
count_ cancerlevels = n(),
mean_time = mean(time, na.rm = TRUE),
sd_time = sd(time, na.rm = TRUE)
A tibble: 3 x 4
cancerlevels count_cancerlevels mean_time sd_time
<ord> <int> <dbl> <dbl>
1 1 16 0.617500 0.20942779
2 2 16 0.544375 0.28936641
3 3 16 0.276250 0.06227627
- In step three, you can graphically check if there is a difference between the distributions. Again, note that you include the jittered dot.
- Run test with command AOV.
- formula: The equation you want to estimate
- data: The dataset used
y ~ X1+ X2+…+Xn (X1 + X2 +… refers to the independent variables)
y ~. Use all the remaining variables as independent variables
Make sure you save the model and print the summary.
- aov(time ~ cancerlevels, data = df): Run the ANOVA test with the following formula
- summary (anova_one_way): Print the summary of the test
Df Sum Sq Mean Sq F value Pr(>F)
Cancerlevels 2 1.033 0.5165 11.79 7.66e-05 ***
Residuals 45 1.972 0.0438
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
The p-value is lower than the threshold 0.05. The statistical difference is indicated by ‘*’ in the above case.
One Way Test to Two Way Anova in R
Let’s see how the one-way test can be extended to two-way ANOVA. The test is similar to one-way ANOVA, but the formula differs and adds another group variable to the formula.
y = x1 + x2
- H0: The means are equal for both variables (factor variables)
- H3: The means are different for both variables
You add treat variables to our model. This variable indicates the treatment given to the patient. You are interested to see if there is a statistical dependence between the cancer levels and treatment given to the patient.
We adjust our code by adding a treat with the other independent variable.
Df Sum Sq Mean Sq F value Pr(>F)
Cancer levels 2 1.0330 0.5165 20.64 5.7e-07 ***
Treat 3 0.9212 0.3071 12.27 6.7e-06 ***
Residuals 42 1.0509 0.0250
Both cancer levels and treatment are statistically different from 0. By this, we can reject the NULL hypothesis. Also, confirm that changing the treatment or type of cancer impacts the time of survival.
One-way ANOVA: H3- Average is different for at least one group
Two-way ANOVA: H3- Average is different for both the groups.
Difference Between One way and Two way ANOVA
The Differences between One-way ANOVA and two-way ANOVA are given below.
|One-way ANOVA||Two-way ANOVA|
|Designed to enable equality testing between 3 or more means||Designed to assess the interrelationship of two independent variables on a dependent variable.|
|Involves one independent variable||Involves two independent variables|
|Analyzed in 3 or more categorical groups.||Compares multiple groups of two factors|
|Has to satisfy two principles- replication and randomization||It has to satisfy three principles which are replication, randomization, and local control.|
Advantages of Two way ANOVA
Some advantages are as follows.
- In the above example, the age and gender in our example – helps to reduce error variation, making the design more efficient.
- Two-Way ANOVA enables us to test the effect of two factors at the same time.
Applications of ANOVA
The applications of ANOVA are listed below.
- Comparing the mileage of different vehicles, fuel and road types.
- Getting to know the impact of temperature, pressure or chemical concentration on some chemical reaction (power reactors, chemical plants, etc.)
- Impact of different catalysts on chemical reaction rates
- Understanding the impact of commercials and different numbers of customer responses.
- Impact of performance, quality and speed manufacturing in biology(process based on the number of cells they get divided into)
This is a guide to Two Way ANOVA in R. Here, we discuss the examples, objectives, steps, and differences between one and two-way ANOVA. You may also have a look at the following articles to learn more –