Overview of Two Way ANOVA in R
Two way ANOVA (Analysis of Variance) helps us to understand the relationship between one continuous dependent variable and two categorical independent variables. In this topic, we are going to learn about Two Way ANOVA in R.
Below are the hypothesis of interest under two-way ANOVA
- H₀: Call it the main effect which is the first factor that is dependent on the continuous variable
- H₀: the Main effect also is about the effect on the second variable on the dependent continuous variable.
- H₀: Interaction is the combined effect of both first, second-factor variable on the dependent variable
Below are the norms that a two-way ANOVA has to satisfy.
- Observations must be independent
- Observations should be normally distributed.
- There should be equal variance in the observations
- No outliers in design
- Errors should be independent.
We need to transform our data if normality and equal variance is violated.
Example of Two Way ANOVA in R
Let’s perform one way ANOVA test on cancer levels data set which contains 48 rows and 3 data variables:
Time Taken: Survival time of an animal
Different levels of cancer 1 – 3
Treatment: Treatments used from 1-3
Before we test, we need the following data in hand.
- Importing the data
- Remove unnecessary variable
- Convert variables (levels of Cancer) as ordered level.
Below is the data set.
time for survival <dbl> 0.31, 0.45, 0.46, 0.43, 0.36, 0.29, 0.40, 0.23, 0.22, 0…
4.5 (2,679 ratings)
cancer levels<ord> 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 1, 1, 1, 1, 2, 2, 2…
Treatment <fctr> A, A, A, A, A, A, A, A, A, A, A, A, B, B, B, B, B, B,…
- H₀: no change in average survival time between group
- H₀: survival time is different for at least one group.
- Check the cancer levels. We can see three character values because we convert them into factors with a mutate verb.
output:  "1" "2" "3"
- Compute both mean and standard deviation
df % > %
group_by(cancerlevels) % > %
count_ cancerlevels = n(),
mean_time = mean(time, na.rm = TRUE),
sd_time = sd(time, na.rm = TRUE)
A tibble: 3 x 4
cancerlevels count_cancerlevels mean_time sd_time
<ord> <int> <dbl> <dbl>
1 1 16 0.617500 0.20942779
2 2 16 0.544375 0.28936641
3 3 16 0.276250 0.06227627
- In step three, you can graphically check if there is a difference between the distributions. Note that you include the jittered dot.
- Run test with command AOV.
- formula: The equation you want to estimate
- data: The dataset used
y ~ X1+ X2+…+Xn (X1 + X2 +… refers to the independent variables)
y ~ . Use all the remaining variables as independent variables
Make sure you save the model and print the summary.
- aov(time ~ cancerlevels, data = df): Run the ANOVA test with the following formula
- summary (anova_one_way): Print the summary of the test
Df Sum Sq Mean Sq F value Pr(>F)
Cancerlevels 2 1.033 0.5165 11.79 7.66e-05 ***
Residuals 45 1.972 0.0438
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
The p-value is lower than the threshold 0.05. The statistical difference is indicated by ‘*’ in the above case.
One Way Test to Two Way Anova in R
Let’s see how the one-way test can be extended to two-way ANOVA. The test is similar to one-way ANOVA but the formula differs and adds another group variable to the formula.
y = x1 + x2
- H0: The means are equal for both variables (factor variables)
- H3: The means are different for both variables
You add treat variables to our model. This variable indicates the treatment given to the patient. You are interested to see if there is a statistical dependence between the cancer levels and treatment given to the patient.
We adjust our code by adding a treat with the other independent variable.
Df Sum Sq Mean Sq F value Pr(>F)
Cancer levels 2 1.0330 0.5165 20.64 5.7e-07 ***
Treat 3 0.9212 0.3071 12.27 6.7e-06 ***
Residuals 42 1.0509 0.0250
Both cancer levels and treatment are statistically different from 0. By this, we can reject the NULL hypothesis. Also, confirm that changing the treatment or type of cancer impacts the time of survival.
One-way ANOVA: H3- Average is different for at least one group
Two-way ANOVA: H3- Average is different for both the groups.
Difference Between One way and Two way ANOVA
Differences between One-way ANOVA and two-way ANOVA
|One-way ANOVA||Two-way ANOVA|
|Designed to enable equality testing between 3 or more means||Designed to assess the interrelationship of two independent variables on a dependent variable.|
|Involves one independent variable||Involves two independent variables|
|Analyzed in 3 or more categorical groups.||Compares multiple groups of two factors|
|Has to satisfy two principles- replication and randomization||Has to satisfy three principles which are replication, randomization, and local control.|
Advantages of Two way ANOVA
- In the above example, the age and gender in our example – helps to reduce error variation, making the design more efficient.
- Two-Way ANOVA enables us to test the effect of two factors at the same time.
Applications of ANOVA
- Comparing the mileage of different vehicles, fuel and road types.
- Getting to know the impact of temperature, pressure or chemical concentration on some chemical reaction (power reactors, chemical plants, etc.)
- Impact of different catalysts on chemical reaction rates
- Understanding the impact of commercials and different numbers of customer responses.
- Impact of performance, quality and speed manufacturing in biology(process based on the number of cells they get divided into)
This is a guide to Two Way ANOVA in R. Here we discuss the examples, objectives, steps, and difference between One way and two way ANOVA. You may also have a look at the following articles to learn more –