Updated July 6, 2023

Overview of Two Way ANOVA in R

A statistical concept that helps to understand the relationship between one continuous dependent variable and two categorical independent variables and is usually studied over samples from various populations through the formulation of null and alternative hypotheses and certain considerations, such as those related to the independence of samples, normal distribution, equality of variance, outliers, etc. must be followed by the variables of interest, with R programming providing a handy functionality to leverage the concept, is termed as Two Way ANOVA, i.e., two-way analysis of variance in R.

Note: We must transform our data if normality and equal variance are violated.

Example of Two Way ANOVA in R

Let’s perform one way ANOVA test on the cancer levels data set, which contains 48 rows and 3 data variables:

Time Taken: Survival time of an animal

Different levels of cancer 1 – 3

Treatment: Treatments used from 1-3

Before we test, we need the following data in hand.

Importing the data
Remove unnecessary variable
Convert variables (levels of cancer) as ordered levels.

Below is the data set.

Observations: 48

Variables: 3

time for survival <dbl> 0.31, 0.45, 0.46, 0.43, 0.36, 0.29, 0.40, 0.23, 0.22, 0…

cancer levels<ord> 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 1, 1, 1, 1, 2, 2, 2…

Treatment <fctr> A, A, A, A, A, A, A, A, A, A, A, A, B, B, B, B, B, B,…

Objectives

H₀: no change in average survival time between group
H₀: survival time is different for at least one group.

Steps

Check the cancer levels. We can see three character values by converting them into factors with a mutate verb.

levels(df$cancerlevels) output: [1] "1" "2" "3"

Compute both mean and standard deviation

df % > % group_by(cancerlevels) % > % summarise( count_ cancerlevels = n(), mean_time = mean(time, na.rm = TRUE), sd_time = sd(time, na.rm = TRUE) )

Output:

A tibble: 3 x 4

cancerlevels count_cancerlevels mean_time sd_time

1 1 16 0.617500 0.20942779

2 2 16 0.544375 0.28936641

3 3 16 0.276250 0.06227627

In step three, you can graphically check if there is a difference between the distributions. Again, note that you include the jittered dot.
Run test with command AOV.

aov(formula, data) Arguments: - formula: The equation you want to estimate - data: The dataset used

Syntax:

y ~ X1+ X2+…+Xn (X1 + X2 +… refers to the independent variables)

y ~. Use all the remaining variables as independent variables

Make sure you save the model and print the summary.

Code

aov(time ~ cancerlevels, data = df): Run the ANOVA test with the following formula
summary (anova_one_way): Print the summary of the test

Df Sum Sq Mean Sq F value Pr(>F)

Cancerlevels 2 1.033 0.5165 11.79 7.66e-05 ***

Residuals 45 1.972 0.0438

—

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1” 1

The p-value is lower than the threshold of 0.05. The statistical difference is indicated by ‘*’ in the above case.

One-Way Test to Two Way Anova in R

Let’s see how the one-way test can be extended to two-way ANOVA. The test is similar to one-way ANOVA, but the formula differs and adds another group variable to the formula.

y = x1 + x2

H0: The means are equal for both variables (factor variables)
H3: The means are different for both variables

You add treat variables to our model. This variable indicates the treatment given to the patient. You are interested to see if there is a statistical dependence between the cancer levels and treatment given to the patient.

We adjust our code by adding a treat with the other independent variable.

Df Sum Sq Mean Sq F value Pr(>F)

Cancer levels 2 1.0330 0.5165 20.64 5.7e-07 ***

Treat 3 0.9212 0.3071 12.27 6.7e-06 ***

Residuals 42 1.0509 0.0250

Both cancer levels and treatment are statistically different from 0. By this, we can reject the NULL hypothesis. Also, confirm that changing the treatment or type of cancer impacts the survival time.

Test

One-way ANOVA: H3- Average is different for at least one group

Two-way ANOVA: H3- Average is different for both groups.

Difference Between One way and Two way ANOVA

The Differences between One-way ANOVA and two-way ANOVA are given below.

One-way ANOVA	Two-way ANOVA
Designed to enable equality testing between 3 or more means	Designed to assess the interrelationship of two independent variables on a dependent variable.
Involves one independent variable	It involves two independent variables
Analyzed in 3 or more categorical groups.	Compares multiple groups of two factors
It has to satisfy two principles- replication and randomization	It has to satisfy three principles: replication, randomization, and local control.

Advantages of Two way ANOVA

Some advantages are as follows.

In the above example, age and gender in our example – help to reduce error variation, making the design more efficient.
Two-Way ANOVA enables us to test the effect of two factors simultaneously.

Applications of ANOVA

The applications of ANOVA are listed below.

I was comparing the mileage of different vehicles, fuel, and road types.
Learning the impact of temperature, pressure, or chemical concentration on some chemical reactions (power reactors, chemical plants, etc.)
Impact of different catalysts on chemical reaction rates
Understanding the impact of commercials and other numbers of customer responses.
Impact of performance, quality, and speed manufacturing in biology(process based on the number of cells they get divided into)