ANOVA (Analysis of Variance)
ANOVA stands for Analysis Of Variance. ANOVA was founded by Ronald Fisher in the year 1918. The name Analysis Of Variance was derived based on the approach in which the method uses the variance to determine the means whether they are different or equal.
It is a statistical method used to test the differences between two or more means. It is used to test general differences rather than specific differences among means. It assesses the significance of one or more factors by comparing the response variable means at different factor levels.
Null hypothesis states that all population means are equal. The alternative hypothesis proves that at least one population mean is different
It provides a way to test various null hypothesis at the same time.
General purpose of ANOVA
The reason for performing ANOVA is to see whether any difference exists between the groups on some variable. Today researchers are using ANOVA in many ways. The usage of ANOVA totally depends on the research design.
You can use ttest to compare the means of two samples but when there are more than two samples to be compared then ANOVA is the best method to be used.
Assumptions of ANOVA
There are four main assumptions
 The expected values of the errors are zero
 The variances of all the errors are equal to each other
 The errors are independent
 They are normally distributed
ANOVA Types

One Way between groups
One Way ANOVA is used to check whether there is any significant difference between the means of three or more unrelated groups. It mainly tests the null hypothesis.
H₀: µ₁ = µ₂ = µ₃ = ….. = µₓ
4.8 (676 ratings)
Where µ means group mean and x means number of groups. One Way ANOVA gives a significant result. One way ANOVA is an omnibus test statistic and it will not let you know which specific groups were different from each other. In order to know the specific group or groups which differed from others then you need to do a post hoc test.
Example of One Way ANOVA
20 people are selected to test the effect of five different exercises. 20 people are divided into 4 groups with 5 members each. Their weights are recorded after a few days. The effect of the exercises on the 5 group of men are compared. Here weight is the only one factor.
Assumptions
The dependent variable is normally distributed in each group
There is homogeneity of variances
Independence of observations

One Way ANOVA repeated measures
Repeated measures ANOVA is more or less equal to One Way ANOVA but used for complex groupings. Repeated measures investigate about the 1. changes in mean scores over three or more time points
2. differences in mean scores under different conditions.
Example of Repeated measures
You might research the effect of a 6 month exercise programme on weight reducing on some individuals. You calculate the weight at three different point of time during the training period to develop a timecourse for any exercise effect.
You might indulge the same individual to eat different type of weight reducing food and rate them as per the taste.
In this example the same set of people are measured more than once on the same dependent variable.

Two way between groups
The two way ANOVA compares the mean difference between groups that have been split on two factors. The main objective of a two way ANOVA is to find out if there is any interaction between the two independent variables on the dependent variables. It also lets you know whether the effect of one of your independent variables on the dependent variable is same for all the values of your other independent variable.
Example
The research of the effect of fertilizers on yield of rice. You apply five fertilizers of different quality on five plots of land each cultivating rice. The yield from each plot of land is recorded and the difference between each plot is observed. Here the effect of the fertility of the plots can also be studied. Thus there are two factors, Fertilizer and Fertility.
Assumptions
Before starting with your two way ANOVA your data should pass through six assumptions to make sure that the data you have is sufficient for performing two way ANOVA. The six assumptions are listed below
 Your dependent variable should be measured at the continuous level
 Your two independent variable should contain two or more categorical independent groups for each
 You should have independence of observations
 Avoid any outliers
 Your dependent variable should be normally distributed for each combination of the groups of the two independent variable
 Homogeneity of variances

Two way repeated measures
Two way repeated measures the mean differences between the groups that have been split into two within the independent variables. A two way repeated measure is often used in research where a dependent variable is measured more than twice under two or more conditions.
Example
A health researcher wants to find the best way to reduce the chronic joint pain suffered by the people. The researcher selects two different type of treatments to reduce the level of pain. The two types of treatments are known as ‘conditions’. Treatment A is a massage programme and Treatment B is a acupuncture programme. Both the treatments are given to all the patients for 8 weeks.
The patients are tested at three points of time – at the beginning of the programme, at the middle of the programme and at the end of the programme.
The researcher selects 30 patients to take part in the research. But when the first 15 patients undergo Treatment A the other 15 patients undergo Treatment B and vice versa.
At the end of 8 weeks, the researcher uses two way repeated measures ANOVA to find out if there is any change in the pain as a result of the interaction between the type of treatment and at which point of time.
Assumptions
Your data should pass five assumptions that are needed for a two way repeated measures ANOVA to give the exact result.
 Your dependent variable should be measured at the continuous level
 Your two within subject factors should consist of at least two categorical related groups
 There should be no outliers
 The dependent variable should be normally distributed among each combination of the related groups
 The variances of the differences between all combinations of related groups should be equal
Parametric and Non Parametric ANOVA test
If the information about the population is completely known by means of its parameters then the statistical test performed is called Parametric test.
If the information about the population or parameters is not known but still it is required to test the hypothesis, then it is called non parametric test.
When you have categorical data then you cannot use ANOVA method, you need to use Chi square test which deals with ANOVA interaction.
Hypothesis testing procedure – One way ANOVA
 Check any necessary assumption and write null and alternative hypothesis
To perform one way ANOVA certain assumptions should be there. The assumptions are as follows
 Each sample is an independent random sample
 The distribution of the response variable follows a normal distribution
 The population variances are equal across responses for the group levels. It can be found out by dividing the largest sample standard deviation by the smallest sample standard and it is not greater than two then assume that the population variances are equal.
 Calculate an appropriate test statistic
One way ANOVA uses F test statistics. Hand calculations requires many steps to compute the F ratio but statistical software like SPSS will compute the F ratio for you and will produce the ANOVA source table.
ANOVA table will give you information about the variability between groups and within groups. The table will give you all of the formula. Below is the example of a one way ANOVA table
Source  SS  DF  MS  F 
Treatments  SST  k1  SST/(k1)  MST/MSE 
Error  SSE  Nk  SSE/(Nk)  
Total (Corrected)  SS  N1 
SST means Sum of squares of treatments, SSE means Sum of squares of errors
DFT which is k1 means degrees of freedom for treatment, DFE which is Nk means Degrees of freedom for errors.
 Determine a p value associated with the test statistic
 Determine between the null and alternative hypothesis
If the null hypothesis is false, then MST should be larger than MSE
 Give a conclusion
Based on your result write a conclusion as per your anova research question.
Multiple comparison tests
If you find that there is a significant difference between the groups that is not related to sampling error then it is necessary to run several t tests to test the means between the groups. There are several tests conducted to control the type one error rate.
 Scheffe’s Test
 Modified Bonferroni test
 Dunnette’s test
 Tukey’s test
Calculations
ANOVA calculations can be done in three ways – Hand calculations, Excel sheet and SPSS software. Lets learn about all the calculations in detail below

ANOVA hand calculations
 Step 1
Compute CM
CM = (Total of all observations)^{2}/N_{Total}
 Step 2
Compute the total SS
Total SS = Sum of squares of all observations – CM
 Step 3
Compute SST (Sum of Squares for Treatment)
SST = ∑^{3}_{i=1} T2i/n_{i} – CM
 Step 4
Compute SSE (Sum of Squares for errors)
SSE = SS (Total) – SST
 Step 5
Compute MST, MSE and their ratio F
MST = SST/k1
MSE = SSE/Nk
F = MST/MSE

ANOVA using Excel
To perform a single factor ANOVA in excel follow these simple steps
 Go to Data Tab
 Click Data Analysis
 Select Anova: Single factor and click Ok (there are also other options like Anova: two factor with replication and Anova: two factor without replication)
 Click the Input Range box and select the range
 Click the Output range box and select the output range and click Ok
 You will get the result displayed in the excel sheet
 If F is greater than F crit then the null hypothesis is rejected

ANOVA using SPSS
First download the SPSS software to perform the ANOVA. Here we can see how to perform a One way ANOVA using SPSS
SPSS always assumes that the independent variable is represented numerically. In the sample data set, MAJOR is a string. So first convert the string variable into a numerical variable. Once your conversion is over you are ready to do the ANOVA
 Open the SPSS software
 Click Analyze à Compare Means à One Way ANOVA
 One way ANOVA dialog box appears on the screen
 On the left side of the dialog box you will see a list of all the dependent variables that was measured by you. Move it into the Dependent list on the right side by using the upper arrow button
 In the same way move the independent variable in the left side list to the Factor box on the right side.
 Click on the Post Hoc button to select the type of multiple comparison you want to do.
 Select any Post hoc test that suits your research by clicking on the check box next to the test
 Click Continue and it will take you to the One way ANOVA dialog box
 Select any statistics and Click on the check boxes to the left of the option to select it
 Click Means plot to get a anova graph of the means of the conditions
 Click Continue and Click Ok
SPSS output window will appear with six major sections
 Descriptive section
 Test of Homogeneity of Variances
 ANOVA
 Multiple Comparisons
 Grade Point Average
 Graph
Things to be considered when running an ANOVA
Data level and assumptions play a crucial role in ANOVA.
The researcher should find out whether the data is Crossed or Nested. If the data is crossed all groups receive all aspects.
If the data is Nested then each group will receive different ANOVA method.
It is more important to calculate the anova effect size. The effect size can tell you the degree to which the null hypothesis is false. A medium effect size is always preferable
Hope this article gave you a brief overview of ANOVA and interpreting results using it.
Related Courses :
Data Science Course  All in One Bundle
360+ Online Courses
1500+ Hours
Verifiable Certificates
Lifetime Access

Bouns Data Science Courses

Machine Learning Course

Data Science with Python Course

Data Scientist Course

IoT Course
NGUYEN PHUONG MINH says
VERY GOOD. THANKS A LOT
Leonora Mila Rafaela T. Manoguid says
how do we report the result of one way anova in our research when we make use of microsoft excel?