Correlation Coefficient Formula (Table of Contents)
What is the Correlation Coefficient Formula?
In statistics, there are certain outcomes which have a direct relation to other situations or variables and the correlation coefficient is the measure of that direct association of two variables or situations. These variables exhibit a positive correlation coefficient when they move in the same direction at the same time. Similarly, if they move in the different and opposite direction, they said to have a negative correlation coefficient. For example: If the interest rate in the market fells down, the corporate loans will be cheaper and the economy will boost. So the interest rate and growth of the economy have positive correlation coefficient. The value of correlation coefficient defines the strength of the relationship between variables. The maximum value of the correlation coefficient varied from +1 to -1. If the correlation coefficient is +1, then the variables are perfectly positively correlated and if that value is -1, then it is called perfectly negatively correlated.
Suppose that we have 2 sets of data given by X (X1, X2 … Xn) and Y (Y1, Y2 … Yn).
Formula For the Correlation Coefficient is given by:
Where:
- X – Data points in Data set X
- Y – Data points in Data set Y
- Xm – Mean of Data set X
- Ym – Mean of Data set Y
This formula seems to be very time consuming and confusing at first.
There is another way to calculate correlation coefficient simply by using CORREL () function in excel. I will explain both the Correlation Coefficient formulas by using examples.
Examples of Correlation Coefficient Formula (With Excel Template)
Let’s take an example to understand the calculation of the Correlation Coefficient in a better manner.
Correlation Coefficient Formula – Example #1
Let’s say we have two data sets X & Y and each contains 20 random data points. Calculate the Correlation Coefficient for the data set X & Y.
4.9 (3,296 ratings)
View Course
Solution:
Mean is calculated as:
- Mean of Data Set X = 15.6
- Mean of Data Set Y = 13.8
Now, we need to calculate the difference between the data points and the mean value.
Similarly, calculate for all values of the data set X.
Similarly, calculate for all values of the data set Y.
Calculate the square of the difference for both the data sets X and Y.
Multiply the difference in X with Y.
Correlation Coefficient is calculated using the formula given below
Correlation Coefficient = Σ [(X – Xm) * (Y – Ym)] / √ [Σ (X – Xm)2 * Σ (Y – Ym)2]
Correlation Coefficient = 0.343264
So it means that both the data sets have a positive correlation and is given by 0.343264.
Correlation Coefficient Formula – Example #2
Let say you are looking to invest money in the stock market and you want to invest in 2 stocks and want to choose those stocks in such a way that your portfolio is diversified. It means that if one gives you a negative return, others will help you to get a positive return and vice versa. So basically you want to invest in stocks which have a negative correlation. You have 2 stocks and have got information on their historical returns for the last 15 years.
Solution:
Correlation Coefficient is calculated using the excel formula.
Correlation Coefficient = -0.45986
Here we have used CORREL() function of excel to see correlation coefficient for the 2 stocks. You see that the correlation function is negative in value which means that both the stocks have a negative correlation. So your choice is apt as per your requirements.
Explanation
We know and discuss that the correlation coefficient is a measure of the extent of the relation between two variables, but the catch here is that it can only measure the relationship which is linear. This tool is not efficient in capturing nonlinear relationships. Also, there are a few other properties of the correlation coefficient:
- A correlation coefficient is a unit-less tool. This is a very useful property since allows you to compare data which has different units. For example, Stock prices are dependent upon various parameters like inflation, interest rates, etc. So we can use public information to determine the correlation between them.
- As discussed above, its value lies between + 1 to -1. So +1 is perfectly positively correlated and -1 is perfectly negatively correlated.
Relevance and Uses of Correlation Coefficient Formula
The correlation coefficient helps us to understand the data sets and their relationship better and has many applications in finance and economics. Financial institutes, banks, companies, and even Governments make use of correlation coefficient in order to track the historical data and extract meaningful information and predict market trends in an efficient way. A correlation coefficient is a very powerful tool, but it should not be used in a silo and apply along with other tools. The reason for that is simple, we cannot simply rely on data and data sometimes gives us unmeaning full information. For example: If you have collected information and you have got to know that there is a positive correlation between rain and death of dogs. It means that in the year when the rain was more, there is a number of dogs who died. Although there is a correlation that is not meaningful at all. That is called a spurious correlation. So be very careful while making decisions only based on data.
Recommended Articles
This has been a guide to Correlation Coefficient Formula. Here we discuss how to calculate the Correlation Coefficient using formula along with practical examples and downloadable excel template. You may also look at the following articles to learn more –