EDUCBA

EDUCBA

MENUMENU
  • Free Tutorials
  • Free Courses
  • Certification Courses
  • 360+ Courses All in One Bundle
  • Login
Home Data Science Data Science Tutorials Machine Learning Tutorial What is Regression Analysis?
Secondary Sidebar
Machine Learning Tutorial
  • Supervised
    • What is Supervised Learning
    • Supervised Machine Learning
    • Supervised Machine Learning Algorithms
    • Perceptron Learning Algorithm
    • Simple Linear Regression
    • Polynomial Regression
    • Multivariate Regression
    • Regression in Machine Learning
    • Hierarchical Clustering Analysis
    • Linear Regression Analysis
    • Support Vector Regression
    • Multiple Linear Regression
    • Linear Algebra in Machine Learning
    • Statistics for Machine Learning
    • What is Regression Analysis?
    • Clustering Methods
    • Backward Elimination
    • Ensemble Techniques
    • Bagging and Boosting
    • Linear Regression Modeling
    • What is Reinforcement Learning
  • Basic
    • Introduction To Machine Learning
    • What is Machine Learning?
    • Uses of Machine Learning
    • Applications of Machine Learning
    • Naive Bayes in Machine Learning
    • Dataset Labelling
    • DataSet Example
    • Deep Learning Techniques
    • Dataset ZFS
    • Careers in Machine Learning
    • What is Machine Cycle?
    • Machine Learning Feature
    • Machine Learning Programming Languages
    • What is Kernel in Machine Learning
    • Machine Learning Tools
    • Machine Learning Models
    • Machine Learning Platform
    • Machine Learning Libraries
    • Machine Learning Life Cycle
    • Machine Learning System
    • Machine Learning Datasets
    • Machine Learning Certifications
    • Machine Learning Python vs R
    • Optimization for Machine Learning
    • Types of Machine Learning
    • Machine Learning Methods
    • Machine Learning Software
    • Machine Learning Techniques
    • Machine Learning Feature Selection
    • Ensemble Methods in Machine Learning
    • Support Vector Machine in Machine Learning
    • Decision Making Techniques
    • Restricted Boltzmann Machine
    • Regularization Machine Learning
    • What is Regression?
    • What is Linear Regression?
    • Dataset for Linear Regression
    • Decision tree limitations
    • What is Decision Tree?
    • What is Random Forest
  • Algorithms
    • Machine Learning Algorithms
    • Apriori Algorithm in Machine Learning
    • Types of Machine Learning Algorithms
    • Bayes Theorem
    • AdaBoost Algorithm
    • Classification Algorithms
    • Clustering Algorithm
    • Gradient Boosting Algorithm
    • Gradient Descent in Machine Learning
    • Mean Shift Algorithm
    • Hierarchical Clustering Algorithm
    • Hierarchical Clustering Agglomerative
    • What is a Greedy Algorithm?
    • What is Genetic Algorithm?
    • Random Forest Algorithm
    • Nearest Neighbors Algorithm
    • Weak Law of Large Numbers
    • Ray Tracing Algorithm
    • SVM Algorithm
    • Naive Bayes Algorithm
    • Neural Network Algorithms
    • Boosting Algorithm
    • XGBoost Algorithm
    • Pattern Searching
    • Loss Functions in Machine Learning
    • Decision Tree in Machine Learning
    • Hyperparameter Machine Learning
    • Unsupervised Machine Learning
    • K- Means Clustering Algorithm
    • KNN Algorithm
    • Monty Hall Problem
  • Classification
    • Kernel Methods in Machine Learning
    • Clustering in Machine Learning
    • Machine Learning Architecture
    • Automation Anywhere Architecture
    • Machine Learning C++ Library
    • Machine Learning Frameworks
    • Data Preprocessing in Machine Learning
    • Data Science Machine Learning
    • Classification of Neural Network
    • Neural Network Machine Learning
    • What is Convolutional Neural Network?
    • Single Layer Neural Network
    • Kernel Methods
    • Forward and Backward Chaining
    • Forward Chaining
    • Backward Chaining
  • Deep Learning
    • What Is Deep learning
    • Overviews Deep Learning
    • Application of Deep Learning
    • Careers in Deep Learnings
    • Deep Learning Frameworks
    • Deep Learning Model
    • Deep Learning Algorithms
    • Deep Learning Technique
    • Deep Learning Networks
    • Deep Learning Libraries
    • Deep Learning Toolbox
    • Types of Neural Networks
    • Convolutional Neural Networks
    • Create Decision Tree
    • Deep Learning for NLP
    • Caffe Deep Learning
    • Deep Learning with TensorFlow
  • RPA
    • What is RPA
    • What is Robotics?
    • Benefits of RPA
    • RPA Applications
    • Types of Robots
    • RPA Tools
    • Line Follower Robot
    • What is Blue Prism?
    • RPA vs BPM
  • Interview Questions
    • Deep Learning Interview Questions And Answer
    • Machine Learning Cheat Sheet

What is Regression Analysis?

By Priya PedamkarPriya Pedamkar

What is Regression Analysis

Introduction to Regression Analysis

How did the Regression Analysis work?

There are many types of regression techniques that are used considering different factors and outcomes.

  • Linear Regression
  • Logistic Regression
  • Lasso/Ridge Regression
  • Polynomial Regression

Some of the important statistical regression tests which are used in various sectors are given below:

Start Your Free Data Science Course

Hadoop, Data Science, Statistics & others

1. Linear Regression

This is used when the outcome variable is linearly dependent on the independent variables. It is normally used when we don’t have a huge data set. It is also sensitive to outliers, so if the data set contains outliers, then it’s better to treat them before applying linear regression. There are single and multi-variable regression techniques. Simple Linear Regression is the analysis when the outcome variable is linearly dependent on a single independent variable. Simple Linear Regression follows the equation of a straight line which is given below:

Y=mx+c

Where,

  • Y= Target, Dependent, or Criterion Variable
  • x= Independent or predictor variable
  • m= Slope or Regression Coefficient
  • c= constant

Multi-Variable Linear regression defines the relationship between the outcome variable and more than one independent variable. It follows the below equation of a straight line where dependent variables are the linear combination of all the independent variables:

Y= m1x1+m2x2+m3x3+…mnan+c

Where,

  • Y= Target, Dependent, or Criterion Variable
  • x1, x2, x3…xn= Independent or predictor variables
  • m1, m2, m3…mn= Slope or Regression Coefficients of respective variables
  • c= constant

Linear Regression follows the principle of the Least Square method. This method states that a line of best fit is chosen by minimizing the sum of square error. The line of best fit is chosen where the sum of square error between the observed data and the line is minimum.

There are some assumptions that should be taken care of before applying linear regression to the dataset.

  • There should be a linear relationship between independent and dependent variables.
  • There should be no or a little multicollinearity between the independent variables. Multicollinearity is defined as a phenomenon where there is a high correlation between the independent variables. We can treat multicollinearity by dropping one variable which is correlated or treats two variables as one variable.
  • Homoscedasticity: It is defined as a state where error terms should be randomly distributed across the line in the regression analysis. There should be not any pattern across the line if there is some identified pattern than the data is said to be heteroscedastic.
  • All the variables should be normally distributed, which we see by plotting a Q-Q plot. If the data is not normally distributed, we can use any nonlinear transformation methods to treat it.

So, it is always advisable to test the assumptions while applying linear regression for getting good accuracy and correct result.

2. Logistic Regression

This regression technique is used when the target or outcome variable is categorical or binary in nature. The main difference between linear and logistic regression lies in the target variable, in linear regression, it should be continuous whereas in logistic it should be categorical. The outcome variable should only have two classes, not more than that. Some of the examples are spam filters in emails (Spam or not), fraud detection (Fraud/ Not Fraud), etc. It works on the principle of probability. It can be classified into two categories by setting the threshold value.

For Example: If there are two categories A, B and we set the threshold value as 0.5 then the probability above 0.5 will be considered as one category, and below 0.5 will be another category. Logistic Regression follows an S-shaped curve. Before building the logistic regression model, we have to split the data set into training and testing. Since the target variable is categorical or binary we have to make sure that there is a proper class balance in the training set.

If there is class imbalance then this can be treated by using various methods as mentioned below:

  • Up Sampling: In this technique, the class which has fewer rows is sampled over to match the number of rows of the majority class.
  • Down Sampling: In this technique, the class which has more rows is sampled down to match the number of rows of the minority class.

There are some important points that are important to understand before applying the logistic regression model to the data sets:

  • The target variable should be binary in nature. If there are more than 2 classes in the target variable then it is known as Multinomial Logistic Regression.
  • There should be no or little multicollinearity between the independent variables.
  • It requires a huge sample size to work.
  • There should be a linear relationship between the independent variables and the log of odds.

Benefits of Regression

There are many benefits of regression analysis. Instead of considering our gut feeling and predicting the outcome, we can use regression analysis and show valid points for possible outcomes.

Some of those are listed below:

  • To predict the sales and revenue in any sector for shorter or longer periods.
  • To predict the customer churn rate of any industry and find out the suitable measures of reducing them.
  • To understand and predict the inventory levels of the warehouse.
  • To find whether introducing a new product in the market will be successful or not.
  • To predict whether any customer will default loan or not.
  • To predict whether any customer will buy a product or not.
  • Fraud or Spam Detection

Conclusion

There are various evaluation metrics that are considered after applying the model. Though there are assumptions required to be tested before applying the model we can always modify the variables using various mathematical methods and increase model performance.

Recommended Articles

This is a guide to Regression Analysis. Here we discuss the Introduction to Regression Analysis, How did the Regression Analysis work and the Benefits of Regression. You can also go through our other suggested articles to learn more–

  1. Linear Regression Analysis
  2. Regression Testing Tools
  3. Regression vs Classification
  4. Guide to What is Regression?
Popular Course in this category
Statistical Analysis Training (15 Courses, 10+ Projects)
  15 Online Courses |  10 Hands-on Projects |  140+ Hours |  Verifiable Certificate of Completion
4.5
Price

View Course

Related Courses

Machine Learning Training (20 Courses, 29+ Projects)4.9
Deep Learning Training (18 Courses, 24+ Projects)4.8
Artificial Intelligence AI Training (5 Courses, 2 Project)4.7
Primary Sidebar
Footer
About Us
  • Blog
  • Who is EDUCBA?
  • Sign Up
  • Live Classes
  • Corporate Training
  • Certificate from Top Institutions
  • Contact Us
  • Verifiable Certificate
  • Reviews
  • Terms and Conditions
  • Privacy Policy
  •  
Apps
  • iPhone & iPad
  • Android
Resources
  • Free Courses
  • Database Management
  • Machine Learning
  • All Tutorials
Certification Courses
  • All Courses
  • Data Science Course - All in One Bundle
  • Machine Learning Course
  • Hadoop Certification Training
  • Cloud Computing Training Course
  • R Programming Course
  • AWS Training Course
  • SAS Training Course

ISO 10004:2018 & ISO 9001:2015 Certified

© 2023 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

By continuing above step, you agree to our Terms of Use and Privacy Policy.
*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you

Let’s Get Started

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA Login

Forgot Password?

By signing up, you agree to our Terms of Use and Privacy Policy.

This website or its third-party tools use cookies, which are necessary to its functioning and required to achieve the purposes illustrated in the cookie policy. By closing this banner, scrolling this page, clicking a link or continuing to browse otherwise, you agree to our Privacy Policy

Loading . . .
Quiz
Question:

Answer:

Quiz Result
Total QuestionsCorrect AnswersWrong AnswersPercentage

Explore 1000+ varieties of Mock tests View more