EDUCBA

EDUCBA

MENUMENU
  • Free Tutorials
  • Free Courses
  • Certification Courses
  • 360+ Courses All in One Bundle
  • Login
Home Data Science Data Science Tutorials R Programming Tutorial Predict Function in R
Secondary Sidebar
R programming Tutorial
  • Advanced
    • Statistical Analysis with R
    • R String Functions
    • Data Exploration in R
    • R CSV Files
    • KNN Algorithm in R
    • Sorting in R
    • lm Function in R
    • Hierarchical Clustering in R
    • R Normal Distribution
    • Binomial Distribution in R
    • Decision Tree in R
    • GLM in R
    • Arima Model in R
    • Linear Model in R
    • Predict Function in R
    • Survival Analysis in R
    • Standard Deviation in R
    • Statistical Analysis in R
    • Predictive Analysis?in R
    • T-test in R
    • Database in R
  • Basic
    • What is R Programming Language
    • Careers in R Programming
    • Install R
    • List of R Packages
    • Introduction of R Tools Technology
    • R Programming Language
    • DataSet in R
    • What is RStudio?
    • R-studio-Functions
    • R Packages
    • Time series?in R
    • R Data Types
    • R for data science
    • R Operators
    • R Data Frame
    • R Analytics Tool
    • R Tree Package
    • Vectors in R
  • Control statement
    • If Statement in R
    • If Else Statement in R
    • Else if in R
    • Switch Statement in R
  • Loops
    • Loops in R
    • For Loop in R
    • Nested For Loop in R
    • While Loop in R
    • Next in R
  • Chart/graphs
    • Graphs in R
    • Bar Charts in R
    • Pie Chart in R
    • Histogram in R
    • Line Graph in R
    • Plot Function in R
    • Scatterplot in R
    • R Boxplot labels
  • Regression in R
    • Simple Linear Regression in R
    • Linear Regression in R
    • Multiple Linear Regression in R
    • Logistic Regression in R
    • Poisson Regression in R
    • OLS Regression in R
    • P-Value in Regression
  • Anova in R
    • ANOVA in R
    • One Way ANOVA in R
    • Two Way ANOVA in R
  • Data Structure
    • R list
    • Arrays in R
    • Data Frames in R
    • Factors in R
    • R Vectors
  • Programs
    • Functions in R
    • Boxplot in R
    • R Program Functions
    • Factorial in R
    • Random Number Generator in R
  • Interview question
    • R Interview Questions

Related Courses

R Programming Certification Course

Statistical Analysis Course Training

All in One Data Science Courses

Predict Function in R

By Priya PedamkarPriya Pedamkar

predict function in r

Introduction to Predict function in R

A function in R programming which is syntactically represented as predict(model, data) that is used to apply an already obtained model to another section of the dataset over the portion of which the model used in it was trained, with the data over which the model was built being referred to as train dataset and the data over which the model is to be applied referred to as test dataset, is referred to as predict() function in R programming

Predictive Analytics (Machine Learning)

It covers every time frame, basically, it will consider historical data as well as current data and on the basis of it frame a model which will predict the data or you can say forecast the data. It will include statistical techniques, predictive modeling, machine learning, etc.

For Example, in this, we will continue the same scenario which we used in descriptive analytics, like once we fit in the historical or current data into our model and we pass on the predict command on our new input data, automatically model will tell us that which of the new customer has a chance to default on loans. This technique gives the company a good heads up that in which direction they have to work.

Descriptive Analytics (Business Intelligence)

In this branch of analytics, we will interpret the historical data to understand the changes that occurred in the business. The main types of descriptive analytics techniques include data aggregation and data mining which can provide us the knowledge about past events. The best example of this is let’s suppose you are studying the data of the people who took a loan and you want to specifically study which type of people default on loans. By studying closely, we can identify which kind of people default on loans like what is their age or whether they belong to the same location or whether they are into the same occupation or they work under the same industry sector. This will help us to learn from historical mistakes.

Start Your Free Data Science Course

Hadoop, Data Science, Statistics & others

Prescriptive Analytics (Decision Science)

It is the combination of both descriptive and predictive analytics, it will help the company to make effective decisions. It can provide an answer to a question like which type of customer will default on the loan, and at the same time suggest the ways like what should a company do to reduce the number of defaults.

All in One Data Science Bundle(360+ Courses, 50+ projects)
Python TutorialMachine LearningAWSArtificial Intelligence
TableauR ProgrammingPowerBIDeep Learning
Price
View Courses
360+ Online Courses | 50+ projects | 1500+ Hours | Verifiable Certificates | Lifetime Access
4.7 (86,241 ratings)

In this blog, we will talk about predictive analytics more where we will develop data science models which will help us to predict “what next”. In prediction, there are different types of already existing models in Rstudio like lm, glm or random forest. We will talk about “lm” here.

Predict function syntax in R looks like this:

Predict Function in R syntax

Arguments

  • The object is a class inheriting from “lm”
  • Newdata is a new data frame wherein we have to predict the value
  • Se.fit is used when standard errors are required
  • The scale is generally NULL, but it is used for standard error calculation
  • Df is degrees of freedom
  • Interval, here we have mentioned the type of interval for the calculation
  • Level, here we have to mention the confidence level which is fine to the researcher. Like some studies are conducted with 95% confidence and some are done on 99%.
  • Type, basically the type of prediction (response or model)
  • Na.action is a function which instructs what to do with missing values, the default here is NA
  • Pred.var is the variance for future observation which needs to be assumed for the prediction interval
  • Weights are the variance weights for prediction

We will work on the dataset which already exists in R known as “Cars”. And we will build a linear regression model that will predict the distance on the basis of the speed.

This dataset has 50 observations of 2 variables.

  • The first variable is speed (mph) which has numeric figures
  • The second variable is Distance (ft) which also has numeric figures

A dataset “cars” look like this.

Case Number Speed Distance
1 4 2
2 4 10
3 7 4
4 7 22
5 8 16
6 9 10
7 10 18
8 10 26
9 10 34
10 11 17
11 11 28
12 12 14
13 12 20
14 12 24
15 12 28
16 13 26
17 13 34
18 13 34
19 13 46
20 14 26
21 14 36
22 14 60
23 14 80
24 15 20
25 15 26
26 15 54
27 16 32
28 16 40
29 17 32
30 17 40
31 17 50
32 18 42
33 18 56
34 18 76
35 18 84
36 19 36
37 19 46
38 19 68
39 20 32
40 20 48
41 20 52
42 20 56
43 20 64
44 22 66
45 23 54
46 24 70
47 24 92
48 24 93
49 24 120
50 25 85

Now we will build the linear regression model because to predict something we need a model that has both input and output. Once the model learns that how data works, it will also try to provide predicted figures based on the input supplied, we will come to the prediction part in a while, first, we will make a model.

linear_model = lm(dist~speed, data = cars)
linear_model

Predict Function in R output 1

The Linear regression model equation is:

Y = β1 + β2X + ϵ

  • X = Independent Variable
  • Y = Dependent Variable
  • Β1 = Intercept of the regression model
  • β2 = Slope of the regression model
  • ϵ = error term

When we fit variables of our model then the equation looks like:

Dist = β1 + β2(Speed) + ϵ

And when we fit the outcome of our model into this equation it looks like:

Dist = -17.579 + 3.932(Speed)

Now we have a model, we can predict the value of the new dataset by giving inputs to our model.

Case Number Speed Distance
51 10 To be predicted
52 12 To be predicted
53 15 To be predicted
54 18 To be predicted
55 10 To be predicted
56 14 To be predicted
57 20 To be predicted
58 25 To be predicted
59 14 To be predicted
60 12 To be predicted

We will provide the above speed variable data as an input to our model.

We can predict the value by using function Predict() in Rstudio.

Example:

Input_variable_speed <- data.frame(speed = c(10,12,15,18,10,14,20,25,14,12))
linear_model = lm(dist~speed, data = cars)
predict(linear_model, newdata = Input_variable_speed)

output 2

Now we have predicted values of the distance variable. We have to incorporate confidence level also in these predictions, this will help us to see how sure we are about our predicted values.

Output with predicted values.

Case Number Speed Distance
51 10 21.74499
52 12 29.60981
53 15 41.40704
54 18 53.20426
55 10 21.74499
56 14 37.47463
57 20 61.06908
58 25 80.73112
59 14 37.47463
60 12 29.60981

Confidence interval of Predict Function in R

It will helps us to deal with the uncertainty around the mean predictions. By using interval command in Predict() function we can get 95% of the confidence interval. This 95% of confidence level is pre-fitted in the function.

Example

Input_variable_speed <- data.frame(speed = c(10,12,15,18,10,14,20,25,14,12))
linear_model = lm(dist~speed, data = cars)
predict(linear_model, newdata = Input_variable_speed, interval = "confidence")

Output:

output 3

The 95% confidence intervals associated with a speed of 10 are (15.46, 28.02). This means that, according to our model, 95% of the cars with a speed of 10 mph have a stopping distance between 15.46 and 28.02.

Recommended Articles

This is a guide to Predict Function in R. Here we discuss the three types of Predict Analytics along with the Examples and Arguments. You can also go through our other suggested articles to learn more –

  1. Predictive Modeling
  2. What is Predictive Analytics?
  3. Predictive Analytics Tool
  4. Decision-Making Techniques
Popular Course in this category
R Programming Training (13 Courses, 20+ Projects)
  13 Online Courses |  20 Hands-on Projects |  120+ Hours |  Verifiable Certificate of Completion
4.5
Price

View Course

Related Courses

Statistical Analysis Training (15 Courses, 10+ Projects)4.9
All in One Data Science Bundle (360+ Courses, 50+ projects)4.8
0 Shares
Share
Tweet
Share
Primary Sidebar
Footer
About Us
  • Blog
  • Who is EDUCBA?
  • Sign Up
  • Live Classes
  • Corporate Training
  • Certificate from Top Institutions
  • Contact Us
  • Verifiable Certificate
  • Reviews
  • Terms and Conditions
  • Privacy Policy
  •  
Apps
  • iPhone & iPad
  • Android
Resources
  • Free Courses
  • Database Management
  • Machine Learning
  • All Tutorials
Certification Courses
  • All Courses
  • Data Science Course - All in One Bundle
  • Machine Learning Course
  • Hadoop Certification Training
  • Cloud Computing Training Course
  • R Programming Course
  • AWS Training Course
  • SAS Training Course

ISO 10004:2018 & ISO 9001:2015 Certified

© 2022 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

EDUCBA
Free Data Science Course

SPSS, Data visualization with Python, Matplotlib Library, Seaborn Package

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA Login

Forgot Password?

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

Let’s Get Started

By signing up, you agree to our Terms of Use and Privacy Policy.

This website or its third-party tools use cookies, which are necessary to its functioning and required to achieve the purposes illustrated in the cookie policy. By closing this banner, scrolling this page, clicking a link or continuing to browse otherwise, you agree to our Privacy Policy

Loading . . .
Quiz
Question:

Answer:

Quiz Result
Total QuestionsCorrect AnswersWrong AnswersPercentage

Explore 1000+ varieties of Mock tests View more