EDUCBA Logo

EDUCBA

MENUMENU
  • Explore
    • EDUCBA Pro
    • PRO Bundles
    • Featured Skills
    • New & Trending
    • Fresh Entries
    • Finance
    • Data Science
    • Programming and Dev
    • Excel
    • Marketing
    • HR
    • PDP
    • VFX and Design
    • Project Management
    • Exam Prep
    • All Courses
  • Blog
  • Enterprise
  • Free Courses
  • Log in
  • Sign Up
Home Data Science Data Science Tutorials R Programming Tutorial Predict Function in R
 

Predict Function in R

Priya Pedamkar
Article byPriya Pedamkar

Updated March 24, 2023

predict function in r

 

 

Introduction to Predict function in R

A function in R programming which is syntactically represented as predict(model, data) that is used to apply an already obtained model to another section of the dataset over the portion of which the model used in it was trained, with the data over which the model was built being referred to as train dataset and the data over which the model is to be applied referred to as test dataset, is referred to as predict() function in R programming

Watch our Demo Courses and Videos

Valuation, Hadoop, Excel, Mobile Apps, Web Development & many more.

Predictive Analytics (Machine Learning)

It covers every time frame, basically, it will consider historical data as well as current data and on the basis of it frame a model which will predict the data or you can say forecast the data. It will include statistical techniques, predictive modeling, machine learning, etc.

For Example, in this, we will continue the same scenario which we used in descriptive analytics, like once we fit in the historical or current data into our model and we pass on the predict command on our new input data, automatically model will tell us that which of the new customer has a chance to default on loans. This technique gives the company a good heads up that in which direction they have to work.

Descriptive Analytics (Business Intelligence)

In this branch of analytics, we will interpret the historical data to understand the changes that occurred in the business. The main types of descriptive analytics techniques include data aggregation and data mining which can provide us the knowledge about past events. The best example of this is let’s suppose you are studying the data of the people who took a loan and you want to specifically study which type of people default on loans. By studying closely, we can identify which kind of people default on loans like what is their age or whether they belong to the same location or whether they are into the same occupation or they work under the same industry sector. This will help us to learn from historical mistakes.

Prescriptive Analytics (Decision Science)

It is the combination of both descriptive and predictive analytics, it will help the company to make effective decisions. It can provide an answer to a question like which type of customer will default on the loan, and at the same time suggest the ways like what should a company do to reduce the number of defaults.

In this blog, we will talk about predictive analytics more where we will develop data science models which will help us to predict “what next”. In prediction, there are different types of already existing models in Rstudio like lm, glm or random forest. We will talk about “lm” here.

Predict function syntax in R looks like this:

Predict Function in R syntax

Arguments

  • The object is a class inheriting from “lm”
  • Newdata is a new data frame wherein we have to predict the value
  • Se.fit is used when standard errors are required
  • The scale is generally NULL, but it is used for standard error calculation
  • Df is degrees of freedom
  • Interval, here we have mentioned the type of interval for the calculation
  • Level, here we have to mention the confidence level which is fine to the researcher. Like some studies are conducted with 95% confidence and some are done on 99%.
  • Type, basically the type of prediction (response or model)
  • Na.action is a function which instructs what to do with missing values, the default here is NA
  • Pred.var is the variance for future observation which needs to be assumed for the prediction interval
  • Weights are the variance weights for prediction

We will work on the dataset which already exists in R known as “Cars”. And we will build a linear regression model that will predict the distance on the basis of the speed.

This dataset has 50 observations of 2 variables.

  • The first variable is speed (mph) which has numeric figures
  • The second variable is Distance (ft) which also has numeric figures

A dataset “cars” look like this.

Case Number Speed Distance
1 4 2
2 4 10
3 7 4
4 7 22
5 8 16
6 9 10
7 10 18
8 10 26
9 10 34
10 11 17
11 11 28
12 12 14
13 12 20
14 12 24
15 12 28
16 13 26
17 13 34
18 13 34
19 13 46
20 14 26
21 14 36
22 14 60
23 14 80
24 15 20
25 15 26
26 15 54
27 16 32
28 16 40
29 17 32
30 17 40
31 17 50
32 18 42
33 18 56
34 18 76
35 18 84
36 19 36
37 19 46
38 19 68
39 20 32
40 20 48
41 20 52
42 20 56
43 20 64
44 22 66
45 23 54
46 24 70
47 24 92
48 24 93
49 24 120
50 25 85

Now we will build the linear regression model because to predict something we need a model that has both input and output. Once the model learns that how data works, it will also try to provide predicted figures based on the input supplied, we will come to the prediction part in a while, first, we will make a model.

linear_model = lm(dist~speed, data = cars)
linear_model

Predict Function in R output 1

The Linear regression model equation is:

Y = β1 + β2X + ϵ

  • X = Independent Variable
  • Y = Dependent Variable
  • Β1 = Intercept of the regression model
  • β2 = Slope of the regression model
  • ϵ = error term

When we fit variables of our model then the equation looks like:

Dist = β1 + β2(Speed) + ϵ

And when we fit the outcome of our model into this equation it looks like:

Dist = -17.579 + 3.932(Speed)

Now we have a model, we can predict the value of the new dataset by giving inputs to our model.

Case Number Speed Distance
51 10 To be predicted
52 12 To be predicted
53 15 To be predicted
54 18 To be predicted
55 10 To be predicted
56 14 To be predicted
57 20 To be predicted
58 25 To be predicted
59 14 To be predicted
60 12 To be predicted

We will provide the above speed variable data as an input to our model.

We can predict the value by using function Predict() in Rstudio.

Example:

Input_variable_speed <- data.frame(speed = c(10,12,15,18,10,14,20,25,14,12))
linear_model = lm(dist~speed, data = cars)
predict(linear_model, newdata = Input_variable_speed)

output 2

Now we have predicted values of the distance variable. We have to incorporate confidence level also in these predictions, this will help us to see how sure we are about our predicted values.

Output with predicted values.

Case Number Speed Distance
51 10 21.74499
52 12 29.60981
53 15 41.40704
54 18 53.20426
55 10 21.74499
56 14 37.47463
57 20 61.06908
58 25 80.73112
59 14 37.47463
60 12 29.60981

Confidence interval of Predict Function in R

It will helps us to deal with the uncertainty around the mean predictions. By using interval command in Predict() function we can get 95% of the confidence interval. This 95% of confidence level is pre-fitted in the function.

Example

Input_variable_speed <- data.frame(speed = c(10,12,15,18,10,14,20,25,14,12))
linear_model = lm(dist~speed, data = cars)
predict(linear_model, newdata = Input_variable_speed, interval = "confidence")

Output:

output 3

The 95% confidence intervals associated with a speed of 10 are (15.46, 28.02). This means that, according to our model, 95% of the cars with a speed of 10 mph have a stopping distance between 15.46 and 28.02.

Recommended Articles

This is a guide to Predict Function in R. Here we discuss the three types of Predict Analytics along with the Examples and Arguments. You can also go through our other suggested articles to learn more –

  1. Predictive Modeling
  2. What is Predictive Analytics?
  3. Predictive Analytics Tool
  4. Decision-Making Techniques

Primary Sidebar

Footer

Follow us!
  • EDUCBA FacebookEDUCBA TwitterEDUCBA LinkedINEDUCBA Instagram
  • EDUCBA YoutubeEDUCBA CourseraEDUCBA Udemy
APPS
EDUCBA Android AppEDUCBA iOS App
Blog
  • Blog
  • Free Tutorials
  • About us
  • Contact us
  • Log in
Courses
  • Enterprise Solutions
  • Free Courses
  • Explore Programs
  • All Courses
  • All in One Bundles
  • Sign up
Email
  • [email protected]

ISO 10004:2018 & ISO 9001:2015 Certified

© 2025 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
Loading . . .
Quiz
Question:

Answer:

Quiz Result
Total QuestionsCorrect AnswersWrong AnswersPercentage

Explore 1000+ varieties of Mock tests View more

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

By continuing above step, you agree to our Terms of Use and Privacy Policy.
*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you

EDUCBA Login

Forgot Password?

🚀 Limited Time Offer! - 🎁 ENROLL NOW