EDUCBA

EDUCBA

MENUMENU
  • Blog
  • Free Courses
  • All Courses
  • All in One Bundle
  • Login
Home Data Science Data Science Tutorials R Programming Tutorial Predict Function in R

Predict Function in R

Priya Pedamkar
Article byPriya Pedamkar

Updated March 24, 2023

predict function in r

Introduction to Predict function in R

A function in R programming which is syntactically represented as predict(model, data) that is used to apply an already obtained model to another section of the dataset over the portion of which the model used in it was trained, with the data over which the model was built being referred to as train dataset and the data over which the model is to be applied referred to as test dataset, is referred to as predict() function in R programming

Start Your Free Data Science Course

Hadoop, Data Science, Statistics & others

Predictive Analytics (Machine Learning)

It covers every time frame, basically, it will consider historical data as well as current data and on the basis of it frame a model which will predict the data or you can say forecast the data. It will include statistical techniques, predictive modeling, machine learning, etc.

For Example, in this, we will continue the same scenario which we used in descriptive analytics, like once we fit in the historical or current data into our model and we pass on the predict command on our new input data, automatically model will tell us that which of the new customer has a chance to default on loans. This technique gives the company a good heads up that in which direction they have to work.

Descriptive Analytics (Business Intelligence)

In this branch of analytics, we will interpret the historical data to understand the changes that occurred in the business. The main types of descriptive analytics techniques include data aggregation and data mining which can provide us the knowledge about past events. The best example of this is let’s suppose you are studying the data of the people who took a loan and you want to specifically study which type of people default on loans. By studying closely, we can identify which kind of people default on loans like what is their age or whether they belong to the same location or whether they are into the same occupation or they work under the same industry sector. This will help us to learn from historical mistakes.

Prescriptive Analytics (Decision Science)

It is the combination of both descriptive and predictive analytics, it will help the company to make effective decisions. It can provide an answer to a question like which type of customer will default on the loan, and at the same time suggest the ways like what should a company do to reduce the number of defaults.

In this blog, we will talk about predictive analytics more where we will develop data science models which will help us to predict “what next”. In prediction, there are different types of already existing models in Rstudio like lm, glm or random forest. We will talk about “lm” here.

Predict function syntax in R looks like this:

Predict Function in R syntax

Arguments

  • The object is a class inheriting from “lm”
  • Newdata is a new data frame wherein we have to predict the value
  • Se.fit is used when standard errors are required
  • The scale is generally NULL, but it is used for standard error calculation
  • Df is degrees of freedom
  • Interval, here we have mentioned the type of interval for the calculation
  • Level, here we have to mention the confidence level which is fine to the researcher. Like some studies are conducted with 95% confidence and some are done on 99%.
  • Type, basically the type of prediction (response or model)
  • Na.action is a function which instructs what to do with missing values, the default here is NA
  • Pred.var is the variance for future observation which needs to be assumed for the prediction interval
  • Weights are the variance weights for prediction

We will work on the dataset which already exists in R known as “Cars”. And we will build a linear regression model that will predict the distance on the basis of the speed.

This dataset has 50 observations of 2 variables.

  • The first variable is speed (mph) which has numeric figures
  • The second variable is Distance (ft) which also has numeric figures

A dataset “cars” look like this.

Case Number Speed Distance
1 4 2
2 4 10
3 7 4
4 7 22
5 8 16
6 9 10
7 10 18
8 10 26
9 10 34
10 11 17
11 11 28
12 12 14
13 12 20
14 12 24
15 12 28
16 13 26
17 13 34
18 13 34
19 13 46
20 14 26
21 14 36
22 14 60
23 14 80
24 15 20
25 15 26
26 15 54
27 16 32
28 16 40
29 17 32
30 17 40
31 17 50
32 18 42
33 18 56
34 18 76
35 18 84
36 19 36
37 19 46
38 19 68
39 20 32
40 20 48
41 20 52
42 20 56
43 20 64
44 22 66
45 23 54
46 24 70
47 24 92
48 24 93
49 24 120
50 25 85

Now we will build the linear regression model because to predict something we need a model that has both input and output. Once the model learns that how data works, it will also try to provide predicted figures based on the input supplied, we will come to the prediction part in a while, first, we will make a model.

linear_model = lm(dist~speed, data = cars)
linear_model

Predict Function in R output 1

The Linear regression model equation is:

Y = β1 + β2X + ϵ

  • X = Independent Variable
  • Y = Dependent Variable
  • Β1 = Intercept of the regression model
  • β2 = Slope of the regression model
  • ϵ = error term

When we fit variables of our model then the equation looks like:

Dist = β1 + β2(Speed) + ϵ

And when we fit the outcome of our model into this equation it looks like:

Dist = -17.579 + 3.932(Speed)

Now we have a model, we can predict the value of the new dataset by giving inputs to our model.

Case Number Speed Distance
51 10 To be predicted
52 12 To be predicted
53 15 To be predicted
54 18 To be predicted
55 10 To be predicted
56 14 To be predicted
57 20 To be predicted
58 25 To be predicted
59 14 To be predicted
60 12 To be predicted

We will provide the above speed variable data as an input to our model.

We can predict the value by using function Predict() in Rstudio.

Example:

Input_variable_speed <- data.frame(speed = c(10,12,15,18,10,14,20,25,14,12))
linear_model = lm(dist~speed, data = cars)
predict(linear_model, newdata = Input_variable_speed)

output 2

Now we have predicted values of the distance variable. We have to incorporate confidence level also in these predictions, this will help us to see how sure we are about our predicted values.

Output with predicted values.

Case Number Speed Distance
51 10 21.74499
52 12 29.60981
53 15 41.40704
54 18 53.20426
55 10 21.74499
56 14 37.47463
57 20 61.06908
58 25 80.73112
59 14 37.47463
60 12 29.60981

Confidence interval of Predict Function in R

It will helps us to deal with the uncertainty around the mean predictions. By using interval command in Predict() function we can get 95% of the confidence interval. This 95% of confidence level is pre-fitted in the function.

Example

Input_variable_speed <- data.frame(speed = c(10,12,15,18,10,14,20,25,14,12))
linear_model = lm(dist~speed, data = cars)
predict(linear_model, newdata = Input_variable_speed, interval = "confidence")

Output:

output 3

The 95% confidence intervals associated with a speed of 10 are (15.46, 28.02). This means that, according to our model, 95% of the cars with a speed of 10 mph have a stopping distance between 15.46 and 28.02.

Recommended Articles

This is a guide to Predict Function in R. Here we discuss the three types of Predict Analytics along with the Examples and Arguments. You can also go through our other suggested articles to learn more –

  1. Predictive Modeling
  2. What is Predictive Analytics?
  3. Predictive Analytics Tool
  4. Decision-Making Techniques
All in One Excel VBA Bundle
500+ Hours of HD Videos
15 Learning Paths
120+ Courses
Verifiable Certificate of Completion
Lifetime Access
Financial Analyst Masters Training Program
2000+ Hours of HD Videos
43 Learning Paths
550+ Courses
Verifiable Certificate of Completion
Lifetime Access
All in One Data Science Bundle
2000+ Hour of HD Videos
80 Learning Paths
400+ Courses
Verifiable Certificate of Completion
Lifetime Access
All in One Software Development Bundle
5000+ Hours of HD Videos
149 Learning Paths
1050+ Courses
Verifiable Certificate of Completion
Lifetime Access
Primary Sidebar
All in One Data Science Bundle2000+ Hour of HD Videos | 80 Learning Paths | 400+ Courses | Verifiable Certificate of Completion | Lifetime Access
Financial Analyst Masters Training Program2000+ Hours of HD Videos | 43 Learning Paths | 550+ Courses | Verifiable Certificate of Completion | Lifetime Access
Footer
About Us
  • Blog
  • Who is EDUCBA?
  • Sign Up
  • Live Classes
  • Certificate from Top Institutions
  • Contact Us
  • Verifiable Certificate
  • Reviews
  • Terms and Conditions
  • Privacy Policy
  •  
Apps
  • iPhone & iPad
  • Android
Resources
  • Free Courses
  • Database Management
  • Machine Learning
  • All Tutorials
Certification Courses
  • All Courses
  • Data Science Course - All in One Bundle
  • Machine Learning Course
  • Hadoop Certification Training
  • Cloud Computing Training Course
  • R Programming Course
  • AWS Training Course
  • SAS Training Course

ISO 10004:2018 & ISO 9001:2015 Certified

© 2023 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

Let’s Get Started

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you

EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

By continuing above step, you agree to our Terms of Use and Privacy Policy.
*Please provide your correct email id. Login details for this Free course will be emailed to you

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA Login

Forgot Password?

By signing up, you agree to our Terms of Use and Privacy Policy.

This website or its third-party tools use cookies, which are necessary to its functioning and required to achieve the purposes illustrated in the cookie policy. By closing this banner, scrolling this page, clicking a link or continuing to browse otherwise, you agree to our Privacy Policy

Loading . . .
Quiz
Question:

Answer:

Quiz Result
Total QuestionsCorrect AnswersWrong AnswersPercentage

Explore 1000+ varieties of Mock tests View more