EDUCBA

EDUCBA

MENUMENU
  • Free Tutorials
  • Free Courses
  • Certification Courses
  • 360+ Courses All in One Bundle
  • Login

Data Frames in R

By Priya PedamkarPriya Pedamkar

Home » Data Science » Data Science Tutorials » R Programming Tutorial » Data Frames in R

Data Frames in R

Introduction to Data Frames in R

Data frames in R language are the type of data structure that is used to store data in a tabular form which is of two-dimensional. The data frames are special categories of list data structure in which the components are of equal length. R languages support the built-in function i.e. data.frame() to create the data frames and assign the data elements. R language supports the data frame name to modify and retrieve data elements from the data frames. Data frames in R structured as column name by the component name also, structured as rows by the component values. Data frames in R is a widely used data structure while developing the machine learning models in data science projects.

There are some characteristics of the data frame.

Start Your Free Data Science Course

Hadoop, Data Science, Statistics & others

  • The column name is required
  • Row names should be unique
  • The number of items in each column should be the same

Steps For Creating Data Frames in R

Let’s start with creating a data frame which is explained below,

Step 1: Create a Data Frame of a Class in a School.

Code:

tenthclass = data.frame(roll_number = c(1:5),Name = c("John","Sam","Casey","Ronald","Mathew"),
Marks = c(77,87,45,68,95), stringsAsFactors = FALSE)
print(tenthclass)

When we run this code we will get a data frame like this.

Output:

Creating Data

Here in our example, the data frame is very small, but in real life, while dealing with the problem we have lots of data. So to understand the structure of data we pass on the function Str().

Popular Course in this category
Sale
R Programming Training (13 Courses, 20+ Projects)13 Online Courses | 20 Hands-on Projects | 120+ Hours | Verifiable Certificate of Completion | Lifetime Access
4.5 (9,594 ratings)
Course Price

View Course

Related Courses
Statistical Analysis Training (15 Courses, 10+ Projects)All in One Data Science Bundle (360+ Courses, 50+ projects)

Step 2: We add the below line in our code.

Code:

Str(tenthclass)

When we run the whole code we will get output.

Output:

Str(tenthclass)

The above output means we have 5 observations of 3 variables. Then it explains the data type of each variable. Like in our example roll number is an integer, the name is character and Marks are numbered.

Once we understand the structure of the data, then we will pass the below-mentioned code to understand the data more statistically.

Step 3: Now, we will use a summary() function

Code:

summary(tenthclass)

Output:

Data Frames in R 1-3

The summary provides a better understanding of our data. It will tell us to mean, median, quartile, Max and Min. These things will help us to make a better decision.

Structure

When we want to know the structure of a particular data frame. We can use the below function.

Star ()

str(Data_frame)

Output:

Number: num  2 3 4

alpha: Factor w/ 3 levels “x”,”y”,”z”: 1 2 3

Booleans: logi  TRUE TRUE FALSE

How to Extract Data from Data Frames in R?

Here we will continue the above case. Let’s suppose we want to know the name of the student in class tenth, just name. So how we will extract?

Our data frame looks like this.

roll_number   Name Marks

1           1           John        77

2           2          Sam        87

3           3         Casey      45

4           4         Ronald    68

5           5         Mathew   95

To just get the name as an output we will pass on the following code.

Code:

onlyname = tenthclass$Name
print(onlyname)

Output:

Data Frames in R 1-4

Here if we break the code, we just put the dollar sign in between the name of our data frame and the name of the variable which we want as an output.

Now consider a situation, the teacher wants to know everything about roll number 2 like what his name is and how much he scored.

Here we need everything about roll number 2 so we will pass on the below-mentioned code.

Code:

result_rollnumber2 = tenthclass[c(2),c(1:3)] print(result_rollnumber2)

Output:

Data Frames in R 1-5

Expand in Data Frames

The data frame can be increased and decrease in size by adding or deleting columns and rows.

1. Add Row

We have two data frames. One data frame belongs to class tenth section A and other data frame belongs to class tenth section B. Now these different sections are merging into a single class.

Example #1

Class 10 A

Code:

tenthclass_sectionA = data.frame(roll_number = c(1:5),
Name = c("John","Sam","Casey","Ronald","Mathew"),
Marks = c(77,87,45,68,95), stringsAsFactors = FALSE)
print(tenthclass_sectionA)

Output:

Data Frames in R 1-6

Example #2

Class 10 B

Code:

tenthclass_sectionB = data.frame(roll_number = c(6:10),Name = c("Ria","Justin","Bon","Tim","joe"),
Marks = c(68,98,54,68,42), stringsAsFactors = FALSE)
print(tenthclass_sectionB)

Output:

Data Frames in R 1-7

Example #3

rbind() function

Now we have to merge these both classes into a single class. We will use rbind() function here. The only limitation in adding a new row is that we need to bring in the new rows in the same structure as the existing data frame.

Code:

new_tenthclass = rbind(tenthclass_sectionA,tenthclass_sectionB)
print(new_tenthclass)

Output:

Data Frames in R 1-8

2. Add Column

Now consider a case wherein we have to add blood group details of each and every student in class 10. We will add a new column for it and name it as “Blood_group”.

Our data frame looks like this.

Code:

tenthclass = data.frame(roll_number = c(1:5),Name = c("John","Sam","Casey","Ronald","Mathew"),
Marks = c(77,87,45,68,95), stringsAsFactors = FALSE)
print(tenthclass)

Output:

Data Frames in R 1-9

Code:

tenthclass$Blood_group = c("O","AB","B+","A+","AB")
print(tenthclass)

Output:

Data Frames in R 1-10

3. Delete Column

Code:

print(tenthclass)

Output:

Delete column

In this data frame if we have to delete the blood group variable (Rightmost column) we will pass the below code.

Code:

tenthclass$Blood_group = NULL
print(tenthclass)

Output:

Delete Column output

Bypassing NULL command we can directly remove the variable from our data frame.

4. Delete Row

Code:

print(tenthclass)

Output:

Delete row

Now consider a situation where we don’t need marks of John, so we have to remove the topmost row.

Code:

tenthclass = tenthclass[-1,] print(tenthclass)

Output:

Delete row output

5. Update Data in Data Frame

Code:

print(tenthclass)

Output:

Update data

Let’s suppose Sam scored 98 marks but as per our data frame marks are 87. So we can pass the below code to rectify it.

Code:

tenthclass$Marks[2] = 98
print(tenthclass)

Output:

Update data Output

Inspecting Data Frames

Below are the different ways to inspect a data frame and provides information about a data frame just like the above star function.

1. Names: Provides the names of the variables in the dataframe

Syntax : names(data frame name)

Example

Number <- c(2,3,4)
alpha <- c("x","y","z")
Booleans <- c(TRUE,TRUE,FALSE)
Data_frame <- data.frame(Number,alpha,Booleans)
names(Data_frame)

output:  [1] “Number”   “alpha”    “Booleans”   

2. Summary: Provides the statistics of the data frame.

Syntax: summary(data frame name)

Example

Number <- c(2,3,4)
alpha <- c("x","y","z")
Booleans <- c(TRUE,TRUE,FALSE)
Data_frame <- data.frame(Number,alpha,Booleans)
summary(Data_frame)    

Output:

Number alpha Booleans
Min. :2.0 x:1 Mode :logical
1st Qu.:2.5 y:1 FALSE:1
Median :3.0 z:1 TRUE :2
Mean :3.0 NA’s :0
3rd Qu.:3.5
Max. :4.0

3. Head:  Provides the data for the first few rows.

Syntax:  Head( name of the data frame)

Example

Number <- c(2,3,4,5,6,7,8,9,10,11)
alpha <- c("x","y","z","a","b","c","d","f","g","j")
Booleans <- c(TRUE,TRUE,FALSE,TRUE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE)
Data_frame <- data.frame(Number,alpha,Booleans)
head(Data_frame)

Output: 

Number alpha Booleans
1 2 x TRUE
2 3 y TRUE
3 4 z FALSE
4 5 a TRUE
5 6 b FALSE
6 7 c FALSE

4. Tail: Prints the last few rows in the data frame.

Syntax: tail( name of the data frame)

Number <- c(2,3,4,5,6,7,8,9,10,11)
alpha <- c("x","y","z","a","b","c","d","f","g","j")
Booleans <- c(TRUE,TRUE,FALSE,TRUE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE)
Data_frame <- data.frame(Number,alpha,Booleans)
tail(Data_frame)

Output:

Number alpha Booleans
5 6 b FALSE
6 7 c FALSE
7 8 d FALSE
8 9 f FALSE
9 10 g FALSE
10 11 j FALSE

Extracting Specific Data from the Data Frame

Below is some specific extraction of data from the data frame:

1. Using the Column name

We can extract a particular set of data from the data frame.

From our example above, let’s extract only the first column from the data frame which is Number.

Data_ frame <- data. Frame(Number)

Output:

Number

1      2

2      3

3      4

2. Using the rows

We can extract the data from the rows just like the below example.

Let’s suppose we want to print only two rows of the Number column.

Number <- c(2,3,4)
alpha <- c("x","y","z")
Booleans <- c(TRUE,TRUE,FALSE)
Data_frame <- data.frame(Number,alpha,Booleans)
print(Data_frame)
output <- Data_frame[1:2,] print(output)

Output:

Number alpha Booleans

1      2     x     TRUE

2      3     y     TRUE

3      4     z    FALSE

———————————-

Number alpha Booleans

1      2       x        TRUE

2      3       y        TRUE

We can observe the difference in the first and second outputs.

3. Printing specific rows and columns

We can also print specific rows and columns.

In the below example, we print 1st and 2nd rows, columns

Number <- c(2,3,4)
alpha <- c("x","y","z")
Booleans <- c(TRUE,TRUE,FALSE)
Data_frame <- data.frame(Number,alpha,Booleans)
print(Data_frame)
output <- Data_frame[c(1,2),c(1,2)] print(output)

Output:

Number alpha Booleans

1      2        x     TRUE

2      3        y     TRUE

3      4        z    FALSE

————————————-

Number alpha

1      2        x

2       3        y

4. Adding another column to the data frame

We can add another column along with values to the data frame.

Number <- c(2,3,4)
alpha <- c("x","y","z")
Booleans <- c(TRUE,TRUE,FALSE)
Data_frame <- data.frame(Number,alpha,Booleans)
Data_frame$class <- c("A","B","C")
out <- Data_frame
print(out)

Output:

Number alpha Booleans class

1      2     x     TRUE     A

2      3     y     TRUE     B

3      4     z    FALSE     C

5. Adding a row to the data frame

We use the rbind function to add a new row to the existing data frame.

Number <- c(2,3,4)
alpha <- c("x","y","z")
Booleans <- c(TRUE,TRUE,FALSE)
Data_frame <- data.frame(Number,alpha,Booleans)
Data_frame$class <- c("A","B","C")
out <- rbind(Data_frame,c(5,"x",FALSE,"D"))
print(out)

Output:

Number alpha  Booleans class

1      2          x         TRUE     A

2      3          y         TRUE     B

3      4          z         FALSE    C

4      5          x         FALSE    D

6. Combining both data frames

We can also combine two data frames to produce a single output.

To combine two data frames we need to have the same column for the data frames.

Number <- c(2,3,4)
alpha <- c("x","y","z")
Booleans <- c(TRUE,TRUE,FALSE)
Data_frame1 <- data.frame(Number,alpha,Booleans)
print(Data_frame1)
Number <- c(4,5,6)
alpha <- c("x","y","z")
Booleans <- c(TRUE,TRUE,FALSE)
Data_frame2 <- data.frame(Number,alpha,Booleans)
print(Data_frame2)
out <- rbind(Data_frame1,Data_frame2)
print(out)

Output:

Number alpha Booleans
1 2 x TRUE
2 3 y TRUE
3 4 z FALSE
—————————————–
Number alpha Booleans
1 4 x TRUE
2 5 y TRUE
3 6 z FALSE

—————————————–

Number alpha Booleans
1 2 x TRUE
2 3 y TRUE
3 4 z FALSE
4 4 x TRUE
5 5 y TRUE
6 6 z FALSE

Conclusion

Data frames are a very common form of the problem statement. It is a list of the variable of the same number of rows with unique row IDs. This article helps us to know how we can add a row, add a column, delete a row, delete a column of the data frame and also it tells how we can update the data in the data frame.

Recommended Articles

This is a guide to Data Frames in R. Here we discuss the different steps to create data frames and how to extract data from data frames in R. You may also look at the following articles to learn more –

  1. Top 5 Data Types in R
  2. List of useful R Packages
  3. R CSV Files
  4. R Program Functions – Importance

R Programming Training (12 Courses, 20+ Projects)

13 Online Courses

20 Hands-on Projects

120+ Hours

Verifiable Certificate of Completion

Lifetime Access

Learn More

0 Shares
Share
Tweet
Share
Primary Sidebar
R programming Tutorial
  • Data Structure
    • R list
    • Arrays in R
    • Data Frames in R
    • Factors in R
    • R Vectors
  • Basic
    • What is R Programming Language
    • Careers in R Programming
    • Install R
    • List of R Packages
    • Introduction of R Tools Technology
    • R Programming Language
    • DataSet in R
    • What is RStudio?
    • R-studio-Functions
    • R Packages
    • Time series?in R
    • R Data Types
    • R for data science
    • R Operators
    • R Data Frame
    • R Analytics Tool
    • R Tree Package
    • Vectors in R
  • Control statement
    • If Statement in R
    • If Else Statement in R
    • Else if in R
    • Switch Statement in R
  • Loops
    • Loops in R
    • For Loop in R
    • Nested For Loop in R
    • While Loop in R
    • Next in R
  • Chart/graphs
    • Graphs in R
    • Bar Charts in R
    • Pie Chart in R
    • Histogram in R
    • Line Graph in R
    • Plot Function in R
    • Scatterplot in R
    • R Boxplot labels
  • Regression in R
    • Simple Linear Regression in R
    • Linear Regression in R
    • Multiple Linear Regression in R
    • Logistic Regression in R
    • Poisson Regression in R
    • OLS Regression in R
    • P-Value in Regression
  • Anova in R
    • ANOVA in R
    • One Way ANOVA in R
    • Two Way ANOVA in R
  • Advanced
    • Statistical Analysis with R
    • R String Functions
    • Data Exploration in R
    • R CSV Files
    • KNN Algorithm in R
    • Sorting in R
    • lm Function in R
    • Hierarchical Clustering in R
    • R Normal Distribution
    • Binomial Distribution in R
    • Decision Tree in R
    • GLM in R
    • Arima Model in R
    • Linear Model in R
    • Predict Function in R
    • Survival Analysis in R
    • Standard Deviation in R
    • Statistical Analysis in R
    • Predictive Analysis?in R
    • T-test in R
    • Database in R
  • Programs
    • Functions in R
    • Boxplot in R
    • R Program Functions
    • Factorial in R
    • Random Number Generator in R
  • Interview question
    • R Interview Questions

Related Courses

R Programming Certification Course

Statistical Analysis Course Training

All in One Data Science Courses

Footer
About Us
  • Blog
  • Who is EDUCBA?
  • Sign Up
  • Live Classes
  • Corporate Training
  • Certificate from Top Institutions
  • Contact Us
  • Verifiable Certificate
  • Reviews
  • Terms and Conditions
  • Privacy Policy
  •  
Apps
  • iPhone & iPad
  • Android
Resources
  • Free Courses
  • Database Management
  • Machine Learning
  • All Tutorials
Certification Courses
  • All Courses
  • Data Science Course - All in One Bundle
  • Machine Learning Course
  • Hadoop Certification Training
  • Cloud Computing Training Course
  • R Programming Course
  • AWS Training Course
  • SAS Training Course

© 2022 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA Login

Forgot Password?

By signing up, you agree to our Terms of Use and Privacy Policy.

Let’s Get Started

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

This website or its third-party tools use cookies, which are necessary to its functioning and required to achieve the purposes illustrated in the cookie policy. By closing this banner, scrolling this page, clicking a link or continuing to browse otherwise, you agree to our Privacy Policy

Loading . . .
Quiz
Question:

Answer:

Quiz Result
Total QuestionsCorrect AnswersWrong AnswersPercentage

Explore 1000+ varieties of Mock tests View more

Independence Day Offer - R Programming Training (12 Courses, 20+ Projects) Learn More