EDUCBA

EDUCBA

MENUMENU
  • Explore
    • Lifetime Membership
    • All in One Bundles
    • Fresh Entries
    • Finance
    • Data Science
    • Programming and Dev
    • Excel
    • Marketing
    • HR
    • PDP
    • VFX and Design
    • Project Management
    • Exam Prep
    • All Courses
  • Blog
  • Enterprise
  • Free Courses
  • Log in
  • Sign up
Home Data Science Data Science Tutorials R Programming Tutorial Data Frames in R

Data Frames in R

Priya Pedamkar
Article byPriya Pedamkar

Updated March 22, 2023

Data Frames in R

Introduction to Data Frames in R

Data frames in R language are the type of data structure that is used to store data in a tabular form which is of two-dimensional. The data frames are special categories of list data structure in which the components are of equal length. R languages support the built-in function i.e. data.frame() to create the data frames and assign the data elements. R language supports the data frame name to modify and retrieve data elements from the data frames. Data frames in R structured as column name by the component name also, structured as rows by the component values. Data frames in R is a widely used data structure while developing the machine learning models in data science projects.

Start Your Free Data Science Course

Hadoop, Data Science, Statistics & others

There are some characteristics of the data frame.

  • The column name is required
  • Row names should be unique
  • The number of items in each column should be the same

Steps For Creating Data Frames in R

Let’s start with creating a data frame which is explained below,

Step 1: Create a Data Frame of a Class in a School.

Code:

tenthclass = data.frame(roll_number = c(1:5),Name = c("John","Sam","Casey","Ronald","Mathew"),
Marks = c(77,87,45,68,95), stringsAsFactors = FALSE)
print(tenthclass)

When we run this code we will get a data frame like this.

Output:

Creating Data

Here in our example, the data frame is very small, but in real life, while dealing with the problem we have lots of data. So to understand the structure of data we pass on the function Str().

Step 2: We add the below line in our code.

Code:

Str(tenthclass)

When we run the whole code we will get output.

Output:

Str(tenthclass)

The above output means we have 5 observations of 3 variables. Then it explains the data type of each variable. Like in our example roll number is an integer, the name is character and Marks are numbered.

Once we understand the structure of the data, then we will pass the below-mentioned code to understand the data more statistically.

Step 3: Now, we will use a summary() function

Code:

summary(tenthclass)

Output:

Data Frames in R 1-3

The summary provides a better understanding of our data. It will tell us to mean, median, quartile, Max and Min. These things will help us to make a better decision.

Structure

When we want to know the structure of a particular data frame. We can use the below function.

Star ()

str(Data_frame)

Output:

Number: num  2 3 4

alpha: Factor w/ 3 levels “x”,”y”,”z”: 1 2 3

Booleans: logi  TRUE TRUE FALSE

How to Extract Data from Data Frames in R?

Here we will continue the above case. Let’s suppose we want to know the name of the student in class tenth, just name. So how we will extract?

Our data frame looks like this.

roll_number   Name Marks

1           1           John        77

2           2          Sam        87

3           3         Casey      45

4           4         Ronald    68

5           5         Mathew   95

To just get the name as an output we will pass on the following code.

Code:

onlyname = tenthclass$Name
print(onlyname)

Output:

Data Frames in R 1-4

Here if we break the code, we just put the dollar sign in between the name of our data frame and the name of the variable which we want as an output.

Now consider a situation, the teacher wants to know everything about roll number 2 like what his name is and how much he scored.

Here we need everything about roll number 2 so we will pass on the below-mentioned code.

Code:

result_rollnumber2 = tenthclass[c(2),c(1:3)] print(result_rollnumber2)

Output:

Data Frames in R 1-5

Expand in Data Frames

The data frame can be increased and decrease in size by adding or deleting columns and rows.

1. Add Row

We have two data frames. One data frame belongs to class tenth section A and other data frame belongs to class tenth section B. Now these different sections are merging into a single class.

Example #1

Class 10 A

Code:

tenthclass_sectionA = data.frame(roll_number = c(1:5),
Name = c("John","Sam","Casey","Ronald","Mathew"),
Marks = c(77,87,45,68,95), stringsAsFactors = FALSE)
print(tenthclass_sectionA)

Output:

Data Frames in R 1-6

Example #2

Class 10 B

Code:

tenthclass_sectionB = data.frame(roll_number = c(6:10),Name = c("Ria","Justin","Bon","Tim","joe"),
Marks = c(68,98,54,68,42), stringsAsFactors = FALSE)
print(tenthclass_sectionB)

Output:

Data Frames in R 1-7

Example #3

rbind() function

Now we have to merge these both classes into a single class. We will use rbind() function here. The only limitation in adding a new row is that we need to bring in the new rows in the same structure as the existing data frame.

Code:

new_tenthclass = rbind(tenthclass_sectionA,tenthclass_sectionB)
print(new_tenthclass)

Output:

Data Frames in R 1-8

2. Add Column

Now consider a case wherein we have to add blood group details of each and every student in class 10. We will add a new column for it and name it as “Blood_group”.

Our data frame looks like this.

Code:

tenthclass = data.frame(roll_number = c(1:5),Name = c("John","Sam","Casey","Ronald","Mathew"),
Marks = c(77,87,45,68,95), stringsAsFactors = FALSE)
print(tenthclass)

Output:

Data Frames in R 1-9

Code:

tenthclass$Blood_group = c("O","AB","B+","A+","AB")
print(tenthclass)

Output:

Data Frames in R 1-10

3. Delete Column

Code:

print(tenthclass)

Output:

Delete column

In this data frame if we have to delete the blood group variable (Rightmost column) we will pass the below code.

Code:

tenthclass$Blood_group = NULL
print(tenthclass)

Output:

Delete Column output

Bypassing NULL command we can directly remove the variable from our data frame.

4. Delete Row

Code:

print(tenthclass)

Output:

Delete row

Now consider a situation where we don’t need marks of John, so we have to remove the topmost row.

Code:

tenthclass = tenthclass[-1,] print(tenthclass)

Output:

Delete row output

5. Update Data in Data Frame

Code:

print(tenthclass)

Output:

Update data

Let’s suppose Sam scored 98 marks but as per our data frame marks are 87. So we can pass the below code to rectify it.

Code:

tenthclass$Marks[2] = 98
print(tenthclass)

Output:

Update data Output

Inspecting Data Frames

Below are the different ways to inspect a data frame and provides information about a data frame just like the above star function.

1. Names: Provides the names of the variables in the dataframe

Syntax : names(data frame name)

Example

Number <- c(2,3,4)
alpha <- c("x","y","z")
Booleans <- c(TRUE,TRUE,FALSE)
Data_frame <- data.frame(Number,alpha,Booleans)
names(Data_frame)

output:  [1] “Number”   “alpha”    “Booleans”   

2. Summary: Provides the statistics of the data frame.

Syntax: summary(data frame name)

Example

Number <- c(2,3,4)
alpha <- c("x","y","z")
Booleans <- c(TRUE,TRUE,FALSE)
Data_frame <- data.frame(Number,alpha,Booleans)
summary(Data_frame)    

Output:

Number alpha Booleans
Min. :2.0 x:1 Mode :logical
1st Qu.:2.5 y:1 FALSE:1
Median :3.0 z:1 TRUE :2
Mean :3.0 NA’s :0
3rd Qu.:3.5
Max. :4.0

3. Head:  Provides the data for the first few rows.

Syntax:  Head( name of the data frame)

Example

Number <- c(2,3,4,5,6,7,8,9,10,11)
alpha <- c("x","y","z","a","b","c","d","f","g","j")
Booleans <- c(TRUE,TRUE,FALSE,TRUE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE)
Data_frame <- data.frame(Number,alpha,Booleans)
head(Data_frame)

Output: 

Number alpha Booleans
1 2 x TRUE
2 3 y TRUE
3 4 z FALSE
4 5 a TRUE
5 6 b FALSE
6 7 c FALSE

4. Tail: Prints the last few rows in the data frame.

Syntax: tail( name of the data frame)

Number <- c(2,3,4,5,6,7,8,9,10,11)
alpha <- c("x","y","z","a","b","c","d","f","g","j")
Booleans <- c(TRUE,TRUE,FALSE,TRUE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE)
Data_frame <- data.frame(Number,alpha,Booleans)
tail(Data_frame)

Output:

Number alpha Booleans
5 6 b FALSE
6 7 c FALSE
7 8 d FALSE
8 9 f FALSE
9 10 g FALSE
10 11 j FALSE

Extracting Specific Data from the Data Frame

Below is some specific extraction of data from the data frame:

1. Using the Column name

We can extract a particular set of data from the data frame.

From our example above, let’s extract only the first column from the data frame which is Number.

Data_ frame <- data. Frame(Number)

Output:

Number

1      2

2      3

3      4

2. Using the rows

We can extract the data from the rows just like the below example.

Let’s suppose we want to print only two rows of the Number column.

Number <- c(2,3,4)
alpha <- c("x","y","z")
Booleans <- c(TRUE,TRUE,FALSE)
Data_frame <- data.frame(Number,alpha,Booleans)
print(Data_frame)
output <- Data_frame[1:2,] print(output)

Output:

Number alpha Booleans

1      2     x     TRUE

2      3     y     TRUE

3      4     z    FALSE

———————————-

Number alpha Booleans

1      2       x        TRUE

2      3       y        TRUE

We can observe the difference in the first and second outputs.

3. Printing specific rows and columns

We can also print specific rows and columns.

In the below example, we print 1st and 2nd rows, columns

Number <- c(2,3,4)
alpha <- c("x","y","z")
Booleans <- c(TRUE,TRUE,FALSE)
Data_frame <- data.frame(Number,alpha,Booleans)
print(Data_frame)
output <- Data_frame[c(1,2),c(1,2)] print(output)

Output:

Number alpha Booleans

1      2        x     TRUE

2      3        y     TRUE

3      4        z    FALSE

————————————-

Number alpha

1      2        x

2       3        y

4. Adding another column to the data frame

We can add another column along with values to the data frame.

Number <- c(2,3,4)
alpha <- c("x","y","z")
Booleans <- c(TRUE,TRUE,FALSE)
Data_frame <- data.frame(Number,alpha,Booleans)
Data_frame$class <- c("A","B","C")
out <- Data_frame
print(out)

Output:

Number alpha Booleans class

1      2     x     TRUE     A

2      3     y     TRUE     B

3      4     z    FALSE     C

5. Adding a row to the data frame

We use the rbind function to add a new row to the existing data frame.

Number <- c(2,3,4)
alpha <- c("x","y","z")
Booleans <- c(TRUE,TRUE,FALSE)
Data_frame <- data.frame(Number,alpha,Booleans)
Data_frame$class <- c("A","B","C")
out <- rbind(Data_frame,c(5,"x",FALSE,"D"))
print(out)

Output:

Number alpha  Booleans class

1      2          x         TRUE     A

2      3          y         TRUE     B

3      4          z         FALSE    C

4      5          x         FALSE    D

6. Combining both data frames

We can also combine two data frames to produce a single output.

To combine two data frames we need to have the same column for the data frames.

Number <- c(2,3,4)
alpha <- c("x","y","z")
Booleans <- c(TRUE,TRUE,FALSE)
Data_frame1 <- data.frame(Number,alpha,Booleans)
print(Data_frame1)
Number <- c(4,5,6)
alpha <- c("x","y","z")
Booleans <- c(TRUE,TRUE,FALSE)
Data_frame2 <- data.frame(Number,alpha,Booleans)
print(Data_frame2)
out <- rbind(Data_frame1,Data_frame2)
print(out)

Output:

Number alpha Booleans
1 2 x TRUE
2 3 y TRUE
3 4 z FALSE
—————————————–
Number alpha Booleans
1 4 x TRUE
2 5 y TRUE
3 6 z FALSE

—————————————–

Number alpha Booleans
1 2 x TRUE
2 3 y TRUE
3 4 z FALSE
4 4 x TRUE
5 5 y TRUE
6 6 z FALSE

Conclusion

Data frames are a very common form of the problem statement. It is a list of the variable of the same number of rows with unique row IDs. This article helps us to know how we can add a row, add a column, delete a row, delete a column of the data frame and also it tells how we can update the data in the data frame.

Recommended Articles

This is a guide to Data Frames in R. Here we discuss the different steps to create data frames and how to extract data from data frames in R. You may also look at the following articles to learn more –

  1. Top 5 Data Types in R
  2. List of useful R Packages
  3. R CSV Files
  4. R Program Functions – Importance
ADVERTISEMENT
All in One Excel VBA Bundle
500+ Hours of HD Videos
15 Learning Paths
120+ Courses
Verifiable Certificate of Completion
Lifetime Access
ADVERTISEMENT
Financial Analyst Masters Training Program
2000+ Hours of HD Videos
43 Learning Paths
550+ Courses
Verifiable Certificate of Completion
Lifetime Access
ADVERTISEMENT
All in One Data Science Bundle
2000+ Hour of HD Videos
80 Learning Paths
400+ Courses
Verifiable Certificate of Completion
Lifetime Access
ADVERTISEMENT
All in One Software Development Bundle
5000+ Hours of HD Videos
149 Learning Paths
1050+ Courses
Verifiable Certificate of Completion
Lifetime Access
Primary Sidebar
Footer
Follow us!
  • EDUCBA FacebookEDUCBA TwitterEDUCBA LinkedINEDUCBA Instagram
  • EDUCBA YoutubeEDUCBA CourseraEDUCBA Udemy
APPS
EDUCBA Android AppEDUCBA iOS App
Blog
  • Blog
  • Free Tutorials
  • About us
  • Contact us
  • Log in
  • Blog as Guest
Courses
  • Enterprise Solutions
  • Free Courses
  • Explore Programs
  • All Courses
  • All in One Bundles
  • Sign up
Email
  • [email protected]

ISO 10004:2018 & ISO 9001:2015 Certified

© 2023 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you

Let’s Get Started

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

By continuing above step, you agree to our Terms of Use and Privacy Policy.
*Please provide your correct email id. Login details for this Free course will be emailed to you

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA Login

Forgot Password?

This website or its third-party tools use cookies, which are necessary to its functioning and required to achieve the purposes illustrated in the cookie policy. By closing this banner, scrolling this page, clicking a link or continuing to browse otherwise, you agree to our Privacy Policy

Loading . . .
Quiz
Question:

Answer:

Quiz Result
Total QuestionsCorrect AnswersWrong AnswersPercentage

Explore 1000+ varieties of Mock tests View more

🚀 Cyber Monday Reloaded Price Drop! All in One Universal Bundle (3700+ Courses) @ 🎁 90% OFF - Ends in ENROLL NOW