Introduction to R Packages
R packages are a set of predefined functions as a library to be used while deploying the R program to care for reusability and less code approach R programs. R packages are externally developed and can be imported to the R environment in order to use the available function which belongs to that package. R packages are managed by the R community network known as CRAN for providing and provisioning with the R programming language. Apart from the standard R packages, there are several external packages available for use in the R program. One of the popular graphical packages in R is ggplot2.
Where do we Find Packages?
Packages are available on the internet through different sources. However, there are certain trusted repositories from where we can download the packages.
Here are the two important repositories that are available online.
- CRAN(Comprehensive R Archive Network): This is the official R community with a network of FTP and webservers that contains the latest code and documentation of R. Before, you post your packages online it goes through a series of tests that adheres to CRAN policy.
- GitHub: GitHub is another famous repository but not specific to R.The online community can share their packages with other people and it is used for version control is well. GitHub is open-source and doesn’t have any review process.
List of Useful R Packages
There are several packages in R and can be downloaded from CRAN or GitHub. Below are the packages that can be used for specific purposes.
1. Loading the Data from External Sources
- Haven: R reads and writes data from SAS.
- DBI: To establish communication between the relational database and R.
- RSQlite: It is used to read data from relational databases.
2. Data Manipulation
- Dplyr: It is used for data manipulation like subsetting, provides shortcuts to access data and generates sql queries.
- Tidyr – It is used to convert data into tiny formats.
- stringr– manipulate string expressions and character strings.
- lubridate- To work with data and time.
3. Data Visualization
- Rgl: To work on 3D visualizations.
- ggvis: To create and build grammar of graphics.
- googlevis: To use google visualization tools in R.
4. Web-Based Packages
- XML: To read and write XML documents in R.
- Httpr: Work with http connections.
- Jsonlite: To read json data tables.
Obtaining R Packages
We can check the available packages that are present in R by using the below code.
- available.packages(): There are approximately 5200 packages available in the CRAN network.
CRAN has task views that groups packages under a particular topic.
Installing R Packages
We can install packages directly through IDE or through commands. To install packages we use the below function and specify the package name.
The above code installs the ggplot2 package and its dependent packages if any.
We can install several packages at a time by specifying the package’s names under a character vector.
install.packages(c(“package 1”,”package 2”,”package 3”))
Installing using R Studio
The advantage of using an R studio is it is GUI (Graphical User interface). We can choose the packages to install and the source of it.
We can go to tools -> Install packages.
Loading R Packages
After installing the R package we need to load them into R, to start making use of the installed packages.
We use the below function to load the packages.
There are certain packages that display messages when loaded. Some of them, don’t. We can see the details of the library installed with the help of the below code.
“package:lattice” “package:ggplot2” “package:makeslides”
“package:knitr” “package:slidify” “tools:rstudio”
Creating Your own Package
Before we create our own package. We should keep the below checklist in our mind before we proceed to create a package.
- Organizing the code is one of the most important things while writing code in the package. We lose half the time searching for the code location instead of improving the code. Put all the files in a folder that is easily accessible.
- Documenting the code helps you understand the purpose of the code. When we don’t revisit the code often, we forget why we have written the code in a certain way. It can also help people to understand your code better when shared with them.
- Sharing the scripts through email has become archaic. The easy way is to upload your code and distribute it on GitHub. It is possible you get feedback that can help you enhance the code.
To create your own package, we have to install the devtools package.
To help with the documentation we can use the below package.
After installing the package devtools. You can create your own package.
In the place of “packagename”, you can give the name you wish. You can now add your functions under this package.
You can create the same filename as your function name.
You can distribute your package on GitHub by using the devtools package.
We use the below code to distribute our package on Github.
You can give your github username and package name you have created above.
Here are the Required Files for a Package
Once we have all the above files we are good to post it in the repository.
This is a guide to R Packages. Here we discuss the list of useful R packages, installing packages using R studio and creating your own package, etc. You may also look at the following articles to learn more –