## Introduction to Functions in R

The function is defined as a set of statements, to perform and accomplish any specific logical task. Function takes some input parameters which are known as arguments to perform that task. Functions help in breaking the code, into simpler chunks by orchestrating it logically, which is easier to read and understand. In this topic, we are going to learn about Functions in R.

### How to write Functions in R?

To write the function in R, here is the syntax:

`Fun_name <- function (argument) {`

Function body

}

Here, one can see “function” specific reserved word is used in R, to define any function. The function takes input which is in the form of arguments. The function body is a set of logical statements that are performed over arguments and then it returns the output. “Fun_name” is the name given to the function, through which it can be called anywhere in the R program.

Let’s see an example, which will be more lucid in understanding the concept of function in R.

**R code**

`Multi <- function(x, y) {`

# function to print x multiply y

result <- x*y

print(paste(x,"Multiply", y, "is", result))

}

**output:**

Here we created the function name “Multi”, which takes two arguments as inputs and provides the multiplied output. The first argument is x and the second argument is y. As you can see, we have called the function by the name “Multi”. Here if someone wants, arguments can also be set to the default value.

### Different Types of Functions in R

Different R functions with Syntax and examples (Built-in, Math, statistical, etc.)

#### 1) Built-in Function –

These are the functions that come with R to address a specific task by taking an argument as input and giving an output based on the given input. Let’s discuss some important general functions of R here:

**a) Sort: **Data can be of the sort to ascending or descending order. Data can be whether a vector of continues variable or factor variable.

**Syntax:**

Here is the explanation of its parameters:

- x: This is a vector of the continuous variable or factor variable
- decreasing: This can be set either True/False to control order by ascending or descending. By default, it’s FALSE`.
- last: If the vector has NA values, should it be put last or not

**R code and output:**

Here one can notice, how “NA” values get aligned at the end. As our parameter na.last = True was true.

**b) Seq: **It generates a sequence of the number between two specified numbers.

**Syntax**

Here is the explanation of its parameters:

- from, to start and end value of the sequence.
- by: Increment/gap between two consecutive numbers in sequence
- length.out: the required length of the sequence.
- Along.with: Refers to the length from the length of this argument

**R code and output:**

Here one can notice the sequence generated is having incrementation of 2 because by is defined as 2.

**c) Toupper, tolower: **The two functions: toupper and tolower are functions applied on the string to change the cases of the letters in sentences.

**R code and output:**

One can notice, how the cases of letters get changed when applied to the function.

**d) Rnorm: **This is a built-in function that generates random numbers.

**R code and output:**

The function rnorm takes the first argument which says how many numbers need to be generated.

**e) Rep: **This function replicates the value as many times as specified.

**R syntax:** rnorm(x,n)

Here x represents value to replicate, and n represents the number of times it has to be replicated.

**R code and output:**

**f) Paste: **This function is to concatenate strings together with some specific character in between.

**syntax**

`paste(x,sep = “”, collapse = NULL)`

**R code**

`paste("fish", "water", sep=" - ")`

**R output:**

As you can see, we can paste more than two strings as well. Sep is that specific character that we added in between strings. By default, sep is space.

One more similar function exists like this, which everyone should be aware of is paste0.

The function paste0(x,y,collapse) works similar to paste(x,y,sep = “”,collapse)

Please see the example below:

In simple words, to summarize paste and paste0:

Paste0 is faster than paste when it comes to the concatenation of strings without any separator. As paste always looks for “sep” and which is space by default in it.

**g) Strsplit: **This function is to split the string. Let’s see the simple cases:

**h) Rbind: **The function rbind helps in combing vectors with the same number of columns, one over other.

**Example**

**i) cbind: **This combines vectors with the same number of rows, side by side.

**Example**

In case the number of rows doesn’t match, below is the error you will find:

Both cbind and rbind helps in data manipulation and reshaping.

#### 2) Math Function –

R provides a wide variety of Math functions. Let’s see a few of them in detail:

**a) Sqrt: **This function computes the square root of a number or numeric vector.

**R code and output:**

One can see how to square root of a number, complex number and a sequence of numeric vector has been calculated.

**b) Exp: **This function calculates the exponential value of a number or a numeric vector.

**R code and output:**

**c) Cos, Sin, Tan: **These are trigonometry functions implemented in R here.

**R code and output:**

**d) Abs: **This function returns the absolute positive value of a number.

As you can see, the negative or positive of a number will be returned in its absolute form. Let’s see it for a complex number:

**e) Log: **This is to find the logarithm of a number.

Here is the example is shown below:

Here one gets the flexibility to change the base, as per requirement.

**f) Cumsum: **This is a mathematical function that gives cumulative sums. Here is the example below:

**g) Cumprod: **Like Cumsum mathematical function, we have cumprod where cumulative multiplication happens.

Please see the example below:

**h) Max, Min: **This will help you find the maximum/minimum value in the set of numbers. See below the examples related to this:

**i) Ceiling: **The ceiling is a mathematical function returns the smallest of the integer higher than specified.

Let looks at an example:

ceiling(2.67)

As you can notice, the ceiling is applied over a number as well as over a list, and the output came is the smallest of the next higher integer.

**j) Floor: **The floor is a mathematical function returns the least value integer of the number specified.

The example shown below will help you understand it better:

It works the same way for negative values as well. Please take a look:

#### 3) Statistical Functions –

These are the functions that describe the related probability distribution.

**a) Median: **This calculated the median from the sequence of numbers.

**Syntax**

**R code and output:**

**b) Dnorm: **This refers to the normal distribution. The function dnorm returns the value of the probability density function, for the normal distribution given parameters for x, μ, and σ.

**R code and output:**

**c) Cov: **Covariance tells if two vectors are positively, negatively or totally non-related.

**R code**

`x_new = c(1., 5.5, 7.8, 4.2, -2.7, -5.5, 8.9)`

y_new = c(0.1, 2.0, 0.8, -4.2, 2.7, -9.4, -1.9)

cov(x_new,y_new)

**R output:**

As you can see two vectors are positively related, which means both vectors move in the same direction. If the covariance is negative, it means x and y are inversely related and hence moves in the opposite direction.

**d) Cor: **This is a function to find the correlation between vectors. It actually gives the association factor between the two vectors which is known as the “correlation coefficient”. Correlation adds a degree factor over covariance. If two vectors are positively correlated, the correlation will also tell you with how much extend they are positively related.

These three types of methods that can be used to find a correlation between two vectors:

- Pearson correlation
- Kendall correlation
- Spearman correlation

In simple R format, it looks like:

`cor(x, y, method = c("pearson", "kendall", "spearman"))`

Here x and y are vectors.

Let’s see the practical example of correlation over an inbuilt dataset.

So, here you can see “cor()” function gave the correlation coefficient 0.41 between “qsec” and “mpg”. However, one more function has also been showcased i.e. “cor.test()” which not only tells the correlation coefficient but also p-value and t value related to it. Interpretation becomes far easier with cor.test function.

Similar can be done with the other two methods of correlation:

**R code for the Pearson method:**

`my_data <- mtcars`

cor(my_data$qsec, my_data$mpg, method = " pearson ")

cor.test(my_data$qsec, my_data$mpg, method = " pearson")

**R code for Kendall method:**

`my_data <- mtcars`

cor(my_data$qsec, my_data$mpg, method = " kendall")

cor.test(my_data$qsec, my_data$mpg, method = " kendall")

**R code for Spearman method:**

`my_data <- mtcars`

cor(my_data$qsec, my_data$mpg, method = "spearman")

cor.test(my_data$qsec, my_data$mpg, method = "spearman")

The correlation coefficient ranges between -1 and 1.

If the Correlation coefficient is negative, that implies when x increases y decreases.

If the Correlation coefficient is zero, that implies there exists no association between x and y.

If the Correlation coefficient is positive, that implies when x increases y also tends to increase.

**e) T-test: **The T-test will tell you if two data sets are coming from the same (assuming) normal distributions or not.

Here you should reject the null hypothesis that the two means are equal because the p-value is less than 0.05.

This shown instance is of type: unpaired data sets with unequal variances. Similarly, can be tried with the paired dataset.

**f) Simple Linear Regression: **This shows the relationship between the predictor/independent and response/dependent variable.

A simple practical example could be predicting the weight of a person if the height is known.

**R syntax**

`lm(formula,data)`

Here formula depicts the relation between output i.e. y and input variable i.e. x. Data represent the dataset, on which the formula needs to be applied.

Let’s see one practical example, where the floor area is the input variable and rent is the output variable.

x <- c(1510, 1000, 600, 500, 1280, 136, 1790, 1630)

y <- c(15000, 10000, 6000, 5000, 12800, 13600,17900, 16300)

Here P-value is not less than 5%. Hence the null hypothesis cannot be rejected. There is not much significance to prove the relationship between the floor area and rent.

Here R-square value is 0.4813. That implies only 48% of the variance in the output variable can be explained by the input variable.

Let’s say now we need to predict for a value of floor area, based on the above-fitted model.

**R code**

`x_new <- data.frame(x = 1700)`

result <- predict(relation,x_new)

print(result)

**R output:**

After the execution of the above R code, the output will look like the following:

One can fit and visualize regression. Here is the R code for that:

**# Give the png chart file a name.**

`png(file = "LinearRegressionSample.png")`

**# Plot the chart.**

`plot(y,x,col = "green",main = "Floor Area & Rent Regression",`

abline(lm(x~y)),cex = 1.3,pch = 16,xlab = "Floor area in sq m",ylab = "Rent in Rs")

**# Save the file.**

`dev.off()`

This “LinearRegressionSample.png” graph will be generated in your present working directory.

**g) Chi-Square test**

This is a statistical function in R. This test holds its significance in order to prove if the correlation exists between two categorical variables.

This test also works like any other statistical tests were based on p-value, one can accept or reject the null hypothesis.

**R syntax**

`chisq.test(data),/code>`

Let’s see one practical example of it.

**R code**

**# Load the library.**

`library(datasets)`

data(iris)

**# Create a data frame from the main data set.**

`iris.data <- data.frame(iris$Sepal.Length, iris$Sepal.Width)`

**# Create a table with the needed variables.**

`iris.data = table(iris$Sepal.Length, iris$Sepal.Width)`

print(iris.data)

**# Perform the Chi-Square test.**

`print(chisq.test(iris.data))`

**R output:**

As one can see, the chi-square test has been performed over an iris dataset, considering its two variables “Sepal. Length” and “Sepal.Width”.

The p-value is not less than 0.05, hence correlation doesn’t exist between these two variables. Or we can say these two variables are not dependent on each other.

### Conclusion

Functions in R are simple, easy to fit, easy to grasp and yet very powerful. We saw a variety of functions that are used as part of basics in R. Once one gets comfortable with these functions discussed above, one can explore other varieties of functions. Functions help you, make your code run in simple and in a concise manner. Functions can be inbuilt or user-defined, all depends on the need while addressing a problem. Functions give a good shape to a program.

### Recommended Articles

This is a guide to Functions in R. here we discuss how to write Functions in R and different types of Functions in R with syntax and examples. You may also look at the following article to learn more –