Introduction to Supervised Learning
Supervised Learning is an area of machine learning where we work on predicting the values using labeled data sets. The labeled input datasets are called the independent variable while the predicted results are called the dependent variable because they depend on the independent variable for their results. For example, we all have spam folder in our email (for e.g. Gmail) account which automatically detects most of the spam/fraud emails for you with accuracy greater than 95%. It works based on a supervised learning model where we have a training set of labeled data, which in this case is, labeled spam email flagged by users. These training sets are used for learning which later will be used for the categorization of new emails as spam if it fits the category.
Working on Supervised Machine Learning
Let us understand supervised machine learning with the help of an example. Let’s say we have fruit basket which is filled up with different species of fruits. Our job is to categorize fruits based on their category.
In our case, we have considered four types of fruits and those are Apple, Banana, Grapes, and Oranges.
Now we will try to mention some of the unique characteristics of these fruits which make them unique.
S No. |
Size | Color | Shape |
First Name |
1 |
Small | Green | Round to oval, Bunch shape Cylindrical |
Grape |
2 |
Big | Red | Rounded shape with a depression at the top |
Apple |
3 |
Big | Yellow | Long curving cylinder |
Banana |
4 | Big | Orange | Rounded shape |
Orange |
Now let us say that you have picked up a fruit from the fruit basket, you looked at its features, for e.g. its shape, size, and color for instance and then you deduce that the color of this fruit is red, the size if big, the shape is rounded shape with depression at the top, hence it is an apple.
- Likewise, you do the same for all other remaining fruits as well.
- The rightmost column (“Fruit Name”) is known as the response variable.
- This is how we formulate a supervised learning model, now it will be quite easy for anybody new (Let’s say a robot or an alien) with given properties to easily group the same type of fruits together.
Types of Supervised Machine Learning Algorithm
Let us see different types of machine learning algorithms:
Regression:
Regression is used to predict single value output using the training data set. The output value is always called as the dependent variable while the inputs are known as the independent variable. We have different types of regression in Supervised Learning, for example,
4.7 (3,611 ratings)
View Course
- Linear Regression – Here we have only one independent variable which is used for predicting the output i.e. dependent variable.
- Multiple Regression – Here we have more than one independent variable which is used for predicting the output i.e. the dependent variable.
- Polynomial Regression – Here the graph between the dependent and independent variables follows a polynomial function. For e.g. at first, memory increases with age, then it reaches a threshold at a certain age, and then it starts decreasing as we turn old.
Classification:
The classification of supervised learning algorithms is used to group similar objects into unique classes.
- Binary classification – If the algorithm is trying to group 2 distinct groups of classes, then it is called binary classification.
- Multiclass classification – If the algorithm is trying to group objects to more than 2 groups, then it is called multiclass classification.
- Strength – Classification algorithms usually perform very well.
- Drawbacks – Prone to overfitting and might be unconstrained. For Example – Email Spam classifier
- Logistic regression/classification – When the Y variable is a binary categorical (i.e. 0 or 1), we use Logistic regression for the prediction. For Example – Predicting if a given credit card transaction is fraud or not.
- Naïve Bayes Classifiers – The Naïve Bayes classifier is based on the Bayesian theorem. This algorithm is usually best suited when the dimensionality of the inputs is high. It consists of acyclic graphs that are having one parent and many children nodes. The child nodes are independent of each other.
- Decision Trees – A decision tree is a tree chart like structure which consists of an internal node (test on attribute), branch which denotes the outcome of the test and the leaf nodes which represents the distribution of classes. The root node is the topmost node. It is a very widely used technique which is used for classification.
- Support Vector Machine – A support vector machine is or an SVM does the job of classification by finding the hyperplane which should maximize the margin between 2 classes. These SVM machines are connected to the kernel functions. Fields, where SVMs are extensively used, are biometrics, pattern recognition, etc.
Advantages
Below are some of the advantages of supervised machine learning models:
- The performance of models can be optimized by the user experiences.
- Supervised learning produces outputs using previous experience and also allows you to collect data.
- Supervised machine learning algorithms can be used for implementing a number of real-world problems.
Disadvantages
Disadvantages of Supervised Learning are as follow:
- The effort of training supervised machine learning models may take a lot of time if the dataset is bigger.
- The classification of big data sometimes poses a bigger challenge.
- One may have to deal with the problems of overfitting.
- We need lots of good examples if we want the model to perform well while we are training the classifier.
Good Practices while Building Learning Models
It is a good practice while building a Supervised Learning Machine Models:-
- Before building any good machine learning model, the process of preprocessing of data must be performed.
- One must decide the algorithm which should be best suited for a given problem.
- We need to decide what type of data will be used for the training set.
- Needs to decide on the structure of the algorithm and function.
Conclusion
In our article, we have learned what is supervised learning and we saw that here we train the model using labeled data. Then we went into the working of the models and their different types. We finally saw the advantages and disadvantages of these supervised machine learning algorithms.
Recommended Articles
This is a guide to what is Supervised Learning?. Here we discuss the concepts, how it works, types, advantages and disadvantages of Supervised Learning. You can also go through our other suggested articles to learn more –