• Skip to primary navigation
  • Skip to content
  • Skip to primary sidebar
  • Skip to footer
EDUCBA

EDUCBA

MENUMENU
  • Resources
        • Data & Analytics Career

          • Big Data Analytics Jobs
          • Hadoop developer interview Questions
          • Big Data Vs Machine Learning
        • Data and Analytics Career
        • Interview Questions

          • Career in Cloud Computing Technology
          • Big Data interview questions
          • Data Scientist vs Machine Learning
        • Interview Questions
        • Machine Learning

          • What is Machine Learning
          • Machine Learning Tools
          • Neural Network Algorithms
        • Head to Head Differences
        • Others

          • Resources (A-Z)
          • Data and Analytics Basics
          • Business Analytics
          • View All
  • Free Courses
  • All Courses
        • Certification Courses

          Data Science Course
        • All in One Bundle

          All-in-One-Data-Science-Bundle
        • Machine Learning Course

          Machine-Learning-Training
        • Others

          • Hadoop Certification Training
          • Cloud Computing Training Course
          • R Programming Course
          • AWS Training Course
          • SAS Training Course
          • View All
  • 360+ Courses All in One Bundle
  • Login

Machine Learning Algorithms

Home » Data Science » Blog » Machine Learning » Machine Learning Algorithms

Machine Learning Algorithms

Introduction to Machine Learning Algorithms

Machine Learning is the study of predictive analytics which works on the principle that computers learn from past data and then make predictions on the new data. The advent of Machine Learning algorithms was long back when the computers were developed. However, the recent hype is a result of the large amounts of data that are getting generated and the huge computational power that modern-day computers possess. This has resulted in the emergence of Deep Learning, a sub-field of Machine Learning which thrives on more data. It works like a human brain where neurons are used to make better decisions. Similarly, in Deep Learning, the neural networks form the layers which follow the principle of forwarding and backward propagation to make better decisions.

Categories-of-Machine-Learning

Start Your Free Data Science Course

Hadoop, Data Science, Statistics & others

Categories of Machine Learning Algorithms

The field of Machine Learning Algorithms could be categorized into –

  • Supervised Learning – In Supervised Learning, the data set is labeled, i.e., for every feature or independent variable, there is a corresponding target data which we would use to train the model.
  • UN-Supervised Learning – Unlike in Supervised Learning, the data set is not labeled in this case. Thus clustering technique is used to group the data based on its similarity among the data points in the same group.
  • Reinforcement Learning – A special type of Machine Learning where the model learns from each action taken. The model is rewarded for any correct decision made and penalized for any wrong decision which allows it to learn the patterns, and make better accurate decisions on unknown data.

Division of Machine Learning Algorithms

The problems in Machine Learning Algorithms could be divided into –

  • Regression – There is a continuous relationship between the dependent and the independent variables. The target variable is numeric in nature while the independent variables could be numeric or categorical.
  • Classification – The most common problem statement you would find in the real world is classifying a data point into some binary, multinomial or ordinal class. In the Binary Classification problem, the target variable has only two outcomes (Yes/No, 0/1, True/False). In the Multinomial Classification problem, there are multiple classes in the target variable (Apple/ Orange/Mango, and so on). In the Ordinal classification problem, the target variable is ordered (e.g. – the grade of students).

Now, to solve this kind of problems, programmers and scientist have developed some programs or algorithms which could be used on the data to make predictions. These algorithms could be divided into linear and non-linear or tree-based algorithms. Linear algorithms like Linear Regression, Logistic Regression are generally used when there is a linear relationship between the feature and the target variable whereas the data which exhibits non-linear patterns, the tree-based methods such as Decision Tree, Random Forest, Gradient Boosting, etc., are preferred.

So far, we got a brief intuition about Machine Learning. Now you would learn some of its pre-programmed algorithms that you could use in your next project.

Algorithms

There are numerous Machine Learning algorithms that are in the market currently and it’s only going to increase considering the amount of research that’s being done in this field. Linear and Logistic Regression are generally the first algorithms you learn as a Data Scientist followed by more advanced algorithms.

Popular Course in this category
Cyber Week Sale
Machine Learning Training (17 Courses, 20+ Projects) 17 Online Courses | 20 Hands-on Projects | 144+ Hours | Verifiable Certificate of Completion | Lifetime Access
4.7 (3,535 ratings)
Course Price

View Course

Related Courses
Deep Learning Training (15 Courses, 13+ Projects)Artificial Intelligence Training (3 Courses, 2 Project)

Below are some of the Machine Learning algorithms along with sample code snippets in python.

1. Linear Regression

As the name suggests, this algorithm could be used in cases where the target variable which is continuous in nature is linearly dependent on the dependent variables. It is represented by –

y = a*x + b + e, where y is the target variable we are trying to predict, a is the intercept and b is the slope, x is our dependent variable used to make the prediction. This is a Simple Linear Regression as there is only one independent variable. In the case of Multiple Linear Regression, the equation would have been –

y = a1*x1 + a2*x2 + …… + a(n)*x(n) + b + e

Here, e is the error term and a1, a2.. a (n) are the coefficient of the independent variables.

To evaluate the performance of the model, a metric is used which in this case could be Root Mean Square Error which is the square root of the mean of the sum of the difference between the actual and the predicted values.

Root Mean Square Error ( Machine Learning Algorithms)

The goal of Linear Regression is to find the best fit line which would minimize the difference between the actual and the predicted data points.

actual and the predicted data points

Linear Regression could be written in Python as below –

Linear Regression written in PythonLinear Regression written in Python 2

2. Logistic Regression

In terms of maintaining a linear relationship, it is the same as Linear Regression. However, unlike in Linear Regression, the target variable in Logistic Regression is categorical i.e., binary, multinomial or ordinal in nature. The choice of the activation function is important in Logistic Regression as for binary classification problems, the log of odds in favor i.e., the sigmoid function is used.

Logistic Regression( Machine Learning Algorithms)

In the case of a multi-class problem, the softmax function is preferred as a sigmoid function takes a lot of computation time.

Logistic Regression sigmoid functiom( Machine Learning Algorithms)

The metric used to evaluate a classification problem is generally Accuracy or the ROC curve. The more the area under the ROC, the better is the model. A random graph would have an AUC of 0.5. The value of 1 indicates most accuracy, whereas 0 indicates the least accuracy.

FALSE POSITIVE RATE

Logistic Regression could be written in learning as –

Logistic Regression written in sklearn
Logistic Regression written in sklearn 2

3. K-Nearest Neighbors

Machine Learning Algorithms could be used for both classification and regression problems. The idea behind the KNN method is that it predicts the value of a new data point based on its K Nearest Neighbors. K is generally preferred as an odd number to avoid any conflict. While classifying any new data point, the class with the highest mode within the Neighbors is taken into consideration. While for the regression problem, the mean is considered as the value.

Machine Learning 4.1

I learned the KNN is written as –

KNN is written sklearn
KNN is written sklearn 2

KNN is used in building a recommendation engine.

4. Support Vector Machines

A classification algorithm where a hyperplane separates the two classes. In a binary classification problem, two vectors from two distinct classes are considered known as the support vectors and the hyperplane is drawn at maximum distance from the support vectors.

Machine Learning 5.1

As you can see, a single line separates the two classes. However, in most cases, the data would not be such perfect and a simple hyperplane would not be able to separate the classes. Hence, you need to tune parameters such as Regularization, Kernel, Gamma, and so on.

The kernel could be linear or polynomial depending on how the data is separated. In this case, the kernel is linear in nature. In the case of Regularization, you need to choose an optimum value of C, as the high value could lead to overfitting while a small value could underfit the model. The influence of a single training example is defined by Gamma. Points close to the line are considered in high gamma and vice versa for low gamma.

In sklearn, SVM is written as –

SVM is written sklearn
SVM is written sklearn 2

5. Naive Bayes

It works on the principle of Bayes Theorem which finds the probability of an event considering some true conditions. Bayes Theorem is represented as –

The algorithm is called Naive because it believes all variables are independent and the presence of one variable doesn’t have any relation to the other variables which is never the case in real life. Naive Bayes could be used in Email Spam classification and in text classification.

Naïve Bayes code in Python –

Naïve Bayes code in Python

6. Decision Tree

Used for both classification and regression problems, the Decision Tree algorithm is one the most simple and easily interpretable Machine Learning algorithms. It is not affected by outliers or missing values in the data and could capture the non-linear relationships between the dependent and the independent variables.

Machine Learning Graph 6.1

To build a Decision Tree, all features are considered at first but the feature with the maximum information gain is taken as the final root node based on which the successive splitting is done. This splitting continues on the child node based on the maximum information criteria and it stops until all the instances have been classified or the data could not be split further. Decision Trees are often prone to overfitting and thus it is necessary to tune the hyperparameter like maximum depth, min leaf nodes, minimum samples, maximum features and so on. To reduce overfitting, there is a greedy approach that sets constraints at each step and chooses the best possible criteria for that split. There is another better approach called Pruning where the tree is first built up to a certain pre-defined depth and then starting from the bottom the nodes are removed if it doesn’t improve the model.

In sklearn, Decision Trees are coded as –

Decision Trees In sklearn
Decision Trees In sklearn 2
Decision Trees In sklearn 3

7. Random Forest

To reduce overfitting in Decision Tree, it is required to reduce the variance of the model and thus the concept of bagging came into place. Bagging is a technique where the output of several classifiers is taken to form the final output. Random Forest is one such bagging method where the dataset is sampled into multiple datasets and the features are selected at random for each set. Then on each sampled data, the Decision Tree algorithm is applied to get the output from each mode. In the case of a Regression problem, the mean of the output of all the models is taken whereas, in case of classification problem, the class which gets the maximum vote is considered to classify the data point. Random Forest is not influenced by outliers, missing values in the data and it also helps in dimensionality reduction as well. However, it is not interpretable which a drawback for Random Forest. In Python, you could code Random Forest as –

Python, you could code Random Forest (Machine Learning Algorithms)
Python, you could code Random Forest 2(Machine Learning Algorithms)

8. K-means Clustering

So far, we worked with supervised learning problems where for every input there is a corresponding output. Now, we would learn about unsupervised learning where the data is unlabelled and needs to be clustered into specific groups. There are several clustering techniques available. However, the most common of them is the K-means clustering. Ink-means, k refers to the number of clusters that need to be set in prior to maintaining maximum variance in the dataset. Once the k is set, the centroids are initialized. The centroids are then adjusted repeatedly so that the distance between the data points within a centroid is maximum and the distance between two separate is maximum. Euclidean distance, Manhattan distance, etc, are some of the distance formula used for this purpose.

The value of k could be found from the elbow method.

Machine Learning Graph 7

K-means clustering is used in e-commerce industries where customers are grouped together based on their behavioral patterns. It could also be used in Risk Analytics. Below is the python code –

K-means clustering in python code
K-means clustering in python code 2(Machine Learning Algorithms)

Conclusion: Machine Learning Algorithms

Data Scientist is the sexiest job in the 21st century and Machine Learning is certainly one of its key areas of expertise. To be a Data Scientist, one needs to possess an in-depth understanding of all these algorithms and also several other new techniques such as Deep Learning.

Recommended Articles

This has been a guide to Machine Learning Algorithms. Here we have discuss the Concept, Categories, problems, and different algorithms of Machine Language. You can also go through our other Suggested Articles to learn more –

  1. Machine Learning Techniques
  2. What Is Deep learning
  3. Data Scientist vs Machine Learning
  4. Supervised Learning vs Unsupervised Learning
  5. Hyperparameter Machine Learning
  6. What is Reinforcement Learning?
  7. Most Used Techniques of Ensemble Learning

Machine Learning Training (17 Courses, 11+ Projects)

17 Online Courses

20 Hands-on Projects

144+ Hours

Verifiable Certificate of Completion

Lifetime Access

Learn More

0 Shares
Share
Tweet
Share
Reader Interactions
Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar
Data Analytics Tutorials Tutorials
  • Machine Learning
    • Hierarchical Clustering Algorithm
    • IoT Technology
    • IoT Ecosystem
    • TensorFlow Architecture
    • IoT Devices
    • IoT Projects
    • What is Regression Analysis?
    • Hierarchical Clustering
    • Bagging and Boosting
    • Multivariate Regression
    • Agents in Artificial Intelligence
    • Tensorflow Basics
    • Implementation of Neural Networks
    • Intelligent Agents
    • Artificial Intelligence Techniques
    • Hierarchical Clustering Analysis
    • Clustering in Machine Learning
    • Fuzzy Logic System
    • Benefits of IoT
    • Simple Linear Regression
    • Importance of Artificial Intelligence
    • Artificial Intelligence Companies
    • Artificial Intelligence Applications
    • Hyperparameter Machine Learning
    • What is Reinforcement Learning?
    • IoT Architecture
    • Bayes Theorem
    • Advantages of DevOps
    • Data Science Machine Learning
    • Convolutional Neural Networks
    • Hierarchical Clustering in R
    • IoT Companies
    • IoT in Agriculture
    • IoT Security Issues
    • Autoencoders
    • Artificial Intelligence Software
    • IoT Analytics
    • Unsupervised Machine Learning
    • Artificial Intelligence Problems
    • Linear Regression Modeling
    • Gradient Boosting Algorithm
    • IoT Management
    • Uses of IoT
    • Types of Machine Learning Algorithms
    • Benefits of DevOps
    • How Artificial Intelligence Works?
    • Transformations in Informatica
    • IoT Module
    • Benefits of RPA
    • Tensorflow Image Classification
    • IoT Software
    • Applications of Machine Learning
    • IoT Platform
    • Router Transformation in Informatica
    • Data Science Algorithms
    • Restricted Boltzmann Machine
    • Artificial Intelligence Technology
    • Benefits of Artificial Intelligence
    • DevOps Services
    • Assembly Language vs Machine Language
    • TensorFlow Playground
    • Classification of Neural Network
    • Machine Learning Models
    • Machine Learning Platform
    • Tensorflow vs Pytorch
    • Machine Learning Methods
    • Theano vs Tensorflow
    • Machine Learning Algorithms
    • Classification Algorithms
    • Loss Functions in Machine Learning
    • Machine Learning Libraries
    • Recurrent Neural Networks (RNN)
    • Predictive Analysis vs Forecasting
    • Neural Network Algorithms
    • Predictive Analytics Tool
    • Artificial Intelligence Tools Applications
    • Data Science vs Machine Learning
    • Big Data Vs Machine Learning
    • Computer Science vs Data Science
    • Predictive Analytics vs Data Science
    • Artificial Intelligence vs Business Intelligence
    • Data science vs Business intelligence
    • Data Science Vs Data Mining
    • Computer Scientist vs Data Scientist
    • Supervised Learning vs Reinforcement Learning
    • Data Mining vs Text Mining
    • Machine Learning vs Artificial Intelligence
    • Machine Learning vs Predictive Modelling
    • Machine Learning vs Predictive Analytics
    • Machine Learning vs Neural Network
    • Artificial Intelligence vs Human Intelligence
    • Neural Networks vs Deep Learning
    • Data Science vs Artificial Intelligence
    • Business Intelligence vs Machine Learning
    • Supervised Learning vs Unsupervised Learning
    • Supervised Learning vs Deep Learning
    • Machine Learning vs Statistics
    • Data Scientist vs Machine Learning
    • Uses Of Machine Learning
    • Introduction To Machine Learning
    • Advantages of Artificial Intelligence
    • Introduction to Tensorflow
    • Introduction to Artificial Intelligence
    • What is Artificial Intelligence
    • Kubernetes Alternatives
    • Install Docker
    • How To Install TensorFlow
    • What is Neural Networks?
    • What is Natural Language Processing?
    • What is Pandas
    • What is NLP?
    • NLP in Python
    • Decision Tree Algorithm
    • Machine Learning Tools
    • Boosting Algorithm
    • Naive Bayes Algorithm
    • K- Means Clustering Algorithm
    • DevOps Tools
    • DevOps lifecycle
    • TensorFlow Alternatives
    • What is DevOps?
    • Machine Learning Frameworks
    • AdaBoost Algorithm
    • Types of Machine Learning
    • Machine Learning Architecture
    • What is Fuzzy Logic?
    • What is Kubernetes?
    • What is a Data Lake?
    • What is TensorFlow?
    • BFS Algorithm
    • Install Kubernetes Dashboard
    • DevOps Automation Tool
    • Agile vs DevOps
    • Artificial Intelligence vs Machine Learning vs Deep Learning
    • Artificial Intelligence Interview Questions
    • What Is Deep learning
    • Introduction to NLP
    • Kubernetes Operators
    • What is Machine Learning?
    • DevOps Testing Tools
    • XGBoost Algorithm
  • Big Data (151+)
  • Business Analytics (40+)
  • Cloud Computing (82+)
  • Data Analytics Basics (202+)
  • Data Analytics Careers (36+)
  • Data Mining (30+)
  • Data Visualization (88+)
  • Interview Questions (50+)
  • Statistical Analysis (36+)
  • Data Commands (4+)
  • Power Bi (6+)
Data Analytics Tutorials Courses
  • Machine Learning Training
  • Deep Learning Training
  • Artificial Intelligence Training
Footer
About Us
  • Who is EDUCBA?
  • Sign Up
  •  
Free Courses
  • Free Course on Data Science
  • Free Course on Machine Learning
  • Free Coruse on Statistics
  • Free Course on Data Analytics
Certification Courses
  • All Courses
  • Data Science Course - All in One Bundle
  • Machine Learning Course
  • Hadoop Certification Training
  • Cloud Computing Training Course
  • R Programming Course
  • AWS Training Course
  • SAS Training Course
  • Tableau Training
  • Azure Training Course
  • IoT Course
  • Minitab Training
  • SPSS Certification Course
  • Data Science with Python Course
Resources
  • Resources (A To Z)
  • Data & Analytics Career
  • Interview Questions
  • Data Visualization
  • Data and Analytics Basics
  • Cloud Computing
Apps
  • iPhone & iPad
  • Android
Support
  • Contact Us
  • Verifiable Certificate
  • Reviews
  • Terms and Conditions

© 2019 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

By continuing above step, you agree to our Terms of Use and Privacy Policy.
*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

By continuing above step, you agree to our Terms of Use and Privacy Policy.
*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

By continuing above step, you agree to our Terms of Use and Privacy Policy.
*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

By continuing above step, you agree to our Terms of Use and Privacy Policy.
*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA

By continuing above step, you agree to our Terms of Use and Privacy Policy.
*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA Login

Forgot Password?

Let’s Get Started
Please provide your Email ID
Email ID is incorrect

Cyber Week Offer - Machine Learning Training (17 Courses, 11+ Projects) View More

Cyber Week Offer - Cyber Week Offer - Machine Learning Training (17 Courses, 11+ Projects) View More