EDUCBA Logo

EDUCBA

MENUMENU
  • Explore
    • EDUCBA Pro
    • PRO Bundles
    • All Courses
    • All Specializations
  • Blog
  • Enterprise
  • Free Courses
  • All Courses
  • All Specializations
  • Log in
  • Sign Up
Home Courses 00 AI & DATA SCIENCE Cluster Analysis and Unsupervised Machine Learning – Basic Concepts
Home Courses 00 AI & DATA SCIENCE Cluster Analysis and Unsupervised Machine Learning – Basic Concepts

Cluster Analysis and Unsupervised Machine Learning - Basic Concepts

BESTSELLER
4.7 (89578 ratings)

* One-Time Payment & Get One-Year Access

Offer ends in:

What you'll get

  • 1h 41m
  • 12 Videos
  • Course Level - Intermediate| English[Auto-generated]
  • Course Completion Certificates
  • One-Year Access
  • Mobile App Access

Curriculum:

    What is Cluster Analysis?

    Cluster Analysis is a statistical tool which is used to classify objects into groups called clusters, where the objects belonging to one cluster are more similar to the other objects in that same cluster and the objects of other clusters are completely different. In simple words cluster analysis divides data into clusters that are meaningful and useful. Clustering is used mainly for two purposes - clustering for understanding and clustering for utility.

    Application of cluster analysis

    • Cluster analysis is used in many fields like machine learning, market research, pattern recognition, data analysis, information retrieval, image processing and data compression.
    • Cluster analysis can help the marketers to find out distinct groups of their customer base.
    • Cluster analysis is used in the field of biology to find out plant and animal taxonomies and categorize genes with similar characteristics
    • Cluster analysis is used in an earth observation database to group the houses in a city according to the house type, value and location.
    • Clustering can also be used to segment the documents on the web based on a specific criteria
    • In data mining, cluster analysis is used to gain in-depth understanding about the characteristics of data in each cluster.

    Clustering Methods

    Clustering methods can be divided into the following categories

    • Partitioning method
    • Hierarchical Method
    • Density based method
    • Grid Based Method
    • Model Based Method
    • Constraint Based Method

    Advantages of Cluster Analysis

    Given below are the advantages of cluster analysis

    • Cluster analysis gives a quick overview of data
    • It can be used if there are many groups in data
    • Cluster analysis can be used when there are unusual similarity measures to be done
    • Cluster analysis can be added on ordination plots and it is good for the nearest neighbours

    Approaches to cluster analysis

    There are a number of different approaches used to carry out cluster analysis which are divided into two

    • Hierarchical Method - Agglomerative Methods and Divisive Methods
    • Non Hierarchical Method also known as K-means Clustering methods

    Cluster Analysis Course Objectives

    At the end of this course you will be able to know

    • How to use cluster analysis in data mining
    • About the various types of clusters
    • About the Marketing applications of cluster analysis
    • Implications of wide variety of clustering techniques
    • Use clustering in statistical analysis

    Pre Requisites for Cluster Analysis course

    Basic knowledge of statistics is required. Some familiarity with data analysis will be considered as an added advantage though it is not a necessity.

    Target Audience

    The target audience of this course are listed below

    • Students
    • Research professionals
    • Data Analysts
    • Data Miners
    • And anyone who is interested in learning about cluster analysis

    Cluster Analysis Course Description

    Section 1: Introduction

    Meaning of Cluster Analysis

    The term cluster analysis includes a number of different algorithms and methods for grouping of data and objects. It is an exploratory data analysis tool. Cluster analysis is used to discover data structures without explaining why they exist. This section includes the brief introduction, history and benefits of cluster analysis.

    Understanding of Cluster Analysis

    Under this section we will learn about good clustering which produces high quality clusters and also you will learn how to measure the quality of clustering. The other topics included in this section are major clustering approaches, techniques of cluster analysis, basic concepts and algorithms of cluster analysis.

    Example of Cluster Analysis

    Clustering is used in every aspect of our daily life. Under this chapter you will learn see some illustration and practical application of cluster analysis in various fields. One example is given with a retail chain of stores across various locations. Another example is given based on market segmentation. Finally a simple numerical example is given which explains the objectives of cluster analysis. An example from each field like marketing, land use, biology, Psychology, Medicine, information retrieval, etc where cluster analysis is used are also given under this section.

    Section 2: Types of Clustering

    Hierarchical method of Clustering

    Hierarchical clustering is a set of nested clusters that are organized in the form of a tree. The hierarchical clustering also contains different methods under it to find out which clusters should be joined at each stage. There are two main types of hierarchical clustering - Agglomerative and Divisive. The agglomerative clustering algorithm is explained in detail with example under this section.

    The main methods of hierarchical clustering are also explained in brief in this section

    • Nearest Neighbour Method (Single Linkage Method)
    • Furthest Neighbour Method (Complete Linkage Method)
    • Average Linkage Method (Between Groups)
    • Centroid Method
    • Ward's Method

    Single linkage clustering

    Single linkage method is also known as the nearest neighbour method. This methods is used to measure the distance between clusters where there are more than two observations. The major topics included in this section are listed below

    • Spanning tree
    • Contracting Space
    • Chaining
    • Dendrogram or tree diagram
    • Example of nearest neighbour method using diagrams

    Linkage methods, Wards method

    The single linkage method is explained in detail in the previous chapter. This section deals with the other two linkage methods - Complete linkage and Average Linkage.

    In Complete linkage method the distance between the two clusters is said to be the maximum distance between the members. The formula is explained in this section. An example is given in detail to make you understand easily.

    In average linkage method the distance between two clusters is considered as the average distance between all the pairs in the two clusters. This method is explained in detail under this section with an example.

    In centroid method the mean value of each variable of each cluster is found out and the distance between centroids is used to merge the clusters. This method is also explained with an example.

    In the ward's method the pairs of clusters are combined and the sum of the squared distances within each cluster is found out. Finally the lowest sum of squares is chosen. This method is more popular. This section contains examples of this method.

    k means clustering

    K means clustering is also known as Non Hierarchical clustering. Under this method the desired number of clusters are mentioned beforehand and the best solution is chosen from that. The steps for carrying out K means clustering is mentioned in this chapter.

    K means and Example of K means, difference between hierarchical and non hierarchical clustering

    The important points of K means clustering is mentioned in this chapter which includes Partitional clustering approach, centroid and K means algorithm. The details of K means clustering is explained using the following points

    • Initial Centroids
    • Closeness
    • Similarity measures
    • Happening of convergences
    • Complexity of K means
    • Types of K means clustering - Sub optimal clustering and Optimal Clustering
    • Solutions to Initial Centroids problem
    • Evaluating K means cluster
    • Difference between Hierarchical Clustering and K means Clustering
    • Strengths of K means clustering
    • Limitations of K means clustering

    Example of K means no. of cluster, Statistical tests, Dendrogram, Scree plot

    With its computation K means clustering is considered as a Analysis of Variance (ANOVA) in reverse. The physical fitness example is given to explain the K means clustering method. The K means clustering is explained with other examples using plots and graphs.

    Dendrogram - When carrying out a hierarchical cluster analysis, the result can be represented in the form of a diagram which is known as Dendrogram. This diagram explains which are the clusters which have been joined at each stage of the analysis and what was its distance at the time of joining. This helps to select the optimum number of clusters. An example of a Dendrogram is given under this heading.

    Scree Plot displays the eigenvalues connected with a component in descending order versus the number of the component. The pattern of Scree plot and the properties of Scree plot in cluster analysis is discussed in this section.

    Two step cluster analysis, Evaluation

    The two step cluster analysis is used to reveal natural clusters within a data set. It runs pre clustering method first and then hierarchical method. This section contains the following topics under it

    • Algorithm of two step cluster analysis
    • The two steps of the two step cluster analysis
    • Case study - classifying motor vehicles using two step cluster analysis

    Example for Listwise and Pairwise deletion of missing values , SPSS windows of output

    Listwise and Pairwise deletions are used to find out the missing data. These techniques are used when a data is missing completely at random. Listwise deletion deletes all the data if there is one or more missing values. Pairwise deletion tries to minimize the loss that can be caused because of Listwise deletion. Listwise and Pairwise deletion has its own advantages and disadvantages. This section includes the following topics

    • What is Listwise deletion
    • Example of Listwise deletion
    • What is Pairwise deletion
    • Example of Pairwise deletion

    SPSS windows of output

    In SPSS cluster analysis can be found under Analyze à Classify. SPSS offers three methods of cluster analysis - Hierarchical, K means and Two step cluster. This section includes examples of performing cluster analysis in SPSS.

    K means cluster theory, SPSS windows for k means

    This section explains what is K means clustering method, its history, algorithm, initialization methods, applications and description.

    SPSS is another statistical software which is used to perform cluster analysis. The steps to conduct cluster analysis in SPSS is simple and it lets you to choose the variables on which the cluster analysis needs to be performed. You can perform K means in SPSS by going to the Analyze à Classify à K means cluster. The steps for performing K means cluster analysis in SPSS in given under this chapter. Necessary screenshots are also provided for your easy reference.

    FAQ's General Questions

    • What technical support will be provided ?

    Our customer support centre will be available at your service 24*7. Through that you can ask your queries and contact your instructors. You can also email your queries to the mail id provided in the site for technical support.

    • How can I get access to my course ?

    You will be sent an email along with your user name and password. A link will also be sent for your learning course.

    • How much time commitment is required for each course ?

    Each course requires at least 8 hours to be spent every week. You can choose your flexible time and complete the course at your convenience. Flexibility to learn on your own time is an advantage of taking an online course with educba.

    Testimonials

    Samuel

    This is an excellent introductory course on Cluster analysis. The course covers mainly two types of cluster analysis - Hierarchical and K means. The quality of the material in this course are of high standards. The course flow from one topic into another is best. The examples under each section makes the learning and understanding process easy. Thanks to educba for offering this course.

    Henry Mark

    This is my first online course and it provided me a good experience. The syllabus of this course makes it more interesting. It is not stuffed with content. The content is good and self explanatory. It gave me a greater overview of the clustering methods and techniques which I was not aware of before taking this course. This course is recommended to someone who is new to the concept of cluster analysis as well as to one who knows how to apply cluster analysis to data. Overall a great course to begin with cluster analysis.

    Richard

    This is a good course on cluster analysis. It covers all the important topics and gives good examples to understand the methods and algorithms. It also gives some real life applications of clustering as examples and thus it makes the content more interesting and engaging. I loved this course and would definitely recommend.

    Where do our learners come from?
    Professionals from around the world have benefited from eduCBA's Cluster Analysis courses. Some of the top places that our learners come from include New York, Dubai, San Francisco, Bay Area, New Jersey, Houston, Seattle, Toronto, London, Berlin, UAE, Chicago, UK, Hong Kong, Singapore, Australia, New Zealand, India, Bangalore, New Delhi, Mumbai, Pune, Kolkata, Hyderabad and Gurgaon among many.

    Back to top ^

    * One-Time Payment & Get One-Year Access

    Offer ends in:

    Training 5 or more people?

    Get your team access to 5,000+ top courses, learning paths, mock tests anytime, anywhere.

    Drop an email at: [email protected]

    Course Overview

    This is a online course is to gain fundamental understanding of Data Mining using Cluster Analysis. The aim is to learn about how Data Mining using Cluster Analysis and its features can be used. The tutorials will help you learn about Meaning of Cluster Analysis using examples and various types of clustering.

    370
    Upto 2 hours 1h 41m | 12 Videos | 89578 Views | Intermediate  Intermediate| English[Auto-generated]
    trigger text
    hidden content

    What is Cluster Analysis?

    Cluster Analysis is a statistical tool which is used to classify objects into groups called clusters, where the objects belonging to one cluster are more similar to the other objects in that same cluster and the objects of other clusters are completely different. In simple words cluster analysis divides data into clusters that are meaningful and useful. Clustering is used mainly for two purposes – clustering for understanding and clustering for utility.

    Application of cluster analysis

    • Cluster analysis is used in many fields like machine learning, market research, pattern recognition, data analysis, information retrieval, image processing and data compression.
    • Cluster analysis can help the marketers to find out distinct groups of their customer base.
    • Cluster analysis is used in the field of biology to find out plant and animal taxonomies and categorize genes with similar characteristics
    • Cluster analysis is used in an earth observation database to group the houses in a city according to the house type, value and location.
    • Clustering can also be used to segment the documents on the web based on a specific criteria
    • In data mining, cluster analysis is used to gain in-depth understanding about the characteristics of data in each cluster.

    Clustering Methods

    Clustering methods can be divided into the following categories

    Watch our Demo Courses and Videos

    Valuation, Hadoop, Excel, Mobile Apps, Web Development & many more.

    • Partitioning method
    • Hierarchical Method
    • Density based method
    • Grid Based Method
    • Model Based Method
    • Constraint Based Method

    Advantages of Cluster Analysis

    Given below are the advantages of cluster analysis

    • Cluster analysis gives a quick overview of data
    • It can be used if there are many groups in data
    • Cluster analysis can be used when there are unusual similarity measures to be done
    • Cluster analysis can be added on ordination plots and it is good for the nearest neighbours

    Approaches to cluster analysis

    There are a number of different approaches used to carry out cluster analysis which are divided into two

    • Hierarchical Method – Agglomerative Methods and Divisive Methods
    • Non Hierarchical Method also known as K-means Clustering methods

    Cluster Analysis Course Objectives

    At the end of this course you will be able to know

    • How to use cluster analysis in data mining
    • About the various types of clusters
    • About the Marketing applications of cluster analysis
    • Implications of wide variety of clustering techniques
    • Use clustering in statistical analysis

    Pre Requisites for Cluster Analysis course

    Basic knowledge of statistics is required. Some familiarity with data analysis will be considered as an added advantage though it is not a necessity.

    Target Audience

    The target audience of this course are listed below

    • Students
    • Research professionals
    • Data Analysts
    • Data Miners
    • And anyone who is interested in learning about cluster analysis

    Cluster Analysis Course Description

    Section 1: Introduction

    Meaning of Cluster Analysis

    The term cluster analysis includes a number of different algorithms and methods for grouping of data and objects. It is an exploratory data analysis tool. Cluster analysis is used to discover data structures without explaining why they exist. This section includes the brief introduction, history and benefits of cluster analysis.

    Understanding of Cluster Analysis

    Under this section we will learn about good clustering which produces high quality clusters and also you will learn how to measure the quality of clustering. The other topics included in this section are major clustering approaches, techniques of cluster analysis, basic concepts and algorithms of cluster analysis.

    Example of Cluster Analysis

    Clustering is used in every aspect of our daily life. Under this chapter you will learn see some illustration and practical application of cluster analysis in various fields. One example is given with a retail chain of stores across various locations. Another example is given based on market segmentation. Finally a simple numerical example is given which explains the objectives of cluster analysis. An example from each field like marketing, land use, biology, Psychology, Medicine, information retrieval, etc where cluster analysis is used are also given under this section.

    Section 2: Types of Clustering

    Hierarchical method of Clustering

    Hierarchical clustering is a set of nested clusters that are organized in the form of a tree. The hierarchical clustering also contains different methods under it to find out which clusters should be joined at each stage. There are two main types of hierarchical clustering – Agglomerative and Divisive. The agglomerative clustering algorithm is explained in detail with example under this section.

    The main methods of hierarchical clustering are also explained in brief in this section

    • Nearest Neighbour Method (Single Linkage Method)
    • Furthest Neighbour Method (Complete Linkage Method)
    • Average Linkage Method (Between Groups)
    • Centroid Method
    • Ward’s Method

    Single linkage clustering

    Single linkage method is also known as the nearest neighbour method. This methods is used to measure the distance between clusters where there are more than two observations. The major topics included in this section are listed below

    • Spanning tree
    • Contracting Space
    • Chaining
    • Dendrogram or tree diagram
    • Example of nearest neighbour method using diagrams

    Linkage methods, Wards method

    The single linkage method is explained in detail in the previous chapter. This section deals with the other two linkage methods – Complete linkage and Average Linkage.

    In Complete linkage method the distance between the two clusters is said to be the maximum distance between the members. The formula is explained in this section. An example is given in detail to make you understand easily.

    In average linkage method the distance between two clusters is considered as the average distance between all the pairs in the two clusters. This method is explained in detail under this section with an example.

    In centroid method the mean value of each variable of each cluster is found out and the distance between centroids is used to merge the clusters. This method is also explained with an example.

    In the ward’s method the pairs of clusters are combined and the sum of the squared distances within each cluster is found out. Finally the lowest sum of squares is chosen. This method is more popular. This section contains examples of this method.

    k means clustering

    K means clustering is also known as Non Hierarchical clustering. Under this method the desired number of clusters are mentioned beforehand and the best solution is chosen from that. The steps for carrying out K means clustering is mentioned in this chapter.

    K means and Example of K means, difference between hierarchical and non hierarchical clustering

    The important points of K means clustering is mentioned in this chapter which includes Partitional clustering approach, centroid and K means algorithm. The details of K means clustering is explained using the following points

    • Initial Centroids
    • Closeness
    • Similarity measures
    • Happening of convergences
    • Complexity of K means
    • Types of K means clustering – Sub optimal clustering and Optimal Clustering
    • Solutions to Initial Centroids problem
    • Evaluating K means cluster
    • Difference between Hierarchical Clustering and K means Clustering
    • Strengths of K means clustering
    • Limitations of K means clustering

    Example of K means no. of cluster, Statistical tests, Dendrogram, Scree plot

    With its computation K means clustering is considered as a Analysis of Variance (ANOVA) in reverse. The physical fitness example is given to explain the K means clustering method. The K means clustering is explained with other examples using plots and graphs.

    Dendrogram – When carrying out a hierarchical cluster analysis, the result can be represented in the form of a diagram which is known as Dendrogram. This diagram explains which are the clusters which have been joined at each stage of the analysis and what was its distance at the time of joining. This helps to select the optimum number of clusters. An example of a Dendrogram is given under this heading.

    Scree Plot displays the eigenvalues connected with a component in descending order versus the number of the component. The pattern of Scree plot and the properties of Scree plot in cluster analysis is discussed in this section.

    Two step cluster analysis, Evaluation

    The two step cluster analysis is used to reveal natural clusters within a data set. It runs pre clustering method first and then hierarchical method. This section contains the following topics under it

    • Algorithm of two step cluster analysis
    • The two steps of the two step cluster analysis
    • Case study – classifying motor vehicles using two step cluster analysis

    Example for Listwise and Pairwise deletion of missing values , SPSS windows of output

    Listwise and Pairwise deletions are used to find out the missing data. These techniques are used when a data is missing completely at random. Listwise deletion deletes all the data if there is one or more missing values. Pairwise deletion tries to minimize the loss that can be caused because of Listwise deletion. Listwise and Pairwise deletion has its own advantages and disadvantages. This section includes the following topics

    • What is Listwise deletion
    • Example of Listwise deletion
    • What is Pairwise deletion
    • Example of Pairwise deletion

    SPSS windows of output

    In SPSS cluster analysis can be found under Analyze à Classify. SPSS offers three methods of cluster analysis – Hierarchical, K means and Two step cluster. This section includes examples of performing cluster analysis in SPSS.

    K means cluster theory, SPSS windows for k means

    This section explains what is K means clustering method, its history, algorithm, initialization methods, applications and description.

    SPSS is another statistical software which is used to perform cluster analysis. The steps to conduct cluster analysis in SPSS is simple and it lets you to choose the variables on which the cluster analysis needs to be performed. You can perform K means in SPSS by going to the Analyze à Classify à K means cluster. The steps for performing K means cluster analysis in SPSS in given under this chapter. Necessary screenshots are also provided for your easy reference.

    FAQ’s General Questions

    • What technical support will be provided ?

    Our customer support centre will be available at your service 24*7. Through that you can ask your queries and contact your instructors. You can also email your queries to the mail id provided in the site for technical support.

    • How can I get access to my course ?

    You will be sent an email along with your user name and password. A link will also be sent for your learning course.

    • How much time commitment is required for each course ?

    Each course requires at least 8 hours to be spent every week. You can choose your flexible time and complete the course at your convenience. Flexibility to learn on your own time is an advantage of taking an online course with educba.

    Testimonials

    Samuel

    This is an excellent introductory course on Cluster analysis. The course covers mainly two types of cluster analysis – Hierarchical and K means. The quality of the material in this course are of high standards. The course flow from one topic into another is best. The examples under each section makes the learning and understanding process easy. Thanks to educba for offering this course.

    Henry Mark

    This is my first online course and it provided me a good experience. The syllabus of this course makes it more interesting. It is not stuffed with content. The content is good and self explanatory. It gave me a greater overview of the clustering methods and techniques which I was not aware of before taking this course. This course is recommended to someone who is new to the concept of cluster analysis as well as to one who knows how to apply cluster analysis to data. Overall a great course to begin with cluster analysis.

    Richard

    This is a good course on cluster analysis. It covers all the important topics and gives good examples to understand the methods and algorithms. It also gives some real life applications of clustering as examples and thus it makes the content more interesting and engaging. I loved this course and would definitely recommend.

    Where do our learners come from?
    Professionals from around the world have benefited from eduCBA’s Cluster Analysis courses. Some of the top places that our learners come from include New York, Dubai, San Francisco, Bay Area, New Jersey, Houston, Seattle, Toronto, London, Berlin, UAE, Chicago, UK, Hong Kong, Singapore, Australia, New Zealand, India, Bangalore, New Delhi, Mumbai, Pune, Kolkata, Hyderabad and Gurgaon among many.

    Back to top ^

    View Offline

    View courses without internet connection with a Lifetime Membership
    View courses without internet connection with a Lifetime Membership

    View Offline - Internet-free viewing with your iOS or Android App

    Watch offline with your iOS/Android app.

    Start Your Free Trial Now

    You can download courses from your iOS/Android App.

    Footer
    Follow us!
    • EDUCBA FacebookEDUCBA TwitterEDUCBA LinkedINEDUCBA Instagram
    • EDUCBA YoutubeEDUCBA CourseraEDUCBA Udemy
    APPS
    EDUCBA Android AppEDUCBA iOS App
    Company
    • About us
    • Alumni Speak
    • Contact Us
    • Log in
    • Sign up
    Work with us
    • Careers
    • Become an Instructor
    EDUCBA for Enterprise
    • Enterprise Solutions
    • Explore Programs
    • Free Courses
    • Free Tutorials
    • EDUCBA at Coursera
    • EDUCBA at Udemy
    Resources
    • Blog
    • Self-Paced Training
    • Verifiable Certificate
    • Popular Skills Catalogue
    • Exam Prep Catalogue
    Popular Categories
    • Lifetime Membership
    • All in One Bundles
    • Featured Skills
    • New & Trending
    • Fresh Entries
    • Finance
    • Data Science
    • Programming and Dev
    • Excel
    • Marketing
    • HR
    • PDP
    • VFX and Design
    • Project Management
    • Exam Prep
    • Learning Paths @ $49
    • All Courses
    • Terms & Conditions
    • Disclaimer
    • Privacy Policy & Cookie Policy
    • Shipping Policy

    ISO 10004:2018 & ISO 9001:2015 Certified

    © 2025 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

    EDUCBA

    *Please provide your correct email id. Login details for this Free course will be emailed to you
    Let’s Get Started

    By signing up, you agree to our Terms of Use and Privacy Policy.

    EDUCBA Login

    Forgot Password?

    🚀 Limited Time Offer! - 🎁 ENROLL NOW