Chapter 4 — Clustering

K-Means, hierarchical, dendrograms

Prof. Xuhu Wan

Chapter 4 · Introduction to Business Analytics

Clustering

Finding hidden groups in unlabelled data.

Prof. Xuhu Wan

ISOM, HKUST Business School · Wan Academy · 2026 Edition

The Clustering Workflow

Collect → Standardise → Choose K → Run → Interpret

Clustering is unsupervised learning: no target column, no “right answer”. The algorithm discovers structure in the features alone. Business uses include customer segmentation, anomaly detection, document grouping, and exploratory data analysis.

Important

The single biggest mistake is skipping standardisation. If you cluster Age (20–70) and Income ($15K–$140K) without standardising, the income axis dominates the Euclidean distance by a factor of ~1000.

Why Standardise — Visual Proof

K-Means in One Sentence

Repeatedly: (1) assign each point to its nearest centroid, then (2) move each centroid to its cluster’s mean. Stop when centroids stop moving.

Choosing K — The Elbow Plot

The elbow is a heuristic, not a proof. It tells you where adding clusters stops giving meaningful improvement.

Hierarchical Clustering — Bottom Up

  1. Start with each point as its own cluster
  2. Merge the closest pair
  3. Repeat until one big cluster

The full tree (dendrogram) lets you read off any K after the fact by cutting horizontally.

Note

Ward linkage picks the merge that produces the smallest increase in within-cluster variance — the most common choice in practice. It tends to produce compact, balanced clusters.

Chapter Summary

Method Use when
K-Means You know K · fast on large N · roughly spherical clusters
Hierarchical (Ward) Small-to-mid N · don’t know K up front · want a tree
Always Standardise features first
Choose K Elbow + business knowledge

Full Forbes financial and customer-segmentation case studies in the book — Chapter 4.

This concludes the course. Capstone projects use everything from Chapters 1–4 together.