Chapter 4 — Clustering

K-Means, hierarchical, dendrograms

Prof. Xuhu Wan

Chapter 4 · Introduction to Business Analytics

Clustering

Finding hidden groups in unlabelled data.

Prof. Xuhu Wan

ISOM, HKUST Business School · Wan Academy · 2026 Edition

The Clustering Workflow

Collect → Standardise → Choose K → Run → Interpret

Clustering is unsupervised learning: no target column, no “right answer”. The algorithm discovers structure in the features alone. Business uses include customer segmentation, anomaly detection, document grouping, and exploratory data analysis.

Important

The single biggest mistake is skipping standardisation. If you cluster Age (20–70) and Income ($15K–$140K) without standardising, the income axis dominates the Euclidean distance by a factor of ~1000.

Why Standardise — Visual Proof

K-Means in One Sentence

Repeatedly: (1) assign each point to its nearest centroid, then (2) move each centroid to its cluster’s mean. Stop when centroids stop moving.

Choosing K — The Elbow Plot

The elbow is a heuristic, not a proof. It tells you where adding clusters stops giving meaningful improvement.

Hierarchical Clustering — Bottom Up

Start with each point as its own cluster
Merge the closest pair
Repeat until one big cluster

The full tree (dendrogram) lets you read off any K after the fact by cutting horizontally.

Note

Ward linkage picks the merge that produces the smallest increase in within-cluster variance — the most common choice in practice. It tends to produce compact, balanced clusters.

Chapter Summary

Method	Use when
K-Means	You know K · fast on large N · roughly spherical clusters
Hierarchical (Ward)	Small-to-mid N · don’t know K up front · want a tree
Always	Standardise features first
Choose K	Elbow + business knowledge

Full Forbes financial and customer-segmentation case studies in the book — Chapter 4.

This concludes the course. Capstone projects use everything from Chapters 1–4 together.