Cluster Analysis of Data set

Find a data set you find interesting, and perform cluster analysis on it.

The data should have at least two variables with good variation.

Perform k-means clustering using R, Python, or an online tool. Some online tools are also great at explaining and visualizing what happens. Try https://datatab.net/statistics-calculator/cluster and https://biit.cs.ut.ee/clustvis/

Try using two dimensions (easier to visualize and understand), then expand.

Start by asking for 2 clusters, then increase and see what happens.

Do the clusters make sense? Can you name the clusters in a meaningful way?

Tags: Classification Practice

Analytics Portfolio