Q&A 31 How do you visualize individual observations using a swarm plot?
31.1 Explanation
A swarm plot displays all individual data points across a categorical axis while avoiding overlaps. It is a powerful way to show the distribution and clustering of observations within each group.
- Unlike strip plots, swarm plots use a smart layout algorithm to minimize overlapping.
- Best suited for small to medium datasets where every point matters.
- Commonly used to complement boxplots or violin plots.
They help:
- Visualize the spread of values within each group
- Detect patterns, outliers, or group separation
31.2 Python Code
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Load iris dataset
df = pd.read_csv("data/iris.csv")
df["species"] = df["species"].astype("category") # Ensure species is categorical
# Swarm plot
plt.figure(figsize=(6, 4))
sns.swarmplot(data=df, x="species", y="sepal_length", hue="species",
palette="Set2", dodge=False, legend=False)
# Customize plot
plt.title("Swarm Plot of Sepal Length by Species")
plt.xlabel("Species")
plt.ylabel("Sepal Length")
plt.tight_layout()
plt.show()
31.3 R Code
library(ggplot2)
library(readr)
# Load iris dataset
df <- read_csv("data/iris.csv")
df$species <- as.factor(df$species)
# Swarm-like plot using jitter
ggplot(df, aes(x = species, y = sepal_length, color = species)) +
geom_jitter(width = 0.2, size = 2, alpha = 0.8) +
labs(title = "Swarm-like Plot of Sepal Length by Species",
x = "Species", y = "Sepal Length") +
theme_minimal()