Q&A 24 How do you use a histogram to visualize numerical distributions?

24.1 Explanation

A histogram divides numerical data into intervals (bins) and shows how many observations fall into each bin. It helps reveal the shape of the distribution — such as whether it’s:

  • Symmetric, skewed, or bimodal
  • Uniform, peaked, or flat

Histograms are useful for detecting: - Data spread and central tendency - Potential outliers or gaps - Whether transformation might be needed

24.2 Python Code


import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load dataset
df = pd.read_csv("data/iris.csv")

# Plot histogram for sepal length
plt.figure(figsize=(6, 4))
sns.histplot(df["sepal_length"], bins=10, kde=False, color="steelblue")
plt.title("Histogram of Sepal Length")
plt.xlabel("Sepal Length (cm)")
plt.ylabel("Frequency")
plt.tight_layout()
plt.show()

24.3 R Code

library(ggplot2)
library(readr)

# Load dataset
df <- read_csv("data/iris.csv", show_col_types = FALSE)

# Plot histogram
ggplot(df, aes(x = sepal_length)) +
  geom_histogram(bins = 10, fill = "steelblue", color = "white") +
  labs(title = "Histogram of Sepal Length",
       x = "Sepal Length (cm)", y = "Frequency") +
  theme_minimal()