Q&A 12 How do you sort rows based on a variable in Python and R?

12.1 Explanation

Sorting lets you organize data by numeric values, text, or any other column. It’s useful for:

  • Finding top/bottom performers
  • Preparing plots or tables
  • Detecting outliers and patterns

In this example, we’ll sort the Iris dataset by petal_length in descending order — meaning longest petals appear first.


12.2 Python Code

import pandas as pd

# Load dataset
df = pd.read_csv("data/iris.csv")

# Sort by petal_length (descending)
sorted_df = df.sort_values(by="petal_length", ascending=False)

# View result
print(sorted_df.head())
     sepal_length  sepal_width  petal_length  petal_width    species
118           7.7          2.6           6.9          2.3  virginica
122           7.7          2.8           6.7          2.0  virginica
117           7.7          3.8           6.7          2.2  virginica
105           7.6          3.0           6.6          2.1  virginica
131           7.9          3.8           6.4          2.0  virginica

12.3 R Code

library(readr)
library(dplyr)

# Load dataset
df <- read_csv("data/iris.csv")

# Sort by petal_length (descending)
sorted_df <- df %>%
  arrange(desc(petal_length))

# View result
head(sorted_df)
# A tibble: 6 × 5
  sepal_length sepal_width petal_length petal_width species  
         <dbl>       <dbl>        <dbl>       <dbl> <chr>    
1          7.7         2.6          6.9         2.3 virginica
2          7.7         3.8          6.7         2.2 virginica
3          7.7         2.8          6.7         2   virginica
4          7.6         3            6.6         2.1 virginica
5          7.9         3.8          6.4         2   virginica
6          7.3         2.9          6.3         1.8 virginica

✅ Sorting helps highlight extremes and trends. You can sort by multiple columns or chain it with filtering for advanced workflows.