Q&A 6 How do you load a pre-cleaned dataset in Python and R?
6.1 Explanation
Loading a dataset is one of the first steps in any data analysis project — especially when you’re working from a previously saved, cleaned version.
In this guide, we assume that the dataset *.csv
has been saved in your data/
folder. We’ll now load it using:
- Python: via the
pandas
library and itsread_csv()
function - R: using the
readr
package and itsread_csv()
function
6.3 Python Code
import pandas as pd
# Load the pre-cleaned iris dataset
df = pd.read_csv("data/iris_seaborn.csv")
# Preview the data
print(df.head())
# Confirm the shape
print("Rows and columns:", df.shape)
sepal_length sepal_width petal_length petal_width species
0 5.1 3.5 1.4 0.2 setosa
1 4.9 3.0 1.4 0.2 setosa
2 4.7 3.2 1.3 0.2 setosa
3 4.6 3.1 1.5 0.2 setosa
4 5.0 3.6 1.4 0.2 setosa
Rows and columns: (150, 5)
6.4 R Code
library(readr)
# Load the pre-cleaned iris dataset
df <- read_csv("data/iris_rbase.csv")
# Preview the data
head(df)
# A tibble: 6 × 5
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <chr>
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
Rows and columns: 150 5
✅ Once loaded, you’re ready to continue with data wrangling, visualization, or modeling — all based on your clean, reusable dataset.