Q&A 15 How do you export a cleaned dataset in Python and R?

15.1 Explanation

After cleaning and transforming your data — renaming columns, removing duplicates, creating new variables — it’s good practice to save the final version.

This ensures:

  • You don’t have to redo your work
  • Others can use the clean data
  • You can start fresh from a reliable version for future steps (like visualization or modeling)

In this example, we’ll export our cleaned Iris dataset to a CSV file called iris_cleaned.csv in the data/ folder.


15.2 Python Code

import pandas as pd

# Load and optionally clean dataset
df = pd.read_csv("data/iris.csv")
df_cleaned = df.drop_duplicates()

# Export to new CSV file
df_cleaned.to_csv("data/iris_cleaned.csv", index=False)
print("Cleaned dataset saved as data/iris_cleaned.csv")
Cleaned dataset saved as data/iris_cleaned.csv

15.3 R Code

library(readr)
library(dplyr)

# Load and optionally clean dataset
df <- read_csv("data/iris.csv")
df_cleaned <- df %>%
  distinct()

# Export to new CSV file
write_csv(df_cleaned, "data/iris_cleaned.csv")
cat("Cleaned dataset saved as data/iris_cleaned.csv\n")
Cleaned dataset saved as data/iris_cleaned.csv

✅ With your cleaned dataset saved, you’re now ready to begin visualizing, modeling, or sharing your data — with confidence.