Q&A 15 How do you export a cleaned dataset in Python and R?
15.1 Explanation
After cleaning and transforming your data — renaming columns, removing duplicates, creating new variables — it’s good practice to save the final version.
This ensures:
- You don’t have to redo your work
- Others can use the clean data
- You can start fresh from a reliable version for future steps (like visualization or modeling)
In this example, we’ll export our cleaned Iris dataset to a CSV file called iris_cleaned.csv
in the data/
folder.
15.2 Python Code
import pandas as pd
# Load and optionally clean dataset
df = pd.read_csv("data/iris.csv")
df_cleaned = df.drop_duplicates()
# Export to new CSV file
df_cleaned.to_csv("data/iris_cleaned.csv", index=False)
print("Cleaned dataset saved as data/iris_cleaned.csv")
Cleaned dataset saved as data/iris_cleaned.csv
15.3 R Code
library(readr)
library(dplyr)
# Load and optionally clean dataset
df <- read_csv("data/iris.csv")
df_cleaned <- df %>%
distinct()
# Export to new CSV file
write_csv(df_cleaned, "data/iris_cleaned.csv")
cat("Cleaned dataset saved as data/iris_cleaned.csv\n")
Cleaned dataset saved as data/iris_cleaned.csv
✅ With your cleaned dataset saved, you’re now ready to begin visualizing, modeling, or sharing your data — with confidence.