Q&A 19 How do you create a simple dataset to test variable type conversion?
19.1 Explanation
You can create a small dataset manually to simulate variables of different types — such as character, integer, boolean, date, and categorical.
This is useful when:
- Practicing how to convert between types (e.g., character → factor or object → datetime)
- Testing how functions behave with different variable classes
- Debugging type-specific behavior before applying to large datasets
Working with a controlled sample helps you understand how tools like pandas
(Python) or the tidyverse
(R) handle different types by default.
19.2 Python Code
# ✅ Import modern Python libraries
import pandas as pd
import numpy as np
# Create a test dataset with various types
df = pd.DataFrame({
"name": ["Alice", "Bob", "Carol"], # string / object
"age": ["30", "25", "28"], # string (simulate raw import)
"member": ["yes", "no", "yes"], # string (convert to bool/cat later)
"joined": ["2022-01-01", "2021-07-15", "2023-03-20"] # string (convert to datetime)
})
# Save to CSV for testing workflows
df.to_csv("data/test_conversion.csv", index=False)
# Preview structure
print(df.dtypes)
print("\n", df.head())
name object
age object
member object
joined object
dtype: object
name age member joined
0 Alice 30 yes 2022-01-01
1 Bob 25 no 2021-07-15
2 Carol 28 yes 2023-03-20
19.3 R Code
# ✅ Load modern R packages
library(tidyverse)
# Create a test dataset
df <- tibble(
name = c("Alice", "Bob", "Carol"), # character
age = c("30", "25", "28"), # character (simulate untyped input)
member = c("yes", "no", "yes"), # character (can convert to logical)
joined = c("2022-01-01", "2021-07-15", "2023-03-20") # character (convert to Date)
)
# Save to CSV for downstream testing
write_csv(df, "data/test_conversion.csv")
# Preview structure
glimpse(df)
Rows: 3
Columns: 4
$ name <chr> "Alice", "Bob", "Carol"
$ age <chr> "30", "25", "28"
$ member <chr> "yes", "no", "yes"
$ joined <chr> "2022-01-01", "2021-07-15", "2023-03-20"