Q&A 16 What are common data types in Python and R?
16.1 Explanation
Understanding basic data types is crucial when working with data. Both Python and R offer a set of fundamental types to represent different kinds of values, which affect how data is stored, displayed, and processed.
Common Data Types:
Concept | Python (pandas /base) |
R (base ) |
Notes |
---|---|---|---|
Integer | int |
integer |
Use astype(int) in pandas, as.integer() in R |
Decimal Number | float |
numeric , double |
numeric is usually stored as double in R |
Text / String | str , object (pandas) |
character |
Use astype(str) in pandas, as.character() in R |
Logical / Boolean | bool |
logical |
True /False in Python, TRUE /FALSE in R |
Date / Time | datetime64[ns] |
Date , POSIXct |
Use pd.to_datetime() in pandas, as.Date() or as.POSIXct() in R |
Category | category (pandas) |
factor |
Good for grouping and modeling |
Missing Values | NaN (numpy ) |
NA |
Use pd.isna() in Python, is.na() in R |
Complex Numbers | complex |
complex |
Less common in typical data science |
List | list |
list |
R lists can contain different types, like Python lists |
Dictionary | dict |
named list , list() |
R lists with names can mimic Python dictionaries |
Tuple | tuple |
c() , list() |
No direct equivalent — use c() for vectors or list() for mixed types |
Knowing these types helps with data cleaning, conversion, and model preparation.
16.2 Python Code
import pandas as pd
# Create a simple DataFrame to examine data types
df = pd.DataFrame({
"name": ["Alice", "Bob"], # object (string)
"age": [30, 25], # int64
"joined": pd.to_datetime(["2022-01-01", "2021-07-15"]) # datetime64[ns]
})
# Display data types for each column
print(df.dtypes)
name object
age int64
joined datetime64[ns]
dtype: object
16.3 R Code
# Create a simple data frame to examine R data types
df <- data.frame(
name = c("Alice", "Bob"), # character
age = c(30, 25), # numeric (stored as double)
joined = as.Date(c("2022-01-01", "2021-07-15")) # Date
)
# Print structure of the data frame
str(df)
'data.frame': 2 obs. of 3 variables:
$ name : chr "Alice" "Bob"
$ age : num 30 25
$ joined: Date, format: "2022-01-01" "2021-07-15"