• Data Science Foundations
  • Welcome
  • I GETTING STARTED
  • Setting Up Your Analysis Environment
    • Explanation
    • Who This Guide Is For
    • Install Python
    • Install R
    • Install RStudio
    • Install Visual Studio Code (VSCode)
      • Installing Extensions in VSCode
      • Verify Installation
  • How to Navigate This Guide
    • Best Practices for Using Python & R Side by Side
      • ✅ Run Each Language Independently
      • ✅ Modify and Experiment
      • ✅ Compare Results
      • ✅ Use the Same Dataset
    • What’s Next?
  • 1 How do you create a project directory ready for analysis?
    • 1.1 Explanation
    • 1.2 Bash (Terminal)
    • 1.3 Python Code
    • 1.4 R Code
  • 2 How do you install basic tools and libraries for Python and R?
    • 2.1 Explanation
    • 2.2 Python Code
    • 2.3 R Code
  • 3 What are common sources of datasets for Python and R?
    • 3.1 Explanation
  • 4 What are Common Sources of Datasets for Python and R?
    • 4.1 Explanation
    • 4.2 Built-in or Package-Based Datasets
    • 4.3 ✅ Python
    • 4.4 R
    • 4.5 Online Public Data Sources
    • 4.6 Python Code
  • 5 How do you save a dataset in Python and R?
    • 5.1 Explanation
    • 5.2 Python Code
    • 5.3 R Code
  • 6 How do you load a pre-cleaned dataset in Python and R?
    • 6.1 Explanation
    • 6.2 Using consistent file paths (like data/*.csv) ensures reproducibility across environments.
    • 6.3 Python Code
    • 6.4 R Code
  • 7 How do you rename column names in Python and R?
    • 7.1 Explanation
    • 7.2 Python Code
    • 7.3 R Code
  • 8 How do you examine the structure and types of variables in Python and R?
    • 8.1 Explanation
      • 8.1.1 ✅ Common Data Types in Python and R
    • 8.2 Python Code
    • 8.3 R Code
  • 9 How do you check for missing values in Python and R?
    • 9.1 Explanation
    • 9.2 Python Code
    • 9.3 R Code
  • 10 How do you get summary statistics for numeric variables in Python and R?
    • 10.1 Explanation
    • 10.2 Python Code
    • 10.3 R Code
  • 11 How do you filter rows based on a condition in Python and R?
    • 11.1 Explanation
    • 11.2 Python Code
    • 11.3 R Code
  • 12 How do you sort rows based on a variable in Python and R?
    • 12.1 Explanation
    • 12.2 Python Code
    • 12.3 R Code
  • 13 How do you create a new variable in Python and R?
    • 13.1 Explanation
    • 13.2 Python Code
    • 13.3 R Code
  • 14 How do you detect and remove duplicate rows in Python and R?
    • 14.1 Explanation
    • 14.2 Python Code
    • 14.3 R Code
  • 15 How do you export a cleaned dataset in Python and R?
    • 15.1 Explanation
    • 15.2 Python Code
    • 15.3 R Code
  • General Data Science EDA Summary
    • 🧱 What You’ve Accomplished
    • 📈 Up Next: Data Visualization (VIZ)
      • 📚 Your CDI Learning Path
    • 🚀 Continue Learning
  • II DATA VISUALIZATION
  • 16 What are common data types in Python and R?
    • 16.1 Explanation
    • 16.2 Python Code
    • 16.3 R Code
  • 17 What is the difference between categorical and numerical variables?
    • 17.1 Explanation
      • 17.1.1 🔷 Categorical Variables
      • 17.1.2 🔶 Numerical Variables
  • 18 How do you inspect variable types in a dataset?
    • 18.1 Explanation
    • 18.2 Python Code
    • 18.3 R Code
  • 19 How do you create a simple dataset to test variable type conversion?
    • 19.1 Explanation
    • 19.2 Python Code
    • 19.3 R Code
  • 20 How do you convert variable types in a dataset?
    • 20.1 Explanation
    • 20.2 Python Code
    • 20.3 R Code
  • 21 How do you summarize numerical and categorical variables?
    • 21.1 Explanation
    • 21.2 Python Code
    • 21.3 R Code
  • 22 How do you visualize the frequency of categorical variables?
    • 22.1 Explanation
    • 22.2 Python Code
    • 22.3 R Code
  • 23 How do you visualize distributions of numerical variables?
    • 23.1 Explanation
  • 24 How do you use a histogram to visualize numerical distributions?
    • 24.1 Explanation
    • 24.2 Python Code
    • 24.3 R Code
  • 25 How do you visualize smooth distributions using density plots?
    • 25.1 Explanation
    • 25.2 Python Code
    • 25.3 R Code
  • 26 What are the best plots to compare groups across a categorical variable?
    • 26.1 Explanation
  • 27 How do you use boxplots to compare groups across a categorical variable?
    • 27.1 Explanation
    • 27.2 Python Code
    • 27.3 R Code
  • 28 How do you use violin plots to compare groups across a categorical variable?
    • 28.1 Explanation
    • 28.2 Python Code
    • 28.3 R Code
  • 29 How do you use ridge plots to compare distributions across a categorical variable?
    • 29.1 Explanation
    • 29.2 Python Code
    • 29.3 R Code
  • 30 How do you visualize individual data points by group using a strip plot?
    • 30.1 Explanation
    • 30.2 Python Code
    • 30.3 R Code
  • 31 How do you visualize individual observations using a swarm plot?
    • 31.1 Explanation
    • 31.2 Python Code
    • 31.3 R Code
  • 32 How do you visualize group summaries using a dot plot?
    • 32.1 Explanation
    • 32.2 Python Code
    • 32.3 R Code
  • 33 How do you visualize group summaries using bar plots with error bars?
    • 33.1 Explanation
    • 33.2 Python Code
    • 33.3 R Code
  • 34 How do you visualize relationships between two numerical variables using a scatter plot?
    • 34.1 Explanation
    • 34.2 Python Code
    • 34.3 R Code
  • 35 How do you visualize trends across an ordered variable using a line plot?
    • 35.1 Explanation
    • 35.2 Python Code
    • 35.3 R Code
  • III PATTERN RECOGNITION AND RELATIONSHIPS
  • 36 How do you visualize patterns and relationships in multivariate data?
    • 36.1 Explanation
  • 37 How do you visualize all pairwise relationships using a pair plot?
    • 37.1 Explanation
    • 37.2 Python Code
    • 37.3 R Code
  • 38 How do you visualize correlation between numerical variables?
    • 38.1 Explanation
    • 38.2 Python Code
    • 38.3 R Code
  • 39 How do you visualize the relationship between two numerical variables using a scatter plot?
    • 39.1 Explanation
    • 39.2 Python Code
    • 39.3 R Code
  • 40 How do you enhance scatter plots by adding group color and trend lines?
    • 40.1 Explanation
    • 40.2 Python Code
    • 40.3 R Code
  • IV SPECIALTY VISUALS
  • 41 How do you visualize simple proportions using a pie chart?
    • 41.1 Explanation
    • 41.2 Python Code
    • 41.3 R Code
  • 42 How do you create a donut chart to show part-to-whole proportions?
    • 42.1 Explanation
    • 42.2 Python Code
    • 42.3 R Code
  • 43 How do you visualize hierarchical part-to-whole relationships using a treemap?
    • 43.1 Explanation
    • 43.2 Python Code
    • 43.3 Note
  • 44 How do you create a static treemap in Python?
    • 44.1 Explanation
    • 44.2 Python Code
    • 44.3 R Code
  • 45 How do you visualize overlaps using a Venn diagram?
    • 45.1 Explanation
    • 45.2 Python Code
    • 45.3 R Code
  • How do you choose the right visualization for your data?
    • Explanation
  • What domain-specific visualizations should I learn?
    • Explanation
  • V REFERENCES
  • Core References for Further Learning
    • 📘 General & Framework
    • 🐍 Python Tools
    • 🅡 R Ecosystem
    • 📊 Statistical Learning
  • Full Linked References
  • Explore More Guides

General Data Science – Free Edition

General Data Science – Free Edition


Last updated: June 10, 2025