Welcome to Chapter 15! In the previous chapter, 9-Real-World, we finished our journey through the core Machine Learning curriculum. We learned how to build, deploy, and use models in the real world using Python.
But Python isn't the only language spoken in the land of Data Science.
Just like in the human world, where you might say "Hello" in English or "Hola" in Spanish, in Data Science, you can write code in Python or R.
This brings us to the folder solution/R/.
Imagine you are a chef.
The solution/R/ directory contains the exact same lessons you just learned, but translated into the R Programming Language.
Why does this exist? While Python is great for general programming (building apps, websites, and scripts), R was invented specifically by statisticians for statistics. If you go into academic research or specialized data analysis, you will see R everywhere.
If you open this folder, you will see files that look slightly different from what we used before.
.Rmd)In notebook.ipynb, we used Jupyter Notebooks. The R equivalent is R Markdown.
<-)This is the first thing that confuses beginners.
x = 5x <- 5 (It looks like an arrow, pointing the number 5 into the variable x).In Python, we used Pandas to manipulate data. In R, we use a collection of tools called the Tidyverse. It is designed to be very easy to read, almost like English sentences.
To use the files in solution/R/, you typically need a specific tool called RStudio. It is the "Command Center" for R, similar to how VS Code is for Python.
Let's see how the code looks different for the exact same task: Loading our Pumpkin Data.
Python Version (What you know):
import pandas as pd
df = pd.read_csv("pumpkins.csv")
print(df.head())
R Version (What is in this folder):
library(tidyverse)
# The arrow puts data into 'df'
df <- read_csv("pumpkins.csv")
# Show the first few rows
head(df)
Explanation:
They look almost identical! read_csv does the same thing in both languages. The main difference is the arrow <-.
R is famous for its beautiful graphs using a tool called ggplot2.
# Start a plot using the pumpkin data
ggplot(data = df, aes(x = Size, y = Price)) +
# Draw dots (points)
geom_point()
Output: A scatter plot appears showing the relationship between Size and Price.
Explanation:
aes: Aesthetics. We map the X axis to Size and Y axis to Price.geom_point: Geometric Object. We tell R to draw points (dots).How does the project organize these two languages? They run on parallel tracks.
When you look at the solution/R/ folder, you will see subfolders that match the main chapters perfectly.
solution/R/regression/.solution/R/classification/.Let's revisit the Pumpkin Price prediction problem from Chapter 7, but solve it the "Statistician's Way" using R.
In Python, we used Scikit-learn. In R, regression is built into the core language.
# 1. Train the model (Linear Model = lm)
# Formula: "Price depends on Size" (Price ~ Size)
model <- lm(Price ~ Size, data = df)
# 2. Look at the results
summary(model)
Output: R produces a very detailed statistical report, showing "P-values," "Residuals," and "F-statistics."
Explanation:
lm(): Linear Model.~ (Tilde): This symbol means "Depends on." We are saying Price depends on Size.summary(): Unlike Python's .predict(), R focuses heavily on explaining why the model works using statistics.Just like in Python, we can ask our R model to guess the future.
# Create a new pumpkin of size 450
new_pumpkin <- data.frame(Size = 450)
# Predict the price
predict(model, new_pumpkin)
Output:
1
8.504
Explanation:
The result is $8.50, just like our Python model! Math is universal, even if the programming language changes.
You might ask, "I just spent 14 chapters learning Python. Why should I care about R?"
ggplot2 creates better, publication-quality graphs than Python's tools.
In this chapter, we explored solution/R/. We learned that:
.Rmd files instead of .ipynb.<- and tildes ~, but the logic remains the same.We have now seen how to do Machine Learning in English (Python) and French (R). But what if you want to read this tutorial in actual human languages, like Spanish, Chinese, or Hindi?
Global knowledge should be accessible to everyone, regardless of what language they speak.
Generated by Code IQ