In the previous chapter, Lesson Structure, we opened the "Cookbook" and looked at how a lesson is organized. We saw that the core of the learning experience is the Jupyter Notebook.
However, a cookbook is useless without a kitchen. If you try to open a Notebook file (.ipynb) right now without installing anything, your computer won't know what to do with it.
This chapter guides you through setting up your "Kitchen"βinstalling Python and the necessary tools to run the code.
Your computer understands basic commands (like "open file" or "play music"), but it does not understand Data Science out of the box. We need to install a "translator" that turns human-readable code into instructions the processor can understand.
Imagine you have navigated to the Regression lesson (from Chapter 3). You see a file called notebook.ipynb.
The Goal: You want to double-click this file, write print("Hello World"), and see the result on your screen.
The Solution: To make this happen, we need to install three layers of software:
Setting up a coding environment can be scary for beginners. Let's simplify the concepts using our kitchen analogy.
Python is the electricity and gas. It powers everything. Without Python installed, nothing else in this course works. We specifically use Python 3, which is the modern standard.
venv)Imagine you are cooking a spicy curry in one pot and a delicate dessert in another. You wouldn't want the curry spices to spill into the dessert.
In coding, we use Virtual Environments. These are isolated bubbles. We create a specific bubble just for ML-For-Beginners so the tools we install here don't mess up other projects on your computer.
pip
Python comes with a tool called pip. Think of pip as a personal shopper. You give it a shopping list, and it goes to the internet, finds the tools, and installs them for you.
To solve our use case (running the notebook), follow these steps.
Go to python.org and download the latest version of Python 3.
Open your command line (Terminal on Mac/Linux, Command Prompt or PowerShell on Windows). Navigate to the folder where you downloaded this project.
# 1. Go to the project folder
cd ML-For-Beginners
# 2. Create the environment (we'll call it 'ml-env')
python -m venv ml-env
Explanation: python -m venv tells Python to make a new virtual environment. We named it ml-env. You will see a new folder appear with that name.
Now we need to step inside the bubble.
# On Windows:
ml-env\Scripts\activate
# On Mac / Linux:
source ml-env/bin/activate
Explanation: After running this, your terminal prompt should change to show (ml-env). This means you are now working inside the safe bubble.
Now we use the "Shopper" (pip) to buy the tools listed in Key Technologies.
# Install everything listed in requirements.txt
pip install -r requirements.txt
Explanation: The -r flag tells pip to read a file. It looks at requirements.txt, sees pandas, scikit-learn, and jupyter, and downloads them all automatically.
What happens when you type that pip install command? It triggers a conversation between your computer and the cloud.
Here is the sequence of events that turns a text file into working software.
requirements.txt File
The magic behind the setup is the requirements.txt file located in the root of the Repository Structure.
We discussed this file in Chapter 2, but let's look at how it relates to the setup process specifically.
# A snippet of requirements.txt
jupyter>=1.0.0
pandas
numpy
scikit-learn
seaborn
Explanation: This is a plain text file. When you run pip install -r requirements.txt, pip reads the first line (jupyter). It checks if you have it. If not, it downloads it. Then it moves to pandas, and so on.
Once everything is installed, the final piece of the puzzle is opening the lesson.
# Run this command in your terminal
jupyter notebook
Explanation: This command starts a local web server. Your default web browser will pop up showing the file directory. You can now click on 2-Regression and then notebook.ipynb.
To verify that your setup is correct:
Shift + Enter.
If a number appears next to the cell (e.g., [1]) and the code runs, congratulations! Your kitchen is fully operational.
In this chapter, we built our Python environment:
However, not everyone uses Python. Some Data Scientists prefer a language built specifically for statistics. If that sounds like you, or if you just want to see the alternative, read on.
Generated by Code IQ