LLMs-from-scratch — Codebase Knowledge Tutorial

System Architecture

How the pieces fit

LLMs-from-scratch is organized as connected concepts and components. Start broad, then drill down chapter by chapter.

⚙️

Tokenizer (Byte Pair Encoding)

⚙️

Data Loading and Formatting

⚙️

Attention Mechanisms (Self & Grouped Query)

⚙️

The GPT Architecture (Transformer Block)

⚙️

Training and Finetuning Loops

⚙️

Modern Model Variations (Llama & Qwen)

⚙️

Mixture of Experts (MoE)

⚙️

Inference Optimization (KV Cache)

Repository Overview

Intro and Architecture Diagram

This project serves as a comprehensive educational guide for building Large Language Models (LLMs) from the ground up. It demonstrates how to transform raw text into numerical data using a Tokenizer, construct the core GPT Architecture and modern variants like Llama and Qwen, and implement efficient training loops to teach the model to generate coherent text.

Source Repository: https://github.com/rasbt/LLMs-from-scratch

flowchart TD A0["Tokenizer (Byte Pair Encoding)"] A1["The GPT Architecture (Transformer Block)"] A2["Attention Mechanisms (Self & Grouped Query)"] A3["Training and Finetuning Loops"] A4["Inference Optimization (KV Cache)"] A5["Modern Model Variations (Llama & Qwen)"] A6["Mixture of Experts (MoE)"] A7["Data Loading and Formatting"] A7 -->|"Uses for encoding"| A0 A3 -->|"Iterates over"| A7 A3 -->|"Optimizes weights of"| A1 A1 -->|"Contains"| A2 A1 -->|"Integrates"| A6 A2 -->|"Implements"| A4 A5 -->|"Uses Grouped Query Attention"| A2 A5 -->|"Defines special tokens for"| A0

Tutorial Chapters

All 8 chapters

Follow sequentially or jump to any topic. Start with Tokenizer (Byte Pair Encoding).

Ch.01 CORE

Tokenizer (Byte Pair Encoding)

Welcome to the first chapter of LLMs from Scratch!

↗

Ch.02 CORE

Data Loading and Formatting

In the previous chapter, Chapter 1: Tokenizer (Byte Pair Encoding), we learned how to translate human language into lists of integers (toke…

↗

Ch.03 CORE

Attention Mechanisms (Self & Grouped Query)

In the previous chapter, Chapter 2: Data Loading and Formatting, we turned text into batches of numbers (tokens).

↗

Ch.04 CORE

The GPT Architecture (Transformer Block)

In the previous chapter, Chapter 3: Attention Mechanisms (Self & Grouped Query), we built the "search engine" of our model. We learned how…

↗

Ch.05 CORE

Training and Finetuning Loops

In the previous chapter, The GPT Architecture (Transformer Block), we assembled the complete structure of our Large Language Model (LLM). W…

↗

Ch.06 CORE

Modern Model Variations (Llama & Qwen)

In the previous chapter, Chapter 5: Training and Finetuning Loops, we taught our model how to learn. We used a classic architecture based o…

↗

Ch.07 CORE

Mixture of Experts (MoE)

In the previous chapter, Chapter 6: Modern Model Variations (Llama & Qwen), we upgraded our model engine with modern parts like RoPE and RM…

↗

Ch.08 CORE

Inference Optimization (KV Cache)

Welcome to the final chapter of LLMs from Scratch!

↗

About This Project

Generated by Code IQ

This tutorial was automatically generated by Code IQ and rendered with the shared tutorial site builder. It can be produced for any repository tutorial folder that follows the numbered markdown chapter layout.

View Code IQ ↗

python build_site.py '/home/runner/work/Code-IQ/Code-IQ/output/LLMs-from-scratch'

// → 8 chapters
// → source: rasbt/LLMs-from-scratch

LLMs-from-scratchKnowledge Tutorial

How the pieces fit

Intro and Architecture Diagram

All 8 chapters

Generated by Code IQ

LLMs-from-scratch
Knowledge Tutorial