Generated by Code IQ · v1.0

LLMs-from-scratch
Knowledge Tutorial

This project serves as a comprehensive educational guide for building Large Language Models (LLMs) from the ground up. It demonstrates how to transform raw text into numerical data using a Tokenizer, construct the core GPT Architecture and modern variants like Llama and Qwen, and implement efficient training loops to teach the model to generate coherent text.

8
Chapters
-
Subsystems
Rabbit Holes
▶ Start Reading ⎇ View on GitHub
System Architecture

How the pieces fit

LLMs-from-scratch is organized as connected concepts and components. Start broad, then drill down chapter by chapter.

⚙️
Tokenizer (Byte Pair Encoding)
Tokenizer (Byte Pair Encoding)
⚙️
Data Loading and Formatting
Data Loading and Formatting
⚙️
Attention Mechanisms (Self & Grouped Query)
Attention Mechanisms (Self & Grouped Query)
⚙️
The GPT Architecture (Transformer Block)
The GPT Architecture (Transformer Block)
⚙️
Training and Finetuning Loops
Training and Finetuning Loops
⚙️
Modern Model Variations (Llama & Qwen)
Modern Model Variations (Llama & Qwen)
⚙️
Mixture of Experts (MoE)
Mixture of Experts (MoE)
⚙️
Inference Optimization (KV Cache)
Inference Optimization (KV Cache)
LLMs-from-scratch — bash
open tutorial
◆ Scanning numbered chapters
◆ Building navigation and Mermaid diagrams
◆ Generating chapter and subsystem pages
✓ 8 chapter pages built
✓ Theme toggle enabled
Repository Overview

Intro and Architecture Diagram

This project serves as a comprehensive educational guide for building Large Language Models (LLMs) from the ground up. It demonstrates how to transform raw text into numerical data using a Tokenizer, construct the core GPT Architecture and modern variants like Llama and Qwen, and implement efficient training loops to teach the model to generate coherent text.

Source Repository: https://github.com/rasbt/LLMs-from-scratch

flowchart TD A0["Tokenizer (Byte Pair Encoding)"] A1["The GPT Architecture (Transformer Block)"] A2["Attention Mechanisms (Self & Grouped Query)"] A3["Training and Finetuning Loops"] A4["Inference Optimization (KV Cache)"] A5["Modern Model Variations (Llama & Qwen)"] A6["Mixture of Experts (MoE)"] A7["Data Loading and Formatting"] A7 -->|"Uses for encoding"| A0 A3 -->|"Iterates over"| A7 A3 -->|"Optimizes weights of"| A1 A1 -->|"Contains"| A2 A1 -->|"Integrates"| A6 A2 -->|"Implements"| A4 A5 -->|"Uses Grouped Query Attention"| A2 A5 -->|"Defines special tokens for"| A0
Tutorial Chapters

All 8 chapters

Follow sequentially or jump to any topic. Start with Tokenizer (Byte Pair Encoding).

About This Project

Generated by Code IQ

This tutorial was automatically generated by Code IQ and rendered with the shared tutorial site builder. It can be produced for any repository tutorial folder that follows the numbered markdown chapter layout.

View Code IQ ↗
python build_site.py '/home/runner/work/Code-IQ/Code-IQ/output/LLMs-from-scratch'

// → 8 chapters
// → source: rasbt/LLMs-from-scratch