Chapter 1 Β· CORE

File System & Indexing

πŸ“„ 01_file_system___indexing.md 🏷 Core

Chapter 1: File System & Indexing

Welcome to the internal workings of Filebrowser!

In this first chapter, we are going to explore the foundation of the entire application: How the system manages, finds, and modifies files.

The Problem: The "Library" Dilemma

Imagine walking into a massive library with millions of books, but no organization system. If you wanted to find "Harry Potter," you would have to walk through every single aisle, looking at every single book until you found it. That is slow.

Computers face the same problem. Reading the physical hard drive (disk) is slow. If Filebrowser scanned your actual hard drive every time you clicked a folder, the interface would feel sluggish.

The Solution: The "Librarian" & The "Card Catalog"

Filebrowser solves this using two main concepts:

  1. The File System Adapter (The Librarian): This is the code that actually performs physical tasksβ€”moving a book, throwing one away, or writing a new page.
  2. The Index (The Card Catalog): This is a smart list kept in memory (RAM) and a local database. It remembers where files are, how big they are, and what type they are.

When you browse files in the app, you are mostly looking at the Index (fast). When you upload or delete a file, you are using the File System Adapter (physical).


Key Concepts

1. The Index (indexing package)

The Index is the brain. It maps a "virtual path" (what you see in the browser, like /photos/vacation.jpg) to a "real path" on your server's hard drive (like /var/lib/data/users/admin/photos/vacation.jpg).

It is responsible for:

2. The File Operations (files package)

This layer handles the heavy lifting. It uses the Index to find where a file really is, and then performs standard operating system commands like os.Remove (delete) or os.Rename (move).


How it Works: A Use Case

Let's look at a simple scenario: A user wants to read the contents of a text file.

  1. Request: The user asks for /notes.txt.
  2. Lookup: The system asks the Index: "Where is /notes.txt really located?"
  3. Resolution: The Index checks its map and says: "That is actually at C:\data\notes.txt."
  4. Action: The system reads the bytes from that physical path and sends them back.

Usage Example

Here is how the code handles retrieving file information. This function is called FileInfoFaster because it relies heavily on the cached Index rather than scanning the disk from scratch every time.

// backend/adapters/fs/files/files.go

// This function gets file details efficiently
func FileInfoFaster(opts utils.FileOptions, user *users.User) (*iteminfo.ExtendedFileInfo, error) {
    // 1. Get the Index responsible for this file source
    idx := indexing.GetIndex(opts.Source)

    // 2. Ask the Index for info (it checks cache rules first)
    info, err := idx.GetFileInfo(indexing.FileInfoRequest{
        IndexPath:      opts.Path, 
        IsRoutineScan:  false,
    })

    // 3. Wrap the result in a response object
    response := &iteminfo.ExtendedFileInfo{ FileInfo: *info }
    return response, err
}

Explanation:


Code Walkthrough: Physical Operations

Reading is safe, but what about writing? Let's look at how Filebrowser creates a new file.

// backend/adapters/fs/files/files.go

func WriteFile(source, path string, in io.Reader) error {
    // 1. Find where the file should physically go
    idx := indexing.GetIndex(source)
    realPath, _, _ := idx.GetRealPath(path)

    // 2. Open (or create) the file on the Operating System
    file, err := os.OpenFile(realPath, os.O_RDWR|os.O_CREATE|os.O_TRUNC, 0644)
    if err != nil {
        return err
    }
    defer file.Close()

    // 3. Copy the data from the upload to the disk
    _, err = io.Copy(file, in)
    return err
}

Explanation:

  1. idx.GetRealPath(path): This converts the request path (/new.txt) into the server's absolute path (/home/user/files/new.txt).
  2. os.OpenFile: Standard Go command to open a file handle.
  3. io.Copy: Streams the data from the user (the browser) directly into the file on the disk.

Under the Hood: The Indexing Process

How does the Index know what files exist? It performs Scanning.

When Filebrowser starts (or when you trigger a refresh), it runs a "Scanner" that walks through your folders.

The Scanning Flow

sequenceDiagram participant S as Scanner participant FS as File System (Disk) participant M as Memory (Index) participant DB as Database S->>FS: 1. Read Directory (/photos) FS-->>S: Return: [cat.jpg, dog.png] loop For Each File S->>M: 2. Calculate Stats (Size, Type) S->>DB: 3. Save Metadata end M->>M: 4. Mark folder as "Indexed"
  1. Read Directory: The scanner asks the OS for a list of files.
  2. Calculate Stats: It checks the file size, modification time, and determines if it is an image, video, or text.
  3. Save: It updates the persistent database so we remember this file after a restart.

Deep Dive: The Index Structure

Let's look at the Index struct in backend/indexing/indexingFiles.go. This is the object that holds the state of your file system.

// backend/indexing/indexingFiles.go

type Index struct {
    // Shared database connection
    db *dbsql.IndexDB

    // Active scanners (workers looking for file changes)
    scanners map[string]*Scanner 

    // In-memory cache of folder sizes to speed up browsing
    folderSizes map[string]uint64   

    // Basic configuration (where the files are stored)
    settings.Source 
}

Explanation:

Deep Dive: Processing a Directory

When the scanner looks at a folder, it uses GetDirInfoCore. This function determines what is inside a folder and filters out things we shouldn't see.

// backend/indexing/indexingFiles.go

func (idx *Index) GetDirInfoCore(dirInfo *os.File, ...) (*iteminfo.FileInfo, error) {
    // 1. Read all files in the directory
    files, _ := dirInfo.Readdir(-1)

    // 2. Loop through them
    for _, file := range files {
        // 3. Check if hidden or ignored by rules
        if idx.ShouldSkip(file.IsDir(), file.Name(), ...) {
            continue
        }

        // 4. Add to list and calculate size
        // ... (logic to create itemInfo)
    }
    
    // 5. Return the full list
    return dirFileInfo, nil
}

Explanation: This loop is the "gatekeeper." Even if a file exists on the disk, if ShouldSkip returns true (e.g., it's a hidden system file like .DS_Store), it will not be added to the Index, and the user will never know it exists.


Summary

In this chapter, we learned:

  1. The Index acts as a high-speed cache and map for your files.
  2. The Files Adapter performs the actual physical operations (reading/writing) using paths resolved by the Index.
  3. Scanning is the process of updating the Index to match reality.

Now that we can access and list files, we need to make sure only the right people can see them.

Next Chapter: Authentication & User Storage


Generated by Code IQ