17 April 2026

Do I dare / Disturb the universe?

A first post, which is also an excuse to make sure the typography holds.

This is the first post on this blog, and also the post I use to verify that every small typographic choice I made actually works when real words flow through it. If you’re reading this, one of two things has happened: either I forgot to delete it, or I decided it’s a fine enough opening that it can stay.

What this is for

I’m a software engineer, writing mostly about machine learning infrastructure and the systems I find interesting. Sometimes I’ll write about other things. There is no content calendar. There is no newsletter. Posts appear when I finish them.

If you want to follow along, the RSS feed is the right way.

What the prose should feel like

Body text is set in Source Serif 4 at weight 330, which is lighter than the browser default and, to my eye, reads better on screen. A default 400-weight serif looks smudged at body size — like a photocopy of a photocopy. 330 has air.

Italics are for emphasis and titles and the occasional aside. Bold is for when I mean it. Links look like this — one accent color, an oxidized red, used nowhere else. The dot in the masthead is the only other place it appears.

Here is a pull quote, which is really just a blockquote:

The posts you finish are not the posts you set out to write. They are the ones that survived contact with the sentence-by-sentence business of figuring out what you actually think.

Ordered lists:

First, identify the bottleneck.
Then, measure it, because you will be wrong about where it is.
Then, fix it, and watch the bottleneck move somewhere else.

Unordered lists:

A short item.
A longer item that runs past the end of the first line, which is useful because I want to see how list items wrap in the measure I chose and whether the left alignment on the second line reads as intended.
A final short one.

Code

Inline code looks like this — set in JetBrains Mono, on a subtle warm-gray background, slightly tinted so it reads as set apart from prose without shouting. Now for a block. Here’s some Python, which is the language most of the real posts will lean on:

import torch
from torch import nn

class KVCache(nn.Module):
    """A minimal KV cache for decoder-only inference.

    The cache grows one token at a time during autoregressive
    generation. Its memory footprint, not compute, is usually the
    binding constraint for long-context workloads.
    """

    def __init__(self, n_layers: int, n_heads: int, head_dim: int,
                 max_seq_len: int, dtype=torch.float16):
        super().__init__()
        shape = (n_layers, 2, max_seq_len, n_heads, head_dim)
        self.register_buffer("store", torch.zeros(shape, dtype=dtype))
        self.length = 0

    def update(self, layer: int, k: torch.Tensor, v: torch.Tensor) -> None:
        end = self.length + k.size(0)
        self.store[layer, 0, self.length:end] = k
        self.store[layer, 1, self.length:end] = v

    def get(self, layer: int) -> tuple[torch.Tensor, torch.Tensor]:
        k = self.store[layer, 0, :self.length]
        v = self.store[layer, 1, :self.length]
        return k, v

A little Rust, because I’ve been playing with it:

fn phase_portrait(points: &[Point]) -> Vec<Trajectory> {
    points
        .iter()
        .map(|p| integrate(p, STEP, HORIZON))
        .filter(|t| !t.diverged())
        .collect()
}

And a shell one-liner, which should also look right:

$ rg -l 'TODO' --type rust | xargs -n1 -I{} echo 'fix: {}'

Math

Math is rendered with KaTeX. Inline: the Gaussian density is $f(x) = \frac{1}{\sigma\sqrt{2\pi}} \exp\!\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)$ . Display, for something more spacious:

\mathcal{L}(\theta) \;=\; \mathbb{E}_{x \sim \mathcal{D}} \Big[ -\log p_\theta(x) \Big] \;+\; \lambda \, \lVert \theta \rVert_2^2

A small table

Technique	Peak mem.	Throughput	Caveat
FP16 KV cache	100%	1.0×	baseline
INT8 (per-head)	52%	1.3×	slight quality drop
INT4 (grouped)	28%	1.6×	noticeable on long ctx

Why “phase space”

Systems have histories. A phase space is the abstract room those histories move through — every axis a degree of freedom, every point a complete specification of where things stand right now. The trajectory is what a system actually did, traced against the much larger space of what it could have done.

Most writing about technical systems describes the trajectory and forgets the room. You get a sequence of decisions, each sounding inevitable in retrospect, and no sense of the adjacent choices that were considered and abandoned, or never considered at all. The interesting part of the history is usually the shape of the room.

That shape is what I’d like to write toward.

More soon.