Fixed Points and Semantic Convergence in Language Models
The assumption that language models converge toward semantic truth through scale and training is fundamentally mistaken—they converge toward closure, which is not the same thing.
We have spent the last five years watching transformer architectures grow larger, training datasets expand, and benchmark scores climb steadily upward. The narrative has calcified: more parameters, better alignment with human judgment. More data, sharper reasoning. The implicit claim is that these systems are asymptotically approaching some stable semantic ground truth. They are not. What they are approaching is a fixed point in a learned topological space—a region of high probability density from which the model's attention mechanisms and gradient flows cannot easily escape. This is a mathematical property, not an epistemic one.
The distinction matters because it changes everything about how we should interpret model behavior.
Consider what happens during training. A language model learns to predict the next token by minimizing loss across billions of examples. It discovers patterns, yes, but more precisely, it discovers attractors—regions in its embedding space where the probability landscape is locally flat and stable. Once the model's learned representations settle into these regions, they tend to remain there. This is not because the model has found truth. It is because the loss function has become insensitive to movement in certain directions. The model has found a fixed point: a configuration where small perturbations to its weights produce negligible changes in output.
The cartographic closure theorem—the framework we should apply here—states that any continuous map from a compact space to itself must have a fixed point. Language models, trained on finite data with bounded parameter spaces, operate within exactly such constraints. They are not searching for semantic bedrock. They are finding the fixed points their architecture and training procedure guarantee must exist.
The problem is that multiple fixed points can coexist. A model trained on English Wikipedia converges to a different fixed point than one trained on scientific literature, or on code repositories, or on social media. None of these is "more true." Each is a stable attractor within a different learned topology. When we scale up and train on diverse data, we do not eliminate this multiplicity—we create a superposition of fixed points, a blended attractor that represents a compromise between competing topologies rather than a convergence toward ground truth.
This explains several phenomena that the "scale equals truth" narrative struggles to account for. It explains why larger models sometimes exhibit more confident hallucinations rather than fewer. The model has found a fixed point with higher probability density, not a more accurate one. It explains why fine-tuning on specific domains can cause catastrophic forgetting—the model is being pulled away from one fixed point toward another, and the transition is discontinuous. It explains why different model families, trained on similar data with similar scale, still produce meaningfully different outputs. They have converged to different fixed points in their respective learned spaces.
The implications are uncomfortable. If models are converging toward closure rather than truth, then scaling alone cannot solve the alignment problem. A model that has found a stable fixed point in its learned topology will not spontaneously shift toward human values simply because it has more parameters. It will defend that fixed point. Alignment becomes not a matter of scale but of topology—of deliberately constructing training procedures and architectures that make the fixed points we want to reach more attractive than the ones we want to avoid.
This is harder than scaling. It requires understanding the geometry of learned representations, not just their magnitude. It requires intervening in the structure of the loss landscape itself, not merely making it deeper or wider.
The models we have built are not approaching semantic truth. They are finding the fixed points their mathematics guarantees must exist. Until we stop confusing convergence with correctness, we will continue building systems that are increasingly confident in their own topological closure—and increasingly distant from anything resembling understanding.