A vector is just a list of numbers. That's it. In 2D space, two numbers place a point on a grid. In 3D, three numbers. A language model works with vectors that have hundreds or thousands of numbers -- but the idea is the same.
Each number is a dimension. You can think of each dimension as a dial, and the combination of all the dial settings together describes a unique position in space. That position is the concept.
Vectors also have direction and magnitude. The direction (the angle it points) carries meaning. The length tells you how strongly something is expressed. Two vectors pointing the same direction represent similar things. Pointing opposite? Opposites.
The process has two steps. First, text gets tokenized -- chopped into chunks (tokens). These might be whole words, parts of words, or single characters. Each token gets a unique integer ID. Nothing semantic yet, just an index.
Then each token ID gets looked up in an embedding table -- a giant matrix where each row is a learned vector. Row 4291 might be the vector for "king". That vector was learned during training by adjusting those numbers until the model got good at predicting text.
The genius is that this training process forces geometry to emerge. The model never gets told "make 'dog' and 'cat' close together." It just learns to predict text, and the most efficient way to do that turns out to encode semantic relationships as spatial proximity. The structure is an emergent side effect of compression under pressure.
When a language model learns from text, it encodes concepts as directions in space. Not metaphorically -- literally. Each word or concept gets a vector: a list of thousands of numbers that places it at a specific point in a high-dimensional space.
The wild part: the relationships between concepts are preserved as geometric relationships. Gender isn't stored as a rule. It's just a direction. A consistent arrow you can add or subtract.
This is the Linear Representation Hypothesis -- the idea that semantic relationships map cleanly to vector arithmetic. It mostly works. But it's probably just the surface of something deeper.
Here's the problem: a model with 4096 dimensions can only have 4096 truly orthogonal (non-interfering) directions. But language has millions of concepts. How does it fit?
The answer is superposition -- features get packed at angles that are almost orthogonal, like cramming too many vectors into a space where they slightly overlap. The geometry that emerges looks like polytopes and sphere-packing.
For small numbers of features, the optimal packing isn't random -- it produces pentagons, tetrahedra, icosahedra. Structures whose angles are determined by the golden ratio φ. The model discovers these configurations automatically because they minimize cross-feature interference.
Not every concept is linear. Some things wrap around. The days of the week form a cycle. Musical pitch repeats at the octave. The months loop back. Hue circles around the color wheel.
For these, the model can't use a line -- it uses a ring. Topologically S¹. Researchers have found evidence that models actually do this: temporal and periodic concepts cluster on circular manifolds embedded in the high-dimensional space.
For music specifically: the chromatic scale seems to form a helix, where pitch increases along the axis and octave equivalence creates the wrap. The same note in different octaves is "close" in a way that a flat line can't capture.
Topological Data Analysis (TDA) lets you ask a brutal question about a cloud of points in high-dimensional space: how many holes does it have? Not visually -- mathematically. Loops, voids, tunnels.
A loop is a "1-cycle" -- you can draw a closed path through the data that can't be shrunk to a point without leaving the data cloud. Researchers have found these in embedding spaces. They correspond to cyclic relationships -- exactly the periodic concepts above.
A void is a "2-cycle" -- an empty interior surrounded by the data, like a bubble. These might correspond to conceptual boundaries, category edges, or things the model refuses to interpolate through.
Raw activation space is messy because of superposition. Sparse Autoencoders (SAEs) are a technique for disentangling it -- expanding the compressed representation into a much larger space where features are forced to be sparse: only a few active at a time.
The result is a cleaner feature space where individual directions correspond to interpretable concepts. The geometry of this expanded space is what researchers are now studying for complex structure -- looking for the polytopes, rings, and topological features described above, but in a cleaner signal.
This is where the real work is happening right now. The question of whether golden ratio structures, higher-genus surfaces, or other exotic geometry emerge in SAE feature space at scale is genuinely open.