Reservoir computing inverts the conventional wisdom about neural network training. Instead of training a complex recurrent network end-to-end (which is notoriously difficult due to vanishing/exploding gradients), you use a fixed, randomly initialized, high-dimensional dynamical system as your computational substrate - the "reservoir" - and only train a simple linear readout layer on top.
The reservoir does all the complex nonlinear computation. You never touch its weights. The readout layer learns to extract the relevant information from the reservoir's rich dynamical state. Training is as simple as linear regression.
The surprising result: this works extremely well for temporal processing tasks. The reservoir's internal dynamics create an echo of input history in its current state - past inputs leave traces that affect present activations. Hence "Echo State Network" (ESN), the most common formulation.
Independently discovered by Wolfgang Maass (Liquid State Machines, 2002) and Jaeger & Haas (Echo State Networks, 2002). Maass's formulation was explicitly inspired by cortical microcircuits.
Only W_out is trained - simple linear regression or ridge regression. W_in and W_res are set randomly and never updated. This makes training trivially fast and avoids all gradient-related problems.
The reservoir is a recurrent network whose state evolves over time according to a fixed rule. The key insight is that the current state encodes a nonlinear function of the entire input history.
The Echo State Property (ESP): the reservoir must "forget" its initial state - the response to any input sequence must be asymptotically independent of initial conditions. This requires the reservoir's largest eigenvalue (spectral radius) to be slightly below 1, keeping dynamics stable but rich.
Reservoir computing works best when the reservoir operates near the edge of chaos - the phase transition between ordered (damped) and chaotic (unstable) dynamics. This is not just engineering: it may be how real brains work.
The criticality hypothesis (Beggs & Plenz 2003) proposes that cortical networks self-organize to a critical state, evidenced by neuronal avalanches - cascades of activity that follow power-law size distributions, a hallmark of criticality in physical systems.
At criticality: information transmission is maximized, dynamic range is maximized, sensitivity to weak inputs is maximized, and the system's "memory" of past inputs is longest. Exactly what you want in a reservoir.
The brain may actively regulate its proximity to criticality via homeostatic plasticity - adjusting synaptic strengths to maintain the right dynamical regime. Seizures may be the brain crossing into supercritical chaos. Depression and anesthesia may reflect subcritical states.
Wolfgang Maass's Liquid State Machine (LSM) is the biologically-grounded version. The "liquid" is a recurrent circuit of spiking neurons (LIF or similar). Inputs cause ripples through the liquid - like dropping a stone in water. The state of the liquid at any moment is a complex nonlinear function of all past inputs, fading with time.
The LSM is explicitly modeled on the cortical microcircuit - a canonical 6-layer cortical column that appears repeated across the neocortex with similar connectivity statistics. The idea: cortex is a reservoir. The basal ganglia and other structures learn to read it out.
Key differences from ESN: spiking neurons, biological synapse models (AMPA, NMDA, GABA), realistic timing. More biologically plausible, harder to tune, harder to analyze mathematically.
Reservoir computing excels at temporal tasks where the history of inputs matters - exactly what RNNs were designed for but struggle to train on:
The wildest extension: the reservoir doesn't have to be a simulated neural network. Any sufficiently complex physical dynamical system works. This opens physical reservoir computing:
Fernando & Sojakka 2003. Literally a bucket. Input = vibration patterns from motors. Reservoir = water surface waves. Readout = camera + linear weights on pixel values. Successfully performed XOR and speech classification.
Single nonlinear optical node with delayed feedback creates a virtual network of 50-400 nodes. Processes at light speed. Demonstrated wideband signal classification at GHz rates.
Quantum systems as reservoirs. Exponentially large Hilbert space = exponentially rich reservoir states. Active research area. May offer exponential advantage for certain temporal tasks.
If the cortex is a reservoir and subcortical structures (basal ganglia, cerebellum) are the readout layers - a provocative but not insane hypothesis - then the brain solves the credit assignment problem by never having to assign it through the cortical mass at all.
The cortex projects via learned connections to striatum, which selects actions via dopaminergic reinforcement. The cortical representations are rich, the downstream learning is simple and local.
This maps surprisingly well to the anatomy. Cortex → striatum projection is massive and mostly one-directional. Dopamine provides the reward signal. Exactly: reservoir → readout with neuromodulatory training.