Transformer
The neural network architecture that powers almost every modern AI language model.
A transformer is a type of neural network architecture introduced in 2017 that processes all words in a sequence simultaneously rather than one at a time. It uses attention mechanisms to weigh the importance of each word relative to every other word, making it exceptionally good at understanding context and meaning in language.
Older models read sentences like a person reading a book — one word at a time, left to right, trying to remember what came before. A transformer is more like a team of editors reviewing the entire manuscript at once, each one focused on different relationships between words — who does what to whom, which adjective belongs to which noun, what "it" refers to three sentences back.
Transformer doesn't refer to the full AI model — it's the underlying architecture. GPT, Claude, Gemini, and LLaMA are all transformer-based models. Saying "a transformer" is like saying "a combustion engine" — it describes the mechanism, not the car.