Joe Wood Notes
categories
about
AI
Attention Heads
Some notes on how attention heads in a transformer model develop through training, are used in the model and combined to provide final weights.
January 12, 2025