Dopamine encodes deep network teaching signals for individual learning trajectories.
Becoming an expert in a task, such as interpretation of x-ray images, requires extensive training. Different individuals may progress through the corresponding learning process in different ways, e.g. start recognizing different features of x-ray images at different times during training. This paper shows why such diversity of learning trajectories arises, and how responses of neurons involved in learning evolves over extended training.
Striatal dopamine plays fundamental roles in fine-tuning learned decisions. However, when learning from naive to expert, individuals often exhibit diverse learning trajectories, defying understanding of its underlying dopaminergic mechanisms. Here, we longitudinally measure and manipulate dorsal striatal dopamine signals in mice learning a decision task from naive to expert. Mice learning trajectories transitioned through sequences of strategies, showing substantial individual diversity. Remarkably, the transitions were systematic; each mouse's early strategy determined its strategy weeks later. Dopamine signals reflected strategies each animal transitioned through, encoding a subset of stimulus-choice associations. Optogenetic manipulations selectively updated these associations, leading to learning effects distinct from that of reward. A deep neural network using heterogeneous teaching signals, each updating a subset of network association weights, captured our results. Analyzing the model's fixed points explained learning diversity and systematicity. Altogether, this work provides insights into the biological and mathematical principles underlying individual long-term learning trajectories.

2025. Cell (e-Pub ahead of print).
2024. Cell Rep, 43(4):114080.