



The buzz of notifications and the sound of emails can cause excitement or fear. In a famous experiment, Ivan Pavlov (pictured) showed that dogs can be taught to salivate by the sound of a metronome or harmonium. Known as associative learning or reinforcement learning, this cause-and-effect connection is central to the way most animals treat the world.

Since the early 1970s, the dominant theory of what is happening has been that animals learn by trial and error. The association between cues (metronome) and rewards (food) is done as follows. When the cue comes, the animal predicts when the reward will occur. Then wait for what will arrive. Then calculate the difference between the prediction and the resulting error. Finally, use that error estimate to update things and make better predictions for the future.

Belief in this approach was reinforced in the late 20th century by two things. One of his discoveries was that he was also good at solving engineering problems related to artificial intelligence (AI). Deep neural networks learn by minimizing the error of their predictions.

Another reinforced observation was a paper published in Science in 1997. Fluctuations in brain levels of dopamine signal between some neurons and are associated with reward experiences. pointed out that it is a known chemical. error signal. Dopamine-producing cells become more active when rewards come earlier than expected or are not expected at all, and are inhibited when rewards arrive later or not at all.

A great story about how science works, then. But if the new paper, also published in Science, turns out to be correct, it is wrong.

Researchers have known for some time that some aspects of dopamine activity are inconsistent with prediction error models. is hidden under the carpet… until now. A new study by his Huijeong Jeong and Vijay Namboodiri and team of collaborators at the University of California, San Francisco has changed the world of neuroscience. It proposes a model of associative learning that suggests researchers have turned things around. Moreover, their proposal is supported by a series of experiments.

The old model is proactive and associates cause and effect. The new one is just the opposite. Associate effects with causes. They believe that when an animal receives a reward (or punishment), it will look back on that memory and figure out what caused this event. Dopamine’s role in the model is to flag sufficiently meaningful events to cause future rewards or punishments.

Looking at things this way addresses two issues that have always plagued older models. One is sensitivity to the timescale. Another is ease of computation.

The timescale problem is that cause and effect can be milliseconds (switching on a light bulb and experiencing lighting), minutes (having a drink and feeling tipsy), and even hours (eating something bad and getting food poisoning). ) may be separated by Looking back, Dr. Namboodiri explains that an arbitrarily long list of possible causes can be explored. Looking ahead without always knowing in advance how far to look ahead is much more tricky.

This leads us to the second problem. The sensory experience is rich, and everything in it can potentially predict the outcome. Making predictions based on all possible clues is somewhere between difficult and impossible. When a meaningful event occurs, it’s much easier to look back at other meaningful events to identify the cause.

However, in practice, it is difficult to distinguish between the two models ly. And that’s especially true when people don’t even care to see things they’ve never seen before. Dr. Jeong and Dr. Namboodiri did. They devised and conducted 11 experiments involving mice specifically designed for the purpose, buzzers, and drops of sugar solution. We measured the amount of dopamine released from the nucleus accumbens in real time. All experiments favored the new model.

The 180-degree flip in forward-to-backward thinking that these experiments imply has caused quite a stir in the world of neuroscience. Witten says it’s thought-provoking and represents an exciting new direction.

Further experiments are needed to confirm the new findings. But confirmation would have ramifications beyond neuroscience. This suggests that the way AI works doesn’t have even the slightest relevance to how the brain works, as is currently being debated, but it was actually a lucky guess.

But it may also suggest better ways to do AI. Dr. Namboodiri thinks so and is exploring the possibilities. Evolution took hundreds of millions of years to optimize the learning process. So learning from nature is rarely a bad idea.

