How a subfield of physics led to breakthroughs in AI – and from there to this year’s Nobel Prize

John J. Hopfield and Geoffrey E. Hinton received the Nobel Prize in Physics on October 8, 2024 for their research into machine learning algorithms and neural networks that help computers learn. Their work has been fundamental in developing neural network theories that underlie generative artificial intelligence.

A neural network is a computer model consisting of layers of interconnected neurons. Just like the neurons in your brain, these neurons process and transmit a piece of information. Each neural layer receives a piece of data, processes it and passes the result to the next layer. By the end of the series, the network has processed and refined the data into something useful.

While it may seem surprising that Hopfield and Hinton received the physics prize for their contributions to neural networks used in computer science, their work is deeply rooted in the principles of physics, particularly in a subfield called statistical mechanics.

As a computational materials scientist, I was pleased that this area of ​​research was recognized with the award. Hopfield and Hinton’s work has allowed my colleagues and me to study a process called generative learning for materials science, a method that underlies many popular technologies such as ChatGPT.

What is Statistical Mechanics?

Statistical mechanics is a branch of physics that uses statistical methods to explain the behavior of systems composed of a large number of particles.

Instead of focusing on individual particles, researchers use statistical mechanics to look at the collective behavior of many particles. By seeing how they all work together, researchers can better understand the large-scale macroscopic properties of the system, such as temperature, pressure and magnetization.

For example, physicist Ernst Ising developed a statistical mechanical model for magnetism in the 1920s. We imagine magnetism as the collective behavior of atomic spins interacting with their neighbors.

In Ising’s model, there are higher and lower energy states for the system, and the material is more likely to be in the lowest energy state.

A key idea in statistical mechanics is the Boltzmann distribution, which quantifies how likely a given state is. This distribution describes the probability that a system is in a particular state – such as solid, liquid or gas – based on its energy and temperature.

Ising accurately predicted the phase transition of a magnet using the Boltzmann distribution. He calculated at what temperature the material changed from magnetic to non-magnetic.

Phase changes occur at predictable temperatures. Ice melts into water at a certain temperature because the Boltzmann distribution predicts that when it gets warm, the water molecules are more likely to adopt a disordered (or liquid) state.

In materials, atoms arrange themselves into specific crystal structures that use the lowest amount of energy. When it is cold, water molecules freeze into ice crystals with a low energy state.

Similarly, in biology, proteins fold into low-energy shapes, allowing them to function as specific antibodies – like a lock and key – that target a virus.

Neural networks and statistical mechanics

In principle, all neural networks work according to a similar principle: minimizing energy. Neural networks use this principle to solve computer problems.

For example, imagine an image made up of pixels where you can only see part of the image. Some pixels are visible, while the rest are hidden. To determine what the image is, consider all the possible ways the hidden pixels could fit into the visible parts. From there you could choose what statistical mechanics would say are the most likely states from all the possible options.

A diagram showing statistical mechanics on the left, with a graph showing three atomic structures, with the one with the lowest energy labeled as the most stable. Neural networks are labeled on the right, with two photos of trees, one of which is only half visible.

In statistical mechanics, researchers try to find the most stable physical structure of a material. Neural networks use the same principle to solve complex computer problems. Veera Sundararaghavan

Hopfield and Hinton developed a theory for neural networks based on the idea of ​​statistical mechanics. Like Ising before them, who modeled the collective interactions of atomic spins to solve the photo problem with a neural network, Hopfield and Hinton proposed collective interactions of pixels. They represented these pixels as neurons.

As in statistical physics, the energy of an image refers to how likely a particular configuration of pixels is. A Hopfield network would solve this problem by finding the lowest energy arrangements of hidden pixels.

But unlike statistical mechanics – where energy is determined by known atomic interactions – neural networks learn these energies from data.

Hinton popularized the development of a technique called backpropagation. This technique helps the model figure out the interaction energies between these neurons, and this algorithm underlies much of modern AI learning.

The Boltzmann machine

Building on Hopfield’s work, Hinton envisioned another neural network, the Boltzmann machine. It consists of visible neurons, which we can observe, and hidden neurons, which help the network learn complex patterns.

In a Boltzmann machine you can determine the probability that the image will look a certain way. To calculate this probability, you can add up all the possible states the hidden pixels could be in. This gives you the total probability that the visible pixels are in a specific arrangement.

My group has been working on implementing Boltzmann machines in quantum computers for generative learning.

In generative learning, the network learns to generate new data samples that resemble the data the researchers gave the network to train it. For example, it can generate new images of handwritten numbers after being trained on similar images. The network can generate these by sampling from the learned probability distribution.

Generative learning underpins modern AI – it’s what makes the generation of AI art, videos and text possible.

Hopfield and Hinton have significantly influenced AI research by using tools from statistical physics. Their work draws parallels between how nature determines the physical states of a material and how neural networks predict the likelihood of solutions to complex computer science problems.

This article is republished from The Conversation, an independent nonprofit organization providing facts and trusted analysis to help you understand our complex world. It is written by: Veera Sundararaghavan, University of Michigan

Read more:

Veera Sundararaghavan receives external funding for research unrelated to the content of this article.

Leave a Comment