The Star That Makes Algorithms Work Better
Designing and training optimal neural networks is a very complex task. An important clue to making them more efficient comes from a new paper by Brandon Livio Annesi, Carlo Lucibello, Enrico Malatesta, Gabriele Perugini, and Luca Saglietti of Bocconi's Department of Computing Sciences together with Clarissa Lauditi of the Politecnico di Torino and Fabrizio Pittorino of the Politecnico di Milano and Bocconi Institute for Data Science and Analytics.
Neural networks are called "neural" because they are similar to the networks of neurons in the brain. They can make decisions without being expressly programmed to do so after they have undergone a training process during which they adjust their parameters. Parameters in a network, in turn, can number in the millions or even billions. The quality of a network, roughly speaking, is a function of how quickly its algorithms perform their tasks, and how consistently correct they are.
The solution space is the set of all the different combinations of values that the parameters of a neural network can adopt. This is a very large number even for smaller networks, so researchers want to understand the properties of neural network solution spaces and how these properties can be exploited to improve the training process. One possible approach is to use techniques from statistical physics to study the dynamics of the optimization algorithm as it explores the solution space.
The authors have devised a relatively simple model of neural network precisely to understand what "shape" its solution space has and what implications this discovery can have for optimal algorithm design and for a more efficient training. A "good" algorithm, quick and resource-efficient, will be able to find solutions that can be connected by simple straight paths, that is, lying entirely within the solution space.
What Malatesta and his colleagues have found is that the solution space for this kind of network is actually shaped like a star, with an inner kernel and a number of outer regions which end in spikes. Solutions in or near the kernel will be connected with almost all other solutions, but as a solution gets closer to a "spike" it more frequently faces barriers, because straight paths to other solutions tend to fall outside the star-shaped solution space. This is an unprecedented and potentially highly important development, as it gives important indications to algorithm designers in achieving high-efficiency connections.
"We really did not expect this outcome, since no one so far had a clear idea of what solution spaces look like," says Enrico Malatesta. "The challenge now is to investigate under what conditions (apart from the number of parameters) the star-shape model holds and if its properties are universal. There is still a lot of work to do."
Brandon Livio Annesi, Clarissa Lauditi, Carlo Lucibello, Enrico M. Malatesta, Gabriele Perugini, Fabrizio Pittorino, Luca Saglietti, "Star-Shaped Space of Solutions of the Spherical Negative Perceptron", Physical Review Letters, 1 December 2023, DOI https://doi.org/10.1103/PhysRevLett.131.227301