Publication:

Odors as ''natural language'': sparse neural networks in mammalian olfactory systems and large language models

Loading...
Thumbnail Image

Date

2025-05-06

Authors

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Liu, Bo. 2025. Odors as “Natural Language”: Sparse Neural Networks in Mammalian Olfactory Systems and Large Language Models. Doctoral Dissertation, Harvard University Graduate School of Arts and Sciences.

Abstract

The studies of physics, neuroscience, and artificial intelligence (AI) have a long intertwined history. Particularly, sparse connectivity is a common feature of the brain neural networks and a key focus in AI for efficient computation; notably, pruning trained networks for sparse connectivity has a long history, partially inspired by neuroscience. This thesis explores sparse neural networks through two linked research topics: one focused on the brain (bilateral alignment in olfactory systems), and the other on AI (pruning large language models for on-device AI assistants).

For the first topic, inspired by mammalian dual nostrils creating two cortical neural representations of odors, in Chapter 1, we studied how to construct the inter-hemispheric projections aligning these representations. We hypothesized that this construction originates from online learning since mammals are constantly breathing. With a local Hebbian rule, we found that sparse inter-hemispheric projections suffice for bilateral alignment and discovered an inverse scaling that more cortical neurons allow sparser projections. Also, the local Hebbian rule was found to approximate the global stochastic gradient descent (SGD) rule since their update vectors align, suggesting that biologically plausible learning rules can approximate global learning rules if they contain the gradient information of the latter.

The next chapter extends Chapter 1 from four perspectives: an analysis of the update vector alignment between Hebbian and SGD rules and how it depends on the network parameters; a simple theory that recurrent connections in olfactory cortex may improve the bilateral alignment, inspired by the Hopfield Networks (associative memory) and similar to the design of Google Titans model that combines recurrent neural networks with Transformers; the dynamical properties of Hebbian learning; and finally, the geometric landscape of Hebbian learning.

A similar inverse scaling has been discovered in the Transformer attention matrices used in large language models (LLMs), which motivated the second topic. Concretely, we pruned pretrained Meta Llama-2 and Llama-3 models to obtain models with fewer parameters and develop on-device AI assistants, explored their sparsity limits, and compared their performance at the limits. We found that more than 50% of the parameters in both models could be pruned, and Llama-3 produced fewer factual errors at the sparsity limit but required more parameters presumably due to its training settings and dataset.

In summary, by studying sparsity in both biological and artificial neural networks, this thesis may provide valuable insights into the general bilateral alignment problem in neuroscience (across different modalities and brain regions such as the frontal cortex responsible for short-term and motor response and the medial entorhinal cortex for spatial memory), open the door to interesting theoretical questions, and inspire more efficient AI algorithms or applications.

Description

Other Available Sources

Research Data

Keywords

Biophysics, Neurosciences, Artificial intelligence

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories