Date of Award

2025

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

First Advisor

Nicholas Cheney

Abstract

Over the last decade, neural networks have powered the emergence of artificial intelligence (AI) as a technology reshaping life on every scale, from everyday work and leisure to global finance and politics. A key component of the process to arrive at neural networks’ present capabilities has been the development of model architectures. In theory, simple designs are capable of representing any desired function, but, in practice, being able to tractably train the model requires designing an appropriate, more complex architecture for the given task. While theoretical explanations for an appropriate high-level design exist, low-level design decisions are typically made through an engineering process of trial-and-error. This means developing a model for a novel application can be a costly process requiring significant expertise, inhibiting the adoption of this technology as a method in broader scientific research.

Neural architecture search (NAS) attempts to overcome this by automating the design of a neural network for given data. Implementing this search in a practical and sustainable manner requires estimating the performance of candidate architectures without actually training each one. One way to do this trains a single neural network, a supernet, representing all candidates, while another approach uses “zero-shot” measures to estimate performance without training.

In the first portion of this thesis, we analyze algorithms to train supernets, developing measures to analyze these algorithms. We show that the effective training algorithms are able to provide useful performance estimates by specializing the supernet weights to be more suitable for high-performing candidate architectures. In the second portion of this thesis, we analyze the method by which the final architecture is selected from a trained supernet, explaining the major failure modes of the default approach. By incorporating zero-shot measures into the architecture selection process, we are able to discover better architectures than those found by using either the supernet or the zero-shot measures alone. In the final portion of this thesis, we demonstrate the application of an efficient NAS approach to discover an improved architecture for use in the cryo-electron microscopy reconstruction pipeline. Through these studies, we illuminate a path toward practical automation of the design of neural network architectures.

Language

en

Number of Pages

169 p.

Available for download on Friday, May 15, 2026

Share

COinS