Publication: Structuring Incentives in the Development and Use of Artificial Intelligence
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
The success of machine learning (ML) has, in addition to its direct impacts, significantly reshaped the incentives of the humans developing, using, and being targeted by ML systems. This dissertation explores technical methods for identifying and reshaping these incentives to ensure they are serving the interests of users and society. The first two chapters examine the incentives created by narrow ML systems that make consequential decisions about users. Chapter 1 provides a method for identifying the incentives produced by black-box models like neural networks. Chapter 2 provides methods for selecting linear regression rules in the presence of decision-recipients who will actively game the chosen rule, and whose true outcomes may change in turn. The later two chapters concern the incentives for companies and governments to develop and misuse powerful general-purpose AI systems. They cover technical approaches for identifying misconduct by ML developers, in order to increase their incentives for honesty. Chapter 3 proposes a framework for international inspectors to verify a company or government’s large-scale ML development via data center hardware inspections, inspired by the model of the IAEA. Chapter 4 examines how an auditor could efficiently verify what training data had been used to produce a large ML model, and provides an efficient method that works on current open-source large language models.