Publication:

Fidelity, Fairness and Responsibility through the Lens of Sequential Decision Making

Loading...
Thumbnail Image

Date

2024-01-19

Authors

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Sun, He. 2024. Fidelity, Fairness and Responsibility through the Lens of Sequential Decision Making. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.

Abstract

As methods of artificial intelligence continue to become increasingly important to support robust decision making in regard to deciding how to act on the basis of the right data, learning to act over time while supporting fairness to participants, and helping individuals make better sequential decisions.

This thesis expands in these directions, developing algorithms for enhancing decision-making processes, ensuring fairness in automated decisions, and optimizing user engagement. Motivating settings come from financial time series generation and portfolio optimization, the study of reinforcement learning with fairness constraints in the context of making loans, and the formulation of user engagement optimization in online platforms.

First, I introduce the decision-aware time-series conditional generative adversarial network (DAT- CGAN), which is a new method for time-series generation that is aware of the way in which data will be used. In particular, the framework adopts a multi-Wasserstein loss on decision-related quantities and is designed to support decision-making. DAT-CGAN uses an overlapped block-sampling approach for sample efficiency. The main results characterize the generalization properties of DAT-CGAN, and apply to financial time series and a multi-period portfolio choice problem. The proposed method demonstrates better training stability and generative quality in regard to both raw data and decision-related quantities than GAN-based baselines.

Second, I introduce the study of reinforcement learning (RL) with stepwise fairness constraints, which requires group fairness at each time step. This problem is motivated by the increasing use of AI methods in societally important settings, ranging from credit to employment to housing, and where it is crucial to provide fairness in regard to automated decision making. Moreover, many such settings are dynamic, with populations responding to sequential decision policies. In the case of tabular episodic RL, I provide a learning algorithm with a strong theoretical guarantee in regard to policy optimality and fairness violations. The experimental results also show that the proposed algorithm outperforms strong learning-based baselines.

Third, I formulate and solve a learning problem to handle content recommendation while also learning when to recommend users take a break during a user session. User engagement optimization plays a crucial role in online platforms, with platform designers putting great efforts into recommending interesting content to attract users. At the same time, blindly pushing users to extend a session can lead to burn out and regret, which is harmful to users’ long-term well-being. In response, many platforms now provide a service that reminds users to take a break. However, this timing is typically set manually, which motivates an interest in algorithms to automatically pop-out a reminder. Technically, I formulate the problem as an optimal stopping problem for a Markov decision process, and give an offline Q-learning based algorithm with a rigorous theoretical guarantee. I demonstrate the effectiveness of the algorithm on online click-stream data in an online shopping setting.

Description

Other Available Sources

Research Data

Keywords

fairness, generative adversarial network, marketing, optimal stopping, reinforcement learning, time series, Computer science

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories