Publication: Beyond FICO: Default Prediction and Optimal Lending Strategies in Online P2P Investing
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Research Data
Abstract
The rise of online P2P lending has led to a large community of investors with an enormous discrepancy among best practices to maximize returns. Despite this, a cohesive system for analyzing the merits of various lending strategies has yet to take shape. Using publicly available data from LendingClub, this thesis seeks to propose a framework for analyzing investments using modern portfolio theory and expected value, fundamentally viewing an investment decision as the purchase of a portfolio of assets held until maturity. To do so, we necessarily require a measure of default probability, which leads to the second major aim of the project, to analyze the drivers of default in online lending. We analyze defaults through separation of the features of the dataset into various categories and sets and then finding the ranked optimal subset of each of these feature sets using several machine learning techniques, nuancing our understanding of the interplay between the various features. Specifically, we hope to understand more clearly the effect of the FICO Score and LendingClub assigned subgrade. Our results show that while the LendingClub assigned subgrade is the most valuable feature in the dataset in terms of predicting default, FICO Scores provide limited marginal predictive value over other groups of features. In our analysis of lending strategies, we find that foregoing both ultra-safe and ultra-risky loans for moderate classes along with effective filtering provide the highest returns on a risk-adjusted basis.