Publication: Using Machine Learning to Predict Future Points in the NHL
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
Goal creation in ice hockey is complicated and difficult to understand. Unlike baseball, players and the puck are always moving and a combination of 5 players produces a goal.
Recently, the NHL has published new data about each game. As new statistical analysis tools such as neural network, k-nearest neighborhood and random forest regression become available, it becomes possible to analyze goals and assists more holistically by using a wide range of statistical methods.
The aims of this study were to predict the average number of points of each player in the next 5 games by looking at the statistics of an individual player, his team and his opponents in the previous 10 games. Subsequent to this study, I was able to find important variables to predict the number of points. In this study, the random forest regression predicted the average number of points in the next 5 games by mean squared error of 0.0675. This is about a 70% improvement compared to mean squared error of a baseline model that predicts that all the players get the average number of points in the next 5 games.
This paper starts with an exploration of relevant sports statistics and their history. Then the paper shifts its focus on explaining relevant work by other people (Chapter 1). Next the paper explains different modern machine algorithms such as neural network regression, random forest regression and k-nearest neighborhood regression used in this research (Chapter 2). Then the paper seeks to minimize prediction errors and find significant features with these methods (Chapter 3). Finally the paper summarizes our main findings (Chapter 4).