harvard6
https://dash.harvard.edu:443
The DASH digital repository system captures, stores, indexes, preserves, and distributes digital research material.2020-12-05T00:48:18ZInvestigating the Causal Effects of Climate on Lyme Disease Incidence
https://nrs.harvard.edu/1/37366471
Investigating the Causal Effects of Climate on Lyme Disease Incidence
Chea, Jacqueline
Climate is a factor that influences the spread of many infectious diseases, especially vector-borne diseases. For instance, temperature and humidity are crucial in understanding transmission patterns of malaria due to their influence on the abundance of mosquitoes. Lyme disease is an infection transmitted by ticks that is the most commonly reported vector-borne disease in North America. It has been rapidly spreading out from 2 major focal points in New England and north-central United States since it was first discovered in the late 1970s. The Center for Disease Control estimates over 300,000 people are treated for Lyme disease each year in the United States, with the total cost of Lyme disease testing alone estimated at $492 million. To understand how climatic conditions affect the spread of Lyme disease, we performed causal analysis on time series data on weather, tick populations in Maine, and Lyme disease incidence in New England states using a recently-developed method called convergent cross mapping (CCM). CCM is based on nonlinear state space reconstruction, and tests if X is causally influencing Y by determining how well the historical record of Y values can reliably estimate X. CCM analysis supports the usage of some predictors drawn from previous works, such as annual minimum temperature, while casting doubt on others, such as the Palmer Hydrological Drought Index. Surprisingly, CCM suggests the value of climatic variables never before considered in modelling, such as wind speed.
2020-12-03T05:00:00ZPredicting Mood in College Students: Developing a Predictive Model From Multivariate Time Series
https://nrs.harvard.edu/1/37366470
Predicting Mood in College Students: Developing a Predictive Model From Multivariate Time Series
Aguilar, Marianne
Sleep diaries often collect useful information regarding studentsâ€™ sleep duration, timing, moods, and relevant daytime activities. The abundance of data provided by these multivariate time series provide a basis by which to carry out predictions for end-of-month results. In particular, end-of-month moods are interesting to predict since they can be indicators of larger health problems, such as depression or anxiety. This paper attempts to model the clusters students fall into based on sleep variables and the time-dependent network that contributes to end-of-month mood ratings in an attempt to find important variables on certain days to target for treatment. It concludes by finding that dependent on the cluster a student falls into, wake time, first event timing, or biological determinants are most important in predicting 28th day moods.
2020-12-03T05:00:00ZA Fork's Impact: The Reach of Mission-Driven Fine Dining
https://nrs.harvard.edu/1/37366469
A Fork's Impact: The Reach of Mission-Driven Fine Dining
Mondavi, Lia Jean
This thesis reviews the efforts of mission-driven fine dining restaurants to encourage the need for shifts in our food system and food culture. Through both an analysis of their dishes as well as proposing the use of a modified SIR model to track idea spreading through the general population, I highlight some of the goals of these restaurants and their effectiveness. Ultimately, more accurate data sources are needed to apply the modified SIR model in a robust manner, but I do show that it can be used to track ideas originating from these restaurants.
2020-12-03T05:00:00ZUsing Machine Learning to Predict Future Points in the NHL
https://nrs.harvard.edu/1/37366468
Using Machine Learning to Predict Future Points in the NHL
Matsuzawa, Takehiro
Goal creation in ice hockey is complicated and difficult to understand. Unlike baseball, players and the puck are always moving and a combination of 5 players produces a goal.
Recently, the NHL has published new data about each game. As new statistical analysis tools such as neural network, k-nearest neighborhood and random forest regression become available, it becomes possible to analyze goals and assists more holistically by using a wide range of statistical methods.
The aims of this study were to predict the average number of points of each player in the next 5 games by looking at the statistics of an individual player, his team and his opponents in the previous 10 games. Subsequent to this study, I was able to find important variables to predict the number of points. In this study, the random forest regression predicted the average number of points in the next 5 games by mean squared error of 0.0675. This is about a 70% improvement compared to mean squared error of a baseline model that predicts that all the players get the average number of points in the next 5 games.
This paper starts with an exploration of relevant sports statistics and their history. Then the paper shifts its focus on explaining relevant work by other people (Chapter 1). Next the paper explains different modern machine algorithms such as neural network regression, random forest regression and k-nearest neighborhood regression used in this research (Chapter 2). Then the paper seeks to minimize prediction errors and find significant features with these methods (Chapter 3). Finally the paper summarizes our main findings (Chapter 4).
2017-11-07T05:00:00Z