Swersky, Kevin, Jasper Snoek, and Ryan P. Adams. 2013. Multi-Task Bayesian Optimization in Advances in Neural Information Processing Systems 26, ed.C.J.C. Burges and L. Bottou and M. Welling and Z. Ghahramani and K.Q. Weinberger, 2004-1012. Red Hook, NY: Curran Associates, Inc. Paper given at Neural Information Processing Systems 2013, Lake Tahoe, Nevada, December 5-8, 2013.
Bayesian optimization has recently been proposed as a framework for automatically tuning the hyperparameters of machine learning models and has been shown to yield state-of-the-art performance with impressive ease and efficiency. In this paper, we explore whether it is possible to transfer the knowledge gained from previous optimizations to new tasks in order to find optimal hyperparameter settings more efficiently. Our approach is based on extending multi-task Gaussian processes to the framework of Bayesian optimization. We show that this method significantly speeds up the optimization process when compared to the standard single-task approach. We further propose a straightforward extension of our algorithm in order to jointly minimize the average error across multiple tasks and demonstrate how this can be used to greatly speed up \(k\)-fold cross-validation. Lastly, our most significant contribution is an adaptation of a recently proposed acquisition function, entropy search, to the cost-sensitive and multi-task settings. We demonstrate the utility of this new acquisition function by utilizing a small dataset in order to explore hyperparameter settings for a large dataset. Our algorithm dynamically chooses which dataset to query in order to yield the most information per unit cost.
PDF is from Prof. Adams' website. It's hard to determine exactly what version of the article this is. There is no pagination consistent with the publisher's citation nor is there a copyright notice on the file. Prof. Adams' stated in e-mail that this was an open-access proceedings.