Person:

Cox, David

Loading...
Profile Picture

Email Address

AA Acceptance Date

Birth Date

Research Projects

Organizational Units

Job Title

Last Name

Cox

First Name

David

Name

Cox, David

Search Results

Now showing 1 - 10 of 10
  • Publication

    Perceptual Annotation: Measuring Human Vision to Improve Computer Vision

    (Institute of Electrical & Electronics Engineers (IEEE), 2014) Scheirer, Walter; Anthony, Samuel English; Nakayama, Ken; Cox, David

    For many problems in computer vision, human learners are considerably better than machines. Humans possess highly accurate internal recognition and learning mechanisms that are not yet understood, and they frequently have access to more extensive training data through a lifetime of unbiased experience with the visual world. We propose to use visual psychophysics to directly leverage the abilities of human subjects to build better machine learning systems. First, we use an advanced online psychometric testing platform to make new kinds of annotation data available for learning. Second, we develop a technique for harnessing these new kinds of information – “perceptual annotations” – for support vector machines. A key intuition for this approach is that while it may remain infeasible to dramatically increase the amount of data and high-quality labels available for the training of a given system, measuring the exemplar-by-exemplar difficulty and pattern of errors of human annotators can provide important information for regularizing the solution of the system at hand. A case study for the problem face detection demonstrates that this approach yields state-ofthe- art results on the challenging FDDB data set.

  • Publication

    Rats maintain a binocular field centered on the horizon

    (F1000Research, 2013) Meister, Markus; Cox, David

    In this letter, we attempt to correct a potentially serious misperception arising from the paper “Rats maintain an overhead binocular field at the expense of constant fusion”. While the authors repeatedly emphasize that the animal’s binocular field is overhead, the authors’ own data show that the truth is quite different, even orthogonal: the binocular field is in fact centered dead-ahead in front of the animal, tapering to a sliver both above and below the animal. We predict that this paper will be widely cited for something that it does not demonstrate, a concern that is borne out by the paper’s earliest citation.

  • Publication

    Hyperparameter Optimization and Boosting for Classifying Facial Expressions: How good can a “Null” Model be?

    (ICML, 2013) Bergstra, James; Cox, David

    One of the goals of the ICML workshop on representation and learning is to establish benchmark scores for a new data set of labeled facial expressions. This paper presents the performance of a "Null" model consisting of convolutions with random weights, PCA, pooling, normalization, and a linear readout. Our approach focused on hyperparameter optimization rather than novel model components. On the Facial Expression Recognition Challenge held by the Kaggle website, our hyperparameter optimization approach achieved a score of 60% accuracy on the test data. This paper also introduces a new ensemble construction variant that combines hyperparameter optimization with the construction of ensembles. This algorithm constructed an ensemble of four models that scored 65.5% accuracy. These scores rank 12th and 5th respectively among the 56 challenge participants. It is worth noting that our approach was developed prior to the release of the data set, and applied without modification; our strong competition performance suggests that the TPE hyperparameter optimization algorithm and domain expertise encoded in our Null model can generalize to new image classification data sets.

  • Publication

    Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures

    (JMLR, 2013) Bergstra, J.; Yamins, D.; Cox, David

    Many computer vision algorithms depend on configuration settings that are typically hand-tuned in the course of evaluating the algorithm for a particular data set. While such parameter tuning is often presented as being incidental to the algorithm, correctly setting these parameter choices is frequently critical to realizing a method’s full potential. Compounding matters, these parameters often must be re-tuned when the algorithm is applied to a new problem domain, and the tuning process itself often depends on personal experience and intuition in ways that are hard to quantify or describe. Since the performance of a given technique depends on both the fundamental quality of the algorithm and the details of its tuning, it is sometimes difficult to know whether a given technique is genuinely better, or simply better tuned. In this work, we propose a meta-modeling approach to support automated hyperparameter optimization, with the goal of providing practical tools that replace hand-tuning with a reproducible and unbiased optimization process. Our approach is to expose the underlying expression graph of how a performance metric (e.g. classification accuracy on validation examples) is computed from hyperparameters that govern not only how individual processing steps are applied, but even which processing steps are included. A hyperparameter optimization algorithm transforms this graph into a program for optimizing that performance metric. Our approach yields state of the art results on three disparate computer vision problems: a face-matching verification task (LFW), a face identification task (PubFig83) and an object recognition task (CIFAR-10), using a single broad class of feed-forward vision architectures.

  • Publication

    Editorial: What can simple brains teach us about how vision works

    (Frontiers Media S.A., 2015) Zoccolan, Davide; Cox, David; Benucci, Andrea
  • Publication

    Editorial: What can simple brains teach us about how vision works

    (Frontiers Media S.A., 2015) Zoccolan, Davide; Cox, David; Benucci, Andrea
  • Publication

    Using human brain activity to guide machine learning

    (Nature Publishing Group UK, 2018) Fong, Ruth C.; Scheirer, Walter; Cox, David

    Machine learning is a field of computer science that builds algorithms that learn. In many cases, machine learning algorithms are used to recreate a human ability like adding a caption to a photo, driving a car, or playing a game. While the human brain has long served as a source of inspiration for machine learning, little effort has been made to directly use data collected from working brains as a guide for machine learning algorithms. Here we demonstrate a new paradigm of “neurally-weighted” machine learning, which takes fMRI measurements of human brain activity from subjects viewing images, and infuses these data into the training process of an object recognition learning algorithm to make it more consistent with the human brain. After training, these neurally-weighted classifiers are able to classify images without requiring any additional neural data. We show that our neural-weighting approach can lead to large performance gains when used with traditional machine vision features, as well as to significant improvements with already high-performing convolutional neural network features. The effectiveness of this approach points to a path forward for a new class of hybrid machine learning algorithms which take both inspiration and direct constraints from neuronal data.

  • Publication

    Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning

    (2016) Lotter, William; Kreiman, Gabriel; Cox, David

    While great strides have been made in using deep learning algorithms to solve supervised learning tasks, the problem of unsupervised learning - leveraging unlabeled examples to learn about the structure of a domain - remains a difficult unsolved challenge. Here, we explore prediction of future frames in a video sequence as an unsupervised learning rule for learning about the structure of the visual world. We describe a predictive neural network ("PredNet") architecture that is inspired by the concept of "predictive coding" from the neuroscience literature. These networks learn to predict future frames in a video sequence, with each layer in the network making local predictions and only forwarding deviations from those predictions to subsequent network layers. We show that these networks are able to robustly learn to predict the movement of synthetic (rendered) objects, and that in doing so, the networks learn internal representations that are useful for decoding latent object parameters (e.g. pose) that support object recognition with fewer training views. We also show that these networks can scale to complex natural image streams (car-mounted camera videos), capturing key aspects of both egocentric movement and the movement of objects in the visual scene, and the representation learned in this setting is useful for estimating the steering angle. Altogether, these results suggest that prediction represents a powerful framework for unsupervised learning, allowing for implicit learning of object and scene structure.

  • Publication

    A neural network trained for prediction mimics diverse features of biological neurons and perception

    (Springer Science and Business Media LLC, 2020-04-20) Lotter, William; Kreiman, Gabriel; Cox, David
  • Publication

    Input-aware auto-tuning of compute-bound HPC kernels

    (ACM, 2017-11-12) Tillet, Philippe; Cox, David

    Efficient implementations of HPC applications for parallel architectures generally rely on external software packages (e.g., BLAS, LAPACK, CUDNN). While these libraries provide highly optimized routines for certain characteristics of inputs (e.g., square matrices), they generally do not retain optimal performance across the wide range of problems encountered in practice. In this paper, we present an input-aware auto-tuning framework for matrix multiplications and convolutions, ISAAC, which uses predictive modeling techniques to drive highly parameterized PTX code templates towards not only hardware-, but also application-specific kernels. Numerical experiments on the NVIDIA Maxwell and Pascal architectures show up to 3x performance gains over both cuBLAS and cuDNN after only a few hours of auto-tuning.