Complexity Reduction for Near Real-Time High Dimensional Filtering and Estimation Applied to Biological Signals

Gupta, Manish

View/Open

GUPTA-DISSERTATION-2016.pdf (11.48Mb)

Author

Gupta, Manish HARVARD

0000-0003-1006-6559

Metadata

Show full item record

Citation

Gupta, Manish. 2016. Complexity Reduction for Near Real-Time High Dimensional Filtering and Estimation Applied to Biological Signals. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.

Abstract

Real-time processing of physiological signals collected from wearable sensors that can be done with low computational power is a requirement for continuous health monitoring. Such processing involves identifying underlying physiological state x from a measured biomedical signal y, that are related stochastically: y = f(x; e) (here e is a random variable). Often the state space of x is large, and the dimensionality of y is low: if y has dimension N and S is the state space of x then |S| >> N, since the purpose is to infer a complex physiological state from minimal measurements. This makes real-time inference a challenging task. We present algorithms that address this problem by using lower dimensional approximations of the state. Our algorithms are based on two techniques often used for state dimensionality reduction: (a) decomposition where variables can be grouped into smaller sets, and (b) factorization where variables can be factored into smaller sets. The algorithms are computationally inexpensive, and permit online application. We demonstrate their use in dimensionality reduction by successfully solving two real complex problems in medicine and public safety.

Motivated originally by the problem of predicting cognitive fatigue state from EEG (Chapter 1), we developed the Correlated Sparse Signal Recovery (CSSR) algorithm and successfully applied it to the problem of elimination of blink artifacts in EEG from awake subjects (Chapter 2). Finding the decomposition x = x1+ x2 into a low dimensional representation of the artifact signal x1 is a non-trivial problem and currently there are no online real-time methods accurately solve the problem for small N (dimensionality of y). By using a skew-Gaussian dictionary and a novel method to represent group statistical structure, CSSR is able to identify and remove blink artifacts even from few (e.g. 4-6) channels of EEG recordings in near real-time. The method uses a Bayesian framework. It results in more effective decomposition, as measured by spectral and entropy properties of the decomposed signals, compared to some state-of-the-art artifact subtraction and structured sparse recovery methods. CSSR is novel in structured sparsity: unlike existing group sparse methods (such as block sparse recovery) it does not rely on the assumption of a common sparsity profile. It is also a novel EEG denoising method: unlike state-of-the art artifact removal technique such as independent components analysis, it does not require manual intervention, long recordings or high density (e.g. 32 or more channels) recordings. Potentially this method of denoising is of tremendous utility to the medical community since EEG artifact removal is usually done manually, which is a lengthy tedious process requiring trained technicians and often making entire epochs of data unuseable. Identification of the artifact in itself can be used to determine some physiological state relevant from the artifact properties (for example, blink duration and frequency can be used as a marker of fatigue). A potential application of CSSR is to determine if structurally decomposed cortical EEG (i.e. non-spectral ) representation can instead be used for fatigue prediction.

A new E-M based active learning algorithm for ensemble classification is presented in Chapter 3 and applied to the problem of detection of artifactual epochs based upon several criteria including the sparse features obtained from CSSR. The algorithm offers higher accuracy than existing ensemble methods for unsupervised learning such as similarity- and graph-based ensemble clustering, as well as higher accuracy and lower computational complexity than several active learning methods such as Query-by-Committee and Importance-Weighted Active Learning when tested on data comprising of noisy Gaussian mixtures. In one case we were to successfully identify artifacts with approximately 98% accuracy based upon 31-dimensional data from 700,000 epochs in a matter of seconds on a personal laptop using less than 10% active labels. This is to be compared to a maximum of 94% from other methods. As far as we know, the area of active learning for ensemble-based classification has not been previously applied to biomedical signal classification including artifact detection; it can also be applied to other medical areas, including classification of polysomnographic signals into sleep stages.

Algorithms based upon state-space factorization in the case where there is unidirectional dependence amongst the dynamics groups of variables ( the "Cascade Markov Model") are presented in Chapters 4. An algorithm for estimation of factored state where dynamics follow a Markov model from observations is developed using E-M (i.e. a version of Baum-Welch algorithm on factored state spaces) and applied to real-time human gait and fall detection. The application of factored HMMs to gait and fall detection is novel; falls in the elderly are a major safety issue. Results from the algorithm show higher fall detection accuracy (95%) than that achieved with PCA based estimation (70%). In this chapter, a new algorithm for optimal control on factored Markov decision processes is derived. The algorithm, in the form of decoupled matrix differential equations, both is (i) computationally efficient requiring solution of a one-point instead of two-point boundary value problem and (ii) obviates the "curse of dimensionality" inherent in HJB equations thereby facilitating real-time solution. The algorithm may have application to medicine, such as finding optimal schedules of light exposure for correction of circadian misalignment and optimal schedules for drug intervention in patients.

The thesis demonstrates development of new methods for complexity reduction in high dimensional systems and that their application solves some problems in medicine and public safety more efficiently than state-of-the-art methods.

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-use#LAA

Citable link to this page

http://nrs.harvard.edu/urn-3:HUL.InstRepos:33493389

Collections

FAS Theses and Dissertations [6136]

Contact administrator regarding this item (to report mistakes or request changes)