Publication: Trustworthy Machine Learning Through Interpretability and Fairness
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
The deployment of machine learning (ML) systems across diverse domains has led to groundbreaking advancements in vision and language tasks. However, ensuring that these systems are trustworthy—encompassing attributes such as interpretability, fairness, reliability, and alignment with human values—remains a significant challenge. This dissertation develops novel methodologies to enhance trustworthiness throughout the ML pipeline, addressing critical issues in data preparation, model training, model evaluation, and model deployment. To improve data preparation, this dissertation introduces concept-based auditing frameworks that systematically identify harmful biases and misaligned associations in large-scale and synthetic datasets. In the model training stage, two architectural innovations are proposed: channel embed- dings that improve interpretability in multiplexed biological data and architectural enhancements for super-resolution tasks that ensure semantic consistency by preserving high-frequency details. In the evaluation stage, traditional performance metrics are expanded with a texture-based evaluation framework that provides more human-understandable, context-sensitive insights. Lastly, in the deployment stage, synthetic counterfactual generation and fine-tuning techniques are introduced to mitigate biases in deployed models, enhancing fairness while preserving performance. These contributions address the trustworthiness of ML systems holistically, bridging the gap between low-level computational processes and high-level human understanding. This dissertation advances ML research by providing a comprehensive framework that improves transparency, reliability, and alignment with societal values across the entire machine learning pipeline.