Beyond Machine Learning Accuracy: Shifting Paradigms of Neural Network Explainability and Reasoning
Open/View Files
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
As AI systems become more tightly integrated with every major aspect of our society, the field of AI explainability grows and becomes more urgent. The need for AI explainability is multifaceted. First, it is crucial for ensuring the ethical and legal compliance of AI systems, as it allows users and regulators to scrutinize AI decision-making processes for potential biases, discrimination, and other unintended consequences. Second, explainability can foster trust and user acceptance of AI technologies, as individuals are more likely to rely on systems whose reasoning they can understand and verify. Finally, explainable AI can facilitate the identification and correction of errors in AI models, thereby contributing to their robustness, reliability, and generalizability. In this research, we focus on leveraging foundational mathematical concepts of continuous fractions to envision and implement CoFrNets, a novel neural network architecture that is fully interpretable by design and addresses the "black-box" problem. Moreover, we prove that such architectures are universal approximators based on a novel proof strategy that is different than the typical strategy, which is likely to be of independent interest. We experiment on nonlinear synthetic functions and are able to accurately model as well as estimate feature attributions and even higher order terms in some cases, which is a testament to the representational power as well as interpretability of such architectures. To further showcase the power of CoFrNets, we experiment on seven real datasets spanning tabular, text and image modalities, and demonstrate the advantages of this new architecture over existing neural network architectures for interpretability and accuracy. Subsequently, we discuss why existing explainability methods lose applicability with revolutionary new large language models like GPT-4. We argue that eliciting reasoning from language models is the new "explainability method" and introduce CReDETS, a novel and first of its kind causal reasoning dataset with hand annotated explanations. We benchmark latest generation and most powerful generation of transformer neural network models GPT-3, GPT-3.5 (chatGPT), and GPT-4 and discuss their accuracy, coherence, consistency, and show that even the most recent LLMs have stark weaknesses in reasoning ability.