This webpage aims to regroup publications and software produced as part of a joint project at Fraunhofer HHI, TU Berlin and SUTD Singapore on developing new method to understand nonlinear predictions of state-of-the-art machine learning models.

Machine learning models, in particular deep neural networks (DNNs), are characterized by very high predictive power, but in many case, are not easily interpretable by a human. Interpreting a nonlinear classifier is important to gain trust into the prediction, and to identify potential data selection biases or artefacts.

The project studies in particular techniques to decompose the prediction in terms of contributions of individual input variables such that the produced decomposition (i.e. explanation) can be visualized in the same way as the input data.

Draw a handwritten digit and see the heatmap being formed in real-time. Create your own heatmap for natural images or text. These demos are based on the Layer-wise Relevance Propagation (LRP) technique by Bach et al. (2015).

Layer-wise Relevance Propagation (LRP) is a method that identifies important pixels by running a backward pass in the neural network. The backward pass is a conservative relevance redistribution procedure, where neurons that contribute the most to the higher-layer receive most relevance from it. The LRP procedure is shown graphically in the figure below.

The method can be easily implemented in most programming languages and integrated to existing neural network frameworks. The propagation rules used by LRP can for many architectures, including deep rectifier networks or LSTMs, be understood as a Deep Taylor Decomposition of the prediction.

- Keras Explanation Toolbox (LRP and other Methods)
- GitHub project page for the LRP Toolbox
- TensorFlow LRP Wrapper
- LRP Code for LSTM

- CVPR Tutorial 2018

Video Part 1

Video Part 2

Introduction

Methods

Applications 1

Applications 2 - Tutorial on Implementing LRP

- W Samek, G Montavon, A Vedaldi, LK Hansen, KR Müller (Eds.) Explainable AI: Interpreting, Explaining and Visualizing Deep Learning

Springer LNCS 11700, 2019

- S Lapuschkin, S Wäldchen, A Binder, G Montavon, W Samek, KR Müller. Unmasking Clever Hans Predictors and Assessing What Machines Really Learn

Nature Communications, 10:1096, 2019 [preprint | bibtex]

- G Montavon, W Samek, KR Müller. Methods for Interpreting and Understanding Deep Neural Networks

Digital Signal Processing, 73:1-15, 2018 [bibtex] - W Samek, T Wiegand, KR Müller. Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models

ITU Journal: ICT Discoveries - Special Issue 1 - The Impact of AI on Communication Networks and Services, 1(1):39-48, 2018 [bibtex] - W Samek, KR Müller. Towards Explainable Artificial Intelligence

in Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, Springer LNCS 11700, 2019 [bibtex] - G Montavon, A Binder, S Lapuschkin, W Samek, KR Müller. Layer-Wise Relevance Propagation: An Overview

in Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, Springer LNCS 11700, 2019 [bibtex]

- S Bach, A Binder, G Montavon, F Klauschen, KR Müller, W Samek. On Pixel-wise Explanations for Non-Linear Classifier Decisions by Layer-wise Relevance Propagation

PLOS ONE, 10(7):e0130140, 2015 [preprint, bibtex] - G Montavon, S Lapuschkin, A Binder, W Samek, KR Müller. Explaining NonLinear Classification Decisions with Deep Taylor Decomposition

Pattern Recognition, 65:211–222, 2017 [preprint, bibtex] - L Arras, J Arjona, M Widrich, G Montavon, M Gillhofer, KR Müller, S Hochreiter, W Samek. Explaining and Interpreting LSTMs

in Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, Springer LNCS 11700, 2019 [bibtex]

- J Kauffmann, M Esders, G Montavon, W Samek, KR Müller. From Clustering to Cluster Explanations via Neural Networks

arXiv:1906.07633, 2019 - J Kauffmann, KR Müller, G Montavon. Towards Explaining Anomalies: A Deep Taylor Decomposition of One-Class Models

arXiv:1805.06230, 2018

- W Samek, A Binder, G Montavon, S Bach, KR Müller. Evaluating the Visualization of What a Deep Neural Network has Learned

IEEE Transactions on Neural Networks and Learning Systems, 28(11):2660-2673, 2017 [preprint, bibtex] - L Arras, A Osman, KR Müller, W Samek. Evaluating Recurrent Neural Network Explanations

ACL Workshop on BlackboxNLP, 2019 [preprint, bibtex] - G Montavon. Gradient-Based Vs. Propagation-Based Explanations: An Axiomatic Comparison

in Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, Springer LNCS 11700, 2019 [bibtex]

- M Alber, S Lapuschkin, P Seegerer, M Hägele, KT Schütt, G Montavon, W Samek, KR Müller, S Dähne, PJ Kindermans iNNvestigate neural networks!

Journal of Machine Learning Research, 20(93):1−8, 2019 [preprint] - M Alber. Software and Application Patterns for Explanation Methods

in Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, Springer LNCS 11700, 2019 [bibtex] - S Lapuschkin, A Binder, G Montavon, KR Müller, W Samek The Layer-wise Relevance Propagation Toolbox for Artificial Neural Networks

Journal of Machine Learning Research, 17(114):1−5, 2016 [preprint, bibtex]

- I Sturm, S Bach, W Samek, KR Müller. Interpretable Deep Neural Networks for Single-Trial EEG Classification

Journal of Neuroscience Methods, 274:141–145, 2016 [preprint, bibtex] - A Binder, M Bockmayr, M Hägele, S Wienert, D Heim, K Hellweg, A Stenzinger, L Parlow, J Budczies, B Goeppert, D Treue, M Kotani, M Ishii, M Dietel, A Hocke, C Denkert, KR Müller, F Klauschen. Towards computational fluorescence microscopy: Machine learning-based integrated prediction of morphological and molecular tumor profiles

arXiv:1805.11178, 2018 - F Horst, S Lapuschkin, W Samek, KR Müller, WI Schöllhorn. Explaining the Unique Nature of Individual Gait Patterns with Deep Learning

Scientific Reports, 9:2391, 2019 [preprint, bibtex] - AW Thomas, HR Heekeren, KR Müller, W Samek. Analyzing Neuroimaging Data Through Recurrent Deep Learning Models

arXiv:1810.09945, 2018

- L Arras, F Horn, G Montavon, KR Müller, W Samek. "What is Relevant in a Text Document?": An Interpretable Machine Learning Approach

PLOS ONE, 12(8):e0181142, 2017 [preprint, bibtex] - L Arras, G Montavon, KR Müller, W Samek. Explaining Recurrent Neural Network Predictions in Sentiment Analysis

EMNLP Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis, 159-168, 2017 [preprint, bibtex] - L Arras, F Horn, G Montavon, KR Müller, W Samek. Explaining Predictions of Non-Linear Classifiers in NLP

ACL Workshop on Representation Learning for NLP, 1-7, 2016 [preprint, bibtex] - F Horn, L Arras, G Montavon, KR Müller, W Samek. Exploring text datasets by visualizing relevant words

arXiv:1707.05261, 2017

- S Lapuschkin, A Binder, G Montavon, KR Müller, W Samek. Analyzing Classifiers: Fisher Vectors and Deep Neural Networks

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2912-2920, 2016 [preprint, bibtex] - S Lapuschkin, A Binder, KR Müller, W Samek. Understanding and Comparing Deep Neural Networks for Age and Gender Classification

IEEE International Conference on Computer Vision Workshops (ICCVW), 1629-1638, 2017 [preprint, bibtex] - C Seibold, W Samek, A Hilsmann, P Eisert. Accurate and Robust Neural Networks for Security Related Applications Exampled by Face Morphing Attacks

arXiv:1806.04265, 2018

- C Anders, G Montavon, W Samek, KR Müller. Understanding Patch-Based Learning of Video Data by Explaining Predictions

in Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, Springer LNCS 11700, 2019 [bibtex | preprint] - V Srinivasan, S Lapuschkin, C Hellge, KR Müller, W Samek. Interpretable human action recognition in compressed domain

Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 1692-1696, 2017 [preprint, bibtex]

- S Becker, M Ackermann, S Lapuschkin, KR Müller, W Samek. Interpreting and Explaining Deep Neural Networks for Classification of Audio Signals

arXiv:1807.03418, 2018

- Pascal VOC 2012 Multilabel Model (see paper): [caffemodel] [prototxt]
- Age and Gender Classification Models (see paper): [data and models]