Revealing the Dark Secrets of BERT

fidaforever · Mar 10, 2020 · 0e0a587 · 0e0a587
1 parent 1330bd0
commit 0e0a587
Show file tree

Hide file tree

Showing 3 changed files with 7 additions and 0 deletions.
diff --git a/README.md b/README.md
@@ -20,10 +20,17 @@
 
 ### 2019
 
+
+* [Revealing the Dark Secrets of BERT](https://arxiv.org/abs/1908.08593); Olga Kovaleva, Alexey Romanov, Anna Rogers, Anna Rumshisky. BERT-based architectures currently give state-of-the-art performance on many NLP tasks, but little is known about the exact mechanisms that contribute to its success. In the current work, we focus on the interpretation of self-attention, which is one of the fundamental underlying components of BERT. Using a subset of GLUE tasks and a set of handcrafted features-of-interest, we propose the methodology and carry out a qualitative and quantitative analysis of the information encoded by the individual BERT's heads. Our findings suggest that there is a limited set of attention patterns that are repeated across different heads, indicating the overall model overparametrization. While different heads consistently use the same attention patterns, they have varying impact on performance across different tasks. We show that manually disabling attention in certain heads leads to a performance improvement over the regular fine-tuned BERT models.
+
+![DarkSecrets](images/DarkSecrets.png)
+
 * [Explanation in Artificial Intelligence:Insights from the Social Sciences](https://arxiv.org/pdf/1706.07269.pdf); Tim Miller. There has been a recent resurgence in the area of explainable artificial intelligence as researchers and practitioners seek to make their algorithms more understandable. Much of this research is focused on explicitly explaining decisions or actions to a human observer, and it should not be controversial to say that looking at how humans explain to each other can serve as a useful starting point for explanation in artificial intelligence. However, it is fair to say that most work in explainable artificial intelligence uses only the researchers' intuition of what constitutes a `good' explanation. There exists vast and valuable bodies of research in philosophy, psychology, and cognitive science of how people define, generate, select, evaluate, and present explanations, which argues that people employ certain cognitive biases and social expectations towards the explanation process. This paper argues that the field of explainable artificial intelligence should build on this existing research, and reviews relevant papers from philosophy, cognitive psychology/science, and social psychology, which study these topics. 
 
 ![SocialSciences4XAI](images/SocialSciences4XAI.png)
 
+![SocialSciences4XAI2](images/SocialSciences4XAI2.png)
+
 * [AnchorViz: Facilitating Semantic Data Exploration and Concept Discovery for Interactive Machine Learning](https://www.microsoft.com/en-us/research/publication/anchorviz-facilitating-semantic-data-exploration-and-concept-discovery-for-interactive-machine-learning/); Jina Suh et. al., When building a classifier in interactive machine learning (iML), human knowledge about the target class can be a powerful reference to make the classifier robust to unseen items. The main challenge lies in finding unlabeled items that can either help discover or refine concepts for which the current classifier has no corresponding features (i.e., it has feature blindness). Yet it is unrealistic to ask humans to come up with an exhaustive list of items, especially for rare concepts that are hard to recall. This article presents AnchorViz, an interactive visualization that facilitates the discovery of prediction errors and previously unseen concepts through human-driven semantic data exploration.
 
 ![AnchorViz](images/AnchorViz.png)

diff --git a/images/DarkSecrets.png b/images/DarkSecrets.png
diff --git a/images/SocialSciences4XAI2.png b/images/SocialSciences4XAI2.png