Indirect Bias Exploration - Visualization 1

Introduction

The purpose of this visualization is to explore the potential biases learned by NLP transformer models, machine learning models which can be used to deal with human language.

You will be able to choose a target category attribute (e.g. sport) and see the correlations made by the chosen model with different feature category attributes (e.g. beverage or trait). It is also possible to investigate the correlations between the target and feature attributes and some sensitive attributes (such as gender or religion), to check whether the target and feature elements could be linked by some indirect correlations with these sensitive features. The correlation scores for different models are available.

The target and feature elements will be displayed in a table. In this table, the color of each cell indicates whether it exists a correlation between the elements displayed on the column and the row, and how strong is this correlation. By clicking on the columns and rows headers, it is possible to sort the table in order to facilitate the exploration. The correlations are generated using names as a bridge, which are linked to the column and row elements. It is possible to explore these links by clicking on the cells of the table.