Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 ·...

26
Gender-preserving Debiasing for Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan Duan CS 886: Deep Learning and Natural Language Processing Winter 2020

Transcript of Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 ·...

Page 1: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

Gender-preserving Debiasing for Pre-trained Word Embeddings

Author: Masahiro Kaneko, Danushka BollegalaPresenter: Haonan Duan

CS 886: Deep Learning and Natural Language Processing Winter 2020

Page 2: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

Outline

• Gender bias in word embeddings

• Introduction to the proposed method

• Evaluation

Page 3: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

Man is to Computer Programmer as Woman is toHomemaker (Bolukbasi, 2016)

Page 4: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

How do we measure Occupational Stereotypes in a word embedding?

The cosine similarity between the word vector

and the gender directional vector ( )⃗he − ⃗she

Page 5: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

Occupational Stereotypes in word2vec embeddings

Page 6: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

We believe that no human should be discriminated on the basis of demographic attributes by an NLP system.

There exist clear legal (European Union, 1997), business and ethical obligations to make NLP systems unbiased (Holstein et al., 2018).

Page 7: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

Given the broad applications of pre-trained word embeddings in various down-stream NLP tasks, it is extremely important to debias word embeddings before they are applied in NLP systems that interact with and/or make decisions that affect humans.

Page 8: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

Ideally, we want a debiasing method for word embeddings that can

• Remove all information related to discriminative bias

• Retain all non-discriminative information for the down streaming NLP tasks

• Do not assume any knowledge about the specific pre-trained word embedding algorithms

• Do not assume the availability or access to the or access to the language resources such as corpora or lexicons that might have been used by the word embedding learning algorithm.

Page 9: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

Main idea

Given a pre-trained set of -dimensional word embeddings , over a vocabulary , we can learn a mapping

that projects the original pre-trained word embeddings to a debiased l-dimensional space.

d{wi}|V|

i=1 V

E : ℛd → ℛl

is the new vector representation we will use for the later task.E(w)

Page 10: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

Partition the vocabularyAlso, it should be noted that removing all gender bias is not necessarily ideal.

For example, one would expect the word beard to be associated with man and skirt to be associated with woman.

Gender bias like this is not discriminative and actually would be useful, for example, for a recommendation system

Page 11: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

Partition the vocabulary

Therefore, the proposed method should be able to differentiate between undesirable (stereotypical) biases from the desirable (expected) gender information in words.

Page 12: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

Partition the vocabularyTo achieve this goal, we will first partition the vocabulary

into sets: (masculine), (feminine), (gender neutral), (stereotypical).V 4 Vm Vf Vn

Vs

For example, beard , skirt , desk , nurse ∈ Vm ∈ Vf ∈ Vn ∈ Vs

Note that are mutually exclusive and

Vm, Vf , Vn, VsV = Vm ∪ Vf ∪ Vs ∪ Vs

Page 13: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

Partition the vocabularyThe proposed debiased mapping should satisfy the following four criteria:

• For , we preserve its feminie properties

• For , we preserve its masculine properties

• For , we preserve its gender neutrality.

• For , we remove its stereotypical bias

wf ∈ Vf

wm ∈ Vm

wn ∈ Vn

ws ∈ Vs

Page 14: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

System overview• Encoder (debiasing function):

• Decoder:

• Feminine regressor:

• Masculine regressor:

E : ℛd → ℛl

D : ℛl → ℛd

Cf : ℛl → [0,1]

Cm : ℛl → [0,1]

In the proposed method, are all neural networks. Their parameters will be trained by minimizing the loss function introduced soon.

E, D, Cf , Cm

Page 15: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

Loss function I - Vf, VmFor feminine and masculine words, we require the encoded space to retain the gender-related information.

Lf = ∑w∈Vf

| |Cf(E(w)) − 1 | |22 + ∑

w∈V \Vf

| |Cf(E(w)) | |22

Lm = ∑w∈Vm

| |Cm(E(w)) − 1 | |22 + ∑

w∈V \Vm

| |Cm(E(w)) | |22

Page 16: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

Loss function II - Vn, VsFor the stereotypical and gender-neutral words, we require that they are embedded into a subspace that is orthogonal to a gender directional vector .Vg

How do we define and compute ?Vg

Page 17: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

Loss function II - Vn, VsWe collect a set of feminine and masculine word-pairs , such as, . Then is computed as:

Ω(wf , wm) (he, she), (man, woman)Vg

vg =1

|Ω | ∑(wf ,wm)∈Ω

(E(wm) − E(wf ))

Page 18: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

Loss function II - Vn, VsWe consider the squared inner-product between and the debiased stereotypical or gender-neutral words as the loss :

vgLg

Lg = ∑w∈Vn∪Vs

(vTg E(w))2

Page 19: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

Loss function III It is important that we preserve the semantic information encoded in the word embeddings as much as possible when we perform debiasing.

For this purpose, we minimize the reconstruction loss:

Lr = ∑w∈v

| |D(E(w)) − w | |22

Page 20: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

Loss function

L = λf Lf + λmLm + λgLg + λrLr

Here, are all hyper-parametersλf , λm, λg, λr

Page 21: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

Implementation Details• The masculine regressor and the feminine

regressor are both implemented as feed forward neural networks with one hidden layer. The sigmoid function is used as the nonlinear activation function.

• Both the encoder E and the decoder D of the autoencoder are implemented as feed forward neural networks with two hidden layers. Hyperbolic tangent is used as the activation function throughout the autoencoder.

Page 22: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

Evaluating debiasing performance

Definition: (man, woman), (waiter, waitress) Stereotype: (doctor, nurse)

None: (dog, cat)

Page 23: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

Preservation of Word Semantics - Analogy Detection

Page 24: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

Preservation of Word Semantics - Semantic Similarity Measurement

Page 25: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan
Page 26: Gender-preserving Debiasing for Pre-trained Word Embeddingsmli/Duan.pdf · 2020-03-02 · Pre-trained Word Embeddings Author: Masahiro Kaneko, Danushka Bollegala Presenter: Haonan

Reference• Masahiro, Kaneko, and D. Bollegala. "Gender-preserving

Debiasing for Pre-trained Word Embeddings." Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019.

• Tolga Bolukbasi, Kai-Wei Chang, James Y. Zou,Venkatesh Saligrama, and Adam Kalai. 2016. Man is to computer programmer as woman is to homemaker?debiasing word embeddings. In NIPS.

• European Union. 1997. Treaty of amsterdam (article 13).

• Kenneth Holstein, Jennifer Wortman Vaughan,Hal Daum´e III, Miro Dud´ık, and Hanna Wallach.2018. Improving fairness in machine learningsystems: What do industry practitioners need?