BERT memorisation and pitfalls in low-resource scenarios

Michael Tanzer,Sebastian Ruder,Marek Rei

BERT memorisation and pitfalls in low-resource scenarios

2021

Michael Tanzer
Sebastian Ruder
Marek Rei

State-of-the-art pre-trained models have been shown to memorise facts and perform well with limited amounts of training data. To gain a better understanding of how these models learn, we study their generalisation and memorisation capabilities in noisy and low-resource scenarios. We find that the training of these models is almost unaffected by label noise and that it is possible to reach near-optimal performances even on extremely noisy datasets. Conversely, we also find that they completely fail when tested on low-resource tasks such as few-shot learning and rare entity recognition. To mitigate such limitations, we propose a novel architecture based on BERT and prototypical networks that improves performance in low-resource named entity recognition tasks.

Keywords:

Speech recognition
Architecture
Noise
Named-entity recognition
Training set
rare entity
Computer science
low resource

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations