Extrapolation in NLP
2018
We argue that extrapolation to unseen data will often be easier for models that capture global structures, rather than just maximise their local fit to the training data. We show that this is true for two popular models: the Decomposable Attention Model and word2vec.
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
17
References
17
Citations
NaN
KQI