Glossary of 🤗 Transformers

转载 NLP

NLP

发布日期: 2021-02-01

General terms

autoencoding models: see MLM
autoregressive models: see CLM
CLM: causal language modeling, a pretraining task where the model reads the texts in order and has to predict the next word. It’s usually done by reading the whole sentence but using a mask inside the model to hide the future tokens at a certain timestep.
MLM: masked language modeling, a pretraining task where the model sees a corrupted version of the texts, usually done by masking some tokens randomly, and has to predict the original text.
multimodal: a task that combines texts with another kind of inputs (for instance images).
NLG: natural language generation, all tasks related to generating text ( for instance talk with transformers, translation)
NLP: natural language processing, a generic way to say “deal with texts”.
NLU: natural language understanding, all tasks related to understanding what is in a text (for instance classifying the whole text, individual words)
pretrained model: a model that has been pretrained on some data (for instance all of Wikipedia). Pretraining methods involve a self-supervised objective, which can be reading the text and trying to predict the next word (see CLM) or masking some words and trying to predict them (see MLM).
RNN: recurrent neural network, a type of model that uses a loop over a layer to process texts.
seq2seq or sequence-to-sequence: models that generate a new sequence from an input, like translation models, or summarization models (such as Bart or T5).
token: a part of a sentence, usually a word, but can also be a subword (non-common words are often split in subwords) or a punctuation symbol.

Language Modeling

CarlYoung

http://yc1999.github.io/2021/02/01/glossary-of-hugging-face-transformers/

本博客所有文章除特別声明外，均采用 CC BY 4.0 许可协议。转载请注明来源 CarlYoung !

转载 NLP

上一篇

Summary of The Tasks

Summary of The Tasks

简单易懂地理解🤗 Transformers解决的各种NLP Tasks

2021-02-02 NLP

转载 NLP

下一篇

🤗 Transformers

🤗 Transformers

现如今最流行的NLP library——🤗 Transformers的介绍

2021-01-30 NLP

转载 NLP