Clare S. Y. Huang Data Scientist | Atmospheric Dynamicist

Coursera: Transformer Models and BERT Model

Just a record of quiz done as a test of knowledge.

Question 1

What is the name of the language modeling technique that is used in Bidirectional Encoder Representations from Transformers (BERT)?

  • Long Short-Term Memory (LSTM)
  • Transformer
  • Recurrent Neural Network (RNN)
  • Gated Recurrent Unit (GRU)

Question 2

What is a transformer model?

  • A deep learning model that uses self-attention to learn relationships between different parts of a sequence.
  • A natural language processing model that uses convolutions to learn relationships between different parts of a sequence.
  • A computer vision model that uses fully connected layers to learn relationships between different parts of an image.
  • A machine learning model that uses recurrent neural networks to learn relationships between different parts of a sequence.

Question 3

What kind of transformer model is BERT?

  • Encoder-only model
  • Decoder-only model
  • Encoder-decoder model
  • Recurrent Neural Network (RNN) encoder-decoder model

Question 4

What does fine-tuning a BERT model mean?

  • Training the model on a specific task by using a large amount of unlabeled data
  • Training the model on a specific task and not updating the pre-trained weights
  • Training the hyper-parameters of the models on a specific task
  • Training the model and updating the pre-trained weights on a specific task by using labeled data

Question 5

What is the attention mechanism?

  • A way of determining the importance of each word in a sentence for the translation of another sentence
  • A way of identifying the topic of a sentence
  • A way of predicting the next word in a sentence
  • A way of determining the similarity between two sentences Correct Correct! 1 / 1 point

Question 6

What are the encoder and decoder components of a transformer model?

  • The encoder ingests an input sequence and produces a sequence of hidden states. The decoder takes in the hidden states from the encoder and produces an output sequence.
  • The encoder ingests an input sequence and produces a sequence of tokens. The decoder takes in the tokens from the encoder and produces an output sequence.
  • The encoder ingests an input sequence and produces a single hidden state. The decoder takes in the hidden state from the encoder and produces an output sequence.
  • The encoder ingests an input sequence and produces a sequence of images. The decoder takes in the images from the encoder and produces an output sequence.

Question 7

BERT is a transformer model that was developed by Google in 2018. What is BERT used for?

  • It is used to solve many natural language processing tasks, such as question answering, text classification, and natural language inference.
  • It is used to diagnose and treat diseases.
  • It is used to train other machine learning models, such as Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks.
  • It is used to generate text, translate languages, and write different kinds of creative content.

Question 8

What are the two sublayers of each encoder in a Transformer model?

  • Self-attention and feedforward
  • Recurrent and feedforward
  • Embedding and classification
  • Convolution and pooling

Question 9

What are the three different embeddings that are generated from an input sentence in a Transformer model?

  • Convolution, pooling, and recurrent embeddings
  • Token, segment, and position embeddings
  • Recurrent, feedforward, and attention embeddings
  • Embedding, classification, and next sentence embeddings

Learning Resources

<< Previous Page