Understanding GPT and Transformers

Question 1

In what year did Google introduce their original transformer?

Question 2

In embeddings, what does the temperature parameter affect?

Question 3

What does the embedding matrix (We) contain?

Question 4

Which of the following is not an application of transformers?

Question 5

What is the purpose of the softmax function in transformers?

Question 6

What operation is done in parallel in feed-forward layers?

Question 7

What does the 'P' in GPT stand for?

Question 8

What allows transformers to handle various NLP tasks after initial training?

Question 9

Which model size is mentioned as a reference for the scalability of transformers?

Question 10

What is the main function of the Attention Mechanism in a transformer?

Question 11

Which mechanism updates vectors with contextual meanings?

Question 12

What process is repeated to generate text in the Text Prediction process?

Question 13

What operation is critical for predicting the next word in transformers?

Question 14

What do dot products in transformers measure?

Question 15

How is position in high-dimensional space used in word embeddings?

Quiz for:Understanding GPT and Transformers