A5 Q3
Author Message
c5samuel Offline
Student


Reputation: 0
Post: #1
A5 Q3
I'm trying to understand what exactly is inside the theta matrix. It sounds like each row is a vector representation of a word in the vocabulary, V. But how would these vectors be different for a context word vs. a current word?

Also, if I understand correctly, the softmax function is applied to only 2 words from the vocabulary. But we only get theta and delta as input. Does that mean our function should compute softmax for all possible word pairs?

Thanks!
2015-12-06 23:57
Find all posts by this user Quote this message in a reply
t4peruma Offline
TA


Reputation: 0
Post: #2
RE: A5 Q3
(2015-12-06 23:57)c5samuel Wrote:  I'm trying to understand what exactly is inside the theta matrix. It sounds like each row is a vector representation of a word in the vocabulary, V. But how would these vectors be different for a context word vs. a current word?

See slide 21 of lecture 10.

The input vector of a word is the vector at time 't'. The output vector is the (to be) predicted vector for each word at time 't'.
Basically, in case of skip-gram, at every layer in the neural network, you're trying to predict what the output vector* of the context words should be, given the input vector of the current word.

*(which will become input vectors in the next layer)

These details are much more elaborately clear in the papers mentioned on the Word2Vec website; but for the purposes of part 3 (without bonus), these are not necessary to know.

(2015-12-06 23:57)c5samuel Wrote:  Also, if I understand correctly, the softmax function is applied to only 2 words from the vocabulary. But we only get theta and delta as input. Does that mean our function should compute softmax for all possible word pairs?

Thanks!
Yes; see slide 12 of the A5 tutorial.
2015-12-07 11:25
Find all posts by this user Quote this message in a reply
Post Reply 
 
Thread Rating:
  • 0 Votes - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5


Forum Jump:


User(s) browsing this thread: 1 Guest(s)