'How does word2vec learn word relations?
Which part of the algorithm specifically makes the embeddings to have the king - boy + girl = queen ability? Did they just did this by accident?
Edit :
Take the CBOW as an example. I know about they use embeddings instead of one-hot vectors to encode the words and made the embeddings trainable instead of how we do when using one hot vectors that the data itself is not trainable. Then the output is a one-hot vector for target word. They just average all the surrounding word embeddings at some point then put some lego layers afterwards. So at the end they find the mentioned property by surprise, or is there a training procedure or network structure that gave the embeddings that property?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
