Word embeddings and nearest neighbors
This lesson preview is part of the Fundamentals of transformers - Live Workshop course and can be unlocked immediately with a single-time purchase. Already have access to this course? Log in here.
Get unlimited access to Fundamentals of transformers - Live Workshop with a single-time purchase.

[00:00 - 00:09] We'll dive into this transformer in more detail in the very next lesson. Okay, so after we transformed this vector, 0, 1 became 1, 0.
[00:10 - 00:16] It's not clear how that happened, and that's okay. But what we want to understand now is, how do we go from 1, 0 back into a word?
[00:17 - 00:23] So let me visualize that for you again. Here, we've plotted 1, 0 on a plot.
[00:24 - 00:26] You can ignore the dotted line circle. That was just to help me keep organized.
[00:27 - 00:31] So here we have 1, 0. But the question is, what word does that correspond to?
[00:32 - 00:42] Before, I told you that there were three words, and each of them with their own corresponding vectors. So what we can do is plot those three words here as well.
[00:43 - 00:47] So here we have the three words from before. You are cold, and we have this mystery word right over here.
[00:48 - 00:59] So how would we determine what this new 1, 0 word, or how do we associate a word with this, right? Well, one way that we could do that is just to look at the nearest neighbor.
[01:00 - 01:11] So here, this point, 1, 0, is closest to 0.7. So it stands to reason that this is probably R, right?
[01:12 - 01:20] So we're going to use that idea. We're actually going to look up what is the closest point that has a word associated with it, and that's going to be the word that we output.
[01:21 - 01:31] So 1, 0 translates into R, and that's this less step, which we call nearest neighbors. So to recap, we have three different steps.
[01:32 - 01:37] The first is to convert a word into a vector. This is just the look up and a big list of vectors that we have, or big dictionary of vectors that we have.
[01:38 - 01:49] The second is to actually transform that vector using a transformer, which we'll talk about in more detail in a second. And once we have that output vector, we then convert that vector back into a word using nearest neighbor.
[01:50 - 01:58] OK, so now that I've talked about all of these, we can answer Maya's questions before. Maya's question was, are tokens related to features?
[01:59 - 02:07] So right here, this input, you, you would have been some integer, right? Like we saw before, it could have been 2, 500, or whatever it is.
[02:08 - 02:16] It's a single integer, and that single integer is the token's ID. Right here, 0, 1 is what we would call the token.
[02:17 - 02:24] And it's also what we would call the feature. So to Maya's point, tokens basically are features, right?
[02:25 - 02:37] But features are more broad. Features apply to any neural network in between layers, whereas tokens are very, very specific to transformer models or transformer based models.
[02:38 - 02:47] OK. All right, so this is the summary in this diagram form.
[02:48 - 02:53] Now, let me resummerize, but using the points in 2D space. So to start off, we had the word, you.
[02:54 - 03:06] The word, you was translated into the coordinate 0, 1. Then we ran the transformer and that produced a new coordinate, one, comma, zero. The question was, what does this coordinate correspond to?
[03:07 - 03:12] What is the word that coordinate one, zero gives us? And we simply plot all of the words that we know about.
[03:13 - 03:18] And we look for the closest one, which was our. And that's how we produced our from this entire process.
[03:19 - 03:27] OK, so the question is, are token and embedding synonyms? Isn't the token a word fragment or punctuation mark?
[03:28 - 03:32] So it's a good question. Let me go back to that diagram here.
[03:33 - 03:41] OK, so in this diagram, we have so embeddings and features are direct sentiments. Tokens can refer to either.
[03:42 - 03:49] So tokens can refer to either the text or the vector. That corresponds to the same thing.
[03:50 - 03:55] Right. So when you say word fragment, punctuation mark, that's absolutely correct. That is definitely the case, a token corresponds to that.
[03:56 - 04:00] But the word can also take its vector representation. So there are two representations, the same thing.
[04:01 - 04:12] And the underlying thing is a token. Yeah, and then feel free to leave more questions in the chat if that was confusing to.