## Introduction to PyTorch Embedding

PyTorch Embedding is a space with low dimensions where high dimensional vectors can be translated easily so that models can be reused on new problems and can be solved easily. The changes are kept to each single video frame so that the data can be hidden easily in the video frames whenever there are any changes. But the first step is to determine whether we wish to hide the data or we need to extract the data from the video frames.

### What is PyTorch Embedding?

- An embedding layer must be created where the tensor is initialized based on the requirements.
- Now, when we train the model, it finds similarities between words or numbers and gives us the results.
- If the model is pretrained with another example, then it will give us results from both models.
- After the embedded layer, vocabulary must be defined, and this instance can be used to fetch the correct embedding of the layers.

### How does PyTorch Embedding Work?

- We can call the embedding layer as a linear layer where the layer is defined in this manner nn.linear (number of words, dimensional vectors). Hence, the words in the layer describe the vector of size 1000 with 1 in the normal position. If it is a vector with v[2] = 1, then we get 2nd row of the layer in the same experiment. Here embedding describes the vectors and layers in the vocabulary of words. We can give an index without giving the whole vector where the index describes the position of the 1st vector in the experiment.
- Inputs of forward pass() have a list of indexes with words. There are no batches involved here, and hence it is always one index mapped to one word. This helps us to use linear instead of embedding like nn.linear(). It should be noted that while using nn.linear, the inputs need not be hot vectors as it can be any words that can be mapped using indexes. There is no underlying architecture with embedding as it is always considered as a layer, and mostly it is a linear layer. A word is always matched to a vector, and hence we can call it as a M X N matrix where M corresponds to words and N to vector. So there is a one-to-one correspondence between words and vectors.
- The first layer is mostly a linear layer where we can reduce dimensions easily. Here one hot encoded matrix is not needed where we need only indices of words. Hence, NLP always deals with embedding layers, which is of major significance. Here we can multiply a one-hot vector with an embedding matrix which makes the lookup for the problem. Backpropagation is also done well with embedding matrix.

### Uses of PyTorch Embedding

- This helps us to convert each word present in the matrix to a vector with a properly defined size. We will have the result where there are only 0’s and 1’s in the vector. This helps us to represent the vectors with dimensions where words help reduce the vector’s dimensions.
- We can say that the embedding layer works like a lookup table where each word are converted to numbers, and these numbers can be used to make up the table. Thus, keys are represented by words, and the values are word vectors.
- In NLP, embedding is very helpful in reducing the dimensions so that we can control the number of features in the coding. This helps to reduce the number of lines in the code, and hence we get results faster than with more number of features and less accuracy.
- Another use of embedding is that it connects words with contexts. This helps in connecting similar meaning words with similar contexts, thus reducing the work related to the entire words to relate it with contexts.
- When we have large inputs in machine learning, it is difficult to manage the same as the number of words will be more, and it will be difficult to handle the words and related contexts. Embedding helps here to connect words with indexes, and hence words are reduced to numbers. Then they are connected with contexts, and now the entire input stream is reduced to a few contexts and less number of words.

### Parameters of PyTorch Embedding

Given below are the parameters of PyTorch Embedding:

**Num_embeddings:**This represents the size of the dictionary present in the embeddings, and it is represented in integers.**Embedding_dim:**This represents the size of each vector present in the embeddings, which is represented in integers.**Max_norm:**This is an optional parameter that is represented by a float. Each embedding is passed through this parameter where if it is larger than max_norm, it is renormalized to form the embedding.**Scale_grad_by_freq:**This is an optional parameter with Boolean values where it will help to scale the gradients with the inverse in the frequency of words present in the batch.

### Example of PyTorch Embedding

Given below is the example of PyTorch Embedding:

**Code:**

`import torch`

import torch.nn as nn

import torch.nn.functional as Fun

import torch.optim as opt

torch.manual_seed(2)

word_conversion = {"hey": 0, "there": 1}

embeddings = nn.Embedding(2, 3)

lookup = torch.tensor([word_conversion["hey"]], dtype=torch.long)

hey_embeddings = embeddings(lookup)

print(hey_embeddings)

n, d, m = 2, 4, 6

embeddings = nn.Embedding(n, d, max_norm=True)

Weight = torch.randn((m, d), requires_grad=True)

index = torch.tensor([1, 3])

x = embedding.weight.clone() @ Weight.t()

y = embedding(index) @ Weight.t()

ouputt = (x.unsqueeze(1) + y.unsqueeze(2))

loss_factor = output.sigmoid().prod()

loss_factor.backward()

class NewModel(nn.Module):

def __init__(self, embed_size, embed_dimension):

super(NewModel, self).__init__()

self.embed_size = embed_size

self.embed_dimension = embed_dimension

self.u_embeddings = nn.Embedding(embed_size, embed_dimension, sparse=True)

self.v_embeddings = nn.Embedding(embed_size, embed_dimension, sparse = True)

self.init_embed()

def init_embed(self):

initrange = 0.75 / self.embed_dimension

self.u_embeddings.weight.data.uniform_(-initrange, initrange)

self.v_embeddings.weight.data.uniform_(-0, 0)

def forward(self, pos_u, pos_v, neg_v):

embed_u = self.u_embeddings(pos_u)

embed_v = self.v_embeddings(pos_v)

score = torch.mul(embed_u, embed_v).squeeze()

score = torch.sum(score, dim = 1)

score = F.logsigmoid(score)

neg_embed_v = self.v_embeddings(neg_v)

negtv_score = torch.bmm(neg_embed_v, embed_u.unsqueeze(2)).squeeze()

negtv_score = F.logsigmoid(-1 * negtv_score)

return -1 * (torch.sum(score)+torch.sum(negtv_score))

### Conclusion

Embedding helps to connect words with numbers, and thus half of the work is reduced related to NLP and machine learning. However, it is important to note the dimensions and manage the words so that they are properly indexed and managed.

### Recommended Articles

This is a guide to PyTorch Embedding. Here we discuss the introduction; how does PyTorch embedding work? Uses, parameters, and example, respectively. You may also have a look at the following articles to learn more –