Showing posts with label loss. Show all posts
Showing posts with label loss. Show all posts

7/15/2023

combine costum fc with hugging face model, good to remember and modify for modifications

 refer to code:


.

    def model_forward(self, pixel_values, labels):
# Origin vit encoder-decoder outputs
outputs = self.model(pixel_values=pixel_values, labels=labels, output_hidden_states=True)
# Get last hidden state
last_hidden_state = outputs.decoder_hidden_states[-1] # batch_size, seq_len, hidden_size, ex)5, 15, 768
return last_hidden_state

def fc_part(self, last_hidden_state):
# Reshape the last hidden state
reshaped_logits = last_hidden_state.view(-1, self.model.config.decoder.hidden_size) # batch_size*seq_len, hidden_size
# Apply the fully connected layer
new_logits = self.custom_decoder_fc(reshaped_logits) # batch_size*seq_len, vocab_size
return new_logits

def compute_loss(self, new_logits, labels):
# Reshape labels to match logits dimension
reshaped_labels = labels.view(-1) #batch_size, seq_len -> batch_size*seq_len
# Calculate loss
# [batch_size*seq_len, vocab_size] vs [batch_size*seq_len] #ex) [70, 13] vs [70]
loss = self.loss_f(new_logits, reshaped_labels) #scalar tensor
return loss

def forward_pass(self, pixel_values, labels):
last_hidden_state = self.model_forward(pixel_values, labels) # batch_size, seq_len, hidden_size
new_logits = self.fc_part(last_hidden_state) # batch_size*seq_len, vocab_size
loss = self.compute_loss(new_logits, labels) # scalar tensor
# Reshape new_logits to match labels dimension
new_logits = new_logits.view(labels.shape[0], labels.shape[1], -1) # bathc_size, seq_len, vocab_size

return {'logits':new_logits, 'loss':loss}

..


forward_pass do process step by step.

And in the end return last hidden states logits and loss.


Thank you.

www.marearts.com

πŸ™‡πŸ»‍♂️

7/04/2023

CrossEntropyLoss example code using the input which similar with nlp token.

 Refer to code

.

import torch
import torch.nn as nn

# Assume a batch size of 2 and a sequence length of 3, and the model's vocabulary size is 5.
# So, your predicted logits would have a shape of (batch size, sequence length, vocab size)

logits = torch.tensor([
[[0.1, 0.2, 0.3, 0.4, 0.5], [0.5, 0.4, 0.3, 0.2, 0.1], [0.1, 0.2, 0.3, 0.4, 0.5]],
[[0.5, 0.4, 0.3, 0.2, 0.1], [0.1, 0.2, 0.3, 0.4, 0.5], [0.5, 0.4, 0.3, 0.2, 0.1]]
])
logits = logits.view(-1, logits.shape[-1]) # Reshape logits to be 2D (N, C), where N is batch_size*seq_length, C is vocab_size

# Similarly, your labels would have a shape of (batch size, sequence length).
# These are example labels.

labels = torch.tensor([
[0, 1, 2],
[2, 1, 0]
])
labels = labels.view(-1) # Reshape labels to be 1D (N)

loss_function = nn.CrossEntropyLoss() # Initialize loss function
loss = loss_function(logits, labels) # Compute the loss

print(loss) # Print the loss

..




In this example, logits and labels are explicitly defined tensors. The values in logits represent the output from your model for each token in the sequence for each example in your batch, and the labels tensor represents the correct labels or classes for each of these tokens. nn.CrossEntropyLoss() is then used to compute the loss between the predicted logits and the actual labels.




Thank you.

πŸ™‡πŸ»‍♂️

12/02/2021

How to combine multiple criterions to a loss function? Multiple loss function for single model.

You can simply reference below code:


ex1)

b = nn.MSELoss()(output_x, x_labels) a = nn.CrossEntropyLoss()(output_y, y_labels) loss = a + b loss.backward()


ex2)

b = nn.MSELoss() a = nn.CrossEntropyLoss() loss_a = a(output_x, x_labels) loss_b = b(output_y, y_labels) loss = loss_a + loss_b loss.backward()


And there are many opinions in here:

https://discuss.pytorch.org/t/how-to-combine-multiple-criterions-to-a-loss-function/348/27


Thank you.

www.marearts.com

πŸ™‡πŸ»‍♂️