Notes
Notes - notes.io |
```python
from typing import List
import random
class Trainer:
def __init__(self, model, tokenizer: Tokenizer, optimizer=None):
super().__init__()
self.model = model
if optimizer is None:
self.optimizer = torch.optim.Adam(model.parameters(), lr=0.0001)
else:
self.optimizer = optimizer
self.tokenizer = tokenizer
self.loss_function = torch.nn.CrossEntropyLoss()
def train(self, data: List[torch.Tensor], epochs, batch_size):
loss_per_epoch = []
for epoch in range(epochs):
losses = []
# Shuffle the sequences
random.shuffle(data)
# Create batches of sequences and their respective mask.
batches = []
for i in range(0, len(data), batch_size):
batch_data = data[i: i + batch_size]
# Create the input tensor for the batch
sequence_tensor = torch.nn.utils.rnn.pad_sequence(batch_data, batch_first=True).to(device)
mask_tensor = (sequence_tensor != self.tokenizer.character_to_token('<pad>')).to(device)
batches.append((sequence_tensor, mask_tensor))
# Train the model on each batch
for batch in batches:
self.model.train()
input_tensor, mask_tensor = batch
# Compute the model output
model_output, target = self.model.forward(x=input_tensor, mask=mask_tensor)
# Compute the losses
# The loss is computed on the model output and the target
loss = self.loss_function(model_output.transpose(1, 2), target)
# Backpropagate the loss.
loss.backward()
# Clip the gradients. This is used to prevent exploding gradients.
torch.nn.utils.clip_grad_norm_(self.model.parameters(), 0.5)
# Update the model parameters. This is done by taking a step in the direction of the gradient.
self.optimizer.step()
# Reset the gradients. This is done so that the gradients from the previous batch
# are not used in the next step.
self.optimizer.zero_grad()
# Append the loss to the list of losses, so that the average loss can be computed for this epoch.
losses.append(loss.item())
# Print the loss
epoch_loss = np.average(losses)
loss_per_epoch.append(epoch_loss)
print('Epoch:', epoch, 'Loss:', epoch_loss)
return loss_per_epoch
```
Please make sure to adjust the import statements and other relevant parts of your code as needed.
|
Notes.io is a web-based application for taking notes. You can take your notes and share with others people. If you like taking long notes, notes.io is designed for you. To date, over 8,000,000,000 notes created and continuing...
With notes.io;
- * You can take a note from anywhere and any device with internet connection.
- * You can share the notes in social platforms (YouTube, Facebook, Twitter, instagram etc.).
- * You can quickly share your contents without website, blog and e-mail.
- * You don't need to create any Account to share a note. As you wish you can use quick, easy and best shortened notes with sms, websites, e-mail, or messaging services (WhatsApp, iMessage, Telegram, Signal).
- * Notes.io has fabulous infrastructure design for a short link and allows you to share the note as an easy and understandable link.
Fast: Notes.io is built for speed and performance. You can take a notes quickly and browse your archive.
Easy: Notes.io doesn’t require installation. Just write and share note!
Short: Notes.io’s url just 8 character. You’ll get shorten link of your note when you want to share. (Ex: notes.io/q )
Free: Notes.io works for 12 years and has been free since the day it was started.
You immediately create your first note and start sharing with the ones you wish. If you want to contact us, you can use the following communication channels;
Email: [email protected]
Twitter: http://twitter.com/notesio
Instagram: http://instagram.com/notes.io
Facebook: http://facebook.com/notesio
Regards;
Notes.io Team