How use Metrics with pytorch?

Hello, I was used to tensorflow and keras, where the metrics were log in a very simple way, like this:
model.compile(config.optimizer, config.loss_function, metrics=[‘accuracy’, ‘recall’, ‘AUC’])
But I want to use ResNet-18, and it doesn’t have in TensorFlow, so I decided to migrate to PyTorch. I was using this tutorial as a guideline:

The problem is that I’m not finding an intuitive way to log the metrics as I used to. My training data and test data are not being logged. Do I need to specify the metrics equations and push to log?
In this tutorial, only the test metrics are being logged, and this is the function:
def test(model, test_loader):
model.eval()

# Run the model on some test examples
with torch.no_grad():
    correct, total = 0, 0
    for images, labels in test_loader:
        images, labels = images.to(device), labels.to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    print(f"Accuracy of the model on the {total} " +
          f"test images: {correct / total:%}")

    wandb.log({"test_accuracy": correct / total})

# Save the model in the exchangeable ONNX format
torch.onnx.export(model, images, "model.onnx")
wandb.save("model.onnx")

Hi @flora-ufsc24, thank you for writing in! Specifying the metircs in the wandb.log will do the tric for you and shoul add it to your wandb workspace. Could you please show me how you used to log it with tf and got the desired outcome?

Hey Artsiom!
Yes, this is what Im using in tf and it works:

#Initialize wandb with your project name
run = wandb.init(project='Test-Metrics', #Were the project goes
                 config={#and include hyperparameters and metadata
                         "optimizer": "adam",
                         "epochs":50,
                         "batch_size":32,
                         "loss_function":"binary_crossentropy",
                         "architecture": "Transferlearing_ResNet50",#Not necessary but good practices
                         "dataset": "My 1000 data 70/20/10"
                         })
config = wandb.config #We'll use this to configure our experiment

#Initialize modl like you usually do
tf.keras.backend.clear_session()
model = create_model()
model.summary()

#Compile model like you usually do
#Notice that we use config, so our metadata matches what gets executed
#optimizer = tf.keras.optimizers.Adam(config.learning_rate)
model.compile(config.optimizer, config.loss_function, metrics=['accuracy','recall','AUC','precision','R2Score','TruePositives','FalsePositives'])

#We train with our beloved model.fit
#Notice WandCallback is udes as a regular callback
#We again use config

_=model.fit(x_train_features,y_train,
            epochs=config.epochs,
            batch_size=config.batch_size,
            validation_data=(x_val_features,y_val),
            callbacks=[WandbMetricsLogger()])

# Avaliar o modelo usando o conjunto de teste
test_loss, test_accuracy = model.evaluate(x_test_preprocessed, y_test)

And this is how Im doing with pytorch right now:

# Função para calcular acurácia
def calculate_accuracy(outputs, labels):
    _, predicted = torch.max(outputs, 1)
    correct = (predicted == labels).sum().item()
    accuracy = correct / labels.size(0)
    return accuracy

# Variáveis para controle de early stopping
early_stopping_counter = 0
best_test_loss = float('inf')

# Loop de treinamento
for epoch in range(epochs):
    model.train()
    train_loss = 0.0
    train_accuracy = 0.0

    for inputs, labels in train_loader:
        inputs, labels = inputs.to(device), labels.to(device)

        optimizer.zero_grad()

        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        train_loss += loss.item() * inputs.size(0)
        train_accuracy += calculate_accuracy(outputs, labels) * inputs.size(0)

    # Calcular métricas de treinamento
    train_loss /= len(train_loader.dataset)
    train_accuracy /= len(train_loader.dataset)

    # Avaliação no conjunto de teste
    model.eval()
    test_loss = 0.0
    test_accuracy = 0.0

    with torch.no_grad():
        for inputs, labels in test_loader:
            inputs, labels = inputs.to(device), labels.to(device)

            outputs = model(inputs)
            loss = criterion(outputs, labels)

            test_loss += loss.item() * inputs.size(0)
            test_accuracy += calculate_accuracy(outputs, labels) * inputs.size(0)

    # Calcular métricas de teste
    test_loss /= len(test_loader.dataset)
    test_accuracy /= len(test_loader.dataset)

    # Registrar métricas no wandb apenas no final de cada época
    wandb.log({
        'epoch': epoch + 1,
        'train_loss': train_loss,
        'train_accuracy': train_accuracy,
        'test_loss': test_loss,
        'test_accuracy': test_accuracy
    }, step=epoch + 1)

    # Imprimir métricas
    print(f'Epoch [{epoch + 1}/{epochs}], '
          f'Train Loss: {train_loss:.4f}, Train Accuracy: {train_accuracy:.4f}, '
          f'Test Loss: {test_loss:.4f}, Test Accuracy: {test_accuracy:.4f}')

    # Verificar se a perda de teste diminuiu
    if test_loss < best_test_loss:
        best_test_loss = test_loss
        early_stopping_counter = 0
    else:
        early_stopping_counter += 1

    # Parar o treinamento se a perda de teste não diminuir por 10 épocas
    if early_stopping_counter >= 10:
        print("Early stopping triggered.")
        break

# Fechar a sessão do wandb ao final do treinamento
wandb.finish()

I know that I’m not using the same metrics in both codes, but it’s just an example. I still have a lot of doubts about training in PyTorch because in Keras you just need to use model.compile , and Lucas’ YouTube videos are amazing. If you know of any other materials that teach PyTorch with Weights & Biases, I would love to hear about them. Thanks for your time."

Thank you so much for sending it over, @flora-ufsc24! Taking a look :slight_smile: