Log default metrics #20418

ierezell · 2024-11-13T19:43:30Z

Description & Motivation

When training a model, I have to specify dataloaders, epochs, learning rate and I would like them to be logged by default (like huggingface).
(Could be a DeviceStatMonitor + Batch throughput + dataset metrics)

Pitch

When training a model, many metrics are accessible and it would be really nice to log them directly like :

Pseudo-code

def Trainer.fit(): 
    for metric in ["learning_rate", "train_dataloader_len", "precision", "epochs", "limit_batches", ...]: 
        for logger in logger_that_has_been_set: 
            logger.log(metric, value)

Alternatives

Log all the metrics myself for all the loggers like defined above (long and tedious...), a flag like "log_default_metrics=True" would be a nice alternative.

Additional context

I'm using Databricks (mlflow) and I can have my custom model metrics but nothing in the system metrics or default parameters, default model metrics.

Thanks for the framework, it's really nice !

cc @Borda @awaelchli

lantiga · 2024-11-18T22:47:25Z

Hey @ierezell this is a good idea in general, right now we have save_hyperparameters that will automatically log the hyperparameters that are passed to the constructor. We could do something similar for DataModules after all.

import pytorch_lightning as pl
from torch import nn, optim
import torch

class MyModel(pl.LightningModule):
    def __init__(self, learning_rate=0.001, hidden_size=128, input_size=28, output_size=10):
        super().__init__()
        # Save hyperparameters
        self.save_hyperparameters()

        self.model = nn.Sequential(
            nn.Linear(input_size, hidden_size),
            nn.ReLU(),
            nn.Linear(hidden_size, output_size)
        )
        self.criterion = nn.CrossEntropyLoss()

    def forward(self, x):
        return self.model(x)

    def training_step(self, batch, batch_idx):
        x, y = batch
        logits = self(x)
        loss = self.criterion(logits, y)
        self.log('train_loss', loss)
        return loss

    def configure_optimizers(self):
        return optim.Adam(self.parameters(), lr=self.hparams.learning_rate)

from pytorch_lightning.loggers import TensorBoardLogger

logger = TensorBoardLogger("logs", name="my_model")

trainer = pl.Trainer(logger=logger, max_epochs=5)
model = MyModel(learning_rate=0.01, hidden_size=256)

trainer.fit(model)

ierezell · 2024-11-18T22:58:25Z

Hello @lantiga, thanks for the fast reply!

Indeed I'm already using save_hyperparameters, I should have mentioned it.

The request would be on "non-init" parameters like precision (fp16 or 32),epochs, batch_size etc... That I do not pass to the init of the model but in the trainer arguments.

Maybe a Trainer.fit(log_user_defined_args=True)?

And you're right, also for the dataset (length, size etc...) so maybe
Trainer.fit(log_dataset_metrics=True)?

Thanks again,
Have a great day

ierezell added feature Is an improvement or enhancement needs triage Waiting to be triaged by maintainers labels Nov 13, 2024

lantiga added logger Related to the Loggers and removed needs triage Waiting to be triaged by maintainers labels Nov 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Log default metrics #20418

Log default metrics #20418

ierezell commented Nov 13, 2024 •

edited by github-actions bot

Loading

lantiga commented Nov 18, 2024

ierezell commented Nov 18, 2024

Log default metrics #20418

Log default metrics #20418

Comments

ierezell commented Nov 13, 2024 • edited by github-actions bot Loading

Description & Motivation

Pitch

Alternatives

Additional context

lantiga commented Nov 18, 2024

ierezell commented Nov 18, 2024

ierezell commented Nov 13, 2024 •

edited by github-actions bot

Loading