mdlearn.utils

Configurations and utilities for model building and training.

Functions

get_torch_optimizer(name, hparams, parameters)

Construct a PyTorch optimizer specified by name and hparams.

get_torch_scheduler(name, hparams, optimizer)

Construct a PyTorch lr_scheduler specified by name and hparams.

log_checkpoint(checkpoint_file, epoch, ...)

Write a torch .pt file containing the epoch, model, optimizer, and scheduler.

parse_args()

Parse command line arguments using argparse library

resume_checkpoint(checkpoint_file, model, ...)

Modifies model, optimizer, and scheduler with values stored in torch .pt file checkpoint_file to resume from a previous training checkpoint.

pydantic settings mdlearn.utils.BaseSettings

Show JSON schema
{
   "title": "BaseSettings",
   "description": "Base class for settings, allowing values to be overridden by environment variables.\n\nThis is useful in production for secrets you do not wish to save in code, it plays nicely with docker(-compose),\nHeroku and any 12 factor app design.",
   "type": "object",
   "properties": {},
   "additionalProperties": false
}

dump_yaml(cfg_path: Union[str, pathlib.Path])
classmethod from_yaml(filename: Union[str, pathlib.Path]) mdlearn.utils._T
pydantic settings mdlearn.utils.OptimizerConfig

pydantic schema for PyTorch optimizer which allows for arbitrary optimizer hyperparameters.

Show JSON schema
{
   "title": "OptimizerConfig",
   "description": "pydantic schema for PyTorch optimizer which allows\nfor arbitrary optimizer hyperparameters.",
   "type": "object",
   "properties": {
      "name": {
         "title": "Name",
         "default": "Adam",
         "env_names": "{'name'}",
         "type": "string"
      },
      "hparams": {
         "title": "Hparams",
         "default": {},
         "env_names": "{'hparams'}",
         "type": "object"
      }
   }
}

Config
  • extra: str = allow

Fields
field hparams: Dict[str, Any] = {}
field name: str = 'Adam'
pydantic settings mdlearn.utils.SchedulerConfig

pydantic schema for PyTorch scheduler which allows for arbitrary scheduler hyperparameters.

Show JSON schema
{
   "title": "SchedulerConfig",
   "description": "pydantic schema for PyTorch scheduler which allows for arbitrary\nscheduler hyperparameters.",
   "type": "object",
   "properties": {
      "name": {
         "title": "Name",
         "default": "ReduceLROnPlateau",
         "env_names": "{'name'}",
         "type": "string"
      },
      "hparams": {
         "title": "Hparams",
         "default": {},
         "env_names": "{'hparams'}",
         "type": "object"
      }
   }
}

Config
  • extra: str = allow

Fields
field hparams: Dict[str, Any] = {}
field name: str = 'ReduceLROnPlateau'
pydantic settings mdlearn.utils.WandbConfig

Show JSON schema
{
   "title": "WandbConfig",
   "description": "Base class for settings, allowing values to be overridden by environment variables.\n\nThis is useful in production for secrets you do not wish to save in code, it plays nicely with docker(-compose),\nHeroku and any 12 factor app design.",
   "type": "object",
   "properties": {
      "wandb_project_name": {
         "title": "Wandb Project Name",
         "env_names": "{'wandb_project_name'}",
         "type": "string"
      },
      "wandb_entity_name": {
         "title": "Wandb Entity Name",
         "env_names": "{'wandb_entity_name'}",
         "type": "string"
      },
      "model_tag": {
         "title": "Model Tag",
         "env_names": "{'model_tag'}",
         "type": "string"
      }
   },
   "additionalProperties": false
}

Fields
field model_tag: Optional[str] = None
field wandb_entity_name: Optional[str] = None
field wandb_project_name: Optional[str] = None
init(cfg: mdlearn.utils.BaseSettings, model: torch.nn.Module, wandb_path: Union[str, pathlib.Path]) Optional[wandb.config]

Initialize wandb with model and config.

Parameters
  • cfg (BaseSettings) – Model configuration with hyperparameters and training settings.

  • model (torch.nn.Module) – Model to train, passed to wandb.watch(model) for logging.

  • wandb_path (PathLike) – Path to write wandb/ directory containing training logs.

Returns

Optional[wandb.config] – wandb config object or None if wandb_project_name is None.

mdlearn.utils.get_torch_optimizer(name: str, hparams: Dict[str, Any], parameters) torch.optim.Optimizer

Construct a PyTorch optimizer specified by name and hparams.

mdlearn.utils.get_torch_scheduler(name: Optional[str], hparams: Dict[str, Any], optimizer: torch.optim.Optimizer) Optional[torch.optim.lr_scheduler._LRScheduler]

Construct a PyTorch lr_scheduler specified by name and hparams.

Parameters
  • name (Optional[str]) – Name of PyTorch lr_scheduler class to use. If name is None, simply return None.

  • hparams (Dict[str, Any]) – Hyperparameters to pass to the lr_scheduler.

  • optimizer (torch.optim.Optimizer) – The initialized optimizer.

Returns

Optional[torch.optim.lr_scheduler._LRScheduler] – The initialized PyTorch scheduler, or None if name is None.

mdlearn.utils.log_checkpoint(checkpoint_file: Union[str, pathlib.Path], epoch: int, model: torch.nn.Module, optimizers: Dict[str, torch.optim.Optimizer], scheduler: Optional[torch.optim.lr_scheduler._LRScheduler] = None)

Write a torch .pt file containing the epoch, model, optimizer, and scheduler.

Parameters
  • checkpoint_file (PathLike) – Path to save checkpoint file.

  • epoch (int) – The current training epoch.

  • model (torch.nn.Module) – The model whose parameters are saved.

  • optimizers (Dict[str, torch.optim.Optimizer]) – The optimizers whose parameters are saved.

  • scheduler (Optional[torch.optim.lr_scheduler._LRScheduler]) – Optional scheduler whose parameters are saved.

mdlearn.utils.parse_args() argparse.Namespace

Parse command line arguments using argparse library

Returns

argparse.Namespace – Dict like object containing a path to a YAML file accessed via the config property.

Example

>>> from mdlearn.utils import parse_args
>>> args = parse_args()
>>> # MyConfig should inherit from BaseSettings
>>> cfg = MyConfig.from_yaml(args.config)
mdlearn.utils.resume_checkpoint(checkpoint_file: Union[str, pathlib.Path], model: torch.nn.Module, optimizers: Dict[str, torch.optim.Optimizer], scheduler: Optional[torch.optim.lr_scheduler._LRScheduler] = None) int

Modifies model, optimizer, and scheduler with values stored in torch .pt file checkpoint_file to resume from a previous training checkpoint.

Parameters
  • checkpoint_file (PathLike) – Path to checkpoint file to resume from.

  • model (torch.nn.Module) – Module to update the parameters of.

  • optimizers (Dict[str, torch.optim.Optimizer]) – Optimizers to update.

  • scheduler (Optional[torch.optim.lr_scheduler._LRScheduler]) – Optional scheduler to update.

Returns

int – The epoch the checkpoint is saved plus one i.e. the current training epoch to start from.