mdlearn.nn.models.ae.lstm
Warning
LSTM models are still under development, use with caution!
Classes
|
LSTM model to predict the dynamics for a time series of feature vectors. |
|
Trainer class to fit an LSTM model to a time series of feature vectors. |
- class mdlearn.nn.models.ae.lstm.LSTMAE(*args: Any, **kwargs: Any)
LSTM model to predict the dynamics for a time series of feature vectors.
- __init__(input_dim: int, latent_dim: int = 8, hidden_neurons: list[int] = [128], lstm_bias: bool = True, dropout: float = 0.0, relu_slope: float = 0.0, inplace_activation: bool = False, dense_bias: bool = True)
- Parameters:
input_dim (int) – The number of expected features in the input
x.latent_dim (int, default=8) – Dimension of the latent space.
hidden_neurons (List[int], default=[128]) – The dimension of the hidden states for each LSTM block in the stacked LSTM encoder. This list defines how deep the encoder is i.e. how many LSTM blocks to use. The reverse of this list also defines the shape of the
DenseNetdecoder.lstm_bias (bool, default=True) – If False, then the stacked LSTM encoder does not use bias weights b_ih and b_hh.
dropout (float, default=0.0) – If non-zero, introduces a Dropout layer on the outputs of each LSTM layer except the last layer, with dropout probability equal to dropout.
relu_slope (float, default=0.0) – If greater than 0.0, will use LeakyReLU activiation in the
DenseNetdecoder withnegative_slopeset torelu_slope.inplace_activation (bool, default=False) – Sets the inplace option for the activation function in the
DenseNetdecoder.dense_bias (bool, default=True) – If False, then the
DenseNetdecoder does not use bias.
- forward(x: torch.Tensor) tuple[torch.Tensor, torch.Tensor]
- Parameters:
x (torch.Tensor) – Tensor of shape BxNxD for B batches of length N sequences with D feature dimensions.
- Returns:
torch.Tensor – The latent embedding of size (B,
latent_dim).torch.Tensor – The predicted future time step of size (B, D).
- mse_loss(y_true: torch.Tensor, y_pred: torch.Tensor, reduction: str = 'mean') torch.Tensor
Compute the MSE loss between
y_trueandy_pred.- Parameters:
y_true (torch.Tensor) – The true data.
y_pred (torch.Tensor) – The prediction.
reduction (str, default=”mean”) – The reduction strategy for the F.mse_loss function.
- Returns:
torch.Tensor – The MSE loss between
y_trueandy_pred.
- class mdlearn.nn.models.ae.lstm.LSTMAETrainer(input_dim: int, latent_dim: int = 8, hidden_neurons: list[int] = [128], lstm_bias: bool = True, dropout: float = 0.0, relu_slope: float = 0.0, inplace_activation: bool = False, dense_bias: bool = True, window_size: int = 10, horizon: int = 1, seed: int = numpy.random.default_rng.integers, in_gpu_memory: bool = False, num_data_workers: int = 0, prefetch_factor: int = 2, split_pct: float = 0.8, split_method: str = 'partition', batch_size: int = 128, inference_batch_size: int = 128, shuffle: bool = True, device: str = 'cpu', optimizer_name: str = 'RMSprop', optimizer_hparams: dict[str, Any] = {'lr': 0.001, 'weight_decay': 1e-05}, scheduler_name: str | None = None, scheduler_hparams: dict[str, Any] = {}, epochs: int = 100, verbose: bool = False, clip_grad_max_norm: float = 10.0, checkpoint_log_every: int = 10, plot_log_every: int = 10, plot_n_samples: int = 10000, plot_method: str | None = 'TSNE', train_subsample_pct: float = 1.0, valid_subsample_pct: float = 1.0, use_wandb: bool = False)
Trainer class to fit an LSTM model to a time series of feature vectors.
- __init__(input_dim: int, latent_dim: int = 8, hidden_neurons: list[int] = [128], lstm_bias: bool = True, dropout: float = 0.0, relu_slope: float = 0.0, inplace_activation: bool = False, dense_bias: bool = True, window_size: int = 10, horizon: int = 1, seed: int = numpy.random.default_rng.integers, in_gpu_memory: bool = False, num_data_workers: int = 0, prefetch_factor: int = 2, split_pct: float = 0.8, split_method: str = 'partition', batch_size: int = 128, inference_batch_size: int = 128, shuffle: bool = True, device: str = 'cpu', optimizer_name: str = 'RMSprop', optimizer_hparams: dict[str, Any] = {'lr': 0.001, 'weight_decay': 1e-05}, scheduler_name: str | None = None, scheduler_hparams: dict[str, Any] = {}, epochs: int = 100, verbose: bool = False, clip_grad_max_norm: float = 10.0, checkpoint_log_every: int = 10, plot_log_every: int = 10, plot_n_samples: int = 10000, plot_method: str | None = 'TSNE', train_subsample_pct: float = 1.0, valid_subsample_pct: float = 1.0, use_wandb: bool = False)
- Parameters:
input_dim (int) – The number of expected features in the input
x.latent_dim (int, default=8) – Dimension of the latent space.
hidden_neurons (List[int], default=[128]) – The dimension of the hidden states for each LSTM block in the stacked LSTM encoder. This list defines how deep the encoder is i.e. how many LSTM blocks to use. The reverse of this list also defines the shape of the
DenseNetdecoder.lstm_bias (bool, default=True) – If False, then the stacked LSTM encoder does not use bias weights b_ih and b_hh.
dropout (float, default=0.0) – If non-zero, introduces a Dropout layer on the outputs of each LSTM layer except the last layer, with dropout probability equal to dropout.
relu_slope (float, default=0.0) – If greater than 0.0, will use LeakyReLU activiation in the
DenseNetdecoder withnegative_slopeset torelu_slope.inplace_activation (bool, default=False) – Sets the inplace option for the activation function in the
DenseNetdecoder.dense_bias (bool, default=True) – If False, then the
DenseNetdecoder does not use bias.window_size (int, default=10) – Number of timesteps considered for prediction.
horizon (int, default=1) – How many time steps to predict ahead.
seed (int, default=np.random.default_rng().integers(2**31 - 1, dtype=int)) – Random seed for torch, numpy, and random module.
in_gpu_memory (bool, default=False) – If True, will pre-load the entire
dataarray to GPU memory.num_data_workers (int, default=0) – How many subprocesses to use for data loading. 0 means that the data will be loaded in the main process.
prefetch_factor (int, by default=2) – Number of samples loaded in advance by each worker. 2 means there will be a total of 2 * num_workers samples prefetched across all workers.
split_pct (float, default=0.8) – Proportion of data set to use for training. The rest goes to validation.
split_method (str, default=”random”) – Method to split the data. For random split use “random”, for a simple partition, use “partition”.
batch_size (int, default=128) – Mini-batch size for training.
inference_batch_size (int, default=128) – Mini-batch size for inference.
shuffle (bool, default=True) – Whether to shuffle training data or not.
device (str, default=”cpu”) – Specify training hardware either
cpuorcudafor GPU devices.optimizer_name (str, default=”RMSprop”) – Name of the PyTorch optimizer to use. Matches PyTorch optimizer class name.
optimizer_hparams (Dict[str, Any], default={“lr”: 0.001, “weight_decay”: 0.00001}) – Dictionary of hyperparameters to pass to the chosen PyTorch optimizer.
scheduler_name (Optional[str], default=None) – Name of the PyTorch learning rate scheduler to use. Matches PyTorch optimizer class name.
scheduler_hparams (Dict[str, Any], default={}) – Dictionary of hyperparameters to pass to the chosen PyTorch learning rate scheduler.
epochs (int, default=100) – Number of epochs to train for.
verbose (bool, default=False) – If True, will print training and validation loss at each epoch.
clip_grad_max_norm (float, default=10.0) – Max norm of the gradients for gradient clipping for more information see:
torch.nn.utils.clip_grad_norm_documentation.checkpoint_log_every (int, default=10) – Epoch interval to log a checkpoint file containing the model weights, optimizer, and scheduler parameters.
plot_log_every (int, default=10) – Epoch interval to log a visualization plot of the latent space.
plot_n_samples (int, default=10000) – Number of validation samples to use for plotting.
plot_method (Optional[str], default=”TSNE”) – The method for visualizing the latent space or if visualization should not be run, set
plot_method=None. If using"TSNE", it will attempt to use the RAPIDS.ai GPU implementation and will fallback to the sklearn CPU implementation if RAPIDS.ai is unavailable.train_subsample_pct (float, default=1.0) – Percentage of training data to use during hyperparameter sweeps.
valid_subsample_pct (float, default=1.0) – Percentage of validation data to use during hyperparameter sweeps.
use_wandb (bool, default=False) – If True, will log results to wandb.
- Raises:
ValueError –
split_pctshould be between 0 and 1.ValueError –
train_subsample_pctshould be between 0 and 1.ValueError –
valid_subsample_pctshould be between 0 and 1.ValueError – Specified
deviceascuda, but it is unavailable.
- fit(X: numpy.ndarray, scalars: dict[str, numpy.ndarray] = {}, output_path: str | Path = './', checkpoint: str | Path | None = None)
Trains the LSTMAE on the input data
X.- Parameters:
X (np.ndarray) – Input features vectors of shape (N, D) where N is the number of data examples, and D is the dimension of the feature vector.
scalars (Dict[str, np.ndarray], default={}) – Dictionary of scalar arrays. For instance, the root mean squared deviation (RMSD) for each feature vector can be passed via
{"rmsd": np.array(...)}. The dimension of each scalar array should match the number of input feature vectors N.output_path (PathLike, default=”./”) – Path to write training results to. Makes an
output_path/checkpointsfolder to save model checkpoint files, andoutput_path/plotsfolder to store latent space visualizations.checkpoint (Optional[PathLike], default=None) – Path to a specific model checkpoint file to restore training.
- Raises:
ValueError – If
Xdoes not have two dimensions. For scalar time series, please reshape to (N, 1).TypeError – If
scalarsis not type dict. A common error is to passoutput_pathas the second argument.NotImplementedError – If using a learning rate scheduler other than
ReduceLROnPlateau, a step function will need to be implemented.
- predict(X: numpy.ndarray, inference_batch_size: int | None = None, checkpoint: str | Path | None = None) tuple[numpy.ndarray, numpy.ndarray, float]
Predict using the LSTMAE.
- Parameters:
X (np.ndarray) – The input data to predict on.
inference_batch_size (int, default=None) – The batch size for inference (if None uses the value specified during Trainer construction).
checkpoint (Optional[PathLike], default=None) – Path to a specific model checkpoint file.
- Returns:
np.ndarray – The predictions.
np.ndarray – The latenet embeddings.
float – The average MSE loss.