mdlearn.nn.models.ae.lstm

Warning

LSTM models are still under development, use with caution!

Classes

`LSTMAE`(args, *kwargs)	LSTM model to predict the dynamics for a time series of feature vectors.
`LSTMAETrainer`(input_dim[, latent_dim, ...])	Trainer class to fit an LSTM model to a time series of feature vectors.

class mdlearn.nn.models.ae.lstm.LSTMAE(*args: Any, **kwargs: Any)

LSTM model to predict the dynamics for a time series of feature vectors.

__init__(input_dim: int, latent_dim: int = 8, hidden_neurons: List[int] = [128], lstm_bias: bool = True, dropout: float = 0.0, relu_slope: float = 0.0, inplace_activation: bool = False, dense_bias: bool = True)

Parameters

input_dim (int) – The number of expected features in the input x.
latent_dim (int, default=8) – Dimension of the latent space.
hidden_neurons (List[int], default=[128]) – The dimension of the hidden states for each LSTM block in the stacked LSTM encoder. This list defines how deep the encoder is i.e. how many LSTM blocks to use. The reverse of this list also defines the shape of the DenseNet decoder.
lstm_bias (bool, default=True) – If False, then the stacked LSTM encoder does not use bias weights b_ih and b_hh.
dropout (float, default=0.0) – If non-zero, introduces a Dropout layer on the outputs of each LSTM layer except the last layer, with dropout probability equal to dropout.
relu_slope (float, default=0.0) – If greater than 0.0, will use LeakyReLU activiation in the DenseNet decoder with negative_slope set to relu_slope.
inplace_activation (bool, default=False) – Sets the inplace option for the activation function in the DenseNet decoder.
dense_bias (bool, default=True) – If False, then the DenseNet decoder does not use bias.

forward(x: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor]

Parameters

x (torch.Tensor) – Tensor of shape BxNxD for B batches of length N sequences with D feature dimensions.

Returns

torch.Tensor – The latent embedding of size (B, latent_dim).
torch.Tensor – The predicted future time step of size (B, D).

mse_loss(y_true: torch.Tensor, y_pred: torch.Tensor, reduction: str = 'mean') → torch.Tensor

Compute the MSE loss between y_true and y_pred.

Parameters

y_true (torch.Tensor) – The true data.
y_pred (torch.Tensor) – The prediction.
reduction (str, default=”mean”) – The reduction strategy for the F.mse_loss function.

Returns

torch.Tensor – The MSE loss between y_true and y_pred.

class mdlearn.nn.models.ae.lstm.LSTMAETrainer(input_dim: int, latent_dim: int = 8, hidden_neurons: List[int] = [128], lstm_bias: bool = True, dropout: float = 0.0, relu_slope: float = 0.0, inplace_activation: bool = False, dense_bias: bool = True, window_size: int = 10, horizon: int = 1, seed: int = 42, in_gpu_memory: bool = False, num_data_workers: int = 0, prefetch_factor: int = 2, split_pct: float = 0.8, split_method: str = 'partition', batch_size: int = 128, shuffle: bool = True, device: str = 'cpu', optimizer_name: str = 'RMSprop', optimizer_hparams: Dict[str, Any] = {'lr': 0.001, 'weight_decay': 1e-05}, scheduler_name: Optional[str] = None, scheduler_hparams: Dict[str, Any] = {}, epochs: int = 100, verbose: bool = False, clip_grad_max_norm: float = 10.0, checkpoint_log_every: int = 10, plot_log_every: int = 10, plot_n_samples: int = 10000, plot_method: Optional[str] = 'TSNE', train_subsample_pct: float = 1.0, valid_subsample_pct: float = 1.0, use_wandb: bool = False)

Trainer class to fit an LSTM model to a time series of feature vectors.

__init__(input_dim: int, latent_dim: int = 8, hidden_neurons: List[int] = [128], lstm_bias: bool = True, dropout: float = 0.0, relu_slope: float = 0.0, inplace_activation: bool = False, dense_bias: bool = True, window_size: int = 10, horizon: int = 1, seed: int = 42, in_gpu_memory: bool = False, num_data_workers: int = 0, prefetch_factor: int = 2, split_pct: float = 0.8, split_method: str = 'partition', batch_size: int = 128, shuffle: bool = True, device: str = 'cpu', optimizer_name: str = 'RMSprop', optimizer_hparams: Dict[str, Any] = {'lr': 0.001, 'weight_decay': 1e-05}, scheduler_name: Optional[str] = None, scheduler_hparams: Dict[str, Any] = {}, epochs: int = 100, verbose: bool = False, clip_grad_max_norm: float = 10.0, checkpoint_log_every: int = 10, plot_log_every: int = 10, plot_n_samples: int = 10000, plot_method: Optional[str] = 'TSNE', train_subsample_pct: float = 1.0, valid_subsample_pct: float = 1.0, use_wandb: bool = False)

Parameters

input_dim (int) – The number of expected features in the input x.
latent_dim (int, default=8) – Dimension of the latent space.
hidden_neurons (List[int], default=[128]) – The dimension of the hidden states for each LSTM block in the stacked LSTM encoder. This list defines how deep the encoder is i.e. how many LSTM blocks to use. The reverse of this list also defines the shape of the DenseNet decoder.
lstm_bias (bool, default=True) – If False, then the stacked LSTM encoder does not use bias weights b_ih and b_hh.
dropout (float, default=0.0) – If non-zero, introduces a Dropout layer on the outputs of each LSTM layer except the last layer, with dropout probability equal to dropout.
relu_slope (float, default=0.0) – If greater than 0.0, will use LeakyReLU activiation in the DenseNet decoder with negative_slope set to relu_slope.
inplace_activation (bool, default=False) – Sets the inplace option for the activation function in the DenseNet decoder.
dense_bias (bool, default=True) – If False, then the DenseNet decoder does not use bias.
window_size (int, default=10) – Number of timesteps considered for prediction.
horizon (int, default=1) – How many time steps to predict ahead.
seed (int, default=42) – Random seed for torch, numpy, and random module.
in_gpu_memory (bool, default=False) – If True, will pre-load the entire data array to GPU memory.
num_data_workers (int, default=0) – How many subprocesses to use for data loading. 0 means that the data will be loaded in the main process.
prefetch_factor (int, by default=2) – Number of samples loaded in advance by each worker. 2 means there will be a total of 2 * num_workers samples prefetched across all workers.
split_pct (float, default=0.8) – Proportion of data set to use for training. The rest goes to validation.
split_method (str, default=”random”) – Method to split the data. For random split use “random”, for a simple partition, use “partition”.
batch_size (int, default=128) – Mini-batch size for training.
shuffle (bool, default=True) – Whether to shuffle training data or not.
device (str, default=”cpu”) – Specify training hardware either cpu or cuda for GPU devices.
optimizer_name (str, default=”RMSprop”) – Name of the PyTorch optimizer to use. Matches PyTorch optimizer class name.
optimizer_hparams (Dict[str, Any], default={“lr”: 0.001, “weight_decay”: 0.00001}) – Dictionary of hyperparameters to pass to the chosen PyTorch optimizer.
scheduler_name (Optional[str], default=None) – Name of the PyTorch learning rate scheduler to use. Matches PyTorch optimizer class name.
scheduler_hparams (Dict[str, Any], default={}) – Dictionary of hyperparameters to pass to the chosen PyTorch learning rate scheduler.
epochs (int, default=100) – Number of epochs to train for.
verbose (bool, default=False) – If True, will print training and validation loss at each epoch.
clip_grad_max_norm (float, default=10.0) – Max norm of the gradients for gradient clipping for more information see: torch.nn.utils.clip_grad_norm_ documentation.
checkpoint_log_every (int, default=10) – Epoch interval to log a checkpoint file containing the model weights, optimizer, and scheduler parameters.
plot_log_every (int, default=10) – Epoch interval to log a visualization plot of the latent space.
plot_n_samples (int, default=10000) – Number of validation samples to use for plotting.
plot_method (Optional[str], default=”TSNE”) – The method for visualizing the latent space or if visualization should not be run, set plot_method=None. If using "TSNE", it will attempt to use the RAPIDS.ai GPU implementation and will fallback to the sklearn CPU implementation if RAPIDS.ai is unavailable.
train_subsample_pct (float, default=1.0) – Percentage of training data to use during hyperparameter sweeps.
valid_subsample_pct (float, default=1.0) – Percentage of validation data to use during hyperparameter sweeps.
use_wandb (bool, default=False) – If True, will log results to wandb.

Raises

ValueError – split_pct should be between 0 and 1.
ValueError – train_subsample_pct should be between 0 and 1.
ValueError – valid_subsample_pct should be between 0 and 1.
ValueError – Specified device as cuda, but it is unavailable.

fit(X: numpy.ndarray, scalars: Dict[str, numpy.ndarray] = {}, output_path: Union[str, pathlib.Path] = './', checkpoint: Optional[Union[str, pathlib.Path]] = None)

Trains the LSTMAE on the input data X.

Parameters

X (np.ndarray) – Input features vectors of shape (N, D) where N is the number of data examples, and D is the dimension of the feature vector.
scalars (Dict[str, np.ndarray], default={}) – Dictionary of scalar arrays. For instance, the root mean squared deviation (RMSD) for each feature vector can be passed via {"rmsd": np.array(...)}. The dimension of each scalar array should match the number of input feature vectors N.
output_path (PathLike, default=”./”) – Path to write training results to. Makes an output_path/checkpoints folder to save model checkpoint files, and output_path/plots folder to store latent space visualizations.
checkpoint (Optional[PathLike], default=None) – Path to a specific model checkpoint file to restore training.

Raises

ValueError – If X does not have two dimensions. For scalar time series, please reshape to (N, 1).
TypeError – If scalars is not type dict. A common error is to pass output_path as the second argument.
NotImplementedError – If using a learning rate scheduler other than ReduceLROnPlateau, a step function will need to be implemented.

predict(X: numpy.ndarray, inference_batch_size: int = 512, checkpoint: Optional[Union[str, pathlib.Path]] = None) → Tuple[numpy.ndarray, numpy.ndarray, float]

Predict using the LSTMAE.

Parameters

X (np.ndarray) – The input data to predict on.
inference_batch_size (int, default=512) – The batch size for inference.
checkpoint (Optional[PathLike], default=None) – Path to a specific model checkpoint file.

Returns

np.ndarray – The predictions.
np.ndarray – The latenet embeddings.
float – The average MSE loss.