mdlearn.nn.models.ae.lstm
Warning
LSTM models are still under development, use with caution!
Classes
|
LSTM model to predict the dynamics for a time series of feature vectors. |
|
Trainer class to fit an LSTM model to a time series of feature vectors. |
- class mdlearn.nn.models.ae.lstm.LSTMAE(*args: Any, **kwargs: Any)
LSTM model to predict the dynamics for a time series of feature vectors.
- __init__(input_dim: int, latent_dim: int = 8, hidden_neurons: List[int] = [128], lstm_bias: bool = True, dropout: float = 0.0, relu_slope: float = 0.0, inplace_activation: bool = False, dense_bias: bool = True)
- Parameters
input_dim (int) – The number of expected features in the input
x
.latent_dim (int, default=8) – Dimension of the latent space.
hidden_neurons (List[int], default=[128]) – The dimension of the hidden states for each LSTM block in the stacked LSTM encoder. This list defines how deep the encoder is i.e. how many LSTM blocks to use. The reverse of this list also defines the shape of the
DenseNet
decoder.lstm_bias (bool, default=True) – If False, then the stacked LSTM encoder does not use bias weights b_ih and b_hh.
dropout (float, default=0.0) – If non-zero, introduces a Dropout layer on the outputs of each LSTM layer except the last layer, with dropout probability equal to dropout.
relu_slope (float, default=0.0) – If greater than 0.0, will use LeakyReLU activiation in the
DenseNet
decoder withnegative_slope
set torelu_slope
.inplace_activation (bool, default=False) – Sets the inplace option for the activation function in the
DenseNet
decoder.dense_bias (bool, default=True) – If False, then the
DenseNet
decoder does not use bias.
- forward(x: torch.Tensor) Tuple[torch.Tensor, torch.Tensor]
- Parameters
x (torch.Tensor) – Tensor of shape BxNxD for B batches of length N sequences with D feature dimensions.
- Returns
torch.Tensor – The latent embedding of size (B,
latent_dim
).torch.Tensor – The predicted future time step of size (B, D).
- mse_loss(y_true: torch.Tensor, y_pred: torch.Tensor, reduction: str = 'mean') torch.Tensor
Compute the MSE loss between
y_true
andy_pred
.- Parameters
y_true (torch.Tensor) – The true data.
y_pred (torch.Tensor) – The prediction.
reduction (str, default=”mean”) – The reduction strategy for the F.mse_loss function.
- Returns
torch.Tensor – The MSE loss between
y_true
andy_pred
.
- class mdlearn.nn.models.ae.lstm.LSTMAETrainer(input_dim: int, latent_dim: int = 8, hidden_neurons: List[int] = [128], lstm_bias: bool = True, dropout: float = 0.0, relu_slope: float = 0.0, inplace_activation: bool = False, dense_bias: bool = True, window_size: int = 10, horizon: int = 1, seed: int = 42, in_gpu_memory: bool = False, num_data_workers: int = 0, prefetch_factor: int = 2, split_pct: float = 0.8, split_method: str = 'partition', batch_size: int = 128, shuffle: bool = True, device: str = 'cpu', optimizer_name: str = 'RMSprop', optimizer_hparams: Dict[str, Any] = {'lr': 0.001, 'weight_decay': 1e-05}, scheduler_name: Optional[str] = None, scheduler_hparams: Dict[str, Any] = {}, epochs: int = 100, verbose: bool = False, clip_grad_max_norm: float = 10.0, checkpoint_log_every: int = 10, plot_log_every: int = 10, plot_n_samples: int = 10000, plot_method: Optional[str] = 'TSNE', train_subsample_pct: float = 1.0, valid_subsample_pct: float = 1.0, use_wandb: bool = False)
Trainer class to fit an LSTM model to a time series of feature vectors.
- __init__(input_dim: int, latent_dim: int = 8, hidden_neurons: List[int] = [128], lstm_bias: bool = True, dropout: float = 0.0, relu_slope: float = 0.0, inplace_activation: bool = False, dense_bias: bool = True, window_size: int = 10, horizon: int = 1, seed: int = 42, in_gpu_memory: bool = False, num_data_workers: int = 0, prefetch_factor: int = 2, split_pct: float = 0.8, split_method: str = 'partition', batch_size: int = 128, shuffle: bool = True, device: str = 'cpu', optimizer_name: str = 'RMSprop', optimizer_hparams: Dict[str, Any] = {'lr': 0.001, 'weight_decay': 1e-05}, scheduler_name: Optional[str] = None, scheduler_hparams: Dict[str, Any] = {}, epochs: int = 100, verbose: bool = False, clip_grad_max_norm: float = 10.0, checkpoint_log_every: int = 10, plot_log_every: int = 10, plot_n_samples: int = 10000, plot_method: Optional[str] = 'TSNE', train_subsample_pct: float = 1.0, valid_subsample_pct: float = 1.0, use_wandb: bool = False)
- Parameters
input_dim (int) – The number of expected features in the input
x
.latent_dim (int, default=8) – Dimension of the latent space.
hidden_neurons (List[int], default=[128]) – The dimension of the hidden states for each LSTM block in the stacked LSTM encoder. This list defines how deep the encoder is i.e. how many LSTM blocks to use. The reverse of this list also defines the shape of the
DenseNet
decoder.lstm_bias (bool, default=True) – If False, then the stacked LSTM encoder does not use bias weights b_ih and b_hh.
dropout (float, default=0.0) – If non-zero, introduces a Dropout layer on the outputs of each LSTM layer except the last layer, with dropout probability equal to dropout.
relu_slope (float, default=0.0) – If greater than 0.0, will use LeakyReLU activiation in the
DenseNet
decoder withnegative_slope
set torelu_slope
.inplace_activation (bool, default=False) – Sets the inplace option for the activation function in the
DenseNet
decoder.dense_bias (bool, default=True) – If False, then the
DenseNet
decoder does not use bias.window_size (int, default=10) – Number of timesteps considered for prediction.
horizon (int, default=1) – How many time steps to predict ahead.
seed (int, default=42) – Random seed for torch, numpy, and random module.
in_gpu_memory (bool, default=False) – If True, will pre-load the entire
data
array to GPU memory.num_data_workers (int, default=0) – How many subprocesses to use for data loading. 0 means that the data will be loaded in the main process.
prefetch_factor (int, by default=2) – Number of samples loaded in advance by each worker. 2 means there will be a total of 2 * num_workers samples prefetched across all workers.
split_pct (float, default=0.8) – Proportion of data set to use for training. The rest goes to validation.
split_method (str, default=”random”) – Method to split the data. For random split use “random”, for a simple partition, use “partition”.
batch_size (int, default=128) – Mini-batch size for training.
shuffle (bool, default=True) – Whether to shuffle training data or not.
device (str, default=”cpu”) – Specify training hardware either
cpu
orcuda
for GPU devices.optimizer_name (str, default=”RMSprop”) – Name of the PyTorch optimizer to use. Matches PyTorch optimizer class name.
optimizer_hparams (Dict[str, Any], default={“lr”: 0.001, “weight_decay”: 0.00001}) – Dictionary of hyperparameters to pass to the chosen PyTorch optimizer.
scheduler_name (Optional[str], default=None) – Name of the PyTorch learning rate scheduler to use. Matches PyTorch optimizer class name.
scheduler_hparams (Dict[str, Any], default={}) – Dictionary of hyperparameters to pass to the chosen PyTorch learning rate scheduler.
epochs (int, default=100) – Number of epochs to train for.
verbose (bool, default=False) – If True, will print training and validation loss at each epoch.
clip_grad_max_norm (float, default=10.0) – Max norm of the gradients for gradient clipping for more information see:
torch.nn.utils.clip_grad_norm_
documentation.checkpoint_log_every (int, default=10) – Epoch interval to log a checkpoint file containing the model weights, optimizer, and scheduler parameters.
plot_log_every (int, default=10) – Epoch interval to log a visualization plot of the latent space.
plot_n_samples (int, default=10000) – Number of validation samples to use for plotting.
plot_method (Optional[str], default=”TSNE”) – The method for visualizing the latent space or if visualization should not be run, set
plot_method=None
. If using"TSNE"
, it will attempt to use the RAPIDS.ai GPU implementation and will fallback to the sklearn CPU implementation if RAPIDS.ai is unavailable.train_subsample_pct (float, default=1.0) – Percentage of training data to use during hyperparameter sweeps.
valid_subsample_pct (float, default=1.0) – Percentage of validation data to use during hyperparameter sweeps.
use_wandb (bool, default=False) – If True, will log results to wandb.
- Raises
ValueError –
split_pct
should be between 0 and 1.ValueError –
train_subsample_pct
should be between 0 and 1.ValueError –
valid_subsample_pct
should be between 0 and 1.ValueError – Specified
device
ascuda
, but it is unavailable.
- fit(X: numpy.ndarray, scalars: Dict[str, numpy.ndarray] = {}, output_path: Union[str, pathlib.Path] = './', checkpoint: Optional[Union[str, pathlib.Path]] = None)
Trains the LSTMAE on the input data
X
.- Parameters
X (np.ndarray) – Input features vectors of shape (N, D) where N is the number of data examples, and D is the dimension of the feature vector.
scalars (Dict[str, np.ndarray], default={}) – Dictionary of scalar arrays. For instance, the root mean squared deviation (RMSD) for each feature vector can be passed via
{"rmsd": np.array(...)}
. The dimension of each scalar array should match the number of input feature vectors N.output_path (PathLike, default=”./”) – Path to write training results to. Makes an
output_path/checkpoints
folder to save model checkpoint files, andoutput_path/plots
folder to store latent space visualizations.checkpoint (Optional[PathLike], default=None) – Path to a specific model checkpoint file to restore training.
- Raises
ValueError – If
X
does not have two dimensions. For scalar time series, please reshape to (N, 1).TypeError – If
scalars
is not type dict. A common error is to passoutput_path
as the second argument.NotImplementedError – If using a learning rate scheduler other than
ReduceLROnPlateau
, a step function will need to be implemented.
- predict(X: numpy.ndarray, inference_batch_size: int = 512, checkpoint: Optional[Union[str, pathlib.Path]] = None) Tuple[numpy.ndarray, numpy.ndarray, float]
Predict using the LSTMAE.
- Parameters
X (np.ndarray) – The input data to predict on.
inference_batch_size (int, default=512) – The batch size for inference.
checkpoint (Optional[PathLike], default=None) – Path to a specific model checkpoint file.
- Returns
np.ndarray – The predictions.
np.ndarray – The latenet embeddings.
float – The average MSE loss.