mdlearn.visualize

Functions to visualize modeling results.

Functions

log_latent_visualization(data, colors, ...)

Make scatter plots of the latent space using the specified method of dimensionality reduction.

plot_scatter(data[, color_dict, color])

mdlearn.visualize.log_latent_visualization(data: numpy.ndarray, colors: Dict[str, numpy.ndarray], output_path: Union[str, pathlib.Path], epoch: int = 0, n_samples: Optional[int] = None, method: str = 'raw') Dict[str, str]

Make scatter plots of the latent space using the specified method of dimensionality reduction.

Parameters
  • data (np.ndarray) – The latent embeddings to visualize of shape (N, D) where N is the number of examples and D is the number of dimensions.

  • colors (Dict[str, np.ndarray]) – Each item in the dictionary will generate a different plot labeled with the key name. Each inner array should be of size N.

  • output_path (PathLike) – The output directory path to save plots to.

  • epoch (int, default=0) – The current epoch of training to label plots with.

  • n_samples (Optional[int], default=None) – Number of samples to plot, will take a random sample of the data if n_samples < N. Otherwise, if n_samples is None, use all the data.

  • method (str, default=”raw”) – Method of dimensionality reduction used to plot. Currently supports: “PCA”, “TSNE”, “LLE”, or “raw” for plotting the raw embeddings (or up to the first 3 dimensions if D > 3). If “TSNE” is specified, then the GPU accelerated RAPIDS.ai implementation will be tried first and if it is unavailable then the sklearn version will be used instead.

Returns

Dict[str, str] – A dictionary mapping each key in color to a raw HTML string containing the scatter plot data. These can be saved directly for visualization and logged to wandb during training.

Raises

ValueError – If dimensionality reduction method is not supported.

mdlearn.visualize.plot_scatter(data: numpy.ndarray, color_dict: Dict[str, numpy.ndarray] = {}, color: Optional[str] = None) plotly.graph_objects._figure.Figure