madnis.nn package¶
This module containes functions and classes that implement the different types of neural network architectures necessary for (multi-channel) neural importance sampling.
- class madnis.nn.DiscreteMADE(dims_in, dims_c=0, channels=None, prior_prob_function=None, channel_remap_function=None, **mlp_kwargs)[source]¶
Bases:
Module,Distribution
- class madnis.nn.DiscreteTransformer(dims_in, dims_c=0, prior_prob_function=None, embedding_dim=64, feedforward_dim=256, heads=8, transformer_layers=3, mlp_layers=3, mlp_units=256, mlp_activation=<class 'torch.nn.modules.activation.ReLU'>)[source]¶
Bases:
Module,DistributionInitialize internal Module state, shared by both nn.Module and ScriptModule.
- log_prob(x, c=None, channel=None)[source]¶
Computes the log-probabilities of the input data.
- Parameters:
x (
Tensor) – input data, shape (n, dims_in)c (
Optional[Tensor]) – condition, shape (n, dims_c) or None for an unconditional flowchannel (
Union[Tensor,list[int],int,None]) –encodes the channel of the samples. It must have one of the following types:
Tensor: integer tensor of shape (n, ), containing the channel index for every input sample;list: list of integers, specifying the number of samples in each channel;int: integer specifying a single channel containing all the samples;None: used in the single-channel case or to indicate that all channels contain the same number of samples in the multi-channel case.
- Return type:
Tensor- Returns:
log-probabilities with shape (n, )
- sample(n=None, c=None, channel=None, return_log_prob=False, return_prob=False, device=None, dtype=None)[source]¶
Draws samples following the distribution
- Parameters:
n (
Optional[int]) – number of samplesc (
Optional[Tensor]) – condition, shape (n, dims_c) or None for an unconditional flowchannel (
Union[Tensor,list[int],int,None]) –encodes the channel of the samples. It must have one of the following types:
Tensor: integer tensor of shape (n, ), containing the channel index for every input sample;list: list of integers, specifying the number of samples in each channel;int: integer specifying a single channel containing all the samples;None: used in the single-channel case or to indicate that all channels contain the same number of samples in the multi-channel case.
return_log_prob (
bool) – if True, also return the log-probabilitiesreturn_prob (
bool) – if True, also return the probabilitiesdevice (
Optional[device]) – device of the returned tensor. Only required if no condition is given.dtype (
Optional[dtype]) – dtype of the returned tensor. Only required if no condition is given.
- Return type:
Tensor|tuple[Tensor,...]- Returns:
samples with shape (n, dims_in). Depending on the arguments
return_log_prob,return_prob, this function should also return the log-probabilities with shape (n, ), the probabilities with shape (n, ).
- class madnis.nn.Distribution(*args, **kwargs)[source]¶
Bases:
ProtocolProtocol for a (potentially learnable) distribution that can be used for sampling and density estimation, like a normalizing flow.
- log_prob(x, c=None, channel=None)[source]¶
Computes the log-probabilities of the input data.
- Parameters:
x (
Tensor) – input data, shape (n, dims_in)c (
Optional[Tensor]) – condition, shape (n, dims_c) or None for an unconditional flowchannel (
Union[Tensor,list[int],int,None]) –encodes the channel of the samples. It must have one of the following types:
Tensor: integer tensor of shape (n, ), containing the channel index for every input sample;list: list of integers, specifying the number of samples in each channel;int: integer specifying a single channel containing all the samples;None: used in the single-channel case or to indicate that all channels contain the same number of samples in the multi-channel case.
- Return type:
Tensor- Returns:
log-probabilities with shape (n, )
- prob(x, c=None, channel=None)[source]¶
Computes the probabilities of the input data.
- Parameters:
x (
Tensor) – input data, shape (n, dims_in)c (
Optional[Tensor]) – condition, shape (n, dims_c) or None for an unconditional flowchannel (
Union[Tensor,list[int],int,None]) –encodes the channel of the samples. It must have one of the following types:
Tensor: integer tensor of shape (n, ), containing the channel index for every input sample;list: list of integers, specifying the number of samples in each channel;int: integer specifying a single channel containing all the samples;None: used in the single-channel case or to indicate that all channels contain the same number of samples in the multi-channel case.
- Return type:
Tensor- Returns:
probabilities with shape (n, )
- sample(n, c=None, channel=None, return_log_prob=False, return_prob=False, device=None, dtype=None)[source]¶
Draws samples following the distribution
- Parameters:
n (
int) – number of samplesc (
Optional[Tensor]) – condition, shape (n, dims_c) or None for an unconditional flowchannel (
Union[Tensor,list[int],int,None]) –encodes the channel of the samples. It must have one of the following types:
Tensor: integer tensor of shape (n, ), containing the channel index for every input sample;list: list of integers, specifying the number of samples in each channel;int: integer specifying a single channel containing all the samples;None: used in the single-channel case or to indicate that all channels contain the same number of samples in the multi-channel case.
return_log_prob (
bool) – if True, also return the log-probabilitiesreturn_prob (
bool) – if True, also return the probabilitiesdevice (
Optional[device]) – device of the returned tensor. Only required if no condition is given.dtype (
Optional[dtype]) – dtype of the returned tensor. Only required if no condition is given.
- Return type:
Tensor|tuple[Tensor,...]- Returns:
samples with shape (n, dims_in). Depending on the arguments
return_log_prob,return_prob, this function should also return the log-probabilities with shape (n, ), the probabilities with shape (n, ).
- class madnis.nn.Flow(dims_in, dims_c=0, uniform_latent=True, permutations='log', condition_masks=None, blocks=None, subnet_constructor=None, layers=3, units=32, activation=<class 'torch.nn.modules.activation.ReLU'>, layer_constructor=<class 'torch.nn.modules.linear.Linear'>, channels=None, channel_remap_function=None, mapping=None, bins=10, spline_bounds=10.0, min_bin_width=0.001, min_bin_height=0.001, min_bin_derivative=0.001)[source]¶
Bases:
Module,DistributionCoupling-block based normalizing flow (1605.08803) using rational quadratic spline transformations (1906.04032). Both conditional and non-conditional flows are supported. The class also allows to build multi-channel flows, i.e. an efficient implementation of multiple independent flows with the same hyperparameters.
- Parameters:
dims_in (
int) – input dimensiondims_c (
int) – condition dimensionuniform_latent (
bool) – If True, encode mapping from [0,1]^d to [0,1]^d and use a uniform latent space distribution. If False, encode mapping from R^d to R^d and use Gaussian latent space distribution.permutations (
Literal['log','random','exchange']) – Defines the strategy to permute the input dimensions between coupling blocks. “log”: logarithmic decomposition, so that every dimension is conditioned on every other dimension at least once. “random”: randomly permute dimensions. “exchange”: condition the first half of the input on the second half, then the other way around, repeatedly.condition_masks (
Optional[Tensor]) – Overwrites the permutation strategy with a custom conditioning mask with shape (blocks, dims_in). Components where the mask is True are used as condition, and components where it is False are transformed.blocks (
Optional[int]) – number of coupling blocks. Only needed if permutations is “random” or “exchange”subnet_constructor (
Optional[Callable[[int,int],Module]]) – function used to construct the flow sub-networks, with the number of input features and output features of the subnet as arguments. If None, the MLP (single channel) or StackedMLP (multi-channel) classes are used.layers (
int) – number of subnet layers. Only relevant if subnet_constructor=None.units (
int) – number of subnet hidden nodes. Only relevant if subnet_constructor=None.activation (
Callable[[],Module]) – function that builds a nn.Module used as activation function. Only relevant if subnet_constructor=None.layer_constructor (
Callable[[int,int],Module]) – function used to construct the subnet layers, given the number of input and output features. Only relevant if subnet_constructor=None.channels (
Optional[int]) – If None, build single-channel flow. If integer, build multi-channel flow with this number of channels.channel_remap_function (
Optional[Callable[[Tensor],Tensor]]) – TODOmapping (
Union[Callable[[Tensor,bool],tuple[Tensor,Tensor]],list[Callable[[Tensor,bool],tuple[Tensor,Tensor]]],None]) – Specifies a single mapping function or a list of mapping functions (one per channel) that are applied to the input before it enters the flow (forward direction) or after drawing samples using the flow (inverse direction). The arguments of the function are the input data and a boolean whether the transformation is inverted. It must return the transformed value and the logarithm of the Jacobian determinant of the transformation.bins (
int) – number of RQ spline binsspline_bounds (
float) – If uniform_latent=False, the splines are defined on the interval [-spline_bounds, spline_bounds].min_bin_width (
float) – minimal width of the spline binsmin_bin_height (
float) – minimal height of the spline binsmin_bin_derivative (
float) – minimal derivative at the spline bin edges
- init_with_grid(grid)[source]¶
Initializes the flow using a VEGAS grid, i.e. from bins with varying width and equal probability. The number of bins of this grid should be larger than the number of RQ spline bins. This function then performs the bin reduction algorithm as described in [2311.01548].
- Parameters:
grid (
Tensor) – edges of the VEGAS grid bins with shape (dims_in, vegas_bins+1) for single-channel flows or (channels, dims_in, vegas_bins+1) for multi-channel flows
- log_prob(x, c=None, channel=None, return_latent=False)[source]¶
Computes the log-probabilities of the input data.
- Parameters:
x (
Tensor) – input data, shape (n, dims_in)c (
Optional[Tensor]) – condition, shape (n, dims_c) or None for an unconditional flowchannel (
Union[Tensor,list[int],int,None]) –encodes the channel of the samples. It must have one of the following types:
Tensor: integer tensor of shape (n, ), containing the channel index for every input sample;list: list of integers, specifying the number of samples in each channel;int: integer specifying a single channel containing all the samples;None: used in the single-channel case or to indicate that all channels contain the same number of samples in the multi-channel case.
return_latent (
bool) – if True, also return the latent space vector
- Return type:
Tensor|tuple[Tensor,Tensor]- Returns:
log-probabilities with shape (n, ). If
return_latentis True, it also returns the latent space vector with shape (n, dims_in).
- prob(x, c=None, channel=None, return_latent=False)[source]¶
Computes the probabilities of the input data.
- Parameters:
x (
Tensor) – input data, shape (n, dims_in)c (
Optional[Tensor]) – condition, shape (n, dims_c) or None for an unconditional flowchannel (
Union[Tensor,list[int],int,None]) –encodes the channel of the samples. It must have one of the following types:
Tensor: integer tensor of shape (n, ), containing the channel index for every input sample;list: list of integers, specifying the number of samples in each channel;int: integer specifying a single channel containing all the samples;None: used in the single-channel case or to indicate that all channels contain the same number of samples in the multi-channel case.
return_latent (
bool) – if True, also return the latent space vector
- Return type:
Tensor|tuple[Tensor,Tensor]- Returns:
probabilities with shape (n, ). If
return_latentis True, it also returns the latent space vector with shape (n, dims_in).
- sample(n=None, c=None, channel=None, return_log_prob=False, return_prob=False, device=None, dtype=None, return_latent=False)[source]¶
Draws samples from the probability distribution encoded by the flow.
- Parameters:
n (
Optional[int]) – number of samples. Only required if no condition is given.c (
Optional[Tensor]) – condition, shape (n, dims_c) or None for an unconditional flowchannel (
Union[Tensor,list[int],int,None]) –encodes the channel of the samples. It must have one of the following types:
Tensor: integer tensor of shape (n, ), containing the channel index for every input sample;list: list of integers, specifying the number of samples in each channel;int: integer specifying a single channel containing all the samples;None: used in the single-channel case or to indicate that all channels contain the same number of samples in the multi-channel case.
return_log_prob (
bool) – if True, also return the log-probabilitiesreturn_prob (
bool) – if True, also return the probabilitiesdevice (
Optional[device]) – device of the returned tensor. Only required if no condition is given.dtype (
Optional[dtype]) – dtype of the returned tensor. Only required if no condition is given.return_latent (
bool) – if True, also return the latent space vector
- Return type:
Tensor|tuple[Tensor,...]- Returns:
samples with shape (n, dims_in). Depending on the arguments
return_log_prob,return_probandreturn_latent, this function will also return the log-probabilities with shape (n, ), the probabilities with shape (n, ) and the latent space vector with shape (n, dims_in).
- transform(x, c=None, channel=None, inverse=False)[source]¶
Transforms the input data into the latent space or back.
- Parameters:
x (
Tensor) – input data, shape (n, dims_in)c (
Optional[Tensor]) – condition, shape (n, dims_c) or None for an unconditional flowchannel (
Union[Tensor,list[int],int,None]) –encodes the channel of the samples. It must have one of the following types:
Tensor: integer tensor of shape (n, ), containing the channel index for every input sample;list: list of integers, specifying the number of samples in each channel;int: integer specifying a single channel containing all the samples;None: used in the single-channel case or to indicate that all channels contain the same number of samples in the multi-channel case.
inverse (
bool) – if True, use inverted transformation (i.e. the sampling direction)
- Return type:
tuple[Tensor,Tensor]- Returns:
tuple containing the transformed values with shape (n, dims_in), and log Jacobian determinants with shape (n, ) of the transformation
- class madnis.nn.MLP(features_in, features_out, layers=3, units=32, activation=<class 'torch.nn.modules.activation.ReLU'>, layer_constructor=<class 'torch.nn.modules.linear.Linear'>)[source]¶
Bases:
ModuleClass implementing a standard fully-connected network.
- Parameters:
features_in (
int) – number of input featuresfeatures_out (
int) – number of output featureslayers (
int) – number of layersunits (
int) – number of hidden nodesactivation (
Callable[[],Module]) – function that builds a nn.Module used as activation functionlayer_construction – function used to construct the network layers, given the number of input and output features
- class madnis.nn.MaskedMLP(input_dims, output_dims, layers=3, nodes_per_feature=8, activation=<class 'torch.nn.modules.activation.LeakyReLU'>, channels=None)[source]¶
Bases:
ModuleInitialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(x, channel=None)[source]¶
Define the computation performed at every call.
Should be overridden by all subclasses. :rtype:
TensorNote
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class madnis.nn.MixedFlow(dims_in_continuous, dims_in_discrete, dims_c=0, discrete_dims_position='first', discrete_model='made', channels=None, continuous_kwargs={}, discrete_kwargs={})[source]¶
Bases:
Module,Distribution- log_prob(x, c=None, channel=None)[source]¶
Computes the log-probabilities of the input data.
- Parameters:
x (
Tensor) – input data, shape (n, dims_in)c (
Optional[Tensor]) – condition, shape (n, dims_c) or None for an unconditional flowchannel (
Union[Tensor,list[int],int,None]) –encodes the channel of the samples. It must have one of the following types:
Tensor: integer tensor of shape (n, ), containing the channel index for every input sample;list: list of integers, specifying the number of samples in each channel;int: integer specifying a single channel containing all the samples;None: used in the single-channel case or to indicate that all channels contain the same number of samples in the multi-channel case.
- Return type:
Tensor- Returns:
log-probabilities with shape (n, )
- class madnis.nn.StackedMLP(features_in, features_out, channels, layers, units, activation=<class 'torch.nn.modules.activation.ReLU'>, layer_constructor=<class 'torch.nn.modules.linear.Linear'>)[source]¶
Bases:
ModuleBuilds multiple independent MLPs that can be efficiently evaluated in parallel.
- Parameters:
features_in (
int) – number of input featuresfeatures_out (
int) – number of output featureschannels (
int) – number of channelslayers (
int) – number of layersunits (
int) – number of hidden nodesactivation (
Callable[[],Module]) – function that builds a nn.Module used as activation functionlayer_construction – function used to construct the network layers, given the number of input and output features
- forward(x, channel=None)[source]¶
Evaluates the network.
- Parameters:
x (
Tensor) – network input, shape (n, features_in)channel (
Union[list[int],int,None]) –encodes the channel of the samples. It must have one of the following types:
list: list of integers, specifying the number of samples in each channel;
int: integer specifying a single channel containing all the samples;
None: all channels contain the same number of samples.
- Return type:
Tensor- Returns:
network output, shape (n, features_out)
- madnis.nn.rational_quadratic_spline(inputs, unnormalized_widths, unnormalized_heights, unnormalized_derivatives, inverse=False, left=0.0, right=1.0, bottom=0.0, top=1.0, min_bin_width=0.001, min_bin_height=0.001, min_derivative=0.001)[source]¶
Constrained rational quadratic spline transformations as introduced in 1906.04032. The input points have to be within the spline boundaries.
- Parameters:
inputs (
Tensor) – input tensor, shape (…, )unnormalized_widths (
Tensor) – unnormalized spline bin widths, shape (…, n_bins)unnormalized_heights (
Tensor) – unnormalized spline bin heights, shape (…, n_bins)unnormalized_derivatives (
Tensor) – unnormalized derivatives at spline bin edges, shape (…, n_bins + 1)inverse (
bool) – if True, perform inverse transformationleft (
float) – lower bound of inputsright (
float) – upper bound of inputsbottom (
float) – lower bound of outputstop (
float) – upper bound of outputsmin_bin_width (
float) – minimal bin widthmin_bin_height (
float) – minimal bin heightmin_derivative (
float) – minimal derivative at bin edges
- Return type:
tuple[Tensor,Tensor]- Returns:
tuple containing the output tensor and the log-Jacobian determinants of the transformation, both with shape (…, )
- madnis.nn.unconstrained_rational_quadratic_spline(inputs, unnormalized_widths, unnormalized_heights, unnormalized_derivatives, inverse=False, left=0.0, right=1.0, bottom=0.0, top=1.0, min_bin_width=0.001, min_bin_height=0.001, min_derivative=0.001)[source]¶
Unconstrained rational quadratic spline transformations as introduced in 1906.04032. Points outside the bounds are mapped onto themselves.
- Parameters:
inputs (
Tensor) – input tensor, shape (…, )unnormalized_widths (
Tensor) – unnormalized spline bin widths, shape (…, n_bins)unnormalized_heights (
Tensor) – unnormalized spline bin heights, shape (…, n_bins)unnormalized_derivatives (
Tensor) – unnormalized derivatives at spline bin edges, shape (…, n_bins + 1)inverse (
bool) – if True, perform inverse transformationleft (
float) – lower bound of inputsright (
float) – upper bound of inputsbottom (
float) – lower bound of outputstop (
float) – upper bound of outputsmin_bin_width (
float) – minimal bin widthmin_bin_height (
float) – minimal bin heightmin_derivative (
float) – minimal derivative at bin edges
- Return type:
tuple[Tensor,Tensor]- Returns:
tuple containing the output tensor and the log-Jacobian determinants of the transformation, both with shape (…, )