madnis.nn package¶

This module containes functions and classes that implement the different types of neural network architectures necessary for (multi-channel) neural importance sampling.

class madnis.nn.DiscreteMADE(dims_in, dims_c=0, channels=None, prior_prob_function=None, channel_remap_function=None, **mlp_kwargs)[source]¶: Bases: Module, Distribution

class madnis.nn.DiscreteTransformer(dims_in, dims_c=0, prior_prob_function=None, embedding_dim=64, feedforward_dim=256, heads=8, transformer_layers=3, mlp_layers=3, mlp_units=256, mlp_activation=<class 'torch.nn.modules.activation.ReLU'>)[source]¶

Bases: Module, Distribution

Initialize internal Module state, shared by both nn.Module and ScriptModule.

log_prob(x, c=None, channel=None)[source]¶

Computes the log-probabilities of the input data.

Parameters:

x (Tensor) – input data, shape (n, dims_in)
c (Optional[Tensor]) – condition, shape (n, dims_c) or None for an unconditional flow
channel (Union[Tensor, list[int], int, None]) –
encodes the channel of the samples. It must have one of the following types:
- Tensor: integer tensor of shape (n, ), containing the channel index for every input sample;
- list: list of integers, specifying the number of samples in each channel;
- int: integer specifying a single channel containing all the samples;
- None: used in the single-channel case or to indicate that all channels contain the same number of samples in the multi-channel case.

Return type:

Tensor

Returns:

log-probabilities with shape (n, )

sample(n=None, c=None, channel=None, return_log_prob=False, return_prob=False, device=None, dtype=None)[source]¶

Draws samples following the distribution

Parameters:

n (Optional[int]) – number of samples
c (Optional[Tensor]) – condition, shape (n, dims_c) or None for an unconditional flow
channel (Union[Tensor, list[int], int, None]) –
encodes the channel of the samples. It must have one of the following types:
- Tensor: integer tensor of shape (n, ), containing the channel index for every input sample;
- list: list of integers, specifying the number of samples in each channel;
- int: integer specifying a single channel containing all the samples;
- None: used in the single-channel case or to indicate that all channels contain the same number of samples in the multi-channel case.
return_log_prob (bool) – if True, also return the log-probabilities
return_prob (bool) – if True, also return the probabilities
device (Optional[device]) – device of the returned tensor. Only required if no condition is given.
dtype (Optional[dtype]) – dtype of the returned tensor. Only required if no condition is given.

Return type:

Tensor | tuple[Tensor, ...]

Returns:

samples with shape (n, dims_in). Depending on the arguments return_log_prob, return_prob, this function should also return the log-probabilities with shape (n, ), the probabilities with shape (n, ).

class madnis.nn.Distribution(*args, **kwargs)[source]¶

Bases: Protocol

Protocol for a (potentially learnable) distribution that can be used for sampling and density estimation, like a normalizing flow.

log_prob(x, c=None, channel=None)[source]¶

Computes the log-probabilities of the input data.

Parameters:

x (Tensor) – input data, shape (n, dims_in)
c (Optional[Tensor]) – condition, shape (n, dims_c) or None for an unconditional flow
channel (Union[Tensor, list[int], int, None]) –
encodes the channel of the samples. It must have one of the following types:
- Tensor: integer tensor of shape (n, ), containing the channel index for every input sample;
- list: list of integers, specifying the number of samples in each channel;
- int: integer specifying a single channel containing all the samples;
- None: used in the single-channel case or to indicate that all channels contain the same number of samples in the multi-channel case.

Return type:

Tensor

Returns:

log-probabilities with shape (n, )

prob(x, c=None, channel=None)[source]¶

Computes the probabilities of the input data.

Parameters:

x (Tensor) – input data, shape (n, dims_in)
c (Optional[Tensor]) – condition, shape (n, dims_c) or None for an unconditional flow
channel (Union[Tensor, list[int], int, None]) –
encodes the channel of the samples. It must have one of the following types:
- Tensor: integer tensor of shape (n, ), containing the channel index for every input sample;
- list: list of integers, specifying the number of samples in each channel;
- int: integer specifying a single channel containing all the samples;
- None: used in the single-channel case or to indicate that all channels contain the same number of samples in the multi-channel case.

Return type:

Tensor

Returns:

probabilities with shape (n, )

sample(n, c=None, channel=None, return_log_prob=False, return_prob=False, device=None, dtype=None)[source]¶

Draws samples following the distribution

Parameters:

n (int) – number of samples
c (Optional[Tensor]) – condition, shape (n, dims_c) or None for an unconditional flow
channel (Union[Tensor, list[int], int, None]) –
encodes the channel of the samples. It must have one of the following types:
- Tensor: integer tensor of shape (n, ), containing the channel index for every input sample;
- list: list of integers, specifying the number of samples in each channel;
- int: integer specifying a single channel containing all the samples;
- None: used in the single-channel case or to indicate that all channels contain the same number of samples in the multi-channel case.
return_log_prob (bool) – if True, also return the log-probabilities
return_prob (bool) – if True, also return the probabilities
device (Optional[device]) – device of the returned tensor. Only required if no condition is given.
dtype (Optional[dtype]) – dtype of the returned tensor. Only required if no condition is given.

Return type:

Tensor | tuple[Tensor, ...]

Returns:

class madnis.nn.Flow(dims_in, dims_c=0, uniform_latent=True, permutations='log', condition_masks=None, blocks=None, subnet_constructor=None, layers=3, units=32, activation=<class 'torch.nn.modules.activation.ReLU'>, layer_constructor=<class 'torch.nn.modules.linear.Linear'>, channels=None, channel_remap_function=None, mapping=None, bins=10, spline_bounds=10.0, min_bin_width=0.001, min_bin_height=0.001, min_bin_derivative=0.001)[source]¶

Bases: Module, Distribution

Coupling-block based normalizing flow (1605.08803) using rational quadratic spline transformations (1906.04032). Both conditional and non-conditional flows are supported. The class also allows to build multi-channel flows, i.e. an efficient implementation of multiple independent flows with the same hyperparameters.

Parameters:

dims_in (int) – input dimension
dims_c (int) – condition dimension
uniform_latent (bool) – If True, encode mapping from [0,1]^d to [0,1]^d and use a uniform latent space distribution. If False, encode mapping from R^d to R^d and use Gaussian latent space distribution.
permutations (Literal['log', 'random', 'exchange']) – Defines the strategy to permute the input dimensions between coupling blocks. “log”: logarithmic decomposition, so that every dimension is conditioned on every other dimension at least once. “random”: randomly permute dimensions. “exchange”: condition the first half of the input on the second half, then the other way around, repeatedly.
condition_masks (Optional[Tensor]) – Overwrites the permutation strategy with a custom conditioning mask with shape (blocks, dims_in). Components where the mask is True are used as condition, and components where it is False are transformed.
blocks (Optional[int]) – number of coupling blocks. Only needed if permutations is “random” or “exchange”
subnet_constructor (Optional[Callable[[int, int], Module]]) – function used to construct the flow sub-networks, with the number of input features and output features of the subnet as arguments. If None, the MLP (single channel) or StackedMLP (multi-channel) classes are used.
layers (int) – number of subnet layers. Only relevant if subnet_constructor=None.
units (int) – number of subnet hidden nodes. Only relevant if subnet_constructor=None.
activation (Callable[[], Module]) – function that builds a nn.Module used as activation function. Only relevant if subnet_constructor=None.
layer_constructor (Callable[[int, int], Module]) – function used to construct the subnet layers, given the number of input and output features. Only relevant if subnet_constructor=None.
channels (Optional[int]) – If None, build single-channel flow. If integer, build multi-channel flow with this number of channels.
channel_remap_function (Optional[Callable[[Tensor], Tensor]]) – TODO
mapping (Union[Callable[[Tensor, bool], tuple[Tensor, Tensor]], list[Callable[[Tensor, bool], tuple[Tensor, Tensor]]], None]) – Specifies a single mapping function or a list of mapping functions (one per channel) that are applied to the input before it enters the flow (forward direction) or after drawing samples using the flow (inverse direction). The arguments of the function are the input data and a boolean whether the transformation is inverted. It must return the transformed value and the logarithm of the Jacobian determinant of the transformation.
bins (int) – number of RQ spline bins
spline_bounds (float) – If uniform_latent=False, the splines are defined on the interval [-spline_bounds, spline_bounds].
min_bin_width (float) – minimal width of the spline bins
min_bin_height (float) – minimal height of the spline bins
min_bin_derivative (float) – minimal derivative at the spline bin edges

init_with_grid(grid)[source]¶

Initializes the flow using a VEGAS grid, i.e. from bins with varying width and equal probability. The number of bins of this grid should be larger than the number of RQ spline bins. This function then performs the bin reduction algorithm as described in [2311.01548].

Parameters:: grid (Tensor) – edges of the VEGAS grid bins with shape (dims_in, vegas_bins+1) for single-channel flows or (channels, dims_in, vegas_bins+1) for multi-channel flows

log_prob(x, c=None, channel=None, return_latent=False)[source]¶

Computes the log-probabilities of the input data.

Parameters:

x (Tensor) – input data, shape (n, dims_in)
c (Optional[Tensor]) – condition, shape (n, dims_c) or None for an unconditional flow
channel (Union[Tensor, list[int], int, None]) –
encodes the channel of the samples. It must have one of the following types:
- Tensor: integer tensor of shape (n, ), containing the channel index for every input sample;
- list: list of integers, specifying the number of samples in each channel;
- int: integer specifying a single channel containing all the samples;
- None: used in the single-channel case or to indicate that all channels contain the same number of samples in the multi-channel case.
return_latent (bool) – if True, also return the latent space vector

Return type:

Tensor | tuple[Tensor, Tensor]

Returns:

log-probabilities with shape (n, ). If return_latent is True, it also returns the latent space vector with shape (n, dims_in).

prob(x, c=None, channel=None, return_latent=False)[source]¶

Computes the probabilities of the input data.

Parameters:

x (Tensor) – input data, shape (n, dims_in)
c (Optional[Tensor]) – condition, shape (n, dims_c) or None for an unconditional flow
channel (Union[Tensor, list[int], int, None]) –
encodes the channel of the samples. It must have one of the following types:
- Tensor: integer tensor of shape (n, ), containing the channel index for every input sample;
- list: list of integers, specifying the number of samples in each channel;
- int: integer specifying a single channel containing all the samples;
- None: used in the single-channel case or to indicate that all channels contain the same number of samples in the multi-channel case.
return_latent (bool) – if True, also return the latent space vector

Return type:

Tensor | tuple[Tensor, Tensor]

Returns:

probabilities with shape (n, ). If return_latent is True, it also returns the latent space vector with shape (n, dims_in).

sample(n=None, c=None, channel=None, return_log_prob=False, return_prob=False, device=None, dtype=None, return_latent=False)[source]¶

Draws samples from the probability distribution encoded by the flow.

Parameters:

n (Optional[int]) – number of samples. Only required if no condition is given.
c (Optional[Tensor]) – condition, shape (n, dims_c) or None for an unconditional flow
channel (Union[Tensor, list[int], int, None]) –
encodes the channel of the samples. It must have one of the following types:
- Tensor: integer tensor of shape (n, ), containing the channel index for every input sample;
- list: list of integers, specifying the number of samples in each channel;
- int: integer specifying a single channel containing all the samples;
- None: used in the single-channel case or to indicate that all channels contain the same number of samples in the multi-channel case.
return_log_prob (bool) – if True, also return the log-probabilities
return_prob (bool) – if True, also return the probabilities
device (Optional[device]) – device of the returned tensor. Only required if no condition is given.
dtype (Optional[dtype]) – dtype of the returned tensor. Only required if no condition is given.
return_latent (bool) – if True, also return the latent space vector

Return type:

Tensor | tuple[Tensor, ...]

Returns:

samples with shape (n, dims_in). Depending on the arguments return_log_prob, return_prob and return_latent, this function will also return the log-probabilities with shape (n, ), the probabilities with shape (n, ) and the latent space vector with shape (n, dims_in).

transform(x, c=None, channel=None, inverse=False)[source]¶

Transforms the input data into the latent space or back.

Parameters:

x (Tensor) – input data, shape (n, dims_in)
c (Optional[Tensor]) – condition, shape (n, dims_c) or None for an unconditional flow
channel (Union[Tensor, list[int], int, None]) –
encodes the channel of the samples. It must have one of the following types:
- Tensor: integer tensor of shape (n, ), containing the channel index for every input sample;
- list: list of integers, specifying the number of samples in each channel;
- int: integer specifying a single channel containing all the samples;
- None: used in the single-channel case or to indicate that all channels contain the same number of samples in the multi-channel case.
inverse (bool) – if True, use inverted transformation (i.e. the sampling direction)

Return type:

tuple[Tensor, Tensor]

Returns:

tuple containing the transformed values with shape (n, dims_in), and log Jacobian determinants with shape (n, ) of the transformation

class madnis.nn.MLP(features_in, features_out, layers=3, units=32, activation=<class 'torch.nn.modules.activation.ReLU'>, layer_constructor=<class 'torch.nn.modules.linear.Linear'>)[source]¶

Bases: Module

Class implementing a standard fully-connected network.

Parameters:

features_in (int) – number of input features
features_out (int) – number of output features
layers (int) – number of layers
units (int) – number of hidden nodes
activation (Callable[[], Module]) – function that builds a nn.Module used as activation function
layer_construction – function used to construct the network layers, given the number of input and output features

forward(x)[source]¶

Evaluates the network.

Parameters:: x (Tensor) – network input, shape (n, features_in)
Return type:: Tensor
Returns:: network output, shape (n, features_out)

class madnis.nn.MaskedMLP(input_dims, output_dims, layers=3, nodes_per_feature=8, activation=<class 'torch.nn.modules.activation.LeakyReLU'>, channels=None)[source]¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x, channel=None)[source]¶

Define the computation performed at every call.

Should be overridden by all subclasses. :rtype: Tensor

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class madnis.nn.MixedFlow(dims_in_continuous, dims_in_discrete, dims_c=0, discrete_dims_position='first', discrete_model='made', channels=None, continuous_kwargs={}, discrete_kwargs={})[source]¶

Bases: Module, Distribution

log_prob(x, c=None, channel=None)[source]¶

Computes the log-probabilities of the input data.

Parameters:

x (Tensor) – input data, shape (n, dims_in)
c (Optional[Tensor]) – condition, shape (n, dims_c) or None for an unconditional flow
channel (Union[Tensor, list[int], int, None]) –
encodes the channel of the samples. It must have one of the following types:
- Tensor: integer tensor of shape (n, ), containing the channel index for every input sample;
- list: list of integers, specifying the number of samples in each channel;
- int: integer specifying a single channel containing all the samples;
- None: used in the single-channel case or to indicate that all channels contain the same number of samples in the multi-channel case.

Return type:

Tensor

Returns:

log-probabilities with shape (n, )

class madnis.nn.StackedMLP(features_in, features_out, channels, layers, units, activation=<class 'torch.nn.modules.activation.ReLU'>, layer_constructor=<class 'torch.nn.modules.linear.Linear'>)[source]¶

Bases: Module

Builds multiple independent MLPs that can be efficiently evaluated in parallel.

Parameters:

features_in (int) – number of input features
features_out (int) – number of output features
channels (int) – number of channels
layers (int) – number of layers
units (int) – number of hidden nodes
activation (Callable[[], Module]) – function that builds a nn.Module used as activation function
layer_construction – function used to construct the network layers, given the number of input and output features

forward(x, channel=None)[source]¶

Evaluates the network.

Parameters:

x (Tensor) – network input, shape (n, features_in)
channel (Union[list[int], int, None]) –
encodes the channel of the samples. It must have one of the following types:
- list: list of integers, specifying the number of samples in each channel;
- int: integer specifying a single channel containing all the samples;
- None: all channels contain the same number of samples.

Return type:

Tensor

Returns:

network output, shape (n, features_out)

reset_parameters()[source]¶: Initializes the network parameters. The parameters of the last layer are initialized to zero. Kaiming uniform initializiation is used for the other layers.

madnis.nn.rational_quadratic_spline(inputs, unnormalized_widths, unnormalized_heights, unnormalized_derivatives, inverse=False, left=0.0, right=1.0, bottom=0.0, top=1.0, min_bin_width=0.001, min_bin_height=0.001, min_derivative=0.001)[source]¶

Constrained rational quadratic spline transformations as introduced in 1906.04032. The input points have to be within the spline boundaries.

Parameters:

inputs (Tensor) – input tensor, shape (…, )
unnormalized_widths (Tensor) – unnormalized spline bin widths, shape (…, n_bins)
unnormalized_heights (Tensor) – unnormalized spline bin heights, shape (…, n_bins)
unnormalized_derivatives (Tensor) – unnormalized derivatives at spline bin edges, shape (…, n_bins + 1)
inverse (bool) – if True, perform inverse transformation
left (float) – lower bound of inputs
right (float) – upper bound of inputs
bottom (float) – lower bound of outputs
top (float) – upper bound of outputs
min_bin_width (float) – minimal bin width
min_bin_height (float) – minimal bin height
min_derivative (float) – minimal derivative at bin edges

Return type:

tuple[Tensor, Tensor]

Returns:

tuple containing the output tensor and the log-Jacobian determinants of the transformation, both with shape (…, )

madnis.nn.unconstrained_rational_quadratic_spline(inputs, unnormalized_widths, unnormalized_heights, unnormalized_derivatives, inverse=False, left=0.0, right=1.0, bottom=0.0, top=1.0, min_bin_width=0.001, min_bin_height=0.001, min_derivative=0.001)[source]¶

Unconstrained rational quadratic spline transformations as introduced in 1906.04032. Points outside the bounds are mapped onto themselves.

Parameters:

inputs (Tensor) – input tensor, shape (…, )
unnormalized_widths (Tensor) – unnormalized spline bin widths, shape (…, n_bins)
unnormalized_heights (Tensor) – unnormalized spline bin heights, shape (…, n_bins)
unnormalized_derivatives (Tensor) – unnormalized derivatives at spline bin edges, shape (…, n_bins + 1)
inverse (bool) – if True, perform inverse transformation
left (float) – lower bound of inputs
right (float) – upper bound of inputs
bottom (float) – lower bound of outputs
top (float) – upper bound of outputs
min_bin_width (float) – minimal bin width
min_bin_height (float) – minimal bin height
min_derivative (float) – minimal derivative at bin edges

Return type:

tuple[Tensor, Tensor]

Returns:

tuple containing the output tensor and the log-Jacobian determinants of the transformation, both with shape (…, )