madnis.integrator package¶

This module contains functions and classes to train neural importance sampling networks and evaluate the integration and sampling performance.

class madnis.integrator.Buffer(capacity, shapes, persistent=True, dtypes=None)[source]¶

Bases: Module

Circular buffer for multiple tensors with different shapes. The class is a torch.nn.Module to allow for simple storage.

Parameters:

capacity (int) – maximum number of samples stored in the buffer
shapes (list[tuple[int, ...]]) – shapes of the tensors to be stored, without batch dimension. If a shape is None, no tensor is stored at that position. This allows for simpler handling of optional stored fields.
persistent (bool) – if True, the content of the buffer is part of the module’s state_dict
dtypes (Optional[list[dtype | None]]) – if different from None, specifies the tensors which have a non-standard dtype

filter(predicate, batch_size=100000)[source]¶

Removes samples from the buffer that do not fulfill the criterion given by the predicate function.

Parameters:

predicate (Callable[[tuple[Tensor | None, ...]], Tensor]) – function that returns a mask for a batch of samples, given a tuple with all the buffered fields as argument
batch_size (int) – maximal batch size to limit memory usage

sample(count)[source]¶

Returns a batch of samples drawn from the buffer without replacement.

Parameters:: count (int) – number of samples
Return type:: list[Tensor | None]
Returns:: samples drawn from the buffer

store(*tensors)[source]¶

Adds the given tensors to the buffer. If the buffer is full, the oldest stored samples are overwritten.

Parameters:: tensors (Tensor | None) – samples to be stored. The shapes of the tensors after the batch dimension must match the shapes given during initialization. The argument can be None if the corresponding shape was None during initialization.

class madnis.integrator.ChannelData(channel_index, target_index, group, remapped, position_in_group)[source]¶

Bases: object

Information about a single channel

Parameters:

channel_index (int) – index of the channel
target_index (int) – index of the channel that it is mapped to
group (ChannelGroup) – channel group that the channel belongs to
remapped (bool) – True if the channel is remapped to another channel
position_in_group (int) – index of the channel within its group

class madnis.integrator.ChannelGroup(group_index, target_index, channel_indices)[source]¶

Bases: object

A group of channels

Parameters:

group_index (int) – index of the group in the list of groups
target_index (int) – index of the channel that all other channels in the group are mapped to
channel_indices (list[int]) – indices of the channels in the group

class madnis.integrator.ChannelGrouping(channel_assignment)[source]¶

Bases: object

Class that encodes how channels are grouped together for a multi-channel integrand

Parameters:: channel_assignment (list[int | None]) – list with an entry for each channel. If None, the channel is not remapped. Otherwise, the index of the channel to which it is mapped.

class madnis.integrator.Integrand(function, input_dim, bounds=None, channel_count=None, remapped_dim=None, has_channel_weight_prior=False, channel_grouping=None, function_includes_sampling=False, update_active_channels_mask=None, discrete_dims=[], discrete_dims_position='first', discrete_prior_prob_function=None)[source]¶

Bases: Module

Class that wraps an integrand function and meta-data necessary to use advanced MadNIS features like learnable multi-channel weights, grouped channels and channel weight priors.

Parameters:

function (Callable) –
integrand function. The signature depends on the other arguments:
- single-channel integration, channel_count=None: x -> f
- basic multi-channel integration, remapped_dim=None, has_channel_weight_prior=False: (x, c) -> f
- with channel weights, remapped_dim=None, has_channel_weight_prior=True: (x, c) -> (f, alpha) (no trainable channel weights possible)
- with channel-dependent mapping, remapped_dim: int, has_channel_weight_prior=False: (x, c) -> (f, y)
- all features, remapped_dim: int, has_channel_weight_prior=True: (x, c) -> (f, y, alpha)
with the following tensors:
- x is a point generated by the importance sampling, shape (n, input_dim),
- c is the channel index, shape (n, ),
- f is the integrand value, shape (n, ),
- y is the point after applying a channel-dependent mapping, shape (n, remapped_dim)
- alpha is the prior channel weight, shape (n, channel_count).
input_dim (int) – dimension of the integration space
bounds (Optional[list[list[float]]]) – List of pairs [lower bound, upper bound] of the integration interval for all dimensions. The integrand is rescaled so that the MadNIS training can be performed on the unit hypercube. If None, the unit hypercube is used as integration domain.
channel_count (Optional[int]) – None in the single-channel case, specifies the number of channels otherwise.
remapped_dim (Optional[int]) – If different from None, it gives the dimension of a remapped space, with a channel-dependent mapping computed as part of the integrand function.
has_channel_weight_prior (bool) – If True, the integrand returns channel weights
channel_grouping (Optional[ChannelGrouping]) – ChannelGrouping object or None if all channels are independent

forward(x, channels)[source]¶

Define the computation performed at every call.

Should be overridden by all subclasses. :rtype: tuple[Tensor, Tensor | None, Tensor | None]

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

remap_channels(channels)[source]¶

Remaps channel indices to the indices of their respective channel groups if a ChannelGrouping object was provided, otherwise returns the indices unchanged.

Parameters:: channels (Tensor | int) – channel indices, tensor with shape (n, ) or integer
Return type:: Tensor | int
Returns:: remapped channel indices, tensor with shape (n, ) or integer

unique_channel_count()[source]¶

Returns the number of channels, or, if some channels are grouped together, the number of channel groups

Return type:: int

class madnis.integrator.IntegrationMetrics(integral, count, error, rel_error, rel_stddev, rel_stddev_opt, channel_integrals, channel_counts, channel_errors, channel_rel_errors, channel_rel_stddevs)[source]¶

Bases: object

Metrics for the integration performance

Parameters:

integral (float) – total integration results
count (int) – number of integration samples
error (float) – Monte Carlo integration error
rel_error (float) – relative integration error
rel_stddev (float) – relative standard deviation (does not scale with number of samples)
rel_stddev_opt (float) – optimal relative standard deviation that would have been possible with stratified sampling
channel_integrals (list[float]) – channel-wise integrals
channel_counts (list[int]) – channel-wise number of samples
channel_errors (list[float]) – channel-wise integration errors
channel_rel_errors (list[float]) – channel-wise relative integration errors
channel_rel_stddevs (list[float]) – channel-wise relative standard deviations

class madnis.integrator.Integrator(integrand, dims=0, flow=None, flow_kwargs={}, discrete_flow_kwargs={}, discrete_model='made', train_channel_weights=True, cwnet=None, cwnet_kwargs={}, loss=None, optimizer=None, batch_size=1024, batch_size_per_channel=0, learning_rate=0.001, scheduler=None, uniform_channel_ratio=1.0, integration_history_length=20, drop_zero_integrands=False, batch_size_threshold=0.5, buffer_capacity=0, minimum_buffer_size=50, buffered_steps=0, max_stored_channel_weights=None, channel_dropping_threshold=0.0, channel_dropping_interval=100, channel_grouping_mode='none', freeze_cwnet_iteration=None, device=None, dtype=None)[source]¶

Bases: Module

Implements MadNIS training and integration logic. MadNIS integrators are torch modules, so their state can easily be saved and loaded using the torch.save and torch.load methods.

Parameters:

integrand (Callable[[Tensor], Tensor] | Integrand) – the function to be integrated. In the case of a simple single-channel integration, the integrand function can directly be passed to the integrator. In more complicated cases, like multi-channel integrals, use the Integrand class.
dims (int) – dimension of the integration space. Only required if a simple function is given as integrand.
flow (Optional[Distribution]) – sampling distribution used for the integration. If None, a flow is constructed using the Flow class. Otherwise, it has to be compatible with a normalizing flow, i.e. have the interface defined in the Distribution class.
flow_kwargs (dict[str, Any]) – If flow is None, these keyword arguments are passed to the Flow constructor.
discrete_flow_kwargs (dict[str, Any]) – If flow is None, these keyword arguments are passed to the MixedFlow or DiscreteMADE constructor.
train_channel_weights (bool) – If True, construct a channel weight network and train it. Only necessary if cwnet is None.
cwnet (Optional[Module]) – network used for the trainable channel weights. If None and train_channel_weights is True, the cwnet is built using the MLP class.
cwnet_kwargs (dict[str, Any]) – If cwnet is None and train_channel_weights is True, these keyword arguments are passed to the MLP constructor.
loss (Optional[Callable[[Tensor, Tensor, Tensor | None, Tensor | None], Tensor]]) – Loss function used for training. If not provided, the KL divergence is chosen in the single-channel case and the stratified variance is chosen in the multi-channel case.
optimizer (Union[Optimizer, Callable[[Iterable[Parameter]], Optimizer], None]) – optimizer for the training. Can be an optimizer object or function that is called with the model parameters as argument and returns the optimizer. If None, the Adam optimizer is used.
batch_size (int) – Training batch size
batch_size_per_channel (int) – used to compute the batch size as a function of the number of active channels, batch_size + n_active_channels * batch_size_per_channel
learning_rate (float) – learning rate used for the Adam optimizer
scheduler (Union[LRScheduler, Callable[[Optimizer], LRScheduler], None]) – learning rate scheduler for the training. Can be a learning rate scheduler object or a function that gets the optimizer as argument and returns the scheduler. If None, a constant learning rate is used.
uniform_channel_ratio (float) – part of samples in each batch that will be distributed equally between all channels, value has to be between 0 and 1.
integration_history_length (int) – number of batches for which the channel-wise means and variances are stored. This is used for stratified sampling during integration, and during the training if uniform_channel_ratio is different from one.
drop_zero_integrands (bool) – If True, points with integrand zero are dropped and not used for the optimization.
batch_size_threshold (float) – New samples are drawn until the number of samples is at least batch_size_threshold * batch_size.
buffer_capacity (int) – number of samples that are stored for buffered training
minimum_buffer_size (int) – minimal size of the buffer to run buffered training
buffered_steps (int) – number of optimization steps on buffered samples after every online training step
max_stored_channel_weights (Optional[int]) – number of prior channel weights that are buffered for each sample. If None, all prior channel weights are saved, otherwise only those for the channels with the largest contributions.
channel_dropping_threshold (float) – all channels which a cumulated contribution to the integrand that is smaller than this threshold are dropped
channel_dropping_interval (int) – number of training steps after which channel dropping is performed
channel_grouping_mode (Literal['none', 'uniform', 'learned']) – If “none” all channels are treated as separate channels in the loss and integration, even when they grouped together. If “uniform”, the channels within each group are sampled with equal probability. If “learned”, a discrete normalizing flow is used to sample the channel index within a group.
freeze_cwnet_iteration (Optional[int]) – If not None, specifies the training iteration after which the channel weight network is frozen
device (Optional[device]) – torch device used for training and integration. If None, use default device.
dtype (Optional[dtype]) – torch dtype used for training and integration. If None, use default dtype.

integral()[source]¶

Returns the current estimate of the integral based on previous training iterations and calls to the integrate function.

Return type:: tuple[float, float]
Returns:: tuple with the value of the integral and the MC integration error

integrate(n, batch_size=100000)[source]¶

Draws samples and computes the integral.

Parameters:

n (int) – number of samples
batch_size (int) – batch size used for sampling and calling the integrand

Return type:

tuple[float, float]

Returns:

tuple with the value of the integral and the MC integration error

integration_metrics(n, batch_size=100000)[source]¶

Draws samples and computes metrics for the total and channel-wise integration quality.

Parameters:

n (int) – number of samples
batch_size (int) – batch size used for sampling and calling the integrand

Return type:

IntegrationMetrics

Returns:

IntegrationMetrics object, see its documentation for details

sample(n, batch_size=100000, channel_weight_mode='variance', channel=None, evaluate_integrand=True)[source]¶

Draws samples and computes their integration weight

Parameters:

n (int) – number of samples
batch_size (int) – batch size used for sampling and calling the integrand
channel_weight_mode (Literal['uniform', 'mean', 'variance']) – specifies whether the channels are weighted by their mean, variance or uniformly. Note that weighting by mean can lead to problems for non-positive functions
channel (Optional[int]) – if different from None, samples are only generated for this channel

Return type:

SampleBatch

Returns:

SampleBatch object, see its documentation for details

train(steps, callback=None, capture_keyboard_interrupt=False)[source]¶

Performs multiple training steps

Parameters:

steps (int) – number of training steps
callback (Optional[Callable[[TrainingStatus], None]]) – function that is called after each training step with the training status as argument
capture_keyboard_interrupt (bool) – If True, a keyboard interrupt does not raise an exception. Instead, the current training step is finished and the training is aborted afterwards.

train_step()[source]¶

Performs a single training step

Return type:: TrainingStatus
Returns:: Training status

unweighting_metrics(n, batch_size=100000, channel_weight_mode='mean')[source]¶

Draws samples and computes metrics for the total and channel-wise integration quality. This function is only suitable for functions that are non-negative everywhere.

Parameters:

n (int) – number of samples
batch_size (int) – batch size used for sampling and calling the integrand
channel_weight_mode (Literal['uniform', 'mean', 'variance']) – specifies whether the channels are weighted by their mean, variance or uniformly.

Return type:

UnweightingMetrics

Returns:

UnweightingMetrics object, see its documentation for details

class madnis.integrator.SampleBatch(x, y, q_sample, func_vals, channels, alphas_prior=None, alpha_channel_indices=None, integration_channels=None, weights=None, alphas=None, zero_counts=None)[source]¶

Bases: object

Contains a batch of samples

Parameters:

x (Tensor) – samples generated by the flow, shape (n, dim)
y (Tensor | None) – remapped samples returned by the integrand, shape (n, remapped_dim)
q_sample (Tensor) – probabilities of the samples, shape (n, )
func_vals (Tensor) – integrand value, shape (n, )
channels (Tensor | None) – channels indices for multi-channel integration, shape (n, ), otherwise None
alphas_prior (Optional[Tensor]) – prior channel weights, shape (n, channels), or None for single-channel integration
alpha_channel_indices (Optional[Tensor]) – channel indices if not all prior channel weights are stored, otherwise None
integration_channels (Optional[Tensor]) – index of the channel group in case the integration is performed at the level of channel groups, shape (n, ), otherwise None
weights (Optional[Tensor]) – integration weight, shape (n, ). Only set when returned from Integrator.sample function, otherwise None.
alphas (Optional[Tensor]) – channel weights including learned correction, shape (n, channels). Only set when returned from Integrator.sample function, otherwise None.
zero_counts (Optional[Tensor]) – channel-wise counts of samples with zero-weights that are not included in the batch, shape (channels, ). This field is ignored by most methods, as it behaves does not have the batch size as its first dimension

static cat(batches)[source]¶

Concatenates multiple batches. If the field zero_counts is not None, the zero_counts of all batches are added.

Parameters:: batches (Iterable[SampleBatch]) – Iterable over SampleBatch objects
Return type:: SampleBatch
Returns:: New SamplaBatch object containing the concatenated batches

map(func)[source]¶

Applies function to all fields in the batch that are not None and returns a new SampleBatch

Parameters:: func (Callable[[Tensor], Tensor]) – function that is applied to all fields in the batch. Expects a tensor as argument and returns a new tensor
Return type:: SampleBatch
Returns:: Transformed SampleBatch

split(batch_size)[source]¶

Splits up the fields into batches and yields SampleBatch objects for every batch.

Parameters:: batch_size (int) – maximal size of the batches
Return type:: Iterable[SampleBatch]
Returns:: Iterator over the batches

class madnis.integrator.TrainingStatus(step, loss, buffered, learning_rate, dropped_channels)[source]¶

Bases: object

Contains the MadNIS training status to pass it to a callback function.

Parameters:

step (int) – optimization step
loss (float) – loss from the optimization step
buffered (bool) – whether the optimization was performed on buffered samples
learning_rate (float | None) – current learning rate if learning rate scheduler is present
dropped_channels (int) – number of channels dropped after this optimization step

class madnis.integrator.UnweightingMetrics(cut_eff, uweff_before_cuts, uweff_before_cuts_partial, uweff_after_cuts, uweff_after_cuts_partial, over_weight_rate)[source]¶

Bases: object

Metrics for the unweighting performance

Parameters:

cut_eff (float) – cut efficiency
uweff_before_cuts (float) – unweighting efficiency before cuts (computed as uweff_after_cuts * cut_eff)
uweff_before_cuts_partial (float) – unweighting efficiency without over-weights before cuts (computed as uweff_after_cuts_partial * cut_eff)
uweff_after_cuts (float) – unweighting efficiency after cuts
uweff_after_cuts_partial (float) – unweighting efficiency without over-weights after cuts
over_weight_rate (float) – fraction of over-weight samples

class madnis.integrator.VegasPreTraining(integrator, bins=64, damping=0.7)[source]¶

Bases: object

Implements VEGAS pre-training. It wraps around an Integrator object and uses its integrand, sample buffer and integration history. In addition, it also defines the functions sample, integrate, integration_metrics and unweighting_metrics to allow for comparisions between VEGAS and MadNIS.

initialize_integrator()[source]¶: Initializes the flows in the integrator object using the trained VEGAS grid

integrate(n)[source]¶

Draws samples and computes the integral.

Parameters:

n (int) – number of samples
batch_size – batch size used for sampling and calling the integrand

Return type:

tuple[float, float]

Returns:

tuple with the value of the integral and the MC integration error

integration_metrics(n, batch_size=100000)[source]¶

Draws samples and computes metrics for the total and channel-wise integration quality.

Parameters:

n (int) – number of samples
batch_size (int) – batch size used for sampling and calling the integrand

Return type:

IntegrationMetrics

Returns:

IntegrationMetrics object, see its documentation for details

sample(n, batch_size=100000, channel_weight_mode='variance', channel=None, evaluate_integrand=True)[source]¶

Draws samples and computes their integration weight

Parameters:

n (int) – number of samples
batch_size (int) – batch size used for sampling and calling the integrand
channel_weight_mode (Literal['uniform', 'mean', 'variance']) – specifies whether the channels are weighted by their mean, variance or uniformly. Note that weighting by mean can lead to problems for non-positive functions
channel (Optional[int]) – if different from None, samples are only generated for this channel
evaluate_integrand (bool) – If False, the integrand is not evaluated and func_vals, y and alphas are set to dummy values. This can be used when integrand is expensive and one only needs the mapping jacobian.

Return type:

SampleBatch

Returns:

SampleBatch object, see its documentation for details

train(samples_per_channel, callback=None, capture_keyboard_interrupt=False)[source]¶

Performs multiple training steps

Parameters:

samples_per_channel (list[int]) – list of the number of samples per channel, with one entry for every training iteration
callback (Optional[Callable[[VegasTrainingStatus], None]]) – function that is called after each training step with the training status as argument
capture_keyboard_interrupt (bool) – If True, a keyboard interrupt does not raise an exception. Instead, the current training step is finished and the training is aborted afterwards.

train_step(samples_per_channel)[source]¶

Performs a single VEGAS training iteration

Parameters:: samples_per_channel (int) – number of training samples per channel
Return type:: VegasTrainingStatus
Returns:: VegasTrainingStatus object containing metrics of the training progress

unweighting_metrics(n, batch_size=100000, channel_weight_mode='mean')[source]¶

Draws samples and computes metrics for the total and channel-wise integration quality. This function is only suitable for functions that are non-negative everywhere.

Parameters:

n (int) – number of samples
batch_size (int) – batch size used for sampling and calling the integrand
channel_weight_mode (Literal['uniform', 'mean', 'variance']) – specifies whether the channels are weighted by their mean, variance or uniformly.

Return type:

UnweightingMetrics

Returns:

UnweightingMetrics object, see its documentation for details

madnis.integrator.integration_metrics(channel_means, channel_variances, channel_counts)[source]¶

Calculate metrics for the integration performance

Parameters:

channel_means (Tensor) – channel-wise integrals
channel_variances (Tensor) – channel-wise variances
channel_counts (Tensor) – channel-wise sample counts

Return type:

IntegrationMetrics

Returns:

An IntegrationMetrics object

madnis.integrator.kl_divergence(f_true, q_test, q_sample)[source]¶

Computes the Kullback-Leibler divergence for two given sets of probabilities, f_true and q_test. It uses importance sampling, i.e. the estimator is divided by an additional factor of q_sample.

Parameters:

f_true (Tensor) – normalized integrand values
q_test (Tensor) – estimated function/probability
q_sample (Tensor) – sampling probability
channels – channel indices or None in the single-channel case

Return type:

Tensor

Returns:

computed KL divergence

madnis.integrator.kl_divergence_softclip(f_true, q_test, q_sample, threshold=30.0)[source]¶

Computes the Kullback-Leibler divergence for two given sets of probabilities, f_true and q_test. It uses importance sampling, i.e. the estimator is divided by an additional factor of q_sample. A soft clipping function is applied to the sample weights.

Parameters:

f_true (Tensor) – normalized integrand values
q_test (Tensor) – estimated function/probability
q_sample (Tensor) – sampling probability
channels – channel indices or None in the single-channel case
threshold (Tensor) – approximate point of transition between linear and logarithmic behavior

Return type:

Tensor

Returns:

computed KL divergence

madnis.integrator.multi_channel_loss(loss)[source]¶

Turns a single-channel loss function into a multi-channel loss function by evaluating it for each channel separately and then adding them weighted by TODO weighted by what?

Parameters:: loss (Callable[[Tensor, Tensor, Tensor], Tensor]) – single-channel loss function, that expects the integrand value, test probability and sampling probability as arguments
Return type:: Callable[[Tensor, Tensor, Tensor | None, Tensor | None], Tensor]
Returns:: multi-channel loss function, that expects the integrand value, test probability and, optionally, sampling probability and channel indices as arguments.

madnis.integrator.rkl_divergence(f_true, q_test, q_sample)[source]¶

Computes the reverse Kullback-Leibler divergence for two given sets of probabilities, f_true and q_test. It uses importance sampling, i.e. the estimator is divided by an additional factor of q_sample.

Parameters:

f_true (Tensor) – normalized integrand values
q_test (Tensor) – estimated function/probability
q_sample (Tensor) – sampling probability
channels – channel indices or None in the single-channel case

Return type:

Tensor

Returns:

computed KL divergence

madnis.integrator.stratified_variance(f_true, q_test, q_sample=None, channels=None)[source]¶

Computes the stratified variance as introduced in [2311.01548] for two given sets of probabilities, f_true and q_test. It uses importance sampling with a sampling probability specified by q_sample.

Parameters:

f_true (Tensor) – normalized integrand values
q_test (Tensor) – estimated function/probability
q_sample (Optional[Tensor]) – sampling probability
channels (Optional[Tensor]) – channel indices or None in the single-channel case

Returns:

computed stratified variance

madnis.integrator.stratified_variance_softclip(f_true, q_test, q_sample=None, channels=None, threshold=30.0)[source]¶

Parameters:

f_true (Tensor) – normalized integrand values
q_test (Tensor) – estimated function/probability
q_sample (Optional[Tensor]) – sampling probability
channels (Optional[Tensor]) – channel indices or None in the single-channel case
threshold (Tensor) – approximate point of transition between linear and logarithmic behavior

Returns:

computed stratified variance

madnis.integrator.unweighting_metrics(weights, channels=None, channel_count=None, replica_count=1000)[source]¶

Calculate the unweighting efficiency as discussed in arXiv:2001.10028

Parameters:

weights (Tensor) – weights of the samples
channels (Optional[Tensor]) – channel indices of the samples
channel_count (Optional[int]) – number of channels
replica_count (int) – number of replicas, called m in the reference

Return type:

UnweightingMetrics | tuple[UnweightingMetrics, list[UnweightingMetrics]]

Returns:

An UnweightingMetrics object. In the multi-channel case, it also returns a list of UnweightingMetrics objects for all channels.