madnis.integrator package¶
This module contains functions and classes to train neural importance sampling networks and evaluate the integration and sampling performance.
- class madnis.integrator.Buffer(capacity, shapes, persistent=True, dtypes=None)[source]¶
Bases:
ModuleCircular buffer for multiple tensors with different shapes. The class is a torch.nn.Module to allow for simple storage.
- Parameters:
capacity (
int) – maximum number of samples stored in the buffershapes (
list[tuple[int,...]]) – shapes of the tensors to be stored, without batch dimension. If a shape is None, no tensor is stored at that position. This allows for simpler handling of optional stored fields.persistent (
bool) – if True, the content of the buffer is part of the module’s state_dictdtypes (
Optional[list[dtype|None]]) – if different from None, specifies the tensors which have a non-standard dtype
- filter(predicate, batch_size=100000)[source]¶
Removes samples from the buffer that do not fulfill the criterion given by the predicate function.
- Parameters:
predicate (
Callable[[tuple[Tensor|None,...]],Tensor]) – function that returns a mask for a batch of samples, given a tuple with all the buffered fields as argumentbatch_size (
int) – maximal batch size to limit memory usage
- sample(count)[source]¶
Returns a batch of samples drawn from the buffer without replacement.
- Parameters:
count (
int) – number of samples- Return type:
list[Tensor|None]- Returns:
samples drawn from the buffer
- store(*tensors)[source]¶
Adds the given tensors to the buffer. If the buffer is full, the oldest stored samples are overwritten.
- Parameters:
tensors (
Tensor|None) – samples to be stored. The shapes of the tensors after the batch dimension must match the shapes given during initialization. The argument can be None if the corresponding shape was None during initialization.
- class madnis.integrator.ChannelData(channel_index, target_index, group, remapped, position_in_group)[source]¶
Bases:
objectInformation about a single channel
- Parameters:
channel_index (
int) – index of the channeltarget_index (
int) – index of the channel that it is mapped togroup (
ChannelGroup) – channel group that the channel belongs toremapped (
bool) – True if the channel is remapped to another channelposition_in_group (
int) – index of the channel within its group
- class madnis.integrator.ChannelGroup(group_index, target_index, channel_indices)[source]¶
Bases:
objectA group of channels
- Parameters:
group_index (
int) – index of the group in the list of groupstarget_index (
int) – index of the channel that all other channels in the group are mapped tochannel_indices (
list[int]) – indices of the channels in the group
- class madnis.integrator.ChannelGrouping(channel_assignment)[source]¶
Bases:
objectClass that encodes how channels are grouped together for a multi-channel integrand
- Parameters:
channel_assignment (
list[int|None]) – list with an entry for each channel. If None, the channel is not remapped. Otherwise, the index of the channel to which it is mapped.
- class madnis.integrator.Integrand(function, input_dim, bounds=None, channel_count=None, remapped_dim=None, has_channel_weight_prior=False, channel_grouping=None, function_includes_sampling=False, update_active_channels_mask=None, discrete_dims=[], discrete_dims_position='first', discrete_prior_prob_function=None)[source]¶
Bases:
ModuleClass that wraps an integrand function and meta-data necessary to use advanced MadNIS features like learnable multi-channel weights, grouped channels and channel weight priors.
- Parameters:
function (
Callable) –integrand function. The signature depends on the other arguments:
single-channel integration,
channel_count=None:x -> fbasic multi-channel integration,
remapped_dim=None,has_channel_weight_prior=False:(x, c) -> fwith channel weights,
remapped_dim=None,has_channel_weight_prior=True:(x, c) -> (f, alpha)(no trainable channel weights possible)with channel-dependent mapping,
remapped_dim: int,has_channel_weight_prior=False:(x, c) -> (f, y)all features,
remapped_dim: int,has_channel_weight_prior=True:(x, c) -> (f, y, alpha)
with the following tensors:
xis a point generated by the importance sampling, shape (n, input_dim),cis the channel index, shape (n, ),fis the integrand value, shape (n, ),yis the point after applying a channel-dependent mapping, shape (n, remapped_dim)alphais the prior channel weight, shape (n, channel_count).
input_dim (
int) – dimension of the integration spacebounds (
Optional[list[list[float]]]) – List of pairs[lower bound, upper bound]of the integration interval for all dimensions. The integrand is rescaled so that the MadNIS training can be performed on the unit hypercube. If None, the unit hypercube is used as integration domain.channel_count (
Optional[int]) – None in the single-channel case, specifies the number of channels otherwise.remapped_dim (
Optional[int]) – If different from None, it gives the dimension of a remapped space, with a channel-dependent mapping computed as part of the integrand function.has_channel_weight_prior (
bool) – If True, the integrand returns channel weightschannel_grouping (
Optional[ChannelGrouping]) – ChannelGrouping object or None if all channels are independent
- forward(x, channels)[source]¶
Define the computation performed at every call.
Should be overridden by all subclasses. :rtype:
tuple[Tensor,Tensor|None,Tensor|None]Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- remap_channels(channels)[source]¶
Remaps channel indices to the indices of their respective channel groups if a
ChannelGroupingobject was provided, otherwise returns the indices unchanged.- Parameters:
channels (
Tensor|int) – channel indices, tensor with shape (n, ) or integer- Return type:
Tensor|int- Returns:
remapped channel indices, tensor with shape (n, ) or integer
- class madnis.integrator.IntegrationMetrics(integral, count, error, rel_error, rel_stddev, rel_stddev_opt, channel_integrals, channel_counts, channel_errors, channel_rel_errors, channel_rel_stddevs)[source]¶
Bases:
objectMetrics for the integration performance
- Parameters:
integral (
float) – total integration resultscount (
int) – number of integration sampleserror (
float) – Monte Carlo integration errorrel_error (
float) – relative integration errorrel_stddev (
float) – relative standard deviation (does not scale with number of samples)rel_stddev_opt (
float) – optimal relative standard deviation that would have been possible with stratified samplingchannel_integrals (
list[float]) – channel-wise integralschannel_counts (
list[int]) – channel-wise number of sampleschannel_errors (
list[float]) – channel-wise integration errorschannel_rel_errors (
list[float]) – channel-wise relative integration errorschannel_rel_stddevs (
list[float]) – channel-wise relative standard deviations
- class madnis.integrator.Integrator(integrand, dims=0, flow=None, flow_kwargs={}, discrete_flow_kwargs={}, discrete_model='made', train_channel_weights=True, cwnet=None, cwnet_kwargs={}, loss=None, optimizer=None, batch_size=1024, batch_size_per_channel=0, learning_rate=0.001, scheduler=None, uniform_channel_ratio=1.0, integration_history_length=20, drop_zero_integrands=False, batch_size_threshold=0.5, buffer_capacity=0, minimum_buffer_size=50, buffered_steps=0, max_stored_channel_weights=None, channel_dropping_threshold=0.0, channel_dropping_interval=100, channel_grouping_mode='none', freeze_cwnet_iteration=None, device=None, dtype=None)[source]¶
Bases:
ModuleImplements MadNIS training and integration logic. MadNIS integrators are torch modules, so their state can easily be saved and loaded using the torch.save and torch.load methods.
- Parameters:
integrand (
Callable[[Tensor],Tensor] |Integrand) – the function to be integrated. In the case of a simple single-channel integration, the integrand function can directly be passed to the integrator. In more complicated cases, like multi-channel integrals, use theIntegrandclass.dims (
int) – dimension of the integration space. Only required if a simple function is given as integrand.flow (
Optional[Distribution]) – sampling distribution used for the integration. If None, a flow is constructed using theFlowclass. Otherwise, it has to be compatible with a normalizing flow, i.e. have the interface defined in theDistributionclass.flow_kwargs (
dict[str,Any]) – If flow is None, these keyword arguments are passed to the Flow constructor.discrete_flow_kwargs (
dict[str,Any]) – If flow is None, these keyword arguments are passed to theMixedFloworDiscreteMADEconstructor.train_channel_weights (
bool) – If True, construct a channel weight network and train it. Only necessary if cwnet is None.cwnet (
Optional[Module]) – network used for the trainable channel weights. If None and train_channel_weights is True, the cwnet is built using theMLPclass.cwnet_kwargs (
dict[str,Any]) – If cwnet is None and train_channel_weights is True, these keyword arguments are passed to theMLPconstructor.loss (
Optional[Callable[[Tensor,Tensor,Tensor|None,Tensor|None],Tensor]]) – Loss function used for training. If not provided, the KL divergence is chosen in the single-channel case and the stratified variance is chosen in the multi-channel case.optimizer (
Union[Optimizer,Callable[[Iterable[Parameter]],Optimizer],None]) – optimizer for the training. Can be an optimizer object or function that is called with the model parameters as argument and returns the optimizer. If None, the Adam optimizer is used.batch_size (
int) – Training batch sizebatch_size_per_channel (
int) – used to compute the batch size as a function of the number of active channels,batch_size + n_active_channels * batch_size_per_channellearning_rate (
float) – learning rate used for the Adam optimizerscheduler (
Union[LRScheduler,Callable[[Optimizer],LRScheduler],None]) – learning rate scheduler for the training. Can be a learning rate scheduler object or a function that gets the optimizer as argument and returns the scheduler. If None, a constant learning rate is used.uniform_channel_ratio (
float) – part of samples in each batch that will be distributed equally between all channels, value has to be between 0 and 1.integration_history_length (
int) – number of batches for which the channel-wise means and variances are stored. This is used for stratified sampling during integration, and during the training if uniform_channel_ratio is different from one.drop_zero_integrands (
bool) – If True, points with integrand zero are dropped and not used for the optimization.batch_size_threshold (
float) – New samples are drawn until the number of samples is at least batch_size_threshold * batch_size.buffer_capacity (
int) – number of samples that are stored for buffered trainingminimum_buffer_size (
int) – minimal size of the buffer to run buffered trainingbuffered_steps (
int) – number of optimization steps on buffered samples after every online training stepmax_stored_channel_weights (
Optional[int]) – number of prior channel weights that are buffered for each sample. If None, all prior channel weights are saved, otherwise only those for the channels with the largest contributions.channel_dropping_threshold (
float) – all channels which a cumulated contribution to the integrand that is smaller than this threshold are droppedchannel_dropping_interval (
int) – number of training steps after which channel dropping is performedchannel_grouping_mode (
Literal['none','uniform','learned']) – If “none” all channels are treated as separate channels in the loss and integration, even when they grouped together. If “uniform”, the channels within each group are sampled with equal probability. If “learned”, a discrete normalizing flow is used to sample the channel index within a group.freeze_cwnet_iteration (
Optional[int]) – If not None, specifies the training iteration after which the channel weight network is frozendevice (
Optional[device]) – torch device used for training and integration. If None, use default device.dtype (
Optional[dtype]) – torch dtype used for training and integration. If None, use default dtype.
- integral()[source]¶
Returns the current estimate of the integral based on previous training iterations and calls to the
integratefunction.- Return type:
tuple[float,float]- Returns:
tuple with the value of the integral and the MC integration error
- integrate(n, batch_size=100000)[source]¶
Draws samples and computes the integral.
- Parameters:
n (
int) – number of samplesbatch_size (
int) – batch size used for sampling and calling the integrand
- Return type:
tuple[float,float]- Returns:
tuple with the value of the integral and the MC integration error
- integration_metrics(n, batch_size=100000)[source]¶
Draws samples and computes metrics for the total and channel-wise integration quality.
- Parameters:
n (
int) – number of samplesbatch_size (
int) – batch size used for sampling and calling the integrand
- Return type:
- Returns:
IntegrationMetricsobject, see its documentation for details
- sample(n, batch_size=100000, channel_weight_mode='variance', channel=None, evaluate_integrand=True)[source]¶
Draws samples and computes their integration weight
- Parameters:
n (
int) – number of samplesbatch_size (
int) – batch size used for sampling and calling the integrandchannel_weight_mode (
Literal['uniform','mean','variance']) – specifies whether the channels are weighted by their mean, variance or uniformly. Note that weighting by mean can lead to problems for non-positive functionschannel (
Optional[int]) – if different from None, samples are only generated for this channel
- Return type:
- Returns:
SampleBatchobject, see its documentation for details
- train(steps, callback=None, capture_keyboard_interrupt=False)[source]¶
Performs multiple training steps
- Parameters:
steps (
int) – number of training stepscallback (
Optional[Callable[[TrainingStatus],None]]) – function that is called after each training step with the training status as argumentcapture_keyboard_interrupt (
bool) – If True, a keyboard interrupt does not raise an exception. Instead, the current training step is finished and the training is aborted afterwards.
- unweighting_metrics(n, batch_size=100000, channel_weight_mode='mean')[source]¶
Draws samples and computes metrics for the total and channel-wise integration quality. This function is only suitable for functions that are non-negative everywhere.
- Parameters:
n (
int) – number of samplesbatch_size (
int) – batch size used for sampling and calling the integrandchannel_weight_mode (
Literal['uniform','mean','variance']) – specifies whether the channels are weighted by their mean, variance or uniformly.
- Return type:
- Returns:
UnweightingMetricsobject, see its documentation for details
- class madnis.integrator.SampleBatch(x, y, q_sample, func_vals, channels, alphas_prior=None, alpha_channel_indices=None, integration_channels=None, weights=None, alphas=None, zero_counts=None)[source]¶
Bases:
objectContains a batch of samples
- Parameters:
x (
Tensor) – samples generated by the flow, shape (n, dim)y (
Tensor|None) – remapped samples returned by the integrand, shape (n, remapped_dim)q_sample (
Tensor) – probabilities of the samples, shape (n, )func_vals (
Tensor) – integrand value, shape (n, )channels (
Tensor|None) – channels indices for multi-channel integration, shape (n, ), otherwise Nonealphas_prior (
Optional[Tensor]) – prior channel weights, shape (n, channels), or None for single-channel integrationalpha_channel_indices (
Optional[Tensor]) – channel indices if not all prior channel weights are stored, otherwise Noneintegration_channels (
Optional[Tensor]) – index of the channel group in case the integration is performed at the level of channel groups, shape (n, ), otherwise Noneweights (
Optional[Tensor]) – integration weight, shape (n, ). Only set when returned from Integrator.sample function, otherwise None.alphas (
Optional[Tensor]) – channel weights including learned correction, shape (n, channels). Only set when returned from Integrator.sample function, otherwise None.zero_counts (
Optional[Tensor]) – channel-wise counts of samples with zero-weights that are not included in the batch, shape (channels, ). This field is ignored by most methods, as it behaves does not have the batch size as its first dimension
- static cat(batches)[source]¶
Concatenates multiple batches. If the field zero_counts is not None, the zero_counts of all batches are added.
- Parameters:
batches (
Iterable[SampleBatch]) – Iterable over SampleBatch objects- Return type:
- Returns:
New SamplaBatch object containing the concatenated batches
- map(func)[source]¶
Applies function to all fields in the batch that are not None and returns a new SampleBatch
- Parameters:
func (
Callable[[Tensor],Tensor]) – function that is applied to all fields in the batch. Expects a tensor as argument and returns a new tensor- Return type:
- Returns:
Transformed SampleBatch
- class madnis.integrator.TrainingStatus(step, loss, buffered, learning_rate, dropped_channels)[source]¶
Bases:
objectContains the MadNIS training status to pass it to a callback function.
- Parameters:
step (
int) – optimization steploss (
float) – loss from the optimization stepbuffered (
bool) – whether the optimization was performed on buffered sampleslearning_rate (
float|None) – current learning rate if learning rate scheduler is presentdropped_channels (
int) – number of channels dropped after this optimization step
- class madnis.integrator.UnweightingMetrics(cut_eff, uweff_before_cuts, uweff_before_cuts_partial, uweff_after_cuts, uweff_after_cuts_partial, over_weight_rate)[source]¶
Bases:
objectMetrics for the unweighting performance
- Parameters:
cut_eff (
float) – cut efficiencyuweff_before_cuts (
float) – unweighting efficiency before cuts (computed asuweff_after_cuts * cut_eff)uweff_before_cuts_partial (
float) – unweighting efficiency without over-weights before cuts (computed asuweff_after_cuts_partial * cut_eff)uweff_after_cuts (
float) – unweighting efficiency after cutsuweff_after_cuts_partial (
float) – unweighting efficiency without over-weights after cutsover_weight_rate (
float) – fraction of over-weight samples
- class madnis.integrator.VegasPreTraining(integrator, bins=64, damping=0.7)[source]¶
Bases:
objectImplements VEGAS pre-training. It wraps around an
Integratorobject and uses its integrand, sample buffer and integration history. In addition, it also defines the functions sample, integrate, integration_metrics and unweighting_metrics to allow for comparisions between VEGAS and MadNIS.- initialize_integrator()[source]¶
Initializes the flows in the integrator object using the trained VEGAS grid
- integrate(n)[source]¶
Draws samples and computes the integral.
- Parameters:
n (
int) – number of samplesbatch_size – batch size used for sampling and calling the integrand
- Return type:
tuple[float,float]- Returns:
tuple with the value of the integral and the MC integration error
- integration_metrics(n, batch_size=100000)[source]¶
Draws samples and computes metrics for the total and channel-wise integration quality.
- Parameters:
n (
int) – number of samplesbatch_size (
int) – batch size used for sampling and calling the integrand
- Return type:
- Returns:
IntegrationMetricsobject, see its documentation for details
- sample(n, batch_size=100000, channel_weight_mode='variance', channel=None, evaluate_integrand=True)[source]¶
Draws samples and computes their integration weight
- Parameters:
n (
int) – number of samplesbatch_size (
int) – batch size used for sampling and calling the integrandchannel_weight_mode (
Literal['uniform','mean','variance']) – specifies whether the channels are weighted by their mean, variance or uniformly. Note that weighting by mean can lead to problems for non-positive functionschannel (
Optional[int]) – if different from None, samples are only generated for this channelevaluate_integrand (
bool) – If False, the integrand is not evaluated and func_vals, y and alphas are set to dummy values. This can be used when integrand is expensive and one only needs the mapping jacobian.
- Return type:
- Returns:
SampleBatchobject, see its documentation for details
- train(samples_per_channel, callback=None, capture_keyboard_interrupt=False)[source]¶
Performs multiple training steps
- Parameters:
samples_per_channel (
list[int]) – list of the number of samples per channel, with one entry for every training iterationcallback (
Optional[Callable[[VegasTrainingStatus],None]]) – function that is called after each training step with the training status as argumentcapture_keyboard_interrupt (
bool) – If True, a keyboard interrupt does not raise an exception. Instead, the current training step is finished and the training is aborted afterwards.
- train_step(samples_per_channel)[source]¶
Performs a single VEGAS training iteration
- Parameters:
samples_per_channel (
int) – number of training samples per channel- Return type:
VegasTrainingStatus- Returns:
VegasTrainingStatusobject containing metrics of the training progress
- unweighting_metrics(n, batch_size=100000, channel_weight_mode='mean')[source]¶
Draws samples and computes metrics for the total and channel-wise integration quality. This function is only suitable for functions that are non-negative everywhere.
- Parameters:
n (
int) – number of samplesbatch_size (
int) – batch size used for sampling and calling the integrandchannel_weight_mode (
Literal['uniform','mean','variance']) – specifies whether the channels are weighted by their mean, variance or uniformly.
- Return type:
- Returns:
UnweightingMetricsobject, see its documentation for details
- madnis.integrator.integration_metrics(channel_means, channel_variances, channel_counts)[source]¶
Calculate metrics for the integration performance
- Parameters:
channel_means (
Tensor) – channel-wise integralschannel_variances (
Tensor) – channel-wise varianceschannel_counts (
Tensor) – channel-wise sample counts
- Return type:
- Returns:
An IntegrationMetrics object
- madnis.integrator.kl_divergence(f_true, q_test, q_sample)[source]¶
Computes the Kullback-Leibler divergence for two given sets of probabilities,
f_trueandq_test. It uses importance sampling, i.e. the estimator is divided by an additional factor ofq_sample.- Parameters:
f_true (
Tensor) – normalized integrand valuesq_test (
Tensor) – estimated function/probabilityq_sample (
Tensor) – sampling probabilitychannels – channel indices or None in the single-channel case
- Return type:
Tensor- Returns:
computed KL divergence
- madnis.integrator.kl_divergence_softclip(f_true, q_test, q_sample, threshold=30.0)[source]¶
Computes the Kullback-Leibler divergence for two given sets of probabilities,
f_trueandq_test. It uses importance sampling, i.e. the estimator is divided by an additional factor ofq_sample. A soft clipping function is applied to the sample weights.- Parameters:
f_true (
Tensor) – normalized integrand valuesq_test (
Tensor) – estimated function/probabilityq_sample (
Tensor) – sampling probabilitychannels – channel indices or None in the single-channel case
threshold (
Tensor) – approximate point of transition between linear and logarithmic behavior
- Return type:
Tensor- Returns:
computed KL divergence
- madnis.integrator.multi_channel_loss(loss)[source]¶
Turns a single-channel loss function into a multi-channel loss function by evaluating it for each channel separately and then adding them weighted by TODO weighted by what?
- Parameters:
loss (
Callable[[Tensor,Tensor,Tensor],Tensor]) – single-channel loss function, that expects the integrand value, test probability and sampling probability as arguments- Return type:
Callable[[Tensor,Tensor,Tensor|None,Tensor|None],Tensor]- Returns:
multi-channel loss function, that expects the integrand value, test probability and, optionally, sampling probability and channel indices as arguments.
- madnis.integrator.rkl_divergence(f_true, q_test, q_sample)[source]¶
Computes the reverse Kullback-Leibler divergence for two given sets of probabilities,
f_trueandq_test. It uses importance sampling, i.e. the estimator is divided by an additional factor ofq_sample.- Parameters:
f_true (
Tensor) – normalized integrand valuesq_test (
Tensor) – estimated function/probabilityq_sample (
Tensor) – sampling probabilitychannels – channel indices or None in the single-channel case
- Return type:
Tensor- Returns:
computed KL divergence
- madnis.integrator.stratified_variance(f_true, q_test, q_sample=None, channels=None)[source]¶
Computes the stratified variance as introduced in [2311.01548] for two given sets of probabilities,
f_trueandq_test. It uses importance sampling with a sampling probability specified byq_sample.- Parameters:
f_true (
Tensor) – normalized integrand valuesq_test (
Tensor) – estimated function/probabilityq_sample (
Optional[Tensor]) – sampling probabilitychannels (
Optional[Tensor]) – channel indices or None in the single-channel case
- Returns:
computed stratified variance
- madnis.integrator.stratified_variance_softclip(f_true, q_test, q_sample=None, channels=None, threshold=30.0)[source]¶
Computes the stratified variance as introduced in [2311.01548] for two given sets of probabilities,
f_trueandq_test. It uses importance sampling with a sampling probability specified byq_sample. A soft clipping function is applied to the sample weights.- Parameters:
f_true (
Tensor) – normalized integrand valuesq_test (
Tensor) – estimated function/probabilityq_sample (
Optional[Tensor]) – sampling probabilitychannels (
Optional[Tensor]) – channel indices or None in the single-channel casethreshold (
Tensor) – approximate point of transition between linear and logarithmic behavior
- Returns:
computed stratified variance
- madnis.integrator.unweighting_metrics(weights, channels=None, channel_count=None, replica_count=1000)[source]¶
Calculate the unweighting efficiency as discussed in arXiv:2001.10028
- Parameters:
weights (
Tensor) – weights of the sampleschannels (
Optional[Tensor]) – channel indices of the sampleschannel_count (
Optional[int]) – number of channelsreplica_count (
int) – number of replicas, called m in the reference
- Return type:
UnweightingMetrics|tuple[UnweightingMetrics,list[UnweightingMetrics]]- Returns:
An UnweightingMetrics object. In the multi-channel case, it also returns a list of UnweightingMetrics objects for all channels.