Bounds on Representation and Arithmetic Propagation Errors for Dirac Mixture Representations

General

General

Signaloid's UxHw technology supports multiple families of mechanisms for representing and performing arithmetic on probability distributions. One of the families of representations supported by UxHw is the family of Dirac mixture representations, with the telescoping torques representation (TTR) being one member of the family. Because all the representations supported by Signaloid's UxHw technology can be used to represent any parametric/analytic or empirical distribution, while at the same time being limited in representation size to a bounded number of bits, the representations are by definition approximations. When approximating mathematical objects such as functions, measures, or sets, it is important to assess the quality of the approximation and how it behaves under various transformations. Typically, this involves selecting a family of objects that meet certain criteria, along with a metric to quantify the "closeness" between them. In the case of Signaloid's Dirac mixture representations, the objects they approximate are probability distributions, usually with finite means, and the metric used to gauge their proximity is the Wasserstein distance. The transformations under consideration range from basic operations such as addition and multiplication, to more complex processes, such as those defined by numerical schemes for stochastic differential equations (SDEs).

Figure 1: Normal distribution (black, dashed) and the binned representation (green bars) of the TTR representation comprising 128 Dirac deltas.

Why It Matters

Quantifying the error of representation and stability under transformation of an approximation scheme, such as the Dirac mixture representation for approximating probability distributions, is an important prerequisite to benchmarking its performance and to identifying scenarios where it outperforms existing methods. To quantify error requires a reference to compare to, and in the case of approximations for representations and arithmetic for probability distributions, the obvious reference to which to compare is the Monte Carlo method.

The Technical Details

Wasserstein Metric

The Wasserstein metric is a family of metrics that originates from the mathematical field of optimal transportation theory. Informally, this distance between two probability distributions measures the minimal cost of transporting one distribution into the other. This cost relates to the geometry of the underlying space and is given by a distance or a norm. A transport plan or coupling between the distributions, is a function that transports one distribution into the other.

Mathematically, we may define this metric in the following way. Consider a metric space (Ω,d), let μ and ν be probability measures and let Π(μ,ν) be all the possible couplings between the probability measures, all defined over that same metric space. Then, then the family of Wasserstein distances between μ and ν, with underlying metric d is given by

Wasserstein-1 Distance

In general, it is not possible to obtain the optimal coupling between two distributions, thus it is not possible to obtain the Wasserstein distance between them. However, in the one-dimensional case, i.e., when dealing with probability distribution of random variables, the optimal coupling has an explicit expression:

It is possible to compute the Wasserstein-1 distance between one-dimensional probability measures in terms of their respective cumulative distribution function (CDF) while only requiring that the distributions have finite mean.

Representation Error of TTR

Let μT be the target distribution that we are approximating according to the W1 distance. For the TTR distribution, the error depends on the number of Dirac deltas, Ndd. Let μTTR be the TTR distribution approximation. Then the distance to the target distribution is

The representation error of TTR distribution decreases as the reciprocal of its representation size, Ndd.

In the case of Monte Carlo, because the samples used to represent a distribution are randomly generated, each set of Ns samples will result in a slightly different representation error. We therefore use the expected value of the Wasserstein distance to quantify the representation error. This implies that we will have to generate a number of sets of Ns random samples and then extract the average error. Let Ns be the number of random samples used for a single set of samples. Let μMC be the Monte Carlo distribution approximation. Then,

Because the representation error for TTR decreases as the reciprocal of Ndd while the representation error of Monte Carlo decreases as the reciprocal of the square root of Ns, to achieve a given target approximation error, the Monte Carlo method requires a significantly larger number of samples compared to the TTR representation.

Operation Propagation Errors

Operations using TTR representation first perform the operation (e.g., addition) on all possible pairs of Dirac deltas in the Dirac mixtures which form the operands; starting from operands each represented with Ndd Dirac deltas, this first step results in Ndd2 Dirac deltas. Next, the operation reduces these Ndd2 Dirac deltas back down to Ndd using the TTR algorithm [2].

Because there are no representation errors introduced in performing arithmetic operations on Dirac deltas, the error of an arithmetic operation (e.g., addition) will simply be the representation error of the individual TTR representations (e.g., of the summands in the case of addition).

The Takeaway

The telescoping torques representation (TTR) is one of the representations provided by Signaloid's UxHw technology. Like all representations that can be used to represent any parametric/analytic or empirical distribution while at the same time being limited in representation size to a bounded number of bits, TTR is by definition an approximation. The Wasserstein-1 metric provides an effective way to quantify and compare the quality of approximations to an unknown probability distribution. This technology explainer detailed how TTR outperforms the Monte Carlo method in terms of accuracy for a given number of discrete points.

References
  1. F. Santambrogio (2015). "Wasserstein distances and curves in the Wasserstein spaces". In Optimal Transport for Applied Mathematicians. Progress in Nonlinear Differential Equations and Their Applications, vol 87. Birkhäuser, Cham. Doi: 10.1007/978-3-319-20828-2_5.

  2. Tsoutsouras, Vasileios, Orestis Kaparounakis, Bilgesu Bilgin, Chatura Samarakoon, James Meech, Jan Heck, and Phillip Stanley-Marbell. "The Laplace Microarchitecture for Tracking Data Uncertainty and its Implementation in a RISC-V Processor." In MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 1254-1269. 2021.

  3. E. del Barrio, E. Giné, C. Matrán (1999). "Central Limit Theorems for the Wasserstein Distance Between the Empirical and the True Distributions," The Annals of Probability, 27(2), 1009-1071.

Schedule a Demo Call
Request Whitepaper
Schedule a Demo Call
Request Whitepaper
Schedule a Demo Call
Request Whitepaper