Easily Implementing Gaussian Processes with Uncertain Inputs, Using Signaloid's UxHw Technology

Gaussian processes are machine learning models that intrinsically model uncertainty along with the underlying process [1]. Practitioners and researchers use Gaussian processes in applications where well-calibrated uncertainty is useful. An example of such an application is Bayesian Optimization (BO) [2], which uses information on the uncertainty of values to efficiently optimize expensive-to-evaluate expressions such as those appearing in models of fluid dynamic [3]. When the exact input to a Gaussian process is known, it is possible to calculate its predicted output distribution exactly, using known techniques in the research literature. When there is uncertainty in the inputs, there are no known analytical closed-form solutions [4] for the distribution of the output; an evaluation of a Gaussian process with uncertain inputs must then usually resort to approximations such as Monte Carlo evaluation of the Gaussian process. Signaloid's UxHw technology makes it possible to compute this difficult-to-compute output distribution directly [5]. The Pareto plot below shows how UxHw can also be orders-of-magnitude faster than Monte Carlo simulation regardless of the required accuracy required accuracy (as measured by the Wasserstein distance). This technology explainer provides an overview of how Signaloid's UxHw technology makes it easy to evaluate Gaussian processes with uncertain inputs and to do so much faster and more accurately than on any competing computing platform.
Figure 1: A comparison of the Pareto frontiers of uncertain Gaussian process predictions, comparing an implementation run on a platform with Signaloid's UxHw technology versus a Monte Carlo evaluation run on a traditional computing platform. The evaluations use a simple benchmark problem (Figure 2 below). The results show that, at any chosen accuracy (Wasserstein distances; lower is better), the implementation running on the Signaloid-UxHw-accelerated platform is faster. Furthermore, the implementation on UxHw does not suffer from the large variations in accuracy, observed across repetitions of Monte Carlo executions of the same model, manifested by the high variance in the accuracy across the sets of Monte Carlo evaluations.
Why It Matters
Gaussian processes are indispensable in machine learning applications where well-calibrated uncertainty is useful in a machine learning model's output, such as when a model is used for controlling a physical system such as a robot arm. When there is uncertainty in the inputs, there are no known analytical closed-form solutions for the distribution of the output and practitioners must resort to using approximations such as Monte Carlo methods, but these methods are non-trivial to implement and slow to execute. Signaloid's UxHw technology, combined with a useful technique borrowed from the machine learning literature (the so-called "reparameterization trick" [6]) makes it easy to implement Gaussian process prediction with uncertain inputs and provides performance that is over 100-fold faster compared to running the equivalent Monte Carlo on hardware with similar resources.
The Technical Details
The equations for the mean and variance of the output of a Gaussian process for a point-valued input are known, exactly, in the research literature. Consider an uncertain input (i.e., an input that has a distribution) as the bottom panel of the figure below shows. The input could be any possible value on the horizontal axis; since there isn't a single input point, there won't be a single Gaussian distribution that is the output of the Gaussian process. Instead, the true output distribution is the cumulative effect of the different Gaussian distributions that would result from any of the possible input points, weighted by their chance of occurring. There are no known closed-form analytical solutions to this problem, even for simple cases.
The equations for the mean and variance are only valid when the input is known exactly. Consider an uncertain input (ie., an input that has a distribution) as shown in the bottom panel of the figure below. Therefore, the input could be any possible value on the horizontal axis; since there isn't a single input point, there won't be a single Gaussian distribution that is the output of the Gaussian Process. Instead, the true output distribution is the cumulative effect of the different Gaussian distributions that would result from any of the possible input points, weighted by their chance of occurring. There aren't any solutions to this problem, even in the simplest cases.
To find a good approximation of the output of a Gaussian process with an uncertain input, practitioners often resort to using a technique called Monte Carlo evaluation; the process of Monte Carlo involves repeatedly sampling from the input distribution and passing these samples through the Gaussian process to obtain samples from the output distribution. The process then builds the output distribution using these samples. The accuracy and the precision of the Monte Carlo evaluation depend on the number of samples passed through the Gaussian process: The more samples used, the closer the Monte Carlo's estimated output distribution will be to the true output distribution.
As an alternative to iterative sample-based Monte Carlo evaluations, algorithms running on computing platforms implementing Signaloid's UxHw technology can take advantage of UxHw's ability to perform arithmetic on probability distributions. In the context of the Gaussian process with uncertain input example, an algorithm implementing the Gaussian process prediction can directly compute the output distribution without resorting to Monte Carlo. The code snippet below highlights the key idea (you can find the full source code implementation here). The algorithm builds on a technique from the machine learning literature called the "reparameterization trick" [6] that allows it to separate the mean and the variance of a Gaussian distribution from its stochastic component. The reparametrization trick, combined with the fact that UxHw-enhanced computing platforms will propagate the input distribution through the arithmetic of computations on this input, allows the algorithm implementation to use the standard equations for the mean and variance of a Gaussian process output to compute the output for an uncertain input.
Using the Gaussian process and uncertain input in the figure above, we can compute its output distribution using the Signaloid's UxHw API or Monte Carlo simulation. The following figures show example results from such experiments. Comparing the outputs from Signaloids UxHW API and the Monte Carlo simulation to the ground truth distribution shows that they both capture the same output distribution; Monte Carlo however took approximate 100x longer.
Figure 2: The input distribution (bottom) and the Gaussian process (top) used as an example in this technology explainer. The example evaluates the output distribution by passing the input distribution (bottom) through the Gaussian process (top). The triptych below shows examples of the output distributions.
Relevant Code Example
Figure 3: Execution on UxHw The output distribution to the simple benchmark Gaussian process prediction with an uncertain input describe above as calculated using the UxHw API. It took 3.4 ms to compute this distribution (Signaloid Cloud Compute Engine with core class C0-Pro, Jupiter microarchitecture, and representation size 128).
Figure 4: The output distribution to the simple benchmark Gaussian process prediction with an uncertain input describe above as calculated using a Monte Carlo iterations. It took 340 ms to compute this distribution running on an Intel Xeon (AWS r7iz EC2 instance).
Figure 5: The ground truth distribution obtained for the simple benchmark Gaussian process prediction with an uncertain inputd described above using a 1,000,000 sample Monte Carlo simulation. It took 3.6 s to compute this distribution on an Intel Xeon (AWS r7iz EC2 instance).
The Takeaway
Computing the output Gaussian process distribution when the input is a distribution is generally difficult. Using traditional methods to solve this problem such as Monte Carlo simulation can be computationally expensive and complicated to implement. It is possible to derive a simple algorithm that uses Signaloid's UxHw technology to compute this output distribution approximately 108-fold faster than using Monte Carlo simulation.
References
https://github.com/physical-computation/uncertain-gaussian-process-code/blob/main/src/main.c
Williams, Christopher KI, and Carl Edward Rasmussen. Gaussian processes for machine learning. Vol. 2. No. 3. Cambridge, MA: MIT press, 2006.
Frazier, Peter I. "A tutorial on Bayesian optimization." arXiv preprint arXiv:1807.02811 (2018).
Morita, Yuki, et al. "Applying Bayesian optimization with Gaussian process regression to computational fluid dynamics problems." Journal of Computational Physics 449 (2022): 110788.
Deisenroth, Marc Peter. Efficient reinforcement learning using Gaussian processes. Vol. 9. KIT Scientific Publishing, 2010.
Janith Petangoda, Chatura Samarakoon, and Phillip Stanley-Marbell. "Gaussian Process Predictions with Uncertain Inputs Enabled by Uncertainty-Tracking Processor Architectures." NeurIPS 2024 Workshop Machine Learning with new Compute Paradigms.
D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114, 2013.