Fixed broadcasting rules for gpflow.models.model.predict_y, partially resolves #1461. by mohit-rajpal · Pull Request #1597

Fixed broadcasting rules for gpflow.models.model.predict_y, partially resolves #1461. by mohit-rajpal · Pull Request #1597 · GPflow/GPflow

PR type: new feature

**Related issue(s)/PRs:#1461

Summary

Proposed changes

This proposed change fixes the broadcasting rules for predict_y to correctly handle full_cov/full_output_cov = True. The changes made rely on slightly abusing tf broadcasting rules in a compatible manner to the covariance matrix/tensor when full_cov/full_output_cov = True.

What alternatives have you considered?

I think at this time we should keep predict_log_densities for full_cov/full_output_cov = True as is. The reasoning behind this is that with large covariance matrix/tensors, there is no efficient way to compute these exactly without hitting OOM issues. I strongly prefer keeping the predict_log_densities 'as-is' to avoid a situation where predict_log_densities takes far too long to run, or approximates a solution.

Minimal working example

import gpflow
import numpy as np
import gpflow as gpf

for foc in [False, True]:
    for oc in [False, True]:
        print(foc)
        print(oc)
        rng = np.random.RandomState(123)
        N = 100  # Number of training observations
        X = rng.rand(N, 1) * 2 - 1  # X values
        M = 50  # Number of inducing locations
        kernel = gpflow.kernels.SquaredExponential()
        Z = X[:M, :].copy()  # Initialize inducing locations to the first M inputs in the dataset
        m = gpflow.models.SVGP(kernel, gpflow.likelihoods.Gaussian(), Z, num_data=N)

        pX = np.linspace(10, 20, 4)[:, None]  # Test locations
        _, pYv = m.predict_y(pX, full_output_cov = foc, full_cov = oc)  # Predict Y values at test locations
        _, pFv = m.predict_f(pX, full_output_cov = foc, full_cov = oc)
        print(pYv.shape)
        print(pYv - pFv)


        num_elements = 7
        L = 2
        Zinit = [gpf.inducing_variables.InducingPoints(np.linspace(0, 100, 60)[:, None]) for _ in range(L)]
        kern_list = [gpflow.kernels.SquaredExponential() for _ in range(L)]

        #linear model of coregionalization
        kernel = gpf.kernels.LinearCoregionalization(
          kern_list, W=np.random.randn(num_elements, L))

        # create multi-output inducing variables from Z
        iv = gpf.inducing_variables.SeparateIndependentInducingVariables(
          Zinit
        )


        # initialize mean of variational posterior to be of shape MxL
        q_mu = np.zeros((60, L))
        # initialize \sqrt(Σ) of variational posterior to be of shape LxMxM
        q_sqrt = np.repeat(np.eye(60)[None, ...], L, axis=0) * 1.0

        # create SVGP model as usual and optimize
        m = gpf.models.SVGP(
          kernel, gpf.likelihoods.Gaussian(), inducing_variable=iv, q_mu=q_mu, q_sqrt=q_sqrt
        )

        _, pYv = m.predict_y(pX, full_output_cov = foc, full_cov = oc)  # Predict Y values at test locations
        _, pFv = m.predict_f(pX, full_output_cov = foc, full_cov = oc)
        print(pYv.shape)
        print(pYv - pFv)

PR checklist

New features: code is well-documented
- detailed docstrings (API documentation)
- notebook examples (usage demonstration)
The bug case / new feature is covered by unit tests
Code has type annotations
I ran the black+isort formatter (make format)
I locally tested that the tests pass (make check-all)

Release notes

Fully backwards compatible: yes

If not, why is it worth breaking backwards compatibility:

Commit message (for release notes): Fixed broadcasting rules for gpflow.models.model.predict_y, partially resolves #1461.