Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add drop_first option to HSGPPeriodic #7115

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

bwengals
Copy link
Contributor

@bwengals bwengals commented Jan 23, 2024

Description

Adds drop_first, defaults to False, in the arguments for HSGPPeriodic. Sometimes, but not always, the first sine and cosine basis vectors are all zeros and all ones respectively. Previously, we were always removing the first sine basis vector, whether it was all zeros or not. When drop_first=False, no basis vectors are dropped. When drop_first=True, both are dropped. The all-zero sine component can be safely dropped because it contributes nothing to the model. The all-one cosine component will likely cause sampling problems by adding an extra intercept to the model (which is what got me to do the PR).

If the user would like to do more specific things like keep the cosine intercept but drop the first sine component, they can fall back to prior_linearized and do their own thing.

I'm wondering if always dropping the first sine basis vector -- even though its not always all zeros, is what led to the super slight mismatches between the HSGP approx and the unapproximated GP @theorashid? Whether or not the first sine basis vector is all zeros (and so the first cosine basis vec is all ones) depends on the period that is set. I uncommented the test that was xfailed before. Still fails, so maybe its something else. Will look into this more.

TODOs

  • Test that the indexing works with an odd numbered m.
  • Verify if always dropping the sine component was causing the slight mismatch in the approx to the true.

Related Issue

  • None

Checklist

Type of change

  • New feature / enhancement
  • Bug fix
  • Documentation
  • Maintenance
  • Other (please specify):

📚 Documentation preview 📚: https://pymc--7115.org.readthedocs.build/en/7115/

Copy link

codecov bot commented Jan 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (627a8dd) 90.41% compared to head (c667927) 92.36%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #7115      +/-   ##
==========================================
+ Coverage   90.41%   92.36%   +1.94%     
==========================================
  Files         101      101              
  Lines       16930    16937       +7     
==========================================
+ Hits        15308    15644     +336     
+ Misses       1622     1293     -329     
Files Coverage Δ
pymc/gp/hsgp_approx.py 95.80% <100.00%> (+0.18%) ⬆️

... and 6 files with indirect coverage changes

@@ -473,11 +472,15 @@ def __init__(
self,
m: int,
scale: Optional[Union[float, TensorLike]] = 1.0,
drop_first=False,
Copy link
Contributor

@juanitorduz juanitorduz Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember to add it to the dostrings ;) Also type: drop_first: bool = Flase

self.f = pm.Deterministic(name, f, dims=dims)
if self.drop_first:
# Drop first sine term (all zeros), drop first cos term (all ones)
beta = pm.Normal(f"{name}_hsgp_coeffs_", size=(m * 2) - 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which one is the old case with size=(m * 2 - 1))? With m*2, you might be accidentally using the same beta twice, or have some redundancy

Copy link
Contributor Author

@bwengals bwengals Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither, now drop_first=True removes both the first sine and first cos terms, and drop_first=False keeps all terms. So whenever the first sine term is all zeros, then the first cosine term is all ones. My reasoning was that wanting to keep the quasi-intercept cos term while dropping the all zeros sin term was more of an "experts only" move, so someone can do this with prior_linearized. Having all three options as an argument felt kind of cumbersome? What do you think?

With m*2, you might be accidentally using the same beta twice, or have some redundancy

True, need to double check this is all good, especially if someone uses an odd number m.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with the drop_first=True case, but why would anyone want the drop_first=False case with the first sine term when the first eigenfunction is zero? Doesn't that always make one of the beta redundant? I would just have the drop_first=True case and the original case as drop_first=False. I was just following the sums in the maths:

Screenshot 2024-01-24 at 20 54 05

If you go against the maths, please add some comments to explain why. Otherwise pymc devs in 2034 might get confused when they look at appendix b in the original paper.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise pymc devs in 2034

Love the optimism!

Copy link
Contributor Author

@bwengals bwengals Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but why would anyone want the drop_first=False case with the first sine term when the first eigenfunction is zero? Doesn't that always make one of the beta redundant?

Yup it does. OK, I made a dumb mistake. When sin(x) = 0, cos(x) is either 1 or -1 (been a while since trig class), not cos(x) always equals one like I wrote. So it only happens sometimes that the cosine term is all ones. Could be all -1's or be a mix of 1's and -1's. So drop_first = True always dropping both the first sine and cosine term is a bad idea.

To see what I'm trying to explain (sorry kind of poorly), try running the code below, but change period from 1 to 2 to 3.

  • period = 1: sine term all zeros (gotta drop it), cosine term all ones (should probably drop it, or at least be warned? If you already have an intercept in your model the sampler will bog down and you should remove one of them.)
  • period = 2: sine term all zeros (gotta drop it), cosine term alternates -1 and 1 (gotta keep it)
  • period = 3: both terms oscillate (keep both... right? though you point out @theorashid that the math says drop the sine term, so I'm not sure)

Samples of the prior predictive for the gp look good in all cases.

import pymc as pm
import numpy as np
import matplotlib.pyplot as plt
import pytensor
import pytensor.tensor as pt

X = np.linspace(0, 10, 1000)[:, None]

with pm.Model() as model:
    period = 1
    scale = pm.HalfNormal("scale", 10)
    cov_func = pm.gp.cov.Periodic(1, period=period, ls=1.0)

    m = 200
    gp = pm.gp.HSGPPeriodic(m=m, scale=scale, cov_func=cov_func)
    X_mean = np.mean(X, axis=0)
    X = pm.MutableData("X", X)
    Xs = X - X_mean
    
    # FROM pm.gp.hsgp_approx.calc_basis_periodic(Xs, cov_func.period, m=m)
    w0 = (2 * pt.pi) / period  # angular frequency defining the periodicity
    m1 = pt.tile(w0 * Xs, m)
    m2 = pt.diag(pt.arange(0, m, 1))
    mw0x = m1 @ m2

    (phi_cos, phi_sin), psd = gp.prior_linearized(Xs=Xs)

    plt.plot(phi_cos[0, :].eval())
    plt.plot(phi_sin[0, :].eval())

   beta = pm.Normal(f"beta", size=m * 2)
    beta_cos = beta[:m]
    beta_sin = beta[m:]

    cos_term = phi_cos @ (psd * beta_cos)
    sin_term = phi_sin @ (psd * beta_sin)
    f = pm.Deterministic("f", cos_term + sin_term)

    #prior = pm.sample_prior_predictive()
    #import arviz as az
    #f = az.extract(prior.prior, var_names="f")
    #plt.figure(figsize=(10, 4))
    #plt.plot(X.eval().flatten(), f.mean(dim="sample").data)
    #plt.xticks(np.arange(10));

Copy link
Contributor

@theorashid theorashid Jan 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I stared at the equation again and I can't see how sin(0 * w0 * x) can be anything other than 0. Equally, cos(0 * w0 * x) is always 1 – so there's always an intercept term, I suppose.

So mw0x there for me has shape (1000, 200) (i.e. (X.size, m)). So phi_cos, phi_sin both have shape (1000, 200). So by plotting plt.plot(phi_cos[0, :].eval()) you are plotting something of shape 200 i.e. m. So you're plotting the first term of each basis vector rather than plt.plot(phi_cos[:, 0].eval()) which is the contribution of the first (0th) basis across X. I find that to be 0 for sine and 1 for cos throughout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are plotting something of shape 200

Yuuuuuup. You're right. I plotted it wrong, so sorry for spreading my confusion around! It looks like I found that the cosine term is somtimes all ones, so the issue with the extra intercept is real. But then plotted it wrong and proceeded to draw lots of incorrect conclusions from there.

What the code should do is either:

  • Sine term is always all zeros, always drop (as the code originally did)
  • Cosine term is sometimes all ones. Allow user to optionally drop it.

Is that right @theorashid?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds right to me. Maybe remove the DeprecationWarning from the HSGP class too, if you want to stick with drop_first consistent after all

@@ -229,62 +229,73 @@ def test_conditional(self, model, cov_func, X1, parameterization):

class TestHSGPPeriodic(_BaseFixtures):
def test_parametrization(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add some drop first tests


self.f = pm.Deterministic(name, f, dims=dims)
if self.drop_first:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder is we would like to inform the user when drop_first=Flase and the fist basis vectors are zero or ones

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree this would be very cool. The basis vectors are made in pytensor. I tried using constant_fold to automatically check if the first sin term is all zeros and drop it automatically, but since X can be mutable it didn't work.

@ricardoV94 ricardoV94 added the GP Gaussian Process label Feb 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancements GP Gaussian Process
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants