-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add drop_first
option to HSGPPeriodic
#7115
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #7115 +/- ##
==========================================
+ Coverage 90.41% 92.36% +1.94%
==========================================
Files 101 101
Lines 16930 16937 +7
==========================================
+ Hits 15308 15644 +336
+ Misses 1622 1293 -329
|
@@ -473,11 +472,15 @@ def __init__( | |||
self, | |||
m: int, | |||
scale: Optional[Union[float, TensorLike]] = 1.0, | |||
drop_first=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember to add it to the dostrings ;) Also type: drop_first: bool = Flase
self.f = pm.Deterministic(name, f, dims=dims) | ||
if self.drop_first: | ||
# Drop first sine term (all zeros), drop first cos term (all ones) | ||
beta = pm.Normal(f"{name}_hsgp_coeffs_", size=(m * 2) - 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which one is the old case with size=(m * 2 - 1))
? With m*2
, you might be accidentally using the same beta
twice, or have some redundancy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neither, now drop_first=True
removes both the first sine and first cos terms, and drop_first=False
keeps all terms. So whenever the first sine term is all zeros, then the first cosine term is all ones. My reasoning was that wanting to keep the quasi-intercept cos term while dropping the all zeros sin term was more of an "experts only" move, so someone can do this with prior_linearized
. Having all three options as an argument felt kind of cumbersome? What do you think?
With m*2, you might be accidentally using the same beta twice, or have some redundancy
True, need to double check this is all good, especially if someone uses an odd number m
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with the drop_first=True
case, but why would anyone want the drop_first=False
case with the first sine term when the first eigenfunction is zero? Doesn't that always make one of the beta
redundant? I would just have the drop_first=True
case and the original case as drop_first=False
. I was just following the sums in the maths:
If you go against the maths, please add some comments to explain why. Otherwise pymc devs in 2034 might get confused when they look at appendix b in the original paper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise pymc devs in 2034
Love the optimism!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but why would anyone want the drop_first=False case with the first sine term when the first eigenfunction is zero? Doesn't that always make one of the beta redundant?
Yup it does. OK, I made a dumb mistake. When sin(x) = 0, cos(x) is either 1 or -1 (been a while since trig class), not cos(x) always equals one like I wrote. So it only happens sometimes that the cosine term is all ones. Could be all -1's or be a mix of 1's and -1's. So drop_first = True
always dropping both the first sine and cosine term is a bad idea.
To see what I'm trying to explain (sorry kind of poorly), try running the code below, but change period
from 1 to 2 to 3.
- period = 1: sine term all zeros (gotta drop it), cosine term all ones (should probably drop it, or at least be warned? If you already have an intercept in your model the sampler will bog down and you should remove one of them.)
- period = 2: sine term all zeros (gotta drop it), cosine term alternates -1 and 1 (gotta keep it)
- period = 3: both terms oscillate (keep both... right? though you point out @theorashid that the math says drop the sine term, so I'm not sure)
Samples of the prior predictive for the gp look good in all cases.
import pymc as pm
import numpy as np
import matplotlib.pyplot as plt
import pytensor
import pytensor.tensor as pt
X = np.linspace(0, 10, 1000)[:, None]
with pm.Model() as model:
period = 1
scale = pm.HalfNormal("scale", 10)
cov_func = pm.gp.cov.Periodic(1, period=period, ls=1.0)
m = 200
gp = pm.gp.HSGPPeriodic(m=m, scale=scale, cov_func=cov_func)
X_mean = np.mean(X, axis=0)
X = pm.MutableData("X", X)
Xs = X - X_mean
# FROM pm.gp.hsgp_approx.calc_basis_periodic(Xs, cov_func.period, m=m)
w0 = (2 * pt.pi) / period # angular frequency defining the periodicity
m1 = pt.tile(w0 * Xs, m)
m2 = pt.diag(pt.arange(0, m, 1))
mw0x = m1 @ m2
(phi_cos, phi_sin), psd = gp.prior_linearized(Xs=Xs)
plt.plot(phi_cos[0, :].eval())
plt.plot(phi_sin[0, :].eval())
beta = pm.Normal(f"beta", size=m * 2)
beta_cos = beta[:m]
beta_sin = beta[m:]
cos_term = phi_cos @ (psd * beta_cos)
sin_term = phi_sin @ (psd * beta_sin)
f = pm.Deterministic("f", cos_term + sin_term)
#prior = pm.sample_prior_predictive()
#import arviz as az
#f = az.extract(prior.prior, var_names="f")
#plt.figure(figsize=(10, 4))
#plt.plot(X.eval().flatten(), f.mean(dim="sample").data)
#plt.xticks(np.arange(10));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I stared at the equation again and I can't see how sin(0 * w0 * x)
can be anything other than 0. Equally, cos(0 * w0 * x)
is always 1 – so there's always an intercept term, I suppose.
So mw0x
there for me has shape (1000, 200)
(i.e. (X.size, m)
). So phi_cos, phi_sin
both have shape (1000, 200)
. So by plotting plt.plot(phi_cos[0, :].eval())
you are plotting something of shape 200
i.e. m
. So you're plotting the first term of each basis vector rather than plt.plot(phi_cos[:, 0].eval())
which is the contribution of the first (0th) basis across X
. I find that to be 0 for sine and 1 for cos throughout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are plotting something of shape 200
Yuuuuuup. You're right. I plotted it wrong, so sorry for spreading my confusion around! It looks like I found that the cosine term is somtimes all ones, so the issue with the extra intercept is real. But then plotted it wrong and proceeded to draw lots of incorrect conclusions from there.
What the code should do is either:
- Sine term is always all zeros, always drop (as the code originally did)
- Cosine term is sometimes all ones. Allow user to optionally drop it.
Is that right @theorashid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds right to me. Maybe remove the DeprecationWarning
from the HSGP class too, if you want to stick with drop_first
consistent after all
@@ -229,62 +229,73 @@ def test_conditional(self, model, cov_func, X1, parameterization): | |||
|
|||
class TestHSGPPeriodic(_BaseFixtures): | |||
def test_parametrization(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To add some drop first tests
|
||
self.f = pm.Deterministic(name, f, dims=dims) | ||
if self.drop_first: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder is we would like to inform the user when drop_first=Flase
and the fist basis vectors are zero or ones
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree this would be very cool. The basis vectors are made in pytensor. I tried using constant_fold
to automatically check if the first sin term is all zeros and drop it automatically, but since X can be mutable it didn't work.
for more information, see https://pre-commit.ci
Description
Adds
drop_first
, defaults toFalse
, in the arguments forHSGPPeriodic
. Sometimes, but not always, the first sine and cosine basis vectors are all zeros and all ones respectively. Previously, we were always removing the first sine basis vector, whether it was all zeros or not. Whendrop_first=False
, no basis vectors are dropped. Whendrop_first=True
, both are dropped. The all-zero sine component can be safely dropped because it contributes nothing to the model. The all-one cosine component will likely cause sampling problems by adding an extra intercept to the model (which is what got me to do the PR).If the user would like to do more specific things like keep the cosine intercept but drop the first sine component, they can fall back to
prior_linearized
and do their own thing.I'm wondering if always dropping the first sine basis vector -- even though its not always all zeros, is what led to the super slight mismatches between the HSGP approx and the unapproximated GP @theorashid? Whether or not the first sine basis vector is all zeros (and so the first cosine basis vec is all ones) depends on the period that is set. I uncommented the test that was xfailed before. Still fails, so maybe its something else. Will look into this more.
TODOs
m
.Related Issue
Checklist
Type of change
📚 Documentation preview 📚: https://pymc--7115.org.readthedocs.build/en/7115/