Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDMS: add whole-molecule query tool & refactor metadata acquisition #3173

Merged
merged 1 commit into from
Jan 17, 2025

Conversation

keflavich
Copy link
Contributor

@keflavich keflavich commented Jan 12, 2025

This is a workaround for the CDMS bugs (#3095 and related).

It appears that the query tool mangles the original table in a way that appears unpredictable.

closes #2901
closes #3094

@pep8speaks
Copy link

pep8speaks commented Jan 12, 2025

Hello @keflavich! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2025-01-17 14:50:02 UTC

Copy link

codecov bot commented Jan 12, 2025

Codecov Report

Attention: Patch coverage is 31.57895% with 78 lines in your changes missing coverage. Please review.

Please upload report for BASE (main@ca584f6). Learn more about missing BASE report.
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
astroquery/linelists/cdms/core.py 26.21% 76 Missing ⚠️
astroquery/jplspec/lookup_table.py 85.71% 1 Missing ⚠️
astroquery/linelists/cdms/setup_package.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3173   +/-   ##
=======================================
  Coverage        ?   67.40%           
=======================================
  Files           ?      229           
  Lines           ?    18599           
  Branches        ?        0           
=======================================
  Hits            ?    12536           
  Misses          ?     6063           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bsipocz bsipocz added the cdms label Jan 14, 2025
Copy link
Member

@bsipocz bsipocz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the indexing issue to pass the tests, and there is also a failing doctest for the narrative docs, update that file, too.

# accounts for three formats, e.g.: '058501' or 'H2C2S' or '058501 H2C2S'
badlist = (self.MALFORMATTED_MOLECULE_LIST + # noqa
[y for x in self.MALFORMATTED_MOLECULE_LIST for y in x.split()])
if payload['Molecules'] in badlist:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say drop line 146, as request can deal with a dict data input, but you cannot index it if you turn in to a list of tuples. A dictionary would be anyways better as it would be consistent with our docstrings (even though upstream accepts more types right now).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I did this, I hope it worked.... I ran into more local configuration issues w/pytest that affect only astroquery. grumble.

astroquery/linelists/cdms/core.py Outdated Show resolved Hide resolved
astroquery/linelists/cdms/tests/test_cdms_remote.py Outdated Show resolved Hide resolved
@bsipocz bsipocz added this to the v0.4.8 milestone Jan 14, 2025
bsipocz
bsipocz previously approved these changes Jan 14, 2025
Copy link
Member

@bsipocz bsipocz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a changelog entry, fix the docs, and rebase for the merge commit. Then this should be good to go.

@keflavich
Copy link
Contributor Author

could you direct me to the doctest failure? I don't see it on CI, and I can't run the damned tests locally again...

@keflavich
Copy link
Contributor Author

Don't merge this yet, I discovered some additional fixes that are needed (the molecule tables use different names that need to be merged...)

@bsipocz bsipocz modified the milestones: v0.4.8, v0.4.9 Jan 15, 2025
@bsipocz
Copy link
Member

bsipocz commented Jan 15, 2025

doc fixes: pytest docs/linelists/cdms/cdms.rst --doctest-plus-generate-diff=overwrite -R should take care of it

@keflavich keflavich changed the title CDMS: add whole-molecule query tool CDMS: add whole-molecule query tool & refactor metadata acquisition Jan 16, 2025
@bsipocz bsipocz dismissed their stale review January 16, 2025 20:05

new code is being added

@bsipocz
Copy link
Member

bsipocz commented Jan 16, 2025

You need to rebase to pick up changes from last night's release.

@keflavich keflavich force-pushed the cdms_cats branch 2 times, most recently from 4cb4195 to 97c8bb1 Compare January 16, 2025 23:52
@bsipocz
Copy link
Member

bsipocz commented Jan 17, 2025

Test failure is related.

@keflavich
Copy link
Contributor Author

yes, I'm trying to figure out how to 'sanitize' the unicode for windows now....

@bsipocz
Copy link
Member

bsipocz commented Jan 17, 2025

yes, I'm trying to figure out how to 'sanitize' the unicode for windows now....

summon @pllim, she is very good with solving windows problems 😄

@keflavich
Copy link
Contributor Author

I think I solved it.... wait for CI, but I had to replace unicode <U+0096> (in python, '\x96') with simple '-', which is how a human would enter it under almost all circumstances. When I look up 96, it's "start of guarded area", which means nothing to me. It is rendered on the web as a slightly different '-' character - possibly intended as a superscript dash.

@keflavich
Copy link
Contributor Author

I spoke too soon, there's also an instance of \x90 somewhere.

@bsipocz
Copy link
Member

bsipocz commented Jan 17, 2025

btw @keflavich - how does this relate to #3094 and #2901? If neither of those is obsolete by this, could you have a look at them, too while this module is fresh in your mind?

@keflavich
Copy link
Contributor Author

This "fixes" #3094 to the extent that a fix is possible: it gives a better error message for the few known cases where there's a problem upstream.

#2901.... maybe I can merge in here. Hm. Thanks for noting that.

@keflavich
Copy link
Contributor Author

Wow, green!

@bsipocz are you OK with this merging in #2901?

(squashing soon)

@bsipocz
Copy link
Member

bsipocz commented Jan 17, 2025

@bsipocz are you OK with this merging in #2901?

Yes, we just haven't merged that due to the missing changelog and test. Including it here and closing the obsolete ones sounds good to me.

@pllim
Copy link
Member

pllim commented Jan 17, 2025

Well, I am glad you figured it out because +2,871 −1,224 😅

@bsipocz
Copy link
Member

bsipocz commented Jan 17, 2025

Haha, that's just deception, changes to the datafiles :)

Copy link
Member

@bsipocz bsipocz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this is good to go. I had one minor comment, and noticed a test failure, but it was quicker to just fix that.

astroquery/jplspec/lookup_table.py Show resolved Hide resolved
astroquery/linelists/cdms/core.py Outdated Show resolved Hide resolved
astroquery/linelists/cdms/core.py Show resolved Hide resolved
@bsipocz
Copy link
Member

bsipocz commented Jan 17, 2025

What about #3095, is this sufficient to close it too?

@keflavich
Copy link
Contributor Author

yes, this closes #3095

malformatted catalogs, improve LUT, refactor and significantly robustify
CDMS data table handling.  Also, update cached metadata files

add more regression tests

whitespace

whitespace

setup for data file

almost there ... unicode bad

col name

recode unicode raised minus (character 96) as simple dash

try replacing the text before ascii-reading it

oops

dashes

try a different approach...

try a different approach... part 2

super minor

remove unnecessary test that I just added

allow lookuptable to skip regex searching for exact matches

no single-char variables

looks like my fixes didn't work, and of course I made a bunch of stupid errors.  I might have to give up for the night

hooray!  got a different error this time.  Now just guesswork though....

Entry column

changelog

CI: ignoring a Deprecation we don't directly use but trigger somewhere in the stack

finish test

update docstr, inline comment
@keflavich
Copy link
Contributor Author

Ready for final review / merge

@bsipocz bsipocz merged commit b005fc9 into astropy:main Jan 17, 2025
9 checks passed
@bsipocz
Copy link
Member

bsipocz commented Jan 17, 2025

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants