-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tests #18
base: main
Are you sure you want to change the base?
Tests #18
Conversation
I agree on the pytest approach |
@KristijanArmeni thanks for this! Can somehow add this to our CI workflow? |
Sure @soul-codes. Is there a CI set up somewhere? Otherwise, we want to choose a CI framework/provider first. I use CircleCI in a few of my projects and works fine (and see it in some mature OSS projects too). It has a free tier that I think should work fine. I guess GitHub actions might work too, but don't have experience with those. We'd need to setup an account, link the repo with CircleCI. Then it's a matter of setting up small CircleCI config file and it should be running our tests remotely. |
@KristijanArmeni this is where we set up GitHub CI workflows. Feel free to add a new workflow that performs tests. Here's the docs to writing them. Perhaps @DeanEby can help as he's been doing some CI things on the repo recently? 🙏 |
@@ -0,0 +1,3 @@ | |||
[tool.pytest.ini_options] | |||
testpaths = ["tests"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KristijanArmeni a thought -- what do you think about allowing each analyzer to have its own test folder so that each module is self contained? How tricky would it be to configure something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it should be too tricky, as it is one of the two supported layouts by pytest. I can't say I see an obvious advantage for one or the other layout atm. I guess once you have many tests, it becomes more manageable to put them alongside with application code?
Separate test folder:
Putting tests into an extra directory outside your actual application code might be useful if you have many functional tests or for other reasons want to keep tests separate from actual application code (often a good idea)
Tests as part of modules:
Inlining test directories into your application package is useful if you have direct relation between tests and application modules and want to distribute them along with your application
See more under 'Tests layout' here: https://docs.pytest.org/en/7.1.x/explanation/goodpractices.html
Hey! I'd be happy to help out with this. Should I implement pytests on other analyzers, or perhaps try to implement the current test to the CI workflow? |
A CI workflow of some kind would be great |
@soul-codes @DeanEby If I understand correctly, we could just add one CI step to the current
Meaning, that build does not proceed if tests are not passing. Alternatively, we could have a separate EDIT: I see that the current trigger for |
Awesome @DeanEby! Thanks for looking into this. Shall I just add a commit to this PR? Or do you want to add it? Either way fine by me. |
You can add the commit. It would be nice to get this merged so I can start working on additional tests :) |
676b379
to
a9eec6d
Compare
@DeanEby Thanks, I added it. Seems like all are platforms are passing, except MacOS 12 build:
Sounds like it can safely be removed? Outside of that, this seems ready to merge. @soul-codes feel free to throw an extra pair of eyes if need be. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what we'll want to do is create a reusable fake context class for testing that obeys the context interface's intended semantic. The context interface is supposed to be so that the analyzer doesn't need to care how it's implemented.
I can help with that.
|
||
strategy: | ||
|
||
matrix: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's necessary to run datasci tests in all environments. Tools like polars, pandas and such should be cross-platform enough.
Opening this WIP pull request just to get the ball rolling regarding setting up tests (#10). Here I added a small unit test that tests the behavior of the
gini()
method and the full hashtag analyzer on a small input dataframe with 5 rows. See current idea of how this works and demo video below.Current test architecture
Standard pytest stuff
/tests
folder with atest_hashtags.py
testing script (eventually each analyzer would havetest_<analyzer-name>.py
script that would test analyzer methods.pyproject.toml
which holds configuration options for pytest (pyproject.toml
can also be used in the future to add packaging configuration).Mangotango specific stuff
AnalyzerContextDummy()
class that's used to mirrorPrimaryContextAnalyzer()
and it's methods/attributes and is only used to fetch the test data within the analyzer.check_analyzer_context()
function withinanalyzer.main()
that can check if the context passed to the analyzer is a real one or the dummy class. This is not elegant and add hoc; but I couldn't figure out how to get around theinput_reader.preprocess(pl.read_parquet(input_reader.parquet_path))
which loads data from disk, but the test data are created within the test script on the fly.Demo
The main idea is to use
pytest
, which can be invoked in the root folder. It collects all the tests present in the newly added/tests
folder. For example, right now I addedtest_hashtags.py
which contains a test of thegini()
andhashtags.main()
functions. Invoking pytest (on this branch) looks like so:pytest_test.mov