forked from argilla-io/argilla
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: Some feedback dataset improvements (argilla-io#3937)
<!-- Thanks for your contribution! As part of our Community Growers initiative 🌱, we're donating Justdiggit bunds in your name to reforest sub-Saharan Africa. To claim your Community Growers certificate, please contact David Berenstein in our Slack community or fill in this form https://tally.so/r/n9XrxK once your PR has been merged. --> # Description The main improvements that this PR brings are: 1. **Feedback dataset class method unification**: All expected methods are defined in the base class, so they must be available for each dataset implementation. For those cases where the method has less sense or is not implemented yet, the user will be notified with a warning (@davidberenstein1957 review and improve, please) 2. **More general workflow with response unification**: The unification workflow support dataset connected to Argilla. This means that the `prepare_for_training` can be applied with remote datasets. The `unify_responses` returns a dataset where responses are unified. As a common practice, returning data is preferable to modifying values internally. We can avoid weird side effects. So, the unification workflow should be as: ```python from argilla import MultiLabelQuestionStrategy, FeedbackDataset dataset = FeedbackDataset.from_argilla(name="my-dataset") strategy = MultiLabelQuestionStrategy("majority") # "disagreement", "majority_weighted (WIP)" unified_dataset = dataset.unify_responses( question=dataset.question_by_name("tags"), strategy=strategy, ) unified_dataset... ``` **Type of change** (Please delete options that are not relevant. Remember to title the PR according to the type of change) - [X] Refactor (change restructuring the codebase without changing functionality) - [X] Improvement (change adding some improvement to an existing functionality) **How Has This Been Tested** (Please describe the tests that you ran to verify your changes. And ideally, reference `tests`) Some of those flows have been tests locally **Checklist** - [ ] I added relevant documentation - [x] I followed the style guidelines of this project - [x] I did a self-review of my code - [ ] I made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [ ] I filled out [the contributor form](https://tally.so/r/n9XrxK) (see text above) - [x] I have added relevant notes to the `CHANGELOG.md` file (See https://keepachangelog.com/) --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: davidberenstein1957 <[email protected]>
- Loading branch information
1 parent
37b7074
commit cc4cfdf
Showing
20 changed files
with
673 additions
and
377 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.