Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delimit YAML in text output of validate_upload #1099

Closed
jswelling opened this issue May 24, 2022 · 3 comments
Closed

Delimit YAML in text output of validate_upload #1099

jswelling opened this issue May 24, 2022 · 3 comments
Assignees

Comments

@jswelling
Copy link
Collaborator

The text output of validate_upload.py consists of a section of free text followed by a section of YAML containing schema version info &etc. There is no explicit delimiter between the two sections, and since the free text section is of variable length confusion may result. Unfortunately there is external software which explicitly looks for the string 'No Errors' as the first line, so adding a prefix to the text section is problematic.

One could add a prefix character to all the YAML lines, as long as that character is not itself YAML syntax- for example, delimiting with '#' would be problematic because stripping the prefix from YAML comment lines would be ambiguous. Alternately, inserting a known break line like '# yaml follows' would be a possible solution.

@mccalluc
Copy link
Contributor

@jswelling -

Unfortunately there is external software which explicitly looks for the string 'No Errors' as the first line, so adding a prefix to the text section is problematic.

What software is this? I had understood that you were using it in python, rather than wrapping the CLI, or are there (at least) two different usages on your side?

@jswelling
Copy link
Collaborator Author

The software that generates the table of unpublished datasets looks for validation_report.txt in the top level directory, and declares the dataset to have been validated if that file is present and starts with 'No Errors'. This feature is pretty obsolete now that data is usually provided in the form of Uploads, so it could be disabled. If the desired solution is to modify that first line, an issue should be created in ingest-pipeline to disable this check.

@mccalluc
Copy link
Contributor

Filed hubmapconsortium/ingest-pipeline#635; Closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants