Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring HDT Cat by creating reusable components and signatures #138

Merged
merged 2 commits into from
Jan 12, 2022
Merged

Refactoring HDT Cat by creating reusable components and signatures #138

merged 2 commits into from
Jan 12, 2022

Conversation

AlyHdr
Copy link

@AlyHdr AlyHdr commented Jan 10, 2022

In this pull request we worked out a new version of HDT cat that eliminates some unnecessary code and unifies the classes used to iterate over the sections of the dictionary in order to create intersections and unions out of the two given files, in this pull request we introduced some main components:

  • CatWrapper: Wraps around an iterator of a dictionary section and returns a defined object CatElement which always contains the next element to be added to the new section + the mapping which points to which sections an element may go or leave to.
  • CatIntersection: Gets the intersection of two given dictionary sections.
  • CatUnion: Gets the union of two given dictionary sections.
  • CatElement: an element representing the next entity(String) to be added in the new section.
  • SectionUtil: creates a section out of a given list of iterators of intersections and unions while filing the necessary mappings to translate the triple IDs after the creation of the new dictionary. This is used to create any section of the dictionary and can be used to any other structure.

Some extra tests has been added triggering a bug that was discovered during the implementation of this new version, which comes out when using specific string in the RDF files. The problem was in comparing CompactStrings.

@D063520
Copy link
Contributor

D063520 commented Jan 10, 2022

I'm in favour of this refactoring. The code was difficult to read and it is much more logic.
NOTE: this fix should be integrated though .... #128 which maked HDTCat work also on windows. The original message in the pull requests states clearly what the changes are

@mielvds
Copy link
Member

mielvds commented Jan 10, 2022

Hi! Thanks for the contribution, happy to merge, but like @D063520 mentions, you are reverting some of the fixes from #128, namely catching the exceptions when calling Files.delete(). Basically, leave hdt-java-core/src/main/java/org/rdfhdt/hdt/hdt/impl/HDTImpl.java untouched.

@AlyHdr
Copy link
Author

AlyHdr commented Jan 10, 2022

Changes has been made to take #128 into account.

@mielvds mielvds added this to the 2.1.3 milestone Jan 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants