Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail if duplicate keys are present #177

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Ahajha
Copy link

@Ahajha Ahajha commented Oct 1, 2024

#175
This should at least improve the error message. It's not perfect, but at least the cause of the error should end up somewhere on the screen, and should avoid the naive O(n^2) complexity for checking all keys against each other.

I added the following test locally:

TEST_CASE("duplicates") {
  [[maybe_unused]] constexpr auto map =
      frozen::make_unordered_map<frozen::string, int>({{"ha", 1}, {"ha", 2}});
}

This gives the following on clang 15:

[build] /home/alex/Documents/frozen-1/tests/test_unordered_map.cpp:283:35: error: constexpr variable 'map' must be initialized by a constant expression
[build]   [[maybe_unused]] constexpr auto map =
[build]                                   ^
[build] /home/alex/Documents/frozen-1/include/frozen/bits/pmh.h:102:7: note: non-constexpr function 'check' cannot be used in a constant expression
[build]       check("Duplicate keys present, check your input data");
[build]       ^
...

This error message only works on the unordered containers, which makes sense since this change is in hashing code. The non-unordered variants (map, set) don't suffer from the cryptic template depth error anyways, so nothing there to fix really.

@serge-sans-paille
Copy link
Owner

I like the idea. A few comments though

@@ -103,6 +103,11 @@ pmh_buckets<M> constexpr make_pmh_buckets(const carray<Item, N> & items,
bool rejected = false;
for (std::size_t i = 0; i < items.size(); ++i) {
auto & bucket = result.buckets[hash(key(items[i]), static_cast<std::size_t>(result.seed)) % M];
for (const auto item_index : bucket) {
if (key(items[item_index]) == key(items[i])) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • This should use the KeyEqual parameter.
  • This would probably be better if key(items[i]) were saved in a temporary variable.
  • move this loop to an helper function?
  • Move the check under ifndef NDEBUG ?
  • Instead of calling (exit), what about
+      extern void check(const char[]);
+      check("Duplicate keys present, check your input data");

Copy link
Author

@Ahajha Ahajha Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with all of these except maybe the ifndef NDEBUG comment. For my use case, I'm writing a library where the keys will end up being supplied by the user. I think it would be better for their experience if they always saw a helpful message, even in release mode. Perhaps we could add a macro that would force the check to happen, but by default it's only in debug mode? So something like #if !defined NDEBUG || defined FROZEN_LETITGO_ENABLE_DUPLICATE_KEY_ASSERTIONS? (we could also default that variable to be defined with NDEBUG, and the user can manually enable it, which simplifies the check)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if adding another preprocessor directive, and also since hopefully the old error message should go away in debug mode, should I update the README to mention the new macro?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some running commentary: Extracting that check to a separate function might be more trouble than it's worth, due to the inputs it ends up needing to be heavily templated and type annotated. I'll try to get it working just to show what it would look like.

I'll need to pass KeyEqual down the stack quite a bit - though this is for correctness so that's fine.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've incorporated everything except the NDEBUG comment, once we reach a consensus I'll add that in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants