Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 1994 #2031

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Issue 1994 #2031

wants to merge 6 commits into from

Conversation

dctrue2
Copy link

@dctrue2 dctrue2 commented Dec 4, 2021

  • Issue 1994 demonstrates how integer overflow issues are not properly handled.
  • Added validation checks for parsing integers to ensure they fit within each data type's defined range; added corresponding tests.

NatalieSty and others added 4 commits December 4, 2021 16:16
Convert each integer type to a BigInteger to validate that the value is within the proper range, otherwise throw a NumberFormatException.

Added corresponding tests to JsonParserTest.
@Marcono1234
Copy link
Collaborator

Your changes look good, thanks! Could you please:

  1. Add @throws to the documentation of the JsonElement methods to mention that they throw an exception for lossy conversion

  2. Maybe consider throwing an exception other than NumberFormatException. That class is described as:

    Thrown to indicate that the application has attempted to convert a string to one of the numeric types

    That does not seem very fitting because there is no conversion from string when the overflow check is done. Maybe throwing java.lang.ArithmeticException would be better. That one is also used by JDK methods such as Math.addExact when overflow occurs.
    Though on the other hand the JsonReader methods say that they throw NumberFormatException, so that would require catching and wrapping the ArithmeticException. Maybe we should wait here for the comment of a maintainer.

  3. It might be good to retrieve the short, int, ... value from the parsed BigInteger you are using for the range check (and also use it for the exception messages) instead of parsing the value again in the return statement at the end.

  4. It could also be worth trying to avoid the overhead of always converting to String then parsing as BigInteger. When value is a boxed primitive, BigInteger or BigDecimal its value could be tested directly (without parsing). E.g. getAsInt() could check if value is Short, Integer (for those intValue() can be returned without overflow check) or Long (here an overflow check would be needed). But this might be to some extent premature optimization, so we should wait for a comment from a maintainer as well.

  5. Add a test which tests the use case shown in Integer overflow when calling fromJson() on JsonObject #1994? All your tests directly interact with the JsonPrimitive, but it would be good to verify that JsonTreeReader also uses your changed methods. E.g. by calling gson.fromJson(new JsonPrimitive(...), int.class) and verifying that it detects the overflow.

  6. Maybe move the new test methods you added to the class JsonPrimitiveTest instead because they are not really parsing anything.

One issue with this approach might be lossless conversion from floating point values. Because your code uses BigInteger any decimal numbers will result in a NumberFormatException. Maybe it would therefore be better to parse as BigDecimal. Or there might also be better solutions to this. What do you think?

Sorry for this long comment; I hope some of the points are helpful though.

Also as side note: GitHub understands certain keywords. Could you please edit the description of your pull request and add the following, so GitHub understands that this pull request closes that issue:

Fixes #1994


Note that I am not a member of this project. Feel free to consider my comment only as suggestion.

@NatalieSty NatalieSty force-pushed the issue-1994 branch 2 times, most recently from 374354c to 7c58d0a Compare December 5, 2021 02:14
@eamonnmcmanus eamonnmcmanus linked an issue Dec 5, 2021 that may be closed by this pull request
Copy link
Member

@eamonnmcmanus eamonnmcmanus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on!

Though this is potentially a breaking change, it seems like the right thing to do, both because it fixes potential silent data loss and because it is consistent with what we do elsewhere (as detailed in the issue).

The new tests are great, but I had a few suggestions for the implementation changes.

@@ -203,6 +203,17 @@ public float getAsFloat() {
*/
@Override
public long getAsLong() {

if (isNumber()) {
final BigInteger val = getAsBigInteger();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is suddenly very expensive if value is already a Long. getAsBigInteger() will convert it to a string which it will then parse back into a new BigInteger object. If this method is called more than once, it will do the same thing every time. I think it would be worth adding instanceof checks to bypass this for when value is Long or any of the smaller java.lang types, and likewise in the other cases. Also, you're checking isNumber() here, so it looks as if you could reasonably move the true-case from the isNumber() ? code below.

Also we can borrow an idea from the implementation of BigInteger.longValueExact() for a simpler overflow check. (Since that method was added in Java 8 we unfortunately can't use it directly.)

I'm imagining code something like this:

if (isNumber()) {
  if (value instanceof Long || value instanceof Integer || value instanceof Short || value instanceof Byte) {
    return ((Number) value).longValue();
  }
  BigInteger val = getAsBigInteger();
  if (val.bitLength() >= 64) {...throw...}
  return val.longValue();
}
return Long.parseLong(getAsString());


try {
jsonObject.get("bigIntegerU").getAsLong();
fail("Big Integer values greater than Long.MAX_VALUE should not parse as a Long");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to write BigInteger here with no space.

return isNumber() ? getAsNumber().intValue() : Integer.parseInt(getAsString());
}

/**
* convenience method to get this element as a primitive byte.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These methods can all throw NumberFormatException, even before this change. I think it is the javadoc of the overridden methods in JsonElement that should be updated to reflect that.

@eamonnmcmanus
Copy link
Member

I've run this change against Google's internal tests and unfortunately there were quite a few failures. It looks as if there might be about 10 distinct root causes, which fall into two main categories:

  1. Reading floating-point values into int fields. In the failures I saw, the floating-point values were things like 45.0, so these actually do the right thing currently and will fail with this change.
  2. Genuine overflows, sometimes because the value is big enough to fit into an unsigned int or long but not a signed one.

Although we could imagine tweaks to allow these cases to continue to work, or perhaps to update the tests so they don't trigger the problems, my feeling is that it is probably best to leave well enough alone. So I am probably going to close this PR and the associated issue.

@dctrue2
Copy link
Author

dctrue2 commented Dec 7, 2021

Sounds good @eamonnmcmanus and @Marcono1234! We greatly appreciate your feedback. Clearly, building in the necessary checks to make all possible test cases resolve as expected is a bit more laborious and expensive than at first glance from the initial issue posting.

@Marcono1234
Copy link
Collaborator

2. Genuine overflows, sometimes because the value is big enough to fit into an unsigned int or long but not a signed one.

Just curious, are those tests running directly against the JsonPrimitive methods modified here, or are they using Gson.fromJson with a JsonElement (or subtype) as input? In case of the latter, then at least personally I think these tests are somewhat questionable because they explicitly depend on the discrepancy between parsing from a Reader (respectively JsonReader, where unsigned values are not supported) vs. parsing from a JsonElement (respectively JsonTreeReader).
In case of the former, then this could be solved by adding new getAsXExact methods, which could even be internal at first and only be used by JsonTreeReader. However, the question is whether that would be desired.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integer overflow when calling fromJson() on JsonObject
4 participants