Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConsString does not transforms to java.lang.String when passed to the host java object #247

Open
AlexTrotsenko opened this issue Oct 1, 2015 · 37 comments
Labels
Java Interop Issues related to the interaction between Java and JavaScript Triage Issues yet to be triaged

Comments

@AlexTrotsenko
Copy link

As far as I understand, because of LiveConnect we should never have org.mozilla.javascript.ConsString passed to the arguments of the Java host object. And only java.lang.String should be passed instead.

Currently, when in JS code I pass result of 'JavaScript string' + 'Java string' as the arguments of the Java Host object's method, I have org.mozilla.javascript.ConsString but not java.lang.String in my Java Code.

Similar issues is described here:
https://issues.alfresco.com/jira/browse/ALF-20856

@AlexTrotsenko AlexTrotsenko changed the title ConsString does not transforms to ConsString does not transforms to java.lang.String when passed to the host java object Oct 1, 2015
@EduFrazao
Copy link

EduFrazao commented May 1, 2019

Hi Guys. It is a super serious issue. It almost broke any previously working script that interacts with Java object instances. I have to downgrade to 1.7R3 to keep my code working. The only issue that I noted is that I can't pass Strings that was concatenated in JavaScript to my Java classes anymore. There is any workaround? I have thousands of little scripts that do that, and refactory all of them to call .toString() will cause to mutch trouble.

@rbri
Copy link
Collaborator

rbri commented May 1, 2019 via email

@gbrail
Copy link
Collaborator

gbrail commented May 1, 2019 via email

@EduFrazao
Copy link

Can you please provide a simple test case.

Sure. I will do that today. Thanks for answer!

@rbri
Copy link
Collaborator

rbri commented May 7, 2019

There are some performance measures available in #373.

@rbri
Copy link
Collaborator

rbri commented May 7, 2019

BTW still waiting for the sample

@EduFrazao
Copy link

BTW still waiting for the sample
I'm really sorry in not send the test case. Working hard here, but I will attach it tonight. Thanks for your patience.

@victordiaz
Copy link

any update on this issue?

@tonygermano
Copy link
Contributor

I'm not able to reproduce this, and don't see that a test case was ever included when requested.

This is the oldest version of Rhino I have. I ran it at version 0. Exact same results from version 1.7.6 at version 180 and 1.7.13 at version 200.

Rhino 1.7 release 5 PRERELEASE 2012 06 29
js> java.lang.Object.__javaObject__.getMethod('getClass').invoke('a' + new java.lang.String('b'))
class java.lang.String

@victordiaz or @michbarsinai do you have any examples of where this is still broken for you?

@michbarsinai
Copy link

@tonygermano thanks for looking into this!

Here's the problem:

const e = [[some Java object whose .name property is the java.lang.String "ABC"]]
e.name=="ABC"     // true
e.name==="ABC"    // false
e.name=="A"+"BC"  // true
e.name==="A"+"BC" // false

Intuitively, I would expect all of the equality tests to be true (hope I'm correct here :-). I assume that has to do with ConsString, but I might be wrong.

@p-bakker
Copy link
Collaborator

p-bakker commented Jul 6, 2021

@michbarsinai Think this issue is about going from JavaScript > Java, whereas your example is from Java > JavaScript

However, what you're pointing out does seem like a bug to me, created a separate case for this: #992

@p-bakker
Copy link
Collaborator

p-bakker commented Jul 6, 2021

I stand corrected, with regards to the comment above: the behavior of isJavaPrimitiveWrap is the inversion of what I thought it would be: you need to set it to false to make sure Java primitives are exposed in JavaScript land as JavaScript primitives.

So, there's no bug with regards to comment #247 (comment), it's just how you configure Rhino

@tonygermano
Copy link
Contributor

@michbarsinai Your current issue is mainly because when e.name is a java.lang.String then typeof e.name returns "object" (as all java objects do) while typeof "ABC" returns "string" so it will be impossible for === to return true, but == will coerce the object to a string before doing the comparison. This behavior is described in https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Equality

If one of the operands is an object and the other is a number or a string, try to convert the object to a primitive using the object's valueOf() and toString() methods.

Another issue with the same root cause for confusion is #891.

You do have some options in javascript land:

String(e.name) === "ABC" // explicit conversion to string primitive
e.name + '' === "ABC" // string concatenation always returns string primitive
e.name.equals("ABC") // embrace the java-ness

Do not do new String(e.name) or you will create a javascript String object rather than a primitive and have the same problem with strict equality.

I.e, in any javascript environment, not just Rhino,

String("ABC") === "ABC" // true
new String("ABC") === "ABC" // false
new String("ABC").toString() === "ABC" // true

toString() works in that last example because we started with a javascript object. Calling toString() on a java object will return a java.lang.String.

@tonygermano
Copy link
Contributor

tonygermano commented Jul 8, 2021

Setting the javaPrimitiveWrap property, which @p-bakker pointed out, to false is another option to have it automatically convert strings returned by java methods to primitives if this is desirable 100% of the time. The example below shows this behavior in the rhino shell, but you could also set it in java before starting to execute the javascript code.

js> var o = new java.lang.Object()
js> o.toString()
java.lang.Object@326de728
js> typeof o.toString()
object
js> org.mozilla.javascript.Context.getCurrentContext().getWrapFactory().setJavaPrimitiveWrap(false)
js> o.toString()
java.lang.Object@326de728
js> typeof o.toString()
string
js> o.toString() === 'java.lang.Object@326de728'
true

@michbarsinai
Copy link

Thanks @p-bakker and @tonygermano for your inputs! I'll check this soon.

@p-bakker
Copy link
Collaborator

p-bakker commented Jul 11, 2021

As for the original reported issue: think the linked alfresco case contains the info to (try to) reproduce it

@p-bakker
Copy link
Collaborator

p-bakker commented Jul 12, 2021

This issue is still around on the latest master.

Code to reproduce, based on the sample in the linked Alfresco case below. Albeit not exactly the same as the issue reported in the Alfresco case, where the NativeObject is passed as a param to a Java method, I think it demonstrates the issue

package tmp;

import org.mozilla.javascript.Context;
import org.mozilla.javascript.ContextFactory;
import org.mozilla.javascript.NativeObject;
import org.mozilla.javascript.RhinoException;
import org.mozilla.javascript.ScriptableObject;

public class Test {

  public static void main(String[] ar) {
      final Context context = ContextFactory.getGlobal().enterContext();
      context.setLanguageVersion(Context.VERSION_ES6);
      ScriptableObject scope = context.initStandardObjects();
      
      try {
    	  test(context, scope);
      } catch(RhinoException ex) {
          System.out.println(ex.getScriptStackTrace());
      } catch(Exception e) {
          e.printStackTrace();
      }finally {
          Context.exit();
      }
	}
  
  private static void test(Context cx, ScriptableObject scope) {
	  String evaluationScript = """
			var df = new java.text.SimpleDateFormat('yyyy-MM-dd');
			var theDate = df.format(new Date());

			// FAILS
			var stamptext = 'Example text ' + theDate.toString();
			
			// WORKS:
			//stamptext = ('Example text ' + theDate).toString();
	  		var x = {
			    test: stamptext
			};
			
			x
      """;

	  try {
        NativeObject obj = (NativeObject) cx.evaluateString(scope, evaluationScript, "EvaluationScript", 1, null);
		
        System.out.println(obj.get("test").getClass().getName()); // Map get
        System.out.println(obj.get("test", obj).getClass().getName()); // Scriptable get
	  } catch (Exception e) {
		e.printStackTrace();  
	  }
  }
}

@p-bakker p-bakker added the bug Issues considered a bug label Jul 12, 2021
@tonygermano
Copy link
Contributor

I don't think this is a bug. As reported, the problem was with sending a string from JS to a java method, which does appear to work correctly.

The test in the previous comment is accessing properties of a NativeObject from java in the same way internal Rhino code would. Rhino would still need to perform a conversion step before passing these values back to Java, in which case it also knows which type a java method expects.

Context.jsToJava(obj.get("test"), java.lang.String.class) will correctly convert ConsString to a java.lang.String, though obj.get("test").toString() is shorter in this case. (Can/should jsToJava be changed to use generics to return the type requested instead of Object to avoid the extra cast? The method already throws if it can't convert.)

A javascript string returned as a result of interacting directly with native objects from java could be cast directly to a CharSequence as ConsString does implement it, but a NativeString still could not.

Likewise, you can't assume that a number will be returned as a java.lang.Double if you bypass the conversion step.

If you change the script in your test method like below, it will actually return a java.lang.Integer.

String evaluationScript = """
    var x = {
        test: 3.0
    };

    x
""";

So maybe the contract should just be that a javascript string primitive is a java.lang.CharSequence, as a number primitive is a java.lang.Number?

@p-bakker
Copy link
Collaborator

Have a look at the linked case from Alfresco: the issue there is sending a NativeObject into a Java method and then getting one of its properties in Java code, which is expected to be a String, but instead is a ConsString instance

And prior to the implementation of Constring, it would of course return a String

I'm not really sure that its expected behavior to have a Consstring instance leak out into Java code: ConsStrings are imho an internal thing to Rhino for optimizing string concatenation within scripting.

Either as soon as they are stored somewhere (as key or value in a NativeObject/Map/Set/...) they should be stored as a regular String primitive.

Or else, shouldn't at least the methods of the Map/List interface do the check and convert ConsString instances to Strings?

@tonygermano
Copy link
Contributor

I saw what was happening, but I don't think it's a bug.

As you point out, it is sending a javascript object to a Java method, not a string. The javascript object is being sent as a NativeObject. A string (ConsString or not) would be sent as a String, especially if the java method signature was expecting a String, but I think also to minimize breaking changes for older implementations that accepted Objects and were trying to cast them directly to Strings.

Once you start pulling data out of a NativeObject from Java, I don't think there is any contract that says everything that is returned will be automatically converted from javascript to java, because that is exactly how Rhino itself interacts with a NativeObject, and you wouldn't want all of those automatic conversions happening internally in the engine.

The Context.jsToJava method that I mentioned exists for that purpose. The javadoc says:

Convert a JavaScript value into the desired type. Uses the semantics defined with LiveConnect3 and throws an Illegal argument exception if the conversion cannot be performed.

I see this as similar to the getX methods on a JDBC ResultSet. The getObject method will return what it actually is, but getString will always return a string, even if the column is actually a non-string type.

My suggestion (I think this can be accomplished with generics) is to allow:

String strValue = Context.jsToJava(obj.get("test"), String.class);
Integer intValue = Context.jsToJava(obj.get("test"), Integer.class);
// instead of
String strValue = (String) Context.jsToJava(obj.get("test"), String.class);
Integer intValue = (Integer) Context.jsToJava(obj.get("test"), Integer.class);

The get methods in NativeObject return Object. While a javascript number is usually a java.lang.Double, you might get something else which is still a Number but won't cast to a Double. In the same way, I think it would be fair to say while a javascript string is usually a java.lang.String, you might get something else which is still a CharSequence but won't cast to a String.

I considered your last question about specifically addressing ConsString in the Map/List interfaces, which seems reasonable, but should numbers get the same treatment, then? Converting a ConsString to a String also changes the internal state of the ConsString. An odd thing I noticed is that the Map.get method is actually defined in ScriptableObject, even though only NativeObject actually implements Map. I don't see ScriptableObject.get called very much in the code base except in the NativeObject Map implementation itself, and a few other places that are likely calling the wrong method in the first place.

var o = {value: ""};
while(next = getNext()) {
    o.value += next;
}

You'd have to be really careful that whatever changes were made didn't cause that loop to flatten the ConsString on every iteration.

@p-bakker
Copy link
Collaborator

Well, agree with pretty much everything you say, except when it comes to ConsString :-)

That JS numbers resolve to different Number implementations in Java land is just the nature of the game, but to me ConsString is really just an internal thing to Rhino and as soon as it gets stored as a key or value, I think it should be resolved to a String, so ConsString instances will never leak to Java code.

Haven't tested it, just briefly looked at their code, but I think both Nashorn and GraalJS (both of which employ similar tactics to optimize String concatenation in JS I think) resolve their internal string representation to a String primitive when storing it as a key or value inside an Object/Map/Set/....

Curious what others think about this

@tonygermano
Copy link
Contributor

I think the fact that a javascript string is often compatible with a java String is also an implementation detail. The value I mentioned that is returned by ScriptableObject.get as an Integer (even though I set the value to 3.0) will automatically be converted to a Double if that number is sent to a java method, because that is what LiveConnect3 specifies. I don't think that numbers and strings are different in this case. When you get a property value directly from a Rhino object in Java you are still in Rhino internals land, and there are functions to convert from js values to Java values and specify the desired target type.

If we do decide to make Map.get perform an auto-conversion for js strings, I feel the case to auto-convert js numbers is just as strong. We would have to put a strong warning on it to not call that method for internal operations, and we would also need to avoid converting on storage to a property value, or it would degrade ConsString performance to worse than if we weren't using it at all in common scenarios such as the loop in my most recent comment.

ConsStrings already get converted to Strings when being used as keys. The internal put methods must take a String, Symbol, or int as a key. The Map.put method on NativeObject is the only one that takes an Object for the key, and it throws an UnsupportedOperationException.

@tonygermano
Copy link
Contributor

So, when a js object is sent to a java method from a js script or used as a return value from a script, what if instead of sending the actual internal object itself, it gets wrapped in a Scriptable that does perform conversion of values on its get and put methods? Would that solve the interoperability problem without needing to change the behavior of internal methods?

@michbarsinai
Copy link

My 2¢ as a programmer that embeds Rhino in largerJava applications, and who also went a bit into Rhino code (not a very strong opinion but @p-bakker asked what others think):
I was never in a situation where getting a ConsString was useful. If at all, it created a special case for me when I had to do comparisons etc. So in my experience, any design that prevents them from being leaked into Javaland unless explicitly requested makes sense.

That being said, the point about number conversion is solid. I think there is a difference though, as Java does have a Number class, complete with sub-class hierarchy of Long, Double, etc. So when an API returns a Number, Java programmers know that the result could be subclassed if needed. String, on the other hand, is a final class and so people don't expect any polymorphic behavior here. Theoretically, the API could be changed to return CharSequence instead of String, and that would align the two cases.

That being said, the above solution might fire back on equality tests and hashCodes for users, since conceptually people expect ConsString and String to be equal whenever they hold the same char sequence. So there are multiple aspects to be balanced here. I suppose the "principle of least surprise" is a good guideline in these cases. The JS side is more prone to surprises because of its dynamic type system, so that side might need more protection against surprises.

@p-bakker p-bakker removed the bug Issues considered a bug label Jul 15, 2021
@p-bakker p-bakker added Java Interop Issues related to the interaction between Java and JavaScript Triage Issues yet to be triaged labels Jul 15, 2021
@tonygermano
Copy link
Contributor

tonygermano commented Jul 15, 2021

I don't know if there is an API that dictates what should happen in this case. We are talking about a calling a Rhino method that is heavily used internally by the engine that gets a property by key and returns an Object representing a javascript (not Java) value. Context.jsToJava exists to convert a javascript value to a java value. Not using that means you are still manipulating javascript values using java. You can already say that a javascript string can be represented as a CharSequence because both internal representations of a js string implement that interface, but I don't think that is explicitly stated anywhere. Starting in 1.7.14, you also won't even be able to say that a Number is always a number, because if it's a BigInteger (also a Number,) then it's a bigint.

I'm wondering if a Wrapper (it doesn't actually have to be a Scriptable like I previously suggested) with a well-defined API to ease interacting with javascript objects from java should be the way to go, in the same way that we have a wrapper for interacting with java objects from javascript. A Context flag could optionally be used to automatically wrap js objects passed to java methods, but I don't see any reason why objects couldn't be manually wrapped (and unwrapped if needed) in java as well. The wrapper would also have to handle Callables among other things I'm probably not thinking of so they can be invoked from java.

Javascript functions are not subclasses of NativeObject, and therefore do not implement Map, but they probably should since they are also js objects with enumerable properties. That could be solved by having the wrapper itself implement Map. Javascript arrays should implement Map in addition to List, as they are also objects that can have named as well as indexed keys. Pretty much anything that you can call Object.keys() (edit: or rather Object.getOwnPropertyNames()... but only enumerable properties?) on should implement Map in its (user facing) java representation.

Having a wrapper with a clean api would also ease the pain of making api breaking changes to the internal objects in the future as that would become the preferred way to interact with objects from java code.

Context.javaToJs and related methods would unwrap the object if passed from java back to javascript.

@tonygermano
Copy link
Contributor

tonygermano commented Jul 15, 2021

Another surprise with calling any of the NativeObject.get (Map or Scriptable) methods is that none of those are the ecma262 get operation. They only check "own" properties instead of looking up the prototype chain. If the target object does not have a property defined for that key it will return UniqueTag.ID_NOT_FOUND instead of something where Undefined.isUndefined returns true. We are firmly in internals and not user land here. edit: The Map.get method will return null as you would expect from a Map in java when a key is not found.

Considering a potential user API wrapper, I think the Map operations should continue to only deal with "own" properties, converting values as they are returned, but there should also be a different get method to allow walking the prototype chain.

@tonygermano
Copy link
Contributor

tonygermano commented Jul 16, 2021

I've never used GraalVM, so I might be incorrect in the details.

https://www.graalvm.org/reference-manual/js/JavaInteroperability/#access-to-javascript-objects-from-java

This says,

JavaScript objects are exposed to Java code as instances of com.oracle.truffle.api.interop.java.TruffleMap. This class implements Java’s Map interface.

I can't find TruffleMap in the javadocs, though, so I think that might be dated information.

I believe this is actually how javascript objects appear to java: https://www.graalvm.org/truffle/javadoc/org/graalvm/polyglot/Value.html

It is a wrapper class that can contain many different types of values and creates a public API for working with them. The as instance method works in the same way as Context.jsToJava (including my suggestion to use generics for the return type) if you want to convert from the multilanguage api class to a native java object. There are also methods to return values as java primitives.

@rbri
Copy link
Collaborator

rbri commented Jul 21, 2021

Sorry i was not able to follow the whole discussion but...
Rhino is about having a js engine that is well integrated with java and supports writing extensions in java. I think we should stay with that goal and make this more or less simple.
From my point of view leaking the ConsString into java code is really confusing and i can't think of a really stable approach to avoid this every time the code flow crosses the border between js and java. Because of this we might consider to remove ConsString. Maybe the impact on the performance is not that big.

Will try this in HtmlUnit and report my results.

@p-bakker
Copy link
Collaborator

The gist of the discussion (imho) is whether ConsString is an internal thing of Rhino and this should never leak out into Java, or whether it's just another Type one can expect to get in Java from JavaScript.

My position is the first, that is an internal thing, just to optimize string concatenation inside JavaScript and this should never leak to Java land.

Not sure if removing ConsString is the solution: to my knowledge it provides significant performance improvements when concatenating strings in JavaScript.

You're saying that you cannot see a way to prevent ConsString instances from leaking into Java, but aren't the only areas where that can still happen if ConsString instances are used as keys or values in JS Objects/Sets/Maps? If so, wouldn't that be relatively easy to fix?

@rbri
Copy link
Collaborator

rbri commented Jul 22, 2021

Ok, my guess about the performance impact was a bit too optimistic: have done some measurement with the HtmlUnit test suite running serveral test suites from real js frameworks shows a 25% downgrade of the performance. I think removing ConsString is not an options.

@p-bakker regarding the conversion i like to have a look at the HtmlUnit code - there are at least some places where consstring's are handled from the java code. Maybe i can make a unit test to show the cases. And then we can discuss if the is correct/expected or not.

@rbri
Copy link
Collaborator

rbri commented Jul 22, 2021

Have checked the HtmlUnit (core-js) code and it looks like there is no special handling for ConsString - means ConsString seems not to leak into the java code for the cases we have tests for.

@gbrail
Copy link
Collaborator

gbrail commented Jul 22, 2021 via email

@rbri
Copy link
Collaborator

rbri commented Jul 22, 2021

A bit of topic: found two minor optimizations for our ConsString handling but have to do some testing first

@rbri
Copy link
Collaborator

rbri commented Jul 23, 2021

Have done a PR (#1000) containing the optimization and i found a place where we leak a ConsString into java. For me the case looks really like the one from this issue.

More details in the PR.

@p-bakker
Copy link
Collaborator

Haven't tested it, but just looking at the testcode in your PR, the PR doesn't seem to address the issue that initially triggered this case and is on display in #247 (comment)

FWIW, the PR does seem to have its merits and looks good to me though

@p-bakker
Copy link
Collaborator

I think that part of this confusion is because the interface between
embedded Java code and Rhino itself is not always clear. I'm not sure what
we can do, but perhaps those of you who embed Java code in Rhino in
different ways may have some ideas. (The projects I have worked on that use
Rhino generally implement JavaScript objects in Java, usually using the old
"@JSFunction" and other annotations, so that they effectively give users a
JavaScript API that happens to be implemented in Java.)

I think one thing we can do is flatten ConsString instances when they are stored as keys or values in JavaScript Objects, Maps and Sets. Think Nashorn & GraalJS do that as well

To other improvement could be making NativeString implement CharSequence, as suggested in one of the comments

@tonygermano
Copy link
Contributor

I pointed out already that keys are already flattened (at least for Objects; I did not look at Maps, but it would make sense to flatten them there if they currently are not,) and values should not be flattened or you can erase all of the performance benefits (I also provided an example where that would be the case.)

From what I found of GraalVM, it defines an API specifically for passing cross-language values that lets you access them in a language independent way. There are methods to convert to native types, but they are not automatic. I don't know how Nashorn worked.

PR (#1000) looked fine. The test cases involve a javascript string being returned from javascript to java and ensuring it has been converted to String if it was a ConsString.

The case in question is when an Object is returned from javascript to java, and then java accesses properties of the javascript object, which may contain a ConsString, because the Object does not know it is being accessed from java. This is where GraalVM appears to wrap the Object in another API layer (an org.graalvm.polyglot.Value) before returning it to java, and gives access through that wrapper rather than giving access to internal Rhino methods.

I lean towards documenting the behavior when the internal methods are used (which is also useful for Rhino developers) and providing a graalvm-like wrapper as an option for Scriptables that do not convert to "primitive" java types to do some of the auto-conversion for user-facing code.

I think having a single java type with a well-defined API for handling dynamic js types which hides all of the implementation details is the right way to go for accessing values of non-primitive javascript types from java.

The other detail of the graalvm polyglot value is that it is bound to a context, and if the context is closed, then all operations throw IllegalStateExceptions. I don't know if we would want to copy that behavior as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Java Interop Issues related to the interaction between Java and JavaScript Triage Issues yet to be triaged
Projects
None yet
Development

No branches or pull requests

8 participants