Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Content-Type mismatch in publishEvent call #779

Open
mukundansundar opened this issue Sep 14, 2022 · 6 comments
Open

Content-Type mismatch in publishEvent call #779

mukundansundar opened this issue Sep 14, 2022 · 6 comments
Labels
good first issue Good for newcomers kind/bug Something isn't working P1 size/XS triaged/resolved Items triaged and ready
Milestone

Comments

@mukundansundar
Copy link
Contributor

Expected Behavior

Content-Type mismatch in publishEvent call.

When a string event is published, using the

client.publishEvent("pubsubName", "topicName", "This is message #0")

What is seen in the broker for publish using gRPC protocol is:

{"data":"This is message #0","datacontenttype":"application/json","expiration":"2022-09-14T07:34:04Z","id":"87bc9a7b-ab08-4e7a-8583-3f1e0433ab67","pubsubname":"kafka-pubsub","source":"test-batch-pub","specversion":"1.0","topic":"testingtopic","traceid":"00-6b8bfc12d6d2bb6c7e2f80503628f2b8-8a495373d49513aa-01","traceparent":"00-6b8bfc12d6d2bb6c7e2f80503628f2b8-8a495373d49513aa-01","tracestate":"","type":"com.dapr.event.sent"}

Here because the data is not given in the client.publishEvent call, it is being inferred by the default serializer as application/json and that is the datacontenttype field that is set in the cloudEvent in broker.

On subscription, the subscriber correctly reads the value and users see something like

Subscriber got: This is message #0

But when we publish the same data This is message #0 with the publishEvent call with the content type correctly set

 PublishEventRequest req = new PublishEventRequest("pubsubName", "topicName", "This is message #0");
req.setContentType("text/plain");
client.publishEvent(req)

The data seen in the broker is such:

{"data":"\"This is message #0\"","datacontenttype":"text/plain","expiration":"2022-09-14T12:32:37Z","id":"e996716d-1d11-46d7-9112-6866cbb633ce","pubsubname":"kafka-pubsub","source":"test-batch-pub","specversion":"1.0","topic":"simpletest","traceid":"00-a58c282daed4035a660abbf14208cac2-283bb02770d94908-01","traceparent":"00-a58c282daed4035a660abbf14208cac2-283bb02770d94908-01","tracestate":"","type":"com.dapr.event.sent"}

and the subscriber actually gets Subscriber got: "This is message #0"

With the right content type given, the serialization is escaping the quotes. But when no content type is given, a wrong content type is assumed.

Actual Behavior

Proper match should be there between datacontenttype field and the actually data field when saved as cloudEvent.

Steps to Reproduce the Problem

send with the overloaded method of publishEvent(pubsub, topic, data) and publishEvent(publishEventReq) and set proper metadata for the full request.

Release Note

RELEASE NOTE:

@mukundansundar mukundansundar added the kind/bug Something isn't working label Sep 14, 2022
@artursouza artursouza added this to the v1.7 milestone Sep 14, 2022
@artursouza artursouza added P1 triaged/resolved Items triaged and ready size/XS good first issue Good for newcomers labels Sep 14, 2022
@artursouza
Copy link
Member

This should be fixed. Probably without "unescaping" the String but by correctly handling the datacontenttype attribute.

@mukundansundar
Copy link
Contributor Author

One of the issues is that Java sdk has overloaded publishEvent methods that do not require content type as a parameter. IMO this should be deprecated in favor or requiring content type as a parameter in the overloaded functions also.
Then the serializer should properly serialize the event based on the content type. @artursouza thoughts?

@artursouza
Copy link
Member

If the user passes a String, we should skip serialization and pass it as-is and use text as content-type.

@mukundansundar
Copy link
Contributor Author

@artursouza Based on the offline discussion that we had:

  1. If content type given, we expect the user to provide the right content type for the default serializer or the user should have overridden the serializer in the client.
  2. If no content type is provided, based on different scenarios in HTTP we try to convert the Object data into String, be it integer or any other primitive or leave it as string. We do not serialize the string. And we set the contentType field as text/plain. (@artursouza There is a caveat here that if the event is xml then ideally the content type should change.)
  3. For POJO, MessageLite we serialize using the JSON mapper and set as application/json. @artursouza The issue here is that even if it is a CloudEvent, only if the CloudEvent type is used, we will know or the user has to provide the application/cloudevents+json contentType.
  4. For byte[], in HTTP we should base64 encode that field and set content type as application/octet-stream and in gRPC we pass on the bytes as is and set contentType to applicaiton/octet-stream.

All the contentType changes will only be done if the contentType is not given.

@artursouza Though there is an issue with respect to XML data. Any thoughts on how that can be resolved?

@artursouza artursouza modified the milestones: v1.7, v1.8 Sep 30, 2022
@artursouza artursouza modified the milestones: v1.8, v1.9 Feb 1, 2023
@MregXN
Copy link
Member

MregXN commented May 17, 2023

This is the code snippet from AbstractDaprClient.java.

  /**
   * {@inheritDoc}
   */
  @Override
  public Mono<Void> publishEvent(String pubsubName, String topicName, Object data) {
    return this.publishEvent(pubsubName, topicName, data, null);
  }

  /**
   * {@inheritDoc}
   */
  @Override
  public Mono<Void> publishEvent(String pubsubName, String topicName, Object data, Map<String, String> metadata) {
    PublishEventRequest req = new PublishEventRequest(pubsubName, topicName, data)
        .setMetadata(metadata);
    return this.publishEvent(req).then();
  }

It shows that no matter which function is called, it ends up using the same function.

In DaprClientHttp.java it implements publishEvent(PublishEventRequest request). It shows that data will be serialized directly before contentType setting. If user do not set anything, the contentType would be default type application/json.

  public Mono<Void> publishEvent(PublishEventRequest request) {

      ...

      Object data = request.getData();
      Map<String, String> metadata = request.getMetadata();

      byte[] serializedEvent = objectSerializer.serialize(data);
      // Content-type can be overwritten on a per-request basis.
      // It allows CloudEvents to be handled differently, for example.
      String contentType = request.getContentType();
      if (contentType == null || contentType.isEmpty()) {
        contentType = objectSerializer.getContentType();
      }
      Map<String, String> headers = Collections.singletonMap("content-type", contentType);

     ...

  }

So the trigger for this bug is whether users set the contenttype manually and correctly.

Maybe we clould add a judgment before serialize in publishevent(). If the contenttype is "text/plain", then manually convert it to byte[] without calling serialize().

@artursouza @mukundansundar What are your opnions? I can help to fix this.

@MregXN
Copy link
Member

MregXN commented Jun 7, 2023

I have raised related issue #6448 in dapr/dapr. Maybe we need more discussion about whether dapr runtime should be responsible for this bug.
@artursouza @mukundansundar

@artursouza artursouza modified the milestones: v1.10, v1.11 Oct 3, 2023
@artursouza artursouza modified the milestones: v1.11, v1.12 Feb 16, 2024
@artursouza artursouza modified the milestones: v1.12, v1.13 Jul 23, 2024
@artursouza artursouza modified the milestones: v1.13, v1.14 Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers kind/bug Something isn't working P1 size/XS triaged/resolved Items triaged and ready
Projects
None yet
Development

No branches or pull requests

3 participants