Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load CDK: avro tests have temporal types #51546

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Conversation

edgao
Copy link
Contributor

@edgao edgao commented Jan 14, 2025

(pr will fail CI until the actual bug is fixed)

test now fails on our current behavior

image

@edgao edgao requested a review from johnny-schmidt January 14, 2025 20:09
@edgao edgao requested a review from a team as a code owner January 14, 2025 20:09
Copy link

vercel bot commented Jan 14, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
airbyte-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 28, 2025 0:15am

return when (avroValue) {
null -> NullValue
is GenericRecord ->
ObjectValue(
avroValue.schema.fields.associateTo(linkedMapOf()) { field ->
field.name() to convert(avroValue.get(field.name()))
field.name() to convert(avroValue.get(field.name()), field.schema())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we using the field schema vs using the passed in schema here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

goal is to avoid accidentally masking real problems. I.e. if we wrote an avro file with the wrong schema, we should read back that schema, rather than forcibly converting it to the schema we expect

…/airbyte/cdk/load/data/avro/AvroExpectedRecordMapper.kt

Co-authored-by: Francis Genet <[email protected]>
@@ -15,22 +15,15 @@ import io.airbyte.protocol.models.v0.AirbyteRecordMessageMetaChange
import java.time.Instant
import java.util.UUID

class AirbyteValueWithMetaToOutputRecord {
class AirbyteValueWithMetaToOutputRecord(val extractedAtType: ExtractedAtType) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels wrong to make the extractedAtType a first class citizen. The avro schema tests in the avro toolkit validate that the schema is correct. Why should this be a feature of the ITs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this class gets used for all the output formats (json/csv/avro/parquet), so it needs to know who the caller is

maybe this is actually a sign that tests shouldn't treat the airbyte_X fields specially? I.e. if RecordDiffer just operated on a plain AirbyteValue, we could skip this step? (or at least not extracted_at, since we do want RecordDiffer to ignore loaded_at / raw_id)

(probably not a change I want to make right now, but if we ever want to come back and improve the core test framework, maybe it's on that list)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CDK Connector Development Kit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants