-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load CDK: avro tests have temporal types #51546
base: master
Are you sure you want to change the base?
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
.../load-avro/src/testFixtures/kotlin/io/airbyte/cdk/load/data/avro/AvroExpectedRecordMapper.kt
Outdated
Show resolved
Hide resolved
return when (avroValue) { | ||
null -> NullValue | ||
is GenericRecord -> | ||
ObjectValue( | ||
avroValue.schema.fields.associateTo(linkedMapOf()) { field -> | ||
field.name() to convert(avroValue.get(field.name())) | ||
field.name() to convert(avroValue.get(field.name()), field.schema()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we using the field schema vs using the passed in schema here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
goal is to avoid accidentally masking real problems. I.e. if we wrote an avro file with the wrong schema, we should read back that schema, rather than forcibly converting it to the schema we expect
…/airbyte/cdk/load/data/avro/AvroExpectedRecordMapper.kt Co-authored-by: Francis Genet <[email protected]>
@@ -15,22 +15,15 @@ import io.airbyte.protocol.models.v0.AirbyteRecordMessageMetaChange | |||
import java.time.Instant | |||
import java.util.UUID | |||
|
|||
class AirbyteValueWithMetaToOutputRecord { | |||
class AirbyteValueWithMetaToOutputRecord(val extractedAtType: ExtractedAtType) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels wrong to make the extractedAtType a first class citizen. The avro schema tests in the avro toolkit validate that the schema is correct. Why should this be a feature of the ITs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this class gets used for all the output formats (json/csv/avro/parquet), so it needs to know who the caller is
maybe this is actually a sign that tests shouldn't treat the airbyte_X fields specially? I.e. if RecordDiffer just operated on a plain AirbyteValue, we could skip this step? (or at least not extracted_at, since we do want RecordDiffer to ignore loaded_at / raw_id)
(probably not a change I want to make right now, but if we ever want to come back and improve the core test framework, maybe it's on that list)
(pr will fail CI until the actual bug is fixed)
test now fails on our current behavior