-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wildcard topic #92
Comments
@mtagle @C0urante any thoughts on this? For one, I'd prefer not to tie this to the schema registry. Getting topics directly from Kafka seems preferable. Thus far, we've tried to make the dependency on confluent schema registry as pluggable as we can. Also, to be clear, a regex would only work on startup. The Kafka Connect framework doesn't yet provide consumer refreshing--once it starts, the topic list is static. |
@criccomini Think you nailed it with
Seems like the best fix would be with the Kafka Connect framework itself, as opposed to this specific connector. I just talked to @ewencp about it and he pointed out that there's an existing JIRA for adding that support; he'd be glad to give you some info on how to add this functionality to the framework if you're interested. @Kenji-H let us know if that would address your needs; if not I'm sure adding a bit of regex-fiddling to the connector shouldn't be too much work. |
Thanks for your quick replies. As far as I can understand from the sorce code, we still need to update this library after the fix of Kafka Connect framework. When creating the mapping from schema to table, namely BigQuerySinkTask::topicsToBaseTableIds, the library reads the "topics" parameter in a properties file. It looks this part needs some updates. Anyway, we should start from Kafka Connect framework. I think I have some time to make contribution to these changes. |
KAFKA-3037 has a duplicated issue KAFKA-3074, which is resolved. So regex sink is already implemented? |
@Kenji-H No, that's resolved as a duplicate because it was accidentally re-registered. Doing this in the framework is definitely the right choice -- there's really no reason it shouldn't be handled as a generic option for sink connectors. This will likely be a very simple JIRA to implement, and there's really only one design decision to be addressed -- whether to try to do this with the existing option and just allow regexes or if we should add another option (e.g. To propose this change, you'd write up a Kafka Improvement Proposal. These are used to let the community vet changes to public interfaces since adding to public interfaces is a commitment to support them moving forward. This might sound like a lot of effort, but for something simple like this the KIP will just be a couple of sentences in each section of the KIP template and since it's a well known feature that people want, discussion will probably be pretty minimal. If you're interested in taking this on, we can help guide you through the rest of the process and get the code committed. |
Thank you for your kind help. As for the design option, I was considering a similar way. Like you said, some special characters in reg exp, |
Fantastic. @Kenji-H we'll leave this issue open until everything gets resolved. |
Looks like this was released in 1.1.0. Once we upgrade Kafka dependencies, I believe we can support this feature. That said, we don't have plans to upgrade to 1.1.0 yet. |
Hello, any update on the solution? |
Is this feature implemented ? are we able to consume from new added topics? |
I believe not. I tested creating new tables in the source but 1-regex
wildcard is not enabled. 2-Even if it were you would have to reload the
connector in order for the new tables to go through. Altering and other
table operatioins work well, except create/delete. I was think on creating
some python script mechanism that would reload the connector whenever the
length of the topic list changes within the kafka cluster (create/delete
tables go through to the kafka cluster but not through the sink connector).
But haven't gotten around the mechanism yet.
…On Sun, Oct 20, 2019 at 3:44 PM salihoto ***@***.***> wrote:
Also, to be clear, a regex would only work on startup. The Kafka Connect
framework doesn't yet provide consumer refreshing--once it starts, the
topic list is static.
Is this feature implemented ? are we able to consume from new added topics?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#92?email_source=notifications&email_token=AHYXGMAGZ4FCWLZBTLFFTADQPROERA5CNFSM4DP6R4PKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBYKOMQ#issuecomment-544253746>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHYXGMHZJPA6PILV6H4ZFITQPROERANCNFSM4DP6R4PA>
.
|
Actually I was thinking about the same solution. By the way, you mentioned altering operations work well, but dropping column or renaming column is not working. I am not sure but this might be about Big query limitations. Another thing, In avro mod,e deleting opeartion doesn't reflect to kafka but in json mode it's available as the after part is null. |
Ah, wasn’t aware of those additional limitations. Thanks for the heads up.
Regarding the script idea, if ever get to implement it please let me know,
would be great to get to know your approach. I did mine with a timer, every
n seconds check the length of the topic list. But that solution is to
clunky, I am looking for ways for the script to “passively listen/observe”
the list length... if you got any hints that would be awosome. All the best
…On Tue, Oct 22, 2019 at 4:09 PM salihoto ***@***.***> wrote:
I was think on creating some python script mechanism that would reload the
connector whenever the length of the topic list changes within the kafka
cluster (create/delete tables go through to the kafka cluster but not
through the sink connector). But haven't gotten around the mechanism yet.
Actually I was thinking about the same solution. By the way, you mentioned
altering operations work well, but dropping column or renaming column is
not working. I am not sure but this might be about Big query limitations.
Another thing, In avro mod,e deleting opeartion doesn't reflect to kafka
but in json mode it's available as the after part is null.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#92?email_source=notifications&email_token=AHYXGMH756EKVIHQP7XZON3QP4CRZA5CNFSM4DP6R4PKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEB54C6A#issuecomment-544981368>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHYXGMDFOY7MMT5STHJZ6N3QP4CRZANCNFSM4DP6R4PA>
.
|
For now we provide topic names in a properties file as following:
If there are a few topics to deal with, It's ok to list all the topics in a comma-seperated way.
But when you have to deal with hundreds or thousands of topics, it doesn't help. In those situations, we would like to use a wildcard expression to specify topics.
How about implementing a method to retrive all the topic names matching a given wildcard expression in SchemaRegistrySchemaRetriever class?
The text was updated successfully, but these errors were encountered: