Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hare4 pending tasks #6476

Open
acud opened this issue Nov 20, 2024 · 1 comment
Open

hare4 pending tasks #6476

acud opened this issue Nov 20, 2024 · 1 comment

Comments

@acud
Copy link
Contributor

acud commented Nov 20, 2024

Description

In order to finish hare4 we need to address the following issues:

  • stream resets were observed on the hare4 testnet
    • need to find out whether this is resource manager related
    • this needs to be done in conjunction with a <prefix size>:<#identities> sizing which corresponds to the mainnet settings (currently 4 bytes)
    • evaluate whether raising the limits specifically for hare is good enough
    • for mainnet, the limits should be deployed like: set them very high/infinity for the protocol, then after that's released, with proper metrics on how many streams are being actively used at a time - one node on our own infrastructure can be picked, and that node's allowed number of streams can be decreased until it is at a sane level. the high limits must be under consensus across the network, once they are, we can lower the limits on one node, being sure that the side effects only happen on that node and its immediate neighbors.
  • in case the resource limit solution isn't viable, try to do it with multiplexing the data within a stream within the protocol
@dshulyak
Copy link
Contributor

dshulyak commented Nov 30, 2024

you need to disable pubsub.WithValidatorInline(true) in hare4. that option runs validation synchronously in the same goroutine that reads messages. without that option each validation is run in a separate goroutine, limited by another parameter.

original validation in hare is fast (sub 1ms), but if there are rpc requests that can block for seconds it will screw up everything. imagine there are 16 workers, and you receive 16 messages with conflicting id's. in this case all protocols will be halted while each worker tries to do rpc in validation

func (h *Hare) Start() {
	h.pubsub.Register(h.config.ProtocolName, h.Handler, pubsub.WithValidatorInline(true))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants