-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft changes for sharding cacheops #68
Conversation
Merge from upstream
…> 2.6.12, not verified
# # which reference cache keys we delete, they will be hanging out for a while. | ||
# pipe.delete(*conjs_keys) | ||
|
||
for ck in conjs_keys: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There could be lots of conjs_keys
. This will just hang up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure that is so. I have experience with one system under load when about 20k conj_keys
had been invalidated at one moment. In that moment 1500 uwsgi-workers (readers) smoked about several seconds while waiting for Redis. Thus all world
had stopped and service was interrupted. It happened because current version of invalidation contains one long transaction (aka lock) for readers and writers. I will provide more explanation of my thoughts in comments below.
But in any way, thank you, that you noted this line. I will measure and compare it with current version.
Sorry, that won't really work. I already commented a line that will do too many network requests and too many work on redis side. Also, why are you getting rid of transactions in favor of locks? It is significantly slower and won't guarantee invalidators be written in |
http://redis.io/commands/SUNION and http://redis.io/commands/SMEMBERS have a equal complexity. But actually SUNION is more complex because contains implicit merging of all sets. May be merge results of all Now I`ll try to answer you question about transaction vs. lock. Brief answer is the CAP theorem forced me. I need (A)vailabilty but not (C)onsistency. I think, (A) is more significant than (C) for cache system -- only my IMHO. And full answer.
To solve 1 and 3 points I took
Yeah, you are right. An invalidation algorithm became more slower but system in whole seems to be more interactive. Also we must remember that invalidation is a rarely event in comparison with read and write cache. So my opinion, the right questions are "how much slower" and "where is a trade-off". Thank you for you notes! I will measure the mentioned parts and get back with results. |
About robustness. As you know current version of invalidation algorithm has a place where die of worker lead to hanging of cache_keys. But it works good in 99,99... % cases, so ... :) |
In your version death of redis instance (or network split) leads to lots of cache entries hanging without invalidators. That's one reason why data keys and their invalidation information should be stored on single instance. The other reason is that this way you can write data and invalidation in single network round trip on cache miss. Various sharding strategies were discussed in #35. A right approach for your case depends on how data requests in your application are distributed across datacenters, how much data you have, whether it is sharded and such. If you don't mind I'd like to hear what service your building and how is everything working/supposed to work there. Regarding twemproxy not able to support |
There is another way to not use |
In current version death of redis can kill whole service. With sharded version I can see several scenarios:
In all cases system wil survive failure without any trouble. I really sorry. Happy New Year goes to me. I belive I will continue soon ... |
Happy New Year to you, too ) I see that your approach can work. I also see downsides to it. And one of |
Hi, Alexander I have done some measurements. Check my test code and results, please. As you said smemers is quite bad. But I cousin get worked variant of slightly modified original algorithm. Please review benchmark-invalidation.pt and see results below. Конфиг прокси [sq@sq-twemproxy ~]$ cat nutcracker.conf Результаты /home/axeman/sandbox/django-cacheops/virtualenv/bin/python /home/axeman/sandbox/django-cacheops/bench_invalidation.py Conj keys: 10, cache keys: 10 Backend: Redis - native Conj keys: 100, cache keys: 100 Backend: Redis - native Conj keys: 100, cache keys: 1000 Backend: Redis - native Conj keys: 1000, cache keys: 100 Backend: Redis - native Process finished with exit code 0 |
…oft lock (twem-proxy compatible)
Hi, Alexander What do you think about last version? I hope it is not so bad. It looks like good trade-off, IMHO. Would you accept it? :) |
First, sorry for slow response. Second, sorry, but I won't merge it cause it goes against my own vision of how multiredis support should work in cacheops. So I suggest for you to maintain your branch and use it as you like. I can link to it from README, at least until me or somebody else will come up with something better. Anyway, good job! ) It will be interesting to see how fast/slow is cache miss in your version. |
Hi! I will get back to you if I will have interesting results. Bye! |
Hi, Alexander!
It seems I have got working version of sharded cacheops. I changed a lock behavior for compatibility with
twemproxy
. After that I assembled scheme with twemproxy and two redis servers behind it. And it works perfectly. Of course, I did not test it with high concurrency and high load yet. I plan to check my changes on test stage and production after vacation.But now I would be happy if you will check my changes and discuss it with me.
I see some narrow places in the new behavior. Like
max lock time
. But there are some improvements. For instance, lock became more granular.Also versions of server and client must be increased. Server >= 2.6.12, client >= 2.7.4
You can see my twemproxy config (YAML)
Thanks in advance!