Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft changes for sharding cacheops #68

Closed
wants to merge 6 commits into from
Closed

Conversation

ttyS15
Copy link
Contributor

@ttyS15 ttyS15 commented Dec 30, 2013

Hi, Alexander!

It seems I have got working version of sharded cacheops. I changed a lock behavior for compatibility with twemproxy. After that I assembled scheme with twemproxy and two redis servers behind it. And it works perfectly. Of course, I did not test it with high concurrency and high load yet. I plan to check my changes on test stage and production after vacation.

But now I would be happy if you will check my changes and discuss it with me.

I see some narrow places in the new behavior. Like max lock time. But there are some improvements. For instance, lock became more granular.

Also versions of server and client must be increased. Server >= 2.6.12, client >= 2.7.4

You can see my twemproxy config (YAML)

redis1:
  listen: 192.168.144.170:6379
  redis: true
  hash: fnv1a_64
  distribution: ketama
  auto_eject_hosts: true
  timeout: 400
  server_retry_timeout: 2000
  server_failure_limit: 1
  servers:
   - 192.168.144.151:6379:1
   - 192.168.144.61:6379:1

Thanks in advance!

# # which reference cache keys we delete, they will be hanging out for a while.
# pipe.delete(*conjs_keys)

for ck in conjs_keys:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There could be lots of conjs_keys. This will just hang up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that is so. I have experience with one system under load when about 20k conj_keys had been invalidated at one moment. In that moment 1500 uwsgi-workers (readers) smoked about several seconds while waiting for Redis. Thus all world had stopped and service was interrupted. It happened because current version of invalidation contains one long transaction (aka lock) for readers and writers. I will provide more explanation of my thoughts in comments below.

But in any way, thank you, that you noted this line. I will measure and compare it with current version.

@Suor
Copy link
Owner

Suor commented Dec 30, 2013

Sorry, that won't really work. I already commented a line that will do too many network requests and too many work on redis side.

Also, why are you getting rid of transactions in favor of locks? It is significantly slower and won't guarantee invalidators be written in cache_thing() call.

@ttyS15
Copy link
Contributor Author

ttyS15 commented Dec 31, 2013

http://redis.io/commands/SUNION and http://redis.io/commands/SMEMBERS have a equal complexity. But actually SUNION is more complex because contains implicit merging of all sets.

May be merge results of all smember on the client side will be good idea. I`ll check it. Thank you!

Now I`ll try to answer you question about transaction vs. lock. Brief answer is the CAP theorem forced me. I need (A)vailabilty but not (C)onsistency. I think, (A) is more significant than (C) for cache system -- only my IMHO.

And full answer.

  1. I need a cacheops version which can be distributed across several servers because:
  • redis node has reached a load about 80k rps and consumes 60-80 % CPU;
  • several thousand workers which distributed across 3 datacenters, depends on only one process (Redis) in the system, which can be blocked on undefined time (see my comment above);
  • one-threaded redis process can`t use effectively a multicore CPU;
  1. I want that my system generally be live and interactive (though for most readers) despite of a cache invalidation.

  2. I want to improve failover for Redis node, which is used by cacheops. At present we use hot-cold scheme with Sentinel and hardware balancer. Ugly shit! Sorry!

To solve 1 and 3 points I took twemproxy. Twemproxy does not provide full compatibility with Redis protocol by design. As you can see (https://github.com/twitter/twemproxy/blob/master/notes/redis.md) it does not supports MULTI and has a partial support of SUNION (actually no support for me). So I had not a choice. As side effect of replacing transaction with spin-lock for invalidators I have got (I belive it):

  • invalidation process has an interrupts where other workers can do they work;
  • readers does not affected by lock (actually with several Redis nodes);
  • diffirent invalidation processes can be in progress concurrently;

Yeah, you are right. An invalidation algorithm became more slower but system in whole seems to be more interactive. Also we must remember that invalidation is a rarely event in comparison with read and write cache. So my opinion, the right questions are "how much slower" and "where is a trade-off".

Thank you for you notes! I will measure the mentioned parts and get back with results.

@ttyS15
Copy link
Contributor Author

ttyS15 commented Dec 31, 2013

About robustness. As you know current version of invalidation algorithm has a place where die of worker lead to hanging of cache_keys. But it works good in 99,99... % cases, so ... :)

@Suor
Copy link
Owner

Suor commented Dec 31, 2013

In your version death of redis instance (or network split) leads to lots of cache entries hanging without invalidators. That's one reason why data keys and their invalidation information should be stored on single instance. The other reason is that this way you can write data and invalidation in single network round trip on cache miss.

Various sharding strategies were discussed in #35. A right approach for your case depends on how data requests in your application are distributed across datacenters, how much data you have, whether it is sharded and such. If you don't mind I'd like to hear what service your building and how is everything working/supposed to work there.

Regarding twemproxy not able to support MULTI, client side sharding will work here and shouldn't be too hard to implement.

@Suor
Copy link
Owner

Suor commented Dec 31, 2013

There is another way to not use MULTI. Rewrite all cache miss/invalidation logic in Lua, this way you can use twemproxy. You will however need to send invalidation request to all cache instances.

@ttyS15
Copy link
Contributor Author

ttyS15 commented Dec 31, 2013

In current version death of redis can kill whole service. With sharded version I can see several scenarios:

  1. It`s just cache! Let it be slightly corrupted and in any case will be fully restored after expiration time. We miss only 1*k/n keys, where k - number of failed nodes and n - all nodes. Moreover some part of those keys will never be used. Not a problem.
  2. Its just cache, but it must be consistent. Lets flush it if some nodes are died.
  3. We like cache and not sure that we can survive cold start. Ok, lets make failover for each node and sleep well.

In all cases system wil survive failure without any trouble.

I really sorry. Happy New Year goes to me. I belive I will continue soon ...

@Suor
Copy link
Owner

Suor commented Dec 31, 2013

Happy New Year to you, too )

I see that your approach can work. I also see downsides to it. And one of
them is significantly different code than one node scenario dictates. So
think a bit about client-side sharding I described above.

@ttyS15
Copy link
Contributor Author

ttyS15 commented Jan 17, 2014

Hi, Alexander

I have done some measurements. Check my test code and results, please. As you said smemers is quite bad. But I cousin get worked variant of slightly modified original algorithm. Please review benchmark-invalidation.pt and see results below.

Конфиг прокси

[sq@sq-twemproxy ~]$ cat nutcracker.conf 
redis1:
  listen: 192.168.144.170:6379
  redis: true
  hash_tag: "::"
  hash: md5
  distribution: ketama
  auto_eject_hosts: true
  timeout: 40000
  server_retry_timeout: 2000
  server_failure_limit: 1
  servers:
   - 192.168.144.151:6379:1
   - 192.168.144.61:6379:1
[sq@sq-twemproxy ~]$ 

Результаты

/home/axeman/sandbox/django-cacheops/virtualenv/bin/python /home/axeman/sandbox/django-cacheops/bench_invalidation.py

Conj keys: 10, cache keys: 10

Backend: Redis - native
time: 0.00198602676392
Backend: TWEM - native
error: unsupported
Backend: Redis - twem_proxy_compat
time: 0.00952076911926
Backend: TWEM - twem_proxy_compat
time: 0.0115759372711
Backend: Redis - twem_proxy_compat_opt_1
time: 0.00291299819946
Backend: TWEM - twem_proxy_compat_opt_1
time: 0.0034761428833
Backend: Redis - twem_proxy_compat_opt_2
time: 0.00259709358215
Backend: TWEM - twem_proxy_compat_opt_2
time: 0.00280117988586
Backend: Redis - twem_proxy_compat_opt_3
time: 0.00199699401855
Backend: TWEM - twem_proxy_compat_opt_3
time: 0.00243401527405

Conj keys: 100, cache keys: 100

Backend: Redis - native
time: 0.0036518573761
Backend: TWEM - native
error: unsupported
Backend: Redis - twem_proxy_compat
time: 0.749498128891
Backend: TWEM - twem_proxy_compat
time: 0.17934012413
Backend: Redis - twem_proxy_compat_opt_1
time: 0.0529139041901
Backend: TWEM - twem_proxy_compat_opt_1
time: 0.133848905563
Backend: Redis - twem_proxy_compat_opt_2
time: 0.0429902076721
Backend: TWEM - twem_proxy_compat_opt_2
time: 0.0433089733124
Backend: Redis - twem_proxy_compat_opt_3
time: 0.00362396240234
Backend: TWEM - twem_proxy_compat_opt_3
time: 0.00596594810486

Conj keys: 100, cache keys: 1000

Backend: Redis - native
time: 0.0252318382263
Backend: TWEM - native
error: unsupported
Backend: Redis - twem_proxy_compat
time: 1.64295601845
Backend: TWEM - twem_proxy_compat
time: 2.11577010155
Backend: Redis - twem_proxy_compat_opt_1
time: 0.523754119873
Backend: TWEM - twem_proxy_compat_opt_1
time: 24.5513391495
Backend: Redis - twem_proxy_compat_opt_2
time: 0.4054210186
Backend: TWEM - twem_proxy_compat_opt_2
time: 0.411654949188
Backend: Redis - twem_proxy_compat_opt_3
time: 0.0190849304199
Backend: TWEM - twem_proxy_compat_opt_3
time: 0.107830047607

Conj keys: 1000, cache keys: 100

Backend: Redis - native
time: 0.00504016876221
Backend: TWEM - native
error: unsupported
Backend: Redis - twem_proxy_compat
time: 0.161504030228
Backend: TWEM - twem_proxy_compat
time: 0.373972892761
Backend: Redis - twem_proxy_compat_opt_1
time: 0.0837070941925
Backend: TWEM - twem_proxy_compat_opt_1
time: 0.322618961334
Backend: Redis - twem_proxy_compat_opt_2
time: 0.0658929347992
Backend: TWEM - twem_proxy_compat_opt_2
time: 0.068078994751
Backend: Redis - twem_proxy_compat_opt_3
time: 0.00423884391785
Backend: TWEM - twem_proxy_compat_opt_3
time: 0.00618290901184

Process finished with exit code 0

@ttyS15
Copy link
Contributor Author

ttyS15 commented Jan 20, 2014

Hi, Alexander

What do you think about last version? I hope it is not so bad. It looks like good trade-off, IMHO. Would you accept it? :)

@Suor
Copy link
Owner

Suor commented Jan 26, 2014

First, sorry for slow response. Second, sorry, but I won't merge it cause it goes against my own vision of how multiredis support should work in cacheops.

So I suggest for you to maintain your branch and use it as you like. I can link to it from README, at least until me or somebody else will come up with something better.

Anyway, good job! ) It will be interesting to see how fast/slow is cache miss in your version.

@ttyS15
Copy link
Contributor Author

ttyS15 commented Feb 7, 2014

Hi!
It`s a pity. :( But any case thank you for your attention! I will create branch in my fork as you suggested. I hope, I will sync it as long as possible.

I will get back to you if I will have interesting results.

Bye!

@ttyS15 ttyS15 closed this Feb 7, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants