-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Multi-server support #35
Comments
There are two different ways to use multiple redises. First is when there are separate caches on different servers (this is handy when you have geographically distributed application servers), second when cache is distributed between redises. In both cases you need to invalidate on all servers. (You probably can ignore this and invalidate only local cache in first case when you know that queries on different app servers are different). I assume you are talking about second case - sharding. Regarding schemes, storing all schemes on all instances is a bad idea, as far as I can see, cause then you'll need to update them on all instances on every cache miss. I was thinking about storing making each instance kind of self contained cache which holds its schemes and invalidators. This way you can write to cache to a single instance, you'll still need to invalidate on all of them, but I didn't found a general way to avoid this. Also I should ask. Why do you need this? Are you out of CPU on single redis instance? out of memory on single server? need more durability? (In the latter case replication should work). I ask all these questions because the answers will determine what kind of multi-server support you need. I myself actually looked into first variant of multi-server support, but found it's ok to rely on timeout invalidation for not so high coherent data. |
My reasons would be, in order of importance:
And 1. should also be the easiest to solve indeed. Set up 1 master with N slaves and use for example rediscluster to do all the communication with the cache. It already knows to do all writes to the master and distribute reads to the slaves by default. We still write to a single server so that remains simple. And actual sharding would be the ideal solution (if implemented fairly efficiently of course) for the 2nd problem, similarly how you can easily add N memcache servers if using more standard Django caching solutions. |
Sharding won't work with redis cluster cause we need to store our keys with their invalidators on single instance and cluster stores keys as it wants. And you most probably don't need cluster for 1. So it's out of question as I know. For 1 the single thing is needed from cacheops is failover, redis already handles replication. 2 is not so easy. One need to map keys to instances and then use single instance for data, invalidators and schemes in I was thinking to rewrite cacheops with lua scripting, this will allow me to trash in memory cache of schemes, and that will make sharding trivial. My very initial hacks are in scripting branch. |
I think I stand closely to solving this problem. Please check my approach. |
Given production-ready redis cluster and not using transactions but scripting now, this is relatively easy to implement. I don't need this functionality, but if anyone is willing to make a pull request I'll work with him/her on it. Here are my preliminary designs. I see 2 approaches. Shard data logically. By model or some condition, e.g. user id, geographic region or whatever. This is especially convinient when you already shard your database. In this scenario each of redis clusters 16384 shards acts as independent cacheops database - each key goes to same shard with its invalidation set and schemes. This way we invalidate in single place with one request. This could be facilitated by use of hash tags - parts of keys in def hash_tag_func(model, attname, value):
# Shard these by model
if model in (City, Country, Category):
return model.__name__
# Shard users by id
elif model is User and attname == 'id':
return 'u%d' % value
# Shard the rest by user
else:
user_fk = first(f.attname for f in model._meta.fields
if f.rel and f.rel.to is User)
assert user_fk, "Need foreign key to User to shard %s" % model.__name__
if field == user_fk:
return 'u%d' % value Whatever cache_key = '{u42}q:G7A...'
conj_key = '{u42}conj:...'
... Hash tag function is called each time we make a cache fetch, save or invalidation. When we have dnf tree (or queryset from which that is derivable) we pass all elementary conditions from that to this function, in Shard by full cache key, invalidate everywhere. It's actually as simple as it sounds. When we Both approaches could also be used with client side sharding or twemproxy, so redis cluster is not strictly needed. Also, they are not mutually exclusive, eventually both could be implemented. |
We are using redis cluster together with redis-py-cluster in the rest of our application, so I think I could integrate that into django-cacheops. At very first attempt, changing redis.StrictRedis to rediscluster.StrictRedisCluster produces the following errors: Upon read, it complaints about accessing accessing keys from a different node, in the
Repeating the read 3 (number of master nodes in the cluster) times in eventually succeeds. Upon invalidation, (e.g. when saving an model), in the
Any suggestions on how to approach this are appreciated. |
@anac0nda you need to decide how you shard. Cacheops has keys depending on each other so you can't just distribute them by generic hash key as redis cluster does. See my comment with Note that to use second approach from there you need to disable redis cluster check if key belongs to node. Not sure whether this is possible, you may need to use just and a bunch of nodes instead of cluster. |
A note: we have 2 approaches here hash tag function vs router function. A second one also solves master/slave. |
I went to an approach similar to the second one (shard by full cache key), since in my scenario there is no suitable way to logically shard (load is not distributed evenly amongst users or models). Before caching with So, let's say key For the invalidation phase, the models are cleared from all nodes with the following updated wildcard:
I also added a loop in the I did some basic testing (with 2 types of clusters: 3 masters only and 3 master/3 slaves) with our applications and it seems to be working correctly. I will create a PR when I run the tests suite. Meanwhile, please have a look at the attached patch file. Comments on the approach are welcome! |
Looked through the patch. Working approach overall, I see 2 major issues though:
|
I successfully ran the tests, both against cluster and a single redis instance. Points you mention are now addressed in the new patch.
Let me know what you think. |
Nodes change check as written probably won't work. Here are my considerations:
|
Hi @Suor and @anac0nda. I got an email a day ago from @anac0nda i think it was asking for some input into this question. I answered his email but i would like to make a shorter information dump here on how redis cluster works. Redis cluster is based around a hash slots system (~16000 slots). Each key that you send to a cluster gets hashed with CRC16 algorithm with modulus on slot size. Each command then puts their data into a bucket. For example if you run Regarding how nodes work and is handled. When a cluster node event happens different things happen with the cluster based on what happens. If a node is added if you want to expand your capacity in the cluster, you migrate some of the hash slots from the existing nodes to the new node in the cluster. The client and redis itself can already handle this case where it will still server data fetching operations from the old hash slot location during the migration, and all writers will be set for the new one. The transition is seamless and basically the only thing you will notice in the client layer is a slowdown in how many operations each client can handle because there is overhead in dealing with the error handling and new routing. I would argue that there really is never a use case for a user of any redis client to actually base anything around how the cluster really looks. The only use-case i have figured out until now is if you want to monitor cluster changes with some monitoring solution or send notifications in case the cluster does some fail over in case a master node goes down. But i have never seen any reason to add a feature where a client can track node changes and react based on those changes. One other thing is that i have looked through the patch that was submitted but i can't really figure out why you really need this node tracking in the client layer. If you want to, you can enlighten me why this is really needed within this context. If you want to solve the problem with sending all keys that belongs to say an instance of a model object to the same node, you should really look at how regular SQL works with their auto incr on the primary key and use that ID inside the If this is not the case, please point me in the right direction of the problem that this node tracking is trying to solve :) |
@Grokzen the issue with using redis cluster for cacheops is that doesn't fit this whole lot of hashslots model. The reason for this is that cacheops uses one keys refering to other keys: sets with key names. Each such set references some event and keys in a set is cache keys of queries dependent on this event. If all queries in an app can be separated into some independent shards when we could use The solution in a patch by @anac0nda is to make all keys on single node to have same
The prefixes are calculated by trial and error here. As you can see when new node is added it won't be used unless
The exact outcome depends on how slots are redistributed. This theretically could be resolved by binding keys to hash slots not nodes, but that would be impractical. The thing is if you shard by cache key hash then you need to perform invalidation procedure on each shard, which is too much. Also theoretically you can shard not to all hash slots but to some amount of them, say Another approach that probably makes more sense is using Just a Bunch Of Redises + Client Side Sharding + (optionally) Sentinel. This way you won't need to use any prefixes but just send each request to appropriate redis. You will still need to run invalidation procedure on all master nodes. |
Do you currently have a unique ID generation strategy that is in place right now if i would to use only 1 server? If i for example want to cache a query-set object, how do it know that the next time i run the code it will fetch what object from redis? I do not have that much time to dig through all code and understand it all so it would be great if i could be pointed tot he correct place :) |
@Grokzen its |
@Suor what about integrate cachops with https://github.com/zhihu/redis-shard ? |
@genxstylez sharding lib is a trivial part of making cacheops multi-server, the main part is choosing sharding or routing strategy. See above. |
Happy to be the champion here. |
Anything moved here? |
Implementing cache key prefix was a step towards it. The idea is to use Setting everything up turned up to be bigger than I first expected. So I will appreciate any help especially with setting up test env for this. |
I see. Thank you. |
Any update on this? Opened since 2013 😱 |
People often implement some custom solution, which is hardly generalizable, or just don't have time to clean it up and contribute. This exists and open for discussion and reference, there is no active work now. |
Hi, I come across this problem and would love to have my own simple patch to make this work. from cacheops.redis import CacheopsRedis
from cacheops.redis import redis_client, handle_connection_failure
from rediscluster import StrictRedisCluster
class StrictRedisCluster1(StrictRedisCluster):
def _determine_slot(self, *args):
print(args) # debugging
return super(StrictRedisCluster1, self)._determine_slot(*args)
# this class is used in django setting
class CacheopsRedisCluster(StrictRedisCluster1, CacheopsRedis):
get = handle_connection_failure(StrictRedisCluster.get) I understand this would not work. I already set the prefix I'm still reading the codebase, I hope you can point me into some direction to achive that @Suor. |
|
Hello again guys, so I manage to create a small patch to make cacheops work with redis cluster. Its not clean but might help someone in needed. So I ask your permission to share it here Suor: I have a folder compat_cacheops.zip to override some of cacheops method. |
Hello @AndyHoang, thanks for your example. I think those things might be incorporated as long as it still works for single redis scenario. |
Hello, Apparently redis-py have added RedisCluster support to version v4.1.0rc1: Hopefully it will now be easier to also support this with cacheops in a future version... |
Sharding strategy has been possible to set up for a while via The new insideout mode is a path to do it without sharding, i.e. with random rather than logical splitting. One will still need to rewrite The key difference is that in that mode cache keys refer conj keys - instead of the conj sets containing cache keys, so if we get a split we simply dim that cache key invalid, while in traditional cacheops mode we need conj sets to be there to perform invalidation and in a split case we have undead cache keys. |
Are there any updates on this? I basically need to connect cacheops to myAWS Redis Cluster with cluster mode turned on. Am I right in saying that it's not as simple as simply updating the cache redis path / URL in settings.py to the Redis Cluster? If not, does anyone have any solutions for working with AWS redis clusters? I'm basically looking to create a better failover setup in production. |
There is no ready to use solution for this, you are right. People create something that works for them but either don't share it or don't generalize it enough to be included. It's hard to generalize things like this to be fair. And I don't need it myself so not working on it either. |
Thanks. Totally get that it's not an easy solution but I figured it should be a pretty common one. If anyone has a working solution they'd be willing to share, please let me / us etc know. I think there would be a lot of interest in it |
There is this https://github.com/ParcelPerform/django-cacheops |
Thanks. I think I saw that before and was put off by the point that so much customisation was done to the extent that @Suor said it couldnt be merged as it invalidated / conflicted with a large part of the existing codebase |
Just checking in to see if anyone has any updates on this issue? I'm playing around with a custom CACHEOPS_PREFIX method to create a prefix specific to the user logged (i.e. "u123") as that makes the most sense for the way my app is structured. Is it as simple as doing that to ensure that the cluster hashes (and therefore distributes to the correct node in the cluster) or do i need to do something in addition to this? Such as creating a custom CACHEOPS_CLIENT_CLASS class? Has anyone actually solved this before? I've seen a bunch of patches here and there but they seem to change a lot and I'd like to avoid that. Has anyone achieved this with a custom CACHEOPS_CLIENT_CLASS and CACHEOPS_PREFIX combo? If so, would you mind sharing your code so that others can learn what worked? |
This would be also a very nice thing that I've been seriously looking at.
I think some options would be to integrate a smarter client wrapper instead of redis-py like:
https://github.com/gmr/mredis
https://github.com/salimane/rediscluster-py
Or just implement it directly, with the minimum required feature set for cacheops. I quite like how rediscluster works.
There are some obstacles however.
Any thoughts on this matter would be appreciated.
The text was updated successfully, but these errors were encountered: