Skip to content

Commit

Permalink
Add some mild thread documentation
Browse files Browse the repository at this point in the history
since reading the code is probably incredibly confusing now.
  • Loading branch information
dormando committed Sep 3, 2012
1 parent 8963836 commit fa24ccf
Showing 1 changed file with 46 additions and 0 deletions.
46 changes: 46 additions & 0 deletions doc/threads.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
WARNING: This document is currently a stub. It is incomplete, but provided to
give a vague overview of how threads are implemented.

Multithreading in memcached *was* originally simple:

- One listener thread
- N "event worker" threads
- Some misc background threads

Each worker thread is assigned connections, and runs its own epoll loop. The
central hash table, LRU lists, and some statistics counters are covered by
global locks. Protocol parsing, data transfer happens in threads. Data lookups
and modifications happen under central locks.

THIS HAS CHANGED!

I do need to flesh this out more, and it'll need a lot more tuning, but it has
changed in the following ways:

- A secondary small hash table of locks is used to lock an item by its hash
value. This prevents multiple threads from acting on the same item at the
same time.
- This secondary hash table is mapped to the central hash tables buckets. This
allows multiple threads to access the hash table in parallel. Only one
thread may read or write against a particular hash table bucket.
- atomic refcounts per item are used to manage garbage collection and
mutability.
- A central lock is still held around any "item modifications" - any change to
any item flags on any item, the LRU state, or refcount incrementing are
still centrally locked.

- When pulling an item off of the LRU tail for eviction or re-allocation, the
system must attempt to lock the item's bucket, which is done with a trylock
to avoid deadlocks. If a bucket is in use (and not by that thread) it will
walk up the LRU a little in an attempt to fetch a non-busy item.

Since I'm sick of hearing it:

- If you remove the per-thread stats lock, CPU usage goes down by less than a
point of a percent, and it does not improve scalability.
- In my testing, the remaining global STATS_LOCK calls never seem to collide.

Yes, more stats can be moved to threads, and those locks can actually be
removed entirely on x86-64 systems. However my tests haven't shown that as
beneficial so far, so I've prioritized other work. Apologies for the rant but
it's a common question.

0 comments on commit fa24ccf

Please sign in to comment.