Plurk Open Source - open source projects by Plurk Inc.

The hash ring

LightCloud uses consistent hashing to create a hash ring. Consistent hashing is used by most key-value stores to make it possible to add and remove nodes without invalidating too many keys.

Hash ring

How consistent hashing works

Consistent hashing is fairly simple (and smart way of distributing keys). It can be best explained by the idea that you have a ring that goes from 0 to some big number. Given a node A, you find a placement for A on the ring by running hash_function(A), the hash_function should generally mix the values well - good candidates for the hash function are MD5 or SHA1. Given a (key, value) pair you find the key's placement on the ring by running hash_function(key). A node holds all the keys that have a value lower than itself, but greater than the preceding node.

Tom White has written a great blog post about consistent hashing, take a look at it, it explains the idea in much greater detail.

Amir (the developer of LightCloud) has implemented a reference implementation of a hash ring (~50 lines so it should be easy to copy other langauges): Consistent hashing implemented simply in Python.

The specialties of LightCloud's hash ring

One of the big differences between LightCloud and other key-value databases is that it does not copy keys around to other nodes. Amazon's Dynamo (that's used to power Amazon S3) uses following replication approach:

Amazon Dynamo

A problem with this approach is that data is "eventually" consistent and one needs to build complex structures to route efficiently (stuff like routing tables, membership management etc.). We wanted to simply this and looked at ways to reuse TokyoTyrant's master-master replication. The solution we chose was to replicate nodes via master-master replication (instead of replicating keys on the ring):

LightCloud storing of key A

A big problem with this approach is that one will do a lot of lookups if new nodes are added to the system. I.e. routing becomes an issue. To solve this without routing tables we introduced two hash rings - one for lookups and another one for storage nodes:

LightCloud lookup and storage ring

Nodes of LightCloud

Each LightCloud node is a TokyoTyrant server. These nodes are managed by LightCloud manager and a sample node is run in following way:

command = "ttserver -host %(host)s -port %(port)s "\
              "%(master_cfg)s "\
              "-pid %(data)s/%(name)s.pid "\
              "-dmn "\
              "-le -ulog %(data)s/logs/%(id)s/ -sid %(id)s "\
              "-rts %(data)s/data/%(id)s.rts "\
              "-ulim 128m "\
              "-ext %(cur_pwd)s/extensions/our.lua "\
              "-thnum 6 -tout 5 "\
              "%(data)s/data/%(id)s.tch%(opts)s"

Communication with TokyoTyrant

LightCloud can communicate with TokyoTyrant in following ways:

  • Using Tyrant's binary protocol
  • Using memcached's protocol

LightCloud uses pytyrant as default (since it's binary and supports more features such as calling Lua extensions). But memcached can be used as well.

Powered by Skeletonz