Skip to content
forked from vitessio/vitess

vitess provides servers and tools which facilitate scaling of MySQL databases for large scale web services.

License

Notifications You must be signed in to change notification settings

thorwang/vitess

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vitess

Vitess is a set of servers and tools meant to facilitate scaling of MySQL databases for the web. It's currently used as a fundamental component of YouTube's MySQL infrastructure.

Features

This is an incomplete list of Vitess features.

Clients

  • A Python DBAPI 2.0 compliant client interface.
  • A go client interface.
  • Simple BSON based protocol with SASL authentication.

vttablet

Smart middleware sitting in front of MySQL and serving clients requests.

  • Connection pooling.
  • SQL parser: Although very close, the vtocc SQL parser is not SQL-92 compliant. It has left out constructs that are deemed uncommon or OLTP-unfriendly. It should, however, allow most queries used by a well-behaved web application.
  • Query rewrite and sanitation (adding limits, avoiding non-deterministic updates).
  • Query consolidation: reuse the results of an in-flight query to any subsequent requests that were received while the query was still executing.
  • Row cache: the mysql buffer cache is optimized for range scans over indices and tables, particularly when data is densely packed. Unfortunately, it’s not good for random access tables. The rowcache will instead maintain row based caches (using memcached as its backend) and keep them consistent by fielding all DMLs that could potentially affect them.
  • DML annotation: Every DML is rewritten to include a comment field at the end of a query identifying the primary key of the rows it changed.
  • Integrated query killer for queries that take too long to return data.
  • Discard idle backend connections to avoid offline db errors.
  • Transaction management: Ability to limit the number of concurrent transactions and manage deadlines.

Management

  • Cluster configuration is stored using pluggable lock service backends, with the ZooKeeper enabled by default.
  • vtctl: command line tool that allows to do most management operations, like initializing a server, changing its type, etc.
  • Separate replication and serving graph management.
  • vtctld: HTTP daemon giving an overview of the configuration (there's also a JSON API available).
  • Fast database snaphsotting and reloading (much faster than mysqldump and LOAD DATA INFILE).

Sharding

  • Keyspaces: All tables in a sharded database need to contain a “key” column. Vitess will use these values to decide the target shard for such data. All tables that are indexed by a set of keys are known as a keyspace, which basically represents the logical database that combines all the shards that store them.
  • Range based sharding: The main advantage of this scheme is that the shard map is a simple in-memory lookup. The downside of this scheme is that it creates hot-spots for sequentially increasing keys. In such cases, we recommend that the application hash the keys so they distribute more randomly.

Replication

  • Split replication: replicate only part of the replication stream basing on provided criteria (useful for vertical or horizontal resharding).

Tools

  • zkocc: ZooKeeper connection pooler and cache.
  • zkctl: manage ZooKeeper instances.
  • mysqlctl: manage MySQL instances.
  • zk: command line ZooKeeper client and explorer.

Dependencies

Development

Install Go.

cd $WORKSPACE
sudo apt-get install automake libtool flex bison memcached python-dev python-mysqldb libssl-dev g++ mercurial git
go get code.google.com/p/opts-go
git clone [email protected]:youtube/vitess.git src/github.com/youtube/vitess
cd src/github.com/youtube/vitess
./bootstrap.sh
. ./dev.env

Optionally:

VTDATAROOT=... #  $VTROOT/vt if not set
VTPORTSTART=15000

To run the tests:

make  # run the tests

License

Unless otherwise noted, the vitess source files are distributed under the BSD-style license found in the LICENSE file.

About

vitess provides servers and tools which facilitate scaling of MySQL databases for large scale web services.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published