Skip to content

Commit

Permalink
posix-stack: Introduce new, port based, load balancing algorithm for …
Browse files Browse the repository at this point in the history
…connections

New algorithm distributes connection according to peer's port address.
Designated shard is determined by ip_port modulo smp. This allow client to
connect to specific shard by choosing its local port carefully.
  • Loading branch information
Gleb Natapov committed Sep 12, 2018
1 parent 426f979 commit f51eea8
Show file tree
Hide file tree
Showing 5 changed files with 72 additions and 9 deletions.
42 changes: 42 additions & 0 deletions doc/network-connection-load-balancing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Motivation

In sharded systems like seastar it is important for work to be
distributed equally between all shards to achieve maximum performance
from the system. Networking subsystem has its part in distributing work
equally. For instance if on a server all connections will be served by
single shard only, the system will be working with the speed of this
one shard and all other shards will be underutilized.

# Common ways to distribute work received over network between shards

Two common ways to distribute work between shards are:
- do the work at a shard that received it
- shard that does actual work depends on a data been processed
(one way to do it is to hash(data) % smp_count = shard)

# Load Balancing

Those two ways asks for different strategy to distribute connections
between shards. The first one will work best if each cpu will have the
same amount of connections (assuming each connection gets same amount of
works) the second will work best if data will arrive to a shard where
it is going to be processed and actual connection distribution does
not matter.

Seastar's posix stack supports both of those strategies. Desired
one can be chosen by specifying load balancing algorithm in
listen_options provided to reactor::listen() call. Available options
are:

- load_balancing_algorithm::connection_distribution

Make sure that new connection will be placed to a shard with smallest
amount of connections of the same type.

- load_balancing_algorithm::port

Destination shard is chosen as a function of client's local port:
shard = port_number % num_shards. This allows a client to make sure that
a connection will be processed by a specific shard by choosing its local
port accordingly (the knowledge about amount of shards in the server is
needed and can be negotiated by different channel).
17 changes: 17 additions & 0 deletions net/api.hh
Original file line number Diff line number Diff line change
Expand Up @@ -243,6 +243,17 @@ class server_socket {
std::unique_ptr<net::server_socket_impl> _ssi;
bool _aborted = false;
public:
enum class load_balancing_algorithm {
// This algorithm tries to distribute all connections equally between all shards.
// It does this by sending new connections to a shard with smallest amount of connections.
connection_distribution,
// This algorithm distributes new connection based on peer's tcp port. Destination shard
// is calculated as a port number modulo number of shards. This allows a client to connect
// to a specific shard in a server given it knows how many shards server has by choosing
// src port number accordingly.
port,
default_ = connection_distribution
};
/// Constructs a \c server_socket not corresponding to a connection
server_socket();
/// \cond internal
Expand Down Expand Up @@ -271,6 +282,12 @@ public:
};
/// @}

struct listen_options {
bool reuse_address = false;
server_socket::load_balancing_algorithm lba = server_socket::load_balancing_algorithm::default_;
transport proto = transport::TCP;
};

class network_stack {
public:
virtual ~network_stack() {}
Expand Down
7 changes: 4 additions & 3 deletions net/posix-stack.cc
Original file line number Diff line number Diff line change
Expand Up @@ -187,7 +187,8 @@ template <transport Transport>
future<connected_socket, socket_address>
posix_server_socket_impl<Transport>::accept() {
return _lfd.accept().then([this] (pollable_fd fd, socket_address sa) {
auto cth = _conntrack.get_handle();
auto cth = _lba == server_socket::load_balancing_algorithm::connection_distribution ?
_conntrack.get_handle() : _conntrack.get_handle(ntoh(sa.as_posix_sockaddr_in().sin_port) % smp::count);
auto cpu = cth.cpu();
if (cpu == engine().cpu_id()) {
std::unique_ptr<connected_socket_impl> csi(
Expand Down Expand Up @@ -334,12 +335,12 @@ posix_network_stack::listen(socket_address sa, listen_options opt) {
return _reuseport ?
server_socket(std::make_unique<posix_reuseport_server_tcp_socket_impl>(sa, engine().posix_listen(sa, opt)))
:
server_socket(std::make_unique<posix_server_tcp_socket_impl>(sa, engine().posix_listen(sa, opt)));
server_socket(std::make_unique<posix_server_tcp_socket_impl>(sa, engine().posix_listen(sa, opt), opt.lba));
} else {
return _reuseport ?
server_socket(std::make_unique<posix_reuseport_server_sctp_socket_impl>(sa, engine().posix_listen(sa, opt)))
:
server_socket(std::make_unique<posix_server_sctp_socket_impl>(sa, engine().posix_listen(sa, opt)));
server_socket(std::make_unique<posix_server_sctp_socket_impl>(sa, engine().posix_listen(sa, opt), opt.lba));
}
}

Expand Down
10 changes: 9 additions & 1 deletion net/posix-stack.hh
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,10 @@ class conntrack {
_cpu_load[cpu]++;
return cpu;
}
shard_id force_cpu(shard_id cpu) {
_cpu_load[cpu]++;
return cpu;
}
};

lw_shared_ptr<load_balancer> _lb;
Expand Down Expand Up @@ -92,6 +96,9 @@ public:
handle get_handle() {
return handle(_lb->next_cpu(), _lb);
}
handle get_handle(shard_id cpu) {
return handle(_lb->force_cpu(cpu), _lb);
}
};

class posix_data_source_impl final : public data_source_impl {
Expand Down Expand Up @@ -139,8 +146,9 @@ class posix_server_socket_impl : public server_socket_impl {
socket_address _sa;
pollable_fd _lfd;
conntrack _conntrack;
server_socket::load_balancing_algorithm _lba;
public:
explicit posix_server_socket_impl(socket_address sa, pollable_fd lfd) : _sa(sa), _lfd(std::move(lfd)) {}
explicit posix_server_socket_impl(socket_address sa, pollable_fd lfd, server_socket::load_balancing_algorithm lba) : _sa(sa), _lfd(std::move(lfd)), _lba(lba) {}
virtual future<connected_socket, socket_address> accept();
virtual void abort_accept() override;
};
Expand Down
5 changes: 0 additions & 5 deletions net/socket_defs.hh
Original file line number Diff line number Diff line change
Expand Up @@ -61,11 +61,6 @@ namespace net {
class inet_address;
}

struct listen_options {
bool reuse_address = false;
transport proto = transport::TCP;
};

struct ipv4_addr {
uint32_t ip;
uint16_t port;
Expand Down

0 comments on commit f51eea8

Please sign in to comment.