As a R3Search document index grows, so does the time spent responding to queries. One solution to this problem is to divide the index into smaller shards and host each of these on a separate R3Search node, forming a cluster. With a smaller index, response performance improves during queries. Adding documents to the collection is slightly slower because the cluster must first determine which shard stores the new document.
Another way to improve performance and availability is through replication, or mirroring of a node. If a cluster is configured with more nodes than shards, each shard will elect a leader to handle making copies of the index to each replica. Incoming queries are load balanced between the leader and all replicas, allowing yet another increase in speed.
Each node in the cluster must be configured with the same number of shards or the cluster will never reach a ready state.
To manage states during hardware failures and downtime, each cluster requires at least one Zookeeper to organize a cluster.
R3Search Cluster Configuration
Sample configuration diagram:
In this example, the index is split between three shards. If any replica node fails the shard remains intact. If a leader fails, the replica becomes the leader and the shard remains intact. In both cases the shard becomes vulnerable because it is no longer replicated.
Redundancy in a Cluster
Fault tolerance can be achieved by adding additional Zookeepers, which work in a load balanced configuration where more than half of the Zookeepers must remain active after a fault. This makes having an odd number of Zookeepers the ideal configuration as even numbers do not add additional fault tolerance.
For example: with three Zookeepers you can lose one and maintain operation as you have more than half remaining; however, with four Zookeepers you can still only lose one and maintain a majority to have more than half operational. If the tolerance required is one server failure, four Zookeepers is no more tolerant than three. If two server failure tolerance is required, five Zookeepers is ideal and sixth adds no benefit.
Redundant R3Search shards create fault tolerance: In a redundant shard configuration, where data is duplicated across shards, all but one shard in a leader/replica relationship could fail and the indexes would still be available.