Project | Microcluster/Microcloud |
---|---|
Status | Drafting |
Author(s) | @whershberger |
Approver(s) | @tomp @maria-seralessandri @masnax |
Release | LTS |
Internal ID | LX080 |
Abstract
This specification describes a mechanism to perform cluster recovery in the
event of dqlite quorum loss for Microcluster and Microcluster-based projects:
- A recovery API to be provided by microcluster
- The recovery CLI to be used in Microcloud
Rationale
The dqlite database provided by Microcluster ensures fault-tolerance and high-availability using Raft. The protocol requires a quorum of cluster members (a majority of voters) in order to read/write to the database. In the event of a catastrophic failure of a majority of cluster voters, the database on any remaining members will become unavailable. A mechanism is needed in order to recover database functionality on any remaining cluster members.
This mechanism also allows for IP address changes for any/all nodes in the cluster.
Constraints
Per canonical/lxd#13524, the cluster recovery process must be performed on exactly one member, after which the dqlite database directory must be copied from the recovered member to all other surviving members. This is because dqlite’s recovery process forces consensus by cricumventing the usual Raft mechanism; the DB copy is nessesary to ensure consistency between the remaning cluster members.
dqlite also requires that all databases are shut down during recovery. While it may be possible to ensure this with microcluster still running, it would require exposing additional API endpoints over the network to coordinate/control the database state.
Microcluster Specification
The following public structures/methods will be created:
type LocalMember struct {
DqliteID uint64
Address string
Role string
Name string
}
// microcluster/app.go
func (m *MicroCluster) GetLocalClusterMembers() ([]LocalMember, error)
// RecoverFromQuorumLoss can be used to recover database access when a quorum of
// members is lost and cannot be recovered (e.g. hardware failure).
// This function requires that:
// - All cluster members' databases are not running
// - The current member has the most up-to-date raft log (usually the member
// which was most recently the leader)
//
// RecoverFromQuorumLoss will take a database backup before attempting the
// recovery operation.
//
// RecoverFromQuorumLoss should be invoked _exactly once_ for the entire cluster.
// This function creates a gz-compressed tarball
// path.Join(m.FileSystem.StateDir, "recovery_db.tar.gz"). This tarball should
// be manually copied by the user to the state dir of all other cluster members.
//
// On start, Microcluster will automatically check for & load the recovery
// tarball. A database backup will be taken before the load.
func (m *MicroCluster) RecoverFromQuorumLoss(members []LocalMember) error
Upgrade Handling
Microcloud Specification
API changes
None
CLI changes
The following commands will be added to microcloud
:
microcloud cluster edit
$ microcloud cluster edit
You should only run this command if:
- A quorum of cluster members is permanently lost
- You are *absolutely* sure all microcloud instances are stopped (sudo snap stop microcloud)
- This instance has the most up to date database
Do you want to proceed? (yes/no):
Cluster info is opened in the editor:
# Member roles and addresses can be modified. Unrecoverable nodes should be
# given the role `spare`.
#
# `voter` - Voting member of the database. A majority of voters is a quorum.
# `stand-by` - Non-voting member of the database; can be promoted to voter.
# `spare` - Not a member of the database.
#
# The edit is aborted if:
# - the number of members changes
# - the name of any member changes
# - the id of any member changes
# - the address of any member changes
# - no changes are made
members:
- name: c1
role: voter
address: 192.0.2.101:9443
id: 5908841199928984794
- name: c2
role: voter
address: 192.0.2.102:9443
id: 14814744052722818096
- name: c3
role: voter
address: 192.0.2.103:9443
id: 7765948834043589852
If no changes are made:
Cluster edit aborted; no changes made
If changes were made:
Cluster changes applied; new database state saved to /var/snap/microcloud/common/recovery_db.tar.gz
*Before* starting any cluster member, copy /var/snap/microcloud/common/recovery_db.tar.gz to /var/snap/microcloud/common/recovery_db.tar.xz on all remaining cluster members.
Microcloud will load this file during startup.
Database changes
None
Upgrade handling
None
Future work
- It would be ideal to expose a CLI interface to easily determine the latest dqlite segment (as is done in LXD). “This instance has the most up to date database” and “most recently the leader” are a bit hand-wavy for my taste. Such a change should be trivial.
- Implement the ability to change member addresses during recovery. This requires:
a. An update of the trust store (partially implemented)
b. An update ofdaemon.yaml
c. An update of theinternal_cluster_members
table after the database is accessible (a la patches in LXD?)
d. An update of dqlite yaml filesinfo.yaml
andcluster.yaml
?