Update godeps for etcd 3.0.4

This commit is contained in:
Timothy St. Clair
2016-07-22 13:54:40 -05:00
parent 456c43c22d
commit 5f008faa8b
457 changed files with 25492 additions and 10481 deletions

198
vendor/github.com/coreos/etcd/raft/README.md generated vendored Normal file
View File

@@ -0,0 +1,198 @@
# Raft library
Raft is a protocol with which a cluster of nodes can maintain a replicated state machine.
The state machine is kept in sync through the use of a replicated log.
For more details on Raft, see "In Search of an Understandable Consensus Algorithm"
(https://ramcloud.stanford.edu/raft.pdf) by Diego Ongaro and John Ousterhout.
A simple example application, _raftexample_, is also available to help illustrate
how to use this package in practice:
https://github.com/coreos/etcd/tree/master/contrib/raftexample
## Notable Users
- [cockroachdb](https://github.com/cockroachdb/cockroach) A Scalable, Survivable, Strongly-Consistent SQL Database
- [etcd](https://github.com/coreos/etcd) A distributed reliable key-value store
- [tikv](https://github.com/pingcap/tikv) Distributed transactional key value database powered by Rust and Raft
- [swarmkit](https://github.com/docker/swarmkit) A toolkit for orchestrating distributed systems at any scale.
## Usage
The primary object in raft is a Node. You either start a Node from scratch
using raft.StartNode or start a Node from some initial state using raft.RestartNode.
To start a node from scratch:
```go
storage := raft.NewMemoryStorage()
c := &Config{
ID: 0x01,
ElectionTick: 10,
HeartbeatTick: 1,
Storage: storage,
MaxSizePerMsg: 4096,
MaxInflightMsgs: 256,
}
n := raft.StartNode(c, []raft.Peer{{ID: 0x02}, {ID: 0x03}})
```
To restart a node from previous state:
```go
storage := raft.NewMemoryStorage()
// recover the in-memory storage from persistent
// snapshot, state and entries.
storage.ApplySnapshot(snapshot)
storage.SetHardState(state)
storage.Append(entries)
c := &Config{
ID: 0x01,
ElectionTick: 10,
HeartbeatTick: 1,
Storage: storage,
MaxSizePerMsg: 4096,
MaxInflightMsgs: 256,
}
// restart raft without peer information.
// peer information is already included in the storage.
n := raft.RestartNode(c)
```
Now that you are holding onto a Node you have a few responsibilities:
First, you must read from the Node.Ready() channel and process the updates
it contains. These steps may be performed in parallel, except as noted in step
2.
1. Write HardState, Entries, and Snapshot to persistent storage if they are
not empty. Note that when writing an Entry with Index i, any
previously-persisted entries with Index >= i must be discarded.
2. Send all Messages to the nodes named in the To field. It is important that
no messages be sent until the latest HardState has been persisted to disk,
and all Entries written by any previous Ready batch (Messages may be sent while
entries from the same batch are being persisted). To reduce the I/O latency, an
optimization can be applied to make leader write to disk in parallel with its
followers (as explained at section 10.2.1 in Raft thesis). If any Message has type
MsgSnap, call Node.ReportSnapshot() after it has been sent (these messages may be
large). Note: Marshalling messages is not thread-safe; it is important that you
make sure that no new entries are persisted while marshalling.
The easiest way to achieve this is to serialise the messages directly inside
your main raft loop.
3. Apply Snapshot (if any) and CommittedEntries to the state machine.
If any committed Entry has Type EntryConfChange, call Node.ApplyConfChange()
to apply it to the node. The configuration change may be cancelled at this point
by setting the NodeID field to zero before calling ApplyConfChange
(but ApplyConfChange must be called one way or the other, and the decision to cancel
must be based solely on the state machine and not external information such as
the observed health of the node).
4. Call Node.Advance() to signal readiness for the next batch of updates.
This may be done at any time after step 1, although all updates must be processed
in the order they were returned by Ready.
Second, all persisted log entries must be made available via an
implementation of the Storage interface. The provided MemoryStorage
type can be used for this (if you repopulate its state upon a
restart), or you can supply your own disk-backed implementation.
Third, when you receive a message from another node, pass it to Node.Step:
```go
func recvRaftRPC(ctx context.Context, m raftpb.Message) {
n.Step(ctx, m)
}
```
Finally, you need to call `Node.Tick()` at regular intervals (probably
via a `time.Ticker`). Raft has two important timeouts: heartbeat and the
election timeout. However, internally to the raft package time is
represented by an abstract "tick".
The total state machine handling loop will look something like this:
```go
for {
select {
case <-s.Ticker:
n.Tick()
case rd := <-s.Node.Ready():
saveToStorage(rd.State, rd.Entries, rd.Snapshot)
send(rd.Messages)
if !raft.IsEmptySnap(rd.Snapshot) {
processSnapshot(rd.Snapshot)
}
for _, entry := range rd.CommittedEntries {
process(entry)
if entry.Type == raftpb.EntryConfChange {
var cc raftpb.ConfChange
cc.Unmarshal(entry.Data)
s.Node.ApplyConfChange(cc)
}
}
s.Node.Advance()
case <-s.done:
return
}
}
```
To propose changes to the state machine from your node take your application
data, serialize it into a byte slice and call:
```go
n.Propose(ctx, data)
```
If the proposal is committed, data will appear in committed entries with type
raftpb.EntryNormal. There is no guarantee that a proposed command will be
committed; you may have to re-propose after a timeout.
To add or remove node in a cluster, build ConfChange struct 'cc' and call:
```go
n.ProposeConfChange(ctx, cc)
```
After config change is committed, some committed entry with type
raftpb.EntryConfChange will be returned. You must apply it to node through:
```go
var cc raftpb.ConfChange
cc.Unmarshal(data)
n.ApplyConfChange(cc)
```
Note: An ID represents a unique node in a cluster for all time. A
given ID MUST be used only once even if the old node has been removed.
This means that for example IP addresses make poor node IDs since they
may be reused. Node IDs must be non-zero.
## Implementation notes
This implementation is up to date with the final Raft thesis
(https://ramcloud.stanford.edu/~ongaro/thesis.pdf), although our
implementation of the membership change protocol differs somewhat from
that described in chapter 4. The key invariant that membership changes
happen one node at a time is preserved, but in our implementation the
membership change takes effect when its entry is applied, not when it
is added to the log (so the entry is committed under the old
membership instead of the new). This is equivalent in terms of safety,
since the old and new configurations are guaranteed to overlap.
To ensure that we do not attempt to commit two membership changes at
once by matching log positions (which would be unsafe since they
should have different quorum requirements), we simply disallow any
proposed membership change while any uncommitted change appears in
the leader's log.
This approach introduces a problem when you try to remove a member
from a two-member cluster: If one of the members dies before the
other one receives the commit of the confchange entry, then the member
cannot be removed any more since the cluster cannot make progress.
For this reason it is highly recommended to use three or more nodes in
every cluster.

View File

@@ -1,4 +1,4 @@
// Copyright 2015 CoreOS, Inc.
// Copyright 2015 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
@@ -77,7 +77,7 @@ not empty. Note that when writing an Entry with Index i, any
previously-persisted entries with Index >= i must be discarded.
2. Send all Messages to the nodes named in the To field. It is important that
no messages be sent until after the latest HardState has been persisted to disk,
no messages be sent until the latest HardState has been persisted to disk,
and all Entries written by any previous Ready batch (Messages may be sent while
entries from the same batch are being persisted). To reduce the I/O latency, an
optimization can be applied to make leader write to disk in parallel with its
@@ -137,6 +137,7 @@ The total state machine handling loop will look something like this:
cc.Unmarshal(entry.Data)
s.Node.ApplyConfChange(cc)
}
}
s.Node.Advance()
case <-s.done:
return
@@ -209,10 +210,10 @@ stale log entries:
passes 'MsgHup' to its Step method and becomes (or remains) a candidate to
start a new election.
'MsgBeat' is an internal type that signals leaders to send a heartbeat of
'MsgBeat' is an internal type that signals the leader to send a heartbeat of
the 'MsgHeartbeat' type. If a node is a leader, the 'tick' function in
the 'raft' struct is set as 'tickHeartbeat', and sends periodic heartbeat
messages of the 'MsgBeat' type to its followers.
the 'raft' struct is set as 'tickHeartbeat', and triggers the leader to
send periodic 'MsgHeartbeat' messages to its followers.
'MsgProp' proposes to append data to its log entries. This is a special
type to redirect proposals to leader. Therefore, send method overwrites

View File

@@ -1,4 +1,4 @@
// Copyright 2015 CoreOS, Inc.
// Copyright 2015 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.

View File

@@ -1,4 +1,4 @@
// Copyright 2015 CoreOS, Inc.
// Copyright 2015 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.

View File

@@ -1,4 +1,4 @@
// Copyright 2015 CoreOS, Inc.
// Copyright 2015 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.

View File

@@ -1,4 +1,4 @@
// Copyright 2015 CoreOS, Inc.
// Copyright 2015 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
@@ -200,6 +200,7 @@ func StartNode(c *Config, peers []Peer) Node {
}
n := newNode()
n.logger = c.Logger
go n.run(r)
return &n
}
@@ -212,6 +213,7 @@ func RestartNode(c *Config) Node {
r := newRaft(c)
n := newNode()
n.logger = c.Logger
go n.run(r)
return &n
}
@@ -228,6 +230,8 @@ type node struct {
done chan struct{}
stop chan struct{}
status chan chan Status
logger Logger
}
func newNode() node {
@@ -238,10 +242,13 @@ func newNode() node {
confstatec: make(chan pb.ConfState),
readyc: make(chan Ready),
advancec: make(chan struct{}),
tickc: make(chan struct{}),
done: make(chan struct{}),
stop: make(chan struct{}),
status: make(chan chan Status),
// make tickc a buffered chan, so raft node can buffer some ticks when the node
// is busy processing raft messages. Raft node will resume process buffered
// ticks when it becomes idle.
tickc: make(chan struct{}, 128),
done: make(chan struct{}),
stop: make(chan struct{}),
status: make(chan chan Status),
}
}
@@ -306,7 +313,7 @@ func (n *node) run(r *raft) {
r.Step(m)
case m := <-n.recvc:
// filter out response message from unknown From.
if _, ok := r.prs[m.From]; ok || !IsResponseMsg(m) {
if _, ok := r.prs[m.From]; ok || !IsResponseMsg(m.Type) {
r.Step(m) // raft never returns an error
}
case cc := <-n.confc:
@@ -325,7 +332,7 @@ func (n *node) run(r *raft) {
// block incoming proposal when local node is
// removed
if cc.NodeID == r.id {
n.propc = nil
propc = nil
}
r.removeNode(cc.NodeID)
case pb.ConfChangeUpdateNode:
@@ -381,6 +388,8 @@ func (n *node) Tick() {
select {
case n.tickc <- struct{}{}:
case <-n.done:
default:
n.logger.Warningf("A tick missed to fire. Node blocks too long!")
}
}
@@ -392,7 +401,7 @@ func (n *node) Propose(ctx context.Context, data []byte) error {
func (n *node) Step(ctx context.Context, m pb.Message) error {
// ignore unexpected local messages receiving over network
if IsLocalMsg(m) {
if IsLocalMsg(m.Type) {
// TODO: return an error?
return nil
}

View File

@@ -1,4 +1,4 @@
// Copyright 2015 CoreOS, Inc.
// Copyright 2015 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
@@ -75,7 +75,6 @@ type Progress struct {
func (pr *Progress) resetState(state ProgressStateType) {
pr.Paused = false
pr.RecentActive = false
pr.PendingSnapshot = 0
pr.State = state
pr.ins.reset()
@@ -167,9 +166,9 @@ func (pr *Progress) isPaused() bool {
func (pr *Progress) snapshotFailure() { pr.PendingSnapshot = 0 }
// maybeSnapshotAbort unsets pendingSnapshot if Match is equal or higher than
// the pendingSnapshot
func (pr *Progress) maybeSnapshotAbort() bool {
// needSnapshotAbort returns true if snapshot progress's Match
// is equal or higher than the pendingSnapshot.
func (pr *Progress) needSnapshotAbort() bool {
return pr.State == ProgressStateSnapshot && pr.Match >= pr.PendingSnapshot
}

View File

@@ -1,4 +1,4 @@
// Copyright 2015 CoreOS, Inc.
// Copyright 2015 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
@@ -29,8 +29,6 @@ import (
const None uint64 = 0
const noLimit = math.MaxUint64
var ErrSnapshotTemporarilyUnavailable = errors.New("snapshot is temporarily unavailable")
// Possible values for StateType.
const (
StateFollower StateType = iota
@@ -156,7 +154,9 @@ type raft struct {
// the leader id
lead uint64
// leadTransferee is id of the leader transfer target when its value is not zero.
// Follow the procedure defined in raft thesis 3.10.
leadTransferee uint64
// New configuration is ignored if there exists unapplied configuration.
pendingConf bool
@@ -229,7 +229,7 @@ func newRaft(c *Config) *raft {
}
r.becomeFollower(r.Term, None)
nodesStrs := make([]string, 0)
var nodesStrs []string
for _, n := range r.nodes() {
nodesStrs = append(nodesStrs, fmt.Sprintf("%x", n))
}
@@ -397,6 +397,8 @@ func (r *raft) reset(term uint64) {
r.heartbeatElapsed = 0
r.resetRandomizedElectionTimeout()
r.abortLeaderTransfer()
r.votes = make(map[uint64]bool)
for id := range r.prs {
r.prs[id] = &Progress{Next: r.raftLog.lastIndex() + 1, ins: newInflights(r.maxInflight)}
@@ -421,12 +423,9 @@ func (r *raft) appendEntry(es ...pb.Entry) {
// tickElection is run by followers and candidates after r.electionTimeout.
func (r *raft) tickElection() {
if !r.promotable() {
r.electionElapsed = 0
return
}
r.electionElapsed++
if r.pastElectionTimeout() {
if r.promotable() && r.pastElectionTimeout() {
r.electionElapsed = 0
r.Step(pb.Message{From: r.id, Type: pb.MsgHup})
}
@@ -442,6 +441,10 @@ func (r *raft) tickHeartbeat() {
if r.checkQuorum {
r.Step(pb.Message{From: r.id, Type: pb.MsgCheckQuorum})
}
// If current leader cannot transfer leadership in electionTimeout, it becomes leader again.
if r.state == StateLeader && r.leadTransferee != None {
r.abortLeaderTransfer()
}
}
if r.state != StateLeader {
@@ -547,6 +550,11 @@ func (r *raft) Step(m pb.Message) error {
}
return nil
}
if m.Type == pb.MsgTransferLeader {
if r.state != StateLeader {
r.logger.Debugf("%x [term %d state %v] ignoring MsgTransferLeader to %x", r.id, r.Term, r.state, m.From)
}
}
switch {
case m.Term == 0:
@@ -554,15 +562,35 @@ func (r *raft) Step(m pb.Message) error {
case m.Term > r.Term:
lead := m.From
if m.Type == pb.MsgVote {
if r.checkQuorum && r.state != StateCandidate && r.electionElapsed < r.electionTimeout {
// If a server receives a RequestVote request within the minimum election timeout
// of hearing from a current leader, it does not update its term or grant its vote
r.logger.Infof("%x [logterm: %d, index: %d, vote: %x] ignored vote from %x [logterm: %d, index: %d] at term %d: lease is not expired (remaining ticks: %d)",
r.id, r.raftLog.lastTerm(), r.raftLog.lastIndex(), r.Vote, m.From, m.LogTerm, m.Index, r.Term, r.electionTimeout-r.electionElapsed)
return nil
}
lead = None
}
r.logger.Infof("%x [term: %d] received a %s message with higher term from %x [term: %d]",
r.id, r.Term, m.Type, m.From, m.Term)
r.becomeFollower(m.Term, lead)
case m.Term < r.Term:
// ignore
r.logger.Infof("%x [term: %d] ignored a %s message with lower term from %x [term: %d]",
r.id, r.Term, m.Type, m.From, m.Term)
if r.checkQuorum && (m.Type == pb.MsgHeartbeat || m.Type == pb.MsgApp) {
// We have received messages from a leader at a lower term. It is possible that these messages were
// simply delayed in the network, but this could also mean that this node has advanced its term number
// during a network partition, and it is now unable to either win an election or to rejoin the majority
// on the old term. If checkQuorum is false, this will be handled by incrementing term numbers in response
// to MsgVote with a higher term, but if checkQuorum is true we may not advance the term on MsgVote and
// must generate other messages to advance the term. The net result of these two features is to minimize
// the disruption caused by nodes that have been removed from the cluster's configuration: a removed node
// will send MsgVotes which will be ignored, but it will not receive MsgApp or MsgHeartbeat, so it will not
// create disruptive term increases
r.send(pb.Message{To: m.From, Type: pb.MsgAppResp})
} else {
// ignore other cases
r.logger.Infof("%x [term: %d] ignored a %s message with lower term from %x [term: %d]",
r.id, r.Term, m.Type, m.From, m.Term)
}
return nil
}
r.step(r, m)
@@ -572,7 +600,6 @@ func (r *raft) Step(m pb.Message) error {
type stepFunc func(r *raft, m pb.Message)
func stepLeader(r *raft, m pb.Message) {
// These message types do not require any progress for m.From.
switch m.Type {
case pb.MsgBeat:
@@ -594,6 +621,11 @@ func stepLeader(r *raft, m pb.Message) {
// drop any new proposals.
return
}
if r.leadTransferee != None {
r.logger.Debugf("%x [term %d] transfer leadership to %x is in progress; dropping proposal", r.id, r.Term, r.leadTransferee)
return
}
for i, e := range m.Entries {
if e.Type == pb.EntryConfChange {
if r.pendingConf {
@@ -615,7 +647,7 @@ func stepLeader(r *raft, m pb.Message) {
// All other message types require a progress for m.From (pr).
pr, prOk := r.prs[m.From]
if !prOk {
r.logger.Debugf("no progress available for %x", m.From)
r.logger.Debugf("%x no progress available for %x", r.id, m.From)
return
}
switch m.Type {
@@ -638,7 +670,7 @@ func stepLeader(r *raft, m pb.Message) {
switch {
case pr.State == ProgressStateProbe:
pr.becomeReplicate()
case pr.State == ProgressStateSnapshot && pr.maybeSnapshotAbort():
case pr.State == ProgressStateSnapshot && pr.needSnapshotAbort():
r.logger.Debugf("%x snapshot aborted, resumed sending replication messages to %x [%s]", r.id, m.From, pr)
pr.becomeProbe()
case pr.State == ProgressStateReplicate:
@@ -652,6 +684,11 @@ func stepLeader(r *raft, m pb.Message) {
// an update before, send it now.
r.sendAppend(m.From)
}
// Transfer leadership is in progress.
if m.From == r.leadTransferee && pr.Match == r.raftLog.lastIndex() {
r.logger.Infof("%x sent MsgTimeoutNow to %x after received MsgAppResp", r.id, m.From)
r.sendTimeoutNow(m.From)
}
}
}
case pb.MsgHeartbeatResp:
@@ -687,6 +724,33 @@ func stepLeader(r *raft, m pb.Message) {
pr.becomeProbe()
}
r.logger.Debugf("%x failed to send message to %x because it is unreachable [%s]", r.id, m.From, pr)
case pb.MsgTransferLeader:
leadTransferee := m.From
lastLeadTransferee := r.leadTransferee
if lastLeadTransferee != None {
if lastLeadTransferee == leadTransferee {
r.logger.Infof("%x [term %d] transfer leadership to %x is in progress, ignores request to same node %x",
r.id, r.Term, leadTransferee, leadTransferee)
return
}
r.abortLeaderTransfer()
r.logger.Infof("%x [term %d] abort previous transferring leadership to %x", r.id, r.Term, lastLeadTransferee)
}
if leadTransferee == r.id {
r.logger.Debugf("%x is already leader. Ignored transferring leadership to self", r.id)
return
}
// Transfer leadership to third party.
r.logger.Infof("%x [term %d] starts to transfer leadership to %x", r.id, r.Term, leadTransferee)
// Transfer leadership should be finished in one electionTimeout, so reset r.electionElapsed.
r.electionElapsed = 0
r.leadTransferee = leadTransferee
if pr.Match == r.raftLog.lastIndex() {
r.sendTimeoutNow(leadTransferee)
r.logger.Infof("%x sends MsgTimeoutNow to %x immediately as %x already has up-to-date log", r.id, leadTransferee, leadTransferee)
} else {
r.sendAppend(leadTransferee)
}
}
}
@@ -718,6 +782,8 @@ func stepCandidate(r *raft, m pb.Message) {
case len(r.votes) - gr:
r.becomeFollower(r.Term, None)
}
case pb.MsgTimeoutNow:
r.logger.Debugf("%x [term %d state %v] ignored MsgTimeoutNow from %x", r.id, r.Term, r.state, m.From)
}
}
@@ -753,6 +819,9 @@ func stepFollower(r *raft, m pb.Message) {
r.id, r.raftLog.lastTerm(), r.raftLog.lastIndex(), r.Vote, m.From, m.LogTerm, m.Index, r.Term)
r.send(pb.Message{To: m.From, Type: pb.MsgVoteResp, Reject: true})
}
case pb.MsgTimeoutNow:
r.logger.Infof("%x [term %d] received MsgTimeoutNow from %x and starts an election to get leadership.", r.id, r.Term, m.From)
r.campaign()
}
}
@@ -841,11 +910,21 @@ func (r *raft) addNode(id uint64) {
func (r *raft) removeNode(id uint64) {
r.delProgress(id)
r.pendingConf = false
// do not try to commit or abort transferring if there is no nodes in the cluster.
if len(r.prs) == 0 {
return
}
// The quorum size is now smaller, so see if any pending entries can
// be committed.
if r.maybeCommit() {
r.bcastAppend()
}
// If the removed node is the leadTransferee, then abort the leadership transferring.
if r.state == StateLeader && r.leadTransferee == id {
r.abortLeaderTransfer()
}
}
func (r *raft) resetPendingConf() { r.pendingConf = false }
@@ -900,3 +979,11 @@ func (r *raft) checkQuorumActive() bool {
return act >= r.quorum()
}
func (r *raft) sendTimeoutNow(to uint64) {
r.send(pb.Message{To: to, Type: pb.MsgTimeoutNow})
}
func (r *raft) abortLeaderTransfer() {
r.leadTransferee = None
}

View File

@@ -22,7 +22,7 @@ package raftpb
import (
"fmt"
proto "github.com/gogo/protobuf/proto"
proto "github.com/golang/protobuf/proto"
math "math"
)
@@ -34,6 +34,10 @@ var _ = proto.Marshal
var _ = fmt.Errorf
var _ = math.Inf
// This is a compile-time assertion to ensure that this generated file
// is compatible with the proto package it is being compiled against.
const _ = proto.ProtoPackageIsVersion1
type EntryType int32
const (
@@ -66,23 +70,26 @@ func (x *EntryType) UnmarshalJSON(data []byte) error {
*x = EntryType(value)
return nil
}
func (EntryType) EnumDescriptor() ([]byte, []int) { return fileDescriptorRaft, []int{0} }
type MessageType int32
const (
MsgHup MessageType = 0
MsgBeat MessageType = 1
MsgProp MessageType = 2
MsgApp MessageType = 3
MsgAppResp MessageType = 4
MsgVote MessageType = 5
MsgVoteResp MessageType = 6
MsgSnap MessageType = 7
MsgHeartbeat MessageType = 8
MsgHeartbeatResp MessageType = 9
MsgUnreachable MessageType = 10
MsgSnapStatus MessageType = 11
MsgCheckQuorum MessageType = 12
MsgHup MessageType = 0
MsgBeat MessageType = 1
MsgProp MessageType = 2
MsgApp MessageType = 3
MsgAppResp MessageType = 4
MsgVote MessageType = 5
MsgVoteResp MessageType = 6
MsgSnap MessageType = 7
MsgHeartbeat MessageType = 8
MsgHeartbeatResp MessageType = 9
MsgUnreachable MessageType = 10
MsgSnapStatus MessageType = 11
MsgCheckQuorum MessageType = 12
MsgTransferLeader MessageType = 13
MsgTimeoutNow MessageType = 14
)
var MessageType_name = map[int32]string{
@@ -99,21 +106,25 @@ var MessageType_name = map[int32]string{
10: "MsgUnreachable",
11: "MsgSnapStatus",
12: "MsgCheckQuorum",
13: "MsgTransferLeader",
14: "MsgTimeoutNow",
}
var MessageType_value = map[string]int32{
"MsgHup": 0,
"MsgBeat": 1,
"MsgProp": 2,
"MsgApp": 3,
"MsgAppResp": 4,
"MsgVote": 5,
"MsgVoteResp": 6,
"MsgSnap": 7,
"MsgHeartbeat": 8,
"MsgHeartbeatResp": 9,
"MsgUnreachable": 10,
"MsgSnapStatus": 11,
"MsgCheckQuorum": 12,
"MsgHup": 0,
"MsgBeat": 1,
"MsgProp": 2,
"MsgApp": 3,
"MsgAppResp": 4,
"MsgVote": 5,
"MsgVoteResp": 6,
"MsgSnap": 7,
"MsgHeartbeat": 8,
"MsgHeartbeatResp": 9,
"MsgUnreachable": 10,
"MsgSnapStatus": 11,
"MsgCheckQuorum": 12,
"MsgTransferLeader": 13,
"MsgTimeoutNow": 14,
}
func (x MessageType) Enum() *MessageType {
@@ -132,6 +143,7 @@ func (x *MessageType) UnmarshalJSON(data []byte) error {
*x = MessageType(value)
return nil
}
func (MessageType) EnumDescriptor() ([]byte, []int) { return fileDescriptorRaft, []int{1} }
type ConfChangeType int32
@@ -168,29 +180,32 @@ func (x *ConfChangeType) UnmarshalJSON(data []byte) error {
*x = ConfChangeType(value)
return nil
}
func (ConfChangeType) EnumDescriptor() ([]byte, []int) { return fileDescriptorRaft, []int{2} }
type Entry struct {
Type EntryType `protobuf:"varint,1,opt,name=Type,enum=raftpb.EntryType" json:"Type"`
Term uint64 `protobuf:"varint,2,opt,name=Term" json:"Term"`
Index uint64 `protobuf:"varint,3,opt,name=Index" json:"Index"`
Data []byte `protobuf:"bytes,4,opt,name=Data" json:"Data,omitempty"`
Type EntryType `protobuf:"varint,1,opt,name=Type,json=type,enum=raftpb.EntryType" json:"Type"`
Term uint64 `protobuf:"varint,2,opt,name=Term,json=term" json:"Term"`
Index uint64 `protobuf:"varint,3,opt,name=Index,json=index" json:"Index"`
Data []byte `protobuf:"bytes,4,opt,name=Data,json=data" json:"Data,omitempty"`
XXX_unrecognized []byte `json:"-"`
}
func (m *Entry) Reset() { *m = Entry{} }
func (m *Entry) String() string { return proto.CompactTextString(m) }
func (*Entry) ProtoMessage() {}
func (m *Entry) Reset() { *m = Entry{} }
func (m *Entry) String() string { return proto.CompactTextString(m) }
func (*Entry) ProtoMessage() {}
func (*Entry) Descriptor() ([]byte, []int) { return fileDescriptorRaft, []int{0} }
type SnapshotMetadata struct {
ConfState ConfState `protobuf:"bytes,1,opt,name=conf_state" json:"conf_state"`
ConfState ConfState `protobuf:"bytes,1,opt,name=conf_state,json=confState" json:"conf_state"`
Index uint64 `protobuf:"varint,2,opt,name=index" json:"index"`
Term uint64 `protobuf:"varint,3,opt,name=term" json:"term"`
XXX_unrecognized []byte `json:"-"`
}
func (m *SnapshotMetadata) Reset() { *m = SnapshotMetadata{} }
func (m *SnapshotMetadata) String() string { return proto.CompactTextString(m) }
func (*SnapshotMetadata) ProtoMessage() {}
func (m *SnapshotMetadata) Reset() { *m = SnapshotMetadata{} }
func (m *SnapshotMetadata) String() string { return proto.CompactTextString(m) }
func (*SnapshotMetadata) ProtoMessage() {}
func (*SnapshotMetadata) Descriptor() ([]byte, []int) { return fileDescriptorRaft, []int{1} }
type Snapshot struct {
Data []byte `protobuf:"bytes,1,opt,name=data" json:"data,omitempty"`
@@ -198,9 +213,10 @@ type Snapshot struct {
XXX_unrecognized []byte `json:"-"`
}
func (m *Snapshot) Reset() { *m = Snapshot{} }
func (m *Snapshot) String() string { return proto.CompactTextString(m) }
func (*Snapshot) ProtoMessage() {}
func (m *Snapshot) Reset() { *m = Snapshot{} }
func (m *Snapshot) String() string { return proto.CompactTextString(m) }
func (*Snapshot) ProtoMessage() {}
func (*Snapshot) Descriptor() ([]byte, []int) { return fileDescriptorRaft, []int{2} }
type Message struct {
Type MessageType `protobuf:"varint,1,opt,name=type,enum=raftpb.MessageType" json:"type"`
@@ -217,9 +233,10 @@ type Message struct {
XXX_unrecognized []byte `json:"-"`
}
func (m *Message) Reset() { *m = Message{} }
func (m *Message) String() string { return proto.CompactTextString(m) }
func (*Message) ProtoMessage() {}
func (m *Message) Reset() { *m = Message{} }
func (m *Message) String() string { return proto.CompactTextString(m) }
func (*Message) ProtoMessage() {}
func (*Message) Descriptor() ([]byte, []int) { return fileDescriptorRaft, []int{3} }
type HardState struct {
Term uint64 `protobuf:"varint,1,opt,name=term" json:"term"`
@@ -228,30 +245,33 @@ type HardState struct {
XXX_unrecognized []byte `json:"-"`
}
func (m *HardState) Reset() { *m = HardState{} }
func (m *HardState) String() string { return proto.CompactTextString(m) }
func (*HardState) ProtoMessage() {}
func (m *HardState) Reset() { *m = HardState{} }
func (m *HardState) String() string { return proto.CompactTextString(m) }
func (*HardState) ProtoMessage() {}
func (*HardState) Descriptor() ([]byte, []int) { return fileDescriptorRaft, []int{4} }
type ConfState struct {
Nodes []uint64 `protobuf:"varint,1,rep,name=nodes" json:"nodes,omitempty"`
XXX_unrecognized []byte `json:"-"`
}
func (m *ConfState) Reset() { *m = ConfState{} }
func (m *ConfState) String() string { return proto.CompactTextString(m) }
func (*ConfState) ProtoMessage() {}
func (m *ConfState) Reset() { *m = ConfState{} }
func (m *ConfState) String() string { return proto.CompactTextString(m) }
func (*ConfState) ProtoMessage() {}
func (*ConfState) Descriptor() ([]byte, []int) { return fileDescriptorRaft, []int{5} }
type ConfChange struct {
ID uint64 `protobuf:"varint,1,opt,name=ID" json:"ID"`
Type ConfChangeType `protobuf:"varint,2,opt,name=Type,enum=raftpb.ConfChangeType" json:"Type"`
NodeID uint64 `protobuf:"varint,3,opt,name=NodeID" json:"NodeID"`
Context []byte `protobuf:"bytes,4,opt,name=Context" json:"Context,omitempty"`
ID uint64 `protobuf:"varint,1,opt,name=ID,json=iD" json:"ID"`
Type ConfChangeType `protobuf:"varint,2,opt,name=Type,json=type,enum=raftpb.ConfChangeType" json:"Type"`
NodeID uint64 `protobuf:"varint,3,opt,name=NodeID,json=nodeID" json:"NodeID"`
Context []byte `protobuf:"bytes,4,opt,name=Context,json=context" json:"Context,omitempty"`
XXX_unrecognized []byte `json:"-"`
}
func (m *ConfChange) Reset() { *m = ConfChange{} }
func (m *ConfChange) String() string { return proto.CompactTextString(m) }
func (*ConfChange) ProtoMessage() {}
func (m *ConfChange) Reset() { *m = ConfChange{} }
func (m *ConfChange) String() string { return proto.CompactTextString(m) }
func (*ConfChange) ProtoMessage() {}
func (*ConfChange) Descriptor() ([]byte, []int) { return fileDescriptorRaft, []int{6} }
func init() {
proto.RegisterType((*Entry)(nil), "raftpb.Entry")
@@ -1766,3 +1786,55 @@ var (
ErrInvalidLengthRaft = fmt.Errorf("proto: negative length found during unmarshaling")
ErrIntOverflowRaft = fmt.Errorf("proto: integer overflow")
)
var fileDescriptorRaft = []byte{
// 753 bytes of a gzipped FileDescriptorProto
0x1f, 0x8b, 0x08, 0x00, 0x00, 0x09, 0x6e, 0x88, 0x02, 0xff, 0x64, 0x53, 0xc1, 0x6e, 0xdb, 0x38,
0x10, 0xb5, 0x64, 0xd9, 0xb2, 0x47, 0x89, 0xc3, 0x30, 0xde, 0x05, 0x11, 0x04, 0x5e, 0xaf, 0xb1,
0x07, 0x23, 0x8b, 0x64, 0x77, 0x7d, 0xd8, 0xc3, 0xde, 0x12, 0x7b, 0x81, 0x04, 0x58, 0x07, 0x5b,
0xc7, 0xe9, 0xa1, 0x45, 0x51, 0x30, 0x16, 0x2d, 0xbb, 0x8d, 0x44, 0x81, 0xa2, 0xd3, 0xe4, 0x52,
0x14, 0xe8, 0xa1, 0x97, 0x7e, 0x40, 0x3f, 0x29, 0xc7, 0x7c, 0x41, 0xd1, 0xa4, 0x3f, 0x52, 0x90,
0xa2, 0x6c, 0x29, 0xbe, 0x91, 0xef, 0x0d, 0x67, 0xde, 0xbc, 0x19, 0x02, 0x08, 0x3a, 0x95, 0x87,
0xb1, 0xe0, 0x92, 0xe3, 0xaa, 0x3a, 0xc7, 0x97, 0xbb, 0xcd, 0x80, 0x07, 0x5c, 0x43, 0x7f, 0xa8,
0x53, 0xca, 0x76, 0xde, 0x43, 0xe5, 0xdf, 0x48, 0x8a, 0x5b, 0xfc, 0x3b, 0x38, 0xe3, 0xdb, 0x98,
0x11, 0xab, 0x6d, 0x75, 0x1b, 0xbd, 0xed, 0xc3, 0xf4, 0xd5, 0xa1, 0x26, 0x15, 0x71, 0xec, 0xdc,
0x7d, 0xfd, 0xa5, 0x34, 0x72, 0xe4, 0x6d, 0xcc, 0x30, 0x01, 0x67, 0xcc, 0x44, 0x48, 0xec, 0xb6,
0xd5, 0x75, 0x96, 0x0c, 0x13, 0x21, 0xde, 0x85, 0xca, 0x69, 0xe4, 0xb3, 0x1b, 0x52, 0xce, 0x51,
0x95, 0xb9, 0x82, 0x30, 0x06, 0x67, 0x40, 0x25, 0x25, 0x4e, 0xdb, 0xea, 0x6e, 0x8c, 0x1c, 0x9f,
0x4a, 0xda, 0xf9, 0x60, 0x01, 0x3a, 0x8f, 0x68, 0x9c, 0xcc, 0xb8, 0x1c, 0x32, 0x49, 0x15, 0x88,
0xff, 0x06, 0x98, 0xf0, 0x68, 0xfa, 0x3a, 0x91, 0x54, 0xa6, 0x8a, 0xbc, 0x95, 0xa2, 0x3e, 0x8f,
0xa6, 0xe7, 0x8a, 0x30, 0xc9, 0xeb, 0x93, 0x0c, 0x50, 0xc5, 0x75, 0xa5, 0x82, 0x2e, 0x53, 0x9c,
0x80, 0x16, 0x58, 0xd0, 0xa5, 0x91, 0xce, 0x0b, 0xa8, 0x65, 0x0a, 0x94, 0x44, 0xa5, 0x40, 0xd7,
0x34, 0x12, 0xf1, 0x3f, 0x50, 0x0b, 0x8d, 0x32, 0x9d, 0xd8, 0xeb, 0x91, 0x4c, 0xcb, 0x53, 0xe5,
0x26, 0xef, 0x32, 0xbe, 0xf3, 0xb1, 0x0c, 0xee, 0x90, 0x25, 0x09, 0x0d, 0x18, 0x3e, 0x00, 0x6d,
0x9e, 0x71, 0x78, 0x27, 0xcb, 0x61, 0xe8, 0x35, 0x8f, 0x9b, 0x60, 0x4b, 0x5e, 0xe8, 0xc4, 0x96,
0x5c, 0xb5, 0x31, 0x15, 0xfc, 0x49, 0x1b, 0x0a, 0x59, 0x36, 0xe8, 0xac, 0xcd, 0xa4, 0x05, 0xee,
0x15, 0x0f, 0xf4, 0xc0, 0x2a, 0x39, 0x32, 0x03, 0x57, 0xb6, 0x55, 0xd7, 0x6d, 0x3b, 0x00, 0x97,
0x45, 0x52, 0xcc, 0x59, 0x42, 0xdc, 0x76, 0xb9, 0xeb, 0xf5, 0x36, 0x0b, 0x9b, 0x91, 0xa5, 0x32,
0x31, 0x78, 0x0f, 0xaa, 0x13, 0x1e, 0x86, 0x73, 0x49, 0x6a, 0xb9, 0x5c, 0x06, 0xc3, 0x3d, 0xa8,
0x25, 0xc6, 0x31, 0x52, 0xd7, 0x4e, 0xa2, 0xa7, 0x4e, 0x66, 0x0e, 0x66, 0x71, 0x2a, 0xa3, 0x60,
0x6f, 0xd8, 0x44, 0x12, 0x68, 0x5b, 0xdd, 0x5a, 0x96, 0x31, 0xc5, 0xf0, 0x6f, 0x00, 0xe9, 0xe9,
0x64, 0x1e, 0x49, 0xe2, 0xe5, 0x6a, 0xe6, 0xf0, 0xce, 0x2b, 0xa8, 0x9f, 0x50, 0xe1, 0xa7, 0x4b,
0x92, 0xf9, 0x64, 0xad, 0xf9, 0x44, 0xc0, 0xb9, 0xe6, 0x92, 0x15, 0xb7, 0x5a, 0x21, 0xb9, 0xb6,
0xca, 0xeb, 0x6d, 0x75, 0x7e, 0x85, 0xfa, 0x72, 0x29, 0x71, 0x13, 0x2a, 0x11, 0xf7, 0x59, 0x42,
0xac, 0x76, 0xb9, 0xeb, 0x8c, 0xd2, 0x4b, 0xe7, 0xb3, 0x05, 0xa0, 0x62, 0xfa, 0x33, 0x1a, 0x05,
0x7a, 0xb6, 0xa7, 0x83, 0x82, 0x02, 0x7b, 0x3e, 0xc0, 0x7f, 0x9a, 0x2f, 0x68, 0xeb, 0x05, 0xf9,
0x39, 0xbf, 0xf0, 0xe9, 0xbb, 0xb5, 0x1d, 0xd9, 0x83, 0xea, 0x19, 0xf7, 0xd9, 0xe9, 0xa0, 0xa8,
0x2b, 0xd2, 0x18, 0x26, 0xe0, 0xf6, 0x79, 0x24, 0xd9, 0x8d, 0x34, 0x5f, 0xce, 0x9d, 0xa4, 0xd7,
0xfd, 0xbf, 0xa0, 0xbe, 0xfc, 0xd8, 0x78, 0x0b, 0x3c, 0x7d, 0x39, 0xe3, 0x22, 0xa4, 0x57, 0xa8,
0x84, 0x77, 0x60, 0x4b, 0x03, 0xab, 0xc2, 0xc8, 0xda, 0xff, 0x64, 0x83, 0x97, 0x5b, 0x55, 0x0c,
0x50, 0x1d, 0x26, 0xc1, 0xc9, 0x22, 0x46, 0x25, 0xec, 0x81, 0x3b, 0x4c, 0x82, 0x63, 0x46, 0x25,
0xb2, 0xcc, 0xe5, 0x7f, 0xc1, 0x63, 0x64, 0x9b, 0xa8, 0xa3, 0x38, 0x46, 0x65, 0xdc, 0x00, 0x48,
0xcf, 0x23, 0x96, 0xc4, 0xc8, 0x31, 0x81, 0xcf, 0xb9, 0x64, 0xa8, 0xa2, 0x44, 0x98, 0x8b, 0x66,
0xab, 0x86, 0x55, 0x6b, 0x81, 0x5c, 0x8c, 0x60, 0x43, 0x15, 0x63, 0x54, 0xc8, 0x4b, 0x55, 0xa5,
0x86, 0x9b, 0x80, 0xf2, 0x88, 0x7e, 0x54, 0xc7, 0x18, 0x1a, 0xc3, 0x24, 0xb8, 0x88, 0x04, 0xa3,
0x93, 0x19, 0xbd, 0xbc, 0x62, 0x08, 0xf0, 0x36, 0x6c, 0x9a, 0x44, 0x6a, 0x40, 0x8b, 0x04, 0x79,
0x26, 0xac, 0x3f, 0x63, 0x93, 0xb7, 0xcf, 0x16, 0x5c, 0x2c, 0x42, 0xb4, 0x81, 0x7f, 0x82, 0xed,
0x61, 0x12, 0x8c, 0x05, 0x8d, 0x92, 0x29, 0x13, 0xff, 0x31, 0xea, 0x33, 0x81, 0x36, 0xcd, 0xeb,
0xf1, 0x3c, 0x64, 0x7c, 0x21, 0xcf, 0xf8, 0x3b, 0xd4, 0xd8, 0x7f, 0x09, 0x8d, 0xe2, 0x48, 0xd4,
0xdb, 0x15, 0x72, 0xe4, 0xfb, 0x6a, 0x26, 0xa8, 0x84, 0x09, 0x34, 0x57, 0xf0, 0x88, 0x85, 0xfc,
0x9a, 0x69, 0xc6, 0x2a, 0x32, 0x17, 0xb1, 0x4f, 0x65, 0xca, 0xd8, 0xc7, 0xe4, 0xee, 0xa1, 0x55,
0xba, 0x7f, 0x68, 0x95, 0xee, 0x1e, 0x5b, 0xd6, 0xfd, 0x63, 0xcb, 0xfa, 0xf6, 0xd8, 0xb2, 0xbe,
0x7c, 0x6f, 0x95, 0x7e, 0x04, 0x00, 0x00, 0xff, 0xff, 0xb3, 0x85, 0x0b, 0xb6, 0xd4, 0x05, 0x00,
0x00,
}

View File

@@ -46,6 +46,8 @@ enum MessageType {
MsgUnreachable = 10;
MsgSnapStatus = 11;
MsgCheckQuorum = 12;
MsgTransferLeader = 13;
MsgTimeoutNow = 14;
}
message Message {

View File

@@ -1,4 +1,4 @@
// Copyright 2015 CoreOS, Inc.
// Copyright 2015 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
@@ -168,10 +168,10 @@ func (rn *RawNode) ApplyConfChange(cc pb.ConfChange) *pb.ConfState {
// Step advances the state machine using the given message.
func (rn *RawNode) Step(m pb.Message) error {
// ignore unexpected local messages receiving over network
if IsLocalMsg(m) {
if IsLocalMsg(m.Type) {
return ErrStepLocalMsg
}
if _, ok := rn.raft.prs[m.From]; ok || !IsResponseMsg(m) {
if _, ok := rn.raft.prs[m.From]; ok || !IsResponseMsg(m.Type) {
return rn.raft.Step(m)
}
return ErrStepPeerNotFound
@@ -226,3 +226,8 @@ func (rn *RawNode) ReportSnapshot(id uint64, status SnapshotStatus) {
_ = rn.raft.Step(pb.Message{Type: pb.MsgSnapStatus, From: id, Reject: rej})
}
// TransferLeader tries to transfer leadership to the given transferee.
func (rn *RawNode) TransferLeader(transferee uint64) {
_ = rn.raft.Step(pb.Message{Type: pb.MsgTransferLeader, From: transferee})
}

View File

@@ -1,4 +1,4 @@
// Copyright 2015 CoreOS, Inc.
// Copyright 2015 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.

View File

@@ -1,4 +1,4 @@
// Copyright 2015 CoreOS, Inc.
// Copyright 2015 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
@@ -29,8 +29,14 @@ var ErrCompacted = errors.New("requested index is unavailable due to compaction"
// index is older than the existing snapshot.
var ErrSnapOutOfDate = errors.New("requested index is older than the existing snapshot")
// ErrUnavailable is returned by Storage interface when the requested log entries
// are unavailable.
var ErrUnavailable = errors.New("requested entry at index is unavailable")
// ErrSnapshotTemporarilyUnavailable is returned by the Storage interface when the required
// snapshot is temporarily unavailable.
var ErrSnapshotTemporarilyUnavailable = errors.New("snapshot is temporarily unavailable")
// Storage is an interface that may be implemented by the application
// to retrieve log entries from storage.
//
@@ -220,12 +226,14 @@ func (ms *MemoryStorage) Compact(compactIndex uint64) error {
// TODO (xiangli): ensure the entries are continuous and
// entries[0].Index > ms.entries[0].Index
func (ms *MemoryStorage) Append(entries []pb.Entry) error {
ms.Lock()
defer ms.Unlock()
if len(entries) == 0 {
return nil
}
first := ms.ents[0].Index + 1
ms.Lock()
defer ms.Unlock()
first := ms.firstIndex()
last := entries[0].Index + uint64(len(entries)) - 1
// shortcut if there is no new entry.

View File

@@ -1,4 +1,4 @@
// Copyright 2015 CoreOS, Inc.
// Copyright 2015 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
@@ -46,12 +46,13 @@ func max(a, b uint64) uint64 {
return b
}
func IsLocalMsg(m pb.Message) bool {
return m.Type == pb.MsgHup || m.Type == pb.MsgBeat || m.Type == pb.MsgUnreachable || m.Type == pb.MsgSnapStatus || m.Type == pb.MsgCheckQuorum
func IsLocalMsg(msgt pb.MessageType) bool {
return msgt == pb.MsgHup || msgt == pb.MsgBeat || msgt == pb.MsgUnreachable ||
msgt == pb.MsgSnapStatus || msgt == pb.MsgCheckQuorum || msgt == pb.MsgTransferLeader
}
func IsResponseMsg(m pb.Message) bool {
return m.Type == pb.MsgAppResp || m.Type == pb.MsgVoteResp || m.Type == pb.MsgHeartbeatResp || m.Type == pb.MsgUnreachable
func IsResponseMsg(msgt pb.MessageType) bool {
return msgt == pb.MsgAppResp || msgt == pb.MsgVoteResp || msgt == pb.MsgHeartbeatResp || msgt == pb.MsgUnreachable
}
// EntryFormatter can be implemented by the application to provide human-readable formatting