In Search of an Understandable Consensus Algorithm - Study Questions
-
Why does raft use randomized timout values?
-
“GFS and HFDS use a seprate replicated stat machine to manage leader election and store configuration”, what does the author mean? Give concrete examples.
-
What is the max # of servers that can fail while raft is still avaiable? what is the min # of servers that can fail while raft becomes unavailable?
-
Descript the seqence of actions when a CANDIDATE handle an incoming AppendEntries call?
-
How can a candidate discover current leader or a new term? Why it becomes a follower upon such discovery?
-
During election, it is possible that no one is elected, how can that happen? How does raft deal with that?
- consider the cases of leader/follower crashes, where can these scenarios happen?
a. follower may have missing entries b. follower may have extra uncommited entries c. follower may have both missing entries and extra entries
-
When and how will raft fix the inconsitences in Figure 7)?
-
Why we need to ensure that any elected leader for an given has ALL of commited entries from the previous term? Why this property is satisfied in raft?
- Describe the scenario in Figure 8
a.replicated on majority of servers but not commited entry b.such entry in a) got overwritten
-
What if broadCastTime « electionTimeout is violated? what if electionTimeout !« mean time between failures (MTBF)?
-
Describe the disjoint majorities problem, how they can be triggered during membership changes, when each server just switch directly
-
What happened when cluster leader itself is not included in the new membership?
-
Give a scenario where removed cluster members can trick the new leaders into a follower, if without the remedy done by the raft.
- What kind of metadata should snapshot include?