Skip to content

Releases: databendlabs/openraft

Fix internal dependency. Nothing changed.

28 Feb 00:52
Compare
Choose a tag to compare
v0.7.5

Doc: update change log for 0.7.5

Improve membership management

25 Feb 13:34
Compare
Choose a tag to compare

Changed:

  • Changed: 1bd22edc remove AddLearnerError::Exists, which is not actually used; by 张炎泼; 2022-09-30

  • Changed: c6fe29d4 change-membership does not return error when replication lags; by 张炎泼; 2022-10-22

    If blocking is true, Raft::change_membership(..., blocking) will
    block until repliication to new nodes become upto date.
    But it won't return an error when proposing change-membership log.

    • Change: remove two errors: LearnerIsLagging and LearnerNotFound.

    • Fix: #581

Fixed:

  • Fixed: 2896b98e changing membership should not remove replication to all learners; by 张炎泼; 2022-09-30

    When changing membership, replications to the learners(non-voters) that
    are not added as voter should be kept.

    E.g.: with a cluster of voters {0} and learners {1, 2, 3}, changing
    membership to {0, 1, 2} should not remove replication to node 3.

    Only replications to removed members should be removed.

Added:

  • Added: 9a22bb03 add rocks-store as a RaftStorage implementation based on rocks-db; by 张炎泼; 2023-02-22

0.7.3

23 Sep 06:40
Compare
Choose a tag to compare

Changed:

  • Changed: 25e94c36 InstallSnapshotResponse: replies the last applied log id; Do not install a smaller snapshot; by 张炎泼; 2022-09-22

    A snapshot may not be installed by a follower if it already has a higher
    last_applied log id locally.
    In such a case, it just ignores the snapshot and respond with its local
    last_applied log id.

    This way the applied state(i.e., last_applied) will never revert back.

Fixed:

  • Fixed: 21684bbd potential inconsistency when installing snapshot; by 张炎泼; 2022-09-22

    The conflicting logs that are before snapshot_meta.last_log_id should
    be deleted before installing a snapshot.

    Otherwise there is chance the snapshot is installed but conflicting logs
    are left in the store, when a node crashes.

0.7.1:

22 Aug 16:33
Compare
Choose a tag to compare

v0.7.1

Added:

  • Added: ea696474 add feature-flag: bt enables backtrace; by 张炎泼; 2022-03-12

    --features bt enables backtrace when generating errors.
    By default errors does not contain backtrace info.

    Thus openraft can be built on stable rust by default.

    To use on stable rust with backtrace, set RUSTC_BOOTSTRAP=1, e.g.:

    RUSTUP_TOOLCHAIN=stable RUSTC_BOOTSTRAP=1 make test
    

v0.7.0-alpha.3

Changed:

  • Changed: f99ade30 API: move default impl methods in RaftStorage to StorageHelper; by 张炎泼; 2022-07-04

Fixed:

  • Fixed: 44381b0c when handling append-entries, if prev_log_id is purged, it should not delete any logs.; by 张炎泼; 2022-08-14

    When handling append-entries, if the local log at prev_log_id.index is
    purged, a follower should not believe it is a conflict and should
    not delete all logs. It will get committed log lost.

    To fix this issue, use last_applied instead of committed:
    last_applied is always the committed log id, while committed is not
    persisted and may be smaller than the actually applied, when a follower
    is restarted.

v0.7.0-alpha.2

Fixed:

  • Fixed: 30058c03 #424 wrong range when searching for membership entries: [end-step, end).; by 张炎泼; 2022-07-03

    The iterating range searching for membership log entries should be
    [end-step, end), not [start, end).
    With this bug it will return duplicated membership entries.

v0.7.0-alpha.1

Fixed:

  • Fixed: d836d85c if there may be more logs to replicate, continue to call send_append_entries in next loop, no need to wait heartbeat tick; by lichuang; 2022-01-04

  • Fixed: 5a026674 defensive_no_dirty_log hangs tests; by YangKian; 2022-01-08

  • Fixed: 8651625e save leader_id if a higher term is seen when handling append-entries RPC; by 张炎泼; 2022-01-10

    Problem:

    A follower saves hard state (term=msg.term, voted_for=None)
    when a msg.term > local.term when handling append-entries RPC.

    This is quite enough to be correct but not perfect. Correct because:

    • In one term, only an established leader will send append-entries;

    • Thus, there is a quorum voted for this leader;

    • Thus, no matter what voted_for is saved, it is still correct. E.g.
      when handling append-entries, a follower node could save hard state
      (term=msg.term, voted_for=Some(ANY_VALUE)).

    The problem is that a follower already knows the legal leader for a term
    but still does not save it. This leads to an unstable cluster state: The
    test sometimes fails.

    Solution:

    A follower always save hard state with the id of a known legal leader.

  • Fixed: 1a781e1b when lack entry, the snapshot to build has to include at least all purged logs; by 张炎泼; 2022-01-18

  • Fixed: a0a94af7 span.enter() in async loop causes memory leak; by 张炎泼; 2022-06-17

    It is explained in:
    https://onesignal.com/blog/solving-memory-leaks-in-rust/

Changed:

  • Changed: c9c8d898 trait RaftStore: remove get_membership_config(), add last_membership_in_log() and get_membership() with default impl; by drdr xp; 2022-01-04

    Goal: minimize the work for users to implement a correct raft application.

    Now RaftStorage provides default implementations for get_membership()
    and last_membership_in_log().

    These two methods just can be implemented with other basic user impl
    methods.

  • Changed: abda0d10 rename RaftStorage methods do_log_compaction: build_snapshot, delete_logs_from: delete_log; by 张炎泼; 2022-01-15

  • Changed: a52a9300 RaftStorage::get_log_state() returns last purge log id; by 张炎泼; 2022-01-16

    • Change: get_log_state() returns the last_purged_log_id instead of the first_log_id.
      Because there are some cases in which log are empty:
      When a snapshot is install that covers all logs,
      or when max_applied_log_to_keep is 0.

      Returning None is not clear about if there are no logs at all or
      all logs are deleted.

      In such cases, raft still needs to maintain log continuity
      when repilcating. Thus the last log id that once existed is important.
      Previously this is done by checking the last_applied_log_id, which is
      dirty and buggy.

      Now an implementation of RaftStorage has to maintain the
      last_purged_log_id in its store.

    • Change: Remove first_id_in_log(), last_log_id(), first_known_log_id(),
      because concepts are changed.

    • Change: Split delete_logs() into two method for clarity:

      delete_conflict_logs_since() for deleting conflict logs when the
      replication receiving end find a conflict log.

      purge_logs_upto() for cleaning applied logs

    • Change: Rename finalize_snapshot_installation() to install_snapshot().

  • Changed: 7424c968 remove unused error MembershipError::Incompatible; by 张炎泼; 2022-01-17

  • Changed: beeae721 add ChangeMembershipError sub error for reuse; by 张炎泼; 2022-01-17

Fix: span.enter() in async loop causes memory leak

17 Jun 14:16
Compare
Choose a tag to compare

Fixed:

  • Fixed: 4cd2a12b span.enter() in async loop causes memory leak; by 张炎泼; 2022-06-17

v0.6.4

03 Jan 13:15
Compare
Choose a tag to compare

v0.6.4

v0.6.3

v0.6.2

Fixed:

  • Fixed: 4d58a51e a non-voter not in joint config should not block replication; by drdr xp; 2021-08-31

  • Fixed: eed681d5 race condition of concurrent snapshot-install and apply.; by drdr xp; 2021-09-01

    Problem:

    Concurrent snapshot-install and apply mess up last_applied.

    finalize_snapshot_installation runs in the RaftCore thread.
    apply_to_state_machine runs in a separate tokio task(thread).

    Thus there is chance the last_applied being reset to a previous value:

    • apply_to_state_machine is called and finished in a thread.

    • finalize_snapshot_installation is called in RaftCore thread and
      finished with last_applied updated.

    • RaftCore thread finished waiting for apply_to_state_machine, and
      updated last_applied to a previous value.

    RaftCore: -.    install-snapshot,         .-> replicate_to_sm_handle.next(),
               |    update last_applied=5     |   update last_applied=2
               |                              |
               v                              |
    task:      apply 2------------------------'
    --------------------------------------------------------------------> time
    

    Solution:

    Rule: All changes to state machine must be serialized.

    A temporary simple solution for now is to call all methods that modify state
    machine in RaftCore thread.
    But this way it blocks RaftCore thread.

    A better way is to move all tasks that modifies state machine to a
    standalone thread, and send update request back to RaftCore to update
    its fields such as last_applied

  • Fixed: a48a3282 handle-vote should compare last_log_id in dictionary order, not in vector order; by drdr xp; 2021-09-09

    A log {term:2, index:1} is definitely greater than log {term:1, index:2} in raft spec.
    Comparing log id in the way of term1 >= term2 && index1 >= index2 blocks election:
    no one can become a leader.

  • Fixed: 228077a6 a restarted follower should not wait too long to elect. Otherwise the entire cluster hangs; by drdr xp; 2021-11-19

  • Fixed: 6c0ccaf3 consider joint config when starting up and committing.; by drdr xp; 2021-12-24

    • Change: MembershipConfig support more than 2 configs

    • Makes fields in MembershipConfig privates.
      Provides methods to manipulate membership.

    • Fix: commit without replication only when membership contains only one
      node. Previously it just checks the first config, which results in
      data loss if the cluster is in a joint config.

    • Fix: when starting up, count all nodes but not only the nodes in the
      first config to decide if it is a single node cluster.

  • Fixed: b390356f first_known_log_id() should returns the min one in log or in state machine; by drdr xp; 2021-12-28

  • Fixed: cd5a570d clippy warning; by lichuang; 2022-01-02

Changed:

  • Changed: deda6d76 remove PurgedMarker. keep logs clean; by drdr xp; 2021-09-09

    Changing log(add a PurgedMarker(original SnapshotPointer)) makes it
    diffeicult to impl install-snapshot for a RaftStore without a lock
    protecting both logs and state machine.

    Adding a PurgedMarker and installing the snapshot has to be atomic in
    storage layer. But usually logs and state machine are separated store.
    e.g., logs are stored in fast flash disk and state machine is stored
    some where else.

    To get rid of the big lock, PurgedMarker is removed and installing a
    snaphost does not need to keep consistent with logs any more.

  • Changed: 734eec69 VoteRequest: use last_log_id:LogId to replace last_log_term and last_log_index; by drdr xp; 2021-09-09

  • Changed: 74b16524 introduce StorageError. RaftStorage gets rid of anyhow::Error; by drdr xp; 2021-09-13

    StorageError is an enum of DefensiveError and StorageIOError.
    An error a RaftStorage impl returns could be a defensive check error
    or an actual io operation error.

    Why:

    anyhow::Error is not enough to support the flow control in RaftCore.
    It is typeless thus RaftCore can not decide what next to do
    depending on the returned error.

    Inside raft, anyhow::Error should never be used, although it could be used as
    source() of some other error types.

  • Changed: 46bb3b1c RaftStorage::finalize_snapshot_installation is no more responsible to delete logs included in snapshot; by drdr xp; 2021-09-13

    A RaftStorage should be as simple and intuitive as possible.

    One should be able to correctly impl a RaftStorage without reading the
    guide but just by guessing what a trait method should do.

    RaftCore is able to do the job of deleting logs that are included in
    the state machine, RaftStorage should just do what is asked.

  • Changed: 2cd23a37 use structopt to impl config default values; by drdr xp; 2021-09-14

  • Changed: ac4bf4bd InitialState: rename last_applied_log to last_applied; by drdr xp; 2021-09-14

  • Changed: 74283fda RaftStorage::do_log_compaction() do not need to delete logs any more raft-core will delete them.; by drdr xp; 2021-09-14

  • Changed: 112252b5 RaftStorage add 2 API: last_id_in_log() and last_applied_state(), remove get_last_log_id(); by drdr xp; 2021-09-15

  • Changed: 7f347934 simplify membership change; by drdr xp; 2021-09-16

    • Change: if leadership is lost, the cluster is left with the joint
      config.
      One does not receive response of the change-membership request should
      always re-send to ensure membership config is applied.

    • Change: remove joint-uniform logic from RaftCore, which brings a lot
      complexity to raft impl. This logic is now done in Raft(which is a
      shell to control RaftCore).

    • Change: RaftCore.membership is changed to ActiveMembership, which
      includes a log id and a membership config.
      Making this change to let raft be able to check if a membership is
      committed by comparing the log index and its committed index.

    • Change: when adding a existent non-voter, it returns an Ok value
      instead of an Err.

    • Change: add arg blocking to add_non_voter and change_membership.
      A blocking change_membership still wait for the two config change
      log to commit.
      blocking only indicates if to wait for replication to non-voter to
      be up to date.

    • Change: remove non_voters. Merge it into nodes.
      Now both voters and non-voters share the same replication handle.

    • Change: remove field ReplicationState.is_ready_to_join, it
      can be just calculated when needed.

    • Change: remove is_stepping_down, membership.contains() is quite
      enough.

    • Change: remove consensus_state.

  • Changed: df684131 bsearch to find matching log between leader and follower; by drdr xp; 2021-12-17

    • Refactor: simplify algo to find matching log between leader and follower.
      It adopts a binary-search like algo:

      The leader tracks the max matched log id(self.matched) and the least unmatched log id(self.max_possible_matched_index).

      The follower just responds if the prev_log_id in

      AppendEntriesRequest matches the log at prev_log_id.index in its
      store.

      Remove the case-by-case algo.

    • Change: RaftStorage adds 2 new API: try_get_log_entries(),
      first_id_in_log() and first_known_log_id().

      These a are not stable, may be removed soon.

    • Fix: the timeout for Wait() should be a total timeout. Otherwise a
      Wait() never quits.

    • Fix: when send append-entries request, if a log is not found, it
      should retry loading, but not enter snapshot state.
      Because a log may be deleted by RaftCore just after Replication read
      prev_log_id from the store.

    • Refactor: The two replication loop: line-rate loop and snapshot loop
      should not change the ReplicationState, but instead returning an
      error.
      Otherwise it has to check the state everywhere.

    • Refactor: simplify receiving RaftCore messages: split
      drain_raft_rx() into process_raft_event() and
      try_drain_raft_rx().

    • Featur...

Read more