Module error_handling

Module error_handling 

Source
Expand description

§Error Handling

d-engine returns errors through two channels depending on your integration mode:

  • Standalone (gRPC): ErrorCode in response
  • Embedded (Rust): Result<T, E> from API

Both modes use the same error categories defined in proto/error.proto.


§Error Categories

§Network Errors (1000-1999)

Connection problems. Retry with backoff.

CodeWhenAction
CONNECTION_TIMEOUTNetwork slow/unreachableRetry after 3-5s
INVALID_ADDRESSMalformed URLFix address
LEADER_CHANGEDLeader re-electedRetry with new leader (see metadata)

§Business Errors (4000-4999)

Cluster state or request issues. Handle based on error.

CodeWhenAction
NOT_LEADERWrite sent to followerRedirect to leader
CLUSTER_UNAVAILABLE< 2/3 nodes availableWait and retry
INVALID_REQUESTBad request formatFix request
STALE_OPERATIONBased on old cluster stateRefresh state and retry

§Handling in Standalone Mode

Use Leader Hint to redirect to the leader:

currentAddr := "127.0.0.1:9081"  // Start with any node

for i := 0; i < maxRedirects; i++ {
    conn, _ := grpc.NewClient(currentAddr, ...)
    client := pb.NewRaftClientServiceClient(conn)

    resp, err := client.HandleClientWrite(ctx, req)
    if err != nil {
        return err  // Network error
    }

    if resp.Error == error_pb.ErrorCode_SUCCESS {
        break  // Success
    }

    // Got NOT_LEADER - follow leader hint
    if resp.Error == error_pb.ErrorCode_NOT_LEADER && resp.Metadata != nil {
        leaderAddr := resp.Metadata.LeaderAddress
        if leaderAddr != nil && *leaderAddr != "" {
            conn.Close()
            currentAddr = *leaderAddr  // Redirect
            continue
        }
    }
}

See Quick Start for complete working example.


§Handling in Embedded Mode

Check Result:

match client.put(key, value).await {
    Ok(_) => { /* success */ }
    Err(e) => {
        // e is ClientApiError
        match e.code() {
            ErrorCode::NotLeader => { /* handle */ }
            ErrorCode::ClusterUnavailable => { /* retry */ }
            _ => { /* other errors */ }
        }
    }
}

§Leader Hint

When you get NOT_LEADER error, check metadata.LeaderAddress:

if resp.Error == error_pb.ErrorCode_NOT_LEADER && resp.Metadata != nil {
    leaderAddr := resp.Metadata.LeaderAddress  // e.g., "0.0.0.0:9082"
    leaderId := resp.Metadata.LeaderId          // e.g., "2"
    // Reconnect to leaderAddr
}

This allows immediate redirect instead of trying all nodes.


§Retry Strategy

ErrorRetry?How
CONNECTION_TIMEOUTYesExponential backoff
LEADER_CHANGEDYesImmediate with new leader
NOT_LEADERYesRedirect to leader
CLUSTER_UNAVAILABLEYesWait longer, don’t hammer
INVALID_REQUESTNoFix request first

§See Also