dense_rank_str

Function dense_rank_str 

Source
pub fn dense_rank_str<T: Integer>(
    arr: StringAVT<'_, T>,
) -> Result<IntegerArray<i32>, KernelError>
Expand description

Computes SQL DENSE_RANK() ranking for string data with lexicographic dense ordering.

Implements dense ranking for string values using lexicographic comparison, where identical strings receive the same rank and subsequent ranks remain consecutive. Essential for alphabetical dense ranking and textual categorical analysis.

§Parameters

  • arr - String array view containing textual values for dense ranking

§Returns

Returns Result<IntegerArray<i32>, KernelError> containing:

  • Success: Dense rank values with consecutive sequence
  • Error: KernelError if capacity validation fails
  • Zero values for null string elements
  • Null mask indicating positions with valid ranks

§Dense String Ranking

  • DENSE_RANK() semantics: Identical strings receive same rank, no rank gaps
  • Lexicographic ordering: Standard dictionary-style string comparison
  • Case sensitivity: Maintains case-sensitive comparison (“Apple” ≠ “apple”)
  • UTF-8 support: Proper handling of Unicode string sequences

§Use Cases

  • Alphabetical dense ranking: Creating compact alphabetical orderings
  • Categorical encoding: Converting string categories to dense integer codes
  • Text analytics: Establishing lexicographic ordinality for text processing
  • Database operations: SQL DENSE_RANK() for string-valued columns

§Examples

use minarrow::StringArray;
use simd_kernels::kernels::window::dense_rank_str;

let arr = StringArray::<u32>::from_slice(&["banana", "apple", "cherry", "apple"]);
let result = dense_rank_str((&arr, 0, arr.len())).unwrap();
// Output: [2, 1, 3, 1] - dense ranking with tied "apple" values