pub struct BatchTokenizer { /* private fields */ }Expand description
배치 토크나이저
Rayon을 사용하여 여러 텍스트를 병렬로 처리합니다. 내부적으로 토크나이저 풀을 관리하여 각 스레드가 독립적으로 작업합니다.
Implementations§
Source§impl BatchTokenizer
impl BatchTokenizer
Sourcepub fn default_pool_size() -> usize
pub fn default_pool_size() -> usize
기본 풀 크기 (CPU 코어 수)
Sourcepub fn new() -> Result<BatchTokenizer, Error>
pub fn new() -> Result<BatchTokenizer, Error>
Sourcepub fn with_pool_size(pool_size: usize) -> Result<BatchTokenizer, Error>
pub fn with_pool_size(pool_size: usize) -> Result<BatchTokenizer, Error>
Sourcepub fn split_with_overlap(
text: &str,
chunk_size: usize,
overlap: usize,
) -> Vec<String>
pub fn split_with_overlap( text: &str, chunk_size: usize, overlap: usize, ) -> Vec<String>
오버랩 있는 청크 분할
컨텍스트 보존을 위해 청크 간 오버랩을 추가합니다.
§Arguments
text- 분할할 텍스트chunk_size- 청크 크기overlap- 오버랩 크기 (문자 단위)
Sourcepub fn available_tokenizers(&self) -> usize
pub fn available_tokenizers(&self) -> usize
현재 사용 가능한 토크나이저 수
Auto Trait Implementations§
impl Freeze for BatchTokenizer
impl RefUnwindSafe for BatchTokenizer
impl Send for BatchTokenizer
impl Sync for BatchTokenizer
impl Unpin for BatchTokenizer
impl UnsafeUnpin for BatchTokenizer
impl UnwindSafe for BatchTokenizer
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more