eld_llm 0.0.1

An LLM built from scratch in Rust
# BPE Public Benchmarks

## Compression ratio (chars / tokens)

| Model | WikiText-2 | Penn Treebank | BooksCorpus |
|---|---|---|---|
| GPT-2 (reported) ||||
| tiktoken (GPT-2 vocab) ||||
| Your BPE v1 ||||
| Your BPE v2 ||||

## Fertility score (avg tokens per word)

| Model | WikiText-2 | Penn Treebank | BooksCorpus |
|---|---|---|---|
| GPT-2 (reported) ||||
| tiktoken (GPT-2 vocab) ||||
| Your BPE v1 ||||
| Your BPE v2 ||||

## OOV rate (% unknown tokens)

| Model | WikiText-2 | Penn Treebank | BooksCorpus |
|---|---|---|---|
| GPT-2 (reported) ||||
| tiktoken (GPT-2 vocab) ||||
| Your BPE v1 ||||
| Your BPE v2 ||||

## Encode throughput (tokens / sec)

| Model | WikiText-2 | Penn Treebank | BooksCorpus |
|---|---|---|---|
| GPT-2 (reported) ||||
| tiktoken (GPT-2 vocab) ||||
| Your BPE v1 ||||
| Your BPE v2 ||||