A native async Rust client for Apache Kafka 4.0 and newer.
kafkit-client is built on tokio and kafka-protocol. It exposes producer,
consumer, share-group consumer, and admin APIs without wrapping librdkafka or the
Java client.
Status
This crate targets modern Kafka clusters only:
- Apache Kafka 4.0+.
- KRaft clusters.
- Modern consumer groups using KIP-848.
- Transaction protocol v2.
- Share-group protocol support.
Classic consumer runtime protocols, ZooKeeper-era assumptions, and older broker protocol paths are intentionally out of scope. The admin client can still list, describe, and delete classic groups.
Install
[]
= "0.1"
= { = "1", = ["macros", "rt-multi-thread"] }
Performance
kafkit-client is designed as a native async Rust client rather than a wrapper
around librdkafka. In the local release-mode comparison app in this repository,
it currently outperforms rdkafka on both producer and consumer throughput for
the measured workload.
Benchmark setup:
- Command mode:
--release. - Machine: Apple M3 Pro, 12 logical CPUs, 36 GiB RAM.
- OS/toolchain: macOS 26.3.1, Rust 1.95.0, Docker 29.4.0.
- Kafka:
apache/kafka:4.2.0managed by testcontainers. - Payload: 100,000 messages, 256-byte values.
- Topic: three partitions.
- Compression: none.
- Producer:
linger.ms = 0, concurrency 256. - Consumer: topic prefilled with 100,000 records by the comparison app, commits disabled.
| Workload | kafkit-client msg/s | rdkafka msg/s | kafkit speedup |
|---|---|---|---|
| Produce | 122,024.45 | 12,129.49 | 10.06x |
| Consume | 114,726.34 | 31,006.31 | 3.70x |
Producer p50, p95, and p99 latency in that run was also lower:
| Client | p50 us | p95 us | p99 us | max us |
|---|---|---|---|---|
| kafkit-client | 1,038 | 2,709 | 8,495 | 272,895 |
| rdkafka | 18,751 | 37,727 | 55,807 | 87,935 |
Consumer throughput was also higher in the same comparison:
| Client | Elapsed ms | Msg/s | MiB/s |
|---|---|---|---|
| kafkit-client | 871.64 | 114,726.34 | 28.01 |
| rdkafka | 3,225.15 | 31,006.31 | 7.57 |
Consumer latency percentiles are intentionally not highlighted because the benchmark records payload age from prefill time. For consumers, the useful number from this run is fetch/decode throughput with commits disabled.
These are not universal Kafka performance numbers. They are a repeatable local
comparison for this repo's benchmark shape. Different broker versions, network
latency, compression, partition counts, batch sizes, commit behavior, message
sizes, and durability settings can move the result. In this run, broker env vars
were not set, so the benchmark app started separate Kafka testcontainers for
kafkit-client and rdkafka. That isolates the clients but does not put both
clients on the same broker process. The full report, commands, and specs are in
examples/perf-comparison/PERFORMANCE_SUMMARY.md.
How It Works
The producer has the same broad shape as Kafka's Java producer: application calls
enqueue records, a dedicated sender loop drains ready batches, batches are grouped
by leader, and produce requests are dispatched asynchronously. The sender keeps a
bounded number of requests in flight per broker connection, defaulting to five
with ProducerConfig::max_in_flight_requests_per_connection, which matches the
Java producer default.
Internally, the producer path is:
send(...)validates and enqueues aProduceRecordinto the accumulator.- The accumulator groups records by topic/partition and seals batches when they
reach
batch_size, expirelinger, or need to flush. - Metadata maps each partition to its current leader.
- Ready batches are grouped by leader and encoded into Kafka produce requests.
- Produce requests are sent in the background with bounded in-flight tracking.
- Completions resolve the original application futures, retry retriable errors, or fail records when delivery timeout is exceeded.
The consumer runtime follows a similar single-owner async model. It owns the
metadata cache, coordinator connection, leader connections, assignment state,
offset state, heartbeat scheduling, and buffered records. Polling asks the runtime
for records; the runtime fetches from partition leaders, decodes record batches
with kafka-protocol, and preserves partial trailing batches for the next fetch
instead of treating a fetch-boundary partial batch as fatal.
Internally, the consumer path is:
- Subscription or manual assignment updates the runtime-owned assignment state.
- Metadata maps assigned partitions to their current leaders.
- Fetch requests are grouped by leader connection.
- Responses are decoded into
ConsumerRecordbatches in the runtime. - Any records beyond the current poll demand stay buffered for later polls.
- Offset commits are explicit unless auto-commit is enabled.
Why It Is Faster In These Benchmarks
The performance advantage in the current comparison comes from a few concrete implementation choices:
- Native Tokio I/O avoids crossing an FFI/native client boundary for every produce or consume operation.
- The client targets Kafka 4.x and modern protocols directly, so it does not carry compatibility branches for older broker eras.
- Producer dispatch is pipelined: the sender no longer waits for one produce response before issuing the next request to a leader.
- Completion handling is decoupled from request dispatch, so the sender can keep batches moving while previous requests are still awaiting broker responses.
- The default benchmark uses zero linger, which avoids intentionally delaying partial batches in a latency-sensitive workload.
- Consumer fetches are runtime-owned and decode directly into Rust records without an FFI/native-client handoff.
- Fetch decoding handles partial batch boundaries without forcing an error/retry path, which matters when a broker response ends at a batch boundary.
- The consumer benchmark disables commits, so the comparison isolates fetch and decode throughput rather than coordinator commit latency.
librdkafka remains a mature and heavily optimized Kafka client. The claim here
is narrower: for this repo's current release-mode benchmark, with the settings
above, kafkit-client is faster.
Quick Start
Use KafkaClient when you want a compact, topic-scoped setup for application
code:
use ;
async
Producer
Use KafkaProducer and ProducerConfig directly when you need lower-level
control over batching, compression, idempotence, transactions, or explicit
topic/partition routing:
use ;
async
Consumer
Consumers use Kafka's modern group protocol. The ergonomic builder subscribes to
the topic selected by KafkaClient::topic(...):
use ;
async
Admin
The admin client covers common cluster and topic operations:
use ;
async
Security
TLS and SASL are configured on the same builder/config types used for plain connections:
use ;
async
Supported security features:
- TLS with system roots, custom CA files, server-name override, and client certs.
- SASL/PLAIN.
- SASL/SCRAM-SHA-256.
- SASL/SCRAM-SHA-512.
Feature Highlights
- Async producer, consumer, share-group consumer, and admin client.
- Topic-scoped
KafkaClientfacade for concise application setup. - Lower-level
KafkaProducer,KafkaConsumer,KafkaShareConsumer, andKafkaAdminAPIs for explicit configuration. - Record headers, nullable values, tombstones, and explicit record timestamps.
- Configurable compression, linger, batching, retry backoff, request timeout, and delivery timeout.
- Producer
flush()and explicitshutdown()controls. - Transactional producer support, including
send_offsets_to_transaction. - Manual assignment, topic-list subscription, and pattern subscription.
- Seek, position, pause, resume, committed-offset lookup, beginning/end offset lookup, and timestamp offset lookup.
- Topic create/list/describe/delete/config APIs, consumer group description, broker metadata lookup, and cluster description.
tracinginstrumentation.
Choosing an API
- Use
KafkaClientfor service code that mostly works with one topic at a time. - Use
KafkaProducerwithProducerConfigfor explicit partitioning, batching, compression, idempotence, or transactions. - Use
KafkaConsumerwithConsumerConfigfor direct control over group, assignment, fetch, offset, and commit behavior. - Use
KafkaShareConsumerfor Kafka share-group consumption. - Use
KafkaAdminwithAdminConfigfor cluster and topic management.
Operational Notes
- Call
shutdown().awaiton producers and consumers when your application is stopping so background tasks can drain and close connections cleanly. - Call
flush().awaiton producers when you need all currently buffered records acknowledged before continuing. - Consumers return batches from
poll().await; commit the records you have processed, or configure auto-commit if that matches your processing model. - Transactional producers must set a transactional id, call
init_transactions().await, then usebegin_transaction(),commit_transaction(), orabort_transaction().
Runnable Examples
The crate includes real programs under examples/. For a local Kafka 4.0+
broker on localhost:9092, run:
See examples/README.md for environment variables and Docker Compose notes.