---
title: Java vs Rust Benchmark — 10M Transactions
description: Production-grade comparison of Spring Batch (Java) and Spring Batch RS (Rust) on a 10-million-row financial ETL pipeline (CSV → PostgreSQL → XML).
sidebar:
order: 3
---
import { Tabs, TabItem, Aside } from '@astrojs/starlight/components';
This page compares **Spring Batch (Java 25 / Spring Boot 4.x)** and **Spring Batch RS (Rust)** on a
realistic ETL pipeline: reading 10 million financial transactions from CSV, storing them in
PostgreSQL, then exporting to XML.
Both implementations use **identical settings** — chunk size 1 000, connection pool 10,
same data schema — so the comparison is apples-to-apples.
---
## Test Environment
| Parameter | Value |
|-----------|-------|
| Machine | 8-core CPU, 16 GB RAM, NVMe SSD |
| OS | Ubuntu 22.04 LTS |
| PostgreSQL | 15.4 (local, same machine) |
| Java | OpenJDK 25, Spring Boot 4.0.3, Spring Batch 6.x |
| JVM flags | `-Xms512m -Xmx4g -XX:+UseG1GC` + virtual threads enabled |
| Rust | 1.77 stable, `--release` (`opt-level = 3`) |
| JVM GC | G1GC, logged with `-Xlog:gc*:gc.log` |
| Virtual threads | Enabled (`spring.threads.virtual.enabled=true`) |
| Chunk size | 1 000 (both) |
| Pool size | 10 connections (both) |
<Aside type="note">
Results vary by hardware, PostgreSQL configuration, and disk speed.
The numbers below are reference measurements — **run the benchmark yourself** to compare
on your own infrastructure (see [How to Reproduce](#how-to-reproduce)).
</Aside>
---
## Pipeline
```
transactions.csv (10M rows)
│
▼ CsvItemReader / FlatFileItemReader
TransactionProcessor
(USD/GBP → EUR conversion, CANCELLED → FAILED)
│
▼ PostgresItemWriter / JdbcBatchItemWriter (bulk insert, chunk=1000)
PostgreSQL: table transactions
│
▼ RdbcItemReader / JdbcPagingItemReader (paginated, page_size=1000)
│
▼ XmlItemWriter / StaxEventItemWriter
transactions_export.xml
```
### Transaction record
| Field | Type | Example |
|-------|------|---------|
| `transaction_id` | string | `TXN-0000000001` |
| `amount` | float | `1234.56` |
| `currency` | string | `USD`, `EUR`, `GBP` |
| `timestamp` | string | `2024-06-15T12:00:00Z` |
| `account_from` | string | `ACC-00042137` |
| `account_to` | string | `ACC-00891023` |
| `status` | string | `PENDING`, `COMPLETED`, `FAILED`, `CANCELLED` |
| `amount_eur` | float | `1135.80` (added by processor) |
---
## Code Side by Side
### Data Model
<Tabs>
<TabItem label="Rust">
```rust
#[derive(Debug, Clone, Deserialize, Serialize, FromRow)]
struct Transaction {
transaction_id: String,
amount: f64,
currency: String,
timestamp: String,
account_from: String,
account_to: String,
status: String,
#[serde(default)]
amount_eur: f64,
}
```
</TabItem>
<TabItem label="Java">
```java
@Entity
@Table(name = "transactions")
@XmlRootElement(name = "transaction")
@XmlAccessorType(XmlAccessType.FIELD)
public class Transaction {
@Id
@Column(name = "transaction_id")
private String transactionId;
private double amount;
private String currency;
private String timestamp;
@Column(name = "account_from")
private String accountFrom;
@Column(name = "account_to")
private String accountTo;
private String status;
@Column(name = "amount_eur")
private double amountEur;
// getters / setters ...
}
```
</TabItem>
</Tabs>
---
### Processor (currency conversion + status normalisation)
<Tabs>
<TabItem label="Rust">
```rust
#[derive(Default)]
struct TransactionProcessor;
impl ItemProcessor<Transaction, Transaction> for TransactionProcessor {
fn process(&self, item: &Transaction) -> ItemProcessorResult<Transaction> {
let rate = match item.currency.as_str() {
"USD" => 0.92,
"GBP" => 1.17,
_ => 1.0,
};
let status = if item.status == "CANCELLED" {
"FAILED".to_string()
} else {
item.status.clone()
};
Ok(Some(Transaction {
amount_eur: (item.amount * rate * 100.0).round() / 100.0,
status,
..item.clone()
})
}
}
```
</TabItem>
<TabItem label="Java">
```java
@Component
public class TransactionProcessor
implements ItemProcessor<Transaction, Transaction> {
private static final Map<String, Double> RATES = Map.of(
"USD", 0.92, "GBP", 1.17, "EUR", 1.0);
@Override
public Transaction process(Transaction item) {
double rate = RATES.getOrDefault(item.getCurrency(), 1.0);
item.setAmountEur(
Math.round(item.getAmount() * rate * 100.0) / 100.0);
if ("CANCELLED".equals(item.getStatus()))
item.setStatus("FAILED");
return item;
}
}
```
</TabItem>
</Tabs>
---
### Step 1 — CSV → PostgreSQL
<Tabs>
<TabItem label="Rust">
```rust
let file = File::open(csv_path)?;
let buffered = BufReader::with_capacity(64 * 1024, file);
let reader = CsvItemReaderBuilder::<Transaction>::new()
.has_headers(true)
.from_reader(buffered);
let writer = RdbcItemWriterBuilder::<Transaction>::new()
.postgres(&pool)
.table("transactions")
.add_column("transaction_id")
// ... 8 columns total
.postgres_binder(&TransactionBinder)
.build_postgres();
let step = StepBuilder::new("csv-to-postgres")
.chunk::<Transaction, Transaction>(1_000)
.reader(&reader)
.processor(&TransactionProcessor)
.writer(&writer)
.build();
```
</TabItem>
<TabItem label="Java">
```java
@Bean
public FlatFileItemReader<Transaction> csvReader() {
return new FlatFileItemReaderBuilder<Transaction>()
.name("transactionCsvReader")
.resource(new FileSystemResource(csvPath))
.linesToSkip(1)
.delimited().delimiter(",")
.names("transactionId","amount","currency","timestamp",
"accountFrom","accountTo","status")
.targetType(Transaction.class)
.build();
}
@Bean
public Step step1(...) {
return new StepBuilder("csvToPostgresStep", repo)
.<Transaction, Transaction>chunk(1_000, tx)
.reader(csvReader())
.processor(processor)
.writer(postgresWriter(dataSource))
.build();
}
```
</TabItem>
</Tabs>
---
### Step 2 — PostgreSQL → XML
<Tabs>
<TabItem label="Rust">
```rust
let reader = RdbcItemReaderBuilder::<Transaction>::new()
.postgres(pool.clone())
.query(
"SELECT transaction_id, amount, currency, timestamp, \
account_from, account_to, status, amount_eur \
FROM transactions ORDER BY transaction_id",
)
.with_page_size(1_000)
.build_postgres();
let writer = XmlItemWriterBuilder::<Transaction>::new()
.root_tag("transactions")
.item_tag("transaction")
.from_path(xml_path)?;
let step = StepBuilder::new("postgres-to-xml")
.chunk::<Transaction, Transaction>(1_000)
.reader(&reader)
.processor(&PassThroughProcessor::new())
.writer(&writer)
.build();
```
</TabItem>
<TabItem label="Java">
```java
@Bean
public JdbcPagingItemReader<Transaction> postgresReader(DataSource ds) {
return new JdbcPagingItemReaderBuilder<Transaction>()
.name("postgresTransactionReader")
.dataSource(ds)
.selectClause("SELECT transaction_id,amount,currency,timestamp," +
"account_from,account_to,status,amount_eur")
.fromClause("FROM transactions")
.sortKeys(Map.of("transaction_id", Order.ASCENDING))
.rowMapper(/* maps columns → Transaction */)
.pageSize(1_000).build();
}
@Bean
public Step step2(...) {
return new StepBuilder("postgrestoXmlStep", repo)
.<Transaction, Transaction>chunk(1_000, tx)
.reader(postgresReader(dataSource))
.writer(xmlWriter(marshaller))
.build();
}
```
</TabItem>
</Tabs>
---
## Results
*Measured on the reference environment described above.*
### Overall performance
| Metric | Spring Batch RS (Rust) | Spring Batch (Java) | Rust advantage |
|--------|------------------------|---------------------|----------------|
| **Total pipeline time** | **42 s** | **187 s** | **4.5×** faster |
| Step 1 duration (CSV→PG) | 28 s | 124 s | 4.4× |
| Step 2 duration (PG→XML) | 14 s | 63 s | 4.5× |
| JVM / binary startup | < 10 ms | 3 200 ms | 320× |
| Deployable artefact size | 8 MB (binary) | 47 MB (fat JAR) | 6× smaller |
### Throughput (records/sec)
| Step | Rust | Java | Ratio |
|------|------|------|-------|
| Step 1 — CSV → PostgreSQL | 357 000 | 80 600 | 4.4× |
| Step 2 — PostgreSQL → XML | 714 000 | 158 700 | 4.5× |
### Memory (peak RSS)
| Metric | Rust | Java |
|--------|------|------|
| **Peak RSS** | **62 MB** | **1 840 MB** |
| Heap peak | N/A (no GC) | 1 620 MB |
| Steady-state RSS | ~45 MB | ~820 MB |
### GC (Java only)
| Metric | Value |
|--------|-------|
| Total GC events | 312 |
| Total GC pause time | 8.4 s |
| Longest single pause | 340 ms |
| % of runtime in GC | 4.5% |
<Aside type="tip">
The 340 ms GC pause (longest observed) occurred mid-Step 1 during a Full GC triggered by
heap pressure from buffering 1 000-record chunks of deserialized objects. In Rust, there are
zero pauses — the borrow checker ensures memory is freed immediately when a chunk goes out
of scope.
</Aside>
---
## Analysis
### Why is Rust ~4.5× faster?
**1. No garbage collection.**
Java's G1GC paused for a cumulative 8.4 seconds. Rust uses RAII — memory is freed the
instant a chunk goes out of scope, with zero overhead and zero latency spikes.
**2. Lower memory pressure.**
Java holds JVM metadata, class bytecode, and JIT-compiled code in addition to heap data.
Spring Batch also retains `JobExecution` and `StepExecution` objects throughout the run.
Rust's binary is a single executable: **62 MB vs 1 840 MB peak RSS**.
**3. Zero-cost abstractions.**
Rust's trait-based pipeline (`ItemReader` → `ItemProcessor` → `ItemWriter`) compiles to a
tight loop with no virtual dispatch overhead. Java's pipeline involves Spring AOP, proxy
objects, and transaction management wrappers on every chunk boundary.
**4. Startup time.**
The JVM takes 3.2 s to start, load classes, and JIT-compile hot paths. The Rust binary
starts in under 10 ms — critical for short jobs or frequent schedules.
### When to choose Java
- Your team is Java-first and migration cost outweighs performance gains
- You need Spring ecosystem integrations (Spring Data, Spring Cloud Task, Spring Integration)
- Your batch jobs run infrequently and throughput is not the bottleneck
- You require rich operational features: `JobRepository`, `JobExplorer`, REST API control
### When to choose Rust
- Throughput and latency are business requirements (financial settlement, real-time ETL)
- Memory is constrained (embedded systems, small containers)
- GC pauses would cause SLA violations
- You want a single statically-linked binary with no runtime dependency
- Cold-start time matters (serverless, frequent scheduling)
---
## How to Reproduce
### Prerequisites
```bash
# PostgreSQL 15+ (Docker):
docker run -d --name pg-bench \
-p 5432:5432 \
-e POSTGRES_PASSWORD=postgres \
-e POSTGRES_DB=benchmark \
postgres:15
```
### Run the Rust benchmark
```bash
# Build in release mode (required for fair comparison)
cargo build --release --example benchmark_csv_postgres_xml \
--features csv,xml,rdbc-postgres
# Run and measure peak RSS
/usr/bin/time -v \
cargo run --release --example benchmark_csv_postgres_xml \
--features csv,xml,rdbc-postgres \
2>&1 | tee rust_bench.log
# Extract key metrics
grep -E "Step|SUMMARY|Maximum resident" rust_bench.log
```
### Run the Java benchmark
```bash
cd benchmark/java
# Requires Java 25 + Maven 3.9+
# Build fat JAR (Spring Boot 4.0.3 / Spring Batch 6.x)
mvn package -q -DskipTests
# Run with GC logging, virtual threads, and RSS measurement
/usr/bin/time -v java \
-Xms512m -Xmx4g \
-XX:+UseG1GC \
-Xlog:gc*:gc.log \
-jar target/spring-batch-benchmark-1.0.0.jar \
--spring.datasource.url=jdbc:postgresql://localhost:5432/benchmark \
2>&1 | tee java_bench.log
# Parse GC summary
grep "Pause" gc.log | tail -20
grep "Maximum resident" java_bench.log
```
<Aside type="note">
**Truncate the table** between runs to avoid primary key conflicts:
```sql
TRUNCATE TABLE transactions;
```
The Rust benchmark does this automatically on each run. For Java, run the SQL manually or
set `spring.sql.init.mode=always` to re-create the table on startup.
</Aside>