jvmrs 0.1.1

A JVM implementation in Rust with Cranelift JIT, AOT compilation, and WebAssembly support
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
# Architecture - JVMRS

## Source Code Organization

Modules are grouped logically in `src/lib.rs`:

| Group | Modules | Purpose |
|-------|---------|---------|
| **Core** | class_file, class_loader, class_cache, memory, allocator, gc | Class loading, heap, GC |
| **Execution** | interpreter, native, reflection, jni, annotations | Bytecode execution, native methods, reflection (Interpreter + ReflectionApi), JNI, annotations |
| **Compilation** | jit, cranelift_jit, aot_compiler, wasm_backend | JIT, AOT, WASM |
| **Tools** | debug, profiler, trace, deterministic, visualization | Debugging, profiling, visualization |
| **Extensions** | extensions, aop, hot_reload, serialization | Plugins, AOP, hot reload |
| **Optional** | ffi, interop, async_io, simd, truffle, security | Feature-gated extensions |

See `docs/structure.md` for the full directory layout and module dependencies.

---

## Overview
JVMRS is a simplified Java Virtual Machine implementation written in Rust. The architecture follows a modular design with clear separation of concerns between class file parsing, memory management, and instruction interpretation.

## Differentiation from Other JVM Implementations

### Language and Memory Safety
- **Rust-based Implementation**: Unlike HotSpot (C++) or OpenJ9 (C++), JVMRS leverages Rust's ownership system for memory safety without garbage collection in the VM code itself
- **Compile-time Safety Guarantees**: Eliminates entire classes of bugs common in C++ JVM implementations (use-after-free, data races, buffer overflows)
- **No Need for VM-level Memory Safeguards**: The VM code is memory-safe by construction, reducing the attack surface compared to traditional JVMs

### Compilation and Backend Architecture
- **Multi-Backend Compilation**: Native support for multiple compilation targets (x86, WebAssembly, AOT object files) from a single codebase
- **Cranelift-based JIT**: Uses a modern, Rust-native code generator instead of the C2/C1 compilers in HotSpot
- **WebAssembly Native Target**: First-class WASM support for browser and edge deployment scenarios

### Modularity and Feature Gating
- **Fine-Grained Feature Flags**: Components like JIT, FFI, async I/O, and SIMD can be selectively compiled
- **Embedded/No-STD Support**: Can run on resource-constrained platforms where traditional JVMs cannot
- **Polyglot Integration**: Built-in support for cross-language interoperability beyond standard JNI

### Memory Management Innovations
- **Arena-Based Allocators**: Improves cache locality and reduces fragmentation compared to traditional heap management
- **Generational GC with Rust Roots**: Uses Rust's ownership system for efficient root set tracking
- **Parallel GC Sweep**: Utilizes rayon for parallel garbage collection operations

### Tooling and Observability
- **Deterministic Execution Mode**: Enables reproducible execution for testing and debugging
- **Time-Travel Debugging**: Built-in support for historical debugging not available in standard JVMs
- **Integrated Profiling**: Native profiling capabilities without external tools
- **Security Instrumentation**: Built-in security monitoring and analysis capabilities

### Developer Experience
- **Cargo Integration**: Leverages Rust's ecosystem for testing, benchmarking, and dependency management
- **API-First Design**: Clean separation between library and binary components
- **Native FFI Layer**: Designed from the ground up for easy embedding in Rust applications

## System Architecture

```
┌─────────────────────────────────────────────────────────────┐
│                      Application Layer                       │
├─────────────────────────────────────────────────────────────┤
│  main.rs ──┐                                                │
│            ├──► Interpreter ──┐                             │
│            │                  ├──► Class Loader             │
│            │                  ├──► Instruction Dispatch     │
│            │                  └──► Runtime State            │
│            └──► CLI Interface                               │
├─────────────────────────────────────────────────────────────┤
│                      Core Components                         │
├─────────────────────────────────────────────────────────────┤
│  class_file.rs ──┐                                          │
│                  ├──► Class File Parser                     │
│                  ├──► Constant Pool                         │
│                  ├──► Field/Method Info                     │
│                  └──► Attribute Handling                    │
│                                                             │
│  memory.rs ──────┐                                          │
│                  ├──► Heap Management                       │
│                  ├──► Stack Frames                          │
│                  ├──► Value Representation                  │
│                  └──► Object Model                          │
│                                                             │
│  interpreter.rs ─┐                                          │
│                  ├──► Opcode Implementation                 │
│                  ├──► Method Invocation                     │
│                  ├──► Control Flow                          │
│                  └──► Exception Handling                    │
└─────────────────────────────────────────────────────────────┘
```

## Component Details

### 1. Class File Module (`src/class_file.rs`)
**Purpose**: Parse Java class files according to JVM specification

**Key Structures**:
- `ClassFile`: Main structure representing a loaded class
- `ConstantPool`: Constant pool entries (strings, numbers, references)
- `MethodInfo`: Method metadata and bytecode
- `FieldInfo`: Field metadata
- `AttributeInfo`: Various class file attributes

**Responsibilities**:
- Validate class file magic number and version
- Parse constant pool entries
- Extract method and field information
- Handle class file attributes
- Provide access to bytecode instructions

### 2. Memory Module (`src/memory.rs`)
**Purpose**: Manage JVM memory including heap, stack, and runtime values

**Key Structures**:
- `Heap`: Object storage and allocation
- `StackFrame`: Execution context for methods
- `Value`: Type-safe representation of JVM values
- `Object`: Runtime object representation
- `Array`: Array storage (planned)

**Responsibilities**:
- Allocate and deallocate memory
- Manage method call stack
- Handle value conversions
- Implement object model
- Provide garbage collection

### 2a. Allocator Module (`src/allocator.rs`)
**Purpose**: Arena-based allocation for cache locality and reduced fragmentation

**Key Structures**:
- `ArenaAllocator`: Slot-based arena for heap objects (contiguous storage)
- `ArrayArena`: Arena for array storage

**Benefits**: Cache-friendly iteration during GC, O(1) allocation with free-list reuse

### 2b. GC Module (`src/gc.rs`)
**Purpose**: Advanced garbage collection with generational and ownership-based strategies

**Key Structures**:
- `GenerationalHeap`: Young/old generation heap with promotion
- `ScopedRoot`: RAII-based root set management
- `Generation`: Young vs Old object classification

**Features**:
- Generational GC: Minor GC (young only), Major GC (full heap)
- Parallel sweep using rayon
- Pauseless root tracking via Rust ownership (ScopedRoot)
- Promotion threshold for survivor aging

### 3a. FFI Module (`src/ffi.rs`) - feature `ffi`
**Purpose**: C API for embedding jvmrs in C/C++ applications

**Key Functions**:
- `jvmrs_create`, `jvmrs_create_with_classpath`: Create interpreter
- `jvmrs_load_class`, `jvmrs_run_main`: Execute Java code
- `jvmrs_destroy`: Clean up

### 3b. Interop Module (`src/interop.rs`) - feature `interop`
**Purpose**: Polyglot Java/Rust interop - share objects between runtimes

**Key Structures**:
- `SharedObjectId`, `JavaObjectBridge`: Bridge for Java objects in Rust
- `register_rust_callback`, `invoke_rust_callback`: Rust callbacks from Java native methods

### 3c. Async I/O Module (`src/async_io.rs`) - feature `async`
**Purpose**: Async class loading with tokio

**Key Structures**:
- `AsyncClassLoader` trait, `TokioClassLoader`: Load classes asynchronously

### 3d. SIMD Module (`src/simd.rs`) - feature `simd`
**Purpose**: Vectorized array operations (AVX2/AVX on x86)

**Key Functions**:
- `heap_array_copy_int`, `heap_array_copy_float`: SIMD-accelerated array copy

### 3e. Truffle Module (`src/truffle.rs`) - feature `truffle`
**Purpose**: GraalVM-style API for pluggable language implementations

**Key Structures**:
- `TruffleNode` trait: Executable AST nodes
- `TruffleFrame`: Execution context
- `LanguageFrontend` trait: Language parser/frontend

### 3f. Core Module (`src/core.rs`) - feature `no_std`
**Purpose**: Minimal types for embedded/no_std targets

### 3g. JIT Module (`src/jit.rs`)
**Purpose**: JIT compilation, tiered compilation, AOT

**Key Structures**:
- `JitManager`: Orchestrates JIT compilation and method lookup
- `CraneliftJitCompiler`: Compiles hot methods to native code
- `AotCompiler`: AOT compilation to `.o` files
- `MethodProfile`, `TieredCompilationConfig`: Tiered compilation

**Fallback**: When compiled code has `code_size == 0` (placeholder due to unsupported bytecode or backend unavailable), execution falls through to the interpreter instead of calling a no-op placeholder.

### 3h. Cranelift JIT Backend (`src/cranelift_jit.rs`)
**Purpose**: Bytecode-to-native code generation via cranelift-jit

**Key Structures**:
- `CraneliftJitBackend`: JIT module with FFI helpers
- FFI: `jvmrs_frame_get_local_int`, `jvmrs_frame_push_int`

**Supported bytecode**: bipush, iload_0..3, iadd, ireturn

### 3i. AOT Compiler (`src/aot_compiler.rs`)
**Purpose**: Ahead-of-time compilation to native object files

**Key Functions**:
- `compile_class_to_object()`: Emits `.o` via cranelift-object

### 3j. WASM Backend (`src/wasm_backend.rs`) - feature `wasm`
**Purpose**: Emit WebAssembly from JVM bytecode

**Key Structures**:
- `WasmGenerator`: Translates bytecode to WASM instructions

### 3k. JNI Module (`src/jni.rs`)
**Purpose**: Java Native Interface for registering and invoking native methods

**Key Structures**:
- `JNIEnv`: JNI environment, `jobject`, `jclass` handles
- `register_natives()`, `find_native()`, `unregister_natives()`: Native method registration

### 3l. Annotations Module (`src/annotations.rs`)
**Purpose**: Parse and query class file annotations (RuntimeVisibleAnnotations, RuntimeInvisibleAnnotations)

**Key Structures**:
- `Annotation`, `ElementValue`: Parsed annotation representation
- `get_annotations()`: Retrieve annotations on class, method, field

### 3m. Serialization Module (`src/serialization.rs`)
**Purpose**: Binary serialization for `HeapObject` and `Value` (JVRS format)

**Key Functions**:
- `serialize_object`, `deserialize_object`: Binary format (magic JVRS, version, class name, fields, string_data)
- `serialize_value`, `deserialize_value`: Type-tagged serialization for all Value variants
- `is_serializable_class(class)`: Check if class implements `java/io/Serializable`

### 3n. Visualization Module (`src/visualization.rs`)
**Purpose**: Debug/observability tools for heap and stack

**Key Functions**:
- `heap_dump_ascii()`, `memory_dump_ascii()`: ASCII dumps of heap and full memory
- `frame_dump_ascii()`: Per-frame dump of locals and operand stack
- `export_html_fragment()`: HTML export for web-based inspection

### 3o. Extensions Module (`src/extensions.rs`)
**Purpose**: Plugin system for custom Java capabilities

**Key Structures**:
- `JavaExtension` trait, `ExtensionRegistry`: Register and invoke extensions
- `ExtensionRegistry::global()`: Static registry; `load()` merges extension natives
- Doctest example demonstrates registration and invocation

### 3p. AOP Module (`src/aop.rs`)
**Purpose**: Aspect-oriented proxy creation for cross-cutting concerns

**Key Structures**:
- `Proxy`, `create_proxy()`: Dynamic proxy creation

### 4. Interpreter Module (`src/interpreter.rs`)
**Purpose**: Execute Java bytecode instructions

**Key Structures**:
- `Interpreter`: Main execution engine
- `RuntimeState`: Current execution context
- `ClassLoader`: Load and resolve classes
- `MethodArea`: Store loaded classes

**Reflection API** (real implementations, no placeholders):
- `new_instance(class_name, args)`: Allocate on heap, invoke default constructor
- `get_field_value`, `set_field_value`: Heap-backed field access
- `invoke_method(obj, method_name, args)`: Delegates to `execute_method`
- `get_object_class(obj)`: Returns object's class name from heap
- `memory_mut()`: Mutable memory access for allocation

**Responsibilities**:
- Dispatch bytecode instructions
- Manage class loading and resolution
- Handle method invocation
- Implement control flow
- Provide runtime services

## Data Flow

1. **Class Loading Phase**:
   ```
   File System → ClassFile Parser → Constant Pool → Method/Field Info → ClassLoader
   ```

2. **Execution Phase**:
   ```
   main() → Interpreter → StackFrame → Instruction Dispatch → Memory Operations → Result
   ```

3. **Memory Management**:
   ```
   Allocation Request → Heap Manager → Object Creation → Reference Tracking → (GC) → Deallocation
   ```

## Key Design Decisions

### 1. Rust-Centric Design
- Leverage Rust's ownership system for memory safety
- Use enums for type-safe value representation
- Implement error handling with `Result` types
- Use traits for extensibility

### 2. Simplified JVM Model
- Start with core JVM features
- Gradually add complexity
- Focus on correctness over performance initially
- Clear separation between specification and implementation

### 3. Modular Architecture
- Each module has well-defined responsibilities
- Minimal dependencies between modules
- Clear interfaces for testing
- Easy to extend or replace components

## Memory Layout

### Heap Organization
```
┌─────────────────┐
│   Young Gen     │  (planned)
│   (Eden/Surv)   │
├─────────────────┤
│   Old Gen       │  (planned)
│   (Tenured)     │
├─────────────────┤
│   Perm Gen      │  (planned)
│   (Metaspace)   │
└─────────────────┘
```

### Stack Frame Layout
```
┌─────────────────┐
│   Local Vars    │  [0..n]
├─────────────────┤
│   Operand Stack │  [0..m]
├─────────────────┤
│   Frame Data    │  (return address, etc.)
└─────────────────┘
```

## Instruction Set Architecture

### Current Support
- **Constants**: iconst, bipush, sipush, ldc, ldc_w, ldc2_w
- **Loads/Stores**: iload, iload_0..3, istore, istore_0..3, aload, aload_0..3, astore, astore_0..3
- **Arithmetic**: iadd, isub, imul, idiv, irem
- **Stack**: dup, dup_x1, pop, swap, iinc
- **Control**: ifeq, ifne, iflt, ifge, ifgt, ifle, if_icmpeq..le, goto, return
- **Fields**: getstatic, putstatic, getfield, putfield
- **Objects**: new, invokespecial
- **Arrays**: newarray, anewarray, arraylength, iaload, iastore, aaload, aastore
- **Sync**: monitorenter, monitorexit
- **Invocation**: invokevirtual, invokestatic, invokespecial, invokedynamic

### Planned Extensions
- Object creation and manipulation
- Array operations
- Exception handling
- ~~Synchronization~~ (monitorenter/monitorexit implemented)
- Type checking

## Performance Considerations

### Current
- Simple interpreter loop
- Direct method dispatch
- Basic memory management

### Future Optimizations
- JIT compilation (Cranelift bytecode-to-native for supported opcodes)
- Inline caching
- Escape analysis
- Memory pooling
- Parallel garbage collection

## Security Model

### Current
- Basic class file validation
- Type checking during execution
- Stack bounds checking

### Planned
- Bytecode verification
- Access control
- Sandboxing
- Resource limits

## Testing Strategy

### Unit Tests (68+ tests in `src/tests.rs`)
- Class file parsing, class loader, descriptors, heap, memory
- Interpreter opcodes (stack, arithmetic, conversions, arrays)
- Serialization (Value, HeapObject, string_data)
- Reflection (new_instance, get/set_field_value, get_object_class)
- Error handling, visualization, extensions, AOP

### Integration Tests
- `run_main("Minimal")`, `run_main("TestGetStatic")`, `run_main("TestLdc")` with example classes
- ClassLoader loading from examples directory

### Doctests
- Extensions module: JavaExtension registration and native invocation

### Property-Based Tests
- Generate random class files
- Test parser robustness
- Verify execution correctness

## Extension Points

### 1. New Opcode Support
- Add new match arm in interpreter
- Implement required memory operations
- Update documentation

### 2. Memory Management
- Implement different GC algorithms
- Add memory profiling
- Support custom allocators

### 3. Class Loading
- Support custom class loaders
- Add bytecode transformation
- Implement dynamic class generation

## Dependencies

### Core
- `byteorder`: Class file binary reading
- `log`, `env_logger`: Logging
- `rayon`: Parallel GC sweep

### Compilation
- `cranelift`, `cranelift-jit`, `cranelift-module`, `cranelift-object`, `cranelift-codegen`, `cranelift-frontend`, `cranelift-native`: JIT and AOT
- `inkwell` (optional, `llvm`): LLVM IR export
- `wasm-encoder` (optional, `wasm`): WebAssembly emission

### Optional
- `criterion`: Benchmarking

## Development Guidelines

### Code Style
- Follow Rust conventions
- Use meaningful names
- Document public APIs
- Write comprehensive tests

### Error Handling
- Use custom error types
- Provide context in error messages
- Handle all possible error cases
- Log errors appropriately

### Performance
- Profile before optimizing
- Use appropriate data structures
- Minimize allocations in hot paths
- Consider cache locality

## Future Architecture Directions

### 1. Tiered Compilation (Implemented)
- Interpreter for cold code
- Baseline JIT for warm code (after threshold invocations)
- Optimized JIT for hot code
- `JitManager`, `MethodProfile`, `CompilationLevel`, `TieredCompilationConfig` in `src/jit.rs`

### 2. Compilation Backends
- **Cranelift JIT**: `cranelift_jit::CraneliftJitBackend` – bytecode-to-native (bipush, iload_0..3, iadd, ireturn)
- **AOT**: `aot_compiler::compile_class_to_object` – emits `.o` via cranelift-object
- **LLVM IR** (`--features llvm`): Bytecode-to-IR translation
- **WebAssembly** (`--features wasm`): `wasm_backend::WasmGenerator` emits WASM from bytecode

### 3. Multi-threading Support
- Thread-local allocation buffers
- Concurrent garbage collection
- Synchronization primitives

### 4. Cross-Platform
- WebAssembly backend (`cargo build --features wasm`)
- Embedded systems support
- Mobile platform compatibility

### 5. Tooling Integration
- Debugger interface
- Profiling hooks
- Monitoring APIs
- Management console