readmdict 0.1.0

A Rust implementation for reading MDict dictionary files (.mdx format)
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
# rust-readmdict
A port of https://github.com/ffreemt/readmdict

A Rust implementation for reading MDict dictionary files (.mdx format).

## Usage

### Basic Usage

To open an MDX file and display basic information:

```bash
cargo run example_resources/webster.mdx
```

Output:
```
Successfully opened MDX file: example_resources/webster.mdx
Number of entries: 109353
```

### List Keys

To list the first 10 keys from the dictionary:

```bash
cargo run example_resources/webster.mdx --list-keys
```

Output:
```
Successfully opened MDX file: example_resources/webster.mdx
Number of entries: 109353

Keys:
1: 12 a.m
2: 12 midnight
3: 12 p.m.
4: 20/20
5: 20/20 hindsight
6: 20 hindsight
7: .22
8: .22s
9: 24-7
10: 24/7
... and 109343 more
```

### List Keys Since a Word

To list keys that are alphabetically equal to or greater than a specific word:

```bash
cargo run example_resources/webster.mdx --list-keys-since "apple"
```

Output:
```
Successfully opened MDX file: example_resources/webster.mdx
Number of entries: 109353

Keys since 'apple':
1: apple
2: apple cheeked
3: apple of someone's eye
4: apple pie
5: apple pies
6: apple polisher
7: apple polishers
8: apple-cheeked
9: apples
10: applesauce
... and 99401 more
```

### Look up a word and show its content

```bash
# Look up the definition of "apple"
cargo run example_resources/webster.mdx --lookup apple

# Look up a resource file in MDD
cargo run resources.mdd --lookup "image.png"
```

Example output:
```
Successfully opened MDX file: example_resources/webster.mdx
Number of entries: 109353

Looking up 'apple':
Definition:
<div class="entry">...[HTML content with definition]...</div>
```

## Features

- Read MDX dictionary files
- Extract header information and metadata
- Parse and list dictionary keys
- List keys alphabetically from a specific starting word
- Look up words and display their content from MDX files
- Look up resources and display their content from MDD files
- Support for compressed key blocks (zlib)
- Handle different MDX versions (1.x and 2.x)

## Building

```bash
cargo build --release
```

## Implementation Details

This is a Rust port of the Python readmdict library. The implementation follows a simplified file structure that closely mirrors the original Python codebase.

#### File Structure Mapping

| Python File | Rust File | Purpose |
|-------------|-----------|----------|
| `readmdict/__main__.py` | `src/main.rs` | CLI entry point and argument parsing |
| `readmdict/readmdict.py` | `src/readmdict.rs` | Core library with all classes (MDict, MDX, MDD) |
| `readmdict/pureSalsa20.py` | Use `salsa20` crate | Salsa20 encryption (external crate) |
| `readmdict/ripemd128.py` | Use `ripemd` crate | RIPEMD128 hashing (external crate) |
| N/A | `src/lib.rs` | Library entry point (re-exports from readmdict.rs) |

##### Core Classes

| Python Class/Function | Rust Equivalent | Location |
|----------------------|-----------------|----------|
| `MDict` (base class) | `struct MDict` | `src/readmdict.rs` |
| `MDX` (inherits MDict) | `struct Mdx` | `src/readmdict.rs` |
| `MDD` (inherits MDict) | `struct Mdd` | `src/readmdict.rs` |

##### Method-to-Method Mapping

**Utility Functions:**
| Python Function | Rust Function | Location |
|-----------------|---------------|----------|
| `_unescape_entities(text)` | `unescape_entities(text: &[u8]) -> Vec<u8>` | `src/readmdict.rs` |
| `_fast_decrypt(data, key)` | `fast_decrypt(data: &[u8], key: &[u8]) -> Vec<u8>` | `src/readmdict.rs` |
| `_mdx_decrypt(comp_block)` | `mdx_decrypt(comp_block: &[u8]) -> Result<Vec<u8>>` | `src/readmdict.rs` |
| `_salsa_decrypt(ciphertext, key)` | `salsa_decrypt(ciphertext: &[u8], key: &[u8]) -> Result<Vec<u8>>` | `src/readmdict.rs` |
| `_decrypt_regcode_by_deviceid(regcode, deviceid)` | `decrypt_regcode_by_deviceid(regcode: &[u8], deviceid: &[u8]) -> Result<Vec<u8>>` | `src/readmdict.rs` |
| `_decrypt_regcode_by_email(regcode, email)` | `decrypt_regcode_by_email(regcode: &[u8], email: &[u8]) -> Result<Vec<u8>>` | `src/readmdict.rs` |

**MDict Class Methods:**
| Python Method | Rust Method | Purpose |
|---------------|-------------|----------|
| `__init__(fname, encoding, passcode)` | `new(fname: &str, encoding: Option<String>, passcode: Option<Passcode>) -> Result<Self>` | Constructor |
| `__len__()` | `len(&self) -> usize` | Get number of entries |
| `__iter__()` | `keys(&self) -> impl Iterator<Item = &[u8]>` | Iterator over keys |
| `keys()` | `keys(&self) -> impl Iterator<Item = &[u8]>` | Get dictionary keys |
| `_read_number(f)` | `read_number<R: Read>(&self, reader: &mut R) -> Result<u64>` | Read number from file |
| `_parse_header(header)` | `parse_header(header: &[u8]) -> Result<HashMap<String, String>>` | Parse header attributes |
| `_decode_key_block_info(data)` | `decode_key_block_info(&self, data: &[u8]) -> Result<Vec<(u64, u64)>>` | Decode key block info |
| `_decode_key_block(data, info)` | `decode_key_block(&self, data: &[u8], info: &[(u64, u64)]) -> Result<Vec<(u64, Vec<u8>)>>` | Decode key block |
| `_split_key_block(data)` | `split_key_block(&self, data: &[u8]) -> Result<Vec<(u64, Vec<u8>)>>` | Split key block into entries |
| `_read_header()` | `read_header(&mut self) -> Result<HashMap<String, String>>` | Read and parse file header |
| `_read_keys()` | `read_keys(&mut self) -> Result<Vec<(u64, Vec<u8>)>>` | Read key blocks |
| `_read_keys_brutal()` | `read_keys_brutal(&mut self) -> Result<Vec<(u64, Vec<u8>)>>` | Fallback key reading method |

**MDX Class Methods:**
| Python Method | Rust Method | Purpose |
|---------------|-------------|----------|
| `__init__(fname, encoding, substyle, passcode)` | `new(fname: &str, encoding: Option<String>, substyle: bool, passcode: Option<Passcode>) -> Result<Self>` | Constructor |
| `items()` | `items(&self) -> impl Iterator<Item = Result<(Vec<u8>, Vec<u8>)>>` | Iterator over key-value pairs |
| `_substitute_stylesheet(txt)` | `substitute_stylesheet(&self, txt: &str) -> String` | Apply stylesheet substitution |
| `_decode_record_block()` | `decode_record_block(&self) -> impl Iterator<Item = Result<(Vec<u8>, Vec<u8>)>>` | Decode record blocks |

**MDD Class Methods:**
| Python Method | Rust Method | Purpose |
|---------------|-------------|----------|
| `__init__(fname, passcode)` | `new(fname: &str, passcode: Option<Passcode>) -> Result<Self>` | Constructor |
| `items()` | `items(&self) -> impl Iterator<Item = Result<(Vec<u8>, Vec<u8>)>>` | Iterator over filename-content pairs |
| `_decode_record_block()` | `decode_record_block(&self) -> impl Iterator<Item = Result<(Vec<u8>, Vec<u8>)>>` | Decode record blocks |

#### Implementation Checklist

- [ ] 1. Create basic project structure (`src/lib.rs`, `src/main.rs`)
- [ ] 2. Implement core readmdict module (`src/readmdict.rs`) containing:
  - [ ] 2.1. Utility functions (`unescape_entities`, etc.)
  - [ ] 2.2. Crypto functions (`fast_decrypt`, `mdx_decrypt`, `salsa_decrypt`, etc.)
  - [ ] 2.3. Base `MDict` struct with all methods
  - [ ] 2.4. `Mdx` struct inheriting from `MDict`
  - [ ] 2.5. `Mdd` struct inheriting from `MDict`
- [ ] 3. Implement CLI interface (`src/main.rs`) matching `__main__.py`
- [ ] 4. Update `src/lib.rs` to re-export from `readmdict.rs`
- [ ] 5. Add error handling and comprehensive tests
- [ ] 6. Add documentation and usage examples
- [ ] 7. Performance optimization and benchmarking

#### Detailed Structure Plan

**src/readmdict.rs** (single file containing everything from readmdict.py):
```rust
// Imports and dependencies
use std::collections::HashMap;
use std::fs::File;
use std::io::{Read, Seek, SeekFrom, BufReader, Cursor};
use std::path::Path;
use byteorder::{BigEndian, LittleEndian, ReadBytesExt};
use flate2::read::ZlibDecoder;
use regex::bytes::Regex;
use encoding_rs::Encoding;
use salsa20::{Salsa20, StreamCipher};
use ripemd::{Ripemd128, Digest};
use sha2::Sha256;
use adler::adler32;

// Error types
#[derive(Debug, thiserror::Error)]
pub enum Error {
    #[error("IO error: {0}")]
    Io(#[from] std::io::Error),
    #[error("Invalid file format: {0}")]
    InvalidFormat(String),
    #[error("Unsupported compression type")]
    UnsupportedCompression,
    #[error("Encryption error: {0}")]
    Encryption(String),
    #[error("Invalid passcode")]
    InvalidPasscode,
    #[error("Checksum mismatch")]
    ChecksumMismatch,
    #[error("Encoding error: {0}")]
    Encoding(String),
    #[error("Parse error: {0}")]
    Parse(String),
}

pub type Result<T> = std::result::Result<T, Error>;

// Utility functions (direct ports from Python)
fn unescape_entities(text: &[u8]) -> Vec<u8> {
    // Convert HTML entities like &lt; &gt; &amp; &quot; back to < > & "
    // Implementation matches Python _unescape_entities
}

fn fast_decrypt(data: &[u8], key: &[u8]) -> Vec<u8> {
    // Simple XOR decryption with key cycling
    // Direct port of Python _fast_decrypt
}

fn mdx_decrypt(comp_block: &[u8]) -> Result<Vec<u8>> {
    // MDX-specific decryption algorithm
    // Direct port of Python _mdx_decrypt
}

fn salsa_decrypt(ciphertext: &[u8], key: &[u8]) -> Result<Vec<u8>> {
    // Salsa20 decryption using external crate
    // Direct port of Python _salsa_decrypt
}

fn decrypt_regcode_by_deviceid(regcode: &[u8], deviceid: &[u8]) -> Result<Vec<u8>> {
    // Device ID based decryption
    // Direct port of Python _decrypt_regcode_by_deviceid
}

fn decrypt_regcode_by_email(regcode: &[u8], email: &[u8]) -> Result<Vec<u8>> {
    // Email based decryption
    // Direct port of Python _decrypt_regcode_by_email
}

// Passcode struct
#[derive(Debug, Clone)]
pub struct Passcode {
    pub regcode: Vec<u8>,
    pub userid: String,
}

// Base MDict struct (equivalent to Python MDict class)
#[derive(Debug)]
pub struct MDict {
    fname: String,
    encoding: String,
    passcode: Option<Passcode>,
    header: HashMap<String, String>,
    key_list: Vec<(u64, Vec<u8>)>,
    num_entries: usize,
    version: f32,
    encrypt: u8,
    number_width: usize,
    key_block_offset: u64,
    record_block_offset: u64,
    stylesheet: HashMap<String, (String, String)>,
}

impl MDict {
    // Constructor - direct port of Python MDict.__init__
    pub fn new(fname: &str, encoding: Option<String>, passcode: Option<Passcode>) -> Result<Self> {
        // Initialize struct, read header, read keys
        // Handle encoding detection and passcode validation
    }
    
    // Length - direct port of Python MDict.__len__
    pub fn len(&self) -> usize { self.num_entries }
    
    // Keys iterator - direct port of Python MDict.keys
    pub fn keys(&self) -> impl Iterator<Item = &[u8]> {
        self.key_list.iter().map(|(_, key)| key.as_slice())
    }
    
    // Private methods - direct ports from Python
    fn read_number<R: Read>(&self, reader: &mut R) -> Result<u64> {
        // Read number based on version (4 or 8 bytes)
    }
    
    fn parse_header(header: &[u8]) -> Result<HashMap<String, String>> {
        // Parse XML-like header attributes
    }
    
    fn decode_key_block_info(&self, data: &[u8]) -> Result<Vec<(u64, u64)>> {
        // Decode key block compression info
    }
    
    fn decode_key_block(&self, data: &[u8], info: &[(u64, u64)]) -> Result<Vec<(u64, Vec<u8>)>> {
        // Decompress and decode key blocks
    }
    
    fn split_key_block(&self, data: &[u8]) -> Result<Vec<(u64, Vec<u8>)>> {
        // Split key block into individual entries
    }
    
    fn read_header(&mut self) -> Result<HashMap<String, String>> {
        // Read and parse file header
    }
    
    fn read_keys(&mut self) -> Result<Vec<(u64, Vec<u8>)>> {
        // Read key blocks with encryption support
    }
    
    fn read_keys_brutal(&mut self) -> Result<Vec<(u64, Vec<u8>)>> {
        // Fallback key reading for problematic files
    }
}

// MDX struct (equivalent to Python MDX class)
#[derive(Debug)]
pub struct Mdx {
    mdict: MDict,
    substyle: bool,
}

impl Mdx {
    // Constructor - direct port of Python MDX.__init__
    pub fn new(fname: &str, encoding: Option<String>, substyle: bool, passcode: Option<Passcode>) -> Result<Self> {
        let mdict = MDict::new(fname, encoding, passcode)?;
        Ok(Self { mdict, substyle })
    }
    
    // Items iterator - direct port of Python MDX.items
    pub fn items(&self) -> impl Iterator<Item = Result<(Vec<u8>, Vec<u8>)>> {
        self.decode_record_block()
    }
    
    // Stylesheet substitution - direct port of Python MDX._substitute_stylesheet
    fn substitute_stylesheet(&self, txt: &str) -> String {
        // Apply stylesheet definitions to text
    }
    
    // Record block decoder - direct port of Python MDX._decode_record_block
    fn decode_record_block(&self) -> impl Iterator<Item = Result<(Vec<u8>, Vec<u8>)>> {
        // Decode and decompress record blocks, apply encoding and stylesheet
    }
    
    // Delegate methods to MDict
    pub fn len(&self) -> usize { self.mdict.len() }
    pub fn keys(&self) -> impl Iterator<Item = &[u8]> { self.mdict.keys() }
    pub fn header(&self) -> &HashMap<String, String> { &self.mdict.header }
}

// MDD struct (equivalent to Python MDD class)
#[derive(Debug)]
pub struct Mdd {
    mdict: MDict,
}

impl Mdd {
    // Constructor - direct port of Python MDD.__init__
    pub fn new(fname: &str, passcode: Option<Passcode>) -> Result<Self> {
        let mdict = MDict::new(fname, Some("UTF-16".to_string()), passcode)?;
        Ok(Self { mdict })
    }
    
    // Items iterator - direct port of Python MDD.items
    pub fn items(&self) -> impl Iterator<Item = Result<(Vec<u8>, Vec<u8>)>> {
        self.decode_record_block()
    }
    
    // Record block decoder - direct port of Python MDD._decode_record_block
    fn decode_record_block(&self) -> impl Iterator<Item = Result<(Vec<u8>, Vec<u8>)>> {
        // Decode and decompress record blocks for binary data
    }
    
    // Delegate methods to MDict
    pub fn len(&self) -> usize { self.mdict.len() }
    pub fn keys(&self) -> impl Iterator<Item = &[u8]> { self.mdict.keys() }
    pub fn header(&self) -> &HashMap<String, String> { &self.mdict.header }
}
```

**src/lib.rs** (simple re-export):
```rust
mod readmdict;
pub use readmdict::*;
```

**src/main.rs** (direct port of __main__.py):
```rust
use clap::Parser;
use rust_readmdict::*;
use std::path::Path;
use std::fs;
use std::io::Write;

#[derive(Parser)]
#[command(name = "readmdict")]
#[command(about = "A Rust implementation of readmdict for reading MDict dictionary files")]
struct Args {
    #[arg(short = 'x', long, help = "extract mdx to source format and extract files from mdd")]
    extract: bool,
    
    #[arg(short = 's', long, help = "substitute style definition if present")]
    substyle: bool,
    
    #[arg(short = 'd', long, default_value = "data", help = "folder to extract data files from mdd")]
    datafolder: String,
    
    #[arg(short = 'e', long, default_value = "", help = "encoding for the dictionary")]
    encoding: String,
    
    #[arg(short = 'p', long, help = "passcode in format: register_code,email_or_deviceid")]
    passcode: Option<String>,
    
    #[arg(help = "mdx file name")]
    filename: Option<String>,
}

fn parse_passcode(s: &str) -> Result<Passcode> {
    // Parse passcode string in format "regcode,userid"
    let parts: Vec<&str> = s.split(',').collect();
    if parts.len() != 2 {
        return Err(Error::InvalidPasscode);
    }
    Ok(Passcode {
        regcode: hex::decode(parts[0]).map_err(|_| Error::InvalidPasscode)?,
        userid: parts[1].to_string(),
    })
}

fn main() -> Result<()> {
    let args = Args::parse();
    
    // Handle file selection (GUI fallback would require additional crate)
    let filename = match args.filename {
        Some(f) => f,
        None => {
            eprintln!("Please specify a valid MDX/MDD file");
            std::process::exit(1);
        }
    };
    
    if !Path::new(&filename).exists() {
        eprintln!("Please specify a valid MDX/MDD file");
        std::process::exit(1);
    }
    
    let base = Path::new(&filename).file_stem().unwrap().to_str().unwrap();
    let ext = Path::new(&filename).extension().unwrap_or_default().to_str().unwrap();
    
    // Parse passcode if provided
    let passcode = args.passcode.as_ref()
        .map(|s| parse_passcode(s))
        .transpose()?;
    
    // Handle MDX files
    let mdx = if ext.to_lowercase() == "mdx" {
        let encoding = if args.encoding.is_empty() { None } else { Some(args.encoding.clone()) };
        let mdx = Mdx::new(&filename, encoding, args.substyle, passcode.clone())?;
        
        println!("======== {} ========", filename);
        println!("  Number of Entries : {}", mdx.len());
        for (key, value) in mdx.header() {
            println!("  {} : {}", key, value);
        }
        Some(mdx)
    } else {
        None
    };
    
    // Handle MDD files
    let mdd_filename = format!("{}.mdd", base);
    let mdd = if Path::new(&mdd_filename).exists() {
        let mdd = Mdd::new(&mdd_filename, passcode)?;
        
        println!("======== {} ========", mdd_filename);
        println!("  Number of Entries : {}", mdd.len());
        for (key, value) in mdd.header() {
            println!("  {} : {}", key, value);
        }
        Some(mdd)
    } else {
        None
    };
    
    // Extract files if requested
    if args.extract {
        // Extract MDX to text file
        if let Some(mdx) = &mdx {
            let output_filename = format!("{}.txt", base);
            let mut file = fs::File::create(&output_filename)?;
            
            for item in mdx.items() {
                let (key, value) = item?;
                file.write_all(&key)?;
                file.write_all(b"\r\n")?;
                file.write_all(&value)?;
                if !value.ends_with(b"\n") {
                    file.write_all(b"\r\n")?;
                }
                file.write_all(b"</>\r\n")?;
            }
            
            // Extract stylesheet if present
            if let Some(stylesheet) = mdx.header().get("StyleSheet") {
                let style_filename = format!("{}_style.txt", base);
                fs::write(&style_filename, stylesheet.replace('\n', "\r\n"))?;
            }
        }
        
        // Extract MDD data files
        if let Some(mdd) = &mdd {
            let data_folder = Path::new(&filename).parent().unwrap().join(&args.datafolder);
            fs::create_dir_all(&data_folder)?;
            
            for item in mdd.items() {
                let (key, value) = item?;
                let filename = String::from_utf8_lossy(&key).replace('\\', "/");
                let file_path = data_folder.join(&filename);
                
                if let Some(parent) = file_path.parent() {
                    fs::create_dir_all(parent)?;
                }
                
                fs::write(&file_path, &value)?;
            }
        }
    }
    
    Ok(())
}
```

#### Implementation Considerations

**Key Differences from Python:**
1. **Error Handling**: Rust uses `Result<T, E>` instead of exceptions
2. **Memory Management**: No garbage collection, explicit ownership
3. **String Handling**: Distinction between `String`, `&str`, and `Vec<u8>`
4. **Iterator Patterns**: Rust iterators are lazy and zero-cost
5. **File I/O**: More explicit error handling required

**External Crate Dependencies:**
- `clap`: Command-line argument parsing (replaces `argparse`)
- `flate2`: Zlib compression (replaces `zlib`)
- `salsa20`: Salsa20 encryption (replaces `pureSalsa20.py`)
- `ripemd`: RIPEMD128 hashing (replaces `ripemd128.py`)
- `encoding_rs`: Text encoding support
- `regex`: Regular expressions for parsing
- `byteorder`: Binary data reading
- `thiserror`: Error type derivation
- `hex`: Hexadecimal encoding/decoding
- `adler`: Adler32 checksums

**Performance Optimizations:**
1. **Zero-copy where possible**: Use `&[u8]` slices instead of `Vec<u8>` when data doesn't need to be owned
2. **Streaming iterators**: Process records on-demand instead of loading everything into memory
3. **Efficient string handling**: Use `Cow<str>` for strings that might not need allocation
4. **Memory mapping**: Consider using `memmap2` for large files
5. **Parallel processing**: Use `rayon` for CPU-intensive operations like decompression

**Testing Strategy:**
1. **Unit tests**: Test each utility function and method individually
2. **Integration tests**: Test with real MDX/MDD files
3. **Property-based tests**: Use `proptest` for edge cases
4. **Benchmark tests**: Compare performance with Python implementation
5. **Compatibility tests**: Ensure output matches Python version exactly