database-replicator 5.3.4

Universal database-to-PostgreSQL replication CLI. Supports PostgreSQL, SQLite, MongoDB, and MySQL.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
# MongoDB Replication Guide

This guide explains how to replicate MongoDB databases to PostgreSQL using database-replicator's JSONB storage approach.

---

## SerenAI Cloud Replication

**New to SerenAI?** Sign up at [console.serendb.com](https://console.serendb.com) to get started with managed cloud replication.

When replicating to SerenDB targets, this tool runs your replication jobs on SerenAI's cloud infrastructure automatically:

```bash
export SEREN_API_KEY="your-api-key"  # Get from console.serendb.com
database-replicator init \
  --source "mongodb://user:pass@mongo-host:27017/db" \
  --target "postgresql://user:pass@your-db.serendb.com:5432/db"
```

For non-SerenDB targets, use the `--local` flag to run replication locally.

---

## Overview

The tool automatically detects MongoDB connection strings and replicates collections to PostgreSQL using a JSONB storage model. All BSON types are preserved with full type fidelity, including ObjectIds, DateTimes, Binary data, and MongoDB-specific types.

**Key Features:**
- **Zero Configuration**: Automatic source type detection from connection URL
- **Type Preservation**: Lossless conversion of all BSON types to JSONB
- **Security First**: Read-only connections, URL validation, NoSQL injection prevention
- **MongoDB Types**: Full support for ObjectId, DateTime, Binary, Decimal128, and more
- **Metadata Tracking**: Each document includes source type and replication timestamp

## Quick Start

### Basic Replication

Replicate an entire MongoDB database to PostgreSQL:

```bash
database-replicator init \
  --source "mongodb://localhost:27017/mydb" \
  --target "postgresql://user:pass@target-host:5432/db"
```

The tool automatically:
1. Detects that the source is MongoDB (from `mongodb://` URL)
2. Validates the connection string for security
3. Connects to MongoDB and verifies connection
4. Extracts the database name from the URL
5. Lists all collections (excluding system collections)
6. Creates corresponding JSONB tables in PostgreSQL
7. Converts and inserts all documents

### Supported Connection Strings

The tool recognizes these MongoDB URL formats:
- `mongodb://localhost:27017/mydb` - Standard MongoDB connection
- `mongodb://user:pass@localhost:27017/mydb` - With authentication
- `mongodb://host1:27017,host2:27017/mydb?replicaSet=rs0` - Replica sets
- `mongodb+srv://cluster.mongodb.net/mydb` - MongoDB Atlas (SRV)

**Important**: The database name must be included in the connection URL (e.g., `/mydb`).

## Data Type Mapping

MongoDB BSON types are mapped to JSONB as follows:

### Primitive Types

| BSON Type | JSON Type | Example Input | JSON Output |
|-----------|-----------|---------------|-------------|
| String | string | `"Hello"` | `"Hello"` |
| Int32 | number | `42` | `42` |
| Int64 | number | `9223372036854775807` | `9223372036854775807` |
| Double | number | `3.14159` | `3.14159` |
| Boolean | boolean | `true` | `true` |
| Null | null | `null` | `null` |
| Undefined | null | `undefined` | `null` |

### MongoDB-Specific Types

| BSON Type | JSON Representation | Example |
|-----------|---------------------|---------|
| ObjectId | Object with `_type` and `$oid` | `{"_type": "objectid", "$oid": "507f1f77bcf86cd799439011"}` |
| DateTime | Object with `_type` and `$date` (milliseconds) | `{"_type": "datetime", "$date": 1678886400000}` |
| Binary | Object with `_type`, `subtype`, and base64 `data` | `{"_type": "binary", "subtype": 0, "data": "SGVsbG8="}` |
| Decimal128 | String (preserves precision) | `"123.456789012345"` |
| RegularExpression | Object with `_type`, `pattern`, and `options` | `{"_type": "regex", "pattern": "^test", "options": "i"}` |
| Timestamp | Object with `_type`, `t` (time), and `i` (increment) | `{"_type": "timestamp", "t": 1234567890, "i": 1}` |
| MaxKey | Object with `_type` | `{"_type": "maxkey"}` |
| MinKey | Object with `_type` | `{"_type": "minkey"}` |

### Special Cases

**Non-Finite Doubles:**
- `NaN`, `Infinity`, and `-Infinity` are converted to strings for JSON compatibility
- Example: `NaN``"NaN"`

**Arrays:**
- MongoDB arrays are preserved as JSON arrays
- Nested arrays are fully supported
- Example: `[1, 2, [3, 4]]``[1, 2, [3, 4]]`

**Embedded Documents:**
- Nested documents are preserved as JSON objects
- Full depth nesting is supported
- Example: `{user: {name: "Alice"}}``{"user": {"name": "Alice"}}`

**ObjectId Handling:**
- Always converted to hex string in `$oid` field
- Preserves exact ObjectId value for round-trip compatibility
- Example: `ObjectId("507f1f77bcf86cd799439011")``{"_type": "objectid", "$oid": "507f1f77bcf86cd799439011"}`

## PostgreSQL Table Structure

Each MongoDB collection is converted to a PostgreSQL table with this schema:

```sql
CREATE TABLE IF NOT EXISTS "collection_name" (
    id TEXT PRIMARY KEY,
    data JSONB NOT NULL,
    _source_type TEXT NOT NULL DEFAULT 'mongodb',
    _migrated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Performance indexes
CREATE INDEX IF NOT EXISTS "idx_collection_name_data" ON "collection_name" USING GIN (data);
CREATE INDEX IF NOT EXISTS "idx_collection_name_source" ON "collection_name" (_source_type);
```

**Field Descriptions:**
- `id`: String ID from MongoDB's `_id` field (ObjectId converted to hex, or string/number as-is)
- `data`: JSONB containing the complete MongoDB document
- `_source_type`: Always `'mongodb'` for MongoDB replications
- `_migrated_at`: Timestamp of when the document was replicated

## Querying Replicated Data

### Basic Queries

Query JSONB data using PostgreSQL's JSONB operators:

```sql
-- Get all documents
SELECT id, data FROM users;

-- Get specific fields
SELECT
    id,
    data->>'name' AS name,
    data->>'email' AS email,
    (data->>'age')::int AS age
FROM users;

-- Filter by JSONB fields
SELECT * FROM users WHERE data->>'name' = 'Alice';

-- Range queries
SELECT * FROM users WHERE (data->>'age')::int > 25;

-- Check for NULL values
SELECT * FROM users WHERE data->'age' IS NULL;
```

### Working with MongoDB ObjectIds

```sql
-- Extract ObjectId as hex string
SELECT
    id,
    data->'_id'->>'$oid' AS objectid_hex,
    data->>'name' AS name
FROM users;

-- Find by ObjectId
SELECT * FROM users
WHERE data->'_id'->>'$oid' = '507f1f77bcf86cd799439011';

-- Check for ObjectId type
SELECT * FROM users
WHERE data->'_id'->>'_type' = 'objectid';
```

### Working with Dates

```sql
-- Extract DateTime as timestamp
SELECT
    id,
    to_timestamp((data->'created_at'->>'$date')::bigint / 1000) AS created_at,
    data->>'name' AS name
FROM users;

-- Filter by date range
SELECT * FROM events
WHERE to_timestamp((data->'timestamp'->>'$date')::bigint / 1000)
    BETWEEN '2024-01-01' AND '2024-12-31';

-- Group by date
SELECT
    DATE_TRUNC('day', to_timestamp((data->'created_at'->>'$date')::bigint / 1000)) AS date,
    COUNT(*) AS count
FROM events
GROUP BY date
ORDER BY date;
```

### Working with Binary Data

```sql
-- Check if a field is Binary
SELECT id, data->'avatar'->>'_type' AS type
FROM users
WHERE data->'avatar'->>'_type' = 'binary';

-- Decode Binary data
SELECT
    id,
    data->>'name' AS name,
    decode(data->'avatar'->>'data', 'base64') AS avatar_bytes
FROM users
WHERE data->'avatar'->>'_type' = 'binary';

-- Get Binary subtype
SELECT
    id,
    (data->'avatar'->>'subtype')::int AS binary_subtype
FROM users
WHERE data->'avatar'->>'_type' = 'binary';
```

### Working with Nested Documents

```sql
-- Query nested fields
SELECT * FROM users
WHERE data->'address'->>'city' = 'New York';

-- Extract nested data
SELECT
    id,
    data->>'name' AS name,
    data->'address'->>'street' AS street,
    data->'address'->>'city' AS city,
    data->'address'->>'zip' AS zip
FROM users;

-- Query deeply nested fields
SELECT * FROM orders
WHERE data->'payment'->'card'->>'type' = 'visa';
```

### Working with Arrays

```sql
-- Check if array contains value
SELECT * FROM products
WHERE data->'tags' ? 'electronics';

-- Check if array contains any of these values
SELECT * FROM products
WHERE data->'tags' ?| array['electronics', 'computers'];

-- Check if array contains all of these values
SELECT * FROM products
WHERE data->'tags' ?& array['electronics', 'sale'];

-- Get array length
SELECT
    id,
    data->>'name' AS name,
    jsonb_array_length(data->'tags') AS tag_count
FROM products;

-- Expand array to rows
SELECT
    id,
    data->>'name' AS product_name,
    tag
FROM products,
    jsonb_array_elements_text(data->'tags') AS tag;
```

### Advanced Queries

```sql
-- Full-text search
SELECT * FROM articles
WHERE to_tsvector('english', data->>'content') @@ to_tsquery('mongodb & postgresql');

-- Aggregations with nested fields
SELECT
    data->'metadata'->>'category' AS category,
    COUNT(*) AS count,
    AVG((data->>'price')::numeric) AS avg_price,
    SUM((data->>'quantity')::int) AS total_quantity
FROM products
GROUP BY data->'metadata'->>'category';

-- Complex filtering
SELECT * FROM users
WHERE (data->>'age')::int BETWEEN 25 AND 35
    AND data->'address'->>'country' = 'USA'
    AND data->'tags' ? 'premium';

-- Join across collections
SELECT
    u.data->>'name' AS user_name,
    o.data->>'total' AS order_total,
    to_timestamp((o.data->'created_at'->>'$date')::bigint / 1000) AS order_date
FROM users u
JOIN orders o ON u.id = (o.data->>'user_id');
```

## Security Features

### Connection String Validation

The tool validates MongoDB connection strings before use:

**Checks Performed:**
- URL must start with `mongodb://` or `mongodb+srv://`
- Connection string cannot be empty
- Format is validated by MongoDB driver
- Ping test confirms server accessibility

**Rejected Patterns:**
- Wrong protocols (postgresql://, mysql://, http://)
- Missing protocol prefix
- Malformed URLs

### Collection Name Validation

Collection names are validated to prevent NoSQL injection:

**Allowed:**
- Alphanumeric characters (a-z, A-Z, 0-9)
- Underscores (_)
- Names starting with letters or underscores

**Rejected:**
- SQL keywords (SELECT, DROP, etc.)
- Special characters ($, ., ;, etc.)
- Shell metacharacters
- System collection names (system.*)

### Read-Only Operations

**Guarantees:**
- All MongoDB operations are read-only queries
- No insert, update, or delete operations are performed
- No administrative commands are executed
- Collections are listed and read only

### Credential Protection

**Security Measures:**
- Credentials in URLs are never logged
- Error messages sanitize connection strings
- Passwords are not exposed in stack traces
- Connection validation doesn't leak credentials

## Remote Execution

Run replication jobs on SerenAI-managed cloud infrastructure:

```bash
database-replicator init \
  --source "mongodb://SOURCE_CONNECTION" \
  --target "postgresql://TARGET_CONNECTION"
```

**Benefits:**

- No local resources consumed
- Automatic retry and error handling
- Logs stored in CloudWatch
- Job monitoring via API
- Managed security and credentials

### Authentication

Remote execution requires a SerenDB API key for authentication. The tool obtains the API key in one of two ways:

#### Option 1: Environment Variable (Recommended for scripts)

```bash
export SEREN_API_KEY="your-api-key-here"
database-replicator init --source "..." --target "..."
```

#### Option 2: Interactive Prompt

If `SEREN_API_KEY` is not set, the tool will prompt you to enter your API key:

```text
Remote execution requires a SerenDB API key for authentication.

You can generate an API key at:
  https://console.serendb.com/api-keys

Enter your SerenDB API key: [input]
```

**Getting Your API Key:**

1. Sign up for SerenDB at [console.serendb.com/signup]https://console.serendb.com/signup
2. Navigate to [console.serendb.com/api-keys]https://console.serendb.com/api-keys
3. Generate a new API key
4. Copy and save it securely (you won't be able to see it again)

**Security Note:** Never commit API keys to version control. Use environment variables or secure credential management.

## Performance Considerations

### Index Strategy

The tool automatically creates these indexes on each replicated table:

```sql
-- GIN index for efficient JSONB queries
CREATE INDEX IF NOT EXISTS "idx_collection_name_data" ON "collection_name" USING GIN (data);

-- Source type filter index
CREATE INDEX IF NOT EXISTS "idx_collection_name_source" ON "collection_name" (_source_type);
```

For optimal query performance, create additional indexes based on your access patterns:

```sql
-- Index on frequently queried fields
CREATE INDEX idx_users_email ON users ((data->>'email'));
CREATE INDEX idx_users_age ON users (((data->>'age')::int));
CREATE INDEX idx_orders_user_id ON orders ((data->>'user_id'));

-- Index on nested fields
CREATE INDEX idx_users_city ON users ((data->'address'->>'city'));

-- Index for date range queries
CREATE INDEX idx_events_timestamp ON events (to_timestamp((data->'timestamp'->>'$date')::bigint / 1000));
```

### Batch Insert Performance

**Default Behavior:**
- Documents are inserted in batches of 1000
- Each batch is a single transaction
- Progress is logged per collection

**Tips for Large Collections:**
- Replication time scales linearly with document count
- Network bandwidth is typically the bottleneck
- Target PostgreSQL should have sufficient disk space

### Query Optimization

**Best Practices:**
- Always cast JSONB values to appropriate types for comparisons
- Use `->` for intermediate navigation, `->>` only for final text extraction
- Create indexes on frequently queried fields
- Use GIN indexes for containment queries (`?`, `?|`, `?&`)
- Consider materialized views for complex aggregations

**Example Optimization:**

```sql
-- Slow: No index, string comparison
SELECT * FROM users WHERE data->>'age' = '25';

-- Better: Cast to int, but still no index
SELECT * FROM users WHERE (data->>'age')::int = 25;

-- Best: Create index, then query
CREATE INDEX idx_users_age ON users (((data->>'age')::int));
SELECT * FROM users WHERE (data->>'age')::int = 25;
```

## Troubleshooting

### Connection Issues

**Error**: `Failed to connect to MongoDB server`

**Causes:**
- MongoDB server is not running
- Network connectivity issues
- Incorrect hostname or port
- Firewall blocking connection

**Solutions:**
```bash
# Test connectivity
mongo "mongodb://localhost:27017/mydb" --eval "db.version()"

# Check MongoDB is running
systemctl status mongod  # Linux
brew services list | grep mongodb  # macOS

# Verify connection string format
# Must include database name: mongodb://host:port/dbname
```

### Database Name Missing

**Error**: `MongoDB URL must include database name`

**Cause**: Connection URL doesn't specify which database to replicate

**Solution**: Add database name to URL:
```bash
# Wrong
mongodb://localhost:27017

# Correct
mongodb://localhost:27017/mydb
```

### Empty Collection Results

**Symptom**: Replication succeeds but some collections appear empty

**Causes:**
- Collection was actually empty in MongoDB
- System collections were filtered out
- Connection timeout during large collection read

**Verification:**
```javascript
// In mongo shell
use mydb
db.collectionName.countDocuments()
```

### System Collections Not Replicated

**Behavior**: Collections starting with `system.` are not replicated

**Reason**: System collections are internal MongoDB metadata and are intentionally excluded

**Collections Filtered:**
- `system.indexes`
- `system.users`
- `system.profile`
- All other `system.*` collections

### Type Conversion Warnings

**Warning**: `Document X has unsupported _id type, using doc number`

**Cause**: Document has an `_id` field with a type that's not ObjectId, String, Int32, or Int64

**Impact**: Document will get a generated ID instead of using the original `_id`

**Resolution**: This is safe - the original `_id` is preserved in the `data` JSONB field

## Best Practices

### Before Replication

**Planning:**
1. **Analyze Source Data**:
   ```javascript
   // In mongo shell
   db.stats()  // Database statistics
   db.collectionName.stats()  // Per-collection stats
   db.collectionName.find().limit(10)  // Sample data
   ```

2. **Estimate Target Size**:
   - JSONB storage is ~1.5-2x larger than BSON
   - Plan for additional space for indexes
   - Account for metadata fields (_source_type, _migrated_at)

3. **Check Disk Space**:
   ```sql
   -- On target PostgreSQL
   SELECT pg_size_pretty(pg_database_size('your_db'));
   ```

4. **Review Connection Credentials**:
   - Ensure MongoDB user has read access
   - Verify PostgreSQL user can create tables
   - Test connections before full replication

### During Replication

**Monitoring:**
1. **Watch Progress**: Replication logs show collection-by-collection progress
2. **Monitor Resources**: Check CPU, memory, and network on both systems
3. **Verify Data**: Spot-check replicated documents

**For Large Databases:**
- Run during off-peak hours if possible
- Monitor MongoDB server load
- Ensure stable network connection
- Consider replicating collections individually if needed

### After Replication

**Verification:**
```sql
-- Check row counts
SELECT COUNT(*) FROM collection_name;

-- Verify recent replications
SELECT MAX(_migrated_at) FROM collection_name;

-- Check for data integrity
SELECT id, data FROM collection_name LIMIT 10;

-- Verify all expected collections replicated
SELECT table_name FROM information_schema.tables
WHERE table_schema = 'public';
```

**Optimization:**
```sql
-- Analyze tables for query planning
ANALYZE collection_name;

-- Create application-specific indexes
CREATE INDEX idx_custom ON collection_name ((data->>'field_name'));

-- Consider vacuum for space reclamation
VACUUM ANALYZE collection_name;
```

**Backup:**
```bash
# Backup replicated data
pg_dump -h target-host -U user -Fc db > mongodb_replication_backup.dump
```

## FAQ

### Can I replicate multiple databases?

No, each invocation replicates one database. To replicate multiple databases:

```bash
# Replicate database 1
database-replicator init \
  --source "mongodb://localhost:27017/db1" \
  --target "postgresql://user:pass@target:5432/db"

# Replicate database 2
database-replicator init \
  --source "mongodb://localhost:27017/db2" \
  --target "postgresql://user:pass@target:5432/db"
```

### Is this a one-time replication or continuous sync?

**One-time replication only**. MongoDB to PostgreSQL replications do not support continuous replication.

**Reasons:**
- MongoDB doesn't have built-in logical replication to PostgreSQL
- Change streams would require additional infrastructure
- JSONB storage model is optimized for snapshot replications

**For continuous sync**, consider:
- MongoDB Change Streams with custom sync application
- Third-party replication tools (Debezium, etc.)
- Periodic re-replication for batch updates

### What happens to indexes?

**MongoDB indexes are not replicated**. Only data is converted to JSONB.

**Recommendation**: Create PostgreSQL indexes based on your query patterns:
```sql
-- Replace MongoDB index {email: 1}
CREATE INDEX idx_users_email ON users ((data->>'email'));

-- Replace MongoDB compound index {status: 1, created_at: -1}
CREATE INDEX idx_orders_status_date ON orders (
    (data->>'status'),
    to_timestamp((data->'created_at'->>'$date')::bigint / 1000) DESC
);
```

### Can I query replicated data like a MongoDB database?

**Partially**. PostgreSQL's JSONB supports many MongoDB-like operations, but not all:

**✅ Supported:**
- Field access: `data->>'field'`
- Nested field access: `data->'nested'->>'field'`
- Array containment: `data->'tags' ? 'value'`
- Existence checks: `data ? 'field'`

**❌ Not Supported Natively:**
- MongoDB query syntax (no `$gt`, `$in`, `$regex` operators)
- Aggregation pipeline
- Map-reduce operations

**Solution**: Use PostgreSQL-native SQL with JSONB operators (see Querying Replicated Data section).

### How do I handle schema changes?

JSONB storage is schema-less, so documents can have different fields:

```sql
-- Find documents with a specific field
SELECT * FROM users WHERE data ? 'premium_features';

-- Handle optional fields safely
SELECT
    id,
    data->>'name' AS name,
    COALESCE(data->>'email', 'no-email@example.com') AS email
FROM users;
```

### What about MongoDB transactions?

**Not preserved**. Each document is replicated independently.

**Implications:**
- Referential integrity is not enforced during replication
- Related documents may be replicated in different batches
- No ACID guarantees across collections during replication

**Recommendation**: Verify data relationships after replication if critical.

### Can I reverse the replication (PostgreSQL back to MongoDB)?

**Yes, but with manual work**. The JSONB data can be exported and reimported to MongoDB:

```javascript
// Example: Export from PostgreSQL and import to MongoDB
// 1. Export as JSON
\copy (SELECT jsonb_build_object('_id', id, 'data', data) FROM users) TO 'users.json'

// 2. Import to MongoDB
mongoimport --db mydb --collection users --file users.json
```

However, MongoDB-specific types (ObjectId, DateTime) would need manual conversion back to BSON types.

### How do I replicate only specific collections?

Currently, the tool replicates all collections in a database. To replicate specific collections:

**Option 1: Create a temporary database with only desired collections**
```javascript
// In mongo shell
use temp_db
db.users.insertMany(db.getSiblingDB('mydb').users.find().toArray())
db.orders.insertMany(db.getSiblingDB('mydb').orders.find().toArray())
// Then replicate temp_db
```

**Option 2: Drop unwanted tables after replication**
```sql
DROP TABLE IF EXISTS unwanted_collection;
```

Future versions may support collection-level filtering.

### What's the replication speed?

**Typical rates** (depends on network, hardware, document complexity):
- Small documents (<1KB): 5,000-10,000 docs/sec
- Medium documents (1-10KB): 1,000-5,000 docs/sec
- Large documents (>10KB): 100-1,000 docs/sec

**Example**: A 1 million document collection with 2KB average document size:
- Estimated time: 3-10 minutes

**Factors affecting speed:**
- Network latency between MongoDB and PostgreSQL
- Document size and complexity
- Number of nested fields/arrays
- Source and target server resources

### Is the replication safe?

**Yes**. The tool uses read-only connections to MongoDB and validates all inputs.

**Safety features:**
- MongoDB connection is read-only (no write operations)
- Collection names are validated for SQL injection
- Connection strings are validated before use
- All operations are logged for audit trail
- Target data is transactional (rollback on error)

**Best practice**: Always test replications on non-production data first.