Skip to main content

Module double_write

Module double_write 

Source
Expand description

Double-write buffer for torn write protection.

NVMe drives guarantee atomic 4 KiB sector writes but NOT atomic writes for larger pages (e.g., 16 KiB). If power fails mid-write on a 16 KiB page, the WAL page can be partially written (torn).

CRC32C detects torn writes during replay, but without the double-write buffer, the record is lost — even though it was acknowledged to the client.

The double-write buffer solves this:

  1. Before writing to WAL, write the record to the double-write file.
  2. fsync the double-write file.
  3. Write to the WAL file.
  4. fsync the WAL file.

On recovery, if a WAL record’s CRC fails:

  • Check the double-write buffer for an intact copy (verify CRC).
  • If found, use the double-write copy to reconstruct the WAL page.
  • If not found, the record is truly lost (pre-fsync crash).

The double-write file is a fixed-size circular buffer. Only the most recent N records are kept — older ones are overwritten. This is fine because torn writes can only happen on the most recent write.

Structs§

DoubleWriteBuffer
Double-write buffer file.