pub struct MarkerRequest {Show 21 fields
pub file: Option<Vec<u8>>,
pub filename: Option<String>,
pub file_url: Option<String>,
pub output_format: Vec<OutputFormat>,
pub mode: ProcessingMode,
pub max_pages: Option<u32>,
pub page_range: Option<String>,
pub paginate: bool,
pub skip_cache: bool,
pub disable_image_extraction: bool,
pub disable_image_captions: bool,
pub save_checkpoint: bool,
pub add_block_ids: bool,
pub include_markdown_in_chunks: bool,
pub keep_spreadsheet_formatting: bool,
pub page_schema: Option<Value>,
pub segmentation_schema: Option<String>,
pub additional_config: Option<Value>,
pub extras: Option<String>,
pub fence_synthetic_captions: bool,
pub webhook_url: Option<String>,
}Expand description
Request parameters for the DataLab Marker conversion API.
Exactly one of file or file_url must be set.
Fields§
§file: Option<Vec<u8>>Raw file bytes to upload. Required when file_url is not set.
filename: Option<String>Filename for the uploaded file (e.g. "paper.pdf"). Used when file is set.
file_url: Option<String>Public URL to the file. Alternative to file.
output_format: Vec<OutputFormat>Output format(s). Defaults to [Markdown].
mode: ProcessingModeProcessing mode. Defaults to Balanced.
max_pages: Option<u32>Maximum number of pages to process.
page_range: Option<String>Page range (0-indexed). E.g. "0-5" or "1,3,5".
paginate: boolInsert page delimiters in output.
skip_cache: boolForce reprocessing even if cached.
disable_image_extraction: boolSkip extracting images.
disable_image_captions: boolSkip generating image captions.
save_checkpoint: boolSave intermediate checkpoint for downstream extraction steps.
add_block_ids: boolHTML mode only: adds data-block-id attributes.
include_markdown_in_chunks: boolInclude markdown alongside chunks output.
keep_spreadsheet_formatting: boolPreserve spreadsheet table structure.
page_schema: Option<Value>JSON schema for structured data extraction.
segmentation_schema: Option<String>Schema for document segmentation.
additional_config: Option<Value>Extra Marker config (e.g. force_ocr, languages).
extras: Option<String>Comma-separated extras: track_changes, chart_understanding, etc.
fence_synthetic_captions: boolFence auto-generated captions.
webhook_url: Option<String>URL to POST results to when processing completes.