Skip to main content

Crate cml_rs

Crate cml_rs 

Source
Expand description

§CML (Content Markup Language)

CML is a semantic markup language designed for long-term interpretable content storage. It separates content from presentation and enables efficient vector-based semantic search.

§CML Structure

<cml profile="core" version="0.2" encoding="utf-8">
  <header>
    <title>Document Title</title>
    <author role="author">Author Name</author>
    <date type="created" when="2025-12-22"/>
  </header>
  <body>
    <section id="intro">
      <heading size="1">Introduction</heading>
      <paragraph>Content here with <em>inline elements</em>.</paragraph>
    </section>
  </body>
  <footer>
  </footer>
</cml>

§Features

  • Profile-based schemas: Domain-specific document structures (law, code, edu)
  • Pathless references: namespace:identifier format (e.g., president:47)
  • Active documents: Currency conversion, date localization (future)
  • Validation: ID uniqueness, reference integrity, structural correctness

§Usage

use cml_rs::{CmlParser, CmlGenerator, CmlValidator};

// Parse CML
let xml = r#"<cml profile="core" version="0.2" encoding="utf-8">
  <header><title>Test</title></header>
  <body><paragraph>Hello!</paragraph></body>
  <footer></footer>
</cml>"#;

let doc = CmlParser::parse_str(xml)?;

// Validate
CmlValidator::validate(&doc)?;

// Generate
let xml = CmlGenerator::generate(&doc)?;

Re-exports§

pub use generator::CmlGenerator;
pub use parser::CmlParser;
pub use profile::Profile;
pub use profile::ProfileRegistry;
pub use profile::ResolvedProfile;
pub use validator::CmlValidator;
pub use chunker::Chunk;
pub use chunker::CmlChunker;
pub use chunker::CHUNK_OVERLAP_TOKENS;
pub use chunker::MAX_CHUNK_TOKENS;
pub use embedding_store::ChunkMatch;
pub use embedding_store::EmbeddingStore;
pub use embedding_store::MatchType;
pub use embedding_store::EMBEDDING_DIM;
pub use id_generator::BookstackIdGenerator;
pub use id_generator::CodeIdGenerator;
pub use id_generator::ElementId;
pub use id_generator::LegalIdGenerator;
pub use types::*;

Modules§

chunker
Profile-aware semantic chunking for CML v0.2 documents.
embedding_store
SQLite-based embedding store with FTS5 hybrid search
generator
CML v0.2 XML Generator
id_generator
Hybrid ID generation system for CML elements
parser
CML v0.2 XML Parser
profile
CML Profile System
types
CML v0.2 types and structures
validator
CML v0.2 Document Validator

Enums§

CmlError
Errors that can occur during CML processing

Type Aliases§

Result
Result type for CML operations