Expand description
§CML (Content Markup Language)
CML is a semantic markup language designed for long-term interpretable content storage. It separates content from presentation and enables efficient vector-based semantic search.
§CML Structure
<cml profile="core" version="0.2" encoding="utf-8">
<header>
<title>Document Title</title>
<author role="author">Author Name</author>
<date type="created" when="2025-12-22"/>
</header>
<body>
<section id="intro">
<heading size="1">Introduction</heading>
<paragraph>Content here with <em>inline elements</em>.</paragraph>
</section>
</body>
<footer>
</footer>
</cml>§Features
- Profile-based schemas: Domain-specific document structures (law, code, edu)
- Pathless references:
namespace:identifierformat (e.g.,president:47) - Active documents: Currency conversion, date localization (future)
- Validation: ID uniqueness, reference integrity, structural correctness
§Usage
use cml_rs::{CmlParser, CmlGenerator, CmlValidator};
// Parse CML
let xml = r#"<cml profile="core" version="0.2" encoding="utf-8">
<header><title>Test</title></header>
<body><paragraph>Hello!</paragraph></body>
<footer></footer>
</cml>"#;
let doc = CmlParser::parse_str(xml)?;
// Validate
CmlValidator::validate(&doc)?;
// Generate
let xml = CmlGenerator::generate(&doc)?;Re-exports§
pub use generator::CmlGenerator;pub use parser::CmlParser;pub use profile::Profile;pub use profile::ProfileRegistry;pub use profile::ResolvedProfile;pub use validator::CmlValidator;pub use chunker::Chunk;pub use chunker::CmlChunker;pub use chunker::CHUNK_OVERLAP_TOKENS;pub use chunker::MAX_CHUNK_TOKENS;pub use embedding_store::ChunkMatch;pub use embedding_store::EmbeddingStore;pub use embedding_store::MatchType;pub use embedding_store::EMBEDDING_DIM;pub use id_generator::BookstackIdGenerator;pub use id_generator::CodeIdGenerator;pub use id_generator::ElementId;pub use id_generator::LegalIdGenerator;pub use types::*;
Modules§
- chunker
- Profile-aware semantic chunking for CML v0.2 documents.
- embedding_
store - SQLite-based embedding store with FTS5 hybrid search
- generator
- CML v0.2 XML Generator
- id_
generator - Hybrid ID generation system for CML elements
- parser
- CML v0.2 XML Parser
- profile
- CML Profile System
- types
- CML v0.2 types and structures
- validator
- CML v0.2 Document Validator
Enums§
- CmlError
- Errors that can occur during CML processing
Type Aliases§
- Result
- Result type for CML operations