1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
//! Provides filesystem storage abstractions for managing workspaces, folders, and documents.
//!
//! This module defines the core structures and logic for interacting with the
//! application's data model on disk. It establishes conventions for how
//! workspaces, organizational folders, and multi-file documents are represented
//! in the file system.
//!
//! # Core Concepts
//!
//! * **[`Workspace`]:** The root container for all managed data. A workspace corresponds
//! to a directory on the filesystem. It contains documents, folders, and a special
//! `.markhor` subdirectory for internal workspace configuration (like `config.json`)
//! and potential future caches or indexes. Users typically start by [`Workspace::create`]ing
//! or [`Workspace::open`]ing a workspace.
//! * **[`Folder`]:** Represents a standard directory within a workspace used for
//! organizing documents and other folders. Folders are discovered via methods like
//! [`Workspace::list_folders`] or [`Folder::list_folders`].
//! * **[`Document`]:** A logical representation of a single piece of content that may
//! consist of multiple related files. For example, an imported PDF might result in
//! a document containing the original `.pdf` file, a generated `.md` transcription,
//! and perhaps an `.html` version.
//! * Documents are identified by a metadata file with a `.markhor` extension
//! (e.g., `my_report.markhor`). This file contains metadata like a unique ID.
//! * All other files belonging to that document reside in the *same directory* and
//! share the *same base name* as the `.markhor` file, but with different extensions.
//! * **[`ContentFile`]:** Represents an individual file that is part of a [`Document`]
//! (excluding the `.markhor` metadata file itself). Instances are obtained via
//! [`Document::files`] or [`Document::files_by_extension`].
//!
//! # File Naming Conventions
//!
//! The association between a document's metadata (`basename.markhor`) and its content
//! files is determined by naming convention:
//!
//! * **Standard Files:** Files like `basename.pdf`, `basename.txt`.
//! * **Suffixed Files:** In cases where a single document component might produce multiple
//! files of the same type (e.g., splitting tabs into separate files), a hexadecimal
//! suffix is used: `basename.{hex}.extension` (e.g., `basename.a1.md`, `basename.a2.md`).
//!
//! The library automatically discovers these files based on the document's base name.
//!
//! # Conflict Detection
//!
//! To prevent ambiguity and data corruption, strict conflict detection rules are enforced
//! when creating ([`Document::create`]) or moving ([`Document::move_to`]) documents. These
//! rules prevent:
//!
//! * Direct overwrites of existing document metadata files.
//! * Accidental "adoption" of unrelated files that happen to match a document's naming pattern.
//! * Ambiguity between base documents (e.g., `doc.markhor`) and suffixed documents
//! (e.g., `doc.4.markhor`) regarding ownership of files like `doc.4.txt`.
//!
//! Operations will return a [`ConflictError`] if any rule is violated.
//!
//! ## Conflict Rules for Creating/Moving Documents
//!
//! A conflict exists in the target directory if **any** of the following conditions are met:
//!
//! 1. **Direct Markhor Conflict:** The file `target_basename.markhor` already exists.
//! * *Example:* Trying to create `doc.markhor` when `doc.markhor` exists.
//! * *Reason:* Cannot have two identical document definitions.
//! 2. **Orphan File Conflict:** An existing file (which is *not* a `.markhor` file) already matches the file pattern for the *potential document*. This file would be implicitly "adopted" by the new document, potentially misrepresenting its origin.
//! * *Example:* Trying to create `doc.markhor` when `doc.txt` exists.
//! * *Example:* Trying to create `doc.markhor` when `doc.a1.pdf` exists.
//! * *Example:* Trying to create `report.1a.markhor` when `report.1a.csv` exists.
//! * *Reason:* Avoids accidentally associating unrelated files with the new document. It forces explicit action if these files *should* belong to the new document (e.g., renaming or moving them first).
//! 3. **Suffix-Base Ambiguity Conflict:** The potential document has a hex suffix (`target_basename = true_base.{hex}`), AND the corresponding base document (`true_base.markhor`) already exists.
//! * *Example:* Trying to create `doc.4.markhor` when `doc.markhor` exists.
//! * *Reason:* Files like `doc.4.txt` could potentially belong to *either* `doc.markhor` (as `doc.{hex}.txt`) *or* `doc.4.markhor` (as `doc.4.txt`). This creates ambiguity about ownership, even for files not yet created.
//! 4. **Base-Suffix Ambiguity Conflict:** The potential document *does not* have a hex suffix (`target_basename = true_base`), AND *any* suffixed document (`true_base.{hex}.markhor`) already exists.
//! * *Example:* Trying to create `doc.markhor` when `doc.4.markhor` exists.
//! * *Reason:* Similar to rule 3, files like `doc.4.txt` would have ambiguous ownership between `doc.markhor` and `doc.4.markhor`.
//!
//! # Asynchronous API
//!
//! All filesystem I/O operations within this module are `async` and rely on the `tokio`
//! runtime. Methods that perform I/O return `Result<T, Error>`, where [`Error`] encapsulates
//! potential issues like I/O errors, serialization errors, or conflicts.
//!
//! # Example Usage
//!
//! ```rust,no_run
//! use markhor_core::storage::{Storage, Workspace, Document, Error}; // Adjust path as needed
//! use std::sync::Arc;
//! use tempfile::tempdir; // For example purposes
//!
//! #[tokio::main]
//! async fn main() -> Result<(), Box<dyn std::error::Error>> {
//! // Create a temporary directory for the workspace
//! let temp_dir = tempdir()?;
//! let ws_path = temp_dir.path().to_path_buf();
//!
//! // Create a new workspace
//! let storage = Arc::new(Storage::new());
//! let ws = Workspace::create(&storage, &*ws_path).await?;
//! println!("Workspace created at: {}", ws.path().display());
//!
//! // Create a new document within the workspace
//! let root = ws.root().await;
//! let doc = root.create_document("my_doc").await?;
//! println!("Document created with ID: {}", doc.id());
//!
//! // List documents in the workspace root
//! let root_docs = root.list_documents().await?;
//! println!("Found {} documents in workspace root.", root_docs.len());
//! assert_eq!(root_docs.len(), 1);
//!
//! // Clean up (in real code, workspace persists)
//! drop(temp_dir);
//! Ok(())
//! }
//! ```
use crateEvent;
pub use Document;
pub use ;
pub use ;
pub use Workspace;
use HashMap;
use ;
use ;
use Error;
use fs;
use ;
use ;
use ;
pub const MARKHOR_EXTENSION: &str = "markhor";
pub const INTERNAL_DIR_NAME: &str = ".markhor";
// Define the metadata filename constant
const WORKSPACE_CONFIG_FILENAME: &str = "config.json"; // Using .json for clarity
// Define a standard Result type for the library
pub type Result<T> = Result;