# Memories Table DDL - Simple Storage Architecture
## Current Table Definition
This is the finalized DDL for the `memories` table after removing the unused cognitive tiering features and committing to a simple storage architecture with context-aware deduplication.
```sql
CREATE TABLE memories (
-- Core Identity
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
-- Content Storage
content TEXT NOT NULL,
content_hash VARCHAR(64) NOT NULL,
-- Context Preservation (Encoding Specificity Principle)
context_fingerprint VARCHAR(64) NOT NULL,
context TEXT NOT NULL,
summary TEXT NOT NULL,
-- Metadata & Organization
metadata JSONB DEFAULT '{}',
tags TEXT[] DEFAULT '{}',
-- Chunking Support (Semantic Boundaries)
chunk_index INTEGER DEFAULT NULL,
total_chunks INTEGER DEFAULT NULL,
parent_id UUID DEFAULT NULL,
-- Temporal Tracking
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
-- Performance Indexes
CREATE INDEX idx_memories_content_hash ON memories(content_hash);
CREATE INDEX idx_memories_context_fingerprint ON memories(context_fingerprint);
CREATE INDEX idx_memories_created_at ON memories(created_at DESC);
CREATE INDEX idx_memories_updated_at ON memories(updated_at DESC);
CREATE INDEX idx_memories_parent_id ON memories(parent_id) WHERE parent_id IS NOT NULL;
CREATE INDEX idx_memories_tags ON memories USING gin(tags);
-- Context-Aware Deduplication Constraint
ALTER TABLE memories ADD CONSTRAINT memories_context_dedup_key
UNIQUE (content_hash, context_fingerprint);
```
## Column Descriptions
### Core Fields
- **`id`**: Unique identifier for each memory entry (UUID)
- **`content`**: The actual text content being stored
- **`content_hash`**: SHA-256 hash of the content for deduplication
### Context Preservation
- **`context_fingerprint`**: Hash combining content, context, summary, and tags for semantic deduplication
- **`context`**: Contextual information about when/where/why this memory was stored
- **`summary`**: Brief summary of the content for quick retrieval
### Organization
- **`metadata`**: Flexible JSONB field for additional structured data
- **`tags`**: Array of text tags for categorization and search
### Chunking Support
- **`chunk_index`**: Position of this chunk within a larger document (NULL for non-chunked)
- **`total_chunks`**: Total number of chunks in the parent document
- **`parent_id`**: Reference to the parent memory if this is a chunk
### Temporal
- **`created_at`**: Timestamp when the memory was first stored
- **`updated_at`**: Timestamp of the last update to this memory
## Key Features
### Context-Aware Deduplication
The composite unique constraint on `(content_hash, context_fingerprint)` allows:
- Same content to exist with different contexts
- Prevention of exact duplicates (same content + same context)
- Implementation of Tulving & Thomson's encoding specificity principle
### Semantic Chunking
The chunking fields support:
- Breaking large documents into semantic boundaries (sentences/paragraphs)
- Maintaining relationships between chunks via `parent_id`
- Preserving document structure through `chunk_index` ordering
### Efficient Search
Multiple indexes support:
- Fast content-based lookups via `content_hash`
- Context-aware retrieval via `context_fingerprint`
- Temporal queries via timestamp indexes
- Tag-based filtering using GIN index
## Migration Path
To update an existing database to this schema:
1. **Run Migration 006**: `migrations/006_finalize_simple_storage.sql`
- Removes cognitive tiering columns if present
- Ensures all required columns exist
- Creates proper indexes and constraints
2. **Verify Schema**:
```sql
\d memories
\di memories*
SELECT conname, contype, pg_get_constraintdef(oid)
FROM pg_constraint
WHERE conrelid = 'memories'::regclass;
```
## Statistics View
A simple statistics view is available for monitoring:
```sql
CREATE VIEW memory_statistics AS
SELECT
COUNT(*) as total_memories,
COUNT(DISTINCT content_hash) as unique_contents,
COUNT(DISTINCT context_fingerprint) as unique_contexts,
AVG(LENGTH(content)) as avg_content_length,
COUNT(DISTINCT parent_id) as parent_documents,
SUM(CASE WHEN chunk_index IS NOT NULL THEN 1 ELSE 0 END) as chunked_memories,
MIN(created_at) as oldest_memory,
MAX(created_at) as newest_memory
FROM memories;
```
Query with: `SELECT * FROM memory_statistics;`
## Notes
- **Removed Features**: Cognitive tiering system (tier, last_accessed, access_count, importance_score, consolidation_strength) has been removed to simplify the architecture
- **Preserved Features**: Context-aware deduplication and semantic chunking remain as they provide value without complexity
- **Future Compatibility**: The JSONB `metadata` field allows for extensibility without schema changes