prollytree 0.3.2

A prolly (probabilistic) tree for efficient storage, retrieval, and modification of ordered data.
Documentation
Advanced Usage
==============

This guide covers advanced features and performance optimization techniques for ProllyTree.

Performance Optimization
-------------------------

Batch Operations
~~~~~~~~~~~~~~~~

For better performance when inserting many items, use batch operations:

.. code-block:: python

   from prollytree import ProllyTree

   tree = ProllyTree()

   # Instead of individual inserts
   for i in range(1000):
       tree.insert(f"key_{i}".encode(), f"value_{i}".encode())

   # Use batch insert (much faster)
   batch_data = [
       (f"key_{i}".encode(), f"value_{i}".encode())
       for i in range(1000)
   ]
   tree.insert_batch(batch_data)

Storage Backends
~~~~~~~~~~~~~~~~

Choose the appropriate storage backend for your use case:

.. code-block:: python

   from prollytree import ProllyTree, VersionedKvStore

   # In-memory (fastest, not persistent)
   tree = ProllyTree()

   # File-based storage (persistent)
   tree = ProllyTree(storage_type="file", path="/path/to/data")

   # Versioned storage with Git-like history
   store = VersionedKvStore("/path/to/versioned_data")

Tree Configuration
~~~~~~~~~~~~~~~~~~

Tune tree parameters for your workload:

.. code-block:: python

   from prollytree import ProllyTree, TreeConfig

   # Default configuration
   config = TreeConfig()

   # Custom configuration for specific workloads
   config = TreeConfig(
       base=8,        # Higher base for wider trees (good for read-heavy)
       modulus=128,   # Higher modulus for deeper trees (good for write-heavy)
   )

   tree = ProllyTree(config=config)

Concurrent Access
-----------------

Thread Safety
~~~~~~~~~~~~~

For multi-threaded applications:

.. code-block:: python

   import threading
   from prollytree import ProllyTree

   # Create a thread-safe tree
   tree = ProllyTree(thread_safe=True)

   def worker(thread_id):
       for i in range(100):
           key = f"thread_{thread_id}_key_{i}".encode()
           value = f"value_{i}".encode()
           tree.insert(key, value)

   # Start multiple threads
   threads = []
   for i in range(4):
       t = threading.Thread(target=worker, args=(i,))
       threads.append(t)
       t.start()

   for t in threads:
       t.join()

Memory Management
-----------------

LRU Cache
~~~~~~~~~

Enable LRU caching for read-heavy workloads:

.. code-block:: python

   from prollytree import ProllyTree, CacheConfig

   cache_config = CacheConfig(
       max_size=10000,  # Cache up to 10k nodes
       eviction_policy="lru"
   )

   tree = ProllyTree(cache_config=cache_config)

Memory Monitoring
~~~~~~~~~~~~~~~~~

Monitor memory usage:

.. code-block:: python

   tree = ProllyTree()

   # Insert data
   for i in range(10000):
       tree.insert(f"key_{i}".encode(), f"value_{i}".encode())

   # Get memory statistics
   stats = tree.get_memory_stats()
   print(f"Nodes in memory: {stats['node_count']}")
   print(f"Memory usage: {stats['memory_bytes']} bytes")
   print(f"Cache hit rate: {stats['cache_hit_rate']}%")

Data Serialization
-------------------

Custom Serialization
~~~~~~~~~~~~~~~~~~~~~

For complex data types:

.. code-block:: python

   import json
   import pickle
   from prollytree import ProllyTree

   tree = ProllyTree()

   # JSON serialization
   def store_json(tree, key, data):
       serialized = json.dumps(data).encode('utf-8')
       tree.insert(key.encode('utf-8'), serialized)

   def load_json(tree, key):
       data = tree.find(key.encode('utf-8'))
       return json.loads(data.decode('utf-8')) if data else None

   # Usage
   complex_data = {
       "user": "alice",
       "scores": [95, 87, 92],
       "metadata": {"premium": True, "last_login": "2023-01-01"}
   }

   store_json(tree, "user:alice", complex_data)
   retrieved = load_json(tree, "user:alice")

SQL Advanced Queries
---------------------

Complex Joins and Aggregations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   from prollytree import ProllySQLStore

   sql_store = ProllySQLStore("/path/to/sql_data")

   # Create tables
   sql_store.execute("""
       CREATE TABLE users (
           id INTEGER PRIMARY KEY,
           name TEXT,
           department_id INTEGER,
           salary REAL
       )
   """)

   sql_store.execute("""
       CREATE TABLE departments (
           id INTEGER PRIMARY KEY,
           name TEXT,
           budget REAL
       )
   """)

   # Complex aggregation query
   result = sql_store.execute("""
       SELECT
           d.name as department,
           COUNT(u.id) as employee_count,
           AVG(u.salary) as avg_salary,
           MAX(u.salary) as max_salary,
           SUM(u.salary) as total_salary
       FROM departments d
       LEFT JOIN users u ON d.id = u.department_id
       GROUP BY d.id, d.name
       HAVING COUNT(u.id) > 0
       ORDER BY avg_salary DESC
   """)

Error Handling and Debugging
-----------------------------

Exception Handling
~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   from prollytree import ProllyTree, ProllyTreeError, StorageError

   try:
       tree = ProllyTree(storage_type="file", path="/invalid/path")
       tree.insert(b"key", b"value")
   except StorageError as e:
       print(f"Storage error: {e}")
   except ProllyTreeError as e:
       print(f"Tree operation error: {e}")
   except Exception as e:
       print(f"Unexpected error: {e}")

Debug Mode
~~~~~~~~~~

.. code-block:: python

   # Enable debug logging
   tree = ProllyTree(debug=True, log_level="DEBUG")

   # Validate tree structure
   is_valid = tree.validate()
   if not is_valid:
       print("Tree structure is corrupted!")

   # Get detailed statistics
   stats = tree.get_debug_stats()
   print(f"Tree height: {stats['height']}")
   print(f"Node distribution: {stats['node_distribution']}")
   print(f"Rebalancing events: {stats['rebalance_count']}")

Migration and Backup
---------------------

Data Export/Import
~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   # Export tree data
   tree.export_to_file("/path/to/backup.json", format="json")
   tree.export_to_file("/path/to/backup.bin", format="binary")

   # Import tree data
   new_tree = ProllyTree()
   new_tree.import_from_file("/path/to/backup.json", format="json")

This advanced guide covers performance optimization, concurrent access patterns, memory management, complex data operations, and debugging techniques for ProllyTree.