Module vectorscan::alloc

source ·
Available on crate feature alloc only.
Expand description

Routines for overriding the allocators used in several components of vectorscan.

Use Cases

set_allocator() will set all of the allocators at once, while the set_*_allocator() methods such as set_db_allocator() enable overriding allocation logic for individual components of vectorscan. In either case, get_*_allocator() methods such as get_db_allocator() enable introspection of the active allocator (which defaults to libc::malloc() and libc::free() if unset).

Nonstandard Allocators

These methods can be used to wrap nonstandard allocators such as jemalloc for vectorscan usage:

 #[cfg(feature = "compiler")]
 fn main() -> Result<(), vectorscan::error::VectorscanError> {
   use vectorscan::{expression::*, flags::*, matchers::*};
   use jemallocator::Jemalloc;

   // Use jemalloc for all vectorscan allocations.
   vectorscan::alloc::set_allocator(Jemalloc.into())?;

   // Everything works as normal.
   let expr: Expression = "(he)ll".parse()?;
   let db = expr.compile(Flags::default(), Mode::BLOCK)?;

   let mut scratch = db.allocate_scratch()?;

   let mut matches: Vec<&str> = Vec::new();
   scratch
     .scan_sync(&db, "hello".into(), |m| {
       matches.push(unsafe { m.source.as_str() });
       MatchResult::Continue
     })?;
   assert_eq!(&matches, &["hell"]);
   Ok(())
 }

Inspecting Live Allocations

However, this module also supports inspecting live allocations with LayoutTracker::current_allocations() without overriding the allocation logic, by wrapping the standard System allocator:

 #[cfg(feature = "compiler")]
 fn main() -> Result<(), vectorscan::error::VectorscanError> {
   use vectorscan::{expression::*, flags::*, database::*, alloc::*};
   use std::{alloc::System, mem::ManuallyDrop};

   // Wrap the standard Rust System allocator.
   let tracker = LayoutTracker::new(System.into());
   // Register it as the allocator for databases.
   assert!(set_db_allocator(tracker)?.is_none());

   // Create a database.
   let expr: Expression = "asdf".parse()?;
   let mut db = expr.compile(Flags::SOM_LEFTMOST, Mode::BLOCK)?;

   // Get the database allocator we just registered and view its live allocations:
   let allocs = get_db_allocator().as_ref().unwrap().current_allocations();
   // Verify that only the single known db was allocated:
   assert_eq!(1, allocs.len());
   let (p, _layout) = allocs[0];
   // .as_ref_native() and .as_mut_native() provide references to the wrapped pointer:
   let db_ptr: *mut NativeDb = db.as_mut_native();
   assert_eq!(p.as_ptr() as *mut NativeDb, db_ptr);

   // Demonstrate that we can actually use this pointer as a reference to the database,
   // although we have to be careful about shared mutable access,
   // so we can't run the drop code for example.
   let db = ManuallyDrop::new(unsafe { Database::from_native(p.as_ptr() as *mut NativeDb) });
   // We can inspect properties of the database with this reference:
   assert_eq!(db.database_size()?, 936);
   Ok(())
 }

Global State

These methods mutate global process state when setting function pointers for alloc and free, so this module requires the "alloc" feature, which itself requires the "static" feature which statically links the vectorscan native library to ensure exclusive access to this global state.

Lifetimes and Dangling Pointers

These methods enable resetting the registered allocator more than once over the lifetime of the program, but trying to drop any object allocated with a previous allocator will cause an error. The ManuallyDrop and from_native() methods (such as Database::from_native()) can be used to manage the lifetime of objects across multiple allocators:

 #[cfg(feature = "compiler")]
 fn main() -> Result<(), vectorscan::error::VectorscanError> {
   use vectorscan::{expression::*, flags::*, database::*, matchers::*, alloc::*};
   use std::{alloc::System, mem::ManuallyDrop};

   // Set the process-global allocator to use for Database instances:
   let tracker = LayoutTracker::new(System.into());
   // There was no custom allocator registered yet.
   assert!(set_db_allocator(tracker)?.is_none());

   let expr: Expression = "asdf".parse()?;
   // Use ManuallyDrop to avoid calling the vectorscan db free method,
   // since we will be invalidating the pointer by changing the allocator,
   // and the .try_drop() method and Drop impl both call into
   // whatever allocator is currently active to free the pointer, which will error.
   let mut db = ManuallyDrop::new(expr.compile(Flags::SOM_LEFTMOST, Mode::BLOCK)?);

   // Change the allocator to a fresh LayoutTracker:
   let tracker = set_db_allocator(LayoutTracker::new(System.into()))?.unwrap();
   // Get the extant allocations from the old LayoutTracker:
   let allocs = tracker.current_allocations();
   // Verify that only the single known db was allocated:
   assert_eq!(1, allocs.len());
   let (p, layout) = allocs[0];
   let db_ptr: *mut NativeDb = db.as_mut_native();
   assert_eq!(p.as_ptr() as *mut NativeDb, db_ptr);

   // Despite having reset the allocator, our previous db is still valid
   // and can be used for matching:
   let mut scratch = db.allocate_scratch()?;
   let mut matches: Vec<&str> = Vec::new();
   scratch.scan_sync(&db, "asdf asdf".into(), |m| {
     matches.push(unsafe { m.source.as_str() });
     MatchResult::Continue
   })?;
   assert_eq!(&matches, &["asdf", "asdf"]);

   // We can deserialize something from somewhere else into the db handle:
   let expr: Literal = "hello".parse()?;
   let serialized_db = expr.compile(Flags::SOM_LEFTMOST, Mode::BLOCK)?.serialize()?;
   // Ensure the allocated database is large enough to contain the deserialized one:
   assert!(layout.size() >= serialized_db.deserialized_size()?);
   // NB: overwrite the old database!
   unsafe { serialized_db.deserialize_db_at(db.as_mut_native())?; }

   // Reuse the same database object now:
   scratch.setup_for_db(&db)?;
   matches.clear();
   scratch.scan_sync(&db, "hello hello".into(), |m| {
     matches.push(unsafe { m.source.as_str() });
     MatchResult::Continue
   })?;
   assert_eq!(&matches, &["hello", "hello"]);

   // Deallocate the db by hand here (ensure no other handles point to it):
   tracker.deallocate(p);
   // NB: `db` is now INVALID and points to FREED MEMORY!!!
   Ok(())
 }

Allocation Failures

Allocation failure should cause vectorscan methods to fail with VectorscanRuntimeError::NoMem:

 #[cfg(feature = "compiler")]
 fn main() -> Result<(), vectorscan::error::VectorscanError> {
   use vectorscan::{expression::*, flags::*, matchers::*, alloc::*, error::*};
   use std::{alloc::{GlobalAlloc, Layout}, mem::ManuallyDrop, ptr};

   let expr: Expression = "asdf".parse()?;
   // Wrap in ManuallyDrop because we will be clobbering the allocator,
   // including the free methods.
   let db = ManuallyDrop::new(expr.compile(Flags::SOM_LEFTMOST, Mode::BLOCK)?);

   struct BadAllocator;
   unsafe impl GlobalAlloc for BadAllocator {
     unsafe fn alloc(&self, _layout: Layout) -> *mut u8 { ptr::null_mut() }
     // If we wanted to cover allocations made before registering this one,
     // we could fall back to libc::free() for unrecognized pointers.
     unsafe fn dealloc(&self, _ptr: *mut u8, _layout: Layout) {}
   }

   set_allocator(BadAllocator.into())?;

   // Most allocation methods fail with NoMem:
   assert!(matches!(
     db.allocate_scratch(),
     Err(VectorscanRuntimeError::NoMem),
   ));

   // Compile allocation errors fail slightly differently:
   match expr.compile(Flags::SOM_LEFTMOST, Mode::BLOCK) {
     Err(VectorscanCompileError::Compile(CompileError { message, expression })) => {
       assert!(message == "Unable to allocate memory.");
       assert!(expression == Some(ExpressionIndex(0)));
     },
     _ => unreachable!(),
   }
   Ok(())
 }

Modules

  • chimerachimera
    Routines for overriding the allocators used in the chimera library.

Structs

Traits

Functions