subsecond/
lib.rs

1#![allow(clippy::needless_doctest_main)]
2//! # Subsecond: Hot-patching for Rust
3//!
4//! Subsecond is a library that enables hot-patching for Rust applications. This allows you to change
5//! the code of a running application without restarting it. This is useful for game engines, servers,
6//! and other long-running applications where the typical edit-compile-run cycle is too slow.
7//!
8//! Subsecond also implements a technique we call "ThinLinking" which makes compiling Rust code
9//! significantly faster in development mode, which can be used outside of hot-patching.
10//!
11//! # Usage
12//!
13//! Subsecond is designed to be as simple for both application developers and library authors.
14//!
15//! Simply call your existing functions with [`call`] and Subsecond will automatically detour
16//! that call to the latest version of the function.
17//!
18//! ```rust
19//! for x in 0..5 {
20//!     subsecond::call(|| {
21//!         println!("Hello, world! {}", x);
22//!     });
23//! }
24//! ```
25//!
26//! To actually load patches into your application, a third-party tool that implements the Subsecond
27//! compiler and protocol is required. Subsecond is built and maintained by the Dioxus team, so we
28//! suggest using the dioxus CLI tool to use subsecond.
29//!
30//! To install the Dioxus CLI, we recommend using [`cargo binstall`](https://crates.io/crates/cargo-binstall):
31//!
32//! ```sh
33//! cargo binstall dioxus-cli
34//! ```
35//!
36//! The Dioxus CLI provides several tools for development. To run your application with Subsecond enabled,
37//! use `dx serve` - this takes the same arguments as `cargo run` but will automatically hot-reload your
38//! application when changes are detected.
39//!
40//! As of Dioxus 0.7, "--hotpatch" is required to use hotpatching while Subsecond is still experimental.
41//!
42//! ```sh
43//! dx serve --hotpatch
44//! ```
45//!
46//! ## How it works
47//!
48//! Subsecond works by detouring function calls through a jump table. This jump table contains the latest
49//! version of the program's function pointers, and when a function is called, Subsecond will look up
50//! the function in the jump table and call that instead.
51//!
52//! Unlike libraries like [detour](https://crates.io/crates/detour), Subsecond *does not* modify your
53//! process memory. Patching pointers is wildly unsafe and can lead to crashes and undefined behavior.
54//!
55//! Instead, an external tool compiles only the parts of your project that changed, links them together
56//! using the addresses of the functions in your running program, and then sends the new jump table to
57//! your application. Subsecond then applies the patch and continues running. Since Subsecond doesn't
58//! modify memory, the program must have a runtime integration to handle the patching.
59//!
60//! If the framework you're using doesn't integrate with subsecond, you can rely on the fact that calls
61//! to stale [`call`] instances will emit a safe panic that is automatically caught and retried
62//! by the next [`call`] instance up the callstack.
63//!
64//! Subsecond is only enabled when debug_assertions are enabled so you can safely ship your application
65//! with Subsecond enabled without worrying about the performance overhead.
66//!
67//! ## Workspace support
68//!
69//! Subsecond currently only patches the "tip" crate - ie the crate in which your `main.rs` is located.
70//! Changes to crates outside this crate will be ignored, which can be confusing. We plan to add full
71//! workspace support in the future, but for now be aware of this limitation. Crate setups that have
72//! a `main.rs` importing a `lib.rs` won't patch sensibly since the crate becomes a library for itself.
73//!
74//! This is due to limitations in rustc itself where the build-graph is non-deterministic and changes
75//! to functions that forward generics can cause a cascade of codegen changes.
76//!
77//! ## Globals, statics, and thread-locals
78//!
79//! Subsecond *does* support hot-reloading of globals, statics, and thread locals. However, there are several limitations:
80//!
81//! - You may add new globals at runtime, but their destructors will never be called.
82//! - Globals are tracked across patches, but renames are considered to be *new* globals.
83//! - Changes to static initializers will not be observed.
84//!
85//! Subsecond purposefully handles statics this way since many libraries like Dioxus and Tokio rely
86//! on persistent global runtimes.
87//!
88//! HUGE WARNING: Currently, thread-locals in the "tip" crate (the one being patched) will seemingly
89//! reset to their initial value on new patches. This is because we don't currently bind thread-locals
90//! in the patches to their original addresses in the main program. If you rely on thread-locals heavily
91//! in your tip crate, you should be aware of this. Sufficiently complex setups might crash or even
92//! segfault. We plan to fix this in the future, but for now, you should be aware of this limitation.
93//!
94//! ## Struct layout and alignment
95//!
96//! Subsecond currently does not support hot-reloading of structs. This is because the generated code
97//! assumes a particular layout and alignment of the struct. If layout or alignment change and new
98//! functions are called referencing an old version of the struct, the program will crash.
99//!
100//! To mitigate this, framework authors can integrate with Subsecond to either dispose of the old struct
101//! or to re-allocate the struct in a way that is compatible with the new layout. This is called "re-instancing."
102//!
103//! In practice, frameworks that implement subsecond patching properly will throw out the old state
104//! and thus you should never witness a segfault due to misalignment or size changes. Frameworks are
105//! encouraged to aggressively dispose of old state that might cause size and alignment changes.
106//!
107//! We'd like to lift this limitation in the future by providing utilities to re-instantiate structs,
108//! but for now it's up to the framework authors to handle this. For example, Dioxus apps simply throw
109//! out the old state and rebuild it from scratch.
110//!
111//! ## Pointer versioning
112//!
113//! Currently, Subsecond does not "version" function pointers. We have plans to provide this metadata
114//! so framework authors can safely memoize changes without much runtime overhead. Frameworks like
115//! Dioxus and Bevy circumvent this issue by using the TypeID of structs passed to hot functions as
116//! well as the `ptr_address` method on [`HotFn`] to determine if the function pointer has changed.
117//!
118//! Currently, the `ptr_address` method will always return the most up-to-date version of the function
119//! even if the function contents itself did not change. In essence, this is equivalent to a version
120//! of the function where every function is considered "new." This means that framework authors who
121//! integrate re-instancing in their apps might dispose of old state too aggressively. For now, this
122//! is the safer and more practical approach.
123//!
124//! ## Nesting Calls
125//!
126//! Subsecond calls are designed to be nested. This provides clean integration points to know exactly
127//! where a hooked function is called.
128//!
129//! The highest level call is `fn main()` though by default this is not hooked since initialization code
130//! tends to be side-effectual and modify global state. Instead, we recommend wrapping the hot-patch
131//! points manually with [`call`].
132//!
133//! ```rust
134//! fn main() {
135//!     // Changes to the `for` loop will cause an unwind to this call.
136//!     subsecond::call(|| {
137//!         for x in 0..5 {
138//!             // Changes to the `println!` will be isolated to this call.
139//!             subsecond::call(|| {
140//!                 println!("Hello, world! {}", x);
141//!             });
142//!         }
143//!    });
144//! }
145//! ```
146//!
147//! The goal here is to provide granular control over where patches are applied to limit loss of state
148//! when new code is loaded.
149//!
150//! ## Applying patches
151//!
152//! When running under the Dioxus CLI, the `dx serve` command will automatically apply patches when
153//! changes are detected. Patches are delivered over the [Dioxus Devtools](https://crates.io/crates/dioxus-devtools)
154//! websocket protocol and received by corresponding websocket.
155//!
156//! If you're using Subsecond in your own application that doesn't have a runtime integration, you can
157//! build an integration using the [`apply_patch`] function. This function takes a `JumpTable` which
158//! the dioxus-cli crate can generate.
159//!
160//! To add support for the Dioxus Devtools protocol to your app, you can use the [dioxus-devtools](https://crates.io/crates/dioxus-devtools)
161//! crate which provides a `connect` method that will automatically apply patches to your application.
162//!
163//! Unfortunately, one design quirk of Subsecond is that running apps need to communicate the address
164//! of `main` to the patcher. This is due to a security technique called [ASLR](https://en.wikipedia.org/wiki/Address_space_layout_randomization)
165//! which randomizes the address of functions in memory. See the subsecond-harness and subsecond-cli
166//! for more details on how to implement the protocol.
167//!
168//! ## ThinLink
169//!
170//! ThinLink is a program linker for Rust that is designed to be used with Subsecond. It implements
171//! the powerful patching system that Subsecond uses to hot-reload Rust applications.
172//!
173//! ThinLink is simply a wrapper around your existing linker but with extra features:
174//!
175//! - Automatic dynamic linking to dependencies
176//! - Generation of Subsecond jump tables
177//! - Diffing of object files for function invalidation
178//!
179//! Because ThinLink performs very to little actual linking, it drastically speeds up traditional Rust
180//! development. With a development-optimized profile, ThinLink can shrink an incremental build to less than 500ms.
181//!
182//! ThinLink is automatically integrated into the Dioxus CLI though it's currently not available as
183//! a standalone tool.
184//!
185//! ## Limitations
186//!
187//! Subsecond is a powerful tool but it has several limitations. We talk about them above, but here's
188//! a quick summary:
189//!
190//! - Struct hot reloading requires instancing or unwinding
191//! - Statics are tracked but not destructed
192//!
193//! ## Platform support
194//!
195//! Subsecond works across all major platforms:
196//!
197//! - Android (arm64-v8a, armeabi-v7a)
198//! - iOS (arm64)
199//! - Linux (x86_64, aarch64)
200//! - macOS (x86_64, aarch64)
201//! - Windows (x86_64, arm64)
202//! - WebAssembly (wasm32)
203//!
204//! If you have a new platform you'd like to see supported, please open an issue on the Subsecond repository.
205//! We are keen to add support for new platforms like wasm64, riscv64, and more.
206//!
207//! Note that iOS device is currently not supported due to code-signing requirements. We hope to fix
208//! this in the future, but for now you can use the simulator to test your app.
209//!
210//! ## Adding the Subsecond badge to your project
211//!
212//! If you're a framework author and want your users to know that your library supports Subsecond, you
213//! can add the Subsecond badge to your README! Users will know that your library is hot-reloadable and
214//! can be used with Subsecond.
215//!
216//! [![Subsecond](https://img.shields.io/badge/Subsecond-Enabled-orange)](https://crates.io/crates/subsecond)
217//!
218//! ```markdown
219//! [![Subsecond](https://img.shields.io/badge/Subsecond-Enabled-orange)](https://crates.io/crates/subsecond)
220//! ```
221//!
222//! ## License
223//!
224//! Subsecond and ThinLink are licensed under the MIT license. See the LICENSE file for more information.
225//!
226//! ## Supporting this work
227//!
228//! Subsecond is a project by the Dioxus team. If you'd like to support our work, please consider
229//! [sponsoring us on GitHub](https://github.com/sponsors/DioxusLabs) or eventually deploying your
230//! apps with Dioxus Deploy (currently under construction).
231
232pub use subsecond_types::JumpTable;
233
234use std::{
235    backtrace,
236    mem::transmute,
237    panic::AssertUnwindSafe,
238    sync::{atomic::AtomicPtr, Arc, Mutex},
239};
240
241/// Call a given function with hot-reloading enabled. If the function's code changes, `call` will use
242/// the new version of the function. If code *above* the function changes, this will emit a panic
243/// that forces an unwind to the next [`call`] instance.
244///
245/// WASM/rust does not support unwinding, so [`call`] will not track dependency graph changes.
246/// If you are building a framework for use on WASM, you will need to use `Subsecond::HotFn` directly.
247///
248/// However, if you wrap your calling code in a future, you *can* simply drop the future which will
249/// cause `drop` to execute and get something similar to unwinding. Not great if refcells are open.
250pub fn call<O>(mut f: impl FnMut() -> O) -> O {
251    // Only run in debug mode - the rest of this function will dissolve away
252    if !cfg!(debug_assertions) {
253        return f();
254    }
255
256    let mut hotfn = HotFn::current(f);
257    loop {
258        let res = std::panic::catch_unwind(AssertUnwindSafe(|| hotfn.call(())));
259
260        // If the call succeeds just return the result, otherwise we try to handle the panic if its our own.
261        let err = match res {
262            Ok(res) => return res,
263            Err(err) => err,
264        };
265
266        // If this is our panic then let's handle it, otherwise we just resume unwinding
267        let Some(_hot_payload) = err.downcast_ref::<HotFnPanic>() else {
268            std::panic::resume_unwind(err);
269        };
270    }
271}
272
273// We use an AtomicPtr with a leaked JumpTable and Relaxed ordering to give us a global jump table
274// with very little overhead. Reading this amounts of a Relaxed atomic load which basically
275// is no overhead. We might want to look into using a thread_local with a stop-the-world approach
276// just in case multiple threads try to call the jump table before synchronization with the runtime.
277// For Dioxus purposes, this is not a big deal, but for libraries like bevy which heavily rely on
278// multithreading, it might become an issue.
279static APP_JUMP_TABLE: AtomicPtr<JumpTable> = AtomicPtr::new(std::ptr::null_mut());
280static HOTRELOAD_HANDLERS: Mutex<Vec<Arc<dyn Fn() + Send + Sync>>> = Mutex::new(Vec::new());
281
282/// Register a function that will be called whenever a patch is applied.
283///
284/// This handler will be run immediately after the patch library is loaded into the process and the
285/// JumpTable has been set.
286pub fn register_handler(handler: Arc<dyn Fn() + Send + Sync + 'static>) {
287    HOTRELOAD_HANDLERS.lock().unwrap().push(handler);
288}
289
290/// Get the current jump table, if it exists.
291///
292/// This will return `None` if no jump table has been set yet.
293///
294/// # Safety
295///
296/// The `JumpTable` returned here is a pointer into a leaked box. While technically this reference is
297/// valid, we might change the implementation to invalidate the pointer between hotpatches.
298///
299/// You should only use this lifetime in temporary contexts - not *across* hotpatches!
300pub unsafe fn get_jump_table() -> Option<&'static JumpTable> {
301    let ptr = APP_JUMP_TABLE.load(std::sync::atomic::Ordering::Relaxed);
302    if ptr.is_null() {
303        return None;
304    }
305
306    Some(unsafe { &*ptr })
307}
308unsafe fn commit_patch(table: JumpTable) {
309    APP_JUMP_TABLE.store(
310        Box::into_raw(Box::new(table)),
311        std::sync::atomic::Ordering::Relaxed,
312    );
313    HOTRELOAD_HANDLERS
314        .lock()
315        .unwrap()
316        .clone()
317        .iter()
318        .for_each(|handler| {
319            handler();
320        });
321}
322
323/// A panic issued by the [`call`] function if the caller would be stale if called. This causes
324/// an unwind to the next [`call`] instance that can properly handle the panic and retry the call.
325///
326/// This technique allows Subsecond to provide hot-reloading of codebases that don't have a runtime integration.
327#[derive(Debug)]
328pub struct HotFnPanic {
329    _backtrace: backtrace::Backtrace,
330}
331
332/// A pointer to a hot patched function
333#[non_exhaustive]
334#[derive(PartialEq, Eq, Hash, Clone, Copy, Debug)]
335pub struct HotFnPtr(pub u64);
336
337impl HotFnPtr {
338    /// Create a new [`HotFnPtr`].
339    ///
340    /// The safe way to get one is through [`HotFn::ptr_address`].
341    ///
342    /// # Safety
343    ///
344    /// The underlying `u64` must point to a valid function.
345    pub unsafe fn new(index: u64) -> Self {
346        Self(index)
347    }
348}
349
350/// A hot-reloadable function.
351///
352/// To call this function, use the [`HotFn::call`] method. This will automatically use the latest
353/// version of the function from the JumpTable.
354pub struct HotFn<A, M, F>
355where
356    F: HotFunction<A, M>,
357{
358    inner: F,
359    _marker: std::marker::PhantomData<(A, M)>,
360}
361
362impl<A, M, F: HotFunction<A, M>> HotFn<A, M, F> {
363    /// Create a new [`HotFn`] instance with the current function.
364    ///
365    /// Whenever you call [`HotFn::call`], it will use the current function from the [`JumpTable`].
366    pub const fn current(f: F) -> HotFn<A, M, F> {
367        HotFn {
368            inner: f,
369            _marker: std::marker::PhantomData,
370        }
371    }
372
373    /// Call the function with the given arguments.
374    ///
375    /// This will unwrap the [`HotFnPanic`] panic, propagating up to the next [`HotFn::call`].
376    ///
377    /// If you want to handle the panic yourself, use [`HotFn::try_call`].
378    pub fn call(&mut self, args: A) -> F::Return {
379        self.try_call(args).unwrap()
380    }
381
382    /// Get the address of the function in memory which might be different than the original.
383    ///
384    /// This is useful for implementing a memoization strategy to safely preserve state across
385    /// hot-patches. If the ptr_address of a function did not change between patches, then the
386    /// state that exists "above" the function is still valid.
387    ///
388    /// Note that Subsecond does not track this state over time, so it's up to the runtime integration
389    /// to track this state and diff it.
390    pub fn ptr_address(&self) -> HotFnPtr {
391        if size_of::<F>() == size_of::<fn() -> ()>() {
392            let ptr: usize = unsafe { std::mem::transmute_copy(&self.inner) };
393            return HotFnPtr(ptr as u64);
394        }
395
396        let known_fn_ptr = <F as HotFunction<A, M>>::call_it as *const () as usize;
397        if let Some(jump_table) = unsafe { get_jump_table() } {
398            if let Some(ptr) = jump_table.map.get(&(known_fn_ptr as u64)).cloned() {
399                return HotFnPtr(ptr);
400            }
401        }
402
403        HotFnPtr(known_fn_ptr as u64)
404    }
405
406    /// Attempt to call the function with the given arguments.
407    ///
408    /// If this function is stale and can't be updated in place (ie, changes occurred above this call),
409    /// then this function will emit an [`HotFnPanic`] which can be unwrapped and handled by next [`call`]
410    /// instance.
411    pub fn try_call(&mut self, args: A) -> Result<F::Return, HotFnPanic> {
412        if !cfg!(debug_assertions) {
413            return Ok(self.inner.call_it(args));
414        }
415
416        unsafe {
417            // Try to handle known function pointers. This is *really really* unsafe, but due to how
418            // rust trait objects work, it's impossible to make an arbitrary usize-sized type implement Fn()
419            // since that would require a vtable pointer, pushing out the bounds of the pointer size.
420            if size_of::<F>() == size_of::<fn() -> ()>() {
421                return Ok(self.inner.call_as_ptr(args));
422            }
423
424            // Handle trait objects. This will occur for sizes other than usize. Normal rust functions
425            // become ZST's and thus their <T as SomeFn>::call becomes a function pointer to the function.
426            //
427            // For non-zst (trait object) types, then there might be an issue. The real call function
428            // will likely end up in the vtable and will never be hot-reloaded since signature takes self.
429            if let Some(jump_table) = get_jump_table() {
430                let known_fn_ptr = <F as HotFunction<A, M>>::call_it as *const () as u64;
431                if let Some(ptr) = jump_table.map.get(&known_fn_ptr).cloned() {
432                    // The type sig of the cast should match the call_it function
433                    // Technically function pointers need to be aligned, but that alignment is 1 so we're good
434                    let call_it = transmute::<*const (), fn(&F, A) -> F::Return>(ptr as _);
435                    return Ok(call_it(&self.inner, args));
436                }
437            }
438
439            Ok(self.inner.call_it(args))
440        }
441    }
442
443    /// Attempt to call the function with the given arguments, using the given [`HotFnPtr`].
444    ///
445    /// You can get a [`HotFnPtr`] from [`Self::ptr_address`].
446    ///
447    /// If this function is stale and can't be updated in place (ie, changes occurred above this call),
448    /// then this function will emit an [`HotFnPanic`] which can be unwrapped and handled by next [`call`]
449    /// instance.
450    ///
451    /// # Safety
452    ///
453    /// The [`HotFnPtr`] must be to a function whose arguments layouts haven't changed.
454    pub unsafe fn try_call_with_ptr(
455        &mut self,
456        ptr: HotFnPtr,
457        args: A,
458    ) -> Result<F::Return, HotFnPanic> {
459        if !cfg!(debug_assertions) {
460            return Ok(self.inner.call_it(args));
461        }
462
463        unsafe {
464            // Try to handle known function pointers. This is *really really* unsafe, but due to how
465            // rust trait objects work, it's impossible to make an arbitrary usize-sized type implement Fn()
466            // since that would require a vtable pointer, pushing out the bounds of the pointer size.
467            if size_of::<F>() == size_of::<fn() -> ()>() {
468                return Ok(self.inner.call_as_ptr(args));
469            }
470
471            // Handle trait objects. This will occur for sizes other than usize. Normal rust functions
472            // become ZST's and thus their <T as SomeFn>::call becomes a function pointer to the function.
473            //
474            // For non-zst (trait object) types, then there might be an issue. The real call function
475            // will likely end up in the vtable and will never be hot-reloaded since signature takes self.
476            // The type sig of the cast should match the call_it function
477            // Technically function pointers need to be aligned, but that alignment is 1 so we're good
478            let call_it = transmute::<*const (), fn(&F, A) -> F::Return>(ptr.0 as _);
479            Ok(call_it(&self.inner, args))
480        }
481    }
482}
483
484/// Apply the patch using a given jump table.
485///
486/// # Safety
487///
488/// This function is unsafe because it detours existing functions in memory. This is *wildly* unsafe,
489/// especially if the JumpTable is malformed. Only run this if you know what you're doing.
490///
491/// If the pointers are incorrect, function type signatures will be incorrect and the program will crash,
492/// sometimes in a way that requires a restart of your entire computer. Be careful.
493///
494/// # Warning
495///
496/// This function will load the library and thus allocates. In cannot be used when the program is
497/// stopped (ie in a signal handler).
498pub unsafe fn apply_patch(mut table: JumpTable) -> Result<(), PatchError> {
499    // On non-wasm platforms we can just use libloading and the known aslr offsets to load the library
500    #[cfg(any(unix, windows))]
501    {
502        // on android we try to circumvent permissions issues by copying the library to a memmap and then libloading that
503        #[cfg(target_os = "android")]
504        let lib = Box::leak(Box::new(android_memmap_dlopen(&table.lib)?));
505
506        #[cfg(not(target_os = "android"))]
507        let lib = Box::leak(Box::new({
508            match libloading::Library::new(&table.lib) {
509                Ok(lib) => lib,
510                Err(err) => return Err(PatchError::Dlopen(err.to_string())),
511            }
512        }));
513
514        // Use the `main` symbol as a sentinel for the current executable. This is basically a
515        // cross-platform version of `__mh_execute_header` on macOS that we can use to base the executable.
516        let old_offset = aslr_reference() - table.aslr_reference as usize;
517
518        // Use the `main` symbol as a sentinel for the loaded library. Might want to move away
519        // from this at some point, or make it configurable
520        let new_offset = unsafe {
521            // Leak the library. dlopen is basically a no-op on many platforms and if we even try to drop it,
522            // some code might be called (ie drop) that results in really bad crashes (restart your computer...)
523            //
524            // This code currently assumes "main" always makes it to the export list (which it should)
525            // and requires coordination from the CLI to export it.
526            lib.get::<*const ()>(b"main")
527                .ok()
528                .unwrap()
529                .try_as_raw_ptr()
530                .unwrap()
531                .wrapping_byte_sub(table.new_base_address as usize) as usize
532        };
533
534        // Modify the jump table to be relative to the base address of the loaded library
535        table.map = table
536            .map
537            .iter()
538            .map(|(k, v)| {
539                (
540                    (*k as usize + old_offset) as u64,
541                    (*v as usize + new_offset) as u64,
542                )
543            })
544            .collect();
545
546        commit_patch(table);
547    };
548
549    // On wasm, we need to download the module, compile it, and then run it.
550    #[cfg(target_arch = "wasm32")]
551    wasm_bindgen_futures::spawn_local(async move {
552        use js_sys::{
553            ArrayBuffer, Object, Reflect,
554            WebAssembly::{self, Instance, Memory, Module, Table},
555        };
556        use wasm_bindgen::prelude::*;
557        use wasm_bindgen::JsValue;
558        use wasm_bindgen::UnwrapThrowExt;
559        use wasm_bindgen_futures::JsFuture;
560
561        let funcs: Table = wasm_bindgen::function_table().unchecked_into();
562        let memory: Memory = wasm_bindgen::memory().unchecked_into();
563        let exports: Object = wasm_bindgen::exports().unchecked_into();
564
565        let path = table.lib.to_str().unwrap();
566        if !path.ends_with(".wasm") {
567            return;
568        }
569
570        // Fetch + decode the patch wasm. Both awaits are pure I/O — they
571        // touch no shared state, so the future is safe to drop here.
572        let response: web_sys::Response =
573            JsFuture::from(web_sys::window().unwrap_throw().fetch_with_str(&path))
574                .await
575                .unwrap()
576                .unchecked_into();
577        if !response.ok() {
578            panic!(
579                "Failed to patch wasm module at {} - response failed with: {}",
580                path,
581                response.status_text()
582            );
583        }
584        let dl_bytes: ArrayBuffer = JsFuture::from(response.array_buffer().unwrap())
585            .await
586            .unwrap()
587            .unchecked_into();
588
589        // Pre-compile. This is the slow part — V8 hands it to a worker —
590        // and yields the longest. The result is a Module with no side
591        // effects on host JS state, so cancelling here is also safe.
592        let module: Module = JsFuture::from(WebAssembly::compile(dl_bytes.unchecked_ref()))
593            .await
594            .unwrap()
595            .unchecked_into();
596
597        // ── HOST-STATE-MUTATING SECTION ───────────────────────────────
598        //
599        // Below we grow shared linear memory and the indirect function
600        // table, then async-instantiate the patch into them and commit
601        // the new jump table. There IS one `.await` for the instantiate
602        // (we can't avoid it: Chrome disallows synchronous
603        // `new WebAssembly.Instance` on the main thread for modules
604        // larger than 8MB, and patches routinely cross that), but it's
605        // safe — the original race we fixed wasn't about yielding here:
606        //
607        //  * `memory.grow` and `funcs.grow` each return their PRIOR
608        //    length atomically. Concurrent `apply_patch` tasks therefore
609        //    each get a unique, non-overlapping `memory_base` /
610        //    `table_base`, so two patches can't land on the same region.
611        //  * Host code can't observe the half-instantiated patch: the
612        //    new memory pages are zero, the new table slots are null,
613        //    and the jump table isn't committed until the very end of
614        //    this block, so nothing redirects through the new slots.
615        //  * The original bug — using `memory.buffer().byteLength()`
616        //    captured before the awaits, which returned 0 if the buffer
617        //    had been detached by a concurrent grow — is gone because
618        //    we derive `memory_base` from `memory.grow()`'s return
619        //    value instead.
620        //  * Cancellation between grow and `commit_patch` leaks memory
621        //    pages and table slots, but doesn't corrupt anything.
622        const PAGE_SIZE: u32 = 64 * 1024;
623        let patch_pages = (dl_bytes.byte_length() as f64 / PAGE_SIZE as f64).ceil() as u32 + 1;
624
625        // Use grow's return value (the prior page count) to derive
626        // memory_base. Atomic w.r.t. concurrent grows, unlike reading
627        // memory.buffer().byteLength().
628        let prev_pages = memory.grow(patch_pages);
629        let memory_base = (prev_pages + 1) * PAGE_SIZE;
630
631        // grow returns the prior table length, which is __table_base.
632        let table_base = funcs.grow(table.ifunc_count as u32).unwrap();
633
634        // Rebase the jump table entries onto the patch's table slot range.
635        for v in table.map.values_mut() {
636            *v += table_base as u64;
637        }
638
639        // Build the env import object: copy every host export through and
640        // add __memory_base / __table_base globals so the patch's PIC
641        // code resolves correctly.
642        let env = Object::new();
643        for key in Object::keys(&exports) {
644            Reflect::set(&env, &key, &Reflect::get(&exports, &key).unwrap()).unwrap();
645        }
646        for (name, value) in [("__table_base", table_base), ("__memory_base", memory_base)] {
647            let descriptor = Object::new();
648            Reflect::set(&descriptor, &"value".into(), &"i32".into()).unwrap();
649            Reflect::set(&descriptor, &"mutable".into(), &false.into()).unwrap();
650            let global = WebAssembly::Global::new(&descriptor, &value.into()).unwrap();
651            Reflect::set(&env, &name.into(), &global.into()).unwrap();
652        }
653        let imports = Object::new();
654        Reflect::set(&imports, &"env".into(), &env).unwrap();
655
656        // Async instantiation of the precompiled module. We use the
657        // (Module, imports) form of `WebAssembly.instantiate`, which
658        // resolves directly to an `Instance` (not `{module, instance}`).
659        // This is the no-size-limit path; the synchronous
660        // `new WebAssembly.Instance` constructor is capped at 8MB on
661        // Chrome's main thread.
662        let instance: Instance = JsFuture::from(WebAssembly::instantiate_module(&module, &imports))
663            .await
664            .unwrap()
665            .unchecked_into();
666        let inst_exports: Object = instance.exports();
667
668        // Run the patch's relocation thunks and constructors. Order
669        // matters: data relocs first (write memory_base- and table_base-
670        // relative pointers into the patch's data segment), then global
671        // relocs (adjust GOT.func.internal globals by __table_base —
672        // wasm-ld synthesizes those as element-segment-relative offsets),
673        // then ctors. `dyn_into` instead of `unchecked_into` so missing
674        // exports just no-op rather than throwing.
675        for func_name in [
676            "__wasm_apply_data_relocs",
677            "__wasm_apply_global_relocs",
678            "__wasm_call_ctors",
679        ] {
680            if let Ok(val) = Reflect::get(&inst_exports, &func_name.into()) {
681                if let Ok(func) = val.dyn_into::<js_sys::Function>() {
682                    _ = func.call0(&JsValue::undefined());
683                }
684            }
685        }
686
687        unsafe { commit_patch(table) };
688    });
689
690    Ok(())
691}
692
693#[derive(Debug, PartialEq, thiserror::Error)]
694pub enum PatchError {
695    /// The patch failed to apply.
696    ///
697    /// This returns a string instead of the Dlopen error type so we don't need to bring the libloading
698    /// dependency into the public API.
699    #[error("Failed to load library: {0}")]
700    Dlopen(String),
701
702    /// The patch failed to apply on Android, most likely due to a permissions issue.
703    #[error("Failed to load library on Android: {0}")]
704    AndroidMemfd(String),
705}
706
707/// This function returns the address of the main function in the current executable. This is used as
708/// an anchor to reference the current executable's base address.
709///
710/// The point here being that we have a stable address both at runtime and compile time, making it
711/// possible to calculate the ASLR offset from within the process to correct the jump table.
712///
713/// It should only be called from the main executable *first* and not from a shared library since it
714/// self-initializes.
715#[doc(hidden)]
716pub fn aslr_reference() -> usize {
717    #[cfg(target_family = "wasm")]
718    return 0;
719
720    #[cfg(not(target_family = "wasm"))]
721    unsafe {
722        use std::ffi::c_void;
723
724        // The first call to this function should occur in the
725        static mut MAIN_PTR: *mut c_void = std::ptr::null_mut();
726
727        if MAIN_PTR.is_null() {
728            #[cfg(unix)]
729            {
730                MAIN_PTR = libc::dlsym(libc::RTLD_DEFAULT, c"main".as_ptr() as _);
731            }
732
733            #[cfg(windows)]
734            {
735                extern "system" {
736                    fn GetModuleHandleA(lpModuleName: *const i8) -> *mut std::ffi::c_void;
737                    fn GetProcAddress(
738                        hModule: *mut std::ffi::c_void,
739                        lpProcName: *const i8,
740                    ) -> *mut std::ffi::c_void;
741                }
742
743                MAIN_PTR =
744                    GetProcAddress(GetModuleHandleA(std::ptr::null()), c"main".as_ptr() as _) as _;
745            }
746        }
747
748        MAIN_PTR as usize
749    }
750}
751
752/// On Android, we can't dlopen libraries that aren't placed inside /data/data/<package_name>/lib/
753///
754/// If the device isn't rooted, then we can't push the library there.
755/// This is a workaround to copy the library to a memfd and then dlopen it.
756///
757/// I haven't tested it on device yet, so if if it doesn't work, then we can simply revert to using
758/// "adb root" and then pushing the library to the /data/data folder instead of the tmp folder.
759///
760/// Android provides us a flag when calling dlopen to use a file descriptor instead of a path, presumably
761/// because they want to support this.
762/// - https://developer.android.com/ndk/reference/group/libdl
763/// - https://developer.android.com/ndk/reference/structandroid/dlextinfo
764#[cfg(target_os = "android")]
765unsafe fn android_memmap_dlopen(file: &std::path::Path) -> Result<libloading::Library, PatchError> {
766    use std::ffi::{c_void, CStr, CString};
767    use std::os::fd::{AsRawFd, BorrowedFd};
768    use std::ptr;
769
770    #[repr(C)]
771    struct ExtInfo {
772        flags: u64,
773        reserved_addr: *const c_void,
774        reserved_size: libc::size_t,
775        relro_fd: libc::c_int,
776        library_fd: libc::c_int,
777        library_fd_offset: libc::off64_t,
778        library_namespace: *const c_void,
779    }
780
781    extern "C" {
782        fn android_dlopen_ext(
783            filename: *const libc::c_char,
784            flags: libc::c_int,
785            ext_info: *const ExtInfo,
786        ) -> *const c_void;
787    }
788
789    use memmap2::MmapAsRawDesc;
790    use std::os::unix::prelude::{FromRawFd, IntoRawFd};
791
792    let contents = std::fs::read(file)
793        .map_err(|e| PatchError::AndroidMemfd(format!("Failed to read file: {}", e)))?;
794    let mut mfd = memfd::MemfdOptions::default()
795        .create("subsecond-patch")
796        .map_err(|e| PatchError::AndroidMemfd(format!("Failed to create memfd: {}", e)))?;
797    mfd.as_file()
798        .set_len(contents.len() as u64)
799        .map_err(|e| PatchError::AndroidMemfd(format!("Failed to set memfd length: {}", e)))?;
800
801    let raw_fd = mfd.into_raw_fd();
802
803    let mut map = memmap2::MmapMut::map_mut(raw_fd)
804        .map_err(|e| PatchError::AndroidMemfd(format!("Failed to map memfd: {}", e)))?;
805    map.copy_from_slice(&contents);
806    let map = map
807        .make_exec()
808        .map_err(|e| PatchError::AndroidMemfd(format!("Failed to make memfd executable: {}", e)))?;
809
810    let filename = c"/subsecond-patch";
811
812    let info = ExtInfo {
813        flags: 0x10, // ANDROID_DLEXT_USE_LIBRARY_FD
814        reserved_addr: ptr::null(),
815        reserved_size: 0,
816        relro_fd: 0,
817        library_fd: raw_fd,
818        library_fd_offset: 0,
819        library_namespace: ptr::null(),
820    };
821
822    let flags = libloading::os::unix::RTLD_LAZY | libloading::os::unix::RTLD_LOCAL;
823
824    let handle = libloading::os::unix::with_dlerror(
825        || {
826            let ptr = android_dlopen_ext(filename.as_ptr() as _, flags, &info);
827            if ptr.is_null() {
828                return None;
829            } else {
830                return Some(ptr);
831            }
832        },
833        |err| err.to_str().unwrap_or_default().to_string(),
834    )
835    .map_err(|e| {
836        PatchError::AndroidMemfd(format!(
837            "android_dlopen_ext failed: {}",
838            e.unwrap_or_default()
839        ))
840    })?;
841
842    let lib = unsafe { libloading::os::unix::Library::from_raw(handle as *mut c_void) };
843    let lib: libloading::Library = lib.into();
844    Ok(lib)
845}
846
847/// A trait that enables types to be hot-patched.
848///
849/// This trait is only implemented for FnMut types which naturally includes function pointers and
850/// closures that can be re-ran. FnOnce closures are currently not supported since the hot-patching
851/// system we use implies that the function can be called multiple times.
852pub trait HotFunction<Args, Marker> {
853    /// The return type of the function.
854    type Return;
855
856    /// The real function type. This is meant to be a function pointer.
857    /// When we call `call_as_ptr`, we will transmute the function to this type and call it.
858    type Real;
859
860    /// Call the HotFunction with the given arguments.
861    ///
862    /// # Why
863    ///
864    /// "rust-call" isn't stable, so we wrap the underlying call with our own, giving it a stable vtable entry.
865    /// This is more important than it seems since this function becomes "real" and can be hot-patched.
866    fn call_it(&mut self, args: Args) -> Self::Return;
867
868    /// Call the HotFunction as if it were a function pointer.
869    ///
870    /// # Safety
871    ///
872    /// This is only safe if the underlying type is a function (function pointer or virtual/fat pointer).
873    /// Using this will use the JumpTable to find the patched function and call it.
874    unsafe fn call_as_ptr(&mut self, _args: Args) -> Self::Return;
875}
876
877macro_rules! impl_hot_function {
878    (
879        $(
880            ($marker:ident, $($arg:ident),*)
881        ),*
882    ) => {
883        $(
884            /// A marker type for the function.
885            /// This is hidden with the intention to seal this trait.
886            #[doc(hidden)]
887            pub struct $marker;
888
889            impl<T, $($arg,)* R> HotFunction<($($arg,)*), $marker> for T
890            where
891                T: FnMut($($arg),*) -> R,
892            {
893                type Return = R;
894                type Real = fn($($arg),*) -> R;
895
896                fn call_it(&mut self, args: ($($arg,)*)) -> Self::Return {
897                    #[allow(non_snake_case)]
898                    let ( $($arg,)* ) = args;
899                    self($($arg),*)
900                }
901
902                unsafe fn call_as_ptr(&mut self, args: ($($arg,)*)) -> Self::Return {
903                    unsafe {
904                        if let Some(jump_table) = get_jump_table() {
905                            let real = std::mem::transmute_copy::<Self, Self::Real>(&self) as *const ();
906
907                            // Android implements MTE / pointer tagging and we need to preserve the tag.
908                            // If we leave the tag, then indexing our jump table will fail and patching won't work (or crash!)
909                            // This is only implemented on 64-bit platforms since pointer tagging is not available on 32-bit platforms
910                            // In dev, Dioxus disables MTE to work around this issue, but we still handle it anyways.
911                            #[cfg(all(target_pointer_width = "64", target_os = "android"))] let nibble  = real as u64 & 0xFF00_0000_0000_0000;
912                            #[cfg(all(target_pointer_width = "64", target_os = "android"))] let real    = real as u64 & 0x00FFF_FFF_FFFF_FFFF;
913
914                            #[cfg(target_pointer_width = "64")] let real  = real as u64;
915
916                            // No nibble on 32-bit platforms, but we still need to assume u64 since the host always writes 64-bit addresses
917                            #[cfg(target_pointer_width = "32")] let real = real as u64;
918
919                            if let Some(ptr) = jump_table.map.get(&real).cloned() {
920                                // Re-apply the nibble - though this might not be required (we aren't calling malloc for a new pointer)
921                                #[cfg(all(target_pointer_width = "64", target_os = "android"))] let ptr: u64 = ptr | nibble;
922
923                                #[cfg(target_pointer_width = "64")] let ptr: u64 = ptr;
924                                #[cfg(target_pointer_width = "32")] let ptr: u32 = ptr as u32;
925
926                                // Macro-rules requires unpacking the tuple before we call it
927                                #[allow(non_snake_case)]
928                                let ( $($arg,)* ) = args;
929
930
931                                #[cfg(target_pointer_width = "64")]
932                                type PtrWidth = u64;
933                                #[cfg(target_pointer_width = "32")]
934                                type PtrWidth = u32;
935
936                                return std::mem::transmute::<PtrWidth, Self::Real>(ptr)($($arg),*);
937                            }
938                        }
939
940                        self.call_it(args)
941                    }
942                }
943            }
944        )*
945    };
946}
947
948impl_hot_function!(
949    (Fn0Marker,),
950    (Fn1Marker, A),
951    (Fn2Marker, A, B),
952    (Fn3Marker, A, B, C),
953    (Fn4Marker, A, B, C, D),
954    (Fn5Marker, A, B, C, D, E),
955    (Fn6Marker, A, B, C, D, E, F),
956    (Fn7Marker, A, B, C, D, E, F, G),
957    (Fn8Marker, A, B, C, D, E, F, G, H),
958    (Fn9Marker, A, B, C, D, E, F, G, H, I)
959);
subsecond/lib.rs

subsecond/
lib.rs