subsecond/lib.rs
1#![allow(clippy::needless_doctest_main)]
2//! # Subsecond: Hot-patching for Rust
3//!
4//! Subsecond is a library that enables hot-patching for Rust applications. This allows you to change
5//! the code of a running application without restarting it. This is useful for game engines, servers,
6//! and other long-running applications where the typical edit-compile-run cycle is too slow.
7//!
8//! Subsecond also implements a technique we call "ThinLinking" which makes compiling Rust code
9//! significantly faster in development mode, which can be used outside of hot-patching.
10//!
11//! # Usage
12//!
13//! Subsecond is designed to be as simple for both application developers and library authors.
14//!
15//! Simply call your existing functions with [`call`] and Subsecond will automatically detour
16//! that call to the latest version of the function.
17//!
18//! ```rust
19//! for x in 0..5 {
20//! subsecond::call(|| {
21//! println!("Hello, world! {}", x);
22//! });
23//! }
24//! ```
25//!
26//! To actually load patches into your application, a third-party tool that implements the Subsecond
27//! compiler and protocol is required. Subsecond is built and maintained by the Dioxus team, so we
28//! suggest using the dioxus CLI tool to use subsecond.
29//!
30//! To install the Dioxus CLI, we recommend using [`cargo binstall`](https://crates.io/crates/cargo-binstall):
31//!
32//! ```sh
33//! cargo binstall dioxus-cli
34//! ```
35//!
36//! The Dioxus CLI provides several tools for development. To run your application with Subsecond enabled,
37//! use `dx serve` - this takes the same arguments as `cargo run` but will automatically hot-reload your
38//! application when changes are detected.
39//!
40//! As of Dioxus 0.7, "--hotpatch" is required to use hotpatching while Subsecond is still experimental.
41//!
42//! ```sh
43//! dx serve --hotpatch
44//! ```
45//!
46//! ## How it works
47//!
48//! Subsecond works by detouring function calls through a jump table. This jump table contains the latest
49//! version of the program's function pointers, and when a function is called, Subsecond will look up
50//! the function in the jump table and call that instead.
51//!
52//! Unlike libraries like [detour](https://crates.io/crates/detour), Subsecond *does not* modify your
53//! process memory. Patching pointers is wildly unsafe and can lead to crashes and undefined behavior.
54//!
55//! Instead, an external tool compiles only the parts of your project that changed, links them together
56//! using the addresses of the functions in your running program, and then sends the new jump table to
57//! your application. Subsecond then applies the patch and continues running. Since Subsecond doesn't
58//! modify memory, the program must have a runtime integration to handle the patching.
59//!
60//! If the framework you're using doesn't integrate with subsecond, you can rely on the fact that calls
61//! to stale [`call`] instances will emit a safe panic that is automatically caught and retried
62//! by the next [`call`] instance up the callstack.
63//!
64//! Subsecond is only enabled when debug_assertions are enabled so you can safely ship your application
65//! with Subsecond enabled without worrying about the performance overhead.
66//!
67//! ## Workspace support
68//!
69//! Subsecond currently only patches the "tip" crate - ie the crate in which your `main.rs` is located.
70//! Changes to crates outside this crate will be ignored, which can be confusing. We plan to add full
71//! workspace support in the future, but for now be aware of this limitation. Crate setups that have
72//! a `main.rs` importing a `lib.rs` won't patch sensibly since the crate becomes a library for itself.
73//!
74//! This is due to limitations in rustc itself where the build-graph is non-deterministic and changes
75//! to functions that forward generics can cause a cascade of codegen changes.
76//!
77//! ## Globals, statics, and thread-locals
78//!
79//! Subsecond *does* support hot-reloading of globals, statics, and thread locals. However, there are several limitations:
80//!
81//! - You may add new globals at runtime, but their destructors will never be called.
82//! - Globals are tracked across patches, but renames are considered to be *new* globals.
83//! - Changes to static initializers will not be observed.
84//!
85//! Subsecond purposefully handles statics this way since many libraries like Dioxus and Tokio rely
86//! on persistent global runtimes.
87//!
88//! HUGE WARNING: Currently, thread-locals in the "tip" crate (the one being patched) will seemingly
89//! reset to their initial value on new patches. This is because we don't currently bind thread-locals
90//! in the patches to their original addresses in the main program. If you rely on thread-locals heavily
91//! in your tip crate, you should be aware of this. Sufficiently complex setups might crash or even
92//! segfault. We plan to fix this in the future, but for now, you should be aware of this limitation.
93//!
94//! ## Struct layout and alignment
95//!
96//! Subsecond currently does not support hot-reloading of structs. This is because the generated code
97//! assumes a particular layout and alignment of the struct. If layout or alignment change and new
98//! functions are called referencing an old version of the struct, the program will crash.
99//!
100//! To mitigate this, framework authors can integrate with Subsecond to either dispose of the old struct
101//! or to re-allocate the struct in a way that is compatible with the new layout. This is called "re-instancing."
102//!
103//! In practice, frameworks that implement subsecond patching properly will throw out the old state
104//! and thus you should never witness a segfault due to misalignment or size changes. Frameworks are
105//! encouraged to aggressively dispose of old state that might cause size and alignment changes.
106//!
107//! We'd like to lift this limitation in the future by providing utilities to re-instantiate structs,
108//! but for now it's up to the framework authors to handle this. For example, Dioxus apps simply throw
109//! out the old state and rebuild it from scratch.
110//!
111//! ## Pointer versioning
112//!
113//! Currently, Subsecond does not "version" function pointers. We have plans to provide this metadata
114//! so framework authors can safely memoize changes without much runtime overhead. Frameworks like
115//! Dioxus and Bevy circumvent this issue by using the TypeID of structs passed to hot functions as
116//! well as the `ptr_address` method on [`HotFn`] to determine if the function pointer has changed.
117//!
118//! Currently, the `ptr_address` method will always return the most up-to-date version of the function
119//! even if the function contents itself did not change. In essence, this is equivalent to a version
120//! of the function where every function is considered "new." This means that framework authors who
121//! integrate re-instancing in their apps might dispose of old state too aggressively. For now, this
122//! is the safer and more practical approach.
123//!
124//! ## Nesting Calls
125//!
126//! Subsecond calls are designed to be nested. This provides clean integration points to know exactly
127//! where a hooked function is called.
128//!
129//! The highest level call is `fn main()` though by default this is not hooked since initialization code
130//! tends to be side-effectual and modify global state. Instead, we recommend wrapping the hot-patch
131//! points manually with [`call`].
132//!
133//! ```rust
134//! fn main() {
135//! // Changes to the `for` loop will cause an unwind to this call.
136//! subsecond::call(|| {
137//! for x in 0..5 {
138//! // Changes to the `println!` will be isolated to this call.
139//! subsecond::call(|| {
140//! println!("Hello, world! {}", x);
141//! });
142//! }
143//! });
144//! }
145//! ```
146//!
147//! The goal here is to provide granular control over where patches are applied to limit loss of state
148//! when new code is loaded.
149//!
150//! ## Applying patches
151//!
152//! When running under the Dioxus CLI, the `dx serve` command will automatically apply patches when
153//! changes are detected. Patches are delivered over the [Dioxus Devtools](https://crates.io/crates/dioxus-devtools)
154//! websocket protocol and received by corresponding websocket.
155//!
156//! If you're using Subsecond in your own application that doesn't have a runtime integration, you can
157//! build an integration using the [`apply_patch`] function. This function takes a `JumpTable` which
158//! the dioxus-cli crate can generate.
159//!
160//! To add support for the Dioxus Devtools protocol to your app, you can use the [dioxus-devtools](https://crates.io/crates/dioxus-devtools)
161//! crate which provides a `connect` method that will automatically apply patches to your application.
162//!
163//! Unfortunately, one design quirk of Subsecond is that running apps need to communicate the address
164//! of `main` to the patcher. This is due to a security technique called [ASLR](https://en.wikipedia.org/wiki/Address_space_layout_randomization)
165//! which randomizes the address of functions in memory. See the subsecond-harness and subsecond-cli
166//! for more details on how to implement the protocol.
167//!
168//! ## ThinLink
169//!
170//! ThinLink is a program linker for Rust that is designed to be used with Subsecond. It implements
171//! the powerful patching system that Subsecond uses to hot-reload Rust applications.
172//!
173//! ThinLink is simply a wrapper around your existing linker but with extra features:
174//!
175//! - Automatic dynamic linking to dependencies
176//! - Generation of Subsecond jump tables
177//! - Diffing of object files for function invalidation
178//!
179//! Because ThinLink performs very to little actual linking, it drastically speeds up traditional Rust
180//! development. With a development-optimized profile, ThinLink can shrink an incremental build to less than 500ms.
181//!
182//! ThinLink is automatically integrated into the Dioxus CLI though it's currently not available as
183//! a standalone tool.
184//!
185//! ## Limitations
186//!
187//! Subsecond is a powerful tool but it has several limitations. We talk about them above, but here's
188//! a quick summary:
189//!
190//! - Struct hot reloading requires instancing or unwinding
191//! - Statics are tracked but not destructed
192//!
193//! ## Platform support
194//!
195//! Subsecond works across all major platforms:
196//!
197//! - Android (arm64-v8a, armeabi-v7a)
198//! - iOS (arm64)
199//! - Linux (x86_64, aarch64)
200//! - macOS (x86_64, aarch64)
201//! - Windows (x86_64, arm64)
202//! - WebAssembly (wasm32)
203//!
204//! If you have a new platform you'd like to see supported, please open an issue on the Subsecond repository.
205//! We are keen to add support for new platforms like wasm64, riscv64, and more.
206//!
207//! Note that iOS device is currently not supported due to code-signing requirements. We hope to fix
208//! this in the future, but for now you can use the simulator to test your app.
209//!
210//! ## Adding the Subsecond badge to your project
211//!
212//! If you're a framework author and want your users to know that your library supports Subsecond, you
213//! can add the Subsecond badge to your README! Users will know that your library is hot-reloadable and
214//! can be used with Subsecond.
215//!
216//! [](https://crates.io/crates/subsecond)
217//!
218//! ```markdown
219//! [](https://crates.io/crates/subsecond)
220//! ```
221//!
222//! ## License
223//!
224//! Subsecond and ThinLink are licensed under the MIT license. See the LICENSE file for more information.
225//!
226//! ## Supporting this work
227//!
228//! Subsecond is a project by the Dioxus team. If you'd like to support our work, please consider
229//! [sponsoring us on GitHub](https://github.com/sponsors/DioxusLabs) or eventually deploying your
230//! apps with Dioxus Deploy (currently under construction).
231
232pub use subsecond_types::JumpTable;
233
234use std::{
235 backtrace,
236 mem::transmute,
237 panic::AssertUnwindSafe,
238 sync::{atomic::AtomicPtr, Arc, Mutex},
239};
240
241/// Call a given function with hot-reloading enabled. If the function's code changes, `call` will use
242/// the new version of the function. If code *above* the function changes, this will emit a panic
243/// that forces an unwind to the next [`call`] instance.
244///
245/// WASM/rust does not support unwinding, so [`call`] will not track dependency graph changes.
246/// If you are building a framework for use on WASM, you will need to use `Subsecond::HotFn` directly.
247///
248/// However, if you wrap your calling code in a future, you *can* simply drop the future which will
249/// cause `drop` to execute and get something similar to unwinding. Not great if refcells are open.
250pub fn call<O>(mut f: impl FnMut() -> O) -> O {
251 // Only run in debug mode - the rest of this function will dissolve away
252 if !cfg!(debug_assertions) {
253 return f();
254 }
255
256 let mut hotfn = HotFn::current(f);
257 loop {
258 let res = std::panic::catch_unwind(AssertUnwindSafe(|| hotfn.call(())));
259
260 // If the call succeeds just return the result, otherwise we try to handle the panic if its our own.
261 let err = match res {
262 Ok(res) => return res,
263 Err(err) => err,
264 };
265
266 // If this is our panic then let's handle it, otherwise we just resume unwinding
267 let Some(_hot_payload) = err.downcast_ref::<HotFnPanic>() else {
268 std::panic::resume_unwind(err);
269 };
270 }
271}
272
273// We use an AtomicPtr with a leaked JumpTable and Relaxed ordering to give us a global jump table
274// with very little overhead. Reading this amounts of a Relaxed atomic load which basically
275// is no overhead. We might want to look into using a thread_local with a stop-the-world approach
276// just in case multiple threads try to call the jump table before synchronization with the runtime.
277// For Dioxus purposes, this is not a big deal, but for libraries like bevy which heavily rely on
278// multithreading, it might become an issue.
279static APP_JUMP_TABLE: AtomicPtr<JumpTable> = AtomicPtr::new(std::ptr::null_mut());
280static HOTRELOAD_HANDLERS: Mutex<Vec<Arc<dyn Fn() + Send + Sync>>> = Mutex::new(Vec::new());
281
282/// Register a function that will be called whenever a patch is applied.
283///
284/// This handler will be run immediately after the patch library is loaded into the process and the
285/// JumpTable has been set.
286pub fn register_handler(handler: Arc<dyn Fn() + Send + Sync + 'static>) {
287 HOTRELOAD_HANDLERS.lock().unwrap().push(handler);
288}
289
290/// Get the current jump table, if it exists.
291///
292/// This will return `None` if no jump table has been set yet.
293///
294/// # Safety
295///
296/// The `JumpTable` returned here is a pointer into a leaked box. While technically this reference is
297/// valid, we might change the implementation to invalidate the pointer between hotpatches.
298///
299/// You should only use this lifetime in temporary contexts - not *across* hotpatches!
300pub unsafe fn get_jump_table() -> Option<&'static JumpTable> {
301 let ptr = APP_JUMP_TABLE.load(std::sync::atomic::Ordering::Relaxed);
302 if ptr.is_null() {
303 return None;
304 }
305
306 Some(unsafe { &*ptr })
307}
308unsafe fn commit_patch(table: JumpTable) {
309 APP_JUMP_TABLE.store(
310 Box::into_raw(Box::new(table)),
311 std::sync::atomic::Ordering::Relaxed,
312 );
313 HOTRELOAD_HANDLERS
314 .lock()
315 .unwrap()
316 .clone()
317 .iter()
318 .for_each(|handler| {
319 handler();
320 });
321}
322
323/// A panic issued by the [`call`] function if the caller would be stale if called. This causes
324/// an unwind to the next [`call`] instance that can properly handle the panic and retry the call.
325///
326/// This technique allows Subsecond to provide hot-reloading of codebases that don't have a runtime integration.
327#[derive(Debug)]
328pub struct HotFnPanic {
329 _backtrace: backtrace::Backtrace,
330}
331
332/// A pointer to a hot patched function
333#[non_exhaustive]
334#[derive(PartialEq, Eq, Hash, Clone, Copy, Debug)]
335pub struct HotFnPtr(pub u64);
336
337impl HotFnPtr {
338 /// Create a new [`HotFnPtr`].
339 ///
340 /// The safe way to get one is through [`HotFn::ptr_address`].
341 ///
342 /// # Safety
343 ///
344 /// The underlying `u64` must point to a valid function.
345 pub unsafe fn new(index: u64) -> Self {
346 Self(index)
347 }
348}
349
350/// A hot-reloadable function.
351///
352/// To call this function, use the [`HotFn::call`] method. This will automatically use the latest
353/// version of the function from the JumpTable.
354pub struct HotFn<A, M, F>
355where
356 F: HotFunction<A, M>,
357{
358 inner: F,
359 _marker: std::marker::PhantomData<(A, M)>,
360}
361
362impl<A, M, F: HotFunction<A, M>> HotFn<A, M, F> {
363 /// Create a new [`HotFn`] instance with the current function.
364 ///
365 /// Whenever you call [`HotFn::call`], it will use the current function from the [`JumpTable`].
366 pub const fn current(f: F) -> HotFn<A, M, F> {
367 HotFn {
368 inner: f,
369 _marker: std::marker::PhantomData,
370 }
371 }
372
373 /// Call the function with the given arguments.
374 ///
375 /// This will unwrap the [`HotFnPanic`] panic, propagating up to the next [`HotFn::call`].
376 ///
377 /// If you want to handle the panic yourself, use [`HotFn::try_call`].
378 pub fn call(&mut self, args: A) -> F::Return {
379 self.try_call(args).unwrap()
380 }
381
382 /// Get the address of the function in memory which might be different than the original.
383 ///
384 /// This is useful for implementing a memoization strategy to safely preserve state across
385 /// hot-patches. If the ptr_address of a function did not change between patches, then the
386 /// state that exists "above" the function is still valid.
387 ///
388 /// Note that Subsecond does not track this state over time, so it's up to the runtime integration
389 /// to track this state and diff it.
390 pub fn ptr_address(&self) -> HotFnPtr {
391 if size_of::<F>() == size_of::<fn() -> ()>() {
392 let ptr: usize = unsafe { std::mem::transmute_copy(&self.inner) };
393 return HotFnPtr(ptr as u64);
394 }
395
396 let known_fn_ptr = <F as HotFunction<A, M>>::call_it as *const () as usize;
397 if let Some(jump_table) = unsafe { get_jump_table() } {
398 if let Some(ptr) = jump_table.map.get(&(known_fn_ptr as u64)).cloned() {
399 return HotFnPtr(ptr);
400 }
401 }
402
403 HotFnPtr(known_fn_ptr as u64)
404 }
405
406 /// Attempt to call the function with the given arguments.
407 ///
408 /// If this function is stale and can't be updated in place (ie, changes occurred above this call),
409 /// then this function will emit an [`HotFnPanic`] which can be unwrapped and handled by next [`call`]
410 /// instance.
411 pub fn try_call(&mut self, args: A) -> Result<F::Return, HotFnPanic> {
412 if !cfg!(debug_assertions) {
413 return Ok(self.inner.call_it(args));
414 }
415
416 unsafe {
417 // Try to handle known function pointers. This is *really really* unsafe, but due to how
418 // rust trait objects work, it's impossible to make an arbitrary usize-sized type implement Fn()
419 // since that would require a vtable pointer, pushing out the bounds of the pointer size.
420 if size_of::<F>() == size_of::<fn() -> ()>() {
421 return Ok(self.inner.call_as_ptr(args));
422 }
423
424 // Handle trait objects. This will occur for sizes other than usize. Normal rust functions
425 // become ZST's and thus their <T as SomeFn>::call becomes a function pointer to the function.
426 //
427 // For non-zst (trait object) types, then there might be an issue. The real call function
428 // will likely end up in the vtable and will never be hot-reloaded since signature takes self.
429 if let Some(jump_table) = get_jump_table() {
430 let known_fn_ptr = <F as HotFunction<A, M>>::call_it as *const () as u64;
431 if let Some(ptr) = jump_table.map.get(&known_fn_ptr).cloned() {
432 // The type sig of the cast should match the call_it function
433 // Technically function pointers need to be aligned, but that alignment is 1 so we're good
434 let call_it = transmute::<*const (), fn(&F, A) -> F::Return>(ptr as _);
435 return Ok(call_it(&self.inner, args));
436 }
437 }
438
439 Ok(self.inner.call_it(args))
440 }
441 }
442
443 /// Attempt to call the function with the given arguments, using the given [`HotFnPtr`].
444 ///
445 /// You can get a [`HotFnPtr`] from [`Self::ptr_address`].
446 ///
447 /// If this function is stale and can't be updated in place (ie, changes occurred above this call),
448 /// then this function will emit an [`HotFnPanic`] which can be unwrapped and handled by next [`call`]
449 /// instance.
450 ///
451 /// # Safety
452 ///
453 /// The [`HotFnPtr`] must be to a function whose arguments layouts haven't changed.
454 pub unsafe fn try_call_with_ptr(
455 &mut self,
456 ptr: HotFnPtr,
457 args: A,
458 ) -> Result<F::Return, HotFnPanic> {
459 if !cfg!(debug_assertions) {
460 return Ok(self.inner.call_it(args));
461 }
462
463 unsafe {
464 // Try to handle known function pointers. This is *really really* unsafe, but due to how
465 // rust trait objects work, it's impossible to make an arbitrary usize-sized type implement Fn()
466 // since that would require a vtable pointer, pushing out the bounds of the pointer size.
467 if size_of::<F>() == size_of::<fn() -> ()>() {
468 return Ok(self.inner.call_as_ptr(args));
469 }
470
471 // Handle trait objects. This will occur for sizes other than usize. Normal rust functions
472 // become ZST's and thus their <T as SomeFn>::call becomes a function pointer to the function.
473 //
474 // For non-zst (trait object) types, then there might be an issue. The real call function
475 // will likely end up in the vtable and will never be hot-reloaded since signature takes self.
476 // The type sig of the cast should match the call_it function
477 // Technically function pointers need to be aligned, but that alignment is 1 so we're good
478 let call_it = transmute::<*const (), fn(&F, A) -> F::Return>(ptr.0 as _);
479 Ok(call_it(&self.inner, args))
480 }
481 }
482}
483
484/// Apply the patch using a given jump table.
485///
486/// # Safety
487///
488/// This function is unsafe because it detours existing functions in memory. This is *wildly* unsafe,
489/// especially if the JumpTable is malformed. Only run this if you know what you're doing.
490///
491/// If the pointers are incorrect, function type signatures will be incorrect and the program will crash,
492/// sometimes in a way that requires a restart of your entire computer. Be careful.
493///
494/// # Warning
495///
496/// This function will load the library and thus allocates. In cannot be used when the program is
497/// stopped (ie in a signal handler).
498pub unsafe fn apply_patch(mut table: JumpTable) -> Result<(), PatchError> {
499 // On non-wasm platforms we can just use libloading and the known aslr offsets to load the library
500 #[cfg(any(unix, windows))]
501 {
502 // on android we try to circumvent permissions issues by copying the library to a memmap and then libloading that
503 #[cfg(target_os = "android")]
504 let lib = Box::leak(Box::new(android_memmap_dlopen(&table.lib)?));
505
506 #[cfg(not(target_os = "android"))]
507 let lib = Box::leak(Box::new({
508 match libloading::Library::new(&table.lib) {
509 Ok(lib) => lib,
510 Err(err) => return Err(PatchError::Dlopen(err.to_string())),
511 }
512 }));
513
514 // Use the `main` symbol as a sentinel for the current executable. This is basically a
515 // cross-platform version of `__mh_execute_header` on macOS that we can use to base the executable.
516 let old_offset = aslr_reference() - table.aslr_reference as usize;
517
518 // Use the `main` symbol as a sentinel for the loaded library. Might want to move away
519 // from this at some point, or make it configurable
520 let new_offset = unsafe {
521 // Leak the library. dlopen is basically a no-op on many platforms and if we even try to drop it,
522 // some code might be called (ie drop) that results in really bad crashes (restart your computer...)
523 //
524 // This code currently assumes "main" always makes it to the export list (which it should)
525 // and requires coordination from the CLI to export it.
526 lib.get::<*const ()>(b"main")
527 .ok()
528 .unwrap()
529 .try_as_raw_ptr()
530 .unwrap()
531 .wrapping_byte_sub(table.new_base_address as usize) as usize
532 };
533
534 // Modify the jump table to be relative to the base address of the loaded library
535 table.map = table
536 .map
537 .iter()
538 .map(|(k, v)| {
539 (
540 (*k as usize + old_offset) as u64,
541 (*v as usize + new_offset) as u64,
542 )
543 })
544 .collect();
545
546 commit_patch(table);
547 };
548
549 // On wasm, we need to download the module, compile it, and then run it.
550 #[cfg(target_arch = "wasm32")]
551 wasm_bindgen_futures::spawn_local(async move {
552 use js_sys::{
553 ArrayBuffer, Object, Reflect,
554 WebAssembly::{self, Instance, Memory, Module, Table},
555 };
556 use wasm_bindgen::prelude::*;
557 use wasm_bindgen::JsValue;
558 use wasm_bindgen::UnwrapThrowExt;
559 use wasm_bindgen_futures::JsFuture;
560
561 let funcs: Table = wasm_bindgen::function_table().unchecked_into();
562 let memory: Memory = wasm_bindgen::memory().unchecked_into();
563 let exports: Object = wasm_bindgen::exports().unchecked_into();
564
565 let path = table.lib.to_str().unwrap();
566 if !path.ends_with(".wasm") {
567 return;
568 }
569
570 // Fetch + decode the patch wasm. Both awaits are pure I/O — they
571 // touch no shared state, so the future is safe to drop here.
572 let response: web_sys::Response =
573 JsFuture::from(web_sys::window().unwrap_throw().fetch_with_str(&path))
574 .await
575 .unwrap()
576 .unchecked_into();
577 if !response.ok() {
578 panic!(
579 "Failed to patch wasm module at {} - response failed with: {}",
580 path,
581 response.status_text()
582 );
583 }
584 let dl_bytes: ArrayBuffer = JsFuture::from(response.array_buffer().unwrap())
585 .await
586 .unwrap()
587 .unchecked_into();
588
589 // Pre-compile. This is the slow part — V8 hands it to a worker —
590 // and yields the longest. The result is a Module with no side
591 // effects on host JS state, so cancelling here is also safe.
592 let module: Module = JsFuture::from(WebAssembly::compile(dl_bytes.unchecked_ref()))
593 .await
594 .unwrap()
595 .unchecked_into();
596
597 // ── HOST-STATE-MUTATING SECTION ───────────────────────────────
598 //
599 // Below we grow shared linear memory and the indirect function
600 // table, then async-instantiate the patch into them and commit
601 // the new jump table. There IS one `.await` for the instantiate
602 // (we can't avoid it: Chrome disallows synchronous
603 // `new WebAssembly.Instance` on the main thread for modules
604 // larger than 8MB, and patches routinely cross that), but it's
605 // safe — the original race we fixed wasn't about yielding here:
606 //
607 // * `memory.grow` and `funcs.grow` each return their PRIOR
608 // length atomically. Concurrent `apply_patch` tasks therefore
609 // each get a unique, non-overlapping `memory_base` /
610 // `table_base`, so two patches can't land on the same region.
611 // * Host code can't observe the half-instantiated patch: the
612 // new memory pages are zero, the new table slots are null,
613 // and the jump table isn't committed until the very end of
614 // this block, so nothing redirects through the new slots.
615 // * The original bug — using `memory.buffer().byteLength()`
616 // captured before the awaits, which returned 0 if the buffer
617 // had been detached by a concurrent grow — is gone because
618 // we derive `memory_base` from `memory.grow()`'s return
619 // value instead.
620 // * Cancellation between grow and `commit_patch` leaks memory
621 // pages and table slots, but doesn't corrupt anything.
622 const PAGE_SIZE: u32 = 64 * 1024;
623 let patch_pages = (dl_bytes.byte_length() as f64 / PAGE_SIZE as f64).ceil() as u32 + 1;
624
625 // Use grow's return value (the prior page count) to derive
626 // memory_base. Atomic w.r.t. concurrent grows, unlike reading
627 // memory.buffer().byteLength().
628 let prev_pages = memory.grow(patch_pages);
629 let memory_base = (prev_pages + 1) * PAGE_SIZE;
630
631 // grow returns the prior table length, which is __table_base.
632 let table_base = funcs.grow(table.ifunc_count as u32).unwrap();
633
634 // Rebase the jump table entries onto the patch's table slot range.
635 for v in table.map.values_mut() {
636 *v += table_base as u64;
637 }
638
639 // Build the env import object: copy every host export through and
640 // add __memory_base / __table_base globals so the patch's PIC
641 // code resolves correctly.
642 let env = Object::new();
643 for key in Object::keys(&exports) {
644 Reflect::set(&env, &key, &Reflect::get(&exports, &key).unwrap()).unwrap();
645 }
646 for (name, value) in [("__table_base", table_base), ("__memory_base", memory_base)] {
647 let descriptor = Object::new();
648 Reflect::set(&descriptor, &"value".into(), &"i32".into()).unwrap();
649 Reflect::set(&descriptor, &"mutable".into(), &false.into()).unwrap();
650 let global = WebAssembly::Global::new(&descriptor, &value.into()).unwrap();
651 Reflect::set(&env, &name.into(), &global.into()).unwrap();
652 }
653 let imports = Object::new();
654 Reflect::set(&imports, &"env".into(), &env).unwrap();
655
656 // Async instantiation of the precompiled module. We use the
657 // (Module, imports) form of `WebAssembly.instantiate`, which
658 // resolves directly to an `Instance` (not `{module, instance}`).
659 // This is the no-size-limit path; the synchronous
660 // `new WebAssembly.Instance` constructor is capped at 8MB on
661 // Chrome's main thread.
662 let instance: Instance = JsFuture::from(WebAssembly::instantiate_module(&module, &imports))
663 .await
664 .unwrap()
665 .unchecked_into();
666 let inst_exports: Object = instance.exports();
667
668 // Run the patch's relocation thunks and constructors. Order
669 // matters: data relocs first (write memory_base- and table_base-
670 // relative pointers into the patch's data segment), then global
671 // relocs (adjust GOT.func.internal globals by __table_base —
672 // wasm-ld synthesizes those as element-segment-relative offsets),
673 // then ctors. `dyn_into` instead of `unchecked_into` so missing
674 // exports just no-op rather than throwing.
675 for func_name in [
676 "__wasm_apply_data_relocs",
677 "__wasm_apply_global_relocs",
678 "__wasm_call_ctors",
679 ] {
680 if let Ok(val) = Reflect::get(&inst_exports, &func_name.into()) {
681 if let Ok(func) = val.dyn_into::<js_sys::Function>() {
682 _ = func.call0(&JsValue::undefined());
683 }
684 }
685 }
686
687 unsafe { commit_patch(table) };
688 });
689
690 Ok(())
691}
692
693#[derive(Debug, PartialEq, thiserror::Error)]
694pub enum PatchError {
695 /// The patch failed to apply.
696 ///
697 /// This returns a string instead of the Dlopen error type so we don't need to bring the libloading
698 /// dependency into the public API.
699 #[error("Failed to load library: {0}")]
700 Dlopen(String),
701
702 /// The patch failed to apply on Android, most likely due to a permissions issue.
703 #[error("Failed to load library on Android: {0}")]
704 AndroidMemfd(String),
705}
706
707/// This function returns the address of the main function in the current executable. This is used as
708/// an anchor to reference the current executable's base address.
709///
710/// The point here being that we have a stable address both at runtime and compile time, making it
711/// possible to calculate the ASLR offset from within the process to correct the jump table.
712///
713/// It should only be called from the main executable *first* and not from a shared library since it
714/// self-initializes.
715#[doc(hidden)]
716pub fn aslr_reference() -> usize {
717 #[cfg(target_family = "wasm")]
718 return 0;
719
720 #[cfg(not(target_family = "wasm"))]
721 unsafe {
722 use std::ffi::c_void;
723
724 // The first call to this function should occur in the
725 static mut MAIN_PTR: *mut c_void = std::ptr::null_mut();
726
727 if MAIN_PTR.is_null() {
728 #[cfg(unix)]
729 {
730 MAIN_PTR = libc::dlsym(libc::RTLD_DEFAULT, c"main".as_ptr() as _);
731 }
732
733 #[cfg(windows)]
734 {
735 extern "system" {
736 fn GetModuleHandleA(lpModuleName: *const i8) -> *mut std::ffi::c_void;
737 fn GetProcAddress(
738 hModule: *mut std::ffi::c_void,
739 lpProcName: *const i8,
740 ) -> *mut std::ffi::c_void;
741 }
742
743 MAIN_PTR =
744 GetProcAddress(GetModuleHandleA(std::ptr::null()), c"main".as_ptr() as _) as _;
745 }
746 }
747
748 MAIN_PTR as usize
749 }
750}
751
752/// On Android, we can't dlopen libraries that aren't placed inside /data/data/<package_name>/lib/
753///
754/// If the device isn't rooted, then we can't push the library there.
755/// This is a workaround to copy the library to a memfd and then dlopen it.
756///
757/// I haven't tested it on device yet, so if if it doesn't work, then we can simply revert to using
758/// "adb root" and then pushing the library to the /data/data folder instead of the tmp folder.
759///
760/// Android provides us a flag when calling dlopen to use a file descriptor instead of a path, presumably
761/// because they want to support this.
762/// - https://developer.android.com/ndk/reference/group/libdl
763/// - https://developer.android.com/ndk/reference/structandroid/dlextinfo
764#[cfg(target_os = "android")]
765unsafe fn android_memmap_dlopen(file: &std::path::Path) -> Result<libloading::Library, PatchError> {
766 use std::ffi::{c_void, CStr, CString};
767 use std::os::fd::{AsRawFd, BorrowedFd};
768 use std::ptr;
769
770 #[repr(C)]
771 struct ExtInfo {
772 flags: u64,
773 reserved_addr: *const c_void,
774 reserved_size: libc::size_t,
775 relro_fd: libc::c_int,
776 library_fd: libc::c_int,
777 library_fd_offset: libc::off64_t,
778 library_namespace: *const c_void,
779 }
780
781 extern "C" {
782 fn android_dlopen_ext(
783 filename: *const libc::c_char,
784 flags: libc::c_int,
785 ext_info: *const ExtInfo,
786 ) -> *const c_void;
787 }
788
789 use memmap2::MmapAsRawDesc;
790 use std::os::unix::prelude::{FromRawFd, IntoRawFd};
791
792 let contents = std::fs::read(file)
793 .map_err(|e| PatchError::AndroidMemfd(format!("Failed to read file: {}", e)))?;
794 let mut mfd = memfd::MemfdOptions::default()
795 .create("subsecond-patch")
796 .map_err(|e| PatchError::AndroidMemfd(format!("Failed to create memfd: {}", e)))?;
797 mfd.as_file()
798 .set_len(contents.len() as u64)
799 .map_err(|e| PatchError::AndroidMemfd(format!("Failed to set memfd length: {}", e)))?;
800
801 let raw_fd = mfd.into_raw_fd();
802
803 let mut map = memmap2::MmapMut::map_mut(raw_fd)
804 .map_err(|e| PatchError::AndroidMemfd(format!("Failed to map memfd: {}", e)))?;
805 map.copy_from_slice(&contents);
806 let map = map
807 .make_exec()
808 .map_err(|e| PatchError::AndroidMemfd(format!("Failed to make memfd executable: {}", e)))?;
809
810 let filename = c"/subsecond-patch";
811
812 let info = ExtInfo {
813 flags: 0x10, // ANDROID_DLEXT_USE_LIBRARY_FD
814 reserved_addr: ptr::null(),
815 reserved_size: 0,
816 relro_fd: 0,
817 library_fd: raw_fd,
818 library_fd_offset: 0,
819 library_namespace: ptr::null(),
820 };
821
822 let flags = libloading::os::unix::RTLD_LAZY | libloading::os::unix::RTLD_LOCAL;
823
824 let handle = libloading::os::unix::with_dlerror(
825 || {
826 let ptr = android_dlopen_ext(filename.as_ptr() as _, flags, &info);
827 if ptr.is_null() {
828 return None;
829 } else {
830 return Some(ptr);
831 }
832 },
833 |err| err.to_str().unwrap_or_default().to_string(),
834 )
835 .map_err(|e| {
836 PatchError::AndroidMemfd(format!(
837 "android_dlopen_ext failed: {}",
838 e.unwrap_or_default()
839 ))
840 })?;
841
842 let lib = unsafe { libloading::os::unix::Library::from_raw(handle as *mut c_void) };
843 let lib: libloading::Library = lib.into();
844 Ok(lib)
845}
846
847/// A trait that enables types to be hot-patched.
848///
849/// This trait is only implemented for FnMut types which naturally includes function pointers and
850/// closures that can be re-ran. FnOnce closures are currently not supported since the hot-patching
851/// system we use implies that the function can be called multiple times.
852pub trait HotFunction<Args, Marker> {
853 /// The return type of the function.
854 type Return;
855
856 /// The real function type. This is meant to be a function pointer.
857 /// When we call `call_as_ptr`, we will transmute the function to this type and call it.
858 type Real;
859
860 /// Call the HotFunction with the given arguments.
861 ///
862 /// # Why
863 ///
864 /// "rust-call" isn't stable, so we wrap the underlying call with our own, giving it a stable vtable entry.
865 /// This is more important than it seems since this function becomes "real" and can be hot-patched.
866 fn call_it(&mut self, args: Args) -> Self::Return;
867
868 /// Call the HotFunction as if it were a function pointer.
869 ///
870 /// # Safety
871 ///
872 /// This is only safe if the underlying type is a function (function pointer or virtual/fat pointer).
873 /// Using this will use the JumpTable to find the patched function and call it.
874 unsafe fn call_as_ptr(&mut self, _args: Args) -> Self::Return;
875}
876
877macro_rules! impl_hot_function {
878 (
879 $(
880 ($marker:ident, $($arg:ident),*)
881 ),*
882 ) => {
883 $(
884 /// A marker type for the function.
885 /// This is hidden with the intention to seal this trait.
886 #[doc(hidden)]
887 pub struct $marker;
888
889 impl<T, $($arg,)* R> HotFunction<($($arg,)*), $marker> for T
890 where
891 T: FnMut($($arg),*) -> R,
892 {
893 type Return = R;
894 type Real = fn($($arg),*) -> R;
895
896 fn call_it(&mut self, args: ($($arg,)*)) -> Self::Return {
897 #[allow(non_snake_case)]
898 let ( $($arg,)* ) = args;
899 self($($arg),*)
900 }
901
902 unsafe fn call_as_ptr(&mut self, args: ($($arg,)*)) -> Self::Return {
903 unsafe {
904 if let Some(jump_table) = get_jump_table() {
905 let real = std::mem::transmute_copy::<Self, Self::Real>(&self) as *const ();
906
907 // Android implements MTE / pointer tagging and we need to preserve the tag.
908 // If we leave the tag, then indexing our jump table will fail and patching won't work (or crash!)
909 // This is only implemented on 64-bit platforms since pointer tagging is not available on 32-bit platforms
910 // In dev, Dioxus disables MTE to work around this issue, but we still handle it anyways.
911 #[cfg(all(target_pointer_width = "64", target_os = "android"))] let nibble = real as u64 & 0xFF00_0000_0000_0000;
912 #[cfg(all(target_pointer_width = "64", target_os = "android"))] let real = real as u64 & 0x00FFF_FFF_FFFF_FFFF;
913
914 #[cfg(target_pointer_width = "64")] let real = real as u64;
915
916 // No nibble on 32-bit platforms, but we still need to assume u64 since the host always writes 64-bit addresses
917 #[cfg(target_pointer_width = "32")] let real = real as u64;
918
919 if let Some(ptr) = jump_table.map.get(&real).cloned() {
920 // Re-apply the nibble - though this might not be required (we aren't calling malloc for a new pointer)
921 #[cfg(all(target_pointer_width = "64", target_os = "android"))] let ptr: u64 = ptr | nibble;
922
923 #[cfg(target_pointer_width = "64")] let ptr: u64 = ptr;
924 #[cfg(target_pointer_width = "32")] let ptr: u32 = ptr as u32;
925
926 // Macro-rules requires unpacking the tuple before we call it
927 #[allow(non_snake_case)]
928 let ( $($arg,)* ) = args;
929
930
931 #[cfg(target_pointer_width = "64")]
932 type PtrWidth = u64;
933 #[cfg(target_pointer_width = "32")]
934 type PtrWidth = u32;
935
936 return std::mem::transmute::<PtrWidth, Self::Real>(ptr)($($arg),*);
937 }
938 }
939
940 self.call_it(args)
941 }
942 }
943 }
944 )*
945 };
946}
947
948impl_hot_function!(
949 (Fn0Marker,),
950 (Fn1Marker, A),
951 (Fn2Marker, A, B),
952 (Fn3Marker, A, B, C),
953 (Fn4Marker, A, B, C, D),
954 (Fn5Marker, A, B, C, D, E),
955 (Fn6Marker, A, B, C, D, E, F),
956 (Fn7Marker, A, B, C, D, E, F, G),
957 (Fn8Marker, A, B, C, D, E, F, G, H),
958 (Fn9Marker, A, B, C, D, E, F, G, H, I)
959);