pyo3_log/
lib.rs

1#![forbid(unsafe_code)]
2#![doc(
3    html_root_url = "https://docs.rs/pyo3-log/0.2.1/pyo3-log/",
4    test(attr(deny(warnings))),
5    test(attr(allow(unknown_lints, non_local_definitions)))
6)]
7#![warn(missing_docs)]
8
9//! A bridge from Rust to Python logging
10//!
11//! The library can be used to install a [logger][log::Log] into Rust that will send the messages
12//! over to the Python [logging](https://docs.python.org/3/library/logging.html). This can be
13//! useful when writing a native Python extension module in Rust and it is desirable to log from
14//! the Rust side too.
15//!
16//! The library internally depends on the [`pyo3`] crate. This is not exposed through the public
17//! API and it should work from extension modules not using [`pyo3`] directly. It'll nevertheless
18//! still bring the dependency in, so this might be considered if the module doesn't want to use
19//! it.
20//!
21//! # Simple usage
22//!
23//! Each extension module has its own global variables, therefore the used logger is also
24//! independent of other Rust native extensions. Therefore, it is up to each one to set a logger
25//! for itself if it wants one.
26//!
27//! By using [`init`] function from a place that's run only once (maybe from the top-level module
28//! of the extension), the logger is registered and the log messages (eg. [`info`][log::info]) send
29//! their messages over to the Python side.
30//!
31//! ```rust
32//! use log::info;
33//! use pyo3::prelude::*;
34//!
35//! #[pyfunction]
36//! fn log_something() {
37//!     info!("Something!");
38//! }
39//!
40//! #[pymodule]
41//! fn my_module(m: Bound<'_, PyModule>) -> PyResult<()> {
42//!     pyo3_log::init();
43//!
44//!     m.add_wrapped(wrap_pyfunction!(log_something))?;
45//!     Ok(())
46//! }
47//! ```
48//!
49//! # Performance, Filtering and Caching
50//!
51//! Ideally, the logging system would always consult the Python loggers to know which messages
52//! should or should not be logged. However, one of the reasons of using Rust instead of Python is
53//! performance. Part of that is giving up the GIL in long-running computations to let other
54//! threads run at the same time.
55//!
56//! Therefore, acquiring the GIL and calling into the Python interpreter on each
57//! [`trace`][log::trace] message only to figure out it is not to be logged would be prohibitively
58//! slow. There are two techniques employed here.
59//!
60//! First, level filters are applied before consulting the Python side. By default, only the
61//! [`Debug`][Level::Debug] level and more severe is considered to be sent over to Python. This can
62//! be overridden using the [`filter`][Logger::filter] and [`filter_target`][Logger::filter_target]
63//! methods.
64//!
65//! Second, the Python loggers and their effective log levels are cached on the Rust side on the
66//! first use of the given module. This means that on a disabled level, only the first logging
67//! attempt in the given module will acquire GIL while the future ones will short-circuit before
68//! ever reaching Python.
69//!
70//! This is good for performance, but could lead to the incorrect messages to be logged or not
71//! logged in certain situations ‒ if Rust logs before the Python logging system is set up properly
72//! or when it is reconfigured at runtime.
73//!
74//! For these reasons it is possible to turn caching off on construction of the logger (at the cost
75//! of performance) and to clear the cache manually through the [`ResetHandle`].
76//!
77//! To tune the caching and filtering, the logger needs to be created manually:
78//!
79//! ```rust
80//! # use log::LevelFilter;
81//! # use pyo3::prelude::*;
82//! # use pyo3_log::{Caching, Logger};
83//! #
84//! # fn main() -> PyResult<()> {
85//! # Python::with_gil(|py| {
86//! let handle = Logger::new(py, Caching::LoggersAndLevels)?
87//!     .filter(LevelFilter::Trace)
88//!     .filter_target("my_module::verbose_submodule".to_owned(), LevelFilter::Warn)
89//!     .install()
90//!     .expect("Someone installed a logger before us :-(");
91//!
92//! // Some time in the future when logging changes, reset the caches:
93//! handle.reset();
94//! # Ok(())
95//! # })
96//! # }
97//! ```
98//!
99//! # Mapping
100//!
101//! The logging `target` is mapped into the name of the logger on the Python side, replacing all
102//! `::` occurrences with `.` (both form hierarchy in their respective language).
103//!
104//! Log levels are mapped to the same-named ones. The [`Trace`][Level::Trace] doesn't exist on the
105//! Python side, but is mapped to a level with value 5.
106//!
107//! # Interaction with Python GIL
108//!
109//! Under the hook, the logging routines call into Python. That means they need to acquire the
110//! Global Interpreter Lock of Python.
111//!
112//! This has several consequences. One of them is the above mentioned performance considerations.
113//!
114//! The other is a risk of deadlocks if threads are used from within the extension code without
115//! releasing the GIL.
116//!
117//! ```rust
118//! use std::thread;
119//! use log::info;
120//! use pyo3::prelude::*;
121//!
122//! #[pyfunction]
123//! fn deadlock() {
124//!     info!("This logs fine");
125//!
126//!     let background_thread = thread::spawn(|| {
127//!         info!("This'll deadlock");
128//!     });
129//!
130//!     background_thread.join().unwrap();
131//! }
132//! # let _ = deadlock;
133//! ```
134//!
135//! The above code will deadlock, because the `info` call in the background thread needs the GIL
136//! that's held by the deadlock function. One needs to give up the GIL to let the other threads
137//! run, something like this:
138//!
139//! ```rust
140//! use std::thread;
141//! use log::info;
142//! use pyo3::prelude::*;
143//!
144//! #[pyfunction]
145//! fn dont_deadlock(py: Python<'_>) {
146//!     info!("This logs fine");
147//!
148//!     py.allow_threads(|| {
149//!         let background_thread = thread::spawn(|| {
150//!             info!("This'll not deadlock");
151//!         });
152//!
153//!         background_thread.join().unwrap();
154//!     });
155//! }
156//! # let _ = dont_deadlock;
157//! ```
158
159use std::cmp;
160use std::collections::HashMap;
161use std::sync::Arc;
162
163use arc_swap::ArcSwap;
164use log::{Level, LevelFilter, Log, Metadata, Record, SetLoggerError};
165use pyo3::prelude::*;
166use pyo3::types::PyTuple;
167
168/// A handle into a [`Logger`], able to reset its caches.
169///
170/// This handle can be used to manipulate a [`Logger`] even after it has been installed. It's main
171/// purpose is to reset the internal caches, for example if the logging settings on the Python side
172/// changed.
173#[derive(Clone, Debug)]
174pub struct ResetHandle(Arc<ArcSwap<CacheNode>>);
175
176impl ResetHandle {
177    /// Reset the internal logger caches.
178    ///
179    /// This removes all the cached loggers and levels (if there were any). Future logging calls
180    /// may cache them again, using the current Python logging settings.
181    pub fn reset(&self) {
182        // Overwrite whatever is in the cache directly. This must win in case of any collisions
183        // (the caching uses compare_and_swap to let the reset win).
184        self.0.store(Default::default());
185    }
186}
187
188/// What the [`Logger`] can cache.
189#[derive(Copy, Clone, Debug, Eq, PartialEq)]
190#[non_exhaustive]
191pub enum Caching {
192    /// Disables caching.
193    ///
194    /// Every time a log message passes the filters, the code goes to the Python side to check if
195    /// the message shall be logged.
196    Nothing,
197
198    /// Caches the Python `Logger` objects.
199    ///
200    /// The logger objects (which should stay the same during the lifetime of a Python application)
201    /// are cached. However, the log levels are not. This means there's some amount of calling of
202    /// Python code saved during a logging call, but the GIL still needs to be acquired even if the
203    /// message doesn't eventually get output anywhere.
204    Loggers,
205
206    /// Caches both the Python `Logger` and their respective effective log levels.
207    ///
208    /// Therefore, once a `Logger` has been cached, it is possible to decide on the Rust side if a
209    /// message would get logged or not. If the message is not to be logged, no Python code is
210    /// called and the GIL doesn't have to be acquired.
211    LoggersAndLevels,
212}
213
214impl Default for Caching {
215    fn default() -> Self {
216        Caching::LoggersAndLevels
217    }
218}
219
220#[derive(Debug)]
221struct CacheEntry {
222    filter: LevelFilter,
223    logger: PyObject,
224}
225
226impl CacheEntry {
227    fn clone_ref(&self, py: Python<'_>) -> Self {
228        CacheEntry {
229            filter: self.filter,
230            logger: self.logger.clone_ref(py),
231        }
232    }
233}
234
235#[derive(Debug, Default)]
236struct CacheNode {
237    local: Option<CacheEntry>,
238    children: HashMap<String, Arc<CacheNode>>,
239}
240
241impl CacheNode {
242    fn store_to_cache_recursive<'a, P>(
243        &self,
244        py: Python<'_>,
245        mut path: P,
246        entry: CacheEntry,
247    ) -> Arc<Self>
248    where
249        P: Iterator<Item = &'a str>,
250    {
251        let mut me = CacheNode {
252            children: self.children.clone(),
253            local: self.local.as_ref().map(|e| e.clone_ref(py)),
254        };
255        match path.next() {
256            Some(segment) => {
257                let child = me.children.entry(segment.to_owned()).or_default();
258                *child = child.store_to_cache_recursive(py, path, entry);
259            }
260            None => me.local = Some(entry),
261        }
262        Arc::new(me)
263    }
264}
265
266/// The `Logger`
267///
268/// The actual `Logger` that can be installed into the Rust side and will send messages over to
269/// Python.
270///
271/// It can be either created directly and then installed, passed to other aggregating log systems,
272/// or the [`init`] or [`try_init`] functions may be used if defaults are good enough.
273#[derive(Debug)]
274pub struct Logger {
275    /// Filter used as a fallback if none of the `filters` match.
276    top_filter: LevelFilter,
277
278    /// Mapping of filters to modules.
279    ///
280    /// The most specific one will be used, falling back to `top_filter` if none matches. Stored as
281    /// full paths, with `::` separaters (eg. before converting them from Rust to Python).
282    filters: HashMap<String, LevelFilter>,
283
284    /// The imported Python `logging` module.
285    logging: Py<PyModule>,
286
287    /// Caching configuration.
288    caching: Caching,
289
290    /// The cache with loggers and level filters.
291    ///
292    /// The nodes form a tree ‒ each one potentially holding a cache entry (or not) and might have
293    /// some children.
294    ///
295    /// When updating, the whole path from the root is cloned in a copy-on-write manner and the Arc
296    /// here is switched. In case of collisions (eg. someone already replaced the root since
297    /// starting the update), the update is just thrown away.
298    cache: Arc<ArcSwap<CacheNode>>,
299}
300
301impl Logger {
302    /// Creates a new logger.
303    ///
304    /// It defaults to having a filter for [`Debug`][LevelFilter::Debug].
305    pub fn new(py: Python<'_>, caching: Caching) -> PyResult<Self> {
306        let logging = py.import("logging")?;
307        Ok(Self {
308            top_filter: LevelFilter::Debug,
309            filters: HashMap::new(),
310            logging: logging.into(),
311            caching,
312            cache: Default::default(),
313        })
314    }
315
316    /// Installs this logger as the global one.
317    ///
318    /// When installing, it also sets the corresponding [maximum level][log::set_max_level],
319    /// constructed using the filters in this logger.
320    pub fn install(self) -> Result<ResetHandle, SetLoggerError> {
321        let handle = self.reset_handle();
322        let level = cmp::max(
323            self.top_filter,
324            self.filters
325                .values()
326                .copied()
327                .max()
328                .unwrap_or(LevelFilter::Off),
329        );
330        log::set_boxed_logger(Box::new(self))?;
331        log::set_max_level(level);
332        Ok(handle)
333    }
334
335    /// Provides the reset handle of this logger.
336    ///
337    /// Note that installing the logger also returns a reset handle. This function is available if,
338    /// for example, the logger will be passed to some other logging system that connects multiple
339    /// loggers together.
340    pub fn reset_handle(&self) -> ResetHandle {
341        ResetHandle(Arc::clone(&self.cache))
342    }
343
344    /// Configures the default logging filter.
345    ///
346    /// Log messages will be filtered according a filter. If one provided by a
347    /// [`filter_target`][Logger::filter_target] matches, it takes preference. If none matches,
348    /// this one is used.
349    ///
350    /// The default filter if none set is [`Debug`][LevelFilter::Debug].
351    pub fn filter(mut self, filter: LevelFilter) -> Self {
352        self.top_filter = filter;
353        self
354    }
355
356    /// Sets a filter for a specific target, overriding the default.
357    ///
358    /// This'll match targets with the same name and all the children in the module hierarchy. In
359    /// case multiple match, the most specific one wins.
360    ///
361    /// With this configuration, modules will log in the following levels:
362    ///
363    /// ```rust
364    /// # use log::LevelFilter;
365    /// # use pyo3_log::Logger;
366    ///
367    /// Logger::default()
368    ///     .filter(LevelFilter::Warn)
369    ///     .filter_target("xy".to_owned(), LevelFilter::Debug)
370    ///     .filter_target("xy::aa".to_owned(), LevelFilter::Trace);
371    /// ```
372    ///
373    /// * `whatever` => `Warn`
374    /// * `xy` => `Debug`
375    /// * `xy::aa` => `Trace`
376    /// * `xy::aabb` => `Debug`
377    pub fn filter_target(mut self, target: String, filter: LevelFilter) -> Self {
378        self.filters.insert(target, filter);
379        self
380    }
381
382    /// Finds a node in the cache.
383    ///
384    /// The hierarchy separator is `::`.
385    fn lookup(&self, target: &str) -> Option<Arc<CacheNode>> {
386        if self.caching == Caching::Nothing {
387            return None;
388        }
389
390        let root = self.cache.load();
391        let mut node: &Arc<CacheNode> = &root;
392        for segment in target.split("::") {
393            match node.children.get(segment) {
394                Some(sub) => node = sub,
395                None => return None,
396            }
397        }
398
399        Some(Arc::clone(node))
400    }
401
402    /// Logs stuff
403    ///
404    /// Returns a logger to be cached, if any. If it already found a cached logger or if caching is
405    /// turned off, returns None.
406    fn log_inner(
407        &self,
408        py: Python<'_>,
409        record: &Record,
410        cache: &Option<Arc<CacheNode>>,
411    ) -> PyResult<Option<PyObject>> {
412        let msg = format!("{}", record.args());
413        let log_level = map_level(record.level());
414        let target = record.target().replace("::", ".");
415        let cached_logger = cache
416            .as_ref()
417            .and_then(|node| node.local.as_ref())
418            .map(|local| &local.logger);
419        let (logger, cached) = match cached_logger {
420            Some(cached) => (cached.bind(py).clone(), true),
421            None => (
422                self.logging
423                    .bind(py)
424                    .getattr("getLogger")?
425                    .call1((&target,))?,
426                false,
427            ),
428        };
429        // We need to check for this ourselves. For some reason, the logger.handle does not check
430        // it. And besides, we can save ourselves few python calls if it's turned off.
431        if is_enabled_for(&logger, record.level())? {
432            let none = py.None();
433            // TODO: kv pairs, if enabled as a feature?
434            let record = logger.call_method1(
435                "makeRecord",
436                (
437                    target,
438                    log_level,
439                    record.file(),
440                    record.line().unwrap_or_default(),
441                    msg,
442                    PyTuple::empty(py), // args
443                    &none,              // exc_info
444                ),
445            )?;
446            logger.call_method1("handle", (record,))?;
447        }
448
449        let cache_logger = if !cached && self.caching != Caching::Nothing {
450            Some(logger.into())
451        } else {
452            None
453        };
454
455        Ok(cache_logger)
456    }
457
458    fn filter_for(&self, target: &str) -> LevelFilter {
459        let mut start = 0;
460        let mut filter = self.top_filter;
461        while let Some(end) = target[start..].find("::") {
462            if let Some(f) = self.filters.get(&target[..start + end]) {
463                filter = *f;
464            }
465            start += end + 2;
466        }
467        if let Some(f) = self.filters.get(target) {
468            filter = *f;
469        }
470
471        filter
472    }
473
474    fn enabled_inner(&self, metadata: &Metadata, cache: &Option<Arc<CacheNode>>) -> bool {
475        let cache_filter = cache
476            .as_ref()
477            .and_then(|node| node.local.as_ref())
478            .map(|local| local.filter)
479            .unwrap_or_else(LevelFilter::max);
480
481        metadata.level() <= cache_filter && metadata.level() <= self.filter_for(metadata.target())
482    }
483
484    fn store_to_cache(&self, py: Python<'_>, target: &str, entry: CacheEntry) {
485        let path = target.split("::");
486
487        let orig = self.cache.load();
488        // Construct a new cache structure and insert the new root.
489        let new = orig.store_to_cache_recursive(py, path, entry);
490        // Note: In case of collision, the cache update is lost. This is fine, as we simply lose a
491        // tiny bit of performance and will cache the thing next time.
492        //
493        // We err on the side of losing it here (instead of overwriting), because if the cache is
494        // reset, we don't want to re-insert the old value we have.
495        self.cache.compare_and_swap(orig, new);
496    }
497}
498
499impl Default for Logger {
500    fn default() -> Self {
501        Python::with_gil(|py| {
502            Self::new(py, Caching::LoggersAndLevels).expect("Failed to initialize python logging")
503        })
504    }
505}
506
507impl Log for Logger {
508    fn enabled(&self, metadata: &Metadata) -> bool {
509        let cache = self.lookup(metadata.target());
510
511        self.enabled_inner(metadata, &cache)
512    }
513
514    fn log(&self, record: &Record) {
515        let cache = self.lookup(record.target());
516
517        if self.enabled_inner(record.metadata(), &cache) {
518            Python::with_gil(|py| {
519                // If an exception were triggered before this attempt to log,
520                // store it to the side for now and restore it afterwards.
521                let maybe_existing_exception = PyErr::take(py);
522                match self.log_inner(py, record, &cache) {
523                    Ok(Some(logger)) => {
524                        let filter = match self.caching {
525                            Caching::Nothing => unreachable!(),
526                            Caching::Loggers => LevelFilter::max(),
527                            Caching::LoggersAndLevels => extract_max_level(logger.bind(py))
528                                .unwrap_or_else(|e| {
529                                    // See detailed NOTE below
530                                    e.restore(py);
531                                    LevelFilter::max()
532                                }),
533                        };
534
535                        let entry = CacheEntry { filter, logger };
536                        self.store_to_cache(py, record.target(), entry);
537                    }
538                    Ok(None) => (),
539                    Err(e) => {
540                        // NOTE: If an exception was triggered _during_ logging, restore it as current Python exception.
541                        // We have to use PyErr::restore because we cannot return a PyResult from the Log trait's log method.
542                        e.restore(py);
543                    }
544                };
545
546                // If there was a prior exception, restore it now
547                // This ensures that the earliest thrown exception will be the one that's visible to the caller.
548                if let Some(e) = maybe_existing_exception {
549                    e.restore(py);
550                }
551            })
552        }
553    }
554
555    fn flush(&self) {}
556}
557
558fn map_level(level: Level) -> usize {
559    match level {
560        Level::Error => 40,
561        Level::Warn => 30,
562        Level::Info => 20,
563        Level::Debug => 10,
564        Level::Trace => 5,
565    }
566}
567
568fn is_enabled_for(logger: &Bound<'_, PyAny>, level: Level) -> PyResult<bool> {
569    let level = map_level(level);
570    logger.call_method1("isEnabledFor", (level,))?.is_truthy()
571}
572
573fn extract_max_level(logger: &Bound<'_, PyAny>) -> PyResult<LevelFilter> {
574    use Level::*;
575    for l in &[Trace, Debug, Info, Warn, Error] {
576        if is_enabled_for(logger, *l)? {
577            return Ok(l.to_level_filter());
578        }
579    }
580
581    Ok(LevelFilter::Off)
582}
583
584/// Installs a default instance of the logger.
585///
586/// In case a logger is already installed, an error is returned. On success, a handle to reset the
587/// internal caches is returned.
588///
589/// The default logger has a filter set to [`Debug`][LevelFilter::Debug] and caching enabled to
590/// [`LoggersAndLevels`][Caching::LoggersAndLevels].
591pub fn try_init() -> Result<ResetHandle, SetLoggerError> {
592    Logger::default().install()
593}
594
595/// Similar to [`try_init`], but panics if there's a previous logger already installed.
596pub fn init() -> ResetHandle {
597    try_init().unwrap()
598}
599
600#[cfg(test)]
601mod tests {
602    use super::*;
603
604    #[test]
605    fn default_filter() {
606        let logger = Logger::default();
607        assert_eq!(logger.filter_for("hello_world"), LevelFilter::Debug);
608        assert_eq!(logger.filter_for("hello_world::sub"), LevelFilter::Debug);
609    }
610
611    #[test]
612    fn set_filter() {
613        let logger = Logger::default().filter(LevelFilter::Info);
614        assert_eq!(logger.filter_for("hello_world"), LevelFilter::Info);
615        assert_eq!(logger.filter_for("hello_world::sub"), LevelFilter::Info);
616    }
617
618    #[test]
619    fn filter_specific() {
620        let logger = Logger::default()
621            .filter(LevelFilter::Warn)
622            .filter_target("hello_world".to_owned(), LevelFilter::Debug)
623            .filter_target("hello_world::sub".to_owned(), LevelFilter::Trace);
624        assert_eq!(logger.filter_for("hello_world"), LevelFilter::Debug);
625        assert_eq!(logger.filter_for("hello_world::sub"), LevelFilter::Trace);
626        assert_eq!(
627            logger.filter_for("hello_world::sub::multi::level"),
628            LevelFilter::Trace
629        );
630        assert_eq!(
631            logger.filter_for("hello_world::another"),
632            LevelFilter::Debug
633        );
634        assert_eq!(
635            logger.filter_for("hello_world::another::level"),
636            LevelFilter::Debug
637        );
638        assert_eq!(logger.filter_for("other"), LevelFilter::Warn);
639    }
640}