pyo3_log/
lib.rs

1#![forbid(unsafe_code)]
2#![doc(
3    html_root_url = "https://docs.rs/pyo3-log/0.2.1/pyo3-log/",
4    test(attr(deny(warnings))),
5    test(attr(allow(unknown_lints, non_local_definitions)))
6)]
7#![warn(missing_docs)]
8
9//! A bridge from Rust to Python logging
10//!
11//! The library can be used to install a [logger][log::Log] into Rust that will send the messages
12//! over to the Python [logging](https://docs.python.org/3/library/logging.html). This can be
13//! useful when writing a native Python extension module in Rust and it is desirable to log from
14//! the Rust side too.
15//!
16//! The library internally depends on the [`pyo3`] crate. This is not exposed through the public
17//! API and it should work from extension modules not using [`pyo3`] directly. It'll nevertheless
18//! still bring the dependency in, so this might be considered if the module doesn't want to use
19//! it.
20//!
21//! # Simple usage
22//!
23//! Each extension module has its own global variables, therefore the used logger is also
24//! independent of other Rust native extensions. Therefore, it is up to each one to set a logger
25//! for itself if it wants one.
26//!
27//! By using [`init`] function from a place that's run only once (maybe from the top-level module
28//! of the extension), the logger is registered and the log messages (eg. [`info`][log::info]) send
29//! their messages over to the Python side.
30//!
31//! ```rust
32//! use log::info;
33//! use pyo3::prelude::*;
34//!
35//! #[pyfunction]
36//! fn log_something() {
37//!     info!("Something!");
38//! }
39//!
40//! #[pymodule]
41//! fn my_module(m: Bound<'_, PyModule>) -> PyResult<()> {
42//!     pyo3_log::init();
43//!
44//!     m.add_wrapped(wrap_pyfunction!(log_something))?;
45//!     Ok(())
46//! }
47//! ```
48//!
49//! The following example is how this would be performed with the new declarative inline module syntax
50//! introduced in PyO3 0.23 and above.
51//!
52//! ```rust
53//! # mod test_declarative_example {
54//! use pyo3::prelude::*;
55//! use log::info;
56//!
57//! #[pymodule]
58//! mod my_module{
59//!     use super::*;
60//!
61//!    #[pymodule_init]
62//!     fn init(_m: &Bound<'_, PyModule>) -> PyResult<()> {
63//!         pyo3_log::init();
64//!         Ok(())
65//!     }
66//!
67//!     #[pyfunction]
68//!     fn log_something() {
69//!         info!("Something!");
70//!     }
71//! }
72//! # }
73//! ```
74//!
75//! # Performance, Filtering and Caching
76//!
77//! Ideally, the logging system would always consult the Python loggers to know which messages
78//! should or should not be logged. However, one of the reasons of using Rust instead of Python is
79//! performance. Part of that is giving up the GIL in long-running computations to let other
80//! threads run at the same time.
81//!
82//! Therefore, acquiring the GIL and calling into the Python interpreter on each
83//! [`trace`][log::trace] message only to figure out it is not to be logged would be prohibitively
84//! slow. There are two techniques employed here.
85//!
86//! First, level filters are applied before consulting the Python side. By default, only the
87//! [`Debug`][Level::Debug] level and more severe is considered to be sent over to Python. This can
88//! be overridden using the [`filter`][Logger::filter] and [`filter_target`][Logger::filter_target]
89//! methods.
90//!
91//! Second, the Python loggers and their effective log levels are cached on the Rust side on the
92//! first use of the given module. This means that on a disabled level, only the first logging
93//! attempt in the given module will acquire GIL while the future ones will short-circuit before
94//! ever reaching Python.
95//!
96//! This is good for performance, but could lead to the incorrect messages to be logged or not
97//! logged in certain situations ‒ if Rust logs before the Python logging system is set up properly
98//! or when it is reconfigured at runtime.
99//!
100//! For these reasons it is possible to turn caching off on construction of the logger (at the cost
101//! of performance) and to clear the cache manually through the [`ResetHandle`].
102//!
103//! To tune the caching and filtering, the logger needs to be created manually:
104//!
105//! ```rust
106//! # use log::LevelFilter;
107//! # use pyo3::prelude::*;
108//! # use pyo3_log::{Caching, Logger};
109//! #
110//! # fn main() -> PyResult<()> {
111//! # Python::attach(|py| {
112//! let handle = Logger::new(py, Caching::LoggersAndLevels)?
113//!     .filter(LevelFilter::Trace)
114//!     .filter_target("my_module::verbose_submodule".to_owned(), LevelFilter::Warn)
115//!     .install()
116//!     .expect("Someone installed a logger before us :-(");
117//!
118//! // Some time in the future when logging changes, reset the caches:
119//! handle.reset();
120//! # Ok(())
121//! # })
122//! # }
123//! ```
124//!
125//! # Mapping
126//!
127//! The logging `target` is mapped into the name of the logger on the Python side, replacing all
128//! `::` occurrences with `.` (both form hierarchy in their respective language).
129//!
130//! Log levels are mapped to the same-named ones. The [`Trace`][Level::Trace] doesn't exist on the
131//! Python side, but is mapped to a level with value 5.
132//!
133//! # Interaction with Python GIL
134//!
135//! Under the hook, the logging routines call into Python. That means they need to acquire the
136//! Global Interpreter Lock of Python.
137//!
138//! This has several consequences. One of them is the above mentioned performance considerations.
139//!
140//! The other is a risk of deadlocks if threads are used from within the extension code without
141//! releasing the GIL.
142//!
143//! ```rust
144//! use std::thread;
145//! use log::info;
146//! use pyo3::prelude::*;
147//!
148//! #[pyfunction]
149//! fn deadlock() {
150//!     info!("This logs fine");
151//!
152//!     let background_thread = thread::spawn(|| {
153//!         info!("This'll deadlock");
154//!     });
155//!
156//!     background_thread.join().unwrap();
157//! }
158//! # let _ = deadlock;
159//! ```
160//!
161//! The above code will deadlock, because the `info` call in the background thread needs the GIL
162//! that's held by the deadlock function. One needs to give up the GIL to let the other threads
163//! run, something like this:
164//!
165//! ```rust
166//! use std::thread;
167//! use log::info;
168//! use pyo3::prelude::*;
169//!
170//! #[pyfunction]
171//! fn dont_deadlock(py: Python<'_>) {
172//!     info!("This logs fine");
173//!
174//!     py.detach(|| {
175//!         let background_thread = thread::spawn(|| {
176//!             info!("This'll not deadlock");
177//!         });
178//!
179//!         background_thread.join().unwrap();
180//!     });
181//! }
182//! # let _ = dont_deadlock;
183//! ```
184
185use std::cmp;
186use std::collections::HashMap;
187use std::sync::Arc;
188
189use arc_swap::ArcSwap;
190use log::{Level, LevelFilter, Log, Metadata, Record, SetLoggerError};
191use pyo3::prelude::*;
192use pyo3::types::PyTuple;
193
194/// A handle into a [`Logger`], able to reset its caches.
195///
196/// This handle can be used to manipulate a [`Logger`] even after it has been installed. It's main
197/// purpose is to reset the internal caches, for example if the logging settings on the Python side
198/// changed.
199#[derive(Clone, Debug)]
200pub struct ResetHandle(Arc<ArcSwap<CacheNode>>);
201
202impl ResetHandle {
203    /// Reset the internal logger caches.
204    ///
205    /// This removes all the cached loggers and levels (if there were any). Future logging calls
206    /// may cache them again, using the current Python logging settings.
207    pub fn reset(&self) {
208        // Overwrite whatever is in the cache directly. This must win in case of any collisions
209        // (the caching uses compare_and_swap to let the reset win).
210        self.0.store(Default::default());
211    }
212}
213
214/// What the [`Logger`] can cache.
215#[derive(Copy, Clone, Debug, Eq, PartialEq)]
216#[non_exhaustive]
217#[derive(Default)]
218pub enum Caching {
219    /// Disables caching.
220    ///
221    /// Every time a log message passes the filters, the code goes to the Python side to check if
222    /// the message shall be logged.
223    Nothing,
224
225    /// Caches the Python `Logger` objects.
226    ///
227    /// The logger objects (which should stay the same during the lifetime of a Python application)
228    /// are cached. However, the log levels are not. This means there's some amount of calling of
229    /// Python code saved during a logging call, but the GIL still needs to be acquired even if the
230    /// message doesn't eventually get output anywhere.
231    Loggers,
232
233    /// Caches both the Python `Logger` and their respective effective log levels.
234    ///
235    /// Therefore, once a `Logger` has been cached, it is possible to decide on the Rust side if a
236    /// message would get logged or not. If the message is not to be logged, no Python code is
237    /// called and the GIL doesn't have to be acquired.
238    #[default]
239    LoggersAndLevels,
240}
241
242#[derive(Debug)]
243struct CacheEntry {
244    filter: LevelFilter,
245    logger: Py<PyAny>,
246}
247
248impl CacheEntry {
249    fn clone_ref(&self, py: Python<'_>) -> Self {
250        CacheEntry {
251            filter: self.filter,
252            logger: self.logger.clone_ref(py),
253        }
254    }
255}
256
257#[derive(Debug, Default)]
258struct CacheNode {
259    local: Option<CacheEntry>,
260    children: HashMap<String, Arc<CacheNode>>,
261}
262
263impl CacheNode {
264    fn store_to_cache_recursive<'a, P>(
265        &self,
266        py: Python<'_>,
267        mut path: P,
268        entry: CacheEntry,
269    ) -> Arc<Self>
270    where
271        P: Iterator<Item = &'a str>,
272    {
273        let mut me = CacheNode {
274            children: self.children.clone(),
275            local: self.local.as_ref().map(|e| e.clone_ref(py)),
276        };
277        match path.next() {
278            Some(segment) => {
279                let child = me.children.entry(segment.to_owned()).or_default();
280                *child = child.store_to_cache_recursive(py, path, entry);
281            }
282            None => me.local = Some(entry),
283        }
284        Arc::new(me)
285    }
286}
287
288/// The `Logger`
289///
290/// The actual `Logger` that can be installed into the Rust side and will send messages over to
291/// Python.
292///
293/// It can be either created directly and then installed, passed to other aggregating log systems,
294/// or the [`init`] or [`try_init`] functions may be used if defaults are good enough.
295#[derive(Debug)]
296pub struct Logger {
297    /// Filter used as a fallback if none of the `filters` match.
298    top_filter: LevelFilter,
299
300    /// Mapping of filters to modules.
301    ///
302    /// The most specific one will be used, falling back to `top_filter` if none matches. Stored as
303    /// full paths, with `::` separaters (eg. before converting them from Rust to Python).
304    filters: HashMap<String, LevelFilter>,
305
306    /// The prefix to prepend to all log targets
307    prefix: Option<String>,
308
309    /// The imported Python `logging` module.
310    logging: Py<PyModule>,
311
312    /// Caching configuration.
313    caching: Caching,
314
315    /// The cache with loggers and level filters.
316    ///
317    /// The nodes form a tree ‒ each one potentially holding a cache entry (or not) and might have
318    /// some children.
319    ///
320    /// When updating, the whole path from the root is cloned in a copy-on-write manner and the Arc
321    /// here is switched. In case of collisions (eg. someone already replaced the root since
322    /// starting the update), the update is just thrown away.
323    cache: Arc<ArcSwap<CacheNode>>,
324}
325
326impl Logger {
327    /// Creates a new logger.
328    ///
329    /// It defaults to having a filter for [`Debug`][LevelFilter::Debug].
330    pub fn new(py: Python<'_>, caching: Caching) -> PyResult<Self> {
331        let logging = py.import("logging")?;
332        Ok(Self {
333            top_filter: LevelFilter::Debug,
334            filters: HashMap::new(),
335            prefix: None,
336            logging: logging.into(),
337            caching,
338            cache: Default::default(),
339        })
340    }
341
342    /// Installs this logger as the global one.
343    ///
344    /// When installing, it also sets the corresponding [maximum level][log::set_max_level],
345    /// constructed using the filters in this logger.
346    pub fn install(self) -> Result<ResetHandle, SetLoggerError> {
347        let handle = self.reset_handle();
348        let level = cmp::max(
349            self.top_filter,
350            self.filters
351                .values()
352                .copied()
353                .max()
354                .unwrap_or(LevelFilter::Off),
355        );
356        log::set_boxed_logger(Box::new(self))?;
357        log::set_max_level(level);
358        Ok(handle)
359    }
360
361    /// Provides the reset handle of this logger.
362    ///
363    /// Note that installing the logger also returns a reset handle. This function is available if,
364    /// for example, the logger will be passed to some other logging system that connects multiple
365    /// loggers together.
366    pub fn reset_handle(&self) -> ResetHandle {
367        ResetHandle(Arc::clone(&self.cache))
368    }
369
370    /// Configures the default logging filter.
371    ///
372    /// Log messages will be filtered according a filter. If one provided by a
373    /// [`filter_target`][Logger::filter_target] matches, it takes preference. If none matches,
374    /// this one is used.
375    ///
376    /// The default filter if none set is [`Debug`][LevelFilter::Debug].
377    pub fn filter(mut self, filter: LevelFilter) -> Self {
378        self.top_filter = filter;
379        self
380    }
381
382    /// Sets a filter for a specific target, overriding the default.
383    ///
384    /// This'll match targets with the same name and all the children in the module hierarchy. In
385    /// case multiple match, the most specific one wins.
386    ///
387    /// With this configuration, modules will log in the following levels:
388    ///
389    /// ```rust
390    /// # use log::LevelFilter;
391    /// # use pyo3_log::Logger;
392    ///
393    /// Logger::default()
394    ///     .filter(LevelFilter::Warn)
395    ///     .filter_target("xy".to_owned(), LevelFilter::Debug)
396    ///     .filter_target("xy::aa".to_owned(), LevelFilter::Trace);
397    /// ```
398    ///
399    /// * `whatever` => `Warn`
400    /// * `xy` => `Debug`
401    /// * `xy::aa` => `Trace`
402    /// * `xy::aabb` => `Debug`
403    pub fn filter_target(mut self, target: String, filter: LevelFilter) -> Self {
404        self.filters.insert(target, filter);
405        self
406    }
407
408    /// Sets a prefix to prepend to log targets before sending log messages to Python.
409    ///
410    /// This allows for Python-side arrangements where logging configurations are only
411    /// attached to logging names other than the root.
412    pub fn set_prefix(mut self, prefix: &str) -> Self {
413        self.prefix = Some(prefix.replace("::", "."));
414        self
415    }
416
417    /// Finds a node in the cache.
418    ///
419    /// The hierarchy separator is `::`.
420    fn lookup(&self, target: &str) -> Option<Arc<CacheNode>> {
421        if self.caching == Caching::Nothing {
422            return None;
423        }
424
425        let root = self.cache.load();
426        let mut node: &Arc<CacheNode> = &root;
427        for segment in target.split("::") {
428            match node.children.get(segment) {
429                Some(sub) => node = sub,
430                None => return None,
431            }
432        }
433
434        Some(Arc::clone(node))
435    }
436
437    /// Logs stuff
438    ///
439    /// Returns a logger to be cached, if any. If it already found a cached logger or if caching is
440    /// turned off, returns None.
441    fn log_inner(
442        &self,
443        py: Python<'_>,
444        record: &Record,
445        cache: &Option<Arc<CacheNode>>,
446    ) -> PyResult<Option<Py<PyAny>>> {
447        let msg = format!("{}", record.args());
448        let log_level = map_level(record.level());
449        let mut target = record.target().replace("::", ".");
450        target = match &self.prefix {
451            Some(prefix) => format!("{}.{}", prefix, target),
452            None => target,
453        };
454        let cached_logger = cache
455            .as_ref()
456            .and_then(|node| node.local.as_ref())
457            .map(|local| &local.logger);
458        let (logger, cached) = match cached_logger {
459            Some(cached) => (cached.bind(py).clone(), true),
460            None => (
461                self.logging
462                    .bind(py)
463                    .getattr("getLogger")?
464                    .call1((&target,))?,
465                false,
466            ),
467        };
468        // We need to check for this ourselves. For some reason, the logger.handle does not check
469        // it. And besides, we can save ourselves few python calls if it's turned off.
470        if is_enabled_for(&logger, record.level())? {
471            let none = py.None();
472
473            #[allow(unused_mut)]
474            let mut extra = py.None().into_bound(py);
475
476            #[cfg(feature = "kv")]
477            if record.key_values().count() > 0 {
478                // write structured data to 'extra', serializing the values
479                use log::kv::{Key, Value, VisitSource};
480                use pyo3::types::{PyDict, PyString};
481
482                struct PyDictVisitor<'p> {
483                    dict: Bound<'p, PyDict>,
484                }
485
486                impl<'kvs, 'p> VisitSource<'kvs> for PyDictVisitor<'p> {
487                    fn visit_pair(
488                        &mut self,
489                        key: Key<'kvs>,
490                        value: Value<'kvs>,
491                    ) -> Result<(), log::kv::Error> {
492                        let py_key = PyString::new(self.dict.py(), key.as_str());
493                        let py_value = PyString::new(self.dict.py(), &value.to_string());
494
495                        let _ = self.dict.set_item(py_key, py_value);
496                        Ok(())
497                    }
498                }
499
500                let mut visitor = PyDictVisitor {
501                    dict: PyDict::new(py),
502                };
503                let _ = record.key_values().visit(&mut visitor);
504
505                extra = visitor.dict.into_any();
506            }
507
508            let record = logger.call_method1(
509                "makeRecord",
510                (
511                    target,
512                    log_level,
513                    record.file(),
514                    record.line().unwrap_or_default(),
515                    msg,
516                    PyTuple::empty(py), // args
517                    &none,              // exc_info
518                    &none,              // func
519                    extra,              // extra
520                ),
521            )?;
522
523            logger.call_method1("handle", (record,))?;
524        }
525
526        let cache_logger = if !cached && self.caching != Caching::Nothing {
527            Some(logger.into())
528        } else {
529            None
530        };
531
532        Ok(cache_logger)
533    }
534
535    fn filter_for(&self, target: &str) -> LevelFilter {
536        let mut start = 0;
537        let mut filter = self.top_filter;
538        while let Some(end) = target[start..].find("::") {
539            if let Some(f) = self.filters.get(&target[..start + end]) {
540                filter = *f;
541            }
542            start += end + 2;
543        }
544        if let Some(f) = self.filters.get(target) {
545            filter = *f;
546        }
547
548        filter
549    }
550
551    fn enabled_inner(&self, metadata: &Metadata, cache: &Option<Arc<CacheNode>>) -> bool {
552        let cache_filter = cache
553            .as_ref()
554            .and_then(|node| node.local.as_ref())
555            .map(|local| local.filter)
556            .unwrap_or_else(LevelFilter::max);
557
558        metadata.level() <= cache_filter && metadata.level() <= self.filter_for(metadata.target())
559    }
560
561    fn store_to_cache(&self, py: Python<'_>, target: &str, entry: CacheEntry) {
562        let path = target.split("::");
563
564        let orig = self.cache.load();
565        // Construct a new cache structure and insert the new root.
566        let new = orig.store_to_cache_recursive(py, path, entry);
567        // Note: In case of collision, the cache update is lost. This is fine, as we simply lose a
568        // tiny bit of performance and will cache the thing next time.
569        //
570        // We err on the side of losing it here (instead of overwriting), because if the cache is
571        // reset, we don't want to re-insert the old value we have.
572        self.cache.compare_and_swap(orig, new);
573    }
574}
575
576impl Default for Logger {
577    fn default() -> Self {
578        Python::attach(|py| {
579            Self::new(py, Caching::LoggersAndLevels).expect("Failed to initialize python logging")
580        })
581    }
582}
583
584impl Log for Logger {
585    fn enabled(&self, metadata: &Metadata) -> bool {
586        let cache = self.lookup(metadata.target());
587
588        self.enabled_inner(metadata, &cache)
589    }
590
591    fn log(&self, record: &Record) {
592        let cache = self.lookup(record.target());
593
594        if self.enabled_inner(record.metadata(), &cache) {
595            Python::attach(|py| {
596                // If an exception were triggered before this attempt to log,
597                // store it to the side for now and restore it afterwards.
598                let maybe_existing_exception = PyErr::take(py);
599                match self.log_inner(py, record, &cache) {
600                    Ok(Some(logger)) => {
601                        let filter = match self.caching {
602                            Caching::Nothing => unreachable!(),
603                            Caching::Loggers => LevelFilter::max(),
604                            Caching::LoggersAndLevels => extract_max_level(logger.bind(py))
605                                .unwrap_or_else(|e| {
606                                    // See detailed NOTE below
607                                    e.restore(py);
608                                    LevelFilter::max()
609                                }),
610                        };
611
612                        let entry = CacheEntry { filter, logger };
613                        self.store_to_cache(py, record.target(), entry);
614                    }
615                    Ok(None) => (),
616                    Err(e) => {
617                        // NOTE: If an exception was triggered _during_ logging, restore it as current Python exception.
618                        // We have to use PyErr::restore because we cannot return a PyResult from the Log trait's log method.
619                        e.restore(py);
620                    }
621                };
622
623                // If there was a prior exception, restore it now
624                // This ensures that the earliest thrown exception will be the one that's visible to the caller.
625                if let Some(e) = maybe_existing_exception {
626                    e.restore(py);
627                }
628            })
629        }
630    }
631
632    fn flush(&self) {}
633}
634
635fn map_level(level: Level) -> usize {
636    match level {
637        Level::Error => 40,
638        Level::Warn => 30,
639        Level::Info => 20,
640        Level::Debug => 10,
641        Level::Trace => 5,
642    }
643}
644
645fn is_enabled_for(logger: &Bound<'_, PyAny>, level: Level) -> PyResult<bool> {
646    let level = map_level(level);
647    logger.call_method1("isEnabledFor", (level,))?.is_truthy()
648}
649
650fn extract_max_level(logger: &Bound<'_, PyAny>) -> PyResult<LevelFilter> {
651    use Level::*;
652    for l in &[Trace, Debug, Info, Warn, Error] {
653        if is_enabled_for(logger, *l)? {
654            return Ok(l.to_level_filter());
655        }
656    }
657
658    Ok(LevelFilter::Off)
659}
660
661/// Installs a default instance of the logger.
662///
663/// In case a logger is already installed, an error is returned. On success, a handle to reset the
664/// internal caches is returned.
665///
666/// The default logger has a filter set to [`Debug`][LevelFilter::Debug] and caching enabled to
667/// [`LoggersAndLevels`][Caching::LoggersAndLevels].
668pub fn try_init() -> Result<ResetHandle, SetLoggerError> {
669    Logger::default().install()
670}
671
672/// Similar to [`try_init`], but panics if there's a previous logger already installed.
673pub fn init() -> ResetHandle {
674    try_init().unwrap()
675}
676
677#[cfg(test)]
678mod tests {
679    use super::*;
680
681    #[test]
682    fn default_filter() {
683        let logger = Logger::default();
684        assert_eq!(logger.filter_for("hello_world"), LevelFilter::Debug);
685        assert_eq!(logger.filter_for("hello_world::sub"), LevelFilter::Debug);
686    }
687
688    #[test]
689    fn set_filter() {
690        let logger = Logger::default().filter(LevelFilter::Info);
691        assert_eq!(logger.filter_for("hello_world"), LevelFilter::Info);
692        assert_eq!(logger.filter_for("hello_world::sub"), LevelFilter::Info);
693    }
694
695    #[test]
696    fn filter_specific() {
697        let logger = Logger::default()
698            .filter(LevelFilter::Warn)
699            .filter_target("hello_world".to_owned(), LevelFilter::Debug)
700            .filter_target("hello_world::sub".to_owned(), LevelFilter::Trace);
701        assert_eq!(logger.filter_for("hello_world"), LevelFilter::Debug);
702        assert_eq!(logger.filter_for("hello_world::sub"), LevelFilter::Trace);
703        assert_eq!(
704            logger.filter_for("hello_world::sub::multi::level"),
705            LevelFilter::Trace
706        );
707        assert_eq!(
708            logger.filter_for("hello_world::another"),
709            LevelFilter::Debug
710        );
711        assert_eq!(
712            logger.filter_for("hello_world::another::level"),
713            LevelFilter::Debug
714        );
715        assert_eq!(logger.filter_for("other"), LevelFilter::Warn);
716    }
717}