1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192
//! # stpref //! //! stpref (**s**ingle-**t**hreaded **prof**iler) is a very simple //! profiling utility for single-threaded applications, inspired by //! [`hprof`](https://crates.io/crates/hprof). Mostly intended for //! games. Before using this crate, be sure to check the //! [Notes](#notes) section. //! //! # Usage //! //! Using the crate consists of two parts: measuring and //! analyzing. //! //! The measurement part consists of calling the `perf_measure!` macro //! at the start of every code block we want to measure, and giving it //! an identifiable name. //! //! ```no_run //! # #[macro_use] extern crate stperf; fn main() { //! let process = |n| std::thread::sleep(std::time::Duration::from_millis(n)); //! //! perf_measure!("top level processing"); //! { //! perf_measure!("light initial processing"); //! process(100); //! } //! for _ in 0..5 { //! perf_measure!("heavy processing"); //! process(100); // "heavy processing" //! } //! # } //! ``` //! //! The analysis part starts with printing out the information in any //! of the following ways. //! //! ``` //! # #[macro_use] extern crate stperf; fn main() { //! // Simply print out the data with some sensible defaults for configuration. //! stperf::print(); //! //! // Print out the data, but configure the output. In this case, we //! // use a different format, and specify that we want to see the //! // timings with 3 decimals. //! stperf::print_with_format(stperf::format::COMPATIBLE, 3); //! //! // Just get the formatted string like with print_with_format, //! // except it's a String so you can print it out somewhere else than //! // stdout(). (A GUI, for example.) //! let s = stperf::get_formatted_string(stperf::format::STREAMLINED, 2); //! # } //! ``` //! //! And then you get to ponder the deeper meaning of a graph like this: //! //! ```text //! ╶──┬╼ top level processing - 100.0%, 600 ms/loop, 1 samples //! ├───╼ light initial processing - 16.7%, 100 ms/loop, 1 samples //! └───╼ heavy processing - 83.3%, 500 ms/loop, 5 samples //! ``` //! //! ## How to read the graph //! //! The graph will show scopes inside scopes in a tree-like //! structure, with each indentation implying a deeper scope, and the //! branches illustrating who is whose child and sibling. //! //! * The percents represent the fraction of the time the process took //! from its parent's processing time. //! //! * The ms/loop represents the total time the process took to finish //! inside the shallowest scope ("root scope"); the graph above shows //! 500ms for "heavy processing" even though a single process took //! 100ms, since it was ran 5 times inside "top level processing" (the //! shallowest scope, or "root scope"). //! //! * The samples show how many perf_measure!'s were ran for this row //! of data. //! //! # Overhead //! //! The crate has a bit of performance overhead. Here's a few specific //! causes, with performance measured on the machine this crate was //! developed on (i7-6700k @ 4.1GHz): //! //! * The print/string formatting functions are quite heavy, as they go //! through all the measurement data. One call measures at about 3.0ms //! for 100k samples. //! //! * The [`perf_measure!`](macro.perf_measure.html) macro is pretty //! light, but when used in large quantities, may be noticeable. One //! call measures at about 200ns with a --release flag, 1µs //! without. However, stprof can track its own overhead to a degree, //! so the reported overhead is only about 50ns with --release, and //! 360ns without. //! //! All this said, it's important to note: the most useful information //! this profiler gives you is the percents, not the absolute timing //! value. //! //! # Notes //! //! ## Performance note //! The crate accumulates a pretty large amount of data in a small //! amount of time, if you're using it in a realtime application. A //! recommended way of displaying and handling the measurement data is //! as follows: //! 1. Set an update interval, eg. 1 second. //! 2. Every interval, print out the data (eg. `stperf::print()`), and //! cleanup (`stperf::reset()`). //! This way, you'll always have quite a few samples (1 second is a //! long amount of time to gather data), and they'll be fresh. And //! you'll avoid filling up your ram. //! //! ## `disabled`-feature //! Be sure to enable the `disabled` feature for your release builds, //! as this will practically make this crate disappear in place, even //! as your code stays the same. //! //! # Examples //! ``` //! # #[macro_use] extern crate stperf; fn main() { //! use std::thread; //! use std::time::Duration; //! //! let process = || { //! perf_measure!("processing"); //! thread::sleep(Duration::from_millis(100)); //! }; //! //! for _ in 0..2 { //! perf_measure!("main"); //! for _ in 0..2 { //! perf_measure!("inner operations"); //! process(); //! } //! process(); //! } //! //! stperf::print(); //! // Prints out: //! // ╶──┬╼ main - 100.0%, 300 ms/loop, 2 samples //! // ├──┬╼ inner operations - 66.7%, 200 ms/loop, 4 samples //! // │ └───╼ processing - 100.0%, 200 ms/loop, 4 samples //! // └───╼ processing - 33.3%, 100 ms/loop, 2 samples //! # } //! ``` #![deny(missing_docs)] #[cfg(not(feature = "disabled"))] #[macro_use] extern crate lazy_static; pub mod format; #[allow(dead_code, unused_variables)] mod measurement_tracker; pub use measurement_tracker::MeasurementTracker; #[cfg(not(feature = "disabled"))] mod measurement; #[cfg(not(feature = "disabled"))] pub use measurement::{measure, reset}; #[cfg(not(feature = "disabled"))] mod formatter; #[cfg(not(feature = "disabled"))] pub use formatter::{get_formatted_string, print, print_with_format}; #[cfg(feature = "disabled")] #[allow(dead_code, unused_variables)] mod disabled; #[cfg(feature = "disabled")] pub use disabled::*; /// Logs the time between this call and the end of the current scope. #[cfg(not(feature = "disabled"))] #[macro_export] macro_rules! perf_measure { ($s: expr) => { use std::time::Instant; use stperf::measure; #[allow(unused_variables)] let measurement = measure(Instant::now(), $s); }; } /// Logs the time between this call and the end of the current scope. #[cfg(feature = "disabled")] #[macro_export] macro_rules! perf_measure { ($s: expr) => {}; }