tacet 0.4.2

Detect timing side channels in cryptographic code
Documentation
�

/w'������SrSSKrSSKJr SSKrSSKrSSKJ	r	J
r
JrJrJ
r
JrJrJrJrJrJrJrJrJrJrJrJrJrJrJrJrJrJrJ r J!r!J"r"J#r#J$r$J%r%J&r& SSSSS	S
SSS
.r'Sr(S$S\)S\)S\*S\+\*\*44Sjjr,S\-S\*4Sjr.S\R^S\4Sjr0S\R^S\4Sjr1S\R^S\4Sjr2S\R^S\4Sjr3S\R^S\4Sjr4S\R^S\4Sjr5S\R^S\4Sjr6S\R^S\4Sjr7S \S\R^4S!jr8S"r9\:S#:Xa\;"\9"55 gg)%aq
Calibration test visualization for timing-oracle.

Generates publication-quality plots from calibration test CSV data.

Usage:
    uv run plot_calibration.py <data_dir> [--output <output_dir>]

Example:
    CALIBRATION_DATA_DIR=./calibration_data cargo test --release --test calibration_power
    uv run scripts/plot_calibration.py ./calibration_data --output ./plots
�N)�Path)�ggplot�aes�	geom_line�
geom_point�geom_ribbon�
geom_hline�
geom_vline�
geom_errorbar�geom_bar�geom_density�geom_segment�	geom_text�geom_abline�labs�
theme_minimal�theme�element_text�element_line�element_rect�
element_blank�scale_x_continuous�
scale_x_log10�scale_y_continuous�scale_color_manual�scale_fill_manual�coord_cartesian�annotate�
facet_wrap�position_dodge�
after_statz#2563ebz#0d9488z#f97316z#ef4444z#374151z#9ca3afz#e5e7ebz#ffffff)�primary�	secondary�accent�error�text�muted�light�
backgroundc�t�[5[[S[SS9[SSSS0S9[S	[S
SS0S9[S	SSS
.S9[S[SS9[SSS9[SS9[	[SSS9[5[
[SS9[
[SS9S[
[SSS9SSS9-$)z-Clean, minimal theme for timing-oracle plots.z
sans-serifr&)�family�color��bold�b�)�size�weight�margin�
r'�)r1r,r3)�t�r)r1r3�	)r1r,)r1r2)r1r(��?�r,r1r))�fill�bottomN)r;r,)r5��)r&�
plot_title�
plot_subtitle�
axis_title�	axis_text�legend_title�legend_text�panel_grid_major�panel_grid_minor�panel_background�plot_background�legend_position�legend_background�figure_size�dpi)rrr�COLORSrrr���S/Users/agucova/repos/timing-oracle/crates/timing-oracle/scripts/plot_calibration.py�theme_timing_oraclerQOs���	��
��\����H�#��F�C��9�M�&�B�f�W�o�s�TU�h�W�#��!�!�4D�E�"�����@�%�1�V�<�$�!�,�*��w��c�J�*�_�)�v�l�/C�D�(�f�\�.B�C�%�*��|�0D�D�Q���-	
�	
�rO�	successes�trials�
confidence�returnc�v�US:XagUnX-n[US-
5S:aSOSnUS:XaSSU-
S-SU---
nS	U4$X:XaSU-
S-SU--nUS4$XU-nSX�--n	XHSU---U	-n
U[R"USU-
-US
U---U-5-U	-n[S	X�-
5n[	SX�-5nXv4$)z9Wilson score confidence interval for binomial proportion.r)���?�ffffff�?g����MbP?�\���(\�?gR���Q�?rXg@rWg@)�abs�np�sqrt�max�min)rRrSrT�n�p_hat�z�upper�lower�z2�denom�centerr3s            rP�	wilson_cirhqs��
��{���A��M�E��J��%�&��.��E�A��A�~���j�(�C�/�S�1�W�=�=���U�|�����
�"�c�)�s�Q�w�7���s�|��	
��B��"�&�L�E��C�!�G�n�$��
-�F�
����%�3��;�/�"��a��.�@�A�E�F�
F��
N�F���V�_�%�E���V�_�%�E��>�rO�	test_namec���SSKnSUR5;dSUR5;aSnODSUR5;aSnO-SUR5;dS	UR5;aS
nOSnURSUR55nU(a[UR	S55nX$-$URS
UR55nU(a[UR	S55$g)z3Infer injected effect size from test name patterns.rN�adjacent_network�adjacentnetworkgY@�researchgI@�remote_network�
remotenetworkgj�@z(\d+)x_theta�z(\d+)nsrW)�rerd�search�float�group)rirq�theta�match�
multipliers     rP�infer_effect_from_test_namerx�s���
�
�Y�_�_�.�.�2C�y���GX�2X���	�y���(�	(���	�Y�_�_�.�	.�/�Y�_�_�EV�2V�����
�I�I�o�y���'8�9�E���5�;�;�q�>�*�
��!�!�
�I�I�j�)�/�/�"3�4�E���U�[�[��^�$�$�rO�df�output_pathc
�R�XSS:HR5nUR(a[S5 gUSS:HR5(a&USR	[
5US'[S5 X"SS:�nUR(a[S	5 gUR
S5RS
S4SS
9R5nUSUS-US'UR	SSS9US'UR	SSS9US'[U[SSS95[[SSS9[SSS9-[[SSS9-[[SSS9-[SS[S SS!9-[S"S#[S SS!9-[!S$S%S&S'S(9-[#S)S*S+9-[%5-['5-[)S,USR+5S-S-S.S/[S S09-[)S,USR+5S-S1S2S/[S S09-nUR-US3-S/S4S5S69 [S7US3-35 g)8zv
Plot power curve: detection rate vs effect size.

X-axis: Injected effect size (ns)
Y-axis: Detection rate (power) %
�	test_type�powerz0  No power test data found, skipping power curveN�injected_effect_nsrriz'  Inferred effect sizes from test namesz8  Could not determine effect sizes, skipping power curve�decisionc�(�US:HR5$�N�fail��sum��xs rP�<lambda>�"plot_power_curve.<locals>.<lambda>�����f��(9�(9�(;rO�r�count��detected�totalr�r�c�P�[[US5[US55S$�Nr�r�r�rh�int�r7s rPr�r���"��	�#�a�
�m�2D�c�!�G�*�o�(V�WX�(YrOrp��axis�ci_lowc�P�[[US5[US55S$�Nr�r�rpr�r�s rPr�r���"���3�q��}�3E�s�1�W�:��)W�XY�)ZrO�ci_high�r��y��ymin�ymaxr"皙�����?�r;�alphag333333�?r:��ffffff�?�dashedr'��
yintercept�linetyper,r1��������?�dottedz*Power Curve: Detection Rate vs Effect Sizez2Shaded region shows 95% Wilson confidence interval�Injected Effect Size (ns)�Detection Rate��title�subtitler�r�c�4�UVs/sHoSPM sn$s snf�Nz.0%rN��l�vs  rPr�r�����1�.E�1�a�C��z�1�.E��.E��
�rg�������?��labels�limitsr&g
ףp=
�?u
70% min @ 2θr5)r�r��labelr1r,gq=
ףp�?u
90% min @ 5θzpower_curve.pngr=����width�heightrL�	  Saved: )�copy�empty�print�all�applyrx�groupby�agg�reset_indexrrrrMrrr	rrrrQrr^�save)ryrz�power_dfr��ps     rP�plot_power_curver��s����[�/�W�,�-�2�2�4�H��~�~�
�@�A��	�%�&�!�+�0�0�2�2�)1�+�)>�)D�)D�E`�)a��%�&�
�7�8��!5�6��:�;�H��~�~�
�H�I���
�
�/�
0�
4�
4��;�<�#�5���k�m��
�z�?�S��\�1�C��L��I�I�Y�`a�I�b�C��M��Y�Y�Z�ab�Y�c�C�	�N�	�s�C�.�'�:�;�
�c�x�i�8�v�i�?P�X[�
\�	]�
�&��+�#�
6�	7��6�)�,�1�
5�	6���x�v�g��UX�
Y�
	Z���x�v�g��UX�
Y�	Z��>�I�)��	
�	
� �$E�i�
X�!	Y�"�/�#	�$�
�%	 �*�6�S�!5�6�:�:�<�s�B�d�,�1�F�7�O�M�+	M�.�6�S�!5�6�:�:�<�s�B�d�,�1�F�7�O�M�/	M��8�F�F�;�*�*�!�A�3�F�G�	�I�k�$5�5�6�
7�8rOc��XSS:HR5nUR(a[S5 gURS5R	SS4SS	9R5nUS
US-US'UR
SS
S9US'UR
SS
S9US'USRRSS5RRSS5US'[U[SSS95[[SSS9S[SSS9-[[SSS9-[SS [S!SS"9-[S#S$[S%S&S"9-[S'S(S)S*S+9-[!S,S-S.9-[#5-[%['S/S
S09S19-[)SS2S3S4S5[S!S6S79-[)SS2S8S9S5[S%S6S79-nUR+US:-S;S<S=S>9 [S?US:-35 g)@zy
Plot FPR calibration: observed FPR with confidence intervals.

X-axis: Test configuration
Y-axis: False positive rate %
r|�fprz7  No FPR test data found, skipping FPR calibration plotNrirc�(�US:HR5$r�r�r�s rPr��&plot_fpr_calibration.<locals>.<lambda>s���V��/@�/@�/BrOr�)�false_positivesr�r�r�c�P�[[US5[US55S$)Nr�r�rr�r�s rPr�r�s'��	�#�a�8I�6J�2K�S�QR�SZ�Q[�_�(]�^_�(`rOrpr�r�c�P�[[US5[US55S$)Nr�r�rpr�r�s rPr�r�s'���3�q�9J�7K�3L�c�RS�T[�R\�o�)^�_`�)arOr��
fpr_quick_��_� r�r�r�r�r&皙�����?�r�r,r1�r:皙�����?r�r"r�皙�����?r�r$r�z/FPR Calibration: False Positive Rate Under Nullz.Error bars show 95% Wilson confidence intervalzTest ConfigurationzFalse Positive Rater�c�4�UVs/sHoSPM sn$s snfr�rNr�s  rPr�r�"r�r�)rr�r��)�angle�hjust)�axis_text_xr9g)\���(�?uα = 5%r5�left�r�r�r�r1r,�hag�z�G�?z	max = 10%zfpr_calibration.png�r=r�r�r�)r�r�r�r�r�r�r��str�replacerrrrMrr	rrrQrrrr�)ryrz�fpr_dfr�r�s     rP�plot_fpr_calibrationr��s7���;��5�(�
)�
.�
.�
0�F�
�|�|�
�G�H���.�.��
%�
)�
)�#�%B�C�#�*���k�m��
�&�'�#�g�,�6�C��J��I�I�`�gh�I�i�C��M��Y�Y�a�hi�Y�j�C�	�N��{�#�'�'�/�/��b�A�E�E�M�M�c�SV�W�C��L�	�s�C�'�U�+�,�
���	�:�#�V�TZ�^�be�
f�	g�
�6�&�>��
2�	3�
��x�v�i�?P�WZ�
[�	\���x�v�h�?O�VY�
Z�
	[��C�E�"�#�	
�	
��$E�i�
X�	Y� �
�!	 �"�L�r��;�
<�#	=�(�6�S�E��Q�f�U^�N_�dj�
k�)	l�*�6�S�E��1�F�S[�L\�ag�
h�+	i��2�F�F�;�.�.�a��s�F�K�	�I�k�$9�9�:�
;�<rOc��XSS:HR5nUR(a[S5 gURSS/S9nUR(a[S5 gUSUS	:*US	US:*-US
'UR	S	5RSSS
9R
5nUSUS-US'URSSS9US'URSSS9US'[U[S	SS95[S[SSSS9-[[SSS9S[SSS 9-[S!S"[S#SS$9-[S%S&[S'SS$9-[S(S)S*S+S,9-[S-S.S/9-[!5-[#SUS	R%5S0-S1S2S3[S#S4S59-nUR'US6-S7S8S9S:9 [S;US6-35 g)<zs
Plot coverage calibration: CI coverage rate by effect size.

X-axis: Injected effect size
Y-axis: Coverage rate %
r|�coveragez5  No coverage test data found, skipping coverage plotN�	ci_low_ns�
ci_high_ns��subsetz9  No valid coverage data with CIs, skipping coverage plotr~�covered)r�r�)r�r�)�
covered_countr�r�r�c�P�[[US5[US55S$)Nr�r�rr�r�s rPr��+plot_coverage_calibration.<locals>.<lambda>Rs$��	�#�a��6H�2I�3�q�QX�z�?�([�\]�(^rOrpr�r�c�P�[[US5[US55S$)Nr�r�rpr�r�s rPr�r�Ss&���3�q��7I�3J�C�PQ�RY�PZ�O�)\�]^�)_rOr�r��identityr#r��()�statr;r�r�r��r&r�r�rYr�r"r��333333�?r�r'z0Coverage Calibration: 95% CI Contains True Valuez0Nominal coverage = 95%, minimum acceptable = 85%r�z
Coverage Rater�c�4�UVs/sHoSPM sn$s snfr�rNr�s  rPr�r�fr�r�)��?gR���Q�?r�r�g���Q��?z95% nominalr5r�r�zcoverage_calibration.pngr�r=r�r�r�)r�r�r��dropnar�r�r�r�rrrrMrr	rrrQrr_r�)ryrz�cov_dfr�r�s     rP�plot_coverage_calibrationr�/s>���;��:�-�
.�
3�
3�
5�F�
�|�|�
�E�F���]�]�;��"=�]�
>�F�
�|�|�
�I�J��
��	��';� <�	<�	�$�	%���)=�	=�	?��9���.�.�-�
.�
2�
2�(�"�3���k�m��
�/�*�S��\�9�C�
�O��I�I�^�ef�I�g�C��M��Y�Y�_�fg�Y�h�C�	�N�	�s�C�.�*�=�>�
�
���)<�C�r�
R�	S�
���	�:�"�F�SY�N�ad�
e�	f�
��x�v�i�?P�WZ�
[�	\���x�v�g��UX�
Y�
	Z��D�G�)��	
�	
��$E�l�
[�	\� �
�!	 �&�6�S�!5�6�:�:�<�s�B�d�&�Q�f�Y�6G�F�T�'	T��0�F�F�;�3�3�1�Q�C�F�P�	�I�k�$>�>�?�
@�ArOc	���XSR5USS:�-R5nUR(a[S5 gUSUS:*USUS:*-US'USR	S	S
S.5US'[USR5USR55S
-n[
U[SSSS95[SSS[SSS9-[[SSS9SSSS9-[SSS9-[[S[SS.S9-[SS S!S"S#S$9-[SU4SU4S%9-[5-[!S&S'9-nUR#US(-S)S)S*S+9 [S,US(-35 g)-z}
Plot effect estimation accuracy: estimated vs true effect.

X-axis: True injected effect (ns)
Y-axis: Estimated effect (ns)
�shift_nsr~rz8  No effect estimation data found, skipping scatter plotNr�r�r��CI covers true�CI misses true)TF�
covered_labelg�������?)r�r�r,rpr�r'r���	intercept�sloper�r,r1r�皙�����?r9)r�r�r1�r�)r1r�r"r%)rr)�valueszEffect Estimation Accuracyz,Dashed line shows perfect estimation (y = x)zTrue Injected Effect (ns)zEstimated Effect (ns)r�)r�r�r�r�r,��xlim�ylimr<)rIzeffect_estimation.png�r�r�r�)�notnar�r�r��mapr^rrrrMrrrrrrQrr�)ryrz�est_df�max_valr�s     rP�plot_effect_estimationrrs����:��$�$�&�"�-A�*B�Q�*F�G�
H�
M�
M�
O�F�
�|�|�
�H�I��
��	��';� <�	<�	�$�	%���)=�	=�	?��9��%�Y�/�3�3�;K�Td�4e�f�F�?���&�-�.�2�2�4�f�Z�6H�6L�6L�6N�O�RU�U�G�	�v�s�1�Z��W�X����X�V�G�_�[^�
_�	`����<�@��QT�[^�
_�
	`��!�3�
'�	(��$�Y�/�$�W�o�%
��
	� �.�C�)�%��
�!	
�.��7�|�1�g�,�
?�/	@�0�
�1	 �2��
)�3	*��:�F�F�;�0�0��!��F�M�	�I�k�$;�;�<�
=�>rOc	�n�XSR5R5nUR(a[S5 gUSS:�US'[R
"US/SQ/SQS	S
9US'UR
SS	S9RS
SS9R5nUSR[5US'USUS-US'URSSS9US'URSSS9US'[U[SSS95[SSS[SSS9-[![SSS9[SS S!9-[#[SSS"9-[%[SS#S"9-['S$S%S&S'S(9-[)S)/S*QS+S,9-[+S-S+S.9-[-S+S+S/9-[/5-nUR1US0-S1S1S2S39 [S4US0-35 g)5z�
Plot Bayesian calibration curve: stated probability vs empirical frequency.

X-axis: Stated P(leak) binned into deciles
Y-axis: Empirical frequency of true positives in that bin
Diagonal line: Perfect calibration (y = x)
�leak_probabilityz@  No Bayesian calibration data found, skipping calibration curveNr~r�is_true_positive)rr�r�g333333�?rr9g333333�?r�r�r�rX)
r��333333�?��?gffffff�?g�������?g�������?g�������?r�r�rYT)�binsr��include_lowest�prob_bin)�observed�rr��rr�)�true_positivesr�rr��empirical_ratec�P�[[US5[US55S$)Nrr�rr�r�s rPr��+plot_bayesian_calibration.<locals>.<lambda>�s'��	�#�a�8H�6I�2J�C�PQ�RY�PZ�O�(\�]^�(_rOrpr�r�c�P�[[US5[US55S$)Nrr�rpr�r�s rPr�r"�s'���3�q�9I�7J�3K�S�QR�SZ�Q[�_�)]�^_�)`rOr�r�r�r'r�rr�r"r�r�r:r�zBayesian Calibration CurvezMStated P(leak) vs empirical true positive rate (dashed = perfect calibration)zStated P(leak)zEmpirical True Positive Rater�c�4�UVs/sHoSPM sn$s snfr�rNr�s  rPr�r"����!�4�!�Q�3��j�!�4��4r�)r�rr9r�rY)rrp)r��breaksr�c�4�UVs/sHoSPM sn$s snfr�rNr�s  rPr�r"�r%r�r�rzbayesian_calibration.pngrr�r�r�)rr�r�r��pd�cutr�r�r��astypersr�rrrrMrrrrrrrrQr�)ryrz�calib_dfr�r�s     rP�plot_bayesian_calibrationr,�s/���'�(�.�.�0�1�6�6�8�H��~�~�
�P�Q��$,�,@�#A�A�#E�H�
� ��6�6��#�$�
B�K��	�H�Z���
�
�:��
�
5�
9�
9�2�+�:���k�m��
�*�o�,�,�U�3�C�
�O�� 0�1�C��L�@�C����I�I�_�fg�I�h�C��M��Y�Y�`�gh�Y�i�C�	�N�	�s�C�*�(8�9�:����X�V�G�_�[^�
_�	`��c�x�i�8�v�i�?P�X[�
\�
	]��&��+�!�
4�		5��6�)�,�1�
5�
	6��.�d��,�	
�	
�&�4�1��
�'	
�0�4��
�1	
�8�v�F�
3�9	4�:�
�;	 ��B�F�F�;�3�3�1�Q�C�F�P�	�I�k�$>�>�?�
@�ArOc�p�XSS:HR5nUR(a[S5 gS[S[4SjnUSR	U5US	'US
S:HR5(aUSR	[5US
'X"S
S:�nUR(a[S5 gUS	R5n[U5S
::a[S5 gURS	S
/5RSS4SS9R5nUSUS-US'UR	SS
S9US'UR	SS
S9US'[U[S
SS95[[SSS9[SSS9-[![SS
S9-[#[SS S9-[%S!S"[S#S$S%9-[%S&S'[S#S$S%9-['S(S)S*S+9-[)S,S-S.S/S09-[+S1S2S39-[-5-[/5-[1S4S59-nUR3US6-S7S8S9S:9 [S;US6-35 g)<zj
Plot power curves faceted by attacker model.

One panel per attacker model showing power vs effect size.
r|r}z9  No power test data found, skipping faceted power curvesN�namerUc�v�UR5nSU;agSU;agSU;agSU;agS	U;dS
U;agg)
N�adjacent�AdjacentNetwork�remote�
RemoteNetworkrm�Research�shared�SharedHardware�pq�quantum�PostQuantumSentinel�Unknown)rd)r.�
name_lowers  rP�
extract_model�0plot_power_curves_faceted.<locals>.extract_modelsN���Z�Z�\�
���#�$�
��
#�"�
�:�
%��
��
#�#�
�Z�
�9�
�#:�(�rOri�attacker_modelr~rzA  Could not determine effect sizes, skipping faceted power curvesrpz6  Only one attacker model found, skipping faceted plotrc�(�US:HR5$r�r�r�s rPr��+plot_power_curves_faceted.<locals>.<lambda>'r�rOr�r�r�r�c�P�[[US5[US55S$r�r�r�s rPr�r@,r�rOr�r�c�P�[[US5[US55S$r�r�r�s rPr�r@-r�rOr�r�r�r"r�r�r:g@r�r�r'r9r�r�r�z~ attacker_model�free_xr	)�scales�ncolzPower Curves by Attacker ModelzBDetection rate vs effect size (horizontal lines: 70%, 90% targets)r�r�r�c�4�UVs/sHoSPM sn$s snfr�rNr�s  rPr�r@Dr�r�r�r�)r4r5)rKzpower_curves_faceted.pngr4r5r�r�r�)r�r�r�r�r�r�rx�unique�lenr�r�r�rrrrMrrr	rrrrrQrr�)ryrzr�r<�modelsr�r�s       rP�plot_power_curves_facetedrJ�s����[�/�W�,�-�2�2�4�H��~�~�
�I�J���C��C��"*�+�!6�!<�!<�]�!K�H�
��	�%�&�!�+�0�0�2�2�)1�+�)>�)D�)D�E`�)a��%�&��!5�6��:�;�H��~�~�
�Q�R���&�
'�
.�
.�
0�F�
�6�{�a��
�F�G���
�
�,�.B�C�
D�
H�
H��;�<�#�I���k�m��
�z�?�S��\�1�C��L��I�I�Y�`a�I�b�C��M��Y�Y�Z�ab�Y�c�C�	�N�	�s�C�.�'�:�;�
�c�x�i�8�v�i�?P�X[�
\�	]�
�&��+�!�
4�	5��6�)�,�3�
7�	8���x�v�g��UX�
Y�
	Z���x�v�g��UX�
Y�	Z��'��q�
A�
	B��2�Y�)��	
�	
�&�$E�i�
X�'	Y�(�/�)	�*�
�+	 �,�G�
$�-	%��4�F�F�;�3�3�2�a�S�F�Q�	�I�k�$>�>�?�
@�ArOc	�n�XSR5USS:�-R5nUR(a[S5 gUSUS-
US'USUS-S-US'UR	S5RS	S
SSS
9R
5nSnUR	S5RUSS9R
5nSS/UlURUSS9nUSUS-S-US'USSUS-[R"US5--
US'USSUS-[R"US5--US'[U[SSS95[SS[SSS9-[SS [S!S"S9-[S#S [S!S"S9-[![SSS$9S%[S&S'S(9-[#[S)S*S+9-[%S,S-S.S/S09-['5-[)S1S29-[+5-[-[SS39S4S5S6[SS79-nUR/US8-S9S:S;S<9 [S=US8-35 g)>z}
Plot estimation bias: bias and RMSE by effect size.

X-axis: True effect size
Y-axis: Bias (estimated - true) as percentage
rr~rz5  No effect estimation data found, skipping bias plotN�bias_ns�d�bias_pct)rN�mean)rN�std)rrO)rr�)�
mean_bias_pct�std_bias_pct�
mean_estimater�c�b�[R"USUS-
S-R55$)Nrr~r	)r\r]rO)rts rP�compute_rmse�*plot_estimation_bias.<locals>.compute_rmsehs/���w�w��z�*�U�3G�-H�H�Q�N�T�T�V�W�WrOF)�include_groups�rmse)�on�rmse_pctrQrZrRr�r�r�r��solidr'r�r�r�r�r$r9i����r�r�r&r�r�r"r�r:zEstimation Bias by Effect Sizeu&Dashed lines show ±20% bias thresholdzTrue Effect Size (ns)u'Bias ((estimate - true) / true × 100%)r�c�8�UVs/sHoSS3PM
 sn$s snf)Nz.0f�%rNr�s  rPr��&plot_estimation_bias.<locals>.<lambda>�s��A�.F�A�q�C���{�A�.F��.Fs�)r�)r�z{:.0f}% RMSEr4r�)�
format_string�nudge_yr1r,zestimation_bias.pngr5r=r�r�r�)rr�r�r�r�r�r�r��columns�merger\r]rrr	rMrrrrrrQrr�)ryrzrr�rU�rmse_dfr�s       rP�plot_estimation_biasrdNs����:��$�$�&�"�-A�*B�Q�*F�G�
H�
M�
M�
O�F�
�|�|�
�E�F���z�*�V�4H�-I�I�F�9���	�*�V�4H�-I�I�C�O�F�:���.�.�-�
.�
2�
2�*�(�*�#�	3��
�k�m��X��n�n�1�2�8�8��V[�8�\�h�h�j�G�+�V�4�G�O�

�)�)�G� 4�)�
5�C��&�k�C�(<�$=�=��C�C�
�O���(�4�#�n�2E�+E����PS�T[�P\�H]�+]�]�C��M���)�D�3�~�3F�,F����QT�U\�Q]�I^�,^�^�C�	�N�	�s�C�.�/�B�C�
��G�6�'�?�QT�
U�	V�
��X�V�H�=M�TW�
X�	Y���h�f�X�>N�UX�
Y�	Z�
���	�:�$�f�U[�n�cf�
g�	h��6�)�,�1�
5�
	6��2�A�%�;�	
�	
��/�	� �$F�
G�!	H�"�
�#	 �(�C�j�)��QS�Z[�ci�jq�cr�
s�)	t��0�F�F�;�.�.�a��s�F�K�	�I�k�$9�9�:�
;�<rOc���[U5n[XSS:g5n[XSS:H5nXSS:HnXSS:HnXSS:HnXSR5nUR(dMUSS:HR5n	[U5n
[	[USS:HR
55U
5up�OS	up�p�/n
UR(dUS
R5HhnUS:�dMXUS
U:HnUSS:HR
5n[U5nUS:�aUU-OSn[	UU5unnU
RUUUUU45 Mj UR(dwURSS
/S9nUR(dMUSUS
:*US
US
:*-R
5n[U5nUS:�aUU-OSn[	UU5unnOS	unnnnOS	unnnnUR(d�UR5nUS
S:�US'USS-R[
5S-S-US'URS5RSSS9nUSUS-US'URUS-
R5US'USR5nUSR!5nOSunnSUSSUSSX2-S-S S!USSXB-S-S S"U	S-S#S$US-S S%US-S S&U
SS'US(::aS)OS*S+3n[#U
5HIunnnnnUS,::aS-O	US.::aS/OS0n US1:�aS2OS3n!US4US5S6US-S7S$US-S S%US-S S8US9U!3-
nMK US:US-S#S$US-S S%US-S S&USS;US<:�aS)OS*S=US-S7S>US-S7S?US@::aUSA::aS)OS*SB3-
n[%USC-SD5n"U"R'U5 SESESE5 [)SFUSC-35 [)U5 gE!,(df   N+=f)Gz5
Create a comprehensive summary showing key metrics.
r�unmeasurabler|r}r�r�rr�)rrrrr~rr�r�r�rr4r�rrr)�tpr�rgr��	empirical�	deviation)rrz�
================================================================================
                    CALIBRATION SUMMARY REPORT
================================================================================

OVERVIEW
--------
Total Trials:     z>8z
Completed:        � (rMz.1fz%)
Unmeasurable:     zJ%)

FALSE POSITIVE RATE (FPR)
-------------------------
FPR:              z>7.1fz%  [95% CI: z% - z%]
Trials:           uL
Target:           ≤ 5% (α = 0.05), acceptable ≤ 10%
Status:           r�u✓ PASSu✗ FAILz%

STATISTICAL POWER
-----------------r�z70%i�z90%z95%r�u✓u✗�
z>6.0fzns effect: z>5.1fz%] (n=z) z,

CI COVERAGE
-----------
Coverage:         u;
Target:           ≥ 85% (nominal 95%)
Status:           r�zD

BAYESIAN CALIBRATION
--------------------
Mean Calibration Error: z%
Max Deviation:          uL%
Target:                 ≤ 15% mean, ≤ 25% max
Status:                 rrzS

================================================================================
zsummary.txt�wNr�)rHrr�rOrhr�r�rG�appendr�r�r*r�r��indexr[r^�sorted�open�writer�)#ryrz�total_trials�	completedrfr�r��coverage_df�bayesian_df�fpr_rate�
fpr_trials�fpr_low�fpr_high�power_stats�effectr�r�r��rate�low�highr��coverage_total�
coverage_rate�coverage_low�
coverage_highr+�bin_agg�mean_calibration_error�max_calibration_error�summary_textr`�target�status�fs#                                   rP�plot_summary_dashboardr��sZ��
�r�7�L��B�*�~��7�8�9�I��r�Z�.�N�:�;�<�L��[�/�W�,�-�H�
�;��5�(�
)�F���_�
�2�3�K��*�+�1�1�3�4�K��<�<��:�&�&�0�6�6�8����[�
�%�c�6�*�+=��+G�*L�*L�*N�&O�Q[�\����2<�/��g��K��>�>��3�4�;�;�=�F���z�!�+?�"@�F�"J�K��"�:�.�&�8�=�=�?���F���+0�1�9�x�%�'�!��%�h��6�	��T��"�"�F�D�#�t�U�#C�D�>����!�(�(��l�0K�(�L��� � ��[�)�[�9M�-N�N��1�2�k�,�6O�O�Q��c�e�
�!��-�N�8F��8J�G�n�4�PQ�M�*3�G�^�*L�'�L�-�IS�F�M�>�<��EO�B�
�~�|�]�����#�#�%��'/�0D�'E��'I��#�$�!)�);� <�r� A�I�I�#�N�QS�S�VZ�Z�����"�"�:�.�2�2�*�/�3�
�� '�t�}�w�w�/?�?���� '�
�
���0D� D�I�I�K����!(��!5�!:�!:�!<�� '�� 4� 8� 8� :��8<�5�� 5�� ��#�$��R�.��9�#9�#�#=�c�"B�C���#�2�l�&?��&C�C�%H�I��C�<��&�l�7�3�;�s�2C�4��QT��UX�GY�Z��b�/�"�!)�T�!1�:�z�B�C�!�L�*'-�[�&9�"���c�4�� �C�-��v��}�e�%���$�,��E�������k�$�s�(�5�)��c�#�g�c�]�$�t�C�x�PS�n�TZ�[\�Z]�]_�`f�_g�k�	k��':���!��$�U�+�<��S�8H��7M�T�R_�`c�Rc�dg�Qh�i�!�"�%�&�!.�$�!6�:�J�G�H�0��3�E�:�;�.�s�2�5�9�:�'=��'E�J_�cg�Jg��mw�x�y���L�(
�k�M�)�3�	/�1�	�����
0�	�I�k�M�1�2�
3�4�
�,��
0�	/�s�*O!�!
O/�data_dirc	���[URS55nU(d[SU35e/nUHPn[R"U5nURU5 [
SURS[U5S35 MR U(d[S	5e[R"US
S9$![a&n[
SURSU35 SnAM�SnAff=f)z'Load all CSV files from data directory.z*.csvzNo CSV files found in z
  Loaded: rjz	 records)z  Warning: Failed to load z: Nz"No valid CSV files could be loadedT)�ignore_index)�list�glob�
ValueErrorr(�read_csvrmr�r.rH�	Exception�concat)r��	csv_files�dfsr�ry�es      rP�	load_datar�s����X�]�]�7�+�,�I���1�(��<�=�=�
�C�
��	>����Q��B��J�J�r�N��J�q�v�v�h�b��R��	��;�<�	���=�>�>�
�9�9�S�t�,�,��
�	>��.�q�v�v�h�b���<�=�=��	>�s�AB/�/
C�9C�Cc�|�[R"SS9nURS[SS9 URSS[[S5S	S
9 UR	5nUR
R
5(d[SUR
35 gURRS
S
S9 [SUR
S35 [UR
5n[S[U535 [SURS35 [X!R5 [X!R5 [X!R5 [X!R5 [!X!R5 [#X!R5 [%X!R5 ['X!R5 [S5 g)Nz,Generate calibration plots for timing-oracle)�descriptionr�z#Directory containing CSV data files)�type�helpz--outputz-oz./plotszOutput directory for plots)r��defaultr�z!Error: Data directory not found: rpT)�parents�exist_okz
Loading data from z...zTotal records: z
Generating plots to z
Done!r)�argparse�ArgumentParser�add_argumentr�
parse_argsr��existsr��output�mkdirr�rHr�rJr�r,r�rrdr�)�parser�argsrys   rP�mainr�&sT��
�
$�
$�1_�
`�F�
���
��4Y��Z�
���
�D�t�T�)�_�So��p�����D��=�=���!�!�
�1�$�-�-��A�B��	�K�K���d�T��2�	� �����s�
3�4�	�4�=�=�	!�B�	�O�C��G�9�
%�&�	�"�4�;�;�-�s�
3�4��R���%��b�+�+�.���[�[�)��b�+�+�.��b�+�+�.��2�{�{�+���[�[�)��2�{�{�+�	�)��rO�__main__)rY)<�__doc__r��pathlibr�numpyr\�pandasr(�plotninerrrrrr	r
rrr
rrrrrrrrrrrrrrrrrrr r!rMrQr�rs�tuplerhr�rx�	DataFramer�r�r�rr,rJrdr�r�r��__name__�exitrNrOrP�<module>r�s����������������L���
��
�
��	
���D���c��u���e�UZ�l�H[��D�3��5��<B9����B9�D�B9�J6=�R�\�\�6=��6=�r@B�"�,�,�@B�T�@B�F6?�r�|�|�6?�$�6?�rIB�"�,�,�IB�T�IB�XTB�"�,�,�TB�T�TB�n?=�R�\�\�?=��?=�Dy�r�|�|�y�$�y�@-��-����-�,
�D�z�����L�rO