rsclaw 2026.5.1

AI Agent Engine Compatible with OpenClaw
Documentation
+
L��i*����Rt^RIt^RIt^RIt^RIt^RItRtRtRt	RRRRRR	R
RRRR
RRR//RR.///RRRRRR	RRRRR
RRR/RRR//RRR.///RRRRRR	RRRRR
RRR//RR.///RRRRRR	RRRRR
RRR/RRR//RR.///RRRRRR	RRRRR
RRR//RR.///.t
RRRR/RRRR /RRRR!/.tRR"R#R$R%R&/RR'R#R(R%R)/RR*R#]R%R+/RR,R#]R%R+/RR-R#R.R%R//RR0R#]R%R/RR1R#]R%R/.tR8R2R3llt
R4R5ltR6t]R78Xd
]!4R#R#)9u 
Test: prefix_kv_cache=1 — verify KV cache hit with stable system+tools prefix.

Sends 3 sequential requests with identical system+tools, appending one message
each turn. Measures TTFT and checks cached_tokens to verify prefix cache hit.

Usage:
    python tests/test_prefix_kvcache.py
Nzhttp://macstudio.localz#https://api.gaterouter.ai/openai/v1z�You are a helpful AI assistant. You help users with coding, data analysis, and general questions.
Always respond concisely. Use Chinese when the user speaks Chinese.
Platform: macOS. Shell: bash/zsh.�type�function�name�execute_command�descriptionzRun a shell command�
parameters�object�
properties�command�string�required�
write_filezWrite content to a file�path�content�	read_filezRead a file�memoryz Search or store long-term memory�action�query�
web_searchzSearch the web�role�useru'你好,请简单介绍一下自己。uG帮我写一个 Python 函数,计算斐波那契数列的第 N 项。uM现在修改这个函数,加入缓存优化,使用 lru_cache 装饰器。zdeepseek/deepseek-chat�basezhttps://api.deepseek.com/v1�key_env�DEEPSEEK_API_KEYz!doubao/doubao-seed-2-0-pro-260215z(https://ark.cn-beijing.volces.com/api/v3�ARK_API_KEYzgoogle/gemini-2.5-flash�GATE_ROUTER_KEYzanthropic/claude-sonnet-4.6zllama/qwen3.5-fastzhttp://218.22.75.183:8000/v1�LLAMA_REMOTE_KEYzollama/qwen3.5:9bzollama/gemma4:26bc�x�V^8�dQhR\R\R\R\R\R\R\R\/#)	�r�key�model_id�system�messages�tools�	is_ollama�return)�str�list�bool�dict)�formats"�;/Users/oopos/dev/github-rsclaw/tests/test_prefix_kvcache.py�__annotate__r,7sQ��m�m��m�3�m�#�m�s�m�d�m��m�+/�m�<@�m�c��V'df\P!RTRRRRV/.V,RVUu.uFpRRRVR,/NK	upR	R
RRR
RRR^�//4P4pVR2p	RR/p
MaRVRRRRV/.V,RVR	R
R^�RR/pRV9gRV9dRR/VR&\P!V4P4pVR2p	RRV2RR/p
\PPW�V
R7p\P!4p
Rp.pRpRp/p\PPV^<R7;_uu_4pVEF8pVPRR 7P4pV'd�V'gK6\P!V4pTPR!/4pTPRR"4pT'd!Tf\P!4T
,
pT'dTPT4TPR#4'dNTPR$4pR%T;'g^R&TPR'^4R(TPR)^4R*,/pEM'EKVPR+4'gEK3VR,,P4pVR-8XdM�\P!V4pTPR..4pT'dmT^,PR//4pTPRR"4pT'd!Tf\P!4T
,
pT'dTPT4TPR04pT'gEKTpTPR14;'g/PR24pEK;	RRR4\P!4T
,
pR4TR5TRR"P%T4R2TR%TPR%4R(TPR(4R0TR3R/#uupi \PdEK�i;i \PdEK�i;i +'giL�;i \ dpR3\#T4/uRp?#Rp?ii;i)6z7Send a streaming request and measure TTFT + total time.�modelr"rr!rr#rr�streamT�thinkF�options�temperatureg333333�?�num_predictz	/api/chatzContent-Typezapplication/json�
max_tokens�doubao�ark�disabled�thinkingz/chat/completions�
AuthorizationzBearer )�data�headersN��timeout�replace)�errors�message��done�prompt_eval_count�
prompt_tokens�completion_tokens�
eval_count�prompt_eval_ms�prompt_eval_durationi@Bzdata: :�NNz[DONE]�choices�delta�usage�prompt_tokens_details�
cached_tokens�error�ttft�total)�json�dumps�encode�urllib�request�Request�time�perf_counter�urlopen�decode�strip�loads�JSONDecodeError�get�append�
startswith�	Exceptionr&�join)rrr r!r"r#r$�t�body�urlr<�payload�req�t0rQ�
content_partsrOrE�
usage_data�resp�raw_line�line�chunk�msg�cr;rKrL�u�erRs&&&&&&&                        r+�call_streamingru7s*����z�z��X��&�(�I�v�>�?�(�J��u�U�u�!�v�z�:�q��}�E�u�U��d��U��
�s�M�3�?�

���6�8�	
���i� ��!�#5�6��
�X��&�(�I�v�>�?�(�J��U��d��#��3�

���x��5�D�=�#)�:�"6�G�J���z�z�'�"�)�)�+����'�(���w�s�e�_��.�
��
�.�.�
 �
 ���
 �
A�C�	
�	�	�	�B��D��M��M��M��J�6!�
�^�^�
#�
#�C��
#�
4�
4�� �����i��8�>�>�@���� �!� $�
�
�4� 0�� �)�)�I�r�2�C����	�2�.�A��T�\�#�0�0�2�R�7���%�,�,�Q�/��y�y��(�(�(-�	�	�2E�(F�
�+�]�-?�-?�a�/����<��1K�,�e�i�i�8N�PQ�.R�U^�.^�&�
�
�)� �?�?�8�4�4� ���8�>�>�+�D��x�'��!� $�
�
�4� 0��$�i�i�	�2�6�G�� '��
���w�� ;��!�I�I�i��4�����#'�#4�#4�#6��#;�D��)�0�0��3��	�	�'�*�A��q�%&�
�)*���/F�)G�)M�)M�2�(R�(R�Sb�(c�
�c!�5�n
����"�$�E������2�7�7�=�)��������8��*�.�.�)9�:�����	�	��wV��Z �/�/�!� �!��2 �/�/�!� �!��G5�
4��h�!���Q�� � ��!�s��O�)P.�96P�0P�3O!�	+P�5'P�'P�P� AP�,#P�O>�&P�2P�3'P�'P�P�"P�9P.�!O;	�6P�:O;	�;P�>P	�P�P	�P�P+	�&P.�+P.�.Q�9Q�Q�Qc�$�V^8�dQhR\/#)r�cfg)r))r*s"r+r,r,�s��7�7�D�7r-c�6�VR,pVR,pVPR4pRV9dVPR^4^,MTp\PP	VR,;'gRR4;'gRp\RV24\RR024.p.p\
\4EF�wr�VPV	4\W%V\V\V4p
V
P	R	4'd;\R
V^,RV
R	,R,24VPV
4EMV
R
,'dV
R
,R,M^pV
P	R4pV
P	R4p
V
P	R4pVeRV2MRpV
'dRV
2MRpV'dRVR
R2MRpRP\RVVV.44p\R
V^,RVR
RV
R,R
RV24VPV
4V
R,'dV
R,R,MR pVPR!R"RV/4EK�	\V4^8�Ed?\;QJdR#V4F'dKR$M	R%M
!R#V44'EdVU
u.uF+q�P	R
4'gKV
R
,R,NK-	pp
\V4^8�d�V^,^8�d&^VR1,V^,,,
^d,M^pV^
8�dR&M
VR28�dR'MR(p\R)V^,R
R*VR1,R
R+VR,VR-
R.2	4VU
u.uF(q�P	R4fKV
P	R4NK*	pp
V'd\R/V24V#uup
iuup
i)3z+Run 3-turn prefix cache test for one model.rrzollama/�/rrBz
  z  rPz	    Turn z: ERROR :N�PNrQi�rOrErHNzcached=zprompt=zeval=z.0f�msz: TTFT=z>7.0fz
ms  total=rRz.1fzs  r:N��N�OKr�	assistantc3�L"�TFqPR4'*x�K	R#5i)rPN)r`)�.0�rs& r+�	<genexpr>�test_model.<locals>.<genexpr>�s��� E�W��U�U�7�^�!3�!3�W�s�"$FT�FASTER�STABLE�SLOWERz    => TTFT trend: zms -> zms (z, z+.0fz%)z    => Cached tokens: z2--------------------------------------------------�����i����)rb�split�os�environr`�print�	enumerate�TURNSraru�
SYSTEM_PROMPT�TOOLSrd�filter�len�all)rwrrr$r rr"�results�turn_idx�turn_msgr��ttft_ms�cached�prompt�eval_ms�
cached_str�
prompt_str�eval_str�extra�
reply_text�ttfts�improvement�trend�cached_valss&                       r+�
test_modelr��s.���v�;�D��v�;�D����	�*�I�(+�t��t�z�z�#�q�!�!�$��H�
�*�*�.�.��Y��-�-�2�r�
2�
8�
8�b�C�	�D���-��	�B�v�h�-���H��G�'��.�������!��4�h�
�x��PY�Z���5�5��>�>��I�h�q�j�\��!�G�*�S�/�1B�C�D��N�N�1���&'��i�i�!�F�)�d�"�Q������'������'���%�%�(�)��+1�+=�w�v�h�'�2�
�+1�w�v�h�'�r�
�.5�U�7�3�-�r�*�2���	�	�&��
�J��'I�J�K��
�	�(�1�*��W�W�U�O�:�a��j�QT�EU�UX�Y^�X_�`�a����q��,-�Y�<�<�Q�y�\�$�'�T�
������i��D�E�5/�:�7�|�q��S�S� E�W� E�S�S�S� E�W� E�E�E�+2�D�7�a�e�e�F�m�!��6��T�!�!�7��D��u�:��?�>C�A�h��l�1�u�R�y�5��8�3�3�s�:�PQ�K� +�b� 0�H�+�PS�BS�h�Ya�E��'��a���~�V�E�"�I�c�?�$�u�g�UW�Xc�dh�Wi�ik�l�m�<C�i�7�a�e�e�O�F\�1�1�5�5��1�7�K�i���.�{�m�<�=��N��E��js�N�-N�	N�"Nc�"�\R4\R4\R4\R\\4R24\R\\4R24\R\\4R24\R4\R	4\
F�pVR
,pV'dA\PPV4'g\RVR,R
VR24KTRVR,9d*\PP\R2^R7\V4K�	\RR24\R4\R4\R4\R4R# \d\RTR,R24K�i;i)�=z  Prefix KV Cache Test (mode=1)z  System prompt: z chars (stable)z	  Tools: z	 (stable)z	  Turns: z (append-only)z<  Expected: TTFT should decrease or stay stable across turnsz>            (prefix cache hit = no re-prefill of system+tools)rz
  SKIP rz (z	 not set)�ollamaz	/api/tagsr=z (ollama not reachable)�
zD  Note: TTFT improvement depends on provider's cache implementation.z@  Cloud APIs: automatic prefix caching (OpenAI/DeepSeek/Gemini).z.  Local: depends on llama.cpp/vLLM slot reuse.Nz<============================================================)r�r�r�r�r��MODELSr�r�r`rVrWr[�
OLLAMA_URLrcr�)rwrs  r+�mainr��sJ��	�(�O�	�
+�,�	�(�O�	��c�-�0�1��
A�B�	�I�c�%�j�\��
+�,�	�I�c�%�j�\��
0�1�	�H�J�	�J�L����i�.���2�:�:�>�>�'�2�2��I�c�&�k�]�"�W�I�Y�?�@���s�6�{�"�
����&�&�*��Y�'?��&�K�	�3���
�B�v�h�-��	�
P�Q�	�
L�M�	�
:�;�	�V�H����
��	�#�f�+��.E�F�G��
�s�5(E)�)!F�
F�__main__)F)�__doc__rSrYr��sys�urllib.requestrVr��GATE_ROUTER_URLr�r�r�r�rur�r��__name__�r-r+�<module>r�st���+�*�*�
%�
�7��%�
��Z��f�.?��Pe�gs�v|�G�IU�Xa�dj�lt�cu�Wv�xB�EN�DO�vP�&Q�R��Z��f�l�M�Kd�fr�u{�~F�HT�W]�`f�hp�_q�s|�E�GO�~P�VQ�S]�`f�hq�_r�us�&t�u��Z��f�k�=�-�Ye�hn�px�{G�JP�SY�[c�Rd�Ie�gq�tz�s{�h|�&}�~��Z��f�h�
�Gi�kw�{A�CK�MY�\d�gm�ow�fx�zA�DJ�LT�CU�[V�Xb�em�dn�zo�&p�q��Z��f�l�M�K[�]i�lr�t|�K�NU�X^�`h�Wi�Mj�lv�y@�xA�lB�&C�D�	���V�Y� I�J��V�Y� i�j��V�Y� o�p�	���
%�v�/L�i�Yk�l��
0�&�:d�fo�q~���
&����L]�^��
*�F�O�Y�Pa�b��
!�6�+I�9�Vh�i��
 �&�*�i��F��
 �&�*�i��F�
��m�`7�t�<�z���F�r-