�
��[ig � �v � d Z ddlZddlZddlmZ ddlmZ e� Zdedefd�Z ded ede
fd
�Zded ede
fd�Zy)z+F1 evaluation metrics (ported from LoCoMo).� N)�Counter)�
PorterStemmer�s�returnc � � t | � j dd� } t j dd| t j �� } dj d� | D � � } dj | j
� j � � } | S )z Normalize answer for comparison.�,� z\b(a|an|the|and)\b� )�flagsc 3 �F K � | ] }|t j vs�|�� � y �w�N)�string�punctuation)�.0�chs �0/home/spoj/adaptive_memory/bench/eval_metrics.py� <genexpr>z#normalize_answer.<locals>.<genexpr> s � �� �?�Q�r�"�F�,>�,>�">��Q�s �!�!)�str�replace�re�sub�
IGNORECASE�join�lower�split)r s r �normalize_answerr sc � ��A����s�B��A�
���$�c�1�B�M�M�B�A�
���?�Q�?�?�A����������"�#�A��H� �
prediction�ground_truthc �� � t | � j � D �cg c] }t j |� �� }}t |� j � D �cg c] }t j |� �� }}|r|syt |� t |� z }t |j
� � }|dk( ry|t |� z }|t |� z }d|z |z ||z z S c c}w c c}w )z$Token-level F1 with Porter stemming.� r � )r r �ps�stemr �sum�values�len) r r �w�pred_tokens� gt_tokens�common�num_same� precision�recalls r �f1_scorer/