老师，metric.py中pred最终出现乱码怎么处理？ #4201

demouo · 2024-06-11T08:16:10Z

Reminder

I have read the README and searched the existing issues.

System Info

正常运行lora下的sft

Reproduction

正常运行lora下的sft

Expected behavior

微调的模型是Qwen2-7B，问题同样出现在ChatGLM3-6B
因为想要在评估的过程中看到一些自定义指标所以改动了metric.py，改动如下：
metric.py下ComputeMetrics类，在__call__下补充

print("preds shape", preds.shape)
print("labels shape", labels.shape)

输出

preds shape (10，440，65024)
labels shape (10, 440 )

发现维度不一致，因此把preds的最后一个维度挑选最大的下标值（也就是选择了概率最大的word）

# 将概率分布转换为类别索引
if preds.ndim == 3:
       preds = np.argmax(preds, axis=-1)

至此可以正常输出，但是preds输出有乱码，而labels输出没问题说明不是tokenizer的问题
测试段的代码

for pred, label in zip(decoded_preds, decoded_labels):
            
            print("没jieba之前pred,", pred)
            print("没jieba之前label,", label)
            
            hypothesis = list(jieba.cut(pred))
            reference = list(jieba.cut(label))
            
            if len(" ".join(hypothesis).split()) == 0 or len(" ".join(reference).split()) == 0:
                result = {"rouge-1": {"f": 0.0}, "rouge-2": {"f": 0.0}, "rouge-l": {"f": 0.0}}
            else:
                rouge = Rouge()
                scores = rouge.get_scores(" ".join(hypothesis), " ".join(reference))
                result = scores[0]

           # ...

输出

没jieba之前pred, 0: are a helpful assistant.
n:I are given helpfulless. and I are talking a conversation with a user. The seeker isates the conversation by and they are to respond with the seeker based The seeker should should as ['m tostrategy} to to { following. and the response is {reply}.Available is  few of 1 conversations: and each strategy are their descriptions are as follows:
1-focusedWhat the clarification or to the topic or better the seekereeeker clarify their problem they they are.
 
-ended questions are preferred for as:：bing：00igh:  questions are be used to confirm specific information. 
 Validationate： Paraphrasing： Rest technique rest clear direct versionphrasing of the help-seeker's words. can help them clarify the situation from clearly. Strategy Validation： Feelings： Ackiculating the validate the help-seeker’s feelings, This Validation-disclosure： Shareulge personal experiences or you have had, that that you have with the help-seeker. help empathy understanding. 
 Validationirmation： Validationassurance： Ackirm the help-seer’s feelings and abilities, and abilities. provide reassurance that hope. 
 S Information： Offer specific and how to deal the improve avoid careful not avoid bestep and provide the what to do. 
 Problemal： Provide information that help help-seeker, such example, a, statistics, or, or, etc examples asking their. 
 Problem： Any ofasantries, small small strategies strategies that are not fit into the above categories.Question used with you, I's important of the strategies too I
user
I used Aff strategy the conversation, and my reply is I do you fears feel about this situation how reply

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
没jieba之前label, I used Question for the conversation, and my reply is How do your children feel about this? And his?

完整metric.py的代码如下

from dataclasses import dataclass
from typing import TYPE_CHECKING, Dict, Sequence, Tuple, Union

import numpy as np

from ...extras.constants import IGNORE_INDEX
from ...extras.packages import is_jieba_available, is_nltk_available, is_rouge_available

if TYPE_CHECKING:
    from transformers.tokenization_utils import PreTrainedTokenizer

if is_jieba_available():
    import jieba  # type: ignore

if is_nltk_available():
    from nltk.translate.bleu_score import SmoothingFunction, sentence_bleu

if is_rouge_available():
    from rouge_chinese import Rouge


@dataclass
class ComputeMetrics:
    r"""
    Wraps the tokenizer into metric functions, used in Seq2SeqPeftTrainer.
    """

    tokenizer: "PreTrainedTokenizer"

    def __call__(self, eval_preds: Sequence[Union[np.ndarray, Tuple[np.ndarray]]]) -> Dict[str, float]:
        r"""
        Uses the model predictions to compute metrics.
        """
        preds, labels = eval_preds
        score_dict = {"rouge-1": [], "rouge-2": [], "rouge-l": [], "bleu-4": [], "acc": []}
        
        # 将概率分布转换为类别索引
        if preds.ndim == 3:
            preds = np.argmax(preds, axis=-1)
            
        preds = np.where(preds != IGNORE_INDEX, preds, self.tokenizer.pad_token_id)
        labels = np.where(labels != IGNORE_INDEX, labels, self.tokenizer.pad_token_id)
        
        # 打印维度
        print("preds shape", preds.shape)
        print("labels shape", labels.shape)

        # 打印部分预测和标签数据
        print("Predicted tokens:", preds[0][:10])
        print("Label tokens:", labels[0][:10])

        decoded_preds = self.tokenizer.batch_decode(preds, skip_special_tokens=True)
        decoded_labels = self.tokenizer.batch_decode(labels, skip_special_tokens=True)

        for pred, label in zip(decoded_preds, decoded_labels):
            
            print("没jieba之前pred,", pred)
            print("没jieba之前label,", label)
            
            hypothesis = list(jieba.cut(pred))
            reference = list(jieba.cut(label))
            
            if len(" ".join(hypothesis).split()) == 0 or len(" ".join(reference).split()) == 0:
                result = {"rouge-1": {"f": 0.0}, "rouge-2": {"f": 0.0}, "rouge-l": {"f": 0.0}}
            else:
                rouge = Rouge()
                scores = rouge.get_scores(" ".join(hypothesis), " ".join(reference))
                result = scores[0]

            for k, v in result.items():
                score_dict[k].append(round(v["f"] * 100, 4))

            bleu_score = sentence_bleu([list(label)], list(pred), smoothing_function=SmoothingFunction().method3)
            score_dict["bleu-4"].append(round(bleu_score * 100, 4))
        
        print(score_dict)
        print()
        ans = {k: float(np.mean(v)) for k, v in score_dict.items()}
        print(ans)
        
        return ans

Others

No response

The text was updated successfully, but these errors were encountered:

github-actions bot added the pending This problem is yet to be addressed label Jun 11, 2024

hiyouga added wontfix This will not be worked on and removed pending This problem is yet to be addressed labels Jun 14, 2024

hiyouga closed this as not planned Won't fix, can't repro, duplicate, stale Jun 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

老师，metric.py中pred最终出现乱码怎么处理？ #4201

老师，metric.py中pred最终出现乱码怎么处理？ #4201

demouo commented Jun 11, 2024 •

edited

Loading

老师，metric.py中pred最终出现乱码怎么处理？ #4201

老师，metric.py中pred最终出现乱码怎么处理？ #4201

Comments

demouo commented Jun 11, 2024 • edited Loading

Reminder

System Info

Reproduction

Expected behavior

Others

demouo commented Jun 11, 2024 •

edited

Loading