Skip to content

Commit

Permalink
manually fix PLBart tokenizer
Browse files Browse the repository at this point in the history
  • Loading branch information
ArthurZucker committed Sep 26, 2024
1 parent 0317895 commit e71a01a
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions src/transformers/models/plbart/tokenization_plbart.py
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,7 @@ def __init__(
tgt_lang=None,
sp_model_kwargs: Optional[Dict[str, Any]] = None,
additional_special_tokens=None,
clean_up_tokenization_spaces=True,
**kwargs,
):
# Mask token behave like a normal word, i.e. include the space before it
Expand Down Expand Up @@ -200,6 +201,7 @@ def __init__(
tgt_lang=tgt_lang,
additional_special_tokens=_additional_special_tokens,
sp_model_kwargs=self.sp_model_kwargs,
clean_up_tokenization_spaces=clean_up_tokenization_spaces,
**kwargs,
)

Expand Down

0 comments on commit e71a01a

Please sign in to comment.