-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create test cases for the Tokenizer #27
Comments
working on this with @pedroborgescruz |
Hi, @luke-carlson! We started working on this issue today. We made progress, but would like to ask you some questions before continuing: (1) What is the difference between a .vocab file (as defined in the parameter of https://github.com/apple/ml-mdm/blob/main/ml_mdm/language_models/factory.py#124) and a token_file (as defined in the parameter of the functions in https://github.com/apple/ml-mdm/blob/main/ml_mdm/language_models/tokenizer.py#45)?
|
Hey Pedro
|
PR: #41 |
We should create a test case that loads a file, creates a tokenizer https://github.com/apple/ml-mdm/blob/main/ml_mdm/language_models/tokenizer.py and then we assert that the tokenizer was created
The text was updated successfully, but these errors were encountered: