Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guide to fine-tuning ReDimNet #14

Open
pongthang opened this issue Nov 1, 2024 · 13 comments
Open

Guide to fine-tuning ReDimNet #14

pongthang opened this issue Nov 1, 2024 · 13 comments

Comments

@pongthang
Copy link

Hi, as Redimnet was not trained in kids data, accuracy drops (60-70%) when I tried in kids dataset. So I want to fine-tune the model in kids voice. Could you help in this? Could you provide the training script or may be training script example? and some wiki how you train it. Btw it is a good project. Thanks.

@vanIvan
Copy link
Collaborator

vanIvan commented Nov 4, 2024

Hi @pongthang , thank you for taking look at our work! We were using slightly modified version of wespeaker pipeline, we'll discuss in our team, when we could publish it. Foe now please refer to wespeaker pipeline, the only main difference we have - we removed feature calculation from data loading pipeline, and inserted them into the model. I believe wespeaker pipeline has added ReDimNet architecture and recipes for it.

@pongthang
Copy link
Author

Thank you . If you could share the training process it will be great. Yeah, wespeaker pipeline has added ReDimNet architecture and recipes for it. I will follow this, thank you again for your support.

@vanIvan
Copy link
Collaborator

vanIvan commented Nov 6, 2024

We'll be soon releasing few more models trained on much larger dataset. They should have better quality across other domains. Stay tuned.

@pongthang
Copy link
Author

We'll be soon releasing few more models trained on much larger dataset. They should have better quality across other domains. Stay tuned.

This is great. How can I know new models are released ?

@MonolithFoundation
Copy link

@vanIvan Will it support Chinese speaker verification? Any estimate on this? Really hoping for his.

@vanIvan
Copy link
Collaborator

vanIvan commented Nov 13, 2024

Yes, new models would be pretrained on voxblink2 and finetuned on voxblink2+vox2+cnceleb. They will perform better on Chinese.

@pongthang
Copy link
Author

pongthang commented Nov 14, 2024

@vanIvan , Hi , are you planning to train smaller models which have similar size to b0 or b1 on the new dataset ?

@MonolithFoundation
Copy link

@vanIvan sounds extremly good. Is there any estimated on why will new models release?

@vanIvan
Copy link
Collaborator

vanIvan commented Nov 14, 2024

@MonolithFoundation we have released two new models yesterday, S and M models trained on voxblink2 dataset, please check readme and evaluation pages for more information.

@vanIvan
Copy link
Collaborator

vanIvan commented Nov 14, 2024

@pongthang No, we were not planning to train smaller models on voxblink2 dataset.

@MonolithFoundation
Copy link

@vanIvan Sorry, I typoed, means when will release the models that combines voxblink2+vox2+cnceleb

@vanIvan
Copy link
Collaborator

vanIvan commented Nov 14, 2024

@MonolithFoundation we have already released them yesterday, both models S and M have two sets of weights: pretrained version on voxblink2 and finetuned version on voxblink2+vox2+cnceleb12

@MonolithFoundation
Copy link

Thank you so much.
Is there any inference script that can be used to infer on these models?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants