0.28
This update includes several enhancements to the Easy XTTS Trainer, aimed at improving the quality of trained models and providing more control over the training process.
-
Improved Audio Segmentation: The trainer now identifies optimal split points between segments by locating the quietest points in the audio. This method results in cleaner transitions between segments, reducing the likelihood of abrupt cutoffs or the inclusion of fragments of the previous or next segment, which in turn improves the overall quality and naturalness of the synthesized speech and helps eliminate artifacts.
-
Integrated Audio Preprocessing: You can now apply the following audio processing steps directly within Pandrator as a part of the training workflow:
- Normalization: Normalize audio to a target LUFS value (default -16.0). Use
--normalize <value>
to specify a different target. - De-essing: Reduce sibilance with the
--dess
flag. - Noise Reduction: Apply DeepFilterNet noise reduction with
--denoise
. - Dynamic Range Compression: Use the
--compress
option with profiles formale
,female
, orneutral
voices. - Sample Rate Control: Use
--sample-rate
to explicitly set the sample rate (22050Hz or 44100Hz). 22050Hz is recommended.
- Normalization: Normalize audio to a target LUFS value (default -16.0). Use
-
Training Options:
- Training/Validation Split: The
--training-proportion
argument (e.g.,--training-proportion 8_2
) now controls the train/validation split ratio. - Segmentation Methods: The trainer supports three segmentation methods:
maximise-punctuation
,punctuation-only
, andmixed
. The--method-proportion
argument controls the ratio for themixed
method.
- Training/Validation Split: The
-
Pandrator Integration: Trained models and reference audio samples (two: a random one from the 10% longest segments and the fastest one from the 70% longest segments) are automatically made available in Pandrator for immediate generation, as in previous versions.
These changes provide more precise control over the training process and should result in higher-quality custom XTTS voices.
Self-contained packages
I've prepared packages (archives) that you can simply unpack - everything is preinstalled in its own portable conda environment. You can download them from here.
You can use the launcher to start Pandrator, update it and install new features.
Package | Contents | Unpacked Size |
---|---|---|
1 | Pandrator and Silero | 4GB |
2 | Pandrator and XTTS (CPU only) | 7GB |
3 | Pandrator and XTTS | 14GB |
4 | Pandrator, XTTS, RVC, WhisperX (for dubbing) and XTTS fine-tuning | 36GB |
Installer
You may use the installer/launcher below, which was created from the pandrator_installer_launcher.py
file in the repository, or use the source file directly. Please remember to run the executable as an administrator. It's possible that Windows or your antivirus software will flag it as a threat. You may whitelist it, or, if you're not comfortable doing that, review the code in the repository and install Pandrator manually.