Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

what is difference between val.text and train.txt #104

Open
wants to merge 68 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
d66cfc0
add japanese cleaners
CjangCjengh Aug 2, 2022
2c14fc1
modify
CjangCjengh Aug 2, 2022
6eea3ae
modify
CjangCjengh Aug 2, 2022
0e62593
modify
CjangCjengh Aug 3, 2022
5f2f483
ban numba logging
CjangCjengh Aug 3, 2022
67d967d
fix
CjangCjengh Aug 3, 2022
9aeb60e
fix
CjangCjengh Aug 3, 2022
5ff178a
fix
CjangCjengh Aug 8, 2022
5e81711
add new cleaner
CjangCjengh Aug 9, 2022
4ebc313
add korean cleaners
CjangCjengh Aug 9, 2022
544faea
fix
CjangCjengh Aug 9, 2022
cea906a
add new dataset
CjangCjengh Aug 9, 2022
fa25950
add requirement
CjangCjengh Aug 9, 2022
65dff1e
fix
CjangCjengh Aug 10, 2022
e3e355d
delete symbol.py
CjangCjengh Aug 14, 2022
f4e80d1
temp
CjangCjengh Aug 14, 2022
2beed95
Revert "temp"
CjangCjengh Aug 14, 2022
73d4adc
Revert "delete symbol.py"
CjangCjengh Aug 14, 2022
d38ed30
fix
CjangCjengh Aug 14, 2022
b2a8ff0
change filelists
CjangCjengh Aug 14, 2022
036ab34
changed dataset
CjangCjengh Aug 15, 2022
eeed9da
add chinese cleaners
CjangCjengh Aug 15, 2022
d1bad2e
Update symbols.py
CjangCjengh Aug 15, 2022
e632160
Update symbols.py
CjangCjengh Aug 15, 2022
71a7051
add new dataset
CjangCjengh Aug 16, 2022
c3b1073
Update japanese_ss_base2.json
CjangCjengh Aug 16, 2022
24b77e0
Update utils.py
CjangCjengh Aug 16, 2022
4bd1125
fix transcript
CjangCjengh Aug 16, 2022
f88bed1
update filelist
CjangCjengh Aug 16, 2022
9f7f1fb
fix
CjangCjengh Aug 16, 2022
ba9bd7d
add new dataset
CjangCjengh Aug 19, 2022
2f84796
update dataset
CjangCjengh Aug 19, 2022
8bc1ae4
update dataset
CjangCjengh Aug 21, 2022
fc7f3b4
add new dataset
CjangCjengh Aug 21, 2022
1353962
add new cleaners
CjangCjengh Aug 22, 2022
5163e7c
add new dataset
CjangCjengh Aug 23, 2022
b9d65ea
fix
CjangCjengh Aug 23, 2022
26eb734
fix
CjangCjengh Aug 24, 2022
d3fe8d3
fix
CjangCjengh Aug 24, 2022
c822690
add filelist
CjangCjengh Aug 29, 2022
7268900
add new dataset
CjangCjengh Sep 1, 2022
87c2331
fix
CjangCjengh Sep 3, 2022
7f0b17b
Update train_ms.py
CjangCjengh Sep 8, 2022
59ba031
Update train.py
CjangCjengh Sep 8, 2022
f074a36
add Sanskrit symbols
CjangCjengh Sep 17, 2022
e6a58ad
add Sanskrit cleaners
CjangCjengh Sep 20, 2022
5b11af4
add new cleaners
CjangCjengh Sep 30, 2022
93e0250
add cleaners
CjangCjengh Oct 1, 2022
8c83d4d
support shanghainese
CjangCjengh Oct 4, 2022
3835070
fix
CjangCjengh Oct 9, 2022
ee23eaa
add chinese dialect cleaners
CjangCjengh Oct 11, 2022
d3bef23
Add Dockerfile
YumeAyai Oct 20, 2022
c4558b2
Merge pull request #1 from YumeAyai/main
CjangCjengh Oct 20, 2022
27d42b9
fix
CjangCjengh Oct 11, 2022
a3008a5
fix cleaners
CjangCjengh Oct 30, 2022
93f64db
Update colab.ipynb
CjangCjengh Oct 30, 2022
ca6b460
update readme
CjangCjengh Oct 30, 2022
2de6e4d
Update cleaners.py
CjangCjengh Nov 5, 2022
57343b3
Add files via upload
NaruseMioShirakana Nov 24, 2022
404fb3d
Delete Libtorch directory
NaruseMioShirakana Nov 24, 2022
150ec49
Add files via upload
NaruseMioShirakana Nov 24, 2022
6f536fd
Merge pull request #2 from NaruseMioShirakana/main
CjangCjengh Nov 24, 2022
a88d0f2
update cleaners
CjangCjengh Dec 2, 2022
4952f60
Update cleaners.py
CjangCjengh Dec 2, 2022
a688354
Docker: install libsndfile & build monotonic_align
kaoet Jan 17, 2023
14155e9
Update a for loop in data utils
xdtdaniel Feb 27, 2023
92c659f
Merge pull request #4 from xdtdaniel/patch-1
CjangCjengh Jul 4, 2023
43c74f3
Merge pull request #3 from kaoet/main
CjangCjengh Dec 6, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
**/__pycache__
**/.venv
**/.classpath
**/.dockerignore
**/.env
**/.git
**/.gitignore
**/.project
**/.settings
**/.toolstarget
**/.vs
**/.vscode
**/*.*proj.user
**/*.dbmdl
**/*.jfm
**/bin
**/charts
**/docker-compose*
**/compose*
**/Dockerfile*
**/node_modules
**/npm-debug.log
**/obj
**/secrets.dev.yaml
**/values.dev.yaml
README.md
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,6 @@ __pycache__
build
*.c
monotonic_align/monotonic_align
/.vs/vits/FileContentIndex
configs/dracu_japanese_base2.json
configs/tolove_japanese_base2.json
3 changes: 3 additions & 0 deletions .vs/ProjectSettings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"CurrentProjectSetting": null
}
9 changes: 9 additions & 0 deletions .vs/VSWorkspaceState.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"ExpandedNodes": [
"",
"\\filelists",
"\\text"
],
"SelectedNode": "\\text\\symbols.py",
"PreviewInSolutionExplorer": false
}
Binary file added .vs/slnx.sqlite
Binary file not shown.
Empty file.
Binary file added .vs/vits/v17/.suo
Binary file not shown.
23 changes: 23 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# For more information, please refer to https://aka.ms/vscode-docker-python
FROM python:3.7-slim

# Keeps Python from generating .pyc files in the container
ENV PYTHONDONTWRITEBYTECODE=1
# Turns off buffering for easier container logging
ENV PYTHONUNBUFFERED=1

# Install pip requirements
COPY requirements.txt .
RUN apt-get update
RUN apt-get install -y vim
RUN apt-get install -y gcc
RUN apt-get install -y g++
RUN apt-get install -y cmake
RUN apt-get install -y libsndfile1
RUN python -m pip install -r requirements.txt

WORKDIR /content
COPY . /content

# Build monotonic alignment search
RUN cd monotonic_align && python3 setup.py build_ext --inplace
121 changes: 121 additions & 0 deletions Libtorch C++ Infer/VITS-LibTorch.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
#include <iostream>
#include <torch/torch.h>
#include <torch/script.h>
#include <string>
#include <vector>
#include <locale>
#include <codecvt>
#include <direct.h>
#include <fstream>
typedef int64_t int64;
namespace Shirakana {

struct WavHead {
char RIFF[4];
long int size0;
char WAVE[4];
char FMT[4];
long int size1;
short int fmttag;
short int channel;
long int samplespersec;
long int bytepersec;
short int blockalign;
short int bitpersamples;
char DATA[4];
long int size2;
};

int conArr2Wav(int64 size, int16_t* input, const char* filename) {
WavHead head = { {'R','I','F','F'},0,{'W','A','V','E'},{'f','m','t',' '},16,
1,1,22050,22050 * 2,2,16,{'d','a','t','a'},
0 };
head.size0 = size * 2 + 36;
head.size2 = size * 2;
std::ofstream ocout;
char* outputData = (char*)input;
ocout.open(filename, std::ios::out | std::ios::binary);
ocout.write((char*)&head, 44);
ocout.write(outputData, (int32_t)(size * 2));
ocout.close();
return 0;
}

inline std::wstring to_wide_string(const std::string& input)
{
std::wstring_convert<std::codecvt_utf8<wchar_t>> converter;
return converter.from_bytes(input);
}

inline std::string to_byte_string(const std::wstring& input)
{
std::wstring_convert<std::codecvt_utf8<wchar_t>> converter;
return converter.to_bytes(input);
}
}

#define val const auto
int main()
{
torch::jit::Module Vits;
std::string buffer;
std::vector<int64> text;
std::vector<int16_t> data;
while(true)
{
while (true)
{
std::cin >> buffer;
if (buffer == "end")
return 0;
if(buffer == "model")
{
std::cin >> buffer;
Vits = torch::jit::load(buffer);
continue;
}
if (buffer == "endinfer")
{
Shirakana::conArr2Wav(data.size(), data.data(), "temp\\tmp.wav");
data.clear();
std::cout << "endofinfe";
continue;
}
if (buffer == "line")
{
std::cin >> buffer;
while (buffer.find("endline")==std::string::npos)
{
text.push_back(std::atoi(buffer.c_str()));
std::cin >> buffer;
}
val InputTensor = torch::from_blob(text.data(), { 1,static_cast<int64>(text.size()) }, torch::kInt64);
std::array<int64, 1> TextLength{ static_cast<int64>(text.size()) };
val InputTensor_length = torch::from_blob(TextLength.data(), { 1 }, torch::kInt64);
std::vector<torch::IValue> inputs;
inputs.push_back(InputTensor);
inputs.push_back(InputTensor_length);
if (buffer.length() > 7)
{
std::array<int64, 1> speakerIndex{ (int64)atoi(buffer.substr(7).c_str()) };
inputs.push_back(torch::from_blob(speakerIndex.data(), { 1 }, torch::kLong));
}
val output = Vits.forward(inputs).toTuple()->elements()[0].toTensor().multiply(32276.0F);
val outputSize = output.sizes().at(2);
val floatOutput = output.data_ptr<float>();
int16_t* outputTmp = (int16_t*)malloc(sizeof(float) * outputSize);
if (outputTmp == nullptr) {
throw std::exception("内存不足");
}
for (int i = 0; i < outputSize; i++) {
*(outputTmp + i) = (int16_t) * (floatOutput + i);
}
data.insert(data.end(), outputTmp, outputTmp+outputSize);
free(outputTmp);
text.clear();
std::cout << "endofline";
}
}
}
//model S:\VSGIT\ShirakanaTTSUI\build\x64\Release\Mods\AtriVITS\AtriVITS_LJS.pt
}
142 changes: 142 additions & 0 deletions Libtorch C++ Infer/toLibTorch.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%matplotlib inline\n",
"import matplotlib.pyplot as plt\n",
"import IPython.display as ipd\n",
"\n",
"import os\n",
"import json\n",
"import math\n",
"import torch\n",
"from torch import nn\n",
"from torch.nn import functional as F\n",
"from torch.utils.data import DataLoader\n",
"\n",
"import ../commons\n",
"import ../utils\n",
"from ../data_utils import TextAudioLoader, TextAudioCollate, TextAudioSpeakerLoader, TextAudioSpeakerCollate\n",
"from ../models import SynthesizerTrn\n",
"from ../text.symbols import symbols\n",
"from ../text import text_to_sequence\n",
"\n",
"from scipy.io.wavfile import write\n",
"\n",
"\n",
"def get_text(text, hps):\n",
" text_norm = text_to_sequence(text, hps.data.text_cleaners)\n",
" if hps.data.add_blank:\n",
" text_norm = commons.intersperse(text_norm, 0)\n",
" text_norm = torch.LongTensor(text_norm)\n",
" return text_norm"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#############################################################\n",
"# #\n",
"# Single Speakers #\n",
"# #\n",
"#############################################################"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"hps = utils.get_hparams_from_file(\"configs/XXX.json\") #将\"\"内的内容修改为你的模型路径与config路径\n",
"net_g = SynthesizerTrn(\n",
" len(symbols),\n",
" hps.data.filter_length // 2 + 1,\n",
" hps.train.segment_size // hps.data.hop_length,\n",
" **hps.model).cuda()\n",
"_ = net_g.eval()\n",
"\n",
"_ = utils.load_checkpoint(\"/path/to/model.pth\", net_g, None)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"stn_tst = get_text(\"こんにちは\", hps)\n",
"with torch.no_grad():\n",
" x_tst = stn_tst.cuda().unsqueeze(0)\n",
" x_tst_lengths = torch.LongTensor([stn_tst.size(0)]).cuda()\n",
" traced_mod = torch.jit.trace(net_g,(x_tst, x_tst_lengths,sid))\n",
" torch.jit.save(traced_mod,\"OUTPUTLIBTORCHMODEL.pt\")\n",
" audio = net_g.infer(x_tst, x_tst_lengths, noise_scale=.667, noise_scale_w=0.8, length_scale=1)[0][0,0].data.cpu().float().numpy()\n",
"ipd.display(ipd.Audio(audio, rate=hps.data.sampling_rate, normalize=False))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#############################################################\n",
"# #\n",
"# Multiple Speakers #\n",
"# #\n",
"#############################################################"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"hps = utils.get_hparams_from_file(\"./configs/XXX.json\") #将\"\"内的内容修改为你的模型路径与config路径\n",
"net_g = SynthesizerTrn(\n",
" len(symbols),\n",
" hps.data.filter_length // 2 + 1,\n",
" hps.train.segment_size // hps.data.hop_length,\n",
" n_speakers=hps.data.n_speakers,\n",
" **hps.model).cuda()\n",
"_ = net_g.eval()\n",
"\n",
"_ = utils.load_checkpoint(\"/path/to/model.pth\", net_g, None)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"stn_tst = get_text(\"こんにちは\", hps)\n",
"with torch.no_grad():\n",
" x_tst = stn_tst.cuda().unsqueeze(0)\n",
" x_tst_lengths = torch.LongTensor([stn_tst.size(0)]).cuda()\n",
" sid = torch.LongTensor([4]).cuda()\n",
" traced_mod = torch.jit.trace(net_g,(x_tst, x_tst_lengths,sid))\n",
" torch.jit.save(traced_mod,\"OUTPUTLIBTORCHMODEL.pt\")\n",
" audio = net_g.infer(x_tst, x_tst_lengths, sid=sid, noise_scale=.667, noise_scale_w=0.8, length_scale=1)[0][0,0].data.cpu().float().numpy()\n",
"ipd.display(ipd.Audio(audio, rate=hps.data.sampling_rate, normalize=False))"
]
}
],
"metadata": {
"language_info": {
"name": "python"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading