!!! warning
We assume no responsibility for any illegal use of the codebase. Please refer to the local laws regarding DMCA (Digital Millennium Copyright Act) and other relevant laws in your area. <br/>
This codebase and all models are released under the CC-BY-NC-SA-4.0 license.
Windows professional users may consider WSL2 or Docker to run the codebase.
Non-professional Windows users can consider the following methods to run the codebase without a Linux environment (with model compilation capabilities aka torch.compile):
install_env.bat to install the environment.
USE_MIRROR item in install_env.bat.USE_MIRROR=false downloads the latest stable version of torch from the original site. USE_MIRROR=true downloads the latest version of torch from a mirror site. The default is true.INSTALL_TYPE item in install_env.bat.INSTALL_TYPE=preview downloads the preview version with the compiled environment. INSTALL_TYPE=stable downloads the stable version without the compiled environment.USE_MIRROR=preview, execute this step (optional, for activating the compiled model environment):
LLVM-17.0.6-win64.exe, double-click to install it, choose an appropriate installation location, and most importantly, check Add Path to Current User to add to the environment variables.Modify button as shown below, find the Desktop development with C++ option, and check it for download.start.bat to enter the Fish-Speech training inference configuration WebUI page.
API_FLAGS.txt in the project root directory and modify the first three lines as follows:
--infer
# --api
# --listen ...
...
API_FLAGS.txt in the project root directory and modify the first three lines as follows:
# --infer
--api
--listen ...
...
run_cmd.bat to enter the conda/python command line environment of this project.# Create a python 3.10 virtual environment, you can also use virtualenv
conda create -n fish-speech python=3.10
conda activate fish-speech
# Install pytorch
pip3 install torch torchvision torchaudio
# Install fish-speech
pip3 install -e .[stable]
# (Ubuntu / Debian User) Install sox
apt install libsox-dev
lora fine-tuning support.gradient checkpointing, causual sampling, and flash-attn support.text2semantic model, supporting phoneme-free mode.