|
|
2 lat temu | |
|---|---|---|
| data_server | 2 lat temu | |
| figs | 2 lat temu | |
| fish_speech | 2 lat temu | |
| tools | 2 lat temu | |
| .dockerignore | 2 lat temu | |
| .gitignore | 2 lat temu | |
| .pre-commit-config.yaml | 2 lat temu | |
| LICENSE | 2 lat temu | |
| README.md | 2 lat temu | |
| README.zh.md | 2 lat temu | |
| dockerfile | 2 lat temu | |
| pyproject.toml | 2 lat temu | |
| pyrightconfig.json | 2 lat temu |
Documentation is under construction, English is not fully supported yet.
This codebase is released under BSD-3-Clause License, and all models are released under CC-BY-NC-SA-4.0 License. Please refer to LICENSE for more details.
We do not hold any responsibility for any illegal usage of the codebase. Please refer to your local laws about DMCA and other related laws.
Therefore, we strongly recommend to use WSL2 or docker to run the codebase for Windows users.
# Basic environment setup
conda create -n fish-speech python=3.10
conda activate fish-speech
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
# Install flash-attn (for linux)
pip3 install ninja && MAX_JOBS=4 pip3 install flash-attn --no-build-isolation
# Install fish-speech
pip3 install -e .
Download required vqgan and text2semantic model from our huggingface repo.
wget https://huggingface.co/fishaudio/speech-lm-v1/raw/main/vqgan-v1.pth -O checkpoints/vqgan-v1.pth
wget https://huggingface.co/fishaudio/speech-lm-v1/blob/main/text2semantic-400m-v0.1-4k.pth -O checkpoints/text2semantic-400m-v0.1-4k.pth
Generate semantic tokens from text:
python tools/llama/generate.py \
--text "Hello" \
--num-samples 2 \
--compile
You may want to use --compile to fuse cuda kernels faster inference (~25 tokens/sec -> ~300 tokens/sec).
Generate vocals from semantic tokens:
python tools/vqgan/inference.py -i codes_0.npy
Since loading and shuffle the dataset is very slow and memory consuming, we use a rust server to load and shuffle the dataset. The server is based on GRPC and can be installed by
cd data_server
cargo build --release