Server/fish-speech @ cccad3e098d97bf1cdcaefc6d149de091d4893da

Requirements

GPU Memory: 12GB (Inference)
System: Linux, WSL

System Setup

First you need install pyaudio and sox, which is used for audio processing.

apt install portaudio19-dev libsox-dev ffmpeg

Conda

conda create -n fish-speech python=3.12
conda activate fish-speech

# Select the correct cuda version for your system from [cu126, cu128, cu129]
pip install -e .[cu129]
# Or for cpu only
pip install -e .[cpu]
# You can also omit the extra if you want to use the default torch index
pip install -e .

UV

# Select the correct cuda version for your system from [cu126, cu128, cu129]
uv sync --python 3.12 --extra cu129
# Or for cpu only
uv sync --python 3.12 --extra cpu

Intel Arc XPU support

conda create -n fish-speech python=3.12
conda activate fish-speech

conda install libstdcxx -c conda-forge

pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/xpu

pip install -e .

!!! warning

The `compile` option is not supported on windows and macOS, if you want to run with compile, you need to install trition by yourself.

Docker Setup

See inference to use docker for the webui or the API server.

install.md 1.2 KB History Raw

Requirements

System Setup

Conda

UV

Intel Arc XPU support

Docker Setup

install.md 1.2 KB

History Raw