## Requirements - GPU Memory: 12GB (Inference) - System: Linux, WSL ## System Setup First you need install pyaudio and sox, which is used for audio processing. ``` bash apt install portaudio19-dev libsox-dev ffmpeg ``` ### Conda ```bash conda create -n fish-speech python=3.12 conda activate fish-speech # Select the correct cuda version for your system from [cu126, cu128, cu129] pip install -e .[cu129] # Or for cpu only pip install -e .[cpu] # You can also omit the extra if you want to use the default torch index pip install -e . ``` ### UV ```bash # Select the correct cuda version for your system from [cu126, cu128, cu129] uv sync --python 3.12 --extra cu129 # Or for cpu only uv sync --python 3.12 --extra cpu ``` ### Intel Arc XPU support ```bash conda create -n fish-speech python=3.12 conda activate fish-speech conda install libstdcxx -c conda-forge pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/xpu pip install -e . ``` !!! warning The `compile` option is not supported on windows and macOS, if you want to run with compile, you need to install trition by yourself. ## Docker Setup See [inference](./inference.md) to use docker for the webui or the API server.