Lengyue 1 år sedan
förälder
incheckning
543072ee92
4 ändrade filer med 177 tillägg och 11 borttagningar
  1. 57 2
      docs/en/index.md
  2. 58 2
      docs/ja/index.md
  3. 58 3
      docs/pt/index.md
  4. 4 4
      docs/zh/index.md

+ 57 - 2
docs/en/index.md

@@ -105,10 +105,65 @@ pip3 install torch torchvision torchaudio
 # Install fish-speech
 pip3 install -e .[stable]
 
-# (Ubuntu / Debian User) Install sox
-apt install libsox-dev
+# (Ubuntu / Debian User) Install sox + ffmpeg
+apt install libsox-dev ffmpeg
 ```
 
+## Docker Setup
+
+1. Install NVIDIA Container Toolkit:
+
+    To use GPU for model training and inference in Docker, you need to install NVIDIA Container Toolkit:
+
+    For Ubuntu users:
+
+    ```bash
+    # Add repository
+    curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
+        && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
+            sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
+            sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
+    # Install nvidia-container-toolkit
+    sudo apt-get update
+    sudo apt-get install -y nvidia-container-toolkit
+    # Restart Docker service
+    sudo systemctl restart docker
+    ```
+
+    For users of other Linux distributions, please refer to: [NVIDIA Container Toolkit Install-guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html).
+
+2. Pull and run the fish-speech image
+
+    ```shell
+    # Pull the image
+    docker pull fishaudio/fish-speech:latest-dev
+    # Run the image
+    docker run -it \
+        --name fish-speech \
+        --gpus all \
+        -p 7860:7860 \
+        fishaudio/fish-speech:latest-dev \
+        zsh
+    # If you need to use a different port, please modify the -p parameter to YourPort:7860
+    ```
+
+3. Download model dependencies
+
+    Make sure you are in the terminal inside the docker container, then download the required `vqgan` and `llama` models from our huggingface repository.
+
+    ```bash
+    huggingface-cli download fishaudio/fish-speech-1.4 --local-dir checkpoints/fish-speech-1.4
+    ```
+
+4. Configure environment variables and access WebUI
+
+    In the terminal inside the docker container, enter `export GRADIO_SERVER_NAME="0.0.0.0"` to allow external access to the gradio service inside docker.
+    Then in the terminal inside the docker container, enter `python tools/webui.py` to start the WebUI service.
+
+    If you're using WSL or MacOS, visit [http://localhost:7860](http://localhost:7860) to open the WebUI interface.
+
+    If it's deployed on a server, replace localhost with your server's IP.
+
 ## Changelog
 
 - 2024/09/10: Updated Fish-Speech to 1.4 version, with an increase in dataset size and a change in the quantizer's n_groups from 4 to 8.

+ 58 - 2
docs/ja/index.md

@@ -101,12 +101,68 @@ pip3 install torch torchvision torchaudio
 # fish-speechをインストールします。
 pip3 install -e .[stable]
 
-# (Ubuntu / Debianユーザー) soxをインストールします。
-apt install libsox-dev
+# (Ubuntu / Debianユーザー) sox + ffmpegをインストールします。
+apt install libsox-dev ffmpeg
 ```
 
+## Docker セットアップ
+
+1. NVIDIA Container Toolkit のインストール:
+
+    Docker で GPU を使用してモデルのトレーニングと推論を行うには、NVIDIA Container Toolkit をインストールする必要があります:
+
+    Ubuntu ユーザーの場合:
+
+    ```bash
+    # リポジトリの追加
+    curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
+        && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
+            sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
+            sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
+    # nvidia-container-toolkit のインストール
+    sudo apt-get update
+    sudo apt-get install -y nvidia-container-toolkit
+    # Docker サービスの再起動
+    sudo systemctl restart docker
+    ```
+
+    他の Linux ディストリビューションを使用している場合は、以下のインストールガイドを参照してください:[NVIDIA Container Toolkit Install-guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)。
+
+2. fish-speech イメージのプルと実行
+
+    ```shell
+    # イメージのプル
+    docker pull fishaudio/fish-speech:latest-dev
+    # イメージの実行
+    docker run -it \
+        --name fish-speech \
+        --gpus all \
+        -p 7860:7860 \
+        fishaudio/fish-speech:latest-dev \
+        zsh
+    # 他のポートを使用する場合は、-p パラメータを YourPort:7860 に変更してください
+    ```
+
+3. モデルの依存関係のダウンロード
+
+    Docker コンテナ内のターミナルにいることを確認し、huggingface リポジトリから必要な `vqgan` と `llama` モデルをダウンロードします。
+
+    ```bash
+    huggingface-cli download fishaudio/fish-speech-1.4 --local-dir checkpoints/fish-speech-1.4
+    ```
+
+4. 環境変数の設定と WebUI へのアクセス
+
+    Docker コンテナ内のターミナルで、`export GRADIO_SERVER_NAME="0.0.0.0"` と入力して、外部から Docker 内の gradio サービスにアクセスできるようにします。
+    次に、Docker コンテナ内のターミナルで `python tools/webui.py` と入力して WebUI サービスを起動します。
+
+    WSL または MacOS の場合は、[http://localhost:7860](http://localhost:7860) にアクセスして WebUI インターフェースを開くことができます。
+
+    サーバーにデプロイしている場合は、localhost をサーバーの IP に置き換えてください。
+
 ## 変更履歴
 
+- 2024/09/10: Fish-Speech を Ver.1.4 に更新し、データセットのサイズを増加させ、quantizer n_groups を 4 から 8 に変更しました。
 - 2024/07/02: Fish-Speech を Ver.1.2 に更新し、VITS デコーダーを削除し、ゼロショット能力を大幅に強化しました。
 - 2024/05/10: Fish-Speech を Ver.1.1 に更新し、VITS デコーダーを実装して WER を減少させ、音色の類似性を向上させました。
 - 2024/04/22: Fish-Speech Ver.1.0 を完成させ、VQGAN および LLAMA モデルを大幅に修正しました。

+ 58 - 3
docs/pt/index.md

@@ -104,12 +104,67 @@ pip3 install torch torchvision torchaudio
 # Instale o fish-speech
 pip3 install -e .[stable]
 
-# Para os Usuário do Ubuntu / Debian: Instale o sox
-apt install libsox-dev
+# Para os Usuário do Ubuntu / Debian: Instale o sox + ffmpeg
+apt install libsox-dev ffmpeg
 ```
 
-## Histórico de Alterações
+## Configuração do Docker
+
+1. Instale o NVIDIA Container Toolkit:
+
+    Para usar a GPU com Docker para treinamento e inferência de modelos, você precisa instalar o NVIDIA Container Toolkit:
+
+    Para usuários Ubuntu:
+
+    ```bash
+    # Adicione o repositório remoto
+    curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
+        && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
+            sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
+            sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
+    # Instale o nvidia-container-toolkit
+    sudo apt-get update
+    sudo apt-get install -y nvidia-container-toolkit
+    # Reinicie o serviço Docker
+    sudo systemctl restart docker
+    ```
+
+    Para usuários de outras distribuições Linux, consulte o guia de instalação: [NVIDIA Container Toolkit Install-guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html).
+
+2. Baixe e execute a imagem fish-speech
+
+    ```shell
+    # Baixe a imagem
+    docker pull fishaudio/fish-speech:latest-dev
+    # Execute a imagem
+    docker run -it \
+        --name fish-speech \
+        --gpus all \
+        -p 7860:7860 \
+        fishaudio/fish-speech:latest-dev \
+        zsh
+    # Se precisar usar outra porta, modifique o parâmetro -p para YourPort:7860
+    ```
 
+3. Baixe as dependências do modelo
+
+    Certifique-se de estar no terminal do contêiner Docker e, em seguida, baixe os modelos necessários `vqgan` e `llama` do nosso repositório HuggingFace.
+
+    ```bash
+    huggingface-cli download fishaudio/fish-speech-1.4 --local-dir checkpoints/fish-speech-1.4
+    ```
+
+4. Configure as variáveis de ambiente e acesse a WebUI
+
+    No terminal do contêiner Docker, digite `export GRADIO_SERVER_NAME="0.0.0.0"` para permitir o acesso externo ao serviço gradio dentro do Docker.
+    Em seguida, no terminal do contêiner Docker, digite `python tools/webui.py` para iniciar o serviço WebUI.
+
+    Se estiver usando WSL ou MacOS, acesse [http://localhost:7860](http://localhost:7860) para abrir a interface WebUI.
+
+    Se estiver implantando em um servidor, substitua localhost pelo IP do seu servidor.
+
+## Histórico de Alterações
+- 10/09/2024: Fish-Speech atualizado para a versão 1.4, aumentado o tamanho do conjunto de dados, quantizer n_groups 4 -> 8.
 - 02/07/2024: Fish-Speech atualizado para a versão 1.2, removido o Decodificador VITS e aprimorado consideravelmente a capacidade de zero-shot.
 - 10/05/2024: Fish-Speech atualizado para a versão 1.1, implementado o decodificador VITS para reduzir a WER e melhorar a similaridade de timbre.
 - 22/04/2024: Finalizada a versão 1.0 do Fish-Speech, modificados significativamente os modelos VQGAN e LLAMA.

+ 4 - 4
docs/zh/index.md

@@ -100,8 +100,8 @@ pip3 install torch torchvision torchaudio
 # 安装 fish-speech
 pip3 install -e .[stable]
 
-# (Ubuntu / Debian 用户) 安装 sox
-apt install libsox-dev
+# (Ubuntu / Debian 用户) 安装 sox + ffmpeg
+apt install libsox-dev ffmpeg
 ```
 
 ## Docker 配置
@@ -133,13 +133,13 @@ apt install libsox-dev
 
     ```shell
     # 拉取镜像
-    docker pull fishaudio/fish-speech
+    docker pull fishaudio/fish-speech:latest-dev
     # 运行镜像
     docker run -it \
         --name fish-speech \
         --gpus all \
         -p 7860:7860 \
-        fishaudio/fish-speech \
+        fishaudio/fish-speech:latest-dev \
         zsh
     # 如果需要使用其他端口,请修改 -p 参数为 YourPort:7860
     ```