{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## 命令行推理" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### For Windows" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "vscode": { "languageId": "bat" } }, "outputs": [], "source": [ "!chcp 65001" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### For Linux" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import locale\n", "locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## API Client\n", "\n", "需要在终端开启API Server\n", "\n", "> 音频用本地路径\n", "\n", "> 文本可以直接用路径,也可以用内容" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "vscode": { "languageId": "shellscript" } }, "outputs": [], "source": [ "!python -m tools.post_api \\\n", " --text \"Hello everyone, I am an open-source text-to-speech model developed by Fish Audio.\" \\\n", " --reference_audio \"D:\\PythonProject\\原神语音中文\\胡桃\\vo_hutao_draw_appear.wav\" \\\n", " --reference_text \"D:\\PythonProject\\原神语音中文\\胡桃\\vo_hutao_draw_appear.lab\" \\\n", " --streaming True" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## For Test" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 0. 下载模型" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!set HF_ENDPOINT=https://hf-mirror.com\n", "# !export HF_ENDPOINT=https://hf-mirror.com\n", "!huggingface-cli download fishaudio/fish-speech-1.2 --local-dir checkpoints/fish-speech-1.2/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1. 从语音生成 prompt:\n", "> 如果你打算让模型随机选择音色, 你可以跳过这一步.\n", "\n", "你应该能得到一个 `fake.npy` 文件." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "vscode": { "languageId": "shellscript" } }, "outputs": [], "source": [ "## 在此输入你的语音路径:\n", "src_audio = r\"D:\\PythonProject\\原神语音中文\\胡桃\\vo_hutao_draw_appear.wav\"\n", "\n", "!python tools/vqgan/inference.py \\\n", " -i {src_audio} \\\n", " --checkpoint-path \"checkpoints/fish-speech-1.2/firefly-gan-vq-fsq-4x1024-42hz-generator.pth\"\n", "\n", "from IPython.display import Audio, display\n", "audio = Audio(filename=\"fake.wav\")\n", "display(audio)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2. 从文本生成语义 token:\n", "> 该命令会在工作目录下创建 codes_N 文件, 其中 N 是从 0 开始的整数.\n", "\n", "> 您可以使用 --compile 来融合 cuda 内核以实现更快的推理" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "vscode": { "languageId": "shellscript" } }, "outputs": [], "source": [ "!python tools/llama/generate.py \\\n", " --text \"人间灯火倒映湖中,她的渴望让静水泛起涟漪。若代价只是孤独,那就让这份愿望肆意流淌。流入她所注视的世间,也流入她如湖水般澄澈的目光。\" \\\n", " --prompt-text \"唷,找本堂主有何贵干呀?嗯?你不知道吗,往生堂第七十七代堂主就是胡桃我啦!嘶,不过瞧你的模样,容光焕发,身体健康,嗯…想必是为了工作以外的事来找我,对吧?\" \\\n", " --prompt-tokens \"fake.npy\" \\\n", " --checkpoint-path \"checkpoints/fish-speech-1.2\" \\\n", " --num-samples 2\n", " # --compile" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3. 从语义 token 生成人声:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "vscode": { "languageId": "shellscript" } }, "outputs": [], "source": [ "!python tools/vqgan/inference.py \\\n", " -i \"codes_0.npy\" \\\n", " --checkpoint-path \"checkpoints/fish-speech-1.2/firefly-gan-vq-fsq-4x1024-42hz-generator.pth\"\n", "\n", "from IPython.display import Audio, display\n", "audio = Audio(filename=\"fake.wav\")\n", "display(audio)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.14" } }, "nbformat": 4, "nbformat_minor": 2 }