|
|
@@ -0,0 +1,77 @@
|
|
|
+# Start Agent
|
|
|
+
|
|
|
+## Requirements
|
|
|
+
|
|
|
+- GPU memory: At least 8GB(under quanization), 16GB or more is recommanded.
|
|
|
+- Disk usage: 10GB
|
|
|
+
|
|
|
+## Download Model
|
|
|
+
|
|
|
+You can get the model by:
|
|
|
+
|
|
|
+```bash
|
|
|
+huggingface-cli download fishaudio/fish-agent-v0.1-3b --local-dir checkpoints/fish-agent-v0.1-3b
|
|
|
+```
|
|
|
+
|
|
|
+Put them in the 'checkpoints' folder.
|
|
|
+
|
|
|
+You also need the fish-speech model which you can download instructed by [inference](inference.md).
|
|
|
+
|
|
|
+So there will be 2 folder in the checkpoints.
|
|
|
+
|
|
|
+The `checkpoints/fish-speech-1.4` and `checkpoints/fish-agent-v0.1-3b`
|
|
|
+
|
|
|
+## Environment Prepare
|
|
|
+
|
|
|
+If you already have Fish-speech, you can directly use by adding the follow instruction:
|
|
|
+```bash
|
|
|
+pip install cachetools
|
|
|
+```
|
|
|
+
|
|
|
+!!! note
|
|
|
+ Please use the Python version below 3.12 for compile.
|
|
|
+
|
|
|
+If you don't have, please use the below commands to build yout environment:
|
|
|
+
|
|
|
+```bash
|
|
|
+sudo apt-get install portaudio19-dev
|
|
|
+
|
|
|
+pip install -e .[stable]
|
|
|
+```
|
|
|
+
|
|
|
+## Launch The Agent Demo.
|
|
|
+
|
|
|
+To build fish-agent, please use the command below under the main folder:
|
|
|
+
|
|
|
+```bash
|
|
|
+python -m tools.api --llama-checkpoint-path checkpoints/fish-agent-v0.1-3b/ --mode agent --compile
|
|
|
+```
|
|
|
+
|
|
|
+The `--compile` args only support Python < 3.12 , which will greatly speed up the token generation.
|
|
|
+
|
|
|
+It won't compile at once (remember).
|
|
|
+
|
|
|
+Then open another terminal and use the command:
|
|
|
+
|
|
|
+```bash
|
|
|
+python -m tools.e2e_webui
|
|
|
+```
|
|
|
+
|
|
|
+This will create a Gradio WebUI on the device.
|
|
|
+
|
|
|
+When you first use the model, it will come to compile (if the `--compile` is True) for a short time, so please wait with patience.
|
|
|
+
|
|
|
+## Gradio Webui
|
|
|
+<p align="center">
|
|
|
+ <img src="../assets/figs/agent_gradio.png" width="75%">
|
|
|
+</p>
|
|
|
+
|
|
|
+Have a good time!
|
|
|
+
|
|
|
+## Performance
|
|
|
+
|
|
|
+Under our test, a 4060 laptop just barely runs, but is very stretched, which is only about 8 tokens/s. The 4090 is around 95 tokens/s under compile, which is what we recommend.
|
|
|
+
|
|
|
+# About Agent
|
|
|
+
|
|
|
+The demo is an early alpha test version, the inference speed needs to be optimised, and there are a lot of bugs waiting to be fixed. If you've found a bug or want to fix it, we'd be very happy to receive an issue or a pull request.
|