Sem descrição

9 Ramos

spicysama 9f9f0a1fb3 Agent (#648)		há 1 ano atrás
.github	ecaa69e7fc Update docs (#626)	há 1 ano atrás
docs	8f481e64d8 feat: Added KR pages & Documentations (#607)	há 1 ano atrás
fish_speech	9f9f0a1fb3 Agent (#648)	há 1 ano atrás
tools	9f9f0a1fb3 Agent (#648)	há 1 ano atrás
.dockerignore	e413df7145 perf: Optimizing docker builds (#547)	há 1 ano atrás
.gitignore	9f9f0a1fb3 Agent (#648)	há 1 ano atrás
.pre-commit-config.yaml	97625fb8e7 [pre-commit.ci] pre-commit autoupdate (#599)	há 1 ano atrás
.project-root	5707699dfd Handle adaptive number of codebooks	há 2 anos atrás
.readthedocs.yaml	fe293ca492 Use readthedocs instead of github action	há 2 anos atrás
API_FLAGS.txt	9f9f0a1fb3 Agent (#648)	há 1 ano atrás
LICENSE	b91815e074 Switch to CC-BY-NC-SA 4.0 license	há 2 anos atrás
README.md	8f481e64d8 feat: Added KR pages & Documentations (#607)	há 1 ano atrás
docker-compose.dev.yml	f6c56c68d4 Update docker-compose.dev.yml	há 1 ano atrás
dockerfile	23fa4d7e38 Fix dockerfile for `pyaudio` (#623)	há 1 ano atrás
dockerfile.dev	23fa4d7e38 Fix dockerfile for `pyaudio` (#623)	há 1 ano atrás
entrypoint.sh	e413df7145 perf: Optimizing docker builds (#547)	há 1 ano atrás
inference.ipynb	dad516d86d update checkpoint path	há 1 ano atrás
install_env.bat	f15d9f23a9 feat: enable more workers in `api.py` (#621)	há 1 ano atrás
mkdocs.yml	4f097ef2f4 remove ghcr & update docker registry	há 1 ano atrás
pyproject.toml	e37a445f51 Fix backend (#627)	há 1 ano atrás
pyrightconfig.json	6d57066e52 Update pre-commit hook	há 2 anos atrás
run_cmd.bat	8702c61100 From whisper to sensevoice (#482)	há 1 ano atrás
start.bat	46440f25be 对脚本的一点小修改 (#414)	há 1 ano atrás

Fish Speech

**English** | [简体中文](docs/README.zh.md) | [Portuguese](docs/README.pt-BR.md) | [日本語](docs/README.ja.md) | [한국어](docs/README.ko.md)

This codebase and all models are released under CC-BY-NC-SA-4.0 License. Please refer to LICENSE for more details.

Features

Zero-shot & Few-shot TTS: Input a 10 to 30-second vocal sample to generate high-quality TTS output. For detailed guidelines, see Voice Cloning Best Practices.
Multilingual & Cross-lingual Support: Simply copy and paste multilingual text into the input box—no need to worry about the language. Currently supports English, Japanese, Korean, Chinese, French, German, Arabic, and Spanish.
No Phoneme Dependency: The model has strong generalization capabilities and does not rely on phonemes for TTS. It can handle text in any language script.
Highly Accurate: Achieves a low CER (Character Error Rate) and WER (Word Error Rate) of around 2% for 5-minute English texts.
Fast: With fish-tech acceleration, the real-time factor is approximately 1:5 on an Nvidia RTX 4060 laptop and 1:15 on an Nvidia RTX 4090.
WebUI Inference: Features an easy-to-use, Gradio-based web UI compatible with Chrome, Firefox, Edge, and other browsers.
GUI Inference: Offers a PyQt6 graphical interface that works seamlessly with the API server. Supports Linux, Windows, and macOS. See GUI.
Deploy-Friendly: Easily set up an inference server with native support for Linux, Windows and MacOS, minimizing speed loss.