目录

前言

5月时只是小朋友为了智能体会话可以用周深的声音对话,所以折腾了台机器,搭建了一个本地的 Spark-TTS。折腾了这老版本 CUDA、PyTorch 的匹配,多次下载安装不同版本和驱动,抽几个晚上最终定格了版本匹配。

一、安装 Conda

如果还没有安装 Conda 的话。

  • Download Miniconda and install it.
  • Make sure to check “Add Conda to PATH” during installation.

Download Spark-TTS

You have two options to get the files:

Option 1 (Recommended for Windows): Download ZIP manually

  • Go to Spark-TTS GitHub
  • Click “Code” > “Download ZIP”, then extract it.

Option 2: Use Git (Optional)

二、创建 Conda 环境

Open Command Prompt (cmd) and run:

conda create -n sparktts python=3.12 -y conda activate sparktts

This creates and activates a Python 3.12 environment for Spark-TTS.


三、安装依赖

Inside the Spark-TTS folder (whether from ZIP or Git), run:

pip install -r requirements.txt


四、安装 PyTorch

注:自动检测 CUDA 或 CPU,会自动安装对应版本的 PyTorch。

pip install torch torchvision torchaudio --index-url https://pytorch.org/get-started/previous-versions/

# OR Manually install a specific CUDA version (if needed)
# pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
# pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118  # Older GPUs

可能有些电脑比较旧,安装的 cuda 版本比较低,导致没有找到正确的 torch 版本。

(sparktts) I:\AI-src\Spark-TTS>pip install torch torchvision torchaudio --index-url https://pytorch.org/get-started/previous-versions/
Looking in indexes: https://pytorch.org/get-started/previous-versions/
Requirement already satisfied: torch in d:\anaconda3\envs\sparktts\lib\site-packages (2.5.1)
ERROR: Could not find a version that satisfies the requirement torchvision (from versions: none)
ERROR: No matching distribution found for torchvision

解决:

先检查本地 cuda 的版本:

(sparktts) I:\AI-src\Spark-TTS>nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:41:10_Pacific_Daylight_Time_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

注:CUDA 一开始 runtime 版本是 10,后来可支持最新升级到 12.1,但由于 CUDA 12.1 一直安装的+cpu 版本,不能支持 CUDA,所以只能使用 11.8 CUDA 版本。

根据版本 11.8 指定 index-url 重新安装:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

确认

(sparktts) I:\AI-src\Spark-TTS>python ver.py
PyTorch version: 2.6.0+cu118
TorchVision version: 0.21.0+cu118

下载 Spark-TTS 模型

There are two ways to get the model files. Pick one:

Option 1 (Recommended): Using Python
Create a new file in the Spark-TTS folder called download_model.py, paste this inside, and run it:

from huggingface_hub import snapshot_download import os

# Set download path model_dir = “pretrained_models/Spark-TTS-0.5B”

# Check if model already exists if os.path.exists(model_dir) and len(os.listdir(model_dir)) > 0: print(“Model files already exist. Skipping download.") else: print(“Downloading model files…") snapshot_download( repo_id="SparkAudio/Spark-TTS-0.5B”, local_dir=model_dir, resume_download=True # Resumes partial downloads ) print(“Download complete!")

Run it with:

python download_model.py

✅ Option 2: Using Git (If You Installed It)

mkdir pretrained_models git clone https://huggingface.co/SparkAudio/Spark-TTS-0.5B pretrained_models/Spark-TTS-0.5B

Either method works—choose whichever is easier for you.

建议直接使用 git:

(sparktts) I:\AI-src\Spark-TTS\pretrained_models>git clone https://huggingface.co/SparkAudio/Spark-TTS-0.5B pretrained_models/Spark-TTS-0.5B
Cloning into 'pretrained_models/Spark-TTS-0.5B'...
remote: Enumerating objects: 80, done.
remote: Counting objects: 100% (76/76), done.
remote: Compressing objects: 100% (76/76), done.
remote: Total 80 (delta 21), reused 0 (delta 0), pack-reused 4 (from 1)
Unpacking objects: 100% (80/80), 3.63 MiB | 1.43 MiB/s, done.
Updating files: 100% (31/31), done.
Filtering content: 100% (4/4), 3.66 GiB | 8.44 MiB/s, done.

六、运行 Spark-TTS

For an interactive browser-based interface, run:

python webui.py

This launches a local web server where you can enter text and generate speech or clone a voice.

(sparktts) I:\AI-src\Spark-TTS>python webui.py
D:\anaconda3\envs\sparktts\Lib\site-packages\torch\nn\utils\weight_norm.py:143: FutureWarning: `torch.nn.utils.weight_norm` is deprecated in favor of `torch.nn.utils.parametrizations.weight_norm`.
  WeightNorm.apply(module, name, dim)
Missing tensor: mel_transformer.spectrogram.window
Missing tensor: mel_transformer.mel_scale.fb
* Running on local URL:  http://0.0.0.0:7860

To create a public link, set `share=True` in `launch()`.
Setting `pad_token_id` to `eos_token_id`:None for open-end generation.

七、调试与问题

🔎 Before Asking for Help
Many common issues are already covered in existing discussions, documentation, or online resources. Please:

  • Search GitHub issues first 🕵️‍♂️
  • Check the documentation 📖
  • Google or use AI tools (ChatGPT, DeepSeek, etc.)

If you still need help, please explain what you’ve already tried so we can assist you better!


Now you’re good to go! 🚀🔥

Happy TTS-ing.

八、启动 webui.py 失败可能原因

torchtorchvision 版本不匹配

错误信息如下:

(sparktts) I:\AI-src\Spark-TTS>python webui.py
Traceback (most recent call last):
  File "D:\anaconda3\envs\sparktts\Lib\site-packages\transformers\utils\import_utils.py", line 1778, in _get_module
    return importlib.import_module("." + module_name, self.__name__)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\anaconda3\envs\sparktts\Lib\importlib\__init__.py", line 90, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 999, in exec_module
  File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
  File "D:\anaconda3\envs\sparktts\Lib\site-packages\transformers\models\wav2vec2\modeling_wav2vec2.py", line 40, in <module>
    from ...modeling_utils import PreTrainedModel
  File "D:\anaconda3\envs\sparktts\Lib\site-packages\transformers\modeling_utils.py", line 48, in <module>
    from .loss.loss_utils import LOSS_MAPPING
  File "D:\anaconda3\envs\sparktts\Lib\site-packages\transformers\loss\loss_utils.py", line 19, in <module>
    from .loss_deformable_detr import DeformableDetrForObjectDetectionLoss, DeformableDetrForSegmentationLoss
  File "D:\anaconda3\envs\sparktts\Lib\site-packages\transformers\loss\loss_deformable_detr.py", line 4, in <module>
    from ..image_transforms import center_to_corners_format
  File "D:\anaconda3\envs\sparktts\Lib\site-packages\transformers\image_transforms.py", line 22, in <module>
    from .image_utils import (
  File "D:\anaconda3\envs\sparktts\Lib\site-packages\transformers\image_utils.py", line 58, in <module>
    from torchvision.transforms import InterpolationMode
ImportError: cannot import name 'InterpolationMode' from 'torchvision.transforms' (D:\anaconda3\envs\sparktts\Lib\site-packages\torchvision\transforms\__init__.py)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "I:\AI-src\Spark-TTS\webui.py", line 23, in <module>
    from cli.SparkTTS import SparkTTS
  File "I:\AI-src\Spark-TTS\cli\SparkTTS.py", line 23, in <module>
    from sparktts.models.audio_tokenizer import BiCodecTokenizer
  File "I:\AI-src\Spark-TTS\sparktts\models\audio_tokenizer.py", line 22, in <module>
    from transformers import Wav2Vec2FeatureExtractor, Wav2Vec2Model
  File "<frozen importlib._bootstrap>", line 1412, in _handle_fromlist
  File "D:\anaconda3\envs\sparktts\Lib\site-packages\transformers\utils\import_utils.py", line 1767, in __getattr__
    value = getattr(module, name)
            ^^^^^^^^^^^^^^^^^^^^^
  File "D:\anaconda3\envs\sparktts\Lib\site-packages\transformers\utils\import_utils.py", line 1766, in __getattr__
    module = self._get_module(self._class_to_module[name])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\anaconda3\envs\sparktts\Lib\site-packages\transformers\utils\import_utils.py", line 1780, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import transformers.models.wav2vec2.modeling_wav2vec2 because of the following error (look up to see its traceback):
cannot import name 'InterpolationMode' from 'torchvision.transforms' (D:\anaconda3\envs\sparktts\Lib\site-packages\torchvision\transforms\__init__.py)

检查 torchtorchvision 的版本是否匹配,可以通过以下步骤进行:

1. 查看当前安装的 torchtorchvision 版本

在 Python 环境中运行以下代码,获取当前安装的 torchtorchvision 版本:

import torch
import torchvision

print("PyTorch version:", torch.__version__)
print("TorchVision version:", torchvision.__version__)

执行后:

(sparktts) I:\AI-src\Spark-TTS>python ver.py
PyTorch version: 2.5.1+cpu
TorchVision version: 0.2.0

2. 检查版本是否匹配

根据 PyTorch 的版本,torchvision 通常会有一个对应的兼容版本。以下是一些常见版本的对应关系:

PyTorch 版本 TorchVision 版本
2.5.x 0.20.x
2.4.x 0.19.x
2.3.x 0.18.x
2.2.x 0.17.x
2.1.x 0.16.x
2.0.x 0.15.x
1.13.x 0.14.x
1.12.x 0.13.x
1.11.x 0.12.x
1.10.x 0.11.x
1.9.x 0.10.x
1.8.x 0.9.x
1.7.x 0.8.x
1.6.x 0.7.x
1.5.x 0.6.x
1.4.x 0.5.x
1.3.x 0.4.x
1.2.x 0.4.x
1.1.x 0.3.x
1.0.x 0.2.x

如果你的 torchtorchvision 版本不匹配,可能会导致兼容性问题。

3. 调整版本以确保匹配

如果发现版本不匹配,可以通过以下方法调整:

方法 1:升级或降级 torchvision

根据你的 torch 版本,安装对应的 torchvision 版本。例如:

pip install torchvision==0.20.1  # 对应 torch 2.5.x
(sparktts) I:\AI-src\Spark-TTS>pip install torchvision==0.20.1
Collecting torchvision==0.20.1
  Downloading torchvision-0.20.1-cp312-cp312-win_amd64.whl.metadata (6.2 kB)
Requirement already satisfied: numpy in d:\anaconda3\envs\sparktts\lib\site-packages (from torchvision==0.20.1) (2.2.3)
Requirement already satisfied: torch==2.5.1 in d:\anaconda3\envs\sparktts\lib\site-packages (from torchvision==0.20.1) (2.5.1)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in d:\anaconda3\envs\sparktts\lib\site-packages (from torchvision==0.20.1) (11.1.0)
Requirement already satisfied: filelock in d:\anaconda3\envs\sparktts\lib\site-packages (from torch==2.5.1->torchvision==0.20.1) (3.17.0)
Requirement already satisfied: typing-extensions>=4.8.0 in d:\anaconda3\envs\sparktts\lib\site-packages (from torch==2.5.1->torchvision==0.20.1) (4.12.2)
Requirement already satisfied: networkx in d:\anaconda3\envs\sparktts\lib\site-packages (from torch==2.5.1->torchvision==0.20.1) (3.4.2)
Requirement already satisfied: jinja2 in d:\anaconda3\envs\sparktts\lib\site-packages (from torch==2.5.1->torchvision==0.20.1) (3.1.6)
Requirement already satisfied: fsspec in d:\anaconda3\envs\sparktts\lib\site-packages (from torch==2.5.1->torchvision==0.20.1) (2025.3.0)
Requirement already satisfied: setuptools in d:\anaconda3\envs\sparktts\lib\site-packages (from torch==2.5.1->torchvision==0.20.1) (75.8.0)
Requirement already satisfied: sympy==1.13.1 in d:\anaconda3\envs\sparktts\lib\site-packages (from torch==2.5.1->torchvision==0.20.1) (1.13.1)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in d:\anaconda3\envs\sparktts\lib\site-packages (from sympy==1.13.1->torch==2.5.1->torchvision==0.20.1) (1.3.0)
Requirement already satisfied: MarkupSafe>=2.0 in d:\anaconda3\envs\sparktts\lib\site-packages (from jinja2->torch==2.5.1->torchvision==0.20.1) (2.1.5)
Downloading torchvision-0.20.1-cp312-cp312-win_amd64.whl (1.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 53.9 kB/s eta 0:00:00
Installing collected packages: torchvision
  Attempting uninstall: torchvision
    Found existing installation: torchvision 0.2.0
    Uninstalling torchvision-0.2.0:
      Successfully uninstalled torchvision-0.2.0
Successfully installed torchvision-0.20.1

方法 2:升级或降级 torch

如果你希望使用某个特定版本的 torchvision,可以调整 torch 的版本。例如:

pip install torch==2.4.1 torchvision==0.19.1

方法 3:使用 Conda 管理版本

如果你使用的是 Conda,可以通过以下命令安装匹配的版本:

conda install pytorch==2.4.1 torchvision==0.19.1 -c pytorch

4. 验证安装

调整版本后,重新运行以下代码,确保版本匹配且没有兼容性问题,再次验证:

(sparktts) I:\AI-src\Spark-TTS>python ver.py
PyTorch version: 2.5.1+cpu
TorchVision version: 0.20.1+cpu

如果版本匹配,但仍然遇到问题,可以尝试以下操作:

  • 清理旧版本的缓存:
    pip cache purge
    
  • 在一个全新的虚拟环境中重新安装依赖:
    conda create -n new_env python=3.9
    conda activate new_env
    pip install torch torchvision
    

通过上述步骤,你可以确保 torchtorchvision 的版本匹配,从而避免兼容性问题。

九、CUDA 运行版本对不上

PS C:\Users\jm> nvidia-smi
Mon Mar 10 22:12:53 2025
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 538.78                 Driver Version: 538.78       CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                     TCC/WDDM  | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce MX150         WDDM  | 00000000:01:00.0 Off |                  N/A |
| N/A   50C    P0              N/A / ERR! |      0MiB /  2048MiB |      1%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
PS C:\Users\jm> nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:04_Central_Daylight_Time_2018
Cuda compilation tools, release 10.0, V10.0.130
PS C:\Users\jm>

显卡最高支持 12.2,但实际 CUDA 运行时版本是 10.0,需要安装个 12.2 的版本。

CUDA Toolkit Archive | NVIDIA Developer

找到对应的版本下载到本地安装。安装过程中,系统重启了,没有安装成功,重新又安装了 1 次,大约花了一小时。

C:\Users\jm>nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Jun_13_19:42:34_Pacific_Daylight_Time_2023
Cuda compilation tools, release 12.2, V12.2.91
Build cuda_12.2.r12.2/compiler.32965470_0

注:根据电脑实际情况,可能有的需要安装新版本 12.1,有的只能安装旧版本 11.8。


9ong@TsingChan 文章内容由 AI 辅助生成。