浏览代码

fix the bug where non-english speech will always prepend sentences with 'speak' (#652)

cocktailpeanut 1 年之前
父节点
当前提交
ec2c5b70fb
共有 1 个文件被更改,包括 1 次插入1 次删除
  1. 1 1
      tools/llama/generate.py

+ 1 - 1
tools/llama/generate.py

@@ -602,7 +602,7 @@ def encode_tokens(
     num_codebooks=4,
 ):
     string = clean_text(string)
-    string = f"<|im_start|>user\nSpeak: {string}<|im_end|><|im_start|>assistant\n"
+    string = f"<|im_start|>user\n{string}<|im_end|><|im_start|>assistant\n"
 
     new_tokens = tokenizer.encode(
         string,