🎯 RC-Competition-DeepSeek-R1-Distill-Llama-70B-bnb-4bit-SFTOnly-ChatML-CoT-v2-20251211_0619
📋 模型說明
此模型採用 ChatML 格式 進行訓練,並使用 DataCollatorForCompletionOnlyLM 實現精準的 Loss Masking。
🆕 v2.2 改進
- ✅ 使用字串模式 Response Template 避免 token 邊界問題
- ✅ User 部分完全不計算 loss
- ✅ 只學習 Assistant 部分(推理過程 + 答案)
🎭 訓練格式
<|im_start|>user
[文章內容] ← ❌ 不計算 loss
<|im_end|>
<|im_start|>assistant
<think>
[推理過程] ← ✅ 計算 loss
</think>
答案: X ← ✅ 計算 loss
<|im_end|>
📊 訓練參數
| 參數 | 值 |
|---|---|
| Base Model | unsloth/DeepSeek-R1-Distill-Llama-70B-bnb-4bit |
| LoRA Rank | 256 |
| LoRA Alpha | 256 |
| Learning Rate | 2e-05 |
| Effective Batch Size | 32 |
| Epochs | 1 |
| Max Sequence Length | 4096 |
💻 使用方法
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
"kunhsiang/RC-Competition-DeepSeek-R1-Distill-Llama-70B-bnb-4bit-SFTOnly-ChatML-CoT-v2-20251211_0619",
max_seq_length=4096,
load_in_4bit=True,
)
FastLanguageModel.for_inference(model)
messages = [{"role": "user", "content": "[文章內容]\n問題: ...\n選項: 1. ... 2. ... 3. ... 4. ..."}]
inputs = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to("cuda")
outputs = model.generate(inputs, max_new_tokens=1024, temperature=0.7, top_p=0.9)
print(tokenizer.decode(outputs[0], skip_special_tokens=False))
📅 版本資訊
- 訓練日期: 20251211_0619
- 訓練框架: Unsloth + TRL
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support