Add pipeline tag and Github link

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +6 -5
README.md CHANGED
@@ -1,11 +1,12 @@
1
  ---
2
- library_name: transformers
3
  base_model: princeton-nlp/Llama-3-Base-8B-SFT
 
 
 
4
  tags:
5
  - alignment-handbook
6
  - generated_from_trainer
7
- datasets:
8
- - HuggingFaceH4/ultrafeedback_binarized
9
  model-index:
10
  - name: llama-3-8b-dpo-ultrafeedback-decrease_linear-1.0to0.95
11
  results: []
@@ -16,7 +17,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  # llama-3-8b-dpo-ultrafeedback-decrease_linear-1.0to0.95
18
 
19
- This is a model released from the preprint: [DPO-Shift: Shifting the Distribution of Direct Preference Optimization](https://arxiv.org/abs/2502.07599). Please refer to our [repository](https://github.com/Meaquadddd/DPO-Shift) for more details.
20
 
21
 
22
  This model is a fine-tuned version of [princeton-nlp/Llama-3-Base-8B-SFT](https://huggingface.co/princeton-nlp/Llama-3-Base-8B-SFT) on the HuggingFaceH4/ultrafeedback_binarized dataset.
@@ -83,4 +84,4 @@ The following hyperparameters were used during training:
83
  - Transformers 4.44.2
84
  - Pytorch 2.4.0+cu121
85
  - Datasets 2.21.0
86
- - Tokenizers 0.19.1
 
1
  ---
 
2
  base_model: princeton-nlp/Llama-3-Base-8B-SFT
3
+ datasets:
4
+ - HuggingFaceH4/ultrafeedback_binarized
5
+ library_name: transformers
6
  tags:
7
  - alignment-handbook
8
  - generated_from_trainer
9
+ pipeline_tag: text-generation
 
10
  model-index:
11
  - name: llama-3-8b-dpo-ultrafeedback-decrease_linear-1.0to0.95
12
  results: []
 
17
 
18
  # llama-3-8b-dpo-ultrafeedback-decrease_linear-1.0to0.95
19
 
20
+ This model is released from the preprint: [DPO-Shift: Shifting the Distribution of Direct Preference Optimization](https://arxiv.org/abs/2502.07599). For more details, please refer to our [repository](https://github.com/Meaquadddd/DPO-Shift).
21
 
22
 
23
  This model is a fine-tuned version of [princeton-nlp/Llama-3-Base-8B-SFT](https://huggingface.co/princeton-nlp/Llama-3-Base-8B-SFT) on the HuggingFaceH4/ultrafeedback_binarized dataset.
 
84
  - Transformers 4.44.2
85
  - Pytorch 2.4.0+cu121
86
  - Datasets 2.21.0
87
+ - Tokenizers 0.19.1