File size: 2,028 Bytes

86b4111
10c10aa
 
fa9e8ee
 
 
91b0156
fa9e8ee
86b4111
 
7b69fb1
10c10aa
 
 
86b4111
10c10aa
86b4111
10c10aa
86b4111
10c10aa
86b4111
10c10aa
86b4111
 
10c10aa
86b4111
10c10aa
86b4111
10c10aa
86b4111
10c10aa
86b4111
10c10aa
 
86b4111
10c10aa
86b4111
10c10aa
 
 
86b4111
10c10aa
86b4111
10c10aa
fa9e8ee
 
 
86b4111
10c10aa
 
86b4111
 
10c10aa
 
86b4111
 
10c10aa
 
86b4111
 
10c10aa
 
86b4111
 
10c10aa
 
 
 
86b4111
 
10c10aa
 
 
86b4111
 
10c10aa
86b4111
10c10aa
86b4111
10c10aa
a48eb1f
10c10aa
86b4111
10c10aa
86b4111
10c10aa
 
 
 
 
91b0156

---
base_model:
- meta-llama/Llama-3.2-3B-Instruct
language:
- en
license: apache-2.0
pipeline_tag: question-answering
library_name: transformers
---


<div align="center">
  <b style="font-size: 40px;">Gen-8B-R2</b>
</div>

Note: We are still working on this.

Are you looking for a more robust and reliable generation model for RAG system?

Here is a Gen-8B-R2 model that effectively mitigates hallucinations caused by retrieval noise and information overload.

See the details in our paper [Link](https://arxiv.org/pdf/2503.04789)


### What is Gen-8B-R2?

This model is one of the variant of Ext2Gen-8B-R2, which disables the process of extracting sentences from the chunk list.

See the details of Ext2Gen-8B-R2 in https://huggingface.co/DISLab/Ext2Gen-8B-R2

### Recommended Prompt

- query: the query to answer
- chunk_list: the list of retrieved chunks, e.g., ["chunk 1", "chunk 2", "chunk 3"]

```python

def prepare_sample_text(prompt):
    row_json = [{"role": "user", "content": prompt}]
    return tokenizer.apply_chat_template(row_json, tokenize=False)

def format_prompt_template(query, chunk_list):

    chunk_list = ['[Chunk ID: '+ str(idx+1) + '] ' + chunk_text for idx, chunk_text in enumerate(chunk_list)]
    chunk_list = '

'.join(chunk_list)

    prompt = '''
You are an expert assistant trained to generate answers based on document chunks.


### Generation Instruction:
- Answer to the Query based on the given Chunk List.


### Query: 
%s


### Chunk List:
%s


### Output:
''' % (query, chunk_list)
    
    return prompt.strip()


prompt = format_prompt_template(query, noisy_chunks)
prompt =  prepare_sample_text(prompt)
```


Note that this prompt outputs both extracted relevant sentences and the answer to the query. 

The output follows a consistent format as seen in an example below.

```
The estimated number of deaths at Chelmno is 150-300,000, mainly Jews.
```

### Recommended Generation Parameters

```python
max_new_tokens=1024, # or 2048 
do_sample=True,
temperature=0.8,
top_p=0.9,
```