Retrieval In Decoder benefits generative models for explainable complex question answering

Citation DataNeural Networks, ISSN: 0893-6080, Vol: 181, Page: 106833

Publication Year2025

0
Citations
0
Usage
16
Captures
0
Mentions
0
Social Media

Metric Options: Counts1 Year3 Year

Metrics Details

Captures
16
- Readers
  16

Article Description

Large-scale Language Models (LLMs) utilizing the Chain-of-Thought prompting demonstrate exceptional performance in a variety of tasks. However, the persistence of factual hallucinations remains a significant challenge in practical applications. Prevailing retrieval-augmented methods treat the retriever and generator as separate components, which inadvertently restricts the generator’s capabilities to those of the retriever through intensive supervised training. In this work, we propose an unsupervised Retrieval In Decoder framework for multi-granularity decoding called RID, which integrates retrieval directly into the decoding process of generative models. It dynamically adjusts decoding granularity based on retrieval outcomes, and duly corrects the decoding direction through its direct impact on the next token. Moreover, we introduce a reinforcement learning-driven knowledge distillation method for adaptive explanation generation to better apply to Small-scale Language Models (SLMs). The experimental results across six public benchmarks surpass popular LLMs and existing retrieval-augmented methods, which demonstrates the effectiveness of RID in models of different scales and verifies its applicability and scalability.

Bibliographic Details

DOI10.1016/j.neunet.2024.106833

PMID39509813

URL IDhttp://www.sciencedirect.com/science/article/pii/S0893608024007573; http://dx.doi.org/10.1016/j.neunet.2024.106833; http://www.scopus.com/inward/record.url?partnerID=HzOxMe3b&scp=85208195300&origin=inward; http://www.ncbi.nlm.nih.gov/pubmed/39509813; https://linkinghub.elsevier.com/retrieve/pii/S0893608024007573

AUTHOR(S)

Feng, Jianzhou; Wang, Qin; Qiu, Huaxiao; Liu, Lirong

PUBLISHER(S)

Elsevier BV

TAG(S)

Neuroscience; Computer Science

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know