Hallucination-resistant multimodal content generation through knowledge graph-based reinforcement learning

Liang Zeng, Xinyi Lin, Shanping Yu*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Multimodal large models exhibit remarkable capabilities in understanding and generating content by integrating diverse types of data, including text and images. However, they face significant challenges related to hallucination in practical applications, where generated content may be inaccurate or misleading. To address these concerns, this study introduces a chain of thought framework for trusted content generation based on knowledge graph reinforcement learning to mitigate hallucinations effectively. This framework incorporates a chain of thought mechanism to enhance model reasoning, thereby improving interpretability. By leveraging a external structured knowledge graph, the framework optimizes the trajectory of the generated content, ensuring that outputs are informed by reliable contextual information. Furthermore, the use of reinforcement learning techniques bolsters the credibility of the generated responses. Experimental evaluations on the VQA-RAD and SLAKE datasets demonstrate that this approach achieves significant improvements in medical visual question answering tasks. This framework not only elevates the quality of content generation but also enhances the interpretability of the model.

Original languageEnglish
Article number103783
JournalInformation Fusion
Volume127
DOIs
Publication statusPublished - Mar 2026

Keywords

  • Chain of thought
  • Generative artificial intelligence
  • Hallucination
  • Knowledge graph
  • Reinforcement learning

Fingerprint

Dive into the research topics of 'Hallucination-resistant multimodal content generation through knowledge graph-based reinforcement learning'. Together they form a unique fingerprint.

Cite this