Skip to content
/ HLINC Public

HLINC: HaLlucination Inference via Neurosymbolic Computation

Notifications You must be signed in to change notification settings

HaydenMM/HLINC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 

Repository files navigation


HaLlucination Inference via Neurosymbolic Computation

Overview

HLINC is a modular neurosymbolic approach for detecting AND explaining hallucinations in knowledge-grounded LLM conversations. Using HaluEval: Hallucination Evaluation Benchmark Datasets (datasets generated from user queries with ChatGPT outputting hallucinated responses), we test HLINC's ability to detect and explain these hallucinations.

  • Stage 1 uses ChatGPT as a Semantic Parser, converting Knowledge-Grounded Questions and Answers into Microsoft's Z3 Theorem Prover Syntax.
  • Stage 2 runs all of the converted code with a Z3 Theorem Prover, passing each syntax/error that occurs from the logical solver back through the Semantic Parser (ChatGPT) with the added Syntax Error context.
  • Stage 3 runs the Z3 code through the Theorem Prover to detect and explain the hallucinations.

Results

Dataset Approach Correctly Detected Hallucinations Explainability
HaluEval Dialogue w/ Knowledge HLINC 8610/10000 (86.10 %) YES
HaluEval Dialogue w/ Knowledge ChatGPT 9110/10000 (91.10 %) NO

HaluEval Dialogue w/ Knowledge

Stage 1: data/stage-1-dialogue.txt

Stage 2: data/stage-2-dialogue.txt

  • Syntax Errors Detected: 423/10000 (4.23%)
  • Syntax Errors Fixed: 265/423

Stage 3:

  • Correctly Detected Hallucinations: 8610/10000 (86.10 %)

Dataset Approach Correctly Detected Hallucinations Explainability
HaluEval Q/A w/ Knowledge HLINC 7149/10000 (71.49%) YES
HaluEval Q/A w/ Knowledge ChatGPT 7800/10000 (78.00 %) NO

HaluEval Q/A w/ Knowledge

Stage 1: data/stage-1-qa.txt

Stage 2: data/stage-2-qa.txt

  • Syntax Errors Detected: 795/10000 (7.95%)
  • Syntax Errors Fixed: 489/795

Stage 3:

  • Correctly Detected Hallucinations: 7149/10000 (71.49 %)

Files

Z3 Sematic Parser: notebooks/z3_semantic_parser.ipynb

Z3 Logical Solver: notebooks/z3_theorem_prover.ipynb


Appendix

Example of a Correct Answer with no syntax errors

Example of a Hallucinated Answer with a syntax error

Acknowledgements

Thanks to "LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers" for the inspiration to this work!

@inproceedings{OGLZ_LINC_2023,
	author={Theo X. Olausson* and Alex Gu* and Ben Lipkin* and Cedegao E. Zhang* and Armando Solar-Lezama and Joshua B. Tenenbaum and Roger P. Levy},
	title={LINC: A neuro-symbolic approach for logical reasoning by combining language models with first-order logic provers},
	year={2023},
	journal={Proceedings of the Conference on Empirical Methods in Natural Language Processing},
}

Thanks to "HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models" for the datasets used in this work!

@misc{HaluEval,
  author = {Junyi Li and Xiaoxue Cheng and Wayne Xin Zhao and Jian-Yun Nie and Ji-Rong Wen },
  title = {HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models},
  year = {2023},
  journal={arXiv preprint arXiv:2305.11747},
  url={https://arxiv.org/abs/2305.11747}
}

Reference

@misc{HLINC,
  author = {Hayden Moore},
  title = {HLINC: A Neurosymbolic Approach for Detecting and Explaining LLM Hallucinations in Knowledge-Grounded Contexts},
  year = {2025},
  journal={},
  url={}
}```

About

HLINC: HaLlucination Inference via Neurosymbolic Computation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published