A Systematic Framework for Enhancing Retrieval-Augmented Generation for Tabular Data
Vol. 14, No. 6, pp. 400-416,
Jun. 2025
https://doi.org/10.3745/TKIPS.2025.14.6.400
PDF
Abstract
This study proposes a novel framework approach combining structural data processing, user query classification, and a self-feedback
loop to optimize the performance of Retrieval-Augmented Generation (RAG) models for effectively handling tabular data. While RAG models
excel at complex data processing and question answering by integrating retrieval-based and generative capabilities, specific design
strategies tailored to the unique characteristics of tabular data have not been sufficiently explored. To address this gap, this study presents
a detailed design framework focused on improving the key components of RAG models in tabular data processing. The proposed framework
systematically enhances the efficiency and accuracy of RAG models by refining their ability to structure and interpret tabular datasets.
Furthermore, an empirical analysis was conducted to determine the optimal combination of generative models, embedding models, and
the number of retrieved documents for effective tabular data processing. To support this study, a new QA dataset was constructed to
better evaluate RAG models on tabular data tasks. Experimental results demonstrate that the structural approach explicitly preserves the
relationships and context within the data, while the user query classification strategy contributes to maximizing the efficiency of the
RAG process. Additionally, the self-feedback loop enhances the quality of generated responses through iterative evaluation and refinement,
effectively mitigating hallucination issues and ensuring reliable, high-quality responses even for complex queries. By integrating these
optimization strategies, this study refines the design direction for RAG models tailored to tabular data and provides practical insights
into their deployment. This work expands the applicability of RAG models specialized in tabular data processing and enhances their
potential across various application domains. These findings provide a foundational resource for RAG model design and performance
optimization, offering valuable guidance for addressing practical challenges in tabular data processing and advancing future AI-driven
data analysis systems.
Statistics
Cite this article
[IEEE Style]
E. Lee, Y. Lee, H. B. ·, "A Systematic Framework for Enhancing Retrieval-Augmented Generation for Tabular Data," The Transactions of the Korea Information Processing Society, vol. 14, no. 6, pp. 400-416, 2025. DOI: https://doi.org/10.3745/TKIPS.2025.14.6.400.
[ACM Style]
Eunbin Lee, Younghan Lee, and Ho Bae ·. 2025. A Systematic Framework for Enhancing Retrieval-Augmented Generation for Tabular Data. The Transactions of the Korea Information Processing Society, 14, 6, (2025), 400-416. DOI: https://doi.org/10.3745/TKIPS.2025.14.6.400.