Evaluating next generation nlg models for graph to text generation with finetuning vs generalized model for longer generated texts| International Journal of Innovative Science and Research Technology

Evaluating Next Generation NLG Models for Graph to Text Generation with Fine-Tuning Vs Generalized Model for Longer Generated Texts

Authors : Prashant Kaushik

Volume/Issue : Volume 9 - 2024, Issue 1 - January

Google Scholar : http://tinyurl.com/2abeuc48

Scribd : http://tinyurl.com/4mzhwxwe

DOI : https://doi.org/10.5281/zenodo.10597748

Abstract : The paper investigates the feasibility of generative models for graph-to-text generation tasks, particularly in a zero-shot setting where no fine-tuning or additional training resources are utilized. The study evaluates the performance of GPT-3 and ChatGPT on graph-to-text datasets, comparing their results with those of fine-tuned language model (LLM) models like T5 and BART. The findings reveal that generative models, specifically GPT-3 and ChatGPT, exhibit the ability to produce fluent and coherent text, with notable BLEU scores of 11.07 and 11.18 on the AGENDA, & WebNLG datasets, respectively for longer texts. Despite this success, error analysis highlights challenges for actual product usage. In particular Generative models struggle with understanding semantic based relations among entities contexts, leading to the generation of text with hallucinations or irrelevant information. As part of the error analysis, the study employs BERT to detect machine-generated text, which are achieving high macro-F1 scores. The generated text by the generative models is made publicly available by various authors, contributing to the research community's understanding of the capabilities and limitations of such model in the context of graph-to-text generation tasks.

Keywords : LLMs, Large Language Models, Generative Models, Graph to Text, Text Generation, Bleu, Rogue.

The paper investigates the feasibility of generative models for graph-to-text generation tasks, particularly in a zero-shot setting where no fine-tuning or additional training resources are utilized. The study evaluates the performance of GPT-3 and ChatGPT on graph-to-text datasets, comparing their results with those of fine-tuned language model (LLM) models like T5 and BART. The findings reveal that generative models, specifically GPT-3 and ChatGPT, exhibit the ability to produce fluent and coherent text, with notable BLEU scores of 11.07 and 11.18 on the AGENDA, & WebNLG datasets, respectively for longer texts. Despite this success, error analysis highlights challenges for actual product usage. In particular Generative models struggle with understanding semantic based relations among entities contexts, leading to the generation of text with hallucinations or irrelevant information. As part of the error analysis, the study employs BERT to detect machine-generated text, which are achieving high macro-F1 scores. The generated text by the generative models is made publicly available by various authors, contributing to the research community's understanding of the capabilities and limitations of such model in the context of graph-to-text generation tasks.

Keywords : LLMs, Large Language Models, Generative Models, Graph to Text, Text Generation, Bleu, Rogue.