Skip to main navigation Skip to search Skip to main content

Automated Generation and Evaluation of Interactive-Fiction Serious Games with Open-Weight LLMs

Abstract

This work investigates whether open-weight large language models can automatically generate runnable and educationally faithful serious games in a constrained, text-only interactive-fiction (IF) setting. The target games are station-based single-player serious games for knowledge assessment, implemented as IF in a structured, machine-readable text format, and used here as a first step towards later ambient scenarios. A fully automated pipeline called SINE (Serious Interactive Narrative Engine) is evaluated with four prompting strategies, grammar-guided decoding, deterministic validation, and a repair agent. Across a staged evaluation with 240 seeds and increasing complexity, finalist configurations reach success rates between roughly 68% and 86% on the joint criterion of compilation, playability, and learning-goal fidelity. Repair iterations proved central to robustness, whereas grammar masking on top of reasoning prompts did not consistently improve outcomes. The study provides a reproducible benchmark setup, open artifacts, and a constrained generation pipeline as a basis for later extensions toward broader serious game scenarios.
Original languageEnglish
Article number2932
JournalApplied Sciences (Switzerland)
Volume16
Issue number6
ISSN2076-3417
DOIs
Publication statusPublished - 18.03.2026

Funding

FundersFunder number
Stiftung Innovation in der Hochschullehre1001-3214

    Cite this