AI-Powered Personalized Storybook Generator – Using Multi-Modal Content Generation

15 Jun

Authors: Anushka Satav, Siddhi Shinde, Ashish Singh, Supriya Jagtap

Abstract: This paper showcases an AI-driven system that creates and shares beautifully illustrated kids’ storybooks via a seamless, all-in-one multimodal workflow. Kick off with a basic user prompt and chapter count, and a large language model (LLM) delivers a ready-to-go story: a snappy title, cover idea, and a neat lineup of chapters each packed with engaging text and tailored illustration prompts. By crafting the whole multi-chapter arc in one shot, it locks in smooth, consistent storytelling from start to finish. Spot-on images pop up for every chapter, pulled straight from the scene descriptions and woven into an interactive reader. Text-to-speech (TTS) kicks in too, voicing the tale aloud to make it more accessible and captivating. The prototype doesn’t yet handle story carryover across sessions or live text syncing with audio. It also skips built-in safety checks, pointing to age-filtering as prime territory for upgrades. All told, this setup proves a smart, streamlined way to automate rich, multimedia stories.

DOI: http://doi.org/