Storytelling AI for shadow-playing
Narratron is an interactive projector that augments hand shadow puppetry with AI-generated storytelling. Designed for all ages, it transforms traditional physical shadow plays into an immersive and phygital storytelling experience.
Narratron has received several awards including Core77, DNA Paris, and MUSE. It was exhibited at MIT Architecture Gallery in 2024.
This project is in collaboration with Aria Bao, and we have submitted an affiliated paper to The Association for Computing Machinery (ACM). Read the full paper here
Founding Designer
Mar - May 2023
Role
Interaction Designer, Industrial Designer
Skills
Tools
Illustrator, Figma, Rhino, Next.js, Arduino
1
Design
Overview
A storytelling projector powered by gen AI
Narratron is an interactive projector that augments hand shadow puppetry with AI-generated storytelling. Designed for all ages, it transforms traditional physical shadow plays into an immersive and phygital storytelling experience.
My contribution
Designing on phygital interface
As one of the two owners of this project, I was mainly responsible for:
Led the ideation, prototyping, and testing for all product design crafts and interaction flows
Led the development of backend infrastructure and integration of LLMs through APIs
Led the creation of affiliated research paper
2
Process
Behind the scenes
Motivation
Hand shadow puppetry has been practiced as one of the oldest forms of storytelling in a transcultural context. To enhance that experience with multimodal artificial intelligence, Narratron allows users to interact with hand shadows with AI-generated auditory and visual outputs of the story their hand shadows are telling.
Define user story
To determine the interactive pattern on this phygital interface, I sketched a user story to communicate the idea.
The user’s experience with Narratron begins with the startup screen, which creates a serene and focused ambiance, setting the stage for an immersive journey. As the user turns on Narratron, they are greeted by the instructions, preparing them for the experience that lies ahead. Once the startup screen fades away, the user is free to explore and play with their hand shadows in any way they wish. They can experiment with different shapes, sizes, and movements while Narratron’s camera captures the intricate hand shadow shapes created by the user. The captured hand shadow shapes are then analyzed by trained image classifiers integrated into Narratron. These algorithms interpret the hand shadows and translate them into animal keywords. The keywords serve as the foundation for the next step of the process: generating a complete story.
Form factor and interaction
Inspired by the physical affordances of hand cranks and shutters which are the original interfaces of movie projectors and cameras, the physical form factor of Narratron takes on the minimalist design approach with an intent to create frictionless user experience.
Tap the shutter to capture hand shadow
Spin the crank to develop story
Prototyping and testing
During the process we have prototyped and tested Narratron from multiple perspective including its form factor and hardware interface, software interface, and AI ability. We used breadboard to iterate on the physical interaction and a React-based web interface to validate the story generation. Finally, we connected two parts together.
Tech stack
Architecture
To generate the story, Narratron employs the GPT-3.5 language model. The animal keyword identified from the user’s hand shadow is processed by GPT-3.5 and generates a story seamlessly combining plotlines, dialogues, and descriptive elements. While the story is being generated, Narratron simultaneously generates a corresponding image using Stable Diffusion that represents the animal associated with the user’s hand shadow. This image is then projected onto the surface, adding a visual component to the audio experience, enhancing user’s connection to the narrative.
Image classifier training
Story generation by GPT-3.5-turbo
We used Few-Shot Learning prompting strategy to tune the model, ensuring the output stability of story structure and the children-friendly writing style.
A snippet of Few-Shot Learning training data
By hosting all the visual and auditory outputs on a React app, we used react-webcam for video streaming inputs, sent video to Teachable Machine API for image classification, extracted the data (character name) into tuned OpenAI Completions API to create the story, narrated the story through react-text-to-speech, sent the story into StableDiffusion API to generate the backdrop, and wired all Arduino interactions onto the system through Serial Port.
Code snippets
Industrial design
The industrial design prioritizes the stability to stand as a desktop device, and usability for both parents and children with a easy-to-grip knob crank and a big fat top button.
Mechanical drawing
Chassis assembly
Components
Brand design
The warm yellow tone evokes sunlight, joy, and creativity while maintaining a modern tech-forward appearance. The rounded corners, soft shadows, and cloudy forms in the logo reflect the product's child-friendly nature, which communicates safety, accessibility, and approachability - essential qualities for a children's technology product.
Read Next