
Storytelling AI for shadow-playing



Kickstarting in progress


Yubo Zhao, Aria Bao


Mar - May 2023


Illustrator, Figma, Rhino

Next.js, Arduino

Academic recognition

ACM UIST '23 Student Innovation Contest Finalist


A storytelling projector powered by gen AI

Narratron is an interactive projector that augments hand shadow puppetry with AI-generated storytelling. Designed for all ages, it transforms traditional physical shadow plays into an immersive and phygital storytelling experience.

Front view

My contribution

Designing on phygital interface

As one of the two owners of this project, I was mainly responsible for:

  • Led the ideation, prototyping, and testing for all product design crafts and interaction flows

  • Led the development of backend infrastructure and integration of LLMs through APIs

  • Led the creation of affiliated research paper

Final design


Hand shadow puppetry has been practiced as one of the oldest forms of storytelling in a transcultural context. To enhance that experience with multimodal artificial intelligence, Narratron allows users to interact with hand shadows with AI-generated auditory and visual outputs of the story their hand shadows are telling.

Detailed view

Behind the scene

Prototyping and testing

During the process we have prototyped and tested Narratron from multiple perspective including its form factor and hardware interface, software interface, and AI ability.

User story

To determine the interactive pattern on this phygital interface, I sketched a user story to communicate the idea.

User story board

Form factor

Inspired by the physical affordances of hand cranks and shutters which are the original interfaces of movie projectors and cameras, the physical form factor of Narratron takes on the minimalist design approach with an intent to create frictionless user experience.

Tap the shutter to capture hand shadow
Spin the crank to develop story

The projector unit made from commercial-grade SLS-printed nylon, speaker, microcontroller, and all other electronic components are compacted in the confined interior space of this standalone product, the edges are ergonomically bevelled employing functionalist and understated aesthetic, and the center of mass is properly positioned, which balances stability and portability, making it both a desktop device and a handheld gadget.


Tech stack

Inspired by the physical affordances of hand cranks and shutters which are the original interfaces of movie projectors and cameras, the physical form factor of Narratron takes on the minimalist design approach with an intent to create frictionless user experience.

The history of hand shadow play is nearly untraceable which was prevalently practiced long before the existence of Greek shadow show Karagiozis or Chinese shadow puppetry Pi Ying Xi. It is a prelinguistic and transcultural form of storytelling that entertains and educates the younger generation; it is also a stimuli of creative production, by mimicking the things we see, and by telling the stories we relate. Narratron, in that sense, has deeply embedded AI into this intelligent collective effort of hands, eyes, and brains as a true “fairytale copilot”. Its nature of multimodality that combines visual, auditory, tactile, and textual I/O, supported by the collaboration of  LLM, image classifier, speech synthesizer and diffusion models, demonstrates how seamlessly we are able to make bodily interactions with AI. Through bridging the digital and the physical, we are now connecting the ancient and the future.


The user’s experience with Narratron begins with the startup screen, which creates a serene and focused ambiance, setting the stage for an immersive journey. As the user turns on Narratron, they are greeted by the instructions, preparing them for the experience that lies ahead. Once the startup screen fades away, the user is free to explore and play with their hand shadows in any way they wish. They can experiment with different shapes, sizes, and movements while Narratron’s camera captures the intricate hand shadow shapes created by the user. The captured hand shadow shapes are then analyzed by trained image classifiers integrated into Narratron. These algorithms interpret the hand shadows and translate them into animal keywords. The keywords serve as the foundation for the next step of the process: generating a complete story.

Image classifier training

To generate the story, Narratron employs the GPT-3.5 language model. The animal keyword identified from the user’s hand shadow is processed by GPT-3.5 and generates a story seamlessly combining plotlines, dialogues, and descriptive elements. While the story is being generated, Narratron simultaneously generates a corresponding image using Stable Diffusion that represents the animal associated with the user’s hand shadow. This image is then projected onto the surface, adding a visual component to the audio experience, enhancing user’s connection to the narrative.

Story generation by GPT-3.5-turbo
Hardware assembly
Chassis assembly
Final product design

thanks for stopping by;)

Handcrafted with

by human, for human

© Yubo Zhao 2023 All rights reserved.