‘Life By The Drop’ – A Guide to AI Visual Storytelling

AI tools shine in the realm of visual storytelling. With some creativity and the right knowledge, you can craft stunning short visual stories solely using AI technology. This guide will explore the process of creating captivating short story videos, similar to the example provided below.

Step 1 – Brainstorm (ChatGPT)

Head over to ChatGPT (3.5 or 4) and tell it you want to brainstorm ideas for a short visual story and include any amplifying information you have in mind. Review your choices or ask it to brainstorm more ideas if you don’t like the options. Then pick an option like the one below:

A visual story about the cycle of life told through a single drop of rain.

  • Theme: Connection, cycle of life, and perspective.
  • Visuals: Follow a single raindrop from its formation in the clouds, descending to earth, nourishing plants, and eventually evaporating to begin the cycle anew. Include close-ups of the environments it impacts.
  • Emotions: Fosters a feeling of interconnectedness and the wonder of life’s cycles.

Step 2 – Storyboard Outline (Chat GPT)

Tell ChatGPT you want to do a written storyboard for the visual story you picked. This will give you a high-level picture of what the visuals will look like and how the script will sound.

Introduction to the Clouds

Visuals: Begin with a serene view of the sky, where clouds gather and swirl, brewing a storm. The camera zooms into one particular cloud, teeming with the energy of impending rainfall.

Emotion: Evoke a sense of anticipation and the vastness of the natural world.

The Birth of a Raindrop

Visuals: Focus on a single droplet as it forms, highlighting its unique journey as it falls from the sky. Use slow-motion effects to capture the beauty of its descent towards the earth.

Emotion: Instill wonder and the feeling of beginning a new journey.

The Descent and Impact

Visuals: Follow the raindrop as it falls through different environments—over a bustling city, a serene countryside, or a dense forest. Show its impact as it hits the ground, perhaps feeding a tiny plant or joining a river.

Emotion: Convey the interconnectedness of all things and the small, vital role each element plays in the cycle of life.

Nourishment and Transformation

Visuals: Show the immediate effect of the raindrop on its surroundings—how it gives life to a struggling plant or becomes a part of a river that nourishes the landscape. You could also depict its journey through the groundwater, eventually evaporating.

Emotion: Highlight the themes of growth, contribution, and transformation.

Return to the Clouds

Visuals: As the raindrop evaporates, follow its ascent back into the atmosphere, where it rejoins the cloud, ready to begin the cycle anew. The final scene mirrors the first, with the clouds gathering energy once more.

Emotion: Inspire awe for the cyclical nature of life and a sense of completion, yet continuity.

Step 3 – Image Prompts (Chat GPT)

Tell Chat GPT you want to create image prompts for each step in the storyboard. You can specify which text-to-image generator you want to use because that might help ChatGPT tailor each prompt. In this case,

Introduction to the Clouds

Prompt: Create a serene and vast sky at the brink of a storm, where clouds gather and swirl with energy. The visual should emphasize the vastness of the sky and the anticipation of rainfall, with a focus on one cloud, charged and ready to release rain. The palette should consist of deep blues and grays, illuminated by subtle sunlight peeking through the formations.”

The Birth of a Raindrop

Prompt: Illustrate the moment a single raindrop forms within a cloud and begins its descent towards Earth. The visual should evoke wonder, focusing on the droplet’s journey, with soft lighting to emphasize its clarity and purity.

The Descent and Impact


A. The journey of a raindrop as it falls over a bustling city with gleaming wet surfaces. This scene should convey the interconnectedness and the life-giving essence of water.

B. The journey of a raindrop as it falls over a serene countryside dotted with green fields. This scene should convey the interconnectedness and the life-giving essence of water.

C. The journey of a raindrop as it falls over a dense, vibrant forest. This scene should convey the interconnectedness and the life-giving essence of water.

Nourishment and Transformation


A. A wide shot of a small greyish-green pine tree sapling stretching toward the sun

B. A swelling river, carrying the essence of life, endlessly flowing towards the unknown, and a distant sun

Return to the Clouds

Prompt: Depict raindrops evaporating and ascending back to the atmosphere as water vapor, where they rejoin the clouds, with the clouds now regathering energy after the rain. This visual should focus on the cyclical nature of the raindrop’s journey, showcasing the transition from water back to vapor, rising against a backdrop of a calming sky, signifying continuity and the eternal cycle of water.

Step 4 – Narration (Chat GPT, Eleven Labs)

After you have your text-to-image prompts, ask Chat GPT to write a script for each of the scenes based on the text-to-image prompts. Keep in mind that most text and image-to-video generators are currently producing 4-second video clips. You can tell ChatGPT to keep each scene’s narration to 4 seconds. This doesn’t always work out, but it usually ends up between 4-6 seconds of narration per scene. You can always cut the scene or slow its speed when editing to fit the narration length exactly.

Once you have something you like, head over to Eleven Labs and plug in your script. Choose a voice and play the script so you can see how long each scene is. If one scene is longer than 6 seconds, it might help to reduce the word count in the scene. This will help when it comes time to edit the scenes together.

Scene 1: Introduction to the Clouds (6 seconds)

Amidst a formidable sky, clouds converge, whispering promises of impending rain.

Scene 2: The Birth of a Raindrop (6 seconds)

From the heart of the storm, a raindrop emerges, beginning its solitary descent to Earth.

Scene 3 (A,B, and C): The Descent and Impact (18 seconds)

As it falls, each drop glimpses worlds below: across cities, where lights mimic stars; above countryside meadows, painting them with life’s elixir; in a forest, where trees bow in gratitude. Each land embraces the droplet’s touch, a fleeting kiss awakening the soil.

Scene 4: (A and B) Nourishment and Transformation (12 seconds)

Nourished, a new sapling stretches towards the sun, reborn from the raindrop’s sacrifice.

Meanwhile, rivers swell, carrying the essence of life, endlessly flowing towards the unknown.

Scene 5: Return to the Clouds (12 seconds)

Ascending as vapor, the droplets return, rejoining the clouds.

In this eternal cycle, each end is but a prelude to another beginning, a testament to life’s perpetual renewal.

Step 5 – Image Generation (Leonardo AI)

Leonardo Seed: 339120896

Input the prompts from Step 3. This may require some tweaking until you get a result you’re happy with. Once you get the first image the way you want it, make sure to save the seed # in the document you’re working in. With each subsequent image generation, use the same seed # for a consistent look and feel throughout each image.

Video Clip Generation (Leonardo AI) (Runway)

As far as AI has come, this step is still the most difficult. However, text and image-to-video generation is much better than it was last year, so it requires less iteration. For this particular video, I used Leonardo and Runway simultaneously. I generated the image in Leonardo and then used their new motion generation feature plus the image-to-video feature in Runway. If I liked the video generated in Leonardo, I used it. If not, I used Runway. Leonardo does a decent job at image-to-video but lacks control. If you need to fine-tune something, it’s better to use Runway because it has the motion brush tool and camera controls.

Editing (CapCut)

Once you have your script audio file and video clips of all your scenes, it’s time to edit. In this case, I used CapCut because their free version is excellent, especially if you aren’t an advanced video editor like myself. I also forgot to mention that you can create sound effects in Eleven Labs, or you can use the existing ones in CapCut.

And there you have it, a full workflow for creating short visual stories with nothing but AI tools.


This is a Contributor Post. Opinions expressed here are opinions of the Contributor. Influencive does not endorse or review brands mentioned; does not and cannot investigate relationships with brands, products, and people mentioned and is up to the Contributor to disclose. Contributors, amongst other accounts and articles may be professional fee-based.