Walkthrough · 10 minute read

How Melted Crayons Works

From a single photo to a fully narrated comic. Here's the path most creators take, in the order it makes sense to learn it.

Section 01

Characters & Voices

Every story starts with characters. Upload a photo once, and the AI uses it as a visual reference across every panel — so the hero in panel 1 looks like the hero in panel 6, regardless of art style.

Uploading a reference photo

Go to Studio → Characters → + New Character. Choose a clear photo of yourself, your kid, your pet, anyone you want to star in your comics.

Best results: a front-facing or three-quarter photo with the face clearly visible
Avoid: sunglasses, heavy shadows, multiple people in one shot, low resolution images
Photos automatically resize before upload, so high-res phone photos work fine
The AI will extract the subject onto a transparent background — you'll see a preview before saving

No photo? Generate one from a description

In the character creator, switch the source toggle to ✨ Generate from Description. Type what you want — “a young girl with curly red hair, freckles, green eyes, wearing a yellow raincoat” — and the AI produces a clean reference portrait you can use as the character's identity anchor.

Best for: fictional or anonymous heroes — dragons, robots, space pirates, cartoon kids, made-up characters of any kind
Honest tradeoff: real photos preserve identity slightly better across many panels. Generated characters are still very consistent — just not pixel-perfect for real-world resemblance
Click ✨ Try Again to regenerate if the first result isn't quite right. Each generation uses 1 credit — same as creating a panel
The generated image flows through the same background-removal step as uploaded photos. Once extracted, the character behaves identically downstream

Background removal happens automatically

The moment you upload a photo or generate one from a description, the AI starts extracting the subject onto a transparent background — no extra button to click. You'll see “Removing background” under the Save button while it runs. The extracted cutout shows next to the source when it's done; the Save button activates as soon as both are ready.

Runs in your browser — your photo never leaves your device for the extraction step. Privacy bonus, and zero per-extraction cost.
First time you do this, the AI model downloads (~30MB, cached forever after). Subsequent extractions are 1–3 seconds.
If the result looks off, click Retry next to the preview — re-runs against the same source.

Naming and panel defaults

Give your character a name. For people-style characters you can also set a default pose (standing, action, sitting, running, flying) and default expression (neutral, happy, angry, etc.). When you add this character to a new panel, those defaults pre-fill the panel's pose and expression dropdowns — override per panel if you want a different action.

Voice selection also lives on the character (see below) so the same character speaks with the same voice across every story they appear in.

Voice selection

Pick a voice for your character's narration. When you add dialogue or narration to a panel and click the speaker icon, this voice will read it aloud.

Filter by gender — quick toggle row above the dropdown (All / Female / Male / Any-Neutral) so you can narrow to voices that match your character's gender
Voices are also grouped by vibe — narrator, young, mature, hero, villain, character, announcer
Each option shows the gender label inline (e.g. [Male] Earnest Young Man) so it's scannable even without the filter
Click the play icon next to a selected voice to preview it before committing
Leave a character's voice unset to use the story's default narrator instead

Products & brands as characters

Characters aren't limited to people. You can also create Products (like a coffee cup or a sneaker) or Brands (a logo or mascot). The AI preserves their exact appearance the same way it preserves a face. Useful for product storytelling, ads, or branded comics.

👉 Once you have at least one character, you're ready to start a story.

Section 02

Style, Layout & Aspect

Before you create panels, you set the look of your story. These choices apply across every panel for visual consistency, but you can override any of them per panel later.

Choosing an art style

Melted Crayons ships with 30+ art styles — comic book, anime, watercolor, LEGO, Ghibli-inspired, claymation, pixel art, noir, pop art, cyberpunk, and more. Each style has its own visual language, and the AI is trained to render in it consistently.

Storytelling pick: “storybook” for bedtime stories or “comic” for classic sequential art
Show-stopping pick: “LEGO” or “Ghibli” — these styles are instantly recognizable and feel premium
Brand pick: “cinematic” or “photorealistic” for product or marketing comics
You can change the style mid-story, but the AI works hardest to stay consistent when you keep one style throughout

Aspect ratio (panel shape)

Pick the shape of each panel:

4:3 — classic comic book proportions, balanced for reading
16:9 or 21:9 — wide landscape, cinematic for action shots
1:1 — square, ideal if you plan to share to Instagram
9:16 — vertical, for TikTok / IG Story exports

You can mix aspect ratios within a single story — a wide establishing shot, then a tight portrait close-up, then a square action panel. Each one is independent.

Resolution and quality

Default is 2K, which is sharp enough for screen and most prints. Higher tiers can render at 3K for poster-quality detail. Output size, watermark, and commercial license are all set by your plan.

👉 Your style + aspect ratio is the visual foundation. Now you build panels on top of it.

Section 03

Panel Creation

This is the core loop. Every panel goes through four moves: write the scene, cast your characters, add the text, and refine until it's right.

The Scene — describing what's happening

The scene prompt is a short description of what the panel shows. Write it like you'd describe a movie shot to a friend.

“A dragon flies low over a glowing forest at dusk. Mist curls between the trees. A castle rises in the distance.”

Be specific about lighting and mood:“dusk,” “rain-soaked,” “cozy interior,” “harsh noon sun”
Describe the camera:“low-angle shot,” “close-up,” “wide establishing view”
Don't describe the characters here — the AI handles them via your reference photos. Just describe the world they're in.
You can also pick a scene preset from the dropdown (forest, castle, city, etc.) instead of writing your own

Cinematography (optional but powerful)

For more cinematic shots, pick from the four cinematography fields:

Shot: close-up, medium, wide, establishing, etc.
Camera angle: low-angle (heroic), high-angle (vulnerable), dutch angle (off-balance), eye-level (neutral)
Lighting: golden hour, neon noir, soft daylight, moody shadows
Mood: joyful, tense, melancholic, epic

These add cinematic language to your prompt without you having to write it manually. Especially useful when you want a panel to feel like a movie still, not a generic illustration.

Casting — adding characters to the scene

Click + Add Character to drop one of your created characters into the panel. You can:

Add multiple characters per panel (up to your panel limit)
Drag each character on the canvas to set their position
Set per-panel pose (running, sitting, fighting) and expression (smiling, surprised, angry)
If a character has a default pose set, it pre-fills here — you can override

The AI uses each character's reference photo for identity preservation, so they'll look like the same person across every panel even though the AI is generating fresh images.

Text — captions and dialogue

Two kinds of text overlay:

Captions: narration boxes — usually placed at the top or bottom of a panel. Style options include narration (default), thought, action, and location stamps.
Dialogue: speech bubbles tied to a character. Choose between speech (default), thought (cloud-style), shout (loud emphasis), and whisper (italic).

For both, you can dial:

Position: drag anywhere on the panel
Width: 10–90% of panel width — controls wrapping
Font size, family, and color
Corner radius: sharp rectangle, soft round, or full pill — match the comic-book style you're going for
Tail direction (dialogue only): point to who's speaking

Click the speaker icon on a panel to hear the captions and dialogue read aloud using each character's voice.

Regeneration — making it right

First generation rarely lands perfectly. You have three tools to iterate:

Generate — full regeneration with the same prompt. Good if a panel just landed weird and you want a fresh attempt.
Refine — type a text edit instruction (e.g. “make the dragon larger” or “change the lighting to morning”) and the AI edits the panel without redoing it from scratch.
Recast (the gradient button with the refresh-halo icon) — keeps the scene, background, and lighting exactly as they are, but generates fresh character poses and expressions. Different take, same set. Each click rolls a different angle, gesture, and energy so successive recasts feel distinct.
Undo / Redo — every regeneration is saved. Step back through previous versions or step forward again with the arrows.

💡 Pro tip: Recast is your best friend when the composition is right but the character pose isn't. Click it two or three times in a row — each click produces a noticeably different take.

Visual cues built into the editor

A few small affordances in the studio that help you see what you've filled in and what each control does:

Tab progress dots — small green dots appear next to Scene, Characters, and Text when those tabs have content. Quick way to see what's done at a glance.
Helper captions under each action button — Generate (“Build a fresh panel”), Recast (“Try new poses”), Refine (“Edit with text”) — so the three regen options are easy to tell apart.
Disabled-Generate hint — when the Generate button is greyed out, a small message points to the editor where you need to type your scene description.
Story Settings auto-collapse — settings (engine, aspect, narrator, art style, layout) hide behind a gear button so they don't crowd the workspace once you've set them.

Render words inside the image (in-scene text)

Below the canvas there's a toggle labeled “Render words inside the image”. This is for text that should appear inside the artwork — signs, posters, billboards, comic-style sound effects baked into the scene.

Use this for: a street sign saying “BEWARE,” a billboard reading “GOTHAM TIMES,” a t-shirt graphic, a chalkboard, a book cover in the scene
Don't confuse with captions and dialogue — those go in the Text tab and are drawn as overlays on top of the panel. The toggle is for words rendered by the AI inside the artwork itself
Modern image engines render text fairly cleanly — works best with short phrases (1–4 words)

Section 04

AI Storytelling

A few things happen behind the scenes that are worth knowing — they explain why the app behaves the way it does and how to get the most out of it.

Cross-panel style consistency

When you generate panel 2 and beyond, the AI looks at panel 1 to understand your story's exact visual language — palette, lighting, brushwork, mood — and matches it. This is why your comic feels like a coherent piece, not a collection of random AI images.

The technical name is style anchoring. It works automatically. Your job is to make sure your first panel looks the way you want the rest of the story to look.

Regenerate the source panel and the anchor refreshes automatically — if you redo panel 1 to a new look, later panels you generate will pick up the new style anchor instead of staying locked to the old one.
Existing panels stay in their original style until you regenerate them. Style changes don't retroactively re-render finished panels.
Changing the story's default art style (in story settings) clears the cached anchor and warns you — your existing panels keep their look, new panels follow the new style.

💡 If your first panel isn't quite right, regenerate it before you build out the rest of the story so later panels anchor to the version you actually like.

Character identity preservation

Your character reference photos are passed to the AI on every single panel where that character appears. The AI is instructed to reproduce their face, features, hair, and skin tone precisely. This is what lets the same hero appear across many panels and styles without slowly morphing into someone else.

The better your reference photo (clear, well-lit, front-facing), the better the consistency. A blurry side profile gives the AI less to work with.

The AI engines

Melted Crayons uses two AI image engines under the hood:

Gemini Nano Banana 2 — Google's latest. Excellent character consistency and fine-detail rendering.
SeedDream 5.0 — ByteDance's engine. Strong stylistic range and fast generation.

You can switch between them in the story settings. Both are high-quality; sometimes one nails a specific style or character better than the other. If a panel feels off, try the other engine.

Narration and voice

Narration is text-to-speech with natural prosody. When you click the speaker on a panel, the caption text and dialogue text are voiced in order, each line using its character's assigned voice (or your story's default narrator for captions and any dialogue without a per-character voice).

Cartesia voices — fast, natural, available on every plan including free.
ElevenLabs voices — premium voice acting with richer emotional range, unlocked on Creator and Studio plans. Marked in the picker with a 🔒 badge for free / Storyteller users so you can see what's available.
Filter by gender — quick toggle row above the voice dropdown narrows to female / male / neutral so you can match a character without scrolling every voice.
Conversational pauses — the player adds natural beats between clips: a longer breath when a different character takes their turn, a shorter pause when the same speaker continues. Feels like reading aloud, not reading a script.

Reading order: position controls playback

The audio reads your captions and dialogue in the order they're positioned on the panel — top to bottom, then left to right within a row. There's no separate sequence to manage; just drop each bubble where you want a reader's eye to land first.

Numbered badges on every non-empty bubble show the read order at a glance — cyan for captions, magenta for dialogue.
Drag to reorder — moving a bubble higher, lower, left, or right updates its number instantly.
Two bubbles within ~10% of the panel height read as a row — a slight vertical drift between them won't flip the order, matching how the eye actually groups them.
Empty bubbles (no text yet) don't take a number and aren't spoken.

When audio re-renders vs. plays from cache

Audio is generated on demand and cached. Re-listens are free — audio streams straight from Cloudinary. The audio re-renders automatically when:

You edit the text of any caption or dialogue
You drag a bubble to a new position (re-orders playback)
You change a character's voice
You change the story's default narrator voice

Re-renders count against your monthly generation limit, same as a panel re-generation.

Why it sometimes takes a moment

Generating a single panel involves resizing your reference photos, sending them to the AI, prompt construction, image generation (~10–30 seconds), and post-processing. Most panels complete within 30 seconds. If a panel takes longer, the system shows a slow-generation banner and the option to cancel and retry.

If a generation fails, you don't pay for it — only successful panels count toward your monthly limit.

What counts as a generation

Most paid actions in the studio share the same generation counter — your monthly plan cap covers all of them collectively:

Generating a panel — 1 generation
Recast (try new poses on a panel) — 1 generation per click
Refine (text-edit a panel) — 1 generation
Generate-from-description in the character creator — 1 generation per Try Again
Panel narration (TTS) — 1 generation per panel that gets voiced
Uploading a character photo — free (uses cheaper background-removal, not full image generation)

When you hit your monthly cap, prepaid credit packs cover the overflow. When credits run out too, the upgrade prompt opens.

Exporting your story

Once your panels are done, click Export to download your story:

PDF — full storybook with cover page, all panels, and dialogue baked in
PNG — individual panels at high resolution
Instagram, TikTok, Twitter, Facebook — pre-formatted for each platform with captions and dialogue burned in
Video (coming soon) — fully narrated comic video for YouTube and social

What's coming next

A few features are on the roadmap based on early-tester feedback:

Collaborative editing — invite friends, family, or your team to a shared story so multiple people can build a comic together, like a shared Google Doc for stories. Click Share → Invite Collaborators to get notified when it ships.
Video export — fully narrated comic videos with intros, transitions, and outros, ready to drop on YouTube or TikTok.
Custom voices — design your own narrator voice from a short audio sample (Creator tier perk).

🎉 That's the whole product. Now go make something cool.

Ready to start?

The fastest way to learn is by making one.

Create your first character Start a new story