Walkthrough · 10 minute read
How Melted Crayons Works
From a single photo to a fully narrated comic. Here's the path most creators take, in the order it makes sense to learn it.
Section 01
Characters & Voices
Every story starts with characters. Upload a photo once, and the AI uses it as a visual reference across every panel — so the hero in panel 1 looks like the hero in panel 6, regardless of art style.
Uploading a reference photo
Go to Studio → Characters → + New Character. Choose a clear photo of yourself, your kid, your pet, anyone you want to star in your comics.
- Best results: a front-facing or three-quarter photo with the face clearly visible
- Avoid: sunglasses, heavy shadows, multiple people in one shot, low resolution images
- Photos automatically resize before upload, so high-res phone photos work fine
- The AI will extract the subject onto a transparent background — you'll see a preview before saving
No photo? Generate one from a description
In the character creator, switch the source toggle to ✨ Generate from Description. Type what you want — “a young girl with curly red hair, freckles, green eyes, wearing a yellow raincoat” — and the AI produces a clean reference portrait you can use as the character's identity anchor.
- Best for: fictional or anonymous heroes — dragons, robots, space pirates, cartoon kids, made-up characters of any kind
- Honest tradeoff: real photos preserve identity slightly better across many panels. Generated characters are still very consistent — just not pixel-perfect for real-world resemblance
- Click ✨ Try Again to regenerate if the first result isn't quite right. Each generation uses 1 credit — same as creating a panel
- The generated image flows through the same background-removal step as uploaded photos. Once extracted, the character behaves identically downstream
Background removal happens automatically
The moment you upload a photo or generate one from a description, the AI starts extracting the subject onto a transparent background — no extra button to click. You'll see “Removing background” under the Save button while it runs. The extracted cutout shows next to the source when it's done; the Save button activates as soon as both are ready.
- Runs in your browser — your photo never leaves your device for the extraction step. Privacy bonus, and zero per-extraction cost.
- First time you do this, the AI model downloads (~30MB, cached forever after). Subsequent extractions are 1–3 seconds.
- If the result looks off, click Retry next to the preview — re-runs against the same source.
Naming and panel defaults
Give your character a name. For people-style characters you can also set a default pose (standing, action, sitting, running, flying) and default expression (neutral, happy, angry, etc.). When you add this character to a new panel, those defaults pre-fill the panel's pose and expression dropdowns — override per panel if you want a different action.
Voice selection also lives on the character (see below) so the same character speaks with the same voice across every story they appear in.
Voice selection
Pick a voice for your character's narration. When you add dialogue or narration to a panel and click the speaker icon, this voice will read it aloud.
- Filter by gender — quick toggle row above the dropdown (All / Female / Male / Any-Neutral) so you can narrow to voices that match your character's gender
- Voices are also grouped by vibe — narrator, young, mature, hero, villain, character, announcer
- Each option shows the gender label inline (e.g. [Male] Earnest Young Man) so it's scannable even without the filter
- Click the play icon next to a selected voice to preview it before committing
- Leave a character's voice unset to use the story's default narrator instead
Products & brands as characters
Characters aren't limited to people. You can also create Products (like a coffee cup or a sneaker) or Brands (a logo or mascot). The AI preserves their exact appearance the same way it preserves a face. Useful for product storytelling, ads, or branded comics.
Section 02
Style, Layout & Aspect
Before you create panels, you set the look of your story. These choices apply across every panel for visual consistency, but you can override any of them per panel later.
Choosing an art style
Melted Crayons ships with 30+ art styles — comic book, anime, watercolor, LEGO, Ghibli-inspired, claymation, pixel art, noir, pop art, cyberpunk, and more. Each style has its own visual language, and the AI is trained to render in it consistently.
- Storytelling pick: “storybook” for bedtime stories or “comic” for classic sequential art
- Show-stopping pick: “LEGO” or “Ghibli” — these styles are instantly recognizable and feel premium
- Brand pick: “cinematic” or “photorealistic” for product or marketing comics
- You can change the style mid-story, but the AI works hardest to stay consistent when you keep one style throughout
Aspect ratio (panel shape)
Pick the shape of each panel:
- 4:3 — classic comic book proportions, balanced for reading
- 16:9 or 21:9 — wide landscape, cinematic for action shots
- 1:1 — square, ideal if you plan to share to Instagram
- 9:16 — vertical, for TikTok / IG Story exports
You can mix aspect ratios within a single story — a wide establishing shot, then a tight portrait close-up, then a square action panel. Each one is independent.
Resolution and quality
Default is 2K, which is sharp enough for screen and most prints. Higher tiers can render at 3K for poster-quality detail. Output size, watermark, and commercial license are all set by your plan.
Section 03
Panel Creation
This is the core loop. Every panel goes through four moves: write the scene, cast your characters, add the text, and refine until it's right.
The Scene — describing what's happening
The scene prompt is a short description of what the panel shows. Write it like you'd describe a movie shot to a friend.
“A dragon flies low over a glowing forest at dusk. Mist curls between the trees. A castle rises in the distance.”
- Be specific about lighting and mood:“dusk,” “rain-soaked,” “cozy interior,” “harsh noon sun”
- Describe the camera:“low-angle shot,” “close-up,” “wide establishing view”
- Don't describe the characters here — the AI handles them via your reference photos. Just describe the world they're in.
- You can also pick a scene preset from the dropdown (forest, castle, city, etc.) instead of writing your own
Cinematography (optional but powerful)
For more cinematic shots, pick from the four cinematography fields:
- Shot: close-up, medium, wide, establishing, etc.
- Camera angle: low-angle (heroic), high-angle (vulnerable), dutch angle (off-balance), eye-level (neutral)
- Lighting: golden hour, neon noir, soft daylight, moody shadows
- Mood: joyful, tense, melancholic, epic
These add cinematic language to your prompt without you having to write it manually. Especially useful when you want a panel to feel like a movie still, not a generic illustration.
Casting — adding characters to the scene
Click + Add Character to drop one of your created characters into the panel. You can:
- Add multiple characters per panel (up to your panel limit)
- Drag each character on the canvas to set their position
- Set per-panel pose (running, sitting, fighting) and expression (smiling, surprised, angry)
- If a character has a default pose set, it pre-fills here — you can override
The AI uses each character's reference photo for identity preservation, so they'll look like the same person across every panel even though the AI is generating fresh images.
Text — captions and dialogue
Two kinds of text overlay:
- Captions: narration boxes — usually placed at the top or bottom of a panel. Style options include narration (default), thought, action, and location stamps.
- Dialogue: speech bubbles tied to a character. Choose between speech (default), thought (cloud-style), shout (loud emphasis), and whisper (italic).
For both, you can dial:
- Position: drag anywhere on the panel
- Width: 10–90% of panel width — controls wrapping
- Font size, family, and color
- Corner radius: sharp rectangle, soft round, or full pill — match the comic-book style you're going for
- Tail direction (dialogue only): point to who's speaking
Click the speaker icon on a panel to hear the captions and dialogue read aloud using each character's voice.
Regeneration — making it right
First generation rarely lands perfectly. You have three tools to iterate:
- Generate — full regeneration with the same prompt. Good if a panel just landed weird and you want a fresh attempt.
- Refine — type a text edit instruction (e.g. “make the dragon larger” or “change the lighting to morning”) and the AI edits the panel without redoing it from scratch.
- Recast (the gradient button with the refresh-halo icon) — keeps the scene, background, and lighting exactly as they are, but generates fresh character poses and expressions. Different take, same set. Each click rolls a different angle, gesture, and energy so successive recasts feel distinct.
- Undo / Redo — every regeneration is saved. Step back through previous versions or step forward again with the arrows.
Visual cues built into the editor
A few small affordances in the studio that help you see what you've filled in and what each control does:
- Tab progress dots — small green dots appear next to Scene, Characters, and Text when those tabs have content. Quick way to see what's done at a glance.
- Helper captions under each action button — Generate (“Build a fresh panel”), Recast (“Try new poses”), Refine (“Edit with text”) — so the three regen options are easy to tell apart.
- Disabled-Generate hint — when the Generate button is greyed out, a small message points to the editor where you need to type your scene description.
- Story Settings auto-collapse — settings (engine, aspect, narrator, art style, layout) hide behind a gear button so they don't crowd the workspace once you've set them.
Render words inside the image (in-scene text)
Below the canvas there's a toggle labeled “Render words inside the image”. This is for text that should appear inside the artwork — signs, posters, billboards, comic-style sound effects baked into the scene.
- Use this for: a street sign saying “BEWARE,” a billboard reading “GOTHAM TIMES,” a t-shirt graphic, a chalkboard, a book cover in the scene
- Don't confuse with captions and dialogue — those go in the Text tab and are drawn as overlays on top of the panel. The toggle is for words rendered by the AI inside the artwork itself
- Modern image engines render text fairly cleanly — works best with short phrases (1–4 words)
Section 04
AI Storytelling
A few things happen behind the scenes that are worth knowing — they explain why the app behaves the way it does and how to get the most out of it.
Cross-panel style consistency
When you generate panel 2 and beyond, the AI looks at panel 1 to understand your story's exact visual language — palette, lighting, brushwork, mood — and matches it. This is why your comic feels like a coherent piece, not a collection of random AI images.
The technical name is style anchoring. It works automatically. Your job is to make sure your first panel looks the way you want the rest of the story to look.
- Regenerate the source panel and the anchor refreshes automatically — if you redo panel 1 to a new look, later panels you generate will pick up the new style anchor instead of staying locked to the old one.
- Existing panels stay in their original style until you regenerate them. Style changes don't retroactively re-render finished panels.
- Changing the story's default art style (in story settings) clears the cached anchor and warns you — your existing panels keep their look, new panels follow the new style.
Character identity preservation
Your character reference photos are passed to the AI on every single panel where that character appears. The AI is instructed to reproduce their face, features, hair, and skin tone precisely. This is what lets the same hero appear across many panels and styles without slowly morphing into someone else.
The better your reference photo (clear, well-lit, front-facing), the better the consistency. A blurry side profile gives the AI less to work with.
The AI engines
Melted Crayons uses two AI image engines under the hood:
- Gemini Nano Banana 2 — Google's latest. Excellent character consistency and fine-detail rendering.
- SeedDream 5.0 — ByteDance's engine. Strong stylistic range and fast generation.
You can switch between them in the story settings. Both are high-quality; sometimes one nails a specific style or character better than the other. If a panel feels off, try the other engine.
Narration and voice
Narration is text-to-speech with natural prosody. When you click the speaker on a panel, the caption text and dialogue text are voiced in order, each line using its character's assigned voice (or your story's default narrator for captions and any dialogue without a per-character voice).
- Cartesia voices — fast, natural, available on every plan including free.
- ElevenLabs voices — premium voice acting with richer emotional range, unlocked on Creator and Studio plans. Marked in the picker with a 🔒 badge for free / Storyteller users so you can see what's available.
- Filter by gender — quick toggle row above the voice dropdown narrows to female / male / neutral so you can match a character without scrolling every voice.
- Conversational pauses — the player adds natural beats between clips: a longer breath when a different character takes their turn, a shorter pause when the same speaker continues. Feels like reading aloud, not reading a script.
Reading order: position controls playback
The audio reads your captions and dialogue in the order they're positioned on the panel — top to bottom, then left to right within a row. There's no separate sequence to manage; just drop each bubble where you want a reader's eye to land first.
- Numbered badges on every non-empty bubble show the read order at a glance — cyan for captions, magenta for dialogue.
- Drag to reorder — moving a bubble higher, lower, left, or right updates its number instantly.
- Two bubbles within ~10% of the panel height read as a row — a slight vertical drift between them won't flip the order, matching how the eye actually groups them.
- Empty bubbles (no text yet) don't take a number and aren't spoken.
When audio re-renders vs. plays from cache
Audio is generated on demand and cached. Re-listens are free — audio streams straight from Cloudinary. The audio re-renders automatically when:
- You edit the text of any caption or dialogue
- You drag a bubble to a new position (re-orders playback)
- You change a character's voice
- You change the story's default narrator voice
Re-renders count against your monthly generation limit, same as a panel re-generation.
Why it sometimes takes a moment
Generating a single panel involves resizing your reference photos, sending them to the AI, prompt construction, image generation (~10–30 seconds), and post-processing. Most panels complete within 30 seconds. If a panel takes longer, the system shows a slow-generation banner and the option to cancel and retry.
If a generation fails, you don't pay for it — only successful panels count toward your monthly limit.
What counts as a generation
Most paid actions in the studio share the same generation counter — your monthly plan cap covers all of them collectively:
- Generating a panel — 1 generation
- Recast (try new poses on a panel) — 1 generation per click
- Refine (text-edit a panel) — 1 generation
- Generate-from-description in the character creator — 1 generation per Try Again
- Panel narration (TTS) — 1 generation per panel that gets voiced
- Uploading a character photo — free (uses cheaper background-removal, not full image generation)
When you hit your monthly cap, prepaid credit packs cover the overflow. When credits run out too, the upgrade prompt opens.
Exporting your story
Once your panels are done, click Export to download your story:
- PDF — full storybook with cover page, all panels, and dialogue baked in
- PNG — individual panels at high resolution
- Instagram, TikTok, Twitter, Facebook — pre-formatted for each platform with captions and dialogue burned in
- Video (coming soon) — fully narrated comic video for YouTube and social
What's coming next
A few features are on the roadmap based on early-tester feedback:
- Collaborative editing — invite friends, family, or your team to a shared story so multiple people can build a comic together, like a shared Google Doc for stories. Click Share → Invite Collaborators to get notified when it ships.
- Video export — fully narrated comic videos with intros, transitions, and outros, ready to drop on YouTube or TikTok.
- Custom voices — design your own narrator voice from a short audio sample (Creator tier perk).
Ready to start?
The fastest way to learn is by making one.