Generational Generation


(new anime? not until Saturday!)

The relative ease of customizing Stable Diffusion models means that thousands of people are stirring the pot and training their own. This is good, since the official models are biased and censored, but it’s also bad, because the derivative models are biased in different directions, and often over-trained to the point that they simply snap when you find their edges.

Most people don’t do their custom training against the base SD models; they layer their collection of picture/keyword pairs on top of one that’s already been “uncensored” or augmented in some way, with the two major anime branches being Illustrious and Pony. What this means in practice is that feeding the same settings to related models will often produce very similar results.

So, just how similar do they get?

I’ve been using SwarmUI’s grid feature to evaluate different models by passing them all the same prompt, seed, and settings.

For each set, I used a character LoRa (small patch model that can be used to add character/style/location data onto other models with varying success depending on heredity), and generated multiple pictures in my go-to model for cute-and-occasionally-naughty material, CAT - Citron Anime Treasures (Illustrious-based), until I found something that looked like a decent starting point:

Setting aside the boilerplate and the character trigger words, the prompt was:

laughing, standing with arms spread, head back, grounded stance, freedom in motion, outdoors, at Santorini, Greece


A Cuddle of Chikas

What the official SD XL model returns:

Diverging:

Still in the neighborhood:

Pony-based, but still surprisingly similar:

Furry Chika:

PVC Chika:

Plausible cosplay at similar locations:

Naughty cosplay:

Not Chika, but cute:

Also Not Chika, also cute:

Completely different gal

One thing that happens with models trained on Danbooru, etc is that they already know about common anime characters. Just put “misty” in your prompt, and you’ll see her and other Pokemon gals. Even if you’re clearly using it as an adjective to describe the background scenery (“misty mountains” could go either way…).

Chika’s definitely built into many of the anime models, so here’s a much newer gal:

[character trigger] grinning, Standing on toes, arms stretched outward, playful and dynamic positioning, outdoors, at Hamat Gader Hot Springs, Israel

Starting point (CAT)

Closest match

Different enough to be interesting

Well, I knew she had a crush…

Wait, who are they?

Not Angelica, but…

This chick showed up in at least six different 3D models.

I could have used a few more copies of these two:

Official SD model

Much better than its Chika!

Random bicycle is random

Do you see “bicycle” or “handle bars” anywhere in the prompt? No. So, why did it show up three times in different 3D models (and only 3D, not any of the 2D)? Because the character trigger prompt was:

angelica-default, blue eyes, blonde hair, long hair, feather hair ornament, red cape, white armor, white skirt, bike shorts, thigh boots, white gloves

Sure enough, none of the 3D results included bike shorts.


Comments via Isso

Markdown formatting and simple HTML accepted.

Sometimes you have to double-click to enter text in the form (interaction between Isso and Bootstrap?). Tab is more reliable.