(new anime? not until Saturday!)
The relative ease of customizing Stable Diffusion models means that thousands of people are stirring the pot and training their own. This is good, since the official models are biased and censored, but it’s also bad, because the derivative models are biased in different directions, and often over-trained to the point that they simply snap when you find their edges.
Most people don’t do their custom training against the base SD models; they layer their collection of picture/keyword pairs on top of one that’s already been “uncensored” or augmented in some way, with the two major anime branches being Illustrious and Pony. What this means in practice is that feeding the same settings to related models will often produce very similar results.
So, just how similar do they get?
I’ve been using SwarmUI’s grid feature to evaluate different models by passing them all the same prompt, seed, and settings.
For each set, I used a character LoRa (small patch model that can be used to add character/style/location data onto other models with varying success depending on heredity), and generated multiple pictures in my go-to model for cute-and-occasionally-naughty material, CAT - Citron Anime Treasures (Illustrious-based), until I found something that looked like a decent starting point:
Setting aside the boilerplate and the character trigger words, the prompt was:
laughing, standing with arms spread, head back, grounded stance, freedom in motion, outdoors, at Santorini, Greece
One thing that happens with models trained on Danbooru, etc is that they already know about common anime characters. Just put “misty” in your prompt, and you’ll see her and other Pokemon gals. Even if you’re clearly using it as an adjective to describe the background scenery (“misty mountains” could go either way…).
Chika’s definitely built into many of the anime models, so here’s a much newer gal:
[character trigger] grinning, Standing on toes, arms stretched outward, playful and dynamic positioning, outdoors, at Hamat Gader Hot Springs, Israel
This chick showed up in at least six different 3D models.
Much better than its Chika!
Do you see “bicycle” or “handle bars” anywhere in the prompt? No. So, why did it show up three times in different 3D models (and only 3D, not any of the 2D)? Because the character trigger prompt was:
angelica-default, blue eyes, blonde hair, long hair, feather hair ornament, red cape, white armor, white skirt, bike shorts, thigh boots, white gloves
Sure enough, none of the 3D results included bike shorts.
Markdown formatting and simple HTML accepted.
Sometimes you have to double-click to enter text in the form (interaction between Isso and Bootstrap?). Tab is more reliable.