Use of Generative AI by ENTIA Comics

Introduction

Generative AI created a bit of controversy during its initial breakthrough into the mainstream. This strong reception was caused by the ability of these intelligent tools, like Midjourney, Stable Diffusion and ChatGPT, to create substantial output based on a relatively miniscule input from the user. This way, a machine was enabled to conduct complex tasks and generate creations that formerly required a skillful human specialist with years of education and experience.

Such power and autonomy of Generative tools enabled human creators to minimize their agency while maintaining a coherent artistic output. But is AI merely a tool then, or can it be regarded as a co-creator of whatever the human operator prompts it to render? And if so, who is the “real” creator then – the person or the machine? These are valid questions and they require a full disclosure about use of generative AI in the creation of the Universe of ELZA.

The Question of Value

The perceived value of a manuscript is based on the quality of its writing, an illustration is equally valued because of the craftsmanship behind its creation. But the mere skill of the creator is not the only factor that dictates value of their creations, there is also a matter of an idea!

Plenty of mainstream productions have proved a thesis that no matter how much money or man-hours of world’s best craftsmen were spent on a film, series or a comic, if they have no strong thematical contents and valuable ideas behind them, the creation is doomed to be consumed and forgotten by the audiences.

The history have also proven that unique and well developed idea may turn a debut novel of an newbie writer into a best seller, or make a film/series of mediocre visual quality into a cult classic! Same thing can be stated about the comics, or sequential art (link/en.Wikipedia.org) that encompasses communicative qualities of both visual art and literature.

But what happens with the perceived value when Generative AI gets involved? The answer is pretty simple – it replaces a part of the human labor in creation of the product, and thereby affects just that aspect of value. This way a completely AI-Generated creation may be seen as aesthetically pleasant, but perceived as devoid of value because it lacks nor human craftsmanship or any valuable ideas.

This trend can be clearly seen in games with procedurally generated locations – players value exploration in those much less then in handcrafted experiences. Even if the latter was procedurally generated at first and edited by the developers to its final shape.

It means that in order for an AI-Generated work of art to be perceived as valuable on its own, it has to contain both the unique ideas in its core and be a result of human craftsmanship beyond its final form. Such product where both human creator and the machine are equally involved can be classified as AI-Assisted, as AI becomes a mere tool or executioner meanwhile the human maintains the role of an author.

In regard of comics, such AI-Assisted approach may appear in cases where either: an artist who draws illustrations by hand asks AI to supply the story with witty dialogue, or when a writer illustrates their stories with AI-Generated imagery, and everything in between. The general rule seems to be that the more of human craftsmanship is involved in a product, the higher its perceived value will tend to be.

Because of this value focused approach, the Universe of ELZA features AI-free* handcrafted narrative together with AI-Assisted illustrations. It means that every article and every dialogue in the final product is written by human. Also, all character-, item- and location designs are deliberately created by a human even though they may be based on AI-Generated Raw imagery.

The AI-Generated images are carefully prompted to represent very specific ideas by the human creator too and they are heavily altered thereafter by hand. It means that generally the final product contains very little to no random elements that were not originally intended by the author.

*All written articles and dialogue may still be processed by AI to correct the grammar and spelling. Still, no AI is used during the brainstorming/writing phase. This is the reason why writing takes so long at times. 🙂

The Artistic Intention vs. Generative AI

As stated above, AI-Generated illustrations tend to have little to no perceived value compared to handcrafted traditional art. It is the case because differently from traditional techniques, a creator needs very little skills to prompt for an image compared to drawing one by hand. And it is completely true. You can see that image below was created with and extremely simplistic prompt… And it looks dashing!

Prompting for a random result:

“beautiful woman, sci-fi setting”

(Yes! This is the exact prompt for this image!)

But, there is a caveat here. This image has no creative value because it lacks any artistic intention. And it is completely unsuitable for any sequential project like a comic because of its randomness and impossibility to replicate by using similar means. On the other hand such short prompt can be used for brainstorming as the Generative AI will spit out thousands of completely random variations of a similar concept.

Creation of deliberate imagery with Generative AI requires a different approach to prompting. It is much more demanding on prompter’s art direction skills, but it also results in valuable output that is coherent with creator’s artistic intention. Illustration below features an original character called The Huntress.

Prompt for this image includes even the smallest details that make the AI-Generated output consistent with the original concept. As a result, this image features both: a much better gamma of complementary colors, and visualizes a photorealistic rendition of a character that is visually consistent with the stylized version of The Huntress.

Prompting for a deliberate result:

“a shot from an 90s science fiction film. middle-aged athletic hairless Iranian woman, tanned skin, slim nose, brown eyes, bald head, angry expression, looking away. she is wearing a futuristic sharp angular black composite armor over a black carbon fiber undersuit, tall protective collar, sleek shoulder pads, wearing a white ceramic chest plate, black tactical gloves. holding a sleek futuristic white blaster rifle with a white stock, white rangefinder scope on top of it. she is standing in an action pose, looking away. lush purple alien jungle, red trees, blue mist in the background. action shot, cinematic shot, intricate details, film grain, in style of 90s retro futuristic aesthetics, masterpiece, hd, sharp focus“

Sophisticated prompting is rarely enough though for making imagery that can be efficiently used to assemble a comic. Even when User Interfaces for Generative AI can provide control over character’s pose, image perspective, and load pre-trained LORA’s for character consistency, they still fail to generate an output with sufficient fidelity to tell a complex story. This is where a skilled human touch is required.

Everything Begins With a Script

Before even touching the AI tools, a coherent story has to be written. Image 1 features a pretty early example of a script for NADIA #1. It’s format is heavily inspired by the screenwriting process. This praxis proved to be inefficient though and had to be simplified.

Currently, each comic goes into production in a format of a treatment that describes each scene in detail, but lacks most of dialogue except for the most important parts. When the treatment is complete, a mockup of a comic could be drawn based on it. Because of use of the Generative AI instead of the pencil and paper, an Image Library of Raw AI-Generated images has to be created first.

Image 1 – Everything begins with a script.

Writing a detailed script for a comic is not a paramount, but it becomes very helpful in setting up the story and developing the narrative details.

On this image some lines describe what is shown on the illustrations, while others represent the dialogue.

Traditionally, in case of comics, a different format is preferred though. There all lines are written directly in speech bubbles over pre-visualized illustrations. This way every dialogue line is being organically adjusted for its place on the page.

Crafting the Image Library of Raw Generations

Plenty of AI-Generated pre-visualization/reference Raw Images for illustrations are generated by a “shotgun method” where multiple variants of a character/scene are generated at once. Differently from drawing by hand or working with 3D models, this approach opens up for much wider artistic exploration where AI-specific randomness provides much better options then intended.

Image 2 – There are only 9000 ways to create any scene.

After the outline is written, a concepting process begins. Appearance for each character is being planned and prompted.

Same prompt with different descriptions for the pose/angle/expression is used to generate a plethora of images to choose from.

As image 2 illustrates, very few of total amount of AI-generated images are good enough to become illustrations in the comic. Most are kept strictly for pose/prompt reference or future use.

Unfortunately, this method is good only for creating simpler character-focused imagery. Illustrations with complex poses, unusual angles or multiple characters require a different approach.

Image 3 – Manual scene composition.

Illustrations where poses and angle are too important to be randomized by AI are made with the help of ControlNet. It allows deliberate placement of skeletons for each character.

When a critical mass of images for each scene from the outline is created, a mockup stage can begin. Thereafter more concepts can be generated if required by the story.

Assembling the Mockup of the Comic

During the Mockup stage, Raw images are inserted, adjusted and thereafter gradually replaced with their manually edited versions and thereafter the Final Art. Those edits feature replaced backgrounds, hand painted outfits, handcrafted details and other manual adjustments that make them ready for the AI-Assisted upscale to become the Final Art.

Image 4 – The mockup example based on the script from the Image 1.

Images are added to the pages in order to create a pre-visualization of the final comic. These pages are not final, but can give a really good understanding about what works and what is missing.

This particular example page is far from perfect:

The top image of the city seems to be too small to properly give the reader a sense of place.
Images in the door opening sequence lack variety.
Elza’s reaction shot on the bottom is too narrow to effectively convey her emotions.

All these flaws will be corrected as the mockup evolves towards its final shape. On this stage, even whole pages can be added or removed in order to improve the pacing and visual flow of the comic .

Just like the final comic, the mockup is assembled inside the CLIP Studio Paint. This approach makes it very easy to finalize the comic by simply replacing the mockup’s material with complete illustrations. The journey from a Raw AI-Generated image to Final Art is depicted below.

From Raw AI-Generated Image to the Final Art

No single image created by ENTIA Comics is ever released without being significantly altered towards its final shape. This principle is important because of three reasons:

The strong artistic direction behind each aspect of the Universe of ELZA.
The hands-on manual approach allows to eliminate all inherent flaws of AI-Generated imagery while maintaining the AI-powered speed.
Manual alterations and edits create a satisfying sense of ownership for the creator, while readers feel the care by looking at each illustration.

Image 5 clearly shows what king of journey that awaits each single illustration from being a Raw AI-Generated piece towards becoming the Final Art that is manually composed in Photoshop.

Image 5 – From AI-Generated image to an AI-Assisted Final Art in 4 steps.

Taking a detour from Nadia, this is a progression of an illustration featuring Kira from the graphic novel The DESCEND.

The first image is AI-Generated by using the “shotgun approach”. It is picked from the Image Archive. Her clothes and environment do not match the artistic intention, though so they have to be manually edited.

The Second image is much closer to the desired result as it features the right pose, angle, clothes and the background. The red patch on Kira’s shoulder is an original handcrafted design.

Third image is an intermediate AI upscale. Here Kira is successfully matched with the environment, but some artifacts appear. Parts of this image will be further edited and processed with AI before being manually combined with handcrafted elements.

Fourth image is the Final Art. It is made of multiple AI-Upscaled elements that are both handdrawn and based on AI-Generated imagery:

The spaceship is back in the background.
Her red shoulder patch is consistent with the original design while being seamlessly stylized.
Rebecka can be seen studying a plant in the background.

All these steps provide impeccable control to the creator and turn AI-Generated imagery into a form of handcrafted digital collage. Image 5 shows that no compromises on color, shape, nor form were made during the creation of the Final Art – the end image becomes exactly what the author have intended.

The Final Result

After all images were developed into the Final Art, they can be inserted into the comic while replacing the imagery from the Mockup phase. This part of the process opens up for even more adjustments in order to achieve the best final result. Plenty of those changes can be clearly seen when comparing Image 5 and Image 6.

Image 6 – The finished panel of a comic.

As the final step, all complete illustrations are being added together by replacing those previously shown AI-Generated images inside the mockup comic.

Here are the key differences between this final product and the Mockup:

The image of exterior is moved to a separate page. Instead, an interior image of Nadia’s apartment was added to provide an even better sense of place.
Characters did also get some more room to express themselves.
Text was added inside the speech bubbles.

Conclusion

Generative AI is a powerful tool that can rival human craft, but it’s output has little to no perceived value if no human labor was added to it. This human-centric approach provides both comics and articles from the Universe of ELZA with authentic sincerity that no machine is capable of achieving. Because of that, even despite prominent use of AI tools, creations of ENTIA Comics maintain the high perceived value.

It is important to note though that while most of actors aim to optimize out human labor by replacing it with AI-powered tools, ENTIA Comics has a completely different vision: As of now, the use of Generative AI is vital to make the Universe of ELZA possible as it is created by a single human with limited artistic skills. As more demanding stories are written and more illustrations are assembled, those skills do also develop along with the project. It means that as the Universe of ELZA grows, it will feature even more of the human touch and less of AI involved in its creation. This is the vision. This is the promise.