AI Boom's Essential Picks: Scenario
Generative AI for Gaming Assets
Scenario.com is a Generative AI Engine that allows users to create custom image generators. Using cutting-edge AI technology, you can train your own AI model to create images in a specific style or genre, ensuring unmatched style consistency. It's a really exciting platform for anyone looking to use AI to create art assets!
Scenario's goal is to provide a blend of customization and style consistency. On Scenario, models are honed on Stable Diffusion. It empowers users to define style and object parameters, aiming for images that resonate with consistency and intent.
Understanding Scenario
Scenario offers control over specific concepts, thanks to its fine-tuning capabilities on a Stable Diffusion foundation. Beyond a web application, Scenario extends its reach to:
- iOS & Android apps
- a Discord bot
- a Unity plugin
- an expansive API for bespoke creative projects.
Guide: Training your Custom Model
Image Dimensions for Training
- If training on Stable Diffusion 1.5 all images should be squared. Images larger than this will be cropped to 512x512, so its best to prepare your images to focus on the subject or parts of the scene that are of interest.
- If training on SDXL by Scenario all images should should be 1024x1024.
Images outside these dimensions need adjustments. For more information see: Formatting Images for use on Scenario.
Improving Model Performance
Ensure a diverse set of unique images in your dataset, ranging from 5 to 50 images. The quality and diversity of images are pivotal.
Guide: How to caption a dataset for training a LoRA model on SDXL
Image captioning within generative AI is a crucial process. These captions teach the AI model to understand and interpret various elements within a picture. But initially, datasets are usually a mixed bag of information, and the goal of this process is to separate the parts of the dataset that are not crucial to training from those that you want to prompt.
The Basics of Captioning for LoRA on SDXL
Understand the Caption Style
Think of captions as three separate elements that you may wish to utilize: trigger
, class
, and descriptors
.
Trigger
The trigger
is a specific keyword to activate the subject or style in model generations. The trigger
should be a unique token that is not already recognized by the foundation model. Test how unique a trigger is by prompting your unique word using SDXL under "Foundational" in the model library, and ensuring nothing discernible is generated.
A common way to create a unique trigger is to choose a memorable word and remove all the vowels in that word. For example "chicken" might become the unique trigger word "chkn".
Class
Class describes any important subjects in an image, such as a man, woman, sword, or otherwise. Descriptors include unique details like actions, colors, or emotions that are not present in all images. This can also include styles of art.
Make note of any words, including trigger words, that are included in the caption to be used later. Saving these words in the tags section of the model's page can be helpful.
Creating Effective Captions
Effective captions vary depending on training goals. Considering the image below, here are some methods for approaching captioning based on the goal of your training.
If the goal is to train the style of the image a user might describe various aspects of the portrait, including the woman's captivating presence, the shimmering outfit, the iridescent colors of the material wrapping around her curgves. In this type of caption, a unique trigger word is not indicated. A caption might read:
The image features a woman with a captivating presence. She is posed against a dark background that emphasizes her shimmering, metallic-golden outfit.
Her expression is intense and alluring, with piercing eyes that seem to gaze directly at the viewer. The lighting plays a crucial role in the composition, with a strategic use of shadows to add depth and highlight her facial structure, as well as the contours of her body and outfit. The contrast between the golden tones of her outfit and the dark background creates a luxurious and exotic atmosphere. ```
If the goal is to train the woman as the subject of a LoRA model a user typically needs fewer descriptive words. In this case, a trigger word is optional, and can be used to indicate a particularly kind of woman. It is less important to describe all the details in the scene, and more important to describe them as they relate to the woman. It is also useful to use any words you may want to be able to prompt again, such as colors, which the model may have more association with in the future due to their inclusion. An example of this type of caption could be:
a woman with a captivating presence, her makeup is dramatic, with bold, dark eyebrows, smoky eyeshadow, and a natural, glossy lip color. Her cheeks are adorned with a striking highlighter that matches the iridescence of her outfit, lending an ethereal glow to her visage. She wears large, circular earrings and a choker, both of a golden hue that complements her ensemble. Her outfit is a body-hugging, one-piece swimwear with a high-cut leg design, showcasing a spectrum of iridescent colors from cobalt blue to emerald green and vibrant purple, reminiscent of an oil slick. The material has a fine, almost liquid texture that wraps around her form, catching the light and accentuating her curves.
Captioning can be leveraged in nuanced and creative ways. The Co-founder & CEO at Scenario shared a GPT called Caption Crafter that is aimed at helping provide concise and accurate captions for images of characters https://chat.openai.com/g/g-GVg2vl8yi-caption-crafter
Consistency is Key
Ensure consistency in the captioning approach across the dataset. This helps the AI model learn and apply concepts more effectively.
Work with the Foundational Model
Leverage captioning by finding where the foundational model (SDXL) already associates concepts, styles, and aesthetics. Use those in your captions and prompts. SDXL is strong, and it can be easier to work with it than to try and fight against it's training.
Roadmap
3D Asset Creation & Image to 3D
At present, Scenario doesn't support the direct creation of 3D assets. However, they are closely monitoring advancements in the AI domain and are eager to incorporate 3D functionalities once a robust pipeline emerges.
Textures
Generate rich, seamless textures with intricate details.
Skyboxes
Set immersive scenes with high-quality backdrops.
Image to Video
Animate still images into captivating video content.
Downloading CKPT Files
Currently, CKPT files aren't available for download, but this feature is on the horizon.
Affiliate Disclaimer:
Some of the links here are affiliate links, meaning AI Boom may earn a commission at no extra cost to you. This support helps us keep up with the latest in AI innovations.