Join leaders in Boston on March 27th for a special evening of networking, insight, and conversation.request an invitation here.
Popular AI image generation service The middle of a journey introduced one of the most frequently requested features: the ability to consistently recreate characters across new images.
This, by its very nature, has been a major hurdle for previous AI image generators.
That's because most AI image generators rely on „.popularization model” and tools similar to or based on it. Stability AI's Stable Diffusion open source image generation algorithmwhich, broadly speaking, works by taking the text the user types in and attempting to stitch together pixel-by-pixel images that match that description, learned from similar images and text tags in that massive file. Masu. A (and controversial) training data set of millions of human-generated images..
Why consistent characters are so powerful and elusive for AI-generated images
However, as in the case of text-based large-scale language models (LLMs) such as OpenAI's ChatGPT or Cohere's new Command-RThe problem with all generative AI applications is inconsistent responses. The AI generates a new one each time a prompt is typed, even if the prompt is repeated or some of the same keywords are used.
This is great for generating entirely new content (images in Midjourney's case). But what if you're creating storyboards for a movie, novel, graphic novel, comic book, or other visual medium? same Do you want your characters to move around and appear in different scenes and settings with different facial expressions and props?
This exact scenario, typically required for narrative continuity, has so far been extremely difficult to achieve with generative AI. But Midjourney has now put a crack at it, introducing a new tag „-cref“ (short for „character reference“) that users can add to the end of text prompts in his Midjourney Discord to match a character's face. I'm trying. It captures features, body shape, and even clothing from the URL a user pastes after a tag.
As this functionality evolves and becomes more refined, Midjourney could evolve from a cool toy and source of ideas to a more professional tool.
How to use the new Midjourney Consistent Character feature
This tag works best with previously generated Midjourney images. So, for example, a user's workflow would first generate or retrieve the URL of a previously generated character.
Suppose you start from scratch and generate a new character with the prompt „Muscular bald man with beads and an eyepatch.“
![](https://venturebeat.com/wp-content/uploads/2024/03/cfr0z3n_a_muscular_bald_man_with_a_a_bead_and_eye_patch_555ae74e-d7ac-4011-b566-e5055af61ffe.png?resize=1456%2C816&strip=all)
Upscale the image you like the most and Control-click on the Midjourney Discord server to find the „Copy Link“ option.
![](https://venturebeat.com/wp-content/uploads/2024/03/Screenshot-2024-03-11-at-9.21.06%E2%80%AFPM.png?resize=1418%2C1218&strip=all)
![](https://venturebeat.com/wp-content/uploads/2024/03/Screenshot-2024-03-11-at-9.21.06%E2%80%AFPM.png?resize=1418%2C1218&strip=all)
Then enter the new prompt „-cref (URL) of a white tuxedo standing in a villa“ and paste the URL of the image you generated, and Midjourney will try to generate the same character as before with a new image . Settings entered.
![](https://venturebeat.com/wp-content/uploads/2024/03/Screenshot-2024-03-11-at-9.27.58%E2%80%AFPM.png?resize=1342%2C734&strip=all)
![](https://venturebeat.com/wp-content/uploads/2024/03/Screenshot-2024-03-11-at-9.27.58%E2%80%AFPM.png?resize=1342%2C734&strip=all)
As you can see, the result is far from the original character (or the original prompt), but it's definitely encouraging.
In addition, the user will notice that the end of the new prompt (the string '-cref (URL)', so something like '-cref (URL) -cw 100'. The lower the 'cw' number, the more The image will have more variance. The higher the number of 'cw', the more variance you will get. The resulting new image will closely follow the original reference.
As you can see in this example, entering the very low „cw 8“ actually returns what we wanted: a white tuxedo. However, the character's signature eyepatch has now been removed.
![](https://venturebeat.com/wp-content/uploads/2024/03/Screenshot-2024-03-11-at-9.32.34%E2%80%AFPM.png?resize=1470%2C950&strip=all)
![](https://venturebeat.com/wp-content/uploads/2024/03/Screenshot-2024-03-11-at-9.32.34%E2%80%AFPM.png?resize=1470%2C950&strip=all)
Well, there's nothing that a little „regional difference“ can't solve.
![](https://venturebeat.com/wp-content/uploads/2024/03/cfr0z3n_no_glasses_black_eyepatch_ae55f47b-44dd-40d4-ab40-24669cb821b6.png?resize=1456%2C816&strip=all)
![](https://venturebeat.com/wp-content/uploads/2024/03/cfr0z3n_no_glasses_black_eyepatch_ae55f47b-44dd-40d4-ab40-24669cb821b6.png?resize=1456%2C816&strip=all)
OK, the eyepatch is on the wrong eye…but we're getting there!
You can also combine multiple characters into one by using two „-cref“ tags side by side with each URL.
The feature just rolled out earlier this evening, but it's already being tested by artists and creators. If you have Midjourney, please give it a try. And read founder David Holz's full statement about it below.
@Everyone, today we're testing the new „Character Reference“ feature. This is similar to the „Style Reference“ feature, except instead of matching a reference style, it attempts to match the character to the „Character Reference“ image.
How to use
- type
--cref URL
After the prompt for the URL to the character's image. - can be used
--cw
To change the reference strength from 100 to 0 - Strength 100 (
--cw 100
) is the default and uses face, hair, and clothing - If the intensity is 0 (
--cw 0
) Focus only on the face (good for changing clothes, hair, etc.)
what it means
- This feature works best when using characters created from Midjourney images. Not designed for real people or photos (like regular image prompts, images may be distorted)
- Cref works like a regular image prompt, except it „focuses“ on a character trait.
- This technique has limited accuracy and cannot accurately copy dimples, freckles, T-shirt logos, etc.
- Cref works with both Niji models and regular MJ models,
--sref
advanced features
- You can use multiple URLs in this way to blend information and text from multiple images.
--cref URL1 URL2
(This is similar to a multiple image or style prompt)
How does it work in web alpha?
- When you drag or paste an image to the image bar, three icons will appear. Selecting these sets whether they are image prompts, style references, or text references.To use images for multiple categories, hold down the Shift key while selecting options
Please note that this and other features may change suddenly while MJ V6 is in alpha, but an official beta of V6 will be available soon. We welcome your feedback on ideas and features. We hope you enjoy this early release and find it useful in your story and world building.
VentureBeat's mission will be a digital town square for technical decision makers to gain knowledge about transformative enterprise technologies and transactions. Please see the briefing.