Meta joins the race of generative AI model, introduces CM3leon for text, images
With the aim to attain extraordinary performance in the area of generative AI, Meta recently announced a cutting-edge AI model, known as CM3leon, which focuses on text-to-image generation.


Highlights
- Meta has built a new AI model to create more coherent imagery than existing image models
- CM3leon is the first multimodal model trained with a recipe adapted from text-only language models
A generative artificial intelligence (AI) model called "CM3leon" (pronounced like chameleon) has been introduced by Meta (previously Facebook) that performs both text-to-image and image-to-text generation.
"CM3leon is the first multimodal model trained with a recipe adapted from text-only language models, including a large-scale retrieval-augmented pre-training stage and a second multitask supervised fine-tuning (SFT) stage," Meta wrote on Friday in a blog post.
What’s unique in Meta's newly launched model?
The company claimed that by using CM3leon's capabilities, the picture production tools can generate more cohesive imagery that more closely works with input prompts.
In comparison to earlier transformer-based techniques, Meta claimed that CM3leon only needs five times the computing power and a smaller training dataset to run.
As per the recent report, CM3Leon is said to have outperformed Google's Parti, a text-to-image model and set a new standard for text-to-picture creation by achieving an FID (Frechet Inception Distance) score of 4.88 when measured against the most popular image generation benchmark (zero-shot MS-COCO).
The tech behemoth added that CM3leon excels at a variety of vision-language activities, including long-form captioning and visual question answering. Despite only training on a dataset of three billion text tokens, CM3Leon's zero-shot performance is superior to that of larger models trained on larger datasets. However, the utilisation of dataset including millions of licensed images got Meta into the legal challenges, and led the company to face criticism regarding information misuse.
Meta bets big on CM3leon
While traditional image generators often struggle with complex objects and understanding prompts, some of the images generated by CM3leon demonstrate its ability to handle intricate designs.
With the goal of creating high-quality generative models, we believe CM3leon’s strong performance across a variety of tasks is a step toward higher-fidelity image generation and understanding, mentioned Meta.
“Models like CM3leon could ultimately help boost creativity and better applications in the metaverse. We look forward to exploring the boundaries of multimodal language models and releasing more models in the future,” the tech giant further added.
COMMENTS 0