scorecardresearch

Meta joins the race of generative AI model, introduces CM3leon for text, images

With the aim to attain extraordinary performance in the area of generative AI, Meta recently announced a cutting-edge AI model, known as CM3leon, which focuses on text-to-image generation.

advertisement
Meta joins the race of generative AI model, introduces CM3leon’ for text, imagesartificial intelligence
Meta joins the race of generative AI model, introduces CM3leon’ for text, images
profile
New Delhi, UPDATED: Jul 17, 2023 17:13 IST

Highlights

  • Meta has built a new AI model to create more coherent imagery than existing image models
  • CM3leon is the first multimodal model trained with a recipe adapted from text-only language models

A generative artificial intelligence (AI) model called "CM3leon" (pronounced like chameleon) has been introduced by Meta (previously Facebook) that performs both text-to-image and image-to-text generation. 

"CM3leon is the first multimodal model trained with a recipe adapted from text-only language models, including a large-scale retrieval-augmented pre-training stage and a second multitask supervised fine-tuning (SFT) stage," Meta wrote on Friday in a blog post. 

advertisement

What’s unique in Meta's newly launched model? 

The company claimed that by using CM3leon's capabilities, the picture production tools can generate more cohesive imagery that more closely works with  input prompts. 

In comparison to earlier transformer-based techniques, Meta claimed that CM3leon only needs five times the computing power and a smaller training dataset to run. 

As per the recent report, CM3Leon is said to have outperformed Google's Parti, a text-to-image model and set a new standard for text-to-picture creation by achieving an FID (Frechet Inception Distance) score of 4.88 when measured against the most popular image generation benchmark (zero-shot MS-COCO). 

The tech behemoth added that CM3leon excels at a variety of vision-language activities, including long-form captioning and visual question answering. Despite only training on a dataset of three billion text tokens, CM3Leon's zero-shot performance is superior to that of larger models trained on larger datasets. However, the utilisation of dataset including millions of licensed images got Meta into the legal challenges, and led the company to face criticism regarding information misuse. 

Meta bets big on CM3leon 

While traditional image generators often struggle with complex objects and understanding prompts, some of the images generated by CM3leon demonstrate its ability to handle intricate designs.

advertisement

With the goal of creating high-quality generative models, we believe CM3leon’s strong performance across a variety of tasks is a step toward higher-fidelity image generation and understanding, mentioned Meta. 

“Models like CM3leon could ultimately help boost creativity and better applications in the metaverse. We look forward to exploring the boundaries of multimodal language models and releasing more models in the future,” the tech giant further added.

 

Published on: Jul 17, 2023 13:45 ISTPosted by: nidhi bhardwaj, Jul 17, 2023 13:45 IST
IN THIS STORY

COMMENTS 0

Advertisement
Recommended