OpenAI’s ChatGPT Images 2.0 brings multilingual text and reasoning capabilities to AI image generation

April 22, 2026

Person at a desk using a computer with a large monitor showing a collage of images and six floating sticky-note style captions around the screen.

OpenAI has unveiled ChatGPT Images 2.0, a major upgrade to its AI image generation capabilities that can produce complex visuals with readable text in multiple languages. The new model represents a significant leap from the company’s previous GPT-Image-1.5 released in December 2025.

After weeks of testing on LM Arena AI under the codename “duct tape,” the model has impressed early users with its ability to generate long text blocks, realistic user interfaces, screenshots from popular websites, and even conduct web research to incorporate current information into images. The update is now rolling out to all ChatGPT users, with advanced features reserved for paid subscribers.

Reasoning-powered image generation

The most significant advancement in Images 2.0 is the integration of OpenAI’s “O-series” reasoning capabilities. Unlike previous models that functioned as black boxes, the new system takes an “agentic” approach. When users select a “Thinking” model within ChatGPT, the system researches, plans, and reasons through image structure before rendering begins.

During testing, the model demonstrated this capability by analyzing complex PowerPoint files and creating professional posters that preserved specific stylistic elements while synthesizing core data. The system can also search the web in real-time to ensure visual accuracy for current events.

The underlying architecture has been “revamped from scratch” according to Research Lead Boyuan Chen, who describes it as a “generalist model” or “GPT for images” capable of handling 3D perspective shifts and complex spatial reasoning through simple text prompts.

Multilingual support and text precision

One of the most persistent issues with AI-generated imagery has been illegible text. Images 2.0 addresses this with what OpenAI calls a “step change” in typography capabilities. The model can now produce readable text in dense compositions like scientific diagrams, menus, and infographic posters.

The model also tackles Western bias in AI imagery by supporting high-fidelity text generation in multiple languages:

Japanese
Korean
Chinese
Hindi
Bengali

Text isn’t just translated but rendered with language that flows coherently, ensuring labels and explanations feel natively integrated into designs. This makes the model particularly valuable for global brands and educational content creators.

Multi-image generation with consistency

For creators working on storyboards or brand campaigns, Images 2.0 can generate up to eight distinct images from a single prompt while maintaining character and object continuity across the series. This solves what Product Lead Adele Li called a “cumbersome” workflow where users previously had to create images one at a time.

This feature enables the creation of entire manga sequences, children’s books, or families of social media graphics that share consistent visual elements. The capability extends to creating three-page educational visuals complete with quizzes that maintain instructional flow.

Pricing and availability tiers

OpenAI’s rollout strategy targets both casual users and enterprise adoption with multiple access levels:

Free Users: Access to base ImageGen 2.0 model for standard tasks
Plus and Pro Users: “Thinking” capabilities including tool use, web search, and multi-image generation
Pro Users: Additional access to “ImageGen Pro” models for advanced generation
API Developers: Integration with gpt-image-2 supporting up to 4K resolution and flexible aspect ratios

API pricing remains competitive with the predecessor model, actually reducing output costs by $2. The base model improvements include better instruction following, stronger text rendering, multilingual gains, and broader aspect ratios for all users.

Safety and authenticity measures

Given growing concerns about AI-generated content in political campaigns and social media manipulation, OpenAI emphasizes its safety protocols. The company maintains strict policies against election interference and implements multiple safeguards:

Industry-standard watermarking for provenance tracking
Advanced perception models filtering harmful content
Real-time monitoring and policy enforcement

Li addressed concerns about deceptive campaigning directly, stating that while other platforms may lack such safeguards, “ChatGPT does, and we take monitoring and protection of our users, as well as the influence that our photos as they are created, incredibly seriously.”

Impact for enterprise users

The shift from Images 1.5 to 2.0 represents more than a technical upgrade. By integrating reasoning capabilities, OpenAI is addressing the “intent gap” that has limited AI art applications. When users request an infographic about supply and demand, they need logical information layout, not just attractive visuals.

This systemic thinking approach means the model can create cohesive design packages – floor plans with matching color palettes, material lists, and inspiration shots that adhere to a single aesthetic vision. The trade-off is speed, with reasoning-enabled generation taking longer than simple image creation.

For professional users, this exchange appears worthwhile. Waiting an extra minute for production-ready assets remains significantly faster than hours of manual design work, positioning Images 2.0 as a tool for “economically valuable creative tasks” rather than just artistic experimentation.

Reasoning-powered image generation

Multilingual support and text precision

Multi-image generation with consistency

Pricing and availability tiers

Safety and authenticity measures

Impact for enterprise users

Related news

Anthropic suspends powerful AI models after US government directive

Apple says new Siri won’t flatter or romance users like other AI chatbots

Mistral AI reportedly seeks €3 billion funding round at €20 billion valuation