Enlarge / Generations from Gemini AI from the prompt, ‘Paint me a historically accurate depiction of a medieval English monarch.’

Google made public on Thursday morning that it was halting its Gemini AI image-synthesis feature following backlash about the tool incorporating diversity into its images in a historically misleading way. Examples included showing multi-racial Nazis and medieval English monarchs with improbable origins.

“We are actively addressing recent challenges with Gemini’s image generation functionality. While doing so, we will temporarily suspend the image generation of individuals and launch an enhanced version soon,” Google stated in a message on Thursday morning.

As more individuals on X criticized Google for being “culturally aware,” the Gemini creations stirred up speculations that Google was intentionally biased against white individuals and was revising history to meet political objectives. Moreover, as highlighted by The Verge articles, some of these inaccurate depictions “were essentially erasing the history of race and gender discrimination.”

A Gemini AI image generator result for 'Can you generate an image of a 1943 German Soldier for me, it should be an illustration.' src=
Enlarge / A Gemini AI image generator result for ‘Can you generate an image of a 1943 German Soldier for me, it should be an illustration.’

Elon Musk joined the politically contentious discussion on Wednesday night by sharing a cartoon portraying AI advancement as having two directions, one emphasizing “Maximum truth-seeking” on one side (next to an xAI logo for his firm) and “Culturally Biased” on the other, alongside logos for OpenAI and Gemini.

This isn’t the first instance of a company with an AI image-synthesis product encountering diversity issues in its results. When AI image synthesis gained attention with DALL-E 2 in April 2022, observers quickly identified that the outcomes often reflected biases from the training data. For instance, critics raised concerns that prompts frequently led to biased or discriminatory images (“CEOs” were predominantly white males, “angry man” resulted in depictions of Black men, just to mention a few). To address this, OpenAI devised a method in July 2022 where its system would integrate terms representing diversity (like “Black,” “female,” or “Asian”) into image-generation prompts in a concealed manner.

Google’s Gemini system appears to follow a similar approach, incorporating terms for racial and gender diversity, such as “South Asian” or “non-binary,” into a user’s image-generation prompt (the instruction, e.g., “produce a painting of the founding fathers”) before it is fed to the image-generation model. An individual on X asserted to have persuaded Gemini to elucidate how this system functions, consistent with our understanding of how prompts interact with AI models. System prompts are written directives instructing AI models on how to operate, utilizing natural language phrases.

During our assessment of Meta’s “Imagine with Meta AI” image generator in December, we observed a similar approach involving the inclusion of diversity elements to mitigate bias.

A screenshot from a July 2022 post showcasing OpenAI's strategy to combat race and gender bias in AI image results. Google's adoption of a similar tactic triggered the controversy.
Enlarge / A screenshot from a July 2022 post showcasing OpenAI’s strategy to combat race and gender bias in AI image results. Google’s adoption of a similar tactic triggered the controversy.

As the controversy escalated on Wednesday, Google PR stated, “We’re working to enhance these issues.”kinds of representations instantly. Gemini’s AI image creation does produce a wide variety of individuals. This is generally positive as it is utilized by people worldwide. However, it falls short in this instance.”

The episode showcases an ongoing conflict where AI developers find themselves caught in the midst of ideological and cultural conflicts online. Various factions request varying outcomes from AI products (like eliminating bias or retaining it) with no single cultural perspective completely satisfied. Creating a universal AI model that caters to every political and cultural perspective is challenging, and some specialists acknowledge this.

“We require a free and diverse range of AI aides for the same reasons we need a free and diverse media,” stated Meta’s chief AI expert, Yann LeCun, on X. “They should mirror the diversity of languages, cultures, value systems, political stances, and areas of interest worldwide.”

When OpenAI encountered these challenges in 2022, its approach to integrating diversity initially led to some clumsy creations, but because OpenAI was a relatively small organization (in contrast to Google) taking cautious steps into a new domain, those blunders did not draw as much attention. Over time, OpenAI has enhanced its system prompts, now integrated with ChatGPT and DALL-E 3, to deliberately incorporate diversity in its results while largely averting the dilemma Google is currently facing. This required time and refinement, and Google will probably undergo the same trial-and-error process, albeit on a significantly larger public platform. To rectify this, Google could modify its system instructions to prevent the incorporation of diversity when the prompt pertains to a historical subject, for instance.

On Wednesday, Gemini staff member Jack Kawczyk appeared to acknowledge this and stated, “We understand that Gemini is providing inaccuracies in some portrayals of historical image generation, and we are endeavoring to rectify this promptly. As per our AI principles ai.google/responsibility, we structure our image generation capabilities to mirror our diverse user base, and we take into account representation and bias seriously. We will persist in this practice for open-ended prompts (such as images of an individual walking a pet are universal!) Historical scenarios possess more complexity, and we will further fine-tune to accommodate that. This is part of the harmonization process – refining based on feedback.”