Non-hyphenated Department

Hyphenated Title

Pica, our resident Artist in Exile and Director of Photography tries out hyphenated prompts in Stable Diffusion. Clearly, the people used to generate training data for images used in Stability AI's diffusion model were as confused about hyphenated identities as everyone else.

Pica Somatich

10th November, 2024

Utterly Random

Art

An AI robot drawing a portrait

An AI robot drawing a portrait

In a world where hyphens are the delicate bridges between identities, one might ponder: how do AI image generators parse hyphenated subjects, such as Indian-American? And what happens when a Labrador Retriever enters the mix? Random test results from an experiment on online and offline AI diffusion models. The title image above is a composite of hand drawn artwork and AI art created using SD1.5 and Dall-E 2.

*1. 'An Irish-American'*

1. ‘An Irish-American’

*2. An Irish-American?*

2. An Irish-American?

Dall-E 3 created Image 1 using the prompt “An Irish-American”. It created four images: all contained heavily freckled, ginger-haired men holding a glass of beer. Image 2 was close to what I had in mind. It was created locally using SDXL Base, SDXL refiner, a custom illustration Lora, a bloated prompt, and few rounds of in-painting. I haven’t drawn anything into the image.

*3. 'An Indian-American Woman'*

3. ‘An Indian-American Woman’

*4. 'An American-Indian Woman'*

4. ‘An American-Indian Woman’

Images 3 and 4 were both created by Dall-E 3 using the prompts shown in the captions. Merely switching the words around the hyphen resulted in a person that looked significantly different. The “Indian-American” looks more “Indian” while the heavily bejewelled “American-Indian” looks like a Caucasian wearing Indian clothes. Both characters seem to be shopping for vegetables and spices in an exotic market! Image 3 could easily be “an Indian woman”.

*5. 'An Indian-American Labrador retriever'*

5. ‘An Indian-American Labrador retriever’

*6. SDXL*

6. SDXL

Image 5 was created by Dall-E. It is identical, in content, to images created using the prompt “a Labrador Retriever”. I’m not sure how DALL-E interpreted the Indian-American aspect of the prompt. Image 6 was what I wanted to create. Again, it was made using SDXL using the setup mentioned earlier.

Advertisement