Appreciate the poetic misunderstandings of the art of AI

What does an “Art Deco Buddhist temple” look like? The sentence is almost absurd; it’s hard to imagine a Buddhist temple built in the Art Deco style, the western aesthetic of the early 20th century with muted architecture and clean shapes. But that hasn’t deterred the @images_ai Twitter account, which promises “images generated by AI machines.” When another Twitter user issued this prompt in early August, @images_ai responded with a photo which resembles an orientalist Disney castle, a mixture of pointed spiers and red angled roofs with a patterned stone gray facade. Or maybe it looks like the archetypal Chinese Buddhist temple crossed with a McDonald’s – a fleeting, half-memorized image of a dream frozen in a Jpeg on social networks.

Since the launch of @images_ai in late June, it has gained an audience for its constant stream of surreal, glitchy, sometimes beautiful, sometimes shocking images created with open source machine learning tools. The story produced everything from a version of Salvador Dalí’s painting “The Persistence of Memory” in the neon-pastel style by Lisa Frank to a performance of “the edge of reality and time, A frightening whirlwind of floating eyes, hourglasses and windows looking out to nowhere. He sometimes publishes dozens of images a day. But the owner of the account is a human being. “People think I’m a bot all the time,” Sam Burton-King, a twenty-year-old student at Northwestern University, recently told me. “They think if they tag me with something, it’ll just be done. Most of the requests I get are crap.

Burton-King, who uses the pronouns them and them, is not a programmer but a musician who came to Northwestern from the UK on a scholarship. They started with a major in mathematics, but, finding the lessons too difficult, turned to philosophy and music. This year, they noticed that an account they were following on Twitter, a vaporwave music label called DMT Tapes FL, posted some AI-generated footage – spooky, cyberpunk landscapes in bright blue and pink. They noticed other so-called art accounts, not all powered by AI, among them @gameauras, which offers “video game images with elegiac auras” and now has more than sixty thousand followers. Burton-King decided “to try and make my own feed and have an interesting art page that was created by these machines,” they said. Their project is both an artistic archive and a collaboration with the public, an aesthetic AI-driven phone game in which the fun is to see what the machine is wrong.

To create art for @images_ai, Burton-King powers a selection of prompts written in what’s known as a generative adversarial network (GAN), a machine learning system in which two artificial neural networks, models computers that mimic a human brain’s information processing – compete with each other to come up with a result that best matches the investigation. Burton-King often performed tasks in browser tabs while in school; each image takes ten to twenty minutes to generate. The final image jerkily merges from a field of pixelated static as the machine progresses from one hundred to two thousand images, coming closer and closer to something she identifies as an “Art Deco Buddhist temple”.

The neural networks used by Burton-King include a network called CLIP, which is formed on a database of four hundred million pairs of text and images pulled from sites on the Internet, possibly including social networks such as Pinterest and Reddit. CLIP was released in January by the organization OpenAI, with the aim of better determining how well a written description matches a corresponding image. (By comparing the image of a dog to the word “cat”, CLIP would find a weak correlation.) Shortly after its launch, however, a twenty-three-year-old artist named Ryan Murdock realized that the program’s processes could be reversed: you could type in text, like “cat,” and, starting with pixels, “updating the image iteratively,” Murdock told me, until CLIP determines that it looks like a cat. “This way you can take CLIP to be a classifier to something that can lead to the generation of images, ”he said. Murdock pioneered the technique, combining CLIP with a GAN commonly used in a program he called Big Sleep. (@Images_ai started out using Big Sleep, then moved on to another system that followed Murdock’s method.)

CLIP is able to deduce when something looks like a cat, but he can also draw very problematic conclusions. At one point, Murdock created a system capable of taking an image and automatically producing a description of it in text form. “I never published it because it produced horribly biased and cruel captions when you put people in it,” he said. He spat derogatory terms – racist and ableist language. He continued, “No wonder when you get these really powerful neural networks including, like, everything Reddit, they’re going to come out with some really disturbing attitudes.” In this sense, he added, the images generated by AI are a “reflection of the collective unconsciousness of the Internet”. CLIP inherited from the prejudices of its source material as well as its information.

“The art is to discriminate right from wrong,” said Burton-King, determining what words to enter, what size the images to make, and when to stop the generative process. But the most compelling aspect of the tale might be its ability to fulfill an artistic fever dream, some sort of magic spell: “You can just type something and have it manifest in front of you,” Burton-King said. “I think that’s the main draw for everyone.” It doesn’t even require proficiency in coding; @images_ai posted a tutorial for anyone to take the tour using open source tools online.

What do people want to see? The worst and most common @images_ai requests are internet memes, according to Burton-King. “People ask for Shrek, or Big Chungus, or Donald Trump all the time in various situations,” they said. A successful prompt was “Elon Musk is in painWhich resulted in a grotesque collage of grinning faces and Tesla chargers, and drew subsequent requests for scenes from other tortured tech entrepreneurs. Burton-King also receives many K-pop (“stan Loona”) themed requests which they outright refused for fear of being inundated with fans sending similar prompts. But they know what makes a good invite. As we spoke, they posted Twitter notifications @images_ai: “Someone asked for ‘a million explosions at a time’,” they said. “I can’t imagine what that would do. “(They decided to find.) Their favorite prompts are lines from poems or novels: “Evocative words very often find evocative images,” they continued. Two verses from William Blake’s poem “Jerusalem”, From the beginning of the 19th century—“ And those feet in olden days / Walked on the green mountains of England. . . – gave rise to an idyllic green landscape beset by pairs of giant feet and dark, menacing factories, presumably the “Satanic Mills” of the poem.

The machine will “invariably encapsulate the mood of the text, very rarely in a way that you anticipate,” Burton-King said. But this translation is far from literal – like all visual art, it is a matter of interpretation. Looking at the image Blake suggested, a viewer might as well remember Shelley’s “two vast stone legs without a trunk”.

While the Dalí-Lisa Frank mashup created by Burton-King is surprisingly accurate for the styles of the two artists, and several Basquiat mashups have worked well, there is no doubt that these works are human creations. “There is an odd random quality to this.” AI images tend to include what looks like tiles, with particular patterns repeating in different places, which Burton-King likens to tapestries. The images do not make sense as a whole either; their components do not fit into a unified composition. “There is a kind of inconsistency that is very difficult to emulate,” said Burton-King. The pleasure of @ images_ai comes from the nauseous gap between the human prompt and the automated result, the poetic misunderstandings that highlight the limits of machine intelligence. Call it a nice weird valley shape.

Leave a Comment