“Flowers can be enjoyed without knowing about the interactions of soil, air, moisture, and seeds of which they are the result. But they cannot be understood without taking just these interactions into account — and theory is a matter of understanding….
“Theory is concerned with discovering the nature of the production of works of art and of their enjoyment in perception. How is it that the everyday making of things grows into that form of making which is genuinely artistic? How is it that our everyday enjoyment of scenes and situations develops into the peculiar satisfaction that attends the experience which is emphatically esthetic? These are the questions theory must answer. The answers cannot be found, unless we are willing to find the germs and roots in matters of experience that we do not currently regard as esthetic. Having discovered these active seeds, we may follow the course of their growth into the highest forms of finished and refined art.”
For two final iris posts this season, I sifted through the 235 photos I posted so far and selected a few dozen that I thought could be most effectively rendered on black backgrounds. The galleries below — and in the next post — demonstrate, I think, how removing background elements can emphasize the shapes, colors, and structures of these flowers. I didn’t make any other color or texture changes to these images from those posted previously — except to eliminate the backgrounds by converting them to black.
Lately I’ve been trying to educate myself on some of the artificial intelligence tools that have been emerging across various disciplines, about which you have probably seen breathless-sounding news coverage ranging from descriptions of these tools as world-changing to equally breathless heralding of the end of the human race. Having spent three decades working in information technology, I’m not that surprised by the hyperbole, which reflects two recurring themes embedded in most technological advances: these new things are hyped as miraculous; and the next versions of any of them will fix all the problems everyone sees in the current versions. Neither of these is true, of course, but the framing does grab attention and perhaps helps further public discussion, while the wizardry remains largely behind the curtains.
The term “artificial intelligence” is a broad concept that includes a wide variety of technological implementations, some of which have been available for a while across different types of software tools. Before I retired, for example, one of my last projects was to evaluate a customer support platform that was capable of responding to verbal or written support requests, of learning from its interactions with humans, and of improving its ability to respond to reported problems as it engaged in those interactions. In all likelihood, you’ve experienced something like this, happily or not, when you’ve requested help with a software program or web site by telephone, email, or with a chatbot. Similarly, products like Adobe Lightroom and Photoshop now include capabilities that are supported by artificial intelligence, notably spot removal tools that are more capable of recognizing content and matching patterns; and the ability to select objects, subjects, and backgrounds in an image with greater accuracy than previous iterations.
Implementations like these differ, in significant ways, from the newer, user-facing variations of artificial intelligence, which are already being widely used to generate content. Since the universe of available tools is as large as it is, I settled on two I would spend some time with: ChatGPT, the language model with which you can engage conversationally; and Adobe Firefly, a program that can generate images from text prompts. I’ve been using ChatGPT for research (with wildly erratic and often disturbing results) for a few months and taking notes on the experience; but as my notes have reached about 5000 words, I’ve not yet sorted them out enough to write anything better than stream-of-consciousness observations, so I’m going to sit on those notes a little longer.
Adobe Firefly is available for anyone to use, for free, and you can sign in to use it with Adobe, Apple, Google, or Facebook accounts, at this link. Firefly lets you describe, in words, an image you’d like to generate. It supports content types categorized as art, graphic, or photo — so, of course, “photo” is what interested me the most. Here for example is one of the images it generated from my prompt “white iris on black background” in the “photo” style:
Firefly automatically generates the image with the watermark in the lower left corner, to indicate that it was an AI image. Aside from that, though, it’s not quite reminiscent of an actual photograph, especially the iris standards (the uppermost section of the bloom) that seem to lack the fine details you’d find in a photograph. And I could never get Firefly to create a pure black background — there were always some shades of gray behind whatever variation it generated — so I imported it into Lightroom, updated the background, adjusted shadows and added some texture, and ended out with this…
… which is much closer to a photograph in appearance, and eerily resembles one I might have taken. It’s still not quite right — yet it’s difficult to explain in words why it strikes me as “not quite right” — but since it was my first attempt at generating an AI image, I figured I’d eventually learn how to get more “photo-realistic” results.
I decided to try something more complicated, and used the prompt “Mausoleum of a wealthy family at a Victorian Garden cemetery similar to Oakland Cemetery in Atlanta, Georgia, surrounded by hydrangeas” to generate the next four images. I doubt that Firefly recognized “similar to Oakland Cemetery” as relevant to the images it generated; though “Victorian Garden cemetery” is certainly a specific type of cemetery well-represented by images and words in books, articles, and web sources.
Here are its “photographs” of four mausoleums that do not exist:
The first thing I noticed about these images was that they all contained perspective errors: they’re slightly crooked horizontally, or the buildings appear tilted backward — yet this type of perspective error is common in architectural photographs, simply because the person with the camera is much shorter than any building, and it’s very easy to hold the camera off-level and create these distortions (especially with wide-angle lenses). While it’s impossible to speak in terms of “intentionality” with AI images whose training you know nothing about, I thought it was interesting that it included what most photographers would consider mistakes — apparently intentionally!
I took the Firefly images and did what I would do if I had photographed these in real life: I imported them into Lightroom, removed the watermarks and a few spots, made some color and contrast adjustments, then straightened or tilted each image, ending out with these…
… which are certainly now more respectable-looking as photographs. And there are some elements of each image that struck me as especially insightful, given the prompt I used. Aside from the obvious Victorian-style architecture, notice in the first photograph that the tool created a roof with some missing shingles (on the left side), which would reflect such a building’s age and some wear and tear. Further, it included a piece of plywood between the grass and the center sidewalk — something I often do see at Oakland Cemetery, where the old culverts (originally used for drainage and hosing horse doo-doo from the gravesites and pathways) have deteriorated. Both these elements suggest that the tool is capable of great specificity in the images it generates.
Could you tell that these images were not produced with a camera? Or that they were images of structures that don’t exist? At first glance, it might be nearly impossible, and two of the photos (the bottom pair) didn’t seem to reveal any hints of their AI source. A couple of them show problems with the hydrangeas, where those to the left and right side of the frame have no detail. They’re just shapeless blobs whose structure couldn’t be recovered in Lightroom or Photoshop (though they could be replaced with use of a healing tool), but their flawed appearance at the edges might be missed since we tend to focus our eyes toward an image’s center anyway.
There are, however, structural or architectural mistakes in the first two, which — according to a conversation I had with ChatGPT — are common to AI-generated images. Take a look near the mausoleum entrances in this pair, then let your eye follow the columns starting at the ceiling then down. You’ll see that the columns on the right and left side start at the correct location, but the columns on the left side end too far forward, toward the middle of the sidewalk — like they might in an M.C. Escher illusion.
Here’s the relevant portion of each photo, zoomed-in so you can take a closer look:
Now you should very clearly see the flawed column “design” — and the facade of this building, if it could exist, would likely fall down. Once you see the flaws, you can’t unsee them; every time I look at these images now, that’s the first thing I notice. But what’s compelling to me is that more often than not, Firefly generated plausible images of entirely imaginary buildings, that were architecturally correct.
While scrounging around the web trying to learn more about AI image generators, I came across the suggestion that a photography prompt could contain information about a camera and lens combination, and the software would generate an image consistent with their characteristics. So, for example, instead of just using “Iris on a black background” as a prompt, I could type “Photograph of an iris on a black background, taken with a Sony A99ii camera and Sony 100mm lens.” While I couldn’t confirm that those additional details made a difference — because every time you change the prompt, Firefly automatically generates wholly new images, making it hard to compare — I did become convinced that starting the prompt with “Photograph of” might matter. Here, for example, are two images generated with the prompt “Photograph of a blue heron at the edge of a pond”…
… where I only removed the Firefly watermark and made a few shadow and contrast adjustments in Lightroom to emphasize the herons. These images are not of evidently lower quality — nor any less like photographs — than any of the thousands of blue heron images you might find on the web. And unlike the AI-imagined mausoleum images above, blue herons — just not these blue herons — do exist, despite the fact that I didn’t photograph any.
Outside the realm of graphic arts, photography typically captures an instant in an experience, with the experience implied in the relationship between a shared photograph and its viewers. With an AI-generated image, the photographer’s experience is eliminated: there is no living interaction with the external world and whatever story a photograph might represent is reduced to phrases typed at a keyboard. What this might mean for the evolution of photography is something I’ll speculate on in the next post in this series, and share some additional photos of animals — that I didn’t take.
Thanks for reading and taking a look!
My previous iris posts for this season are: