The AI Behind ChatGPT Looks to Visualize the World
Nextgov explored the image-generation capability of OpenAI's DALL-E artificial intelligence program.
In my previous NextGov column, I reviewed the new ChatGPT artificial intelligence, asking it to perform tasks as varied as programming in C++ to telling me a bedtime story. I even interviewed the AI about why some people are afraid of artificial intelligences and the importance of ethics as the science of AI moves forward.
I found that the ChatGPT AI from OpenAI was extremely adept at fielding just about any kind of question I could throw at it. Even though it’s not connected to the internet or any live data streams—so you can’t ask it about current events after 2021—it generally provided much better and more detailed information than you would ever find in something like a Google search. The AI is currently free to use, so everyone should give it a try. All you need to do is set up an account and get started.
In a future column, I plan to interview several AI scientists about what the introduction of such a robust and impressive AI like ChatGPT might mean for government, private industry and even the world. I already talked with one leader in this field, the Founder and CEO of Credo AI, Navrina Singh. She talked with me about the importance of ethical AI in a previous column, and I am sure some of the other scientists working in this field will also have some interesting things to say.
However, this column hits in the middle of the holidays, right between Christmas and the New Year, so I wanted to do something a little more lighthearted. I decided to examine the picture-generating capabilities of ChatGPT’s graphical partner AI, called DALL-E. The DALL-E image generator is also free to use. However, users are given a limited number credits each month which they can spend to generate images—probably as a defense against having the supercomputer that runs it getting slammed by requests for images featuring cheeseburger-eating cats.
So, if you are one of the few folks who are working this week (like me) then my hope is that you will get a little bit of extra enjoyment from this article and the images that DALL-E and I generate.
It’s worth noting that the science of AI-based image generation is not completely noncontroversial. Most AIs like DALL-E have been trained by having the AI ingest some combination of images from classic masters, historical works of art, commercial artwork, photographs and other images on the internet. A graphical-based AI will then try and generate original art based on user requests and the information and artwork it’s learned about, but what happens if it draws too heavily from someone else’s work? Does that create a copyright issue? Also, once generated, who owns the created image? The AI, which is presumably owned by a company, generated the image, but the user crafted the description, sometimes in quite a bit of detail, that helped to make the finished product. As you can see, there are still quite a few legal and ethical issues to work out with AI-generated art.
However, let’s put that aside for now, and just have some fun with DALL-E. The thing to know about AI-generated images is that while the AI has potentially millions of points of data to draw from, it still needs humans to describe exactly what they want to see. If you ask for something simple, like a lemon tree, then you will probably get it, although it will likely look pretty generic. If instead you imagine a photograph of a wizened old farmer wearing faded denim picking a fat, overripe lemon in a verdant orchard at sunrise, then you better tell the AI all about your vision if you want it to generate anything close to that. The AI may be good, but it can’t yet read your mind.
I started off with something funny just to see what DALL-E could do. I had some of the recent NASA missions on my mind, so I asked for a picture of “A fat and happy-looking calico cat wearing a NASA space suit with a clear helmet sitting on Martian sand. In the background, the night sky is alive with colorful meteor showers.” And this is what I got:
That was pretty good, and quite funny. Next, I tried to tap into the AI’s ability to mimic master painters and artists. I asked the AI to generate “A portrait in the style of Vincent Van Gogh depicting how artificial intelligence and humans will work together to solve the world’s problems in the future.” The following is what it generated, which I didn’t really understand, but the AI might be showing a concept that is over my head.
My wife, who knows a lot more about art than I do, suggested that Van Gogh’s style might not be the right medium to demonstrate AI and human partnerships. She suggested Salvador Dali instead. So I asked the same question but replaced Van Gogh’s name. My new request was for “A portrait in the style of Salvador Dali depicting how artificial intelligence and humans will work together to solve the world’s problems in the future.” The results were much better I think.
I could totally understand what the AI artist was trying to depict in the Salvador Dali image, with the human and machine mind merging together.
Next, I decided to try and see how realistic an image the AI could generate. Portraits are nice, but I was curious about trying to produce something that would be akin to a photograph. So the first thing I did was ask for DALL-E to produce “A photo of a government scientist working in front of a bank of computers on AI and other scientific projects.” This is what it made for me.
I didn’t really think about it, but probably assumed that the AI would create an image of a white, nerdy-looking guy in a lab coat, especially since I didn’t specify what the scientist should look like. I was impressed that it instead chose a black woman to represent the default scientist. That shows evidence that what OpenAI is trying to do to eliminate bias from their intelligences is working. Not that the AI would never generate a nerdy white guy to represent a scientist, but it’s not the only choice, and maybe not the default one.
The photo itself is impressive. However, I do want to critique the facial structure of the woman the AI generated. If you look closely, her eyes are kind of messed up and out of proportion, as are her glasses which are thicker on one side. DALL-E seems to have a lot of trouble generating evenly spaced human eyeballs and other delicate facial features. But it really struggles with eyes. I noticed this in just about every single image that I generated over the past month. If you go back and look at the first image of the NASA cat in this article, even one of its eyes is malformed. It’s kind of hidden in the darker fur on one side of their face, but look closely and you will see the flaws. I am not sure why that happens, but it might be because the AI does not understand that the eyes of people and animals are for the most part symmetrical. It’s a pretty universal flaw in images it generates and something that OpenAI might want to work on.
Finally, in addition to just generating images, you can also upload images, like your photo, to DALL-E and then have the AI manipulate it. There is also an editing function, although it’s currently in beta. That is how I helped to craft the final image for this column, which is “A photo of jolly Santa Claus sitting in a flying sleigh being pulled by reindeer. He is reading NextGov magazine as he zooms away into a very starry night sky.”
And with that, I am going to zoom away too since this is my final column for 2022. I hope you enjoyed this gallery illustrating my foray into AI-generated art. Give it a try if you think you can do better (which you probably can). And have a safe and happy holiday season.
John Breeden II is an award-winning journalist and reviewer with over 20 years of experience covering technology. He is the CEO of the Tech Writers Bureau, a group that creates technological thought leadership content for organizations of all sizes. Twitter: @LabGuys