Midjourney is one of the leading drivers of the emerging technology of using artificial intelligence (AI) to create visual imagery from text prompts. The San Francisco-based startup recently made news as the engine behind the artwork that won an award in a Colorado state fair competition, and that’s unlikely to be the last complicated issue that AI art will face in the coming years.
Midjourney differentiates from others in the space by emphasizing the painterly aesthetics in the images it produces. The platform is not trying to create photorealistic images that can be mistaken for photographs, and CEO David Holz says he is personally very uneasy with the uncanny quality of deepfakes and other work that simulates reality too closely. Instead, Holz says Midjourney is designed to unlock the creativity of ordinary people by giving them tools to make beautiful pictures just by describing them.
But despite the humanist, consumer-oriented focus of the company, there are inevitable questions about implications for commercial art and professional artists. I interviewed Holz for a broader piece on the potential disruptions AI art is likely to cause in the production of imagery for entertainment, videogames and publishing. Here is a longer excerpt from our conversation where Holz provides more depth and context as he addresses those issues and expounds on his vision for the company, the industry and the technology. The interview has been edited for length and clarity.
Rob Salkowitz, Forbes Contributor: What’s your role and title?
David Holz, Midjourney. I’m the founder and CEO. I usually just prefer being called the founder, though, because CEO sounds very businessy, and we’re not very businessy. We’re an applied research lab that makes products.
What is Midjourney’s mission?
We like to say we’re trying to expand the imaginative powers of the human species. The goal is to make humans more imaginative, not make imaginative machines, which I think is an important distinction.
Can you give a brief history of the company to date?
We started working on the imagination part of our company about a year and a half ago. There were some breakthroughs on diffusion models, people understanding clip, openAI, that sort of thing. Almost everyone involved in this is San Francisco and we all realized this is going to get serious, that it’s different from a lot of other stuff.
What does Midjourney see as the benefit of this text-to-image technology for business and society?
I’m definitely more concerned about society than business. We’re a consumer product, but maybe 30%-50% of our users right now are professionals. The majority are not. Artists on the platform tell us it allows them to be more creative and explorative in the beginning, coming up with a lot of ideas in a short amount of time.
Right now, our professional users are using the platform for concepting. The hardest part of [a commercial art project] is often at the beginning, when the stakeholder doesn’t know what they want and has to see some ideas to react to. Midjourney can help people converge on the idea they want much more quickly, because iterating on those concepts is very laborious.
Another advantage for artists is it gives people confidence in areas they’re not confident in. Most if not all artists feel like there’s some part of art they can’t do well. It might be colors, composition, backgrounds. We have a famous character designer using our product and people ask him why would you use an AI since you’re so good already. And he said, “well, I’m only good at the character part. This is helping me with the rest, the world, the background, the color schemes.”
About how many people are using the product?
Millions are using it. Our Discord is over two million. It’s the biggest active Discord server by far now.
Does Midjourney’s license allow for commercial use of imagery generated by the platform?
Yes. But if you’re working for a company bigger than a million dollars in annual revenue, we ask that you buy a corporate license.
How was the dataset built?
It’s just a big scrape of the Internet. We use the open data sets that are published and train across those. And I’d say that’s something that 100% of people do. We weren’t picky. The science is really evolving quickly in terms of how much data you really need, versus the quality of the model. It’s going to take a few years to really figure things out, and by that time, you may have models that you train with almost nothing. No one really knows what they can do.
Did you seek consent from living artists or work still under copyright?
No. There isn’t really a way to get a hundred million images and know where they’re coming from. It would be cool if images had metadata embedded in them about the copyright owner or something. But that’s not a thing; there’s not a registry. There’s no way to find a picture on the Internet, and then automatically trace it to an owner and then have any way of doing anything to authenticate it.
Can artists opt out of being including in your data training model?
We’re looking at that. The challenge now is finding out what the rules are, and how to figure out if a person is really the artist of a particular work or just putting their name on it. We haven’t encountered anyone who wants their name taken out of the data set.
Can artists opt out of being named in prompts?
Not right now. We’re looking at that. Again, we’d have to find a way to authenticate those requests, which can get complicated.
What do you say to commercial artists concerned this will destroy their livelihood? At a certain point, why would an art director hire an illustrator to produce work like concept art, production design, backgrounds – those sorts of things – when they can just enter prompts and get useful output much more quickly and at much lower cost?
It’s a lot of work still. It’s not just like “make me a background.” It might be ten times less work, but it is way more work than than a manager is going to do.
I think there’s kind of two ways this could go. One way is to try to provide the same level of content that people consume at a lower price, right? And the other way to go about it is to build wildly better content at the prices that we’re already willing to spend. I find that most people, if they’re already spending money, and you have the choice between wildly better content or cheaper content, actually choose wildly better content. The market has already established a price that people are willing to pay.
I think that some people will try to cut artists out. They will try to make something similar at a lower cost, and I think they will fail in the market. I think the market will go towards higher quality, more creativity, and vastly more sophisticated, diverse and deep content. And the people who actually are able to use like the artists and use the tools to do that are the ones who are going to win.
These technologies actually create a much deeper appreciation and literacy in the visual medium. You might actually have the demand, outstrip the ability to produce at that level, and then maybe you’ll actually be raising the salaries of artists. It could be weird, but that’s what’s going to happen. The pace of that demand increase for both quality and diversity will lead to some wonderful and unexpected projects getting made.
A generation of students graduated art schools, many of them heavily in debt, counting on relatively well-paid jobs in entertainment production, videogame production, commercial art and so on. How does the emergence of AI text-to-image platforms impact their future?
I think some people will try to cut costs, and some people will try to expand ambitions. I think the people who expand ambitions will still be paying all those same salaries, and the people who try to cut costs, I think will fail.
Ai is typically used at scale for stuff like call centers or checking bags at airports and the sort of the jobs that people don’t really care to do. And the value proposition is that it frees people up to do more rewarding, more interesting kinds of jobs. But art jobs are rewarding and interesting. People work their entire lives and develop their skills to get these kind of jobs. Why would you point this technology at that at that level of the economy as a as kind of a business focus and priority for the stuff that you’re doing?
Personally, I’m not. My stuff is not made for professional artists. If they like to use it, then that is great. My stuff is made for like people who, like, there’s this woman in Hong Kong, and she came to me, and she goes, “The one thing in Hong Kong that your parents never want you to be is an artist, and I’m a banker now. I’m living a good banker life. But with Midjourney now I’m actually starting to get a taste of this experience of being the person I actually wanted to be.” Or a guy at the truck stop who’s making his own baseball cards with wild images, just for fun. It’s made for those people, because, like most people, they don’t ever get to do these things.
It’s important to emphasize that this is not about art. This is about imagination. Imagination is sometimes used for art but it’s often not. Most of the images created on Midjourney aren’t being used professionally. They aren’t even being shared. They’re just being used for these other purposes, these very human needs.
Nevertheless, the output of your product is imagery, which has commercial value in professional context in addition to all of those other properties. And this is very disruptive of that economy.
I think it’s like we’re making a boat, and somebody can race with the boat, but it doesn’t mean that the boat’s about racing. If you use the boat to race then maybe like, yeah, sure. In that moment it is. But the human side really matters, and I think that we’re not… We want to make pictures look pretty. We don’t see ourselves as trying to create art as part of our thing. We want the world to be more imaginative. We would rather make beautiful things than ugly things.
Do you believe that any government body has jurisdiction or authority to regulate this technology? And if so, do you think they should?
I don’t know. Regulation is interesting. You have to balance the freedom to do something with the freedom to be protected. The technology itself isn’t the problem. It’s like water. Water can be dangerous, you can drown in it. But it’s also essential. We don’t want to ban water just to avoid the dangerous parts.
Well, we do want to be sure our water is clean.
Yes, that’s true.
Source: https://www.forbes.com/sites/robsalkowitz/2022/09/16/midjourney-founder-david-holz-on-the-impact-of-ai-on-art-imagination-and-the-creative-economy/