What Nvidia’s New Text-To-3D Means For Engineering & Product Design

tl;dr: Generative AI is evolving at an exhilarating pace. The latest algorithm by Nvidia converts text into 3D mesh twice as fast as projects published barely 2 months ago. This means that the technical capabilities are now already surpassing our ability to work with them.

Last week’s paper by Nvidia scientists demonstrated the exponential speed at which the generative AI space is evolving. This explosion of activity – especially visible over the last 9 months – will have an impact on every part of life, not least on product design, engineering and production. The changes will unshackle the industry from structural constraints in the way ideas are communicated, empower faster innovation cycles and ultimately allow it to deliver its sustainability promises.

Example Meshes from Nvidia Research’s Magic 3D algorithms, with the prompts used to generate them.

Nvidia Deep Imagination Research

Having been told for years that AI would fundamentally revolutionize the way we work, few expected the creative sector to be among its first victims. The advent of GPT-3’s human-like text generator in 2020 brought the possibilities into sharper focus. It’s been a wild ride since then: DALL-E (text-to-image), Whisper (speech recognition), and most recently Stable Diffusion (text-to-image) not only increased capabilities of speech and visual AI tools, but also reduced the resources required to use them (from 175bn parameters for GPT-3 to 900mn for Stable Diffusion).

Stable Diffusion’s size means less than 5gb disk space – capable of being run on any laptop. Not only that; unlike OpenAI (which is mainly funded by Microsoft and publishes GPT-3, DALL-E and Whisper), Stable Diffusion is open source, meaning others can build on its learnings much more readily. That means we’re only seeing the beginning of the innovative cycle – there’s much more to come, as Nvidia’s paper now shows.

Stable Diffusion’s backers (stability.ai) are further turbocharging this trend by providing technological and financial grants to other teams taking the exploration into new directions. Additionally, a plethora of projects is making the tools available to an ever-broader range of users. Among them are plugins for Blender, an open-source design tool, and Adobe’s proprietary Photoshop equivalent. Full API access to the tools is being funded with big Venture Capital dollars, meaning that hundreds of millions software developers, not only a few hundred thousand data engineers, will now create their own tools on these algorithms.

Speech, images and text are among the first verticals to be disrupted by these technologies. But 3D is not far behind. Beyond niche generative art, cartoons are the obvious first point of application. There’s already a Pokémon generator based on Stable Diffusion. Visual Effects and movies are next. But many other sectors are likely to be disrupted – among them interior design with Interiorai.com leading the charge.

In all this excitement, applying the innovations to Design & Engineering feels like an afterthought. Yet it is likely to be the area ultimately most significantly impacted. Of course, there are initial challenges: For one, Stable Diffusion and its compatriots are not yet very precise. That’s not a problem for cartoons, but it’s a major challenge for any attempt to transform text into full 3D geometries used in industrial contexts. That’s an area that has had some nascent interest (a project called Bits101 was launched in Israel in 2015). This may be the holy grail of the industry, but there are many intermediate challenges that may be much easier to solve. These include improved object recognition (the Yolo algorithm is already being used to great effect), which will lead to improved quoting and annotation – improving quality and reducing mistakes. Plugins should also make it easier to use Generative AI to develop basic designs (Primitives), which can then be further edited in design tools to improve tolerance as per requirement. That is an approach already used in Altair’s Inspire, which used Finite Element Analysis to do the same. These Primitives can also serve as synthetic database of annotated models, of which there is a dearth of in the 3D CAD industry. Physna’s CEO and founder points this out in an article detailing their own attempts to use these novel methods to create detailed 3D designs, which also highlights a number of pitfalls in using synthetic data to drive these algorithms Creating 3D designs from 2D drawings is another potential application area, as is intelligent CAM – feeding off a library of tool wear to determine the best machining strategies.

These challenges are important and lucrative to address in and for themselves. Yet their main impact will be to help evolve the idea-to-design pathway by ultimately reducing the reliance on 3D designs to communicate intent. Designs, whether 2D or 3D, have served as the primary means of translating customers’ needs to final products. That constrains the industry because these designs serve as a black box in which all those valuable customer insights, manufacturing constraints and company objectives are stored, unable to be disentangled, yet alone identified. This means that when something changes, it’s next to impossible simply to adjust the design. This is the reason manufacturing innovations such as 3D printing take such a long time to adopt and perennially disappoint short-term investors. The components that make up an aircraft are “set” from the moment they’re designed, despite a 20-year+ productive life. There’s almost no scope of innovation – these must await the launch of the next generation.

Being able to change a single constraint and allow an algorithm such as Stable Diffusion to reconstitute the design and production parameters will significantly speed up the adoption of new innovations and allow us to build lighter, better performing products, faster. As they do in Formula 1 or Systems Design, future engineers will act as constraint managers able to express in words and in reference to data sources what the objective and limitations of the product are.

Without speeding up the engineering process for new and existing products in this way we have almost no means of achieving the ambitious sustainability targets we must set ourselves. To do this, we must first agree on a language that we can use to communicate beyond designs. This new semantic model is the obvious gap in the innovations outlined above. A number of companies have already started to experiment with it, such as nTopology with its concepts of Fields. And yet, the pace of change is slow, unlike the algorithms which the semantic model will feed. Nvidia’s new algorithm is reportedly over twice as fast as DreamFusion, published less than 2 months ago. Product and engineering companies need to be working on capturing their ideas in new, future-proof ways now in order to make the most of the possibilities that this explosion of generative AI holds. The speed of change in algorithms has shown, once again, that Morse Law applies everywhere that tools are being digitized. The challenge remains our human inability to embrace this change and deploy new communication methods capable of unlocking their potential, despite the urgency of the task.

Source: https://www.forbes.com/sites/andrewegner/2022/11/24/what-nvidias-new-text-to-3d-means-for-engineering–product-design/