As part of its GTC 2021 conference, Nvidia presents this April 16 its research work on the generation of 3D models from 2D images. The project is called GANverse3D and is based, as its name suggests, on a generative antagonist network (GAN), a machine learning technique that stands out for its ability to “create”.
The best known of these is StyleGAN, which allows you to create photorealistic portraits of fictitious people. Nvidia released it in December 2018, and subsequently presented GauGAN, which allows you to generate paintings from rough sketches, or GameGAN, which reproduces a part of Pac-Man by simple visual imitation.
GANverse3D differs from them by the fact that it generates a 3D models from a single image. This is a particularly complex task because a 2D image does not show all the angles of an object, and therefore the GAN must generate them to complete the model. The difficulty of the problem is that only three types of “objects” have been highlighted in the research: birds, horses and cars.
And the only use case that Nvidia deems strong enough for now is in cars. To highlight this new GAN, the company took the example of the legendary KITT, the Pontiac Firebird with its own consciousness at the heart of the K2000 television series (Knight Rider in the original version). The model used was trained on 55,000 images of cars (presenting several angles of view).
He is able to distinguish the different elements of a car, such as headlights, windows or wheels. Once the training is finished, a single KITT image is enough to build the 3D model. Once the textures are generated by the GAN, the Omniverse and PhysX tools are used to improve their quality and make them look more realistic, then place it in a context of driving with other cars to produce a video. While training the GAN takes several days in the data center, inference from the image only takes 65 milliseconds on a V100 GPU, according to Sanja Fidler, director of the Nvidia research lab in Toronto.
As the video and illustrative image in this article show, the result is not of exceptional visual quality, but it is still superior to what a reverse graphics type network trained on Pascal3D can do, d ‘after Nvidia. And Richard Kerris, who runs Omniverse at Nvidia, sees real uses for car manufacturers. “When creating promotional images, the featured vehicle requires a lot of effort and attention, but the context elements that are in the background don’t need as much detail,” he explains.
GANverse3D, he says, is a way to quickly create these background elements for free. It could also be applied to any other illustration requiring cars, such as artist visions for urban development projects. And since GANverse3D works with Omniverse, the models it generates can quickly be imported into designers’ usual design tools.