Technology

The absurd beauty of Nvidia’s GauGAN 2 AI hack

Typing the words ” Excellent Reports” into Nvidia’s GauGAN 2 artificial intelligence program automatically produces surreal images. Tiernan Ray / / Nvidia

Type the words “excellent reports” into Nvidia’s new artificial intelligence demo, GauGAN 2, and you’ll see a picture of what looks like large chunks of foam insulation struggling in a lake against the background of snow.

Add more words, like ” superb reporting nice,” and you’ll see the image morph into something new, a barely recognizable shape, perhaps a Formula 1 race car that has been digested, moving forward. On what looks like a highway, in front of blurry views of a man-made structure.

zdnet-superb-reporting-comely-3.png

GauGAN 2 produces a strange interpretation of the phrase ” superb nice report”. Tiernan Ray / / Nvidia

Roll the dice, with a small button of an image of two dice, and you and the same phrase become a creepy landscape, shrouded in mist, with an open mouth of some kind of organic nature, but completely unidentifiable in terms of its exact sound. species.

zdnet-superb-reporting-comely-4.png

Another roll of the dice produces this strange landscape plus one creature. Tiernan Ray / / Nvidia

Writing sentences is a way of controlling GauGAN, an algorithm developed by graphics chip giant Nvidia to show the state of the art in AI. The original GauGAN program was introduced in early 2019 as a way to draw and see that the program automatically generates a photorealistic image.

The term “GAN” in the name refers to a large class of neural network programs, called Generative Adversary Networks, introduced in 2014 by Ian Goodfellow and his colleagues. GANs use two neural networks that operate in opposition, one of which produces an output that it refines regularly until the second neural network qualifies it as valid. The competitive nature of this back and forth explains why they are called “adversaries.”

Nvidia has done pioneering work in expanding GANs, in particular by introducing “Style-GAN” in 2018, which made it possible to generate fake photos of highly realistic people.

In the original GauGAN from 2019, Nvidia uses a similar approach, allowing a landscape to be drawn as areas, known as a segmentation map. These high-level abstractions, such as lakes, rivers, and fields, are converted into a structural model, and the GauGAN program then fills the drawn segmentation map with real-world shapes.

The second version of the program has been updated to support the language. The objective is to ensure that GuaGAN 2 is requested by sensitive sentences, elements related to the landscape, such as “coast, reliefs, cliffs”. The GauGAN 2 program will respond by generating a realistic looking scene that matches this input.

According to Nvidia, the program was developed in its “training” phase by receiving 10 million high-quality landscape images, using the Selene supercomputer built from Nvidia GPUs.

A segmentation map can also be created automatically, allowing you to go back and change the landscape design, just like the original GauGAN did.

As Nvidia describes GauGAN 2 in a blog post, the combination of text, image, and segmentation map is a breakthrough in multimodal AI:

GauGAN2 combines segmentation mapping, painting, and text imaging into a model, making it a powerful tool for creating photorealistic art with a mix of words and designs. This demo is one of the first to combine multiple modalities (text, semantic segmentation, sketches, and style) into a single GAN framework. This makes it faster and easier to transform an artist’s vision into a high-quality AI-generated image.

The practical benefit, according to Nvidia, is that a few words can be used to get a basic image without any drawing and then tweak the details to refine the end result.

But adding words that have nothing to do with landscapes, like “”, starts to generate crazy artifacts. In deep learning terminology, weird images produced by silly sentences are the result of the program having to deal with language “out of the box,” that is, not captured in the learning data that was supplied to the machine. Faced with irreconcilable phrases, the program tries to match an image with the phrase.

As can be seen in a series of images, “the coast undulates cliffs” initially produces a very faithful image. Adding qualifiers with bold words (bicycle, New York City, Cassandra’s name) begins to change and shape the landscape in strange ways.

coast-ripples-cliffs-2.png

Automatic output by GauGAN2 of the phrase “coastline undulates cliffs”. Tiernan Ray / / Nvidia

coast-ripples-cliffs-bike-new-york-cassandra-drill-plane-wisely-tire-ostentatious.png

Sortie automatique par GauGAN2 from the phrase “coast ripples cliffs bicycle New York Cassandra drill plane wisely flamboyant tire”. Tiernan Ray / / Nvidia

Even more interesting things happen when all the words in the landscape are removed, leaving only the nonsense. Weird and futuristic landscapes appear, or multicolored amoebas.

cassandra-drill-plane-wisely-flamboyant-tire.png

GauGAN2 auto output for phrase “Cassandra-wisely flamboyant pneumatic-drill-plane” Tiernan Ray / / Nvidia

ostentatious-2.png

Automatic GauGAN2 output for the word “swanky”. Tiernan Ray / / Nvidia

glitzy-3.png

Automatic GauGAN2 output for the word “swanky”. Tiernan Ray / / Nvidia

wisely-tire-ostentatious-2.png

GauGAN2 automatic output for the phrase “wisely flamboyant pneumatic”. Tiernan Ray / / Nvidia

wisely-tire-flamboyant-3.png

GauGAN2 automatic output for the phrase “wisely flamboyant tire”. Tiernan Ray / / Nvidia

The experience can be taken even further with lengthy phrases that are suggestive without being exactly descriptive. Try introducing the first line of TS Eliot’s poem The Wasteland, “April is the cruelest month, spawning lilacs from the dead land.”

This results in striking images that are, in fact, quite appropriate. As you roll the dice, many suitable landscape variations appear, with only slight artifacts in some cases.

“April is the cruelest month, lilacs grow from the dead earth”, TS Eliot, The Wasteland.

April-is-the-cruellest-month-raising-lilacs-from-the-dead-earth-3.png

Tiernan Ray / / Nvidia

Thanks to the innovations of StyleGAN, GauGAN is able to apply a style to the image, to condition the output to the shape of another image, much like a mash-up.

The application of the style to Eliot’s poem distorts the faithful images of the landscape to the point of rendering them unrecognizable. Once again, a host of strange objects appear, some with a disgusting organic quality, others just fragments of what was once an image.

April-is-the-cruellest-month-raising-lilacs-from-the-dead-earth-8.png

Tiernan Ray / / Nvidia

April-is-the-cruellest-month-raising-lilacs-from-the-dead-earth-5.png

Tiernan Ray / / Nvidia

april-est-le-mois-le-plus-cruellest-lilac-farm-of-the-dead-earth-14.png

Tiernan Ray / / Nvidia

You can also send pictures and even draw in GauGAN 2. Submitting an old photograph taken at Þingvellir, the site of the former Icelandic parliament, did not yield much. The image was barely transformed, based on limited evidence.

thingfetlir.jpg

A photo taken at Þingvellir, the site of the former Icelandic parliament, was hardly altered when it was sent to GauGAN2. Tiernan Lightning

However, the addition of the word “Þingvellir” made it possible to obtain a realistic enough landscape to fit the site of Þingvellir.

thingvellir.png

GuaGAN2’s release for the word “Þingvellir” was in the spirit of the old Icelandic landscape. Tiernan Ray / / Nvidia

Adding the word “volcano” we get another striking landscape, less realistic, more surreal.

thingvellir-volcano.png

GuaGAN2 automatic exit for “Þingvellir-volcano”. Tiernan Ray / / Nvidia

The addition of a bold word, like “Tech,” shook up the landscape even more, adding some bizarre and absurd figures.

thingvellir-technology-2.png

GauGAN2 automatic output for phrase “Þingvellir technology”. Tiernan Ray / / Nvidia

Instead of sending a photo of a landscape, you can draw it, as was the case in the original GauGAN. Again, choosing something that does not match the demonstration, a drawing not of a landscape but of a person’s head, gives more interesting results. The face can be peeled again, if desired, using the mash-up function. Rolling the dice, you get interesting variations.

self portrait.jpg

drawing directly in GauGAN2. Tiernan Ray / / Nvidia

output-gaugan-jpg-10.png

Drawing a redesigned head using the layering feature in GauGAN2. Tiernan Ray / / Nvidia

output-gaugan-jpg-19.png

Drawing of a redesigned head using GauGAN2’s layering feature. Tiernan Ray / / Nvidia

The combination of the design with the word “Þingvellir” produced subtle changes, as did the addition of additional words such as “volcano” and “rift”. The image has been reworked to have a volcano-like texture.

self-portrait-plus-thingvellir.png

Drawing of a head combined with the words “Þingvellir volcano rift” and rewritten using the layering functionality in GauGAN2. Tiernan Ray / / Nvidia

Note that navigating the application’s user interface can be difficult on desktop browsers. For whatever reason, it seems to work better on a tablet browser, like an iPad.

Source: “.com”

Woodmart Theme Nulled, WP Reset Pro, Newspaper 11.2, Newspaper – News & WooCommerce WordPress Theme, Premium Addons for Elementor, Rank Math Seo Pro Weadown, WeaPlay, WordPress Theme, Plugins, PHP Script, Jannah Nulled, Elementor Pro Weadown, Woocommerce Custom Product Ad, Business Consulting Nulled, Jnews 8.1.0 Nulled, Avada 7.4 Nulled, Nulledfire, Dokan Pro Nulled, Yoast Nulled, Flatsome Nulled, PW WooCommerce Gift Cards Pro Nulled, Astra Pro Nulled, Woodmart Theme Nulled, Slider Revolution Nulled, Wordfence Premium Nulled, Elementor Pro Weadown, Wpml Nulled, Consulting 6.1.4 Nulled, Fs Poster Plugin Nulled

Back to top button