An open source VQGAN+CLIP powered model that is good at an illustrated style of almost everything
Accomplice uses a combination of the neural network VQGAN+CLIP (which stands for Vector Quantized Generative Adversarial Network and Contrastive Language–Image Pre-training… a mouthful I know!) powered by the image model ImageNet to understand your text and example image input and make one or more unique, high quality images out of it.
ImageNet is an image dataset that catalogs and organizes over 100,000 words and phrases and aims to have at least 1000 images to illustrate each of those words and phrases. Images of each concept are quality-controlled and human-annotated. In total ImageNet offers tens of millions of cleanly labeled and sorted images for most of the concepts in the world's largest database of words and phrases.
Basically, VQGAN+CLIP is a program, called a “neural network”, that can be broken down into two pieces.
GANs (Generative Adversarial Networks) are systems where two neural networks are pitted against one another: a generator which synthesizes images or data, and a discriminator which scores how plausible the results are. The system feeds back on itself to incrementally improve its score.
And CLIP (Contrastive Language–Image Pre-training) is a companion third neural network which finds images based on natural language descriptions, which are what’s initially fed into the VQGAN.
This model creates 1 image per credit
Browse images created using ImageNet
Signup today and get 5 free credits to use on this and any of the other models Accomplice has to offer
Starting creating with this model for free