Knowledge

Stable Diffusion

Source 📝

317: 703: 692: 681: 1106:, argues that " peoples' responsibility as to whether they are ethical, moral, and legal in how they operate this technology", and that putting the capabilities of Stable Diffusion into the hands of the public would result in the technology providing a net benefit, in spite of the potential negative consequences. In addition, Mostaque argues that the intention behind the open availability of Stable Diffusion is to end corporate control and dominance over such technologies, who have previously only developed closed AI systems for image synthesis. This is reflected by the fact that any restrictions Stability AI places on the content that users may generate can easily be bypassed due to the availability of the source code. 554:
512×512 resolution; the version 2.0 update of the Stable Diffusion model later introduced the ability to natively generate images at 768×768 resolution. Another challenge is in generating human limbs due to poor data quality of limbs in the LAION database. The model is insufficiently trained to understand human limbs and faces due to the lack of representative features in the database, and prompting the model to generate images of such type can confound the model. Stable Diffusion XL (SDXL) version 1.0, released in July 2023, introduced native 1024x1024 resolution and improved generation for limbs and text.
168: 110: 29: 857:
the original model. This approach ensures that training with small datasets of image pairs does not compromise the integrity of production-ready diffusion models. The "zero convolution" is a 1×1 convolution with both weight and bias initialized to zero. Before training, all zero convolutions produce zero output, preventing any distortion caused by ControlNet. No layer is trained from scratch; the process is still fine-tuning, keeping the original model secure. This method enables training on small-scale or even personal devices.
772: 781: 758:
prompts". Negative prompts are a feature included in some front-end implementations, including Stability AI's own DreamStudio cloud service, and allow the user to specify prompts which the model should avoid during image generation. The specified prompts may be undesirable image features that would otherwise be present within image outputs due to the positive prompts provided by the user, or due to how the model was originally trained, with mangled human hands being a common example.
569:. However, this fine-tuning process is sensitive to the quality of new data; low resolution images or different resolutions from the original data can not only fail to learn the new task but degrade the overall performance of the model. Even when the model is additionally trained on high quality images, it is difficult for individuals to run models in consumer electronics. For example, the training process for waifu-diffusion requires a minimum 30 GB of 4863: 4843: 588:, as the model was primarily trained on images with English descriptions. As a result, generated images reinforce social biases and are from a western perspective, as the creators note that the model lacks data from other communities and cultures. The model gives more accurate results for prompts that are written in English in comparison to those written in other languages, with western or white cultures often being the default representation. 494:, a German non-profit which receives funding from Stability AI. The Stable Diffusion model was trained on three subsets of LAION-5B: laion2B-en, laion-high-resolution, and laion-aesthetics v2 5+. A third-party analysis of the model's training data identified that out of a smaller subset of 12 million images taken from the original wider dataset used, approximately 47% of the sample size of images came from 100 different domains, with 840:, which fills the masked space with newly generated content based on the provided prompt. A dedicated model specifically fine-tuned for inpainting use-cases was created by Stability AI alongside the release of Stable Diffusion 2.0. Conversely, outpainting extends an image beyond its original dimensions, filling the previously empty space with content generated based on the provided prompt. 750:
a longer duration of time, however a smaller value may result in visual defects. Another configurable option, the classifier-free guidance scale value, allows the user to adjust how closely the output image adheres to the prompt. More experimentative use cases may opt for a lower scale value, while use cases aiming for more specific outputs may use a higher value.
628:. Hypernetworks steer results towards a particular direction, allowing Stable Diffusion-based models to imitate the art style of specific artists, even if the artist is not recognised by the original model; they process the image by finding key areas of importance such as hair and eyes, and then patch these areas in secondary latent space. 824:, in which the visual features of image data are changed and anonymized. The same process may also be useful for image upscaling, in which the resolution of an image is increased, with more detail potentially being added to the image. Additionally, Stable Diffusion has been experimented with as a tool for image compression. Compared to 309: 553:
Stable Diffusion has issues with degradation and inaccuracies in certain scenarios. Initial releases of the model were trained on a dataset that consists of 512×512 resolution images, meaning that the quality of generated images noticeably degrades when user specifications deviate from its "expected"
856:
ControlNet is a neural network architecture designed to manage diffusion models by incorporating additional conditions. It duplicates the weights of neural network blocks into a "locked" copy and a "trainable" copy. The "trainable" copy learns the desired condition, while the "locked" copy preserves
749:
which affects the output image. Users may opt to randomize the seed in order to explore different generated outputs, or use the same seed to obtain the same image output as a previously generated image. Users are also able to adjust the number of inference steps for the sampler; a higher value takes
1097:
More traditional visual artists have expressed concern that widespread usage of image synthesis software such as Stable Diffusion may eventually lead to human artists, along with photographers, models, cinematographers, and actors, gradually losing commercial viability against AI-based competitors.
651:
The Stable Diffusion model supports the ability to generate new images from scratch through the use of a text prompt describing elements to be included or omitted from the output. Existing images can be re-drawn by the model to incorporate new elements described by a text prompt (a process known as
526:
The model was initially trained on the laion2B-en and laion-high-resolution subsets, with the last few rounds of training done on LAION-Aesthetics v2 5+, a subset of 600 million captioned images which the LAION-Aesthetics Predictor V2 predicted that humans would, on average, give a score of at
757:
implementations of Stable Diffusion, which allow users to modify the weight given to specific parts of the text prompt. Emphasis markers allow users to add or reduce emphasis to keywords by enclosing them with brackets. An alternative method of adjusting weight to parts of the prompt are "negative
809:
Stable Diffusion also includes another sampling script, "img2img", which consumes a text prompt, path to an existing image, and strength value between 0.0 and 1.0. The script outputs a new image based on the original image that also features elements provided within the text prompt. The strength
1101:
Stable Diffusion is notably more permissive in the types of content users may generate, such as violent or sexually explicit imagery, in comparison to other commercial products based on generative AI. Addressing the concerns that the model may be used for abusive purposes, CEO of Stability AI,
737:
The text to image sampling script within Stable Diffusion, known as "txt2img", consumes a text prompt in addition to assorted option parameters covering sampling types, output image dimensions, and seed values. The script outputs an image file based on the model's interpretation of the prompt.
489:
data scraped from the web, where 5 billion image-text pairs were classified based on language and filtered into separate datasets by resolution, a predicted likelihood of containing a watermark, and predicted "aesthetic" score (e.g. subjective visual quality). The dataset was created by
608:
An "embedding" can be trained from a collection of user-provided images, and allows the model to generate visually similar images whenever the name of the embedding is used within a generation prompt. Embeddings are based on the "textual inversion" concept developed by researchers from
476:
The architecture is named "multimodal diffusion transformer (MMDiT), where the "multimodal" means that it mixes text and image encodings inside its operations. This differs from previous versions of DiT, where the text encoding affects the image encoding, but not vice versa.
443:
The XL version uses the same LDM architecture as previous versions, except larger: larger UNet backbone, larger cross-attention context, two text encoders instead of one, and trained on multiple aspect ratios (not just the square aspect ratio like previous versions).
472:
The Transformer architecture used for SD 3.0 has three "tracks", for original text encoding, transformed text encoding, and image encoding (in latent space). The transformed text encoding and image encoding are mixed during each transformer block.
835:
Additional use-cases for image modification via img2img are offered by numerous front-end implementations of the Stable Diffusion model. Inpainting involves selectively modifying a portion of an existing image delineated by a user-provided
3428:
Radford, Alec; Kim, Jong Wook; Hallacy, Chris; Ramesh, Aditya; Goh, Gabriel; Agarwal, Sandhini; Sastry, Girish; Askell, Amanda; Mishkin, Pamela (February 26, 2021). "Learning Transferable Visual Models From Natural Language Supervision".
1152:, claiming that these companies have infringed the rights of millions of artists by training AI tools on five billion images scraped from the web without the consent of the original artists. The same month, Stability AI was also sued by 1195:", giving medical advice, automatically creating legal obligations, producing legal evidence, and "discriminating against or harming individuals or groups based on ... social behavior or ... personal or personality characteristics ... 847:
of the provided input image, and generates a new output image based on both the text prompt and the depth information, which allows the coherence and depth of the original input image to be maintained in the generated output.
1652: 2601:
Gal, Rinon; Alaluf, Yuval; Atzmon, Yuval; Patashnik, Or; Bermano, Amit H.; Chechik, Gal; Cohen-Or, Daniel (August 2, 2022). "An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion".
1829:
Podell, Dustin; English, Zion; Lacey, Kyle; Blattmann, Andreas; Dockhorn, Tim; Müller, Jonas; Penna, Joe; Rombach, Robin (July 4, 2023). "SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis".
810:
value denotes the amount of noise added to the output image. A higher strength value produces more variation within the image but may produce an image that is not semantically consistent with the prompt provided.
1090:
Stable Diffusion claims no rights on generated images and freely gives users the rights of usage to any generated images from the model provided that the image content is not illegal or harmful to individuals.
1179:, along with the model (pretrained weights). It applies the Creative ML OpenRAIL-M license, a form of Responsible AI License (RAIL), to the model (M). The license prohibits certain use cases, including crime, 3579: 2421: 617:, where vector representations for specific tokens used by the model's text encoder are linked to new pseudo-words. Embeddings can be used to reduce biases within the original model, or mimic visual styles. 1124:, a user interface for Stable Diffusion, took place, with the hackers claiming they targeted users who committed "one of our sins", which included AI-art generation, art theft, promoting cryptocurrency. 527:
least 5 out of 10 when asked to rate how much they liked them. The LAION-Aesthetics v2 5+ subset also excluded low-resolution images and images which LAION-5B-WatermarkDetection identified as carrying a
3668: 1094:
The images Stable Diffusion was trained on have been filtered without human input, leading to some harmful images and large amounts of private and sensitive information appearing in the training data.
3514: 1983: 2808:
Meng, Chenlin; He, Yutong; Song, Yang; Song, Jiaming; Wu, Jiajun; Zhu, Jun-Yan; Ermon, Stefano (January 4, 2022). "SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations".
2668:
Meng, Chenlin; He, Yutong; Song, Yang; Song, Jiaming; Wu, Jiajun; Zhu, Jun-Yan; Ermon, Stefano (August 2, 2021). "SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations".
379:, capturing a more fundamental semantic meaning of the image. Gaussian noise is iteratively applied to the compressed latent representation during forward diffusion. The U-Net block, composed of a 1365: 2524: 2554: 2194: 1546: 1037:(2021). This paper describes the CLIP method for training text encoders, which convert text into floating point vectors. Such text encodings are used by the diffusion model to create images. 3730: 398:
ViT-L/14 text encoder is used to transform text prompts to an embedding space. Researchers point to increased computational efficiency for training and generation as an advantage of LDMs.
1953: 387:
the output from forward diffusion backwards to obtain a latent representation. Finally, the VAE decoder generates the final image by converting the representation back into pixel space.
2473:
Chambon, Pierre; Bluethgen, Christian; Langlotz, Curtis P.; Chaudhari, Akshay (October 9, 2022). "Adapting Pretrained Vision-Language Foundational Models to Medical Imaging Domains".
2732: 285:. Four of the original 5 authors (Robin Rombach, Andreas Blattmann, Patrick Esser and Dominik Lorenz) later joined Stability AI and released subsequent versions of Stable Diffusion. 2832:
Luzi, Lorenzo; Siahkoohi, Ali; Mayer, Paul M.; Casco-Rodriguez, Josue; Baraniuk, Richard (October 21, 2022). "Boomerang: Local sampling on image manifolds using diffusion models".
292:
and Robin Rombach of CompVis, who were among the researchers who had earlier invented the latent diffusion model architecture used by Stable Diffusion. Stability AI also credited
1633: 447:
The SD XL Refiner, released at the same time, has the same architecture as SD XL, but it was trained for adding fine details to preexisting images via text-conditional img2img.
3451: 642:
in 2022 which can fine-tune the model to generate precise, personalised outputs that depict a specific subject, following training via a set of images which depict the subject.
2948:
The CCDH, a campaign group, tested four of the largest public-facing AI platforms: Midjourney, OpenAI's ChatGPT Plus, Stability.ai's DreamStudio and Microsoft's Image Creator.
557:
Accessibility for individual developers can also be a problem. In order to customize the model for new use cases that are not included in the dataset, such as generating
4737: 390:
The denoising step can be flexibly conditioned on a string of text, an image, or another modality. The encoded conditioning data is exposed to denoising U-Nets via a
1275: 3571: 316: 2413: 1506: 702: 652:"guided image synthesis") through its diffusion-denoising mechanism. In addition, the model also allows the use of prompts to partially alter existing images via 3790: 2906: 691: 3660: 620:
A "hypernetwork" is a small pretrained neural network that is applied to various points within a larger neural network, and refers to the technique created by
3699: 2862: 2587: 1730:
Jascha Sohl-Dickstein, Eric A. Weiss, Niru Maheswaranathan, Surya Ganguli (March 12, 2015). "Deep Unsupervised Learning using Nonequilibrium Thermodynamics".
3506: 2987: 1975: 1773: 3820: 1856:
Esser, Patrick; Kulal, Sumith; Blattmann, Andreas; Entezari, Rahim; Müller, Jonas; Saini, Harry; Levi, Yam; Lorenz, Dominik; Sauer, Axel (March 5, 2024),
813:
There are different methods for performing img2img. The main method is SDEdit, which first adds noise to an image, then denoises it as usual in text2img.
656:
and outpainting, when used with an appropriate user interface that supports such features, of which numerous different open source implementations exist.
4922: 1357: 531:
with greater than 80% probability. Final rounds of training additionally dropped 10% of text conditioning to improve Classifier-Free Diffusion Guidance.
288:
The technical license for the model was released by the CompVis group at Ludwig Maximilian University of Munich. Development was led by Patrick Esser of
3985: 2518: 518:. An investigation by Bayerischer Rundfunk showed that LAION's datasets, hosted on Hugging Face, contain large amounts of private and sensitive data. 423:
million in the text encoder, Stable Diffusion is considered relatively lightweight by 2022 standards, and unlike other diffusion models, it can run on
2704: 2631: 1055:(2022). This paper describes CFG, which allows the text encoding vector to steer the diffusion model towards creating the image described by the text. 2546: 742:
to allow users to identify an image as generated by Stable Diffusion, although this watermark loses its efficacy if the image is resized or rotated.
2186: 4579: 1751: 1538: 565:
adaptations of Stable Diffusion created through additional retraining have been used for a variety of different use-cases, from medical imaging to
2360: 2217: 3722: 2242: 2042: 1945: 1396: 3306: 843:
A depth-guided model, named "depth2img", was introduced with the release of Stable Diffusion 2.0 on November 24, 2022; this model infers the
3850: 2791: 2390: 1256: 4902: 2726: 2654: 1049:(2021, updated in 2022). This paper describes the latent diffusion model (LDM). This is the backbone of the Stable Diffusion architecture. 3881: 2761: 1305: 207:
It is primarily used to generate detailed images conditioned on text descriptions, though it can also be applied to other tasks such as
3760: 2128: 2013: 1693: 3276: 1433: 1335: 659:
Stable Diffusion is recommended to be run with 10 GB or more VRAM, however users with less VRAM may opt to load the weights in
3221: 3188: 3097: 3027: 2308: 466: 3633: 3132: 2163: 4095: 395: 1795: 3978: 216: 878: 2961: 2451: 485:
Stable Diffusion was trained on pairs of images and captions taken from LAION-5B, a publicly available dataset derived from
4897: 4768: 680: 625: 2273: 1271: 894: 837: 4869: 4420: 4157: 1476: 1163:
inclined to dismiss most of the lawsuit filed by Andersen, McKernan, and Ortiz but allowed them to file a new complaint.
832:, the recent methods used for image compression in Stable Diffusion face limitations in preserving small text and faces. 3548: 1498: 3479: 193: 3782: 604:. There are three methods in which user-accessible fine-tuning can be applied to a Stable Diffusion model checkpoint: 4907: 4681: 4308: 4115: 3971: 2071: 1641:. International Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, LA. pp. 10684–10695. 332:
along with the attention mechanism, resulting in the desired image depicting a representation of the trained concept.
300:(a German nonprofit which assembled the dataset on which Stable Diffusion was trained) as supporters of the project. 4636: 3691: 2854: 2577: 130: 3609: 596:
To address the limitations of the model's initial training, end-users may opt to implement additional training to
3912: 2623: 1765: 940:
All released by CompVis. There is no "version 1.0". 1.1 gave rise to 1.2, and 1.2 gave rise to both 1.3 and 1.4.
3045: 4912: 4823: 4763: 4361: 3812: 597: 562: 375:, and an optional text encoder. The VAE encoder compresses the image from pixel space to a smaller dimensional 33:
An image generated with Stable Diffusion based on the text prompt "a photograph of an astronaut riding a horse"
4356: 4045: 2503: 601: 359:. Introduced in 2015, diffusion models are trained with the objective of removing successive applications of 123: 4798: 4195: 4152: 4105: 4100: 2696: 391: 274: 89: 3572:"Startup Behind AI Image Generator Stable Diffusion Is In Talks To Raise At A Valuation Up To $ 1 Billion" 4849: 4145: 4071: 1729: 1208: 898: 73: 1113:
have been brought up, due to such images generated by Stable Diffusion being shared on websites such as
4917: 4473: 4408: 4009: 3957: 2578:"I thrashed the RTX 4090 for 8 hours straight training Stable Diffusion to paint like my uncle Hermann" 2352: 535: 246: 4874: 4732: 4371: 4202: 4025: 2036: 1389:"Stable Diffusion came from the Machine Vision & Learning research group (CompVis) @LMU_Muenchen" 1176: 238: 167: 1199:". The user owns the rights to their generated output images, and is free to use them commercially. 328:
until a configured number of steps have been reached, guided by the CLIP text encoder pretrained on
4773: 4030: 3378: 1388: 242: 3298: 2884:
Zhang, Lvmin (February 10, 2023). "Adding Conditional Control to Text-to-Image Diffusion Models".
4818: 4803: 4456: 4451: 4351: 4219: 4000: 3927: 3842: 3328: 2783: 2382: 1946:"Exploring 12 Million of the 2.3 Billion Images Used to Train Stable Diffusion's Image Generator" 1248: 1196: 380: 368: 1121: 877:. In addition to Stability's interfaces, many third party open source interfaces exist, such as 456: 4927: 4778: 4538: 4257: 4252: 3634:"Hackers Target AI Users With Malicious Stable Diffusion Tool on GitHub to Protest 'Art Theft'" 2646: 2330: 1141: 352: 342: 3873: 1588: 223:
with a computational donation from Stability and training data from non-profit organizations.
4808: 4793: 4758: 4446: 4346: 4214: 3906: 3243: 2753: 1910: 1297: 1228: 412:
and an important link was made between this purely physical field and deep learning in 2015.
364: 282: 227: 4676: 3752: 4828: 4783: 4229: 4174: 4020: 4015: 3950: 2120: 2009: 1685: 754: 424: 109: 3450:
Rombach, Robin; Blattmann, Andreas; Lorenz, Dominik; Esser, Patrick; Ommer, Björn (2022).
3268: 3006: 1425: 816:
The ability of img2img to add noise to the original image makes it potentially useful for
8: 4403: 4381: 4130: 4125: 4083: 4035: 1745: 1327: 610: 539: 185: 135: 3456:
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
3213: 3180: 2300: 1475:. CompVis - Machine Vision and Learning Research Group, LMU Munich. September 17, 2022. 1067:(2022). Describes rectified flow, which is used for the backbone architecture of SD 3.0. 4788: 4366: 3459: 3430: 3124: 2885: 2833: 2809: 2669: 2603: 2474: 2155: 2099: 1888: 1861: 1831: 1731: 1642: 1160: 817: 212: 324:
process used by Stable Diffusion. The model generates images by iteratively denoising
4854: 4842: 4646: 4298: 4169: 4162: 3103: 2754:"stable-diffusion-tools/emphasis at master · JohannesGaessler/stable-diffusion-tools" 2236: 1192: 821: 739: 639: 578: 515: 3692:"Getty Images suing the makers of popular AI art tool for allegedly stealing photos" 2932: 98: 4599: 4589: 4190: 4140: 4135: 4078: 4066: 2098:
Ho, Jonathan; Salimans, Tim (July 25, 2022). "Classifier-Free Diffusion Guidance".
600:
generation outputs to match more specific use-cases, a process also referred to as
585: 289: 220: 142: 3945: 2547:"NVIDIA Quietly Launches GeForce RTX 3080 12GB: More VRAM, More Power, More Money" 1885:
Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow
1065:
Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow
28: 4712: 4656: 4478: 4120: 4040: 2443: 2187:"A startup wants to democratize the tech behind DALL-E 2, consequences be damned" 1041:
SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations
384: 348: 249:. This marked a departure from previous proprietary text-to-image models such as 230: 189: 3953:: Investigation on sensitive and private data in Stable Diffusions training data 3353: 3007:"A friendly guide to local AI image gen with Stable Diffusion and Automatic1111" 308: 4686: 4651: 4641: 4466: 4224: 4050: 3403: 2414:"Stability AI releases Stable Diffusion XL, its next-gen image synthesis model" 2265: 1137: 1133: 360: 234: 3661:"AI art tools Stable Diffusion and Midjourney targeted with copyright lawsuit" 1568: 1472: 1328:"Leaked deck raises questions over Stability AI's Series A pitch to investors" 4891: 4631: 4611: 4528: 4207: 3913:"Step by Step visual introduction to Diffusion Models. - Blog by Kemal Erdem" 3540: 1103: 406: 258: 181: 3071: 561:
characters ("waifu diffusion"), new data and further training are required.
4717: 4548: 3963: 3011: 2907:"Stable Diffusion in your pocket? "Draw Things" brings AI images to iPhone" 1223: 1153: 1059:
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
486: 376: 325: 197: 54: 771: 4813: 4584: 4493: 4488: 4110: 4088: 3507:"This artist is dominating AI-generated art. And he's not happy about it" 2063: 1976:"This artist is dominating AI-generated art. And he's not happy about it" 902: 887:, which aims to decrease the amount of prompting needed by the user, and 780: 746: 4707: 4666: 4661: 4574: 4483: 4391: 4303: 4283: 3601: 1858:
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
1213: 1184: 1149: 1145: 1071:
Scaling Rectified Flow Transformers for High-resolution Image Synthesis
653: 631: 511: 356: 293: 254: 208: 49: 39: 211:, outpainting, and generating image-to-image translations guided by a 4702: 4671: 4569: 4413: 4376: 4313: 4267: 4262: 4247: 1035:
Learning Transferable Visual Models From Natural Language Supervision
844: 573:, which exceeds the usual resource provided in such consumer GPUs as 566: 528: 499: 495: 409: 355:, developed by the CompVis (Computer Vision & Learning) group at 321: 312:
Diagram of the latent diffusion architecture used by Stable Diffusion
954:
Initialized with the weights of 1.2, not 1.4. Released by RunwayML.
711:
Demonstration of the effect of negative prompts on image generation
156: 4604: 4436: 3464: 3435: 2890: 2838: 2814: 2674: 2608: 2582: 2479: 2104: 1893: 1866: 1836: 1736: 1647: 1499:"The new killer app: Creating AI art will absolutely crush your PC" 624:
developer Kurumuz in 2021, originally intended for text-generation
503: 461:
The 3.0 version completely changes the backbone. Not a UNet, but a
432: 2495: 2150: 2148: 2146: 1358:"Revolutionizing image generation by AI: Turning text into images" 634:
is a deep learning generation model developed by researchers from
241:, and it can run on most consumer hardware equipped with a modest 215:. Its development involved researchers from the CompVis Group at 4727: 4564: 4518: 4441: 4341: 4336: 4288: 3154: 2831: 2472: 1218: 1110: 889: 883: 664: 660: 621: 347:
Models in Stable Diffusion series before SD 3 all used a kind of
329: 201: 3541:"Midjourneyを超えた? 無料の作画AI「 #StableDiffusion 」が「AIを民主化した」と断言できる理由" 4742: 4722: 4594: 4386: 3723:"US judge finds flaws in artists' lawsuit against AI companies" 3480:"LICENSE.md · stabilityai/stable-diffusion-xl-base-1.0 at main" 2143: 1883:
Liu, Xingchao; Gong, Chengyue; Liu, Qiang (September 7, 2022),
1573: 1188: 1172: 635: 614: 584:
The creators of Stable Diffusion acknowledge the potential for
574: 507: 278: 250: 3452:"High-Resolution Image Synthesis With Latent Diffusion Models" 1796:"Text-to-Image Generation with Stable Diffusion and OpenVINO™" 4543: 4523: 4513: 4508: 4503: 4498: 4461: 4293: 2933:"AI can be easily used to make fake election photos - report" 2496:"Riffusion - Stable diffusion for real-time music generation" 1803: 1180: 1114: 865:
Stability provides an online image generation service called
558: 491: 372: 363:
on training images, which can be thought of as a sequence of
297: 1635:
High-Resolution Image Synthesis with Latent Diffusion Models
1047:
High-Resolution Image Synthesis with Latent Diffusion Models
994:
The XL 1.0 base model has 3.5 billion parameters, making it
4533: 3449: 1855: 1828: 1631: 829: 825: 570: 498:
taking up 8.5% of the subset, followed by websites such as
542:
for a total of 150,000 GPU-hours, at a cost of $ 600,000.
3299:"stabilityai/stable-diffusion-xl-base-1.0 · Hugging Face" 1602: 1467: 1465: 1463: 1461: 1459: 1457: 1455: 1453: 1451: 1420: 1418: 1416: 1414: 428: 3028:"Fooocus is the easiest way to create AI art on your PC" 2962:"Stability AI open sources its AI-powered design studio" 2784:"Stable Diffusion v2.1 and DreamStudio Updates 7-Dec 22" 2600: 1539:"Anyone can use this AI art generator — that's the risk" 1298:"Diffuse The Rest - a Hugging Face Space by huggingface" 1272:"How to Run Stable Diffusion Locally to Generate Images" 3427: 2988:"Stability AI is open-sourcing its DreamStudio web app" 1010:
Distilled from XL 1.0 to run in fewer diffusion steps.
881:, which is the most popular and offers extra features, 2493: 1632:
Rombach; Blattmann; Lorenz; Esser; Ommer (June 2022).
1448: 1411: 869:. The company also released an open source version of 3783:"From RAIL to Open RAIL: Topologies of RAIL Licenses" 801:: Modified image created with Stable Diffusion XL 1.0 667:
to tradeoff model performance with lower VRAM usage.
3602:"Illegal trade in AI child sex abuse images exposed" 3244:"stabilityai/stable-diffusion-2-base · Hugging Face" 2322: 1320: 465:, which implements the rectified flow method with a 3874:"言葉で指示した画像を凄いAIが描き出す「Stable Diffusion」 ~画像は商用利用も可能" 1043:(2021). This paper describes SDEdit, aka "img2img". 3632: 2008:Brunner, Katharina; Harlan, Elisa (July 7, 2023). 795:: Original image created with Stable Diffusion 1.5 545:SD3 was trained at a cost of around $ 10 million. 394:. For conditioning on text, the fixed, pretrained 269:Stable Diffusion originated from a project called 3714: 3269:"stabilityai/stable-diffusion-2-1 · Hugging Face" 2691: 2689: 2687: 2685: 1269: 4889: 2877: 2056: 745:Each txt2img generation will involve a specific 237:. Its code and model weights have been released 3813:"Ready or not, mass video deepfakes are coming" 3214:"stabilityai/stable-diffusion-2 · Hugging Face" 3181:"runwayml/stable-diffusion-v1-5 · Hugging Face" 1197:legally protected characteristics or categories 3805: 3125:"CompVis/stable-diffusion-v1-4 · Hugging Face" 2807: 2682: 2667: 2644: 2156:"CompVis/stable-diffusion-v1-4 · Hugging Face" 1241: 967:Retrained from scratch on a filtered dataset. 738:Generated images are tagged with an invisible 200:and is considered to be a part of the ongoing 3979: 3871: 3046:"ComfyUI Workflows and what you need to know" 2295: 2293: 2291: 2007: 753:Additional text2img features are provided by 3993: 2260: 2258: 2256: 2254: 2252: 1750:: CS1 maint: multiple names: authors list ( 1111:sexualized depictions of underage characters 367:. Stable Diffusion consists of 3 parts: the 3946:Interactive Explanation of Stable Diffusion 3843:"License - a Hugging Face Space by CompVis" 3538: 2647:"愛犬の合成画像を生成できるAI 文章で指示するだけでコスプレ 米Googleが開発" 2241:: CS1 maint: numeric names: authors list ( 1156:for using its images in the training data. 996:around 3.5x larger than previous versions. 670: 4923:Works involved in plagiarism controversies 3986: 3972: 3534: 3532: 2985: 2855:"Stable Diffusion Based Image Compression" 2776: 2624:"NovelAI Improvements on Stable Diffusion" 2575: 2288: 2097: 1882: 166: 108: 3745: 3683: 3498: 3463: 3434: 2889: 2853:Bühlmann, Matthias (September 28, 2022). 2837: 2813: 2673: 2607: 2478: 2353:"Generating images with Stable Diffusion" 2249: 2215: 2103: 2035:Schuhmann, Christoph (November 2, 2022), 2034: 1892: 1865: 1835: 1735: 1646: 1081:SD 2.0: 0.2 million hours on A100 (40GB). 273:, developed in Germany by researchers at 3720: 3505:Heikkilä, Melissa (September 16, 2022). 3504: 3099:Latent Auto-recursive Composition Engine 2930: 2852: 2516: 2444:"hakurei/waifu-diffusion · Hugging Face" 2118: 1627: 1625: 1623: 1532: 1530: 1528: 1526: 1524: 1386: 1350: 1085: 315: 307: 3658: 3630: 3612:from the original on September 21, 2023 3582:from the original on September 30, 2023 3529: 3354:"stabilityai/sdxl-turbo · Hugging Face" 3279:from the original on September 21, 2023 3224:from the original on September 21, 2023 3191:from the original on September 21, 2023 2959: 2904: 2794:from the original on December 10, 2022. 2506:from the original on December 16, 2022. 2411: 2311:from the original on December 10, 2022. 2184: 2016:from the original on September 12, 2023 1939: 1937: 1935: 1933: 1931: 1536: 1473:"Stable Diffusion Repository on GitHub" 1368:from the original on September 17, 2022 591: 419:million parameters in the U-Net and 123 233:, a kind of deep generative artificial 16:Image-generating machine learning model 4890: 3884:from the original on November 14, 2022 3853:from the original on September 4, 2022 3733:from the original on September 6, 2023 3551:from the original on December 10, 2022 2803: 2801: 2634:from the original on October 27, 2022. 2590:from the original on November 9, 2022. 2517:Mercurio, Anthony (October 31, 2022), 2131:from the original on September 6, 2022 1715:David, Foster. "8. Diffusion Models". 1679: 1677: 1675: 1673: 1491: 1436:from the original on September 5, 2022 1426:"Stable Diffusion Launch Announcement" 1308:from the original on September 5, 2022 789:Demonstration of img2img modification 521: 217:Ludwig Maximilian University of Munich 3967: 3517:from the original on January 14, 2023 3208: 3206: 3135:from the original on January 11, 2023 3095: 2883: 2865:from the original on November 2, 2022 2827: 2825: 2735:from the original on October 18, 2022 2731:, Shield Mountain, November 2, 2022, 2707:from the original on January 20, 2023 2657:from the original on August 31, 2022. 2527:from the original on October 31, 2022 2363:from the original on October 31, 2022 2333:from the original on October 16, 2023 2276:from the original on January 17, 2023 2197:from the original on January 19, 2023 2166:from the original on January 11, 2023 2093: 2091: 2089: 2003: 2001: 1986:from the original on January 14, 2023 1956:from the original on January 20, 2023 1905: 1903: 1878: 1876: 1851: 1849: 1847: 1824: 1822: 1820: 1696:from the original on November 1, 2022 1658:from the original on January 20, 2023 1620: 1549:from the original on January 21, 2023 1537:Vincent, James (September 15, 2022). 1521: 1479:from the original on January 18, 2023 1278:from the original on October 13, 2023 980:Initialized with the weights of 2.0. 879:AUTOMATIC1111 Stable Diffusion Web UI 761: 457:Diffusion model § Rectified flow 196:technology is the premier product of 4824:Generative adversarial network (GAN) 3958:Negative Prompts in Stable Diffusion 3823:from the original on August 31, 2022 3763:from the original on August 30, 2022 3689: 3379:"Adversarial Diffusion Distillation" 3309:from the original on October 8, 2023 3004: 2764:from the original on October 2, 2022 2645:Yuki Yamashita (September 1, 2022). 2557:from the original on August 27, 2023 2454:from the original on October 8, 2023 2424:from the original on August 21, 2023 2266:"Stable Diffusion with 🧨 Diffusers" 2074:from the original on August 26, 2022 1943: 1928: 1607:Computer Vision & Learning Group 1589:"Stable Diffusion 3: Research Paper" 1509:from the original on August 31, 2022 4903:Deep learning software applications 3690:Korn, Jennifer (January 17, 2023). 3659:Vincent, James (January 16, 2023). 3563: 3025: 2905:Edwards, Benj (November 10, 2022). 2798: 1683: 1670: 1387:Mostaque, Emad (November 2, 2022). 1259:from the original on July 26, 2023. 581:, which has only about 12 GB. 79:SDXL 1.0 (model) / July 26, 2023 13: 3793:from the original on July 27, 2023 3702:from the original on March 1, 2023 3671:from the original on March 9, 2023 3631:Maiberg, Emanuel (June 11, 2024). 3203: 2998: 2822: 2393:from the original on July 26, 2023 2119:Mostaque, Emad (August 28, 2022). 2086: 2038:CLIP+MLP Aesthetic Score Predictor 1998: 1900: 1873: 1844: 1817: 1776:from the original on June 25, 2023 1686:"The Illustrated Stable Diffusion" 1561: 1399:from the original on July 20, 2023 1338:from the original on June 29, 2023 1159:In July 2023, U.S. District Judge 1109:Controversy around photorealistic 1053:Classifier-Free Diffusion Guidance 860: 194:generative artificial intelligence 14: 4939: 3900: 3872:Katsuo Ishida (August 26, 2022). 3753:"Stable Diffusion Public Release" 3721:Brittain, Blake (July 19, 2023). 2986:Weatherbed, Jess (May 17, 2023). 2544: 2185:Wiggers, Kyle (August 12, 2022). 2045:from the original on June 8, 2023 1911:"Rectified Flow — Rectified Flow" 1714: 1290: 1270:Ryan O'Connor (August 23, 2022). 663:precision instead of the default 44:Runway, CompVis, and Stability AI 4862: 4861: 4841: 3951:"We Are All Raw Material for AI" 3102:(M.S. Computer Science thesis). 2931:Wendling, Mike (March 6, 2024). 2010:"We Are All Raw Material for AI" 1132:In January 2023, three artists, 779: 770: 701: 690: 679: 534:The model was trained using 256 480: 27: 3865: 3835: 3775: 3652: 3624: 3594: 3569: 3539:Ryo Shimizu (August 26, 2022). 3472: 3443: 3421: 3396: 3371: 3346: 3321: 3291: 3261: 3236: 3173: 3147: 3117: 3089: 3064: 3038: 3019: 2979: 2953: 2924: 2898: 2846: 2746: 2719: 2661: 2638: 2616: 2594: 2576:Dave James (October 28, 2022). 2569: 2538: 2510: 2487: 2466: 2436: 2412:Edwards, Benj (July 27, 2023). 2405: 2375: 2345: 2315: 2209: 2178: 2112: 2028: 1968: 1788: 1758: 1723: 1708: 1595: 1581: 1122:hack on an extension of ComfyUI 646: 567:algorithmically generated music 336: 257:which were accessible only via 4774:Recurrent neural network (RNN) 4764:Differentiable neural computer 3787:Responsible AI Licenses (RAIL) 3096:Huang, Yenkai (May 10, 2024). 3005:Mann, Tobias (June 29, 2024). 2960:Wiggers, Kyle (May 18, 2023). 2494:Seth Forsgren; Hayk Martiros. 2301:"Stable Diffusion 2.0 Release" 1944:Baio, Andy (August 30, 2022). 1380: 1263: 1144:lawsuit against Stability AI, 1018:February 2024 (early preview) 897:user interface, essentially a 548: 264: 1: 4819:Variational autoencoder (VAE) 4779:Long short-term memory (LSTM) 4046:Computational learning theory 2012:. Bayerischer Rundfunk (BR). 1234: 1175:, Stable Diffusion makes its 1127: 851: 729:: "round stones, round rocks" 435:version of Stable Diffusion. 303: 4799:Convolutional neural network 3928:"U-Net for Stable Diffusion" 2216:emad_9608 (April 19, 2024). 1766:"Stable diffusion pipelines" 353:latent diffusion model (LDM) 275:Ludwig Maximilian University 202:artificial intelligence boom 7: 4898:Artificial intelligence art 4794:Multilayer perceptron (MLP) 1209:Artificial intelligence art 1202: 1140:, and Karla Ortiz, filed a 908: 899:visual programming language 405:takes inspiration from the 10: 4944: 4870:Artificial neural networks 4784:Gated recurrent unit (GRU) 4010:Differentiable programming 3932:U-Net for Stable Diffusion 2064:"LAION-Aesthetics | LAION" 1569:"CompVis/Latent-diffusion" 1166: 613:in 2022 with support from 463:Rectified Flow Transformer 454: 340: 188:released in 2022 based on 4837: 4751: 4695: 4624: 4557: 4429: 4329: 4322: 4276: 4240: 4203:Artificial neural network 4183: 4059: 4026:Automatic differentiation 3999: 2697:"Stable Diffusion web UI" 1073:(2024). Describes SD 3.0. 450: 392:cross-attention mechanism 151: 141: 129: 119: 88: 84: 72: 68: 60: 48: 38: 26: 4908:Text-to-image generation 4031:Neuromorphic engineering 3994:Differentiable computing 3458:. pp. 10684–10695. 1717:Generative Deep Learning 671:Text to image generation 438: 245:with at least 4 GB 4804:Residual neural network 4220:Artificial Intelligence 1719:(2 ed.). O'Reilly. 1061:(2023). Describes SDXL. 369:variational autoencoder 3545:Business Insider Japan 2121:"Cost of construction" 1142:copyright infringement 365:denoising autoencoders 343:Latent diffusion model 333: 313: 226:Stable Diffusion is a 147:Creative ML OpenRAIL-M 4913:Unsupervised learning 4759:Neural Turing machine 4347:Human image synthesis 3907:Stable Diffusion Demo 3511:MIT Technology Review 3329:"Announcing SDXL 1.0" 2703:. November 10, 2022. 2383:"Announcing SDXL 1.0" 1980:MIT Technology Review 1249:"Announcing SDXL 1.0" 1229:Imagen (Google Brain) 1193:exploiting ... minors 1177:source code available 1086:Usage and controversy 319: 311: 283:Heidelberg University 4850:Computer programming 4829:Graph neural network 4404:Text-to-video models 4382:Text-to-image models 4230:Large language model 4215:Scientific computing 4021:Statistical manifold 4016:Information geometry 3404:"Stable Diffusion 3" 2630:. October 11, 2022. 2218:"10m is about right" 1024:A family of models. 717:: no negative prompt 592:End-user fine-tuning 4196:In-context learning 4036:Pattern recognition 3878:Impress Corporation 3819:. August 30, 2022. 3817:The Washington Post 3789:. August 18, 2022. 3155:"CompVis (CompVis)" 2728:invisible-watermark 2359:. August 24, 2022. 1171:Unlike models like 1120:In June of 2024, a 932:1.1, 1.2, 1.3, 1.4 914: 611:Tel Aviv University 540:Amazon Web Services 522:Training procedures 431:-only if using the 186:text-to-image model 136:Text-to-image model 23: 4789:Echo state network 4677:Jürgen Schmidhuber 4372:Facial recognition 4367:Speech recognition 4277:Software libraries 3050:thinkdiffusion.com 1690:jalammar.github.io 913: 818:data anonymization 762:Image modification 626:transformer models 334: 314: 105:/generative-models 40:Original author(s) 21: 4918:Art controversies 4885: 4884: 4647:Stephen Grossberg 4620: 4619: 3608:. June 27, 2023. 3161:. August 23, 2023 3104:Dartmouth College 2551:www.anandtech.com 2222:r/StableDiffusion 1915:www.cs.utexas.edu 1028: 1027: 822:data augmentation 740:digital watermark 640:Boston University 579:GeForce 30 series 516:Wikimedia Commons 175: 174: 4935: 4875:Machine learning 4865: 4864: 4845: 4600:Action selection 4590:Self-driving car 4397:Stable Diffusion 4362:Speech synthesis 4327: 4326: 4191:Machine learning 4067:Gradient descent 3988: 3981: 3974: 3965: 3964: 3942: 3940: 3938: 3923: 3921: 3919: 3894: 3893: 3891: 3889: 3869: 3863: 3862: 3860: 3858: 3839: 3833: 3832: 3830: 3828: 3809: 3803: 3802: 3800: 3798: 3779: 3773: 3772: 3770: 3768: 3749: 3743: 3742: 3740: 3738: 3718: 3712: 3711: 3709: 3707: 3687: 3681: 3680: 3678: 3676: 3656: 3650: 3649: 3647: 3645: 3636: 3628: 3622: 3621: 3619: 3617: 3598: 3592: 3591: 3589: 3587: 3567: 3561: 3560: 3558: 3556: 3536: 3527: 3526: 3524: 3522: 3502: 3496: 3495: 3493: 3491: 3476: 3470: 3469: 3467: 3447: 3441: 3440: 3438: 3425: 3419: 3418: 3416: 3414: 3400: 3394: 3393: 3391: 3389: 3375: 3369: 3368: 3366: 3364: 3350: 3344: 3343: 3341: 3339: 3325: 3319: 3318: 3316: 3314: 3295: 3289: 3288: 3286: 3284: 3265: 3259: 3258: 3256: 3254: 3240: 3234: 3233: 3231: 3229: 3210: 3201: 3200: 3198: 3196: 3177: 3171: 3170: 3168: 3166: 3151: 3145: 3144: 3142: 3140: 3121: 3115: 3114: 3112: 3110: 3093: 3087: 3086: 3084: 3082: 3068: 3062: 3061: 3059: 3057: 3042: 3036: 3035: 3023: 3017: 3016: 3002: 2996: 2995: 2983: 2977: 2976: 2974: 2972: 2957: 2951: 2950: 2945: 2943: 2928: 2922: 2921: 2919: 2917: 2902: 2896: 2895: 2893: 2881: 2875: 2874: 2872: 2870: 2850: 2844: 2843: 2841: 2829: 2820: 2819: 2817: 2805: 2796: 2795: 2780: 2774: 2773: 2771: 2769: 2750: 2744: 2743: 2742: 2740: 2723: 2717: 2716: 2714: 2712: 2693: 2680: 2679: 2677: 2665: 2659: 2658: 2642: 2636: 2635: 2620: 2614: 2613: 2611: 2598: 2592: 2591: 2573: 2567: 2566: 2564: 2562: 2542: 2536: 2535: 2534: 2532: 2514: 2508: 2507: 2491: 2485: 2484: 2482: 2470: 2464: 2463: 2461: 2459: 2440: 2434: 2433: 2431: 2429: 2409: 2403: 2402: 2400: 2398: 2379: 2373: 2372: 2370: 2368: 2349: 2343: 2342: 2340: 2338: 2319: 2313: 2312: 2297: 2286: 2285: 2283: 2281: 2262: 2247: 2246: 2240: 2232: 2230: 2228: 2213: 2207: 2206: 2204: 2202: 2182: 2176: 2175: 2173: 2171: 2152: 2141: 2140: 2138: 2136: 2116: 2110: 2109: 2107: 2095: 2084: 2083: 2081: 2079: 2060: 2054: 2053: 2052: 2050: 2032: 2026: 2025: 2023: 2021: 2005: 1996: 1995: 1993: 1991: 1972: 1966: 1965: 1963: 1961: 1941: 1926: 1925: 1923: 1921: 1907: 1898: 1897: 1896: 1880: 1871: 1870: 1869: 1853: 1842: 1841: 1839: 1826: 1815: 1814: 1812: 1810: 1792: 1786: 1785: 1783: 1781: 1762: 1756: 1755: 1749: 1741: 1739: 1727: 1721: 1720: 1712: 1706: 1705: 1703: 1701: 1681: 1668: 1667: 1665: 1663: 1657: 1650: 1640: 1629: 1618: 1617: 1615: 1613: 1599: 1593: 1592: 1585: 1579: 1578: 1565: 1559: 1558: 1556: 1554: 1534: 1519: 1518: 1516: 1514: 1495: 1489: 1488: 1486: 1484: 1469: 1446: 1445: 1443: 1441: 1422: 1409: 1408: 1406: 1404: 1384: 1378: 1377: 1375: 1373: 1354: 1348: 1347: 1345: 1343: 1324: 1318: 1317: 1315: 1313: 1294: 1288: 1287: 1285: 1283: 1267: 1261: 1260: 1245: 915: 912: 783: 774: 705: 694: 683: 586:algorithmic bias 422: 418: 271:Latent Diffusion 192:techniques. The 178:Stable Diffusion 171: 170: 163: 160: 158: 112: 107: 104: 102: 100: 31: 24: 22:Stable Diffusion 20: 4943: 4942: 4938: 4937: 4936: 4934: 4933: 4932: 4888: 4887: 4886: 4881: 4833: 4747: 4713:Google DeepMind 4691: 4657:Geoffrey Hinton 4616: 4553: 4479:Project Debater 4425: 4323:Implementations 4318: 4272: 4236: 4179: 4121:Backpropagation 4055: 4041:Tensor calculus 3995: 3992: 3936: 3934: 3926: 3917: 3915: 3911: 3903: 3898: 3897: 3887: 3885: 3880:(in Japanese). 3870: 3866: 3856: 3854: 3841: 3840: 3836: 3826: 3824: 3811: 3810: 3806: 3796: 3794: 3781: 3780: 3776: 3766: 3764: 3751: 3750: 3746: 3736: 3734: 3719: 3715: 3705: 3703: 3688: 3684: 3674: 3672: 3657: 3653: 3643: 3641: 3629: 3625: 3615: 3613: 3600: 3599: 3595: 3585: 3583: 3568: 3564: 3554: 3552: 3547:(in Japanese). 3537: 3530: 3520: 3518: 3503: 3499: 3489: 3487: 3486:. July 26, 2023 3478: 3477: 3473: 3448: 3444: 3426: 3422: 3412: 3410: 3402: 3401: 3397: 3387: 3385: 3377: 3376: 3372: 3362: 3360: 3352: 3351: 3347: 3337: 3335: 3327: 3326: 3322: 3312: 3310: 3297: 3296: 3292: 3282: 3280: 3267: 3266: 3262: 3252: 3250: 3242: 3241: 3237: 3227: 3225: 3212: 3211: 3204: 3194: 3192: 3179: 3178: 3174: 3164: 3162: 3153: 3152: 3148: 3138: 3136: 3123: 3122: 3118: 3108: 3106: 3094: 3090: 3080: 3078: 3070: 3069: 3065: 3055: 3053: 3052:. December 2023 3044: 3043: 3039: 3024: 3020: 3003: 2999: 2984: 2980: 2970: 2968: 2958: 2954: 2941: 2939: 2929: 2925: 2915: 2913: 2903: 2899: 2882: 2878: 2868: 2866: 2851: 2847: 2830: 2823: 2806: 2799: 2782: 2781: 2777: 2767: 2765: 2752: 2751: 2747: 2738: 2736: 2725: 2724: 2720: 2710: 2708: 2695: 2694: 2683: 2666: 2662: 2653:(in Japanese). 2643: 2639: 2622: 2621: 2617: 2599: 2595: 2574: 2570: 2560: 2558: 2543: 2539: 2530: 2528: 2520:Waifu Diffusion 2515: 2511: 2492: 2488: 2471: 2467: 2457: 2455: 2442: 2441: 2437: 2427: 2425: 2410: 2406: 2396: 2394: 2381: 2380: 2376: 2366: 2364: 2357:Paperspace Blog 2351: 2350: 2346: 2336: 2334: 2321: 2320: 2316: 2299: 2298: 2289: 2279: 2277: 2264: 2263: 2250: 2234: 2233: 2226: 2224: 2214: 2210: 2200: 2198: 2183: 2179: 2169: 2167: 2154: 2153: 2144: 2134: 2132: 2117: 2113: 2096: 2087: 2077: 2075: 2062: 2061: 2057: 2048: 2046: 2033: 2029: 2019: 2017: 2006: 1999: 1989: 1987: 1974: 1973: 1969: 1959: 1957: 1942: 1929: 1919: 1917: 1909: 1908: 1901: 1881: 1874: 1854: 1845: 1827: 1818: 1808: 1806: 1794: 1793: 1789: 1779: 1777: 1764: 1763: 1759: 1743: 1742: 1728: 1724: 1713: 1709: 1699: 1697: 1682: 1671: 1661: 1659: 1655: 1638: 1630: 1621: 1611: 1609: 1601: 1600: 1596: 1587: 1586: 1582: 1567: 1566: 1562: 1552: 1550: 1535: 1522: 1512: 1510: 1497: 1496: 1492: 1482: 1480: 1471: 1470: 1449: 1439: 1437: 1424: 1423: 1412: 1402: 1400: 1385: 1381: 1371: 1369: 1356: 1355: 1351: 1341: 1339: 1326: 1325: 1321: 1311: 1309: 1296: 1295: 1291: 1281: 1279: 1268: 1264: 1247: 1246: 1242: 1237: 1205: 1169: 1130: 1088: 918:Version number 911: 863: 861:User Interfaces 854: 807: 806: 805: 804: 786: 785: 784: 776: 775: 764: 735: 734: 733: 732: 723:: "green trees" 708: 707: 706: 697: 696: 695: 686: 685: 684: 673: 649: 636:Google Research 602:personalization 594: 551: 524: 483: 459: 453: 441: 427:GPUs, and even 420: 416: 351:(DM), called a 349:diffusion model 345: 339: 306: 267: 231:diffusion model 165: 155: 115: 97: 80: 64:August 22, 2022 61:Initial release 34: 17: 12: 11: 5: 4941: 4931: 4930: 4925: 4920: 4915: 4910: 4905: 4900: 4883: 4882: 4880: 4879: 4878: 4877: 4872: 4859: 4858: 4857: 4852: 4838: 4835: 4834: 4832: 4831: 4826: 4821: 4816: 4811: 4806: 4801: 4796: 4791: 4786: 4781: 4776: 4771: 4766: 4761: 4755: 4753: 4749: 4748: 4746: 4745: 4740: 4735: 4730: 4725: 4720: 4715: 4710: 4705: 4699: 4697: 4693: 4692: 4690: 4689: 4687:Ilya Sutskever 4684: 4679: 4674: 4669: 4664: 4659: 4654: 4652:Demis Hassabis 4649: 4644: 4642:Ian Goodfellow 4639: 4634: 4628: 4626: 4622: 4621: 4618: 4617: 4615: 4614: 4609: 4608: 4607: 4597: 4592: 4587: 4582: 4577: 4572: 4567: 4561: 4559: 4555: 4554: 4552: 4551: 4546: 4541: 4536: 4531: 4526: 4521: 4516: 4511: 4506: 4501: 4496: 4491: 4486: 4481: 4476: 4471: 4470: 4469: 4459: 4454: 4449: 4444: 4439: 4433: 4431: 4427: 4426: 4424: 4423: 4418: 4417: 4416: 4411: 4401: 4400: 4399: 4394: 4389: 4379: 4374: 4369: 4364: 4359: 4354: 4349: 4344: 4339: 4333: 4331: 4324: 4320: 4319: 4317: 4316: 4311: 4306: 4301: 4296: 4291: 4286: 4280: 4278: 4274: 4273: 4271: 4270: 4265: 4260: 4255: 4250: 4244: 4242: 4238: 4237: 4235: 4234: 4233: 4232: 4225:Language model 4222: 4217: 4212: 4211: 4210: 4200: 4199: 4198: 4187: 4185: 4181: 4180: 4178: 4177: 4175:Autoregression 4172: 4167: 4166: 4165: 4155: 4153:Regularization 4150: 4149: 4148: 4143: 4138: 4128: 4123: 4118: 4116:Loss functions 4113: 4108: 4103: 4098: 4093: 4092: 4091: 4081: 4076: 4075: 4074: 4063: 4061: 4057: 4056: 4054: 4053: 4051:Inductive bias 4048: 4043: 4038: 4033: 4028: 4023: 4018: 4013: 4005: 4003: 3997: 3996: 3991: 3990: 3983: 3976: 3968: 3962: 3961: 3954: 3948: 3943: 3924: 3909: 3902: 3901:External links 3899: 3896: 3895: 3864: 3847:huggingface.co 3834: 3804: 3774: 3744: 3713: 3682: 3651: 3623: 3593: 3570:Cai, Kenrick. 3562: 3528: 3497: 3484:huggingface.co 3471: 3442: 3420: 3395: 3370: 3358:huggingface.co 3345: 3320: 3303:huggingface.co 3290: 3273:huggingface.co 3260: 3248:huggingface.co 3235: 3218:huggingface.co 3202: 3185:huggingface.co 3172: 3159:huggingface.co 3146: 3129:huggingface.co 3116: 3088: 3063: 3037: 3026:Hachman, Mak. 3018: 2997: 2978: 2952: 2923: 2897: 2876: 2845: 2821: 2797: 2775: 2745: 2718: 2681: 2660: 2637: 2615: 2593: 2568: 2537: 2509: 2486: 2465: 2448:huggingface.co 2435: 2404: 2374: 2344: 2314: 2287: 2270:huggingface.co 2248: 2208: 2177: 2160:huggingface.co 2142: 2111: 2085: 2055: 2027: 1997: 1967: 1927: 1899: 1872: 1843: 1816: 1787: 1770:huggingface.co 1757: 1722: 1707: 1684:Alammar, Jay. 1669: 1619: 1594: 1580: 1560: 1520: 1490: 1447: 1410: 1379: 1349: 1319: 1302:huggingface.co 1289: 1262: 1239: 1238: 1236: 1233: 1232: 1231: 1226: 1221: 1216: 1211: 1204: 1201: 1168: 1165: 1161:William Orrick 1138:Kelly McKernan 1134:Sarah Andersen 1129: 1126: 1087: 1084: 1083: 1082: 1077:Training cost 1075: 1074: 1068: 1062: 1056: 1050: 1044: 1038: 1026: 1025: 1022: 1019: 1016: 1012: 1011: 1008: 1006: 1005:November 2023 1003: 999: 998: 992: 989: 986: 982: 981: 978: 976: 975:December 2022 973: 969: 968: 965: 963: 962:November 2022 960: 956: 955: 952: 949: 946: 942: 941: 938: 936: 933: 929: 928: 925: 922: 919: 910: 907: 905:applications. 893:, which has a 862: 859: 853: 850: 803: 802: 796: 788: 787: 778: 777: 769: 768: 767: 766: 765: 763: 760: 731: 730: 724: 718: 710: 709: 700: 699: 698: 689: 688: 687: 678: 677: 676: 675: 674: 672: 669: 648: 645: 644: 643: 629: 618: 593: 590: 550: 547: 523: 520: 482: 479: 455:Main article: 452: 449: 440: 437: 361:Gaussian noise 341:Main article: 338: 335: 305: 302: 266: 263: 259:cloud services 235:neural network 173: 172: 153: 149: 148: 145: 139: 138: 133: 127: 126: 121: 117: 116: 114: 113: 94: 92: 86: 85: 82: 81: 78: 76: 74:Stable release 70: 69: 66: 65: 62: 58: 57: 52: 46: 45: 42: 36: 35: 32: 15: 9: 6: 4: 3: 2: 4940: 4929: 4928:2022 software 4926: 4924: 4921: 4919: 4916: 4914: 4911: 4909: 4906: 4904: 4901: 4899: 4896: 4895: 4893: 4876: 4873: 4871: 4868: 4867: 4860: 4856: 4853: 4851: 4848: 4847: 4844: 4840: 4839: 4836: 4830: 4827: 4825: 4822: 4820: 4817: 4815: 4812: 4810: 4807: 4805: 4802: 4800: 4797: 4795: 4792: 4790: 4787: 4785: 4782: 4780: 4777: 4775: 4772: 4770: 4767: 4765: 4762: 4760: 4757: 4756: 4754: 4752:Architectures 4750: 4744: 4741: 4739: 4736: 4734: 4731: 4729: 4726: 4724: 4721: 4719: 4716: 4714: 4711: 4709: 4706: 4704: 4701: 4700: 4698: 4696:Organizations 4694: 4688: 4685: 4683: 4680: 4678: 4675: 4673: 4670: 4668: 4665: 4663: 4660: 4658: 4655: 4653: 4650: 4648: 4645: 4643: 4640: 4638: 4635: 4633: 4632:Yoshua Bengio 4630: 4629: 4627: 4623: 4613: 4612:Robot control 4610: 4606: 4603: 4602: 4601: 4598: 4596: 4593: 4591: 4588: 4586: 4583: 4581: 4578: 4576: 4573: 4571: 4568: 4566: 4563: 4562: 4560: 4556: 4550: 4547: 4545: 4542: 4540: 4537: 4535: 4532: 4530: 4529:Chinchilla AI 4527: 4525: 4522: 4520: 4517: 4515: 4512: 4510: 4507: 4505: 4502: 4500: 4497: 4495: 4492: 4490: 4487: 4485: 4482: 4480: 4477: 4475: 4472: 4468: 4465: 4464: 4463: 4460: 4458: 4455: 4453: 4450: 4448: 4445: 4443: 4440: 4438: 4435: 4434: 4432: 4428: 4422: 4419: 4415: 4412: 4410: 4407: 4406: 4405: 4402: 4398: 4395: 4393: 4390: 4388: 4385: 4384: 4383: 4380: 4378: 4375: 4373: 4370: 4368: 4365: 4363: 4360: 4358: 4355: 4353: 4350: 4348: 4345: 4343: 4340: 4338: 4335: 4334: 4332: 4328: 4325: 4321: 4315: 4312: 4310: 4307: 4305: 4302: 4300: 4297: 4295: 4292: 4290: 4287: 4285: 4282: 4281: 4279: 4275: 4269: 4266: 4264: 4261: 4259: 4256: 4254: 4251: 4249: 4246: 4245: 4243: 4239: 4231: 4228: 4227: 4226: 4223: 4221: 4218: 4216: 4213: 4209: 4208:Deep learning 4206: 4205: 4204: 4201: 4197: 4194: 4193: 4192: 4189: 4188: 4186: 4182: 4176: 4173: 4171: 4168: 4164: 4161: 4160: 4159: 4156: 4154: 4151: 4147: 4144: 4142: 4139: 4137: 4134: 4133: 4132: 4129: 4127: 4124: 4122: 4119: 4117: 4114: 4112: 4109: 4107: 4104: 4102: 4099: 4097: 4096:Hallucination 4094: 4090: 4087: 4086: 4085: 4082: 4080: 4077: 4073: 4070: 4069: 4068: 4065: 4064: 4062: 4058: 4052: 4049: 4047: 4044: 4042: 4039: 4037: 4034: 4032: 4029: 4027: 4024: 4022: 4019: 4017: 4014: 4012: 4011: 4007: 4006: 4004: 4002: 3998: 3989: 3984: 3982: 3977: 3975: 3970: 3969: 3966: 3959: 3955: 3952: 3949: 3947: 3944: 3933: 3929: 3925: 3914: 3910: 3908: 3905: 3904: 3883: 3879: 3875: 3868: 3852: 3848: 3844: 3838: 3822: 3818: 3814: 3808: 3792: 3788: 3784: 3778: 3762: 3758: 3754: 3748: 3732: 3728: 3724: 3717: 3701: 3697: 3693: 3686: 3670: 3666: 3662: 3655: 3640: 3635: 3627: 3616:September 26, 3611: 3607: 3603: 3597: 3581: 3577: 3573: 3566: 3550: 3546: 3542: 3535: 3533: 3521:September 26, 3516: 3512: 3508: 3501: 3485: 3481: 3475: 3466: 3461: 3457: 3453: 3446: 3437: 3432: 3424: 3409: 3405: 3399: 3384: 3380: 3374: 3359: 3355: 3349: 3334: 3330: 3324: 3308: 3304: 3300: 3294: 3278: 3274: 3270: 3264: 3249: 3245: 3239: 3223: 3219: 3215: 3209: 3207: 3190: 3186: 3182: 3176: 3160: 3156: 3150: 3134: 3130: 3126: 3120: 3105: 3101: 3100: 3092: 3077: 3073: 3067: 3051: 3047: 3041: 3033: 3029: 3022: 3014: 3013: 3008: 3001: 2993: 2989: 2982: 2967: 2963: 2956: 2949: 2938: 2934: 2927: 2912: 2908: 2901: 2892: 2887: 2880: 2864: 2860: 2856: 2849: 2840: 2835: 2828: 2826: 2816: 2811: 2804: 2802: 2793: 2789: 2785: 2779: 2763: 2759: 2755: 2749: 2734: 2730: 2729: 2722: 2711:September 27, 2706: 2702: 2698: 2692: 2690: 2688: 2686: 2676: 2671: 2664: 2656: 2652: 2648: 2641: 2633: 2629: 2625: 2619: 2610: 2605: 2597: 2589: 2585: 2584: 2579: 2572: 2556: 2552: 2548: 2545:Smith, Ryan. 2541: 2526: 2522: 2521: 2513: 2505: 2501: 2497: 2490: 2481: 2476: 2469: 2453: 2449: 2445: 2439: 2423: 2419: 2415: 2408: 2392: 2388: 2384: 2378: 2362: 2358: 2354: 2348: 2332: 2328: 2324: 2318: 2310: 2306: 2302: 2296: 2294: 2292: 2275: 2271: 2267: 2261: 2259: 2257: 2255: 2253: 2244: 2238: 2223: 2219: 2212: 2196: 2192: 2188: 2181: 2165: 2161: 2157: 2151: 2149: 2147: 2130: 2126: 2122: 2115: 2106: 2101: 2094: 2092: 2090: 2073: 2069: 2065: 2059: 2044: 2040: 2039: 2031: 2020:September 12, 2015: 2011: 2004: 2002: 1985: 1981: 1977: 1971: 1955: 1951: 1947: 1940: 1938: 1936: 1934: 1932: 1916: 1912: 1906: 1904: 1895: 1890: 1886: 1879: 1877: 1868: 1863: 1859: 1852: 1850: 1848: 1838: 1833: 1825: 1823: 1821: 1805: 1801: 1797: 1791: 1775: 1771: 1767: 1761: 1753: 1747: 1738: 1733: 1726: 1718: 1711: 1695: 1691: 1687: 1680: 1678: 1676: 1674: 1662:September 17, 1654: 1649: 1644: 1637: 1636: 1628: 1626: 1624: 1608: 1604: 1598: 1590: 1584: 1576: 1575: 1570: 1564: 1553:September 30, 1548: 1544: 1540: 1533: 1531: 1529: 1527: 1525: 1508: 1504: 1500: 1494: 1483:September 17, 1478: 1474: 1468: 1466: 1464: 1462: 1460: 1458: 1456: 1454: 1452: 1435: 1431: 1427: 1421: 1419: 1417: 1415: 1398: 1394: 1390: 1383: 1367: 1363: 1359: 1353: 1337: 1333: 1329: 1323: 1307: 1303: 1299: 1293: 1277: 1273: 1266: 1258: 1254: 1250: 1244: 1240: 1230: 1227: 1225: 1222: 1220: 1217: 1215: 1212: 1210: 1207: 1206: 1200: 1198: 1194: 1190: 1186: 1182: 1178: 1174: 1164: 1162: 1157: 1155: 1151: 1147: 1143: 1139: 1135: 1125: 1123: 1118: 1116: 1112: 1107: 1105: 1104:Emad Mostaque 1099: 1095: 1092: 1080: 1079: 1078: 1072: 1069: 1066: 1063: 1060: 1057: 1054: 1051: 1048: 1045: 1042: 1039: 1036: 1033: 1032: 1031: 1023: 1020: 1017: 1014: 1013: 1009: 1007: 1004: 1001: 1000: 997: 993: 990: 987: 984: 983: 979: 977: 974: 971: 970: 966: 964: 961: 958: 957: 953: 950: 948:October 2022 947: 944: 943: 939: 937: 934: 931: 930: 926: 923: 921:Release date 920: 917: 916: 906: 904: 901:akin to many 900: 896: 892: 891: 886: 885: 880: 876: 872: 868: 858: 849: 846: 841: 839: 833: 831: 827: 823: 819: 814: 811: 800: 797: 794: 791: 790: 782: 773: 759: 756: 751: 748: 743: 741: 728: 725: 722: 719: 716: 713: 712: 704: 693: 682: 668: 666: 662: 657: 655: 641: 637: 633: 630: 627: 623: 619: 616: 612: 607: 606: 605: 603: 599: 589: 587: 582: 580: 576: 572: 568: 564: 560: 555: 546: 543: 541: 537: 532: 530: 519: 517: 513: 509: 505: 501: 497: 493: 488: 481:Training data 478: 474: 470: 468: 464: 458: 448: 445: 436: 434: 430: 426: 413: 411: 408: 407:thermodynamic 404: 399: 397: 393: 388: 386: 382: 378: 374: 370: 366: 362: 358: 354: 350: 344: 331: 327: 323: 318: 310: 301: 299: 295: 291: 286: 284: 280: 276: 272: 262: 260: 256: 252: 248: 244: 240: 236: 232: 229: 224: 222: 218: 214: 210: 205: 203: 199: 195: 191: 187: 183: 182:deep learning 179: 169: 162: 161:/stable-image 154: 150: 146: 144: 140: 137: 134: 132: 128: 125: 122: 118: 111: 106: 103:/Stability-AI 96: 95: 93: 91: 87: 83: 77: 75: 71: 67: 63: 59: 56: 53: 51: 47: 43: 41: 37: 30: 25: 19: 4718:Hugging Face 4682:David Silver 4396: 4330:Audio–visual 4184:Applications 4163:Augmentation 4008: 3935:. Retrieved 3931: 3916:. Retrieved 3886:. Retrieved 3877: 3867: 3857:September 5, 3855:. Retrieved 3846: 3837: 3825:. Retrieved 3816: 3807: 3797:February 20, 3795:. Retrieved 3786: 3777: 3765:. Retrieved 3757:Stability.Ai 3756: 3747: 3735:. Retrieved 3726: 3716: 3704:. Retrieved 3695: 3685: 3673:. Retrieved 3664: 3654: 3642:. Retrieved 3638: 3626: 3614:. Retrieved 3605: 3596: 3584:. Retrieved 3575: 3565: 3553:. Retrieved 3544: 3519:. Retrieved 3510: 3500: 3488:. Retrieved 3483: 3474: 3455: 3445: 3423: 3411:. Retrieved 3408:Stability AI 3407: 3398: 3386:. Retrieved 3383:Stability AI 3382: 3373: 3361:. Retrieved 3357: 3348: 3336:. Retrieved 3333:Stability AI 3332: 3323: 3311:. Retrieved 3302: 3293: 3281:. Retrieved 3272: 3263: 3251:. Retrieved 3247: 3238: 3226:. Retrieved 3217: 3193:. Retrieved 3184: 3175: 3163:. Retrieved 3158: 3149: 3137:. Retrieved 3128: 3119: 3107:. Retrieved 3098: 3091: 3079:. Retrieved 3075: 3066: 3054:. Retrieved 3049: 3040: 3031: 3021: 3012:The Register 3010: 3000: 2991: 2981: 2969:. Retrieved 2965: 2955: 2947: 2940:. Retrieved 2936: 2926: 2914:. Retrieved 2911:Ars Technica 2910: 2900: 2879: 2867:. Retrieved 2858: 2848: 2788:stability.ai 2787: 2778: 2766:. Retrieved 2757: 2748: 2737:, retrieved 2727: 2721: 2709:. Retrieved 2700: 2663: 2651:ITmedia Inc. 2650: 2640: 2627: 2618: 2596: 2581: 2571: 2559:. Retrieved 2550: 2540: 2529:, retrieved 2519: 2512: 2499: 2489: 2468: 2456:. Retrieved 2447: 2438: 2426:. Retrieved 2418:Ars Technica 2417: 2407: 2395:. Retrieved 2387:Stability AI 2386: 2377: 2365:. Retrieved 2356: 2347: 2335:. Retrieved 2326: 2317: 2305:stability.ai 2304: 2278:. Retrieved 2269: 2225:. Retrieved 2221: 2211: 2199:. Retrieved 2190: 2180: 2168:. Retrieved 2159: 2135:September 6, 2133:. Retrieved 2124: 2114: 2078:September 2, 2076:. Retrieved 2067: 2058: 2047:, retrieved 2037: 2030: 2018:. Retrieved 1988:. Retrieved 1979: 1970: 1958:. Retrieved 1949: 1918:. Retrieved 1914: 1884: 1857: 1809:February 10, 1807:. Retrieved 1799: 1790: 1778:. Retrieved 1769: 1760: 1725: 1716: 1710: 1698:. Retrieved 1689: 1660:. Retrieved 1634: 1612:September 5, 1610:. Retrieved 1606: 1597: 1583: 1572: 1563: 1551:. Retrieved 1542: 1511:. Retrieved 1502: 1493: 1481:. Retrieved 1440:September 6, 1438:. Retrieved 1430:Stability.Ai 1429: 1401:. Retrieved 1392: 1382: 1370:. Retrieved 1361: 1352: 1340:. Retrieved 1331: 1322: 1312:September 5, 1310:. Retrieved 1301: 1292: 1280:. Retrieved 1265: 1253:stability.ai 1252: 1243: 1224:Hugging Face 1170: 1158: 1154:Getty Images 1131: 1119: 1108: 1100: 1096: 1093: 1089: 1076: 1070: 1064: 1058: 1052: 1046: 1040: 1034: 1029: 995: 935:August 2022 888: 882: 875:StableStudio 874: 870: 866: 864: 855: 842: 834: 815: 812: 808: 798: 792: 752: 744: 736: 726: 720: 714: 658: 650: 647:Capabilities 595: 583: 556: 552: 544: 533: 525: 487:Common Crawl 484: 475: 471: 462: 460: 446: 442: 414: 402: 400: 389: 377:latent space 346: 337:Architecture 326:random noise 287: 270: 268: 225: 206: 198:Stability AI 177: 176: 55:Stability AI 50:Developer(s) 18: 4866:Categories 4814:Autoencoder 4769:Transformer 4637:Alex Graves 4585:OpenAI Five 4489:IBM Watsonx 4111:Convolution 4089:Overfitting 3706:January 22, 3675:January 16, 3586:October 31, 2869:November 2, 2768:November 2, 2739:November 2, 2561:October 31, 2531:October 31, 2458:October 31, 2367:October 31, 2337:October 31, 2280:October 31, 2201:November 2, 2170:November 2, 2049:November 2, 1990:November 2, 1960:November 2, 1800:openvino.ai 1700:October 31, 1030:Key papers 1021:800M to 8B 903:3D modeling 871:DreamStudio 867:DreamStudio 549:Limitations 536:Nvidia A100 467:Transformer 265:Development 213:text prompt 4892:Categories 4855:Technology 4708:EleutherAI 4667:Fei-Fei Li 4662:Yann LeCun 4575:Q-learning 4558:Decisional 4484:IBM Watson 4392:Midjourney 4284:TensorFlow 4131:Activation 4084:Regression 4079:Clustering 3937:August 31, 3918:August 31, 3888:October 4, 3827:August 31, 3767:August 31, 3555:October 4, 3490:January 1, 3465:2112.10752 3436:2103.00020 3388:January 1, 3363:January 1, 3338:January 1, 3313:August 17, 3283:August 17, 3253:January 1, 3228:August 17, 3195:August 17, 3139:August 17, 3076:github.com 2966:TechCrunch 2891:2302.05543 2839:2210.12100 2815:2108.01073 2675:2108.01073 2609:2208.01618 2480:2210.04133 2428:August 21, 2397:August 21, 2191:TechCrunch 2105:2207.12598 1894:2209.03003 1867:2403.03206 1837:2307.01952 1746:cite arXiv 1737:1503.03585 1648:2112.10752 1513:August 31, 1362:www.lmu.de 1235:References 1214:Midjourney 1185:harassment 1150:DeviantArt 1146:Midjourney 1128:Litigation 988:July 2023 924:Parameter 895:node-based 852:ControlNet 838:layer mask 747:seed value 654:inpainting 632:DreamBooth 563:Fine-tuned 512:DeviantArt 383:backbone, 357:LMU Munich 304:Technology 294:EleutherAI 255:Midjourney 209:inpainting 120:Written in 90:Repository 4738:MIT CSAIL 4703:Anthropic 4672:Andrew Ng 4570:AlphaZero 4414:VideoPoet 4377:AlphaFold 4314:MindSpore 4268:SpiNNaker 4263:Memristor 4170:Diffusion 4146:Rectifier 4126:Batchnorm 4106:Attention 4101:Adversary 3737:August 6, 3665:The Verge 3639:404 Media 3072:"ComfyUI" 2992:The Verge 2500:Riffusion 2227:April 25, 1543:The Verge 1332:sifted.eu 1002:XL Turbo 755:front-end 598:fine-tune 529:watermark 500:WordPress 496:Pinterest 410:diffusion 403:diffusion 401:The name 322:denoising 190:diffusion 157:stability 4846:Portals 4605:Auto-GPT 4437:Word2vec 4241:Hardware 4158:Datasets 4060:Concepts 3882:Archived 3851:Archived 3821:Archived 3791:Archived 3761:Archived 3731:Archived 3700:Archived 3669:Archived 3644:June 14, 3610:Archived 3606:BBC News 3580:Archived 3549:Archived 3515:Archived 3413:March 5, 3307:Archived 3277:Archived 3222:Archived 3189:Archived 3165:March 6, 3133:Archived 3109:July 10, 3081:July 10, 3056:July 10, 2971:July 10, 2942:July 10, 2916:July 10, 2863:Archived 2792:Archived 2762:Archived 2733:archived 2705:Archived 2655:Archived 2632:Archived 2588:Archived 2583:PC Gamer 2555:Archived 2525:archived 2504:Archived 2452:Archived 2422:Archived 2391:Archived 2361:Archived 2331:Archived 2327:laion.ai 2309:Archived 2274:Archived 2237:cite web 2195:Archived 2164:Archived 2129:Archived 2072:Archived 2068:laion.ai 2043:archived 2014:Archived 1984:Archived 1954:Archived 1950:Waxy.org 1920:March 6, 1780:June 22, 1774:Archived 1694:Archived 1653:Archived 1547:Archived 1507:Archived 1477:Archived 1434:Archived 1403:June 22, 1397:Archived 1372:June 21, 1366:Archived 1342:June 20, 1336:Archived 1306:Archived 1276:Archived 1257:Archived 1203:See also 909:Releases 538:GPUs on 504:Blogspot 433:OpenVINO 425:consumer 415:With 860 385:denoises 330:concepts 239:publicly 4728:Meta AI 4565:AlphaGo 4549:PanGu-Σ 4519:ChatGPT 4494:Granite 4442:Seq2seq 4421:Whisper 4342:WaveNet 4337:AlexNet 4309:Flux.jl 4289:PyTorch 4141:Sigmoid 4136:Softmax 4001:General 3727:Reuters 3032:PCWorld 2937:bbc.com 2628:NovelAI 2323:"LAION" 2125:Twitter 1503:PCWorld 1393:Twitter 1219:Craiyon 1167:License 985:XL 1.0 890:ComfyUI 884:Fooocus 873:called 665:float32 661:float16 622:NovelAI 371:(VAE), 152:Website 143:License 4743:Huawei 4723:OpenAI 4625:People 4595:MuZero 4457:Gemini 4452:Claude 4387:DALL-E 4299:Theano 3576:Forbes 2859:Medium 2758:GitHub 2701:GitHub 1603:"Home" 1574:GitHub 1282:May 4, 1189:doxing 1173:DALL-E 1148:, and 927:Notes 727:Bottom 721:Centre 615:Nvidia 575:Nvidia 508:Flickr 451:SD 3.0 421:  417:  381:ResNet 290:Runway 279:Munich 251:DALL-E 228:latent 221:Runway 164:  124:Python 99:github 4809:Mamba 4580:SARSA 4544:LLaMA 4539:BLOOM 4524:GPT-J 4514:GPT-4 4509:GPT-3 4504:GPT-2 4499:GPT-1 4462:LaMDA 4294:Keras 3460:arXiv 3431:arXiv 2886:arXiv 2834:arXiv 2810:arXiv 2670:arXiv 2604:arXiv 2475:arXiv 2100:arXiv 1889:arXiv 1862:arXiv 1832:arXiv 1804:Intel 1732:arXiv 1656:(PDF) 1643:arXiv 1639:(PDF) 1181:libel 1115:Pixiv 991:3.5B 951:983M 845:depth 799:Right 559:anime 492:LAION 439:SD XL 373:U-Net 298:LAION 180:is a 4733:Mila 4534:PaLM 4467:Bard 4447:BERT 4430:Text 4409:Sora 3939:2024 3920:2024 3890:2022 3859:2022 3829:2022 3799:2023 3769:2022 3739:2023 3708:2023 3677:2023 3646:2024 3618:2023 3588:2022 3557:2022 3523:2022 3492:2024 3415:2024 3390:2024 3365:2024 3340:2024 3315:2023 3285:2023 3255:2024 3230:2023 3197:2023 3167:2024 3141:2023 3111:2024 3083:2024 3058:2024 2973:2024 2944:2024 2918:2024 2871:2022 2770:2022 2741:2022 2713:2022 2563:2022 2533:2022 2460:2022 2430:2023 2399:2023 2369:2022 2339:2022 2282:2022 2243:link 2229:2024 2203:2022 2172:2022 2137:2022 2080:2022 2051:2022 2022:2023 1992:2022 1962:2022 1922:2024 1811:2024 1782:2023 1752:link 1702:2022 1664:2022 1614:2024 1555:2022 1515:2022 1485:2022 1442:2022 1405:2023 1374:2023 1344:2023 1314:2022 1284:2023 1015:3.0 972:2.1 959:2.0 945:1.5 830:WebP 828:and 826:JPEG 820:and 793:Left 638:and 571:VRAM 514:and 396:CLIP 320:The 296:and 281:and 253:and 247:VRAM 219:and 131:Type 101:.com 4474:NMT 4357:OCR 4352:HWR 4304:JAX 4258:VPU 4253:TPU 4248:IPU 4072:SGD 3696:CNN 1191:, " 715:Top 577:'s 429:CPU 277:in 243:GPU 159:.ai 4894:: 3930:. 3876:. 3849:. 3845:. 3815:. 3785:. 3759:. 3755:. 3729:. 3725:. 3698:. 3694:. 3667:. 3663:. 3637:. 3604:. 3578:. 3574:. 3543:. 3531:^ 3513:. 3509:. 3482:. 3454:. 3406:. 3381:. 3356:. 3331:. 3305:. 3301:. 3275:. 3271:. 3246:. 3220:. 3216:. 3205:^ 3187:. 3183:. 3157:. 3131:. 3127:. 3074:. 3048:. 3030:. 3009:. 2990:. 2964:. 2946:. 2935:. 2909:. 2861:. 2857:. 2824:^ 2800:^ 2790:. 2786:. 2760:. 2756:. 2699:. 2684:^ 2649:. 2626:. 2586:. 2580:. 2553:. 2549:. 2523:, 2502:. 2498:. 2450:. 2446:. 2420:. 2416:. 2389:. 2385:. 2355:. 2329:. 2325:. 2307:. 2303:. 2290:^ 2272:. 2268:. 2251:^ 2239:}} 2235:{{ 2220:. 2193:. 2189:. 2162:. 2158:. 2145:^ 2127:. 2123:. 2088:^ 2070:. 2066:. 2041:, 2000:^ 1982:. 1978:. 1952:. 1948:. 1930:^ 1913:. 1902:^ 1887:, 1875:^ 1860:, 1846:^ 1819:^ 1802:. 1798:. 1772:. 1768:. 1748:}} 1744:{{ 1692:. 1688:. 1672:^ 1651:. 1622:^ 1605:. 1571:. 1545:. 1541:. 1523:^ 1505:. 1501:. 1450:^ 1432:. 1428:. 1413:^ 1395:. 1391:. 1364:. 1360:. 1334:. 1330:. 1304:. 1300:. 1274:. 1255:. 1251:. 1187:, 1183:, 1136:, 1117:. 510:, 506:, 502:, 469:. 261:. 204:. 184:, 3987:e 3980:t 3973:v 3960:" 3956:" 3941:. 3922:. 3892:. 3861:. 3831:. 3801:. 3771:. 3741:. 3710:. 3679:. 3648:. 3620:. 3590:. 3559:. 3525:. 3494:. 3468:. 3462:: 3439:. 3433:: 3417:. 3392:. 3367:. 3342:. 3317:. 3287:. 3257:. 3232:. 3199:. 3169:. 3143:. 3113:. 3085:. 3060:. 3034:. 3015:. 2994:. 2975:. 2920:. 2894:. 2888:: 2873:. 2842:. 2836:: 2818:. 2812:: 2772:. 2715:. 2678:. 2672:: 2612:. 2606:: 2565:. 2483:. 2477:: 2462:. 2432:. 2401:. 2371:. 2341:. 2284:. 2245:) 2231:. 2205:. 2174:. 2139:. 2108:. 2102:: 2082:. 2024:. 1994:. 1964:. 1924:. 1891:: 1864:: 1840:. 1834:: 1813:. 1784:. 1754:) 1740:. 1734:: 1704:. 1666:. 1645:: 1616:. 1591:. 1577:. 1557:. 1517:. 1487:. 1444:. 1407:. 1376:. 1346:. 1316:. 1286:.

Index


Original author(s)
Developer(s)
Stability AI
Stable release
Repository
github.com/Stability-AI/generative-models
Edit this at Wikidata
Python
Type
Text-to-image model
License
stability.ai/stable-image
Edit this on Wikidata
deep learning
text-to-image model
diffusion
generative artificial intelligence
Stability AI
artificial intelligence boom
inpainting
text prompt
Ludwig Maximilian University of Munich
Runway
latent
diffusion model
neural network
publicly
GPU
VRAM

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.