Authors :
Adithya R; Adnan Ahmed S; Kishor D; Ramkumar K; Mrs. M. Sumithra
Volume/Issue :
Volume 8 - 2023, Issue 4 - April
Google Scholar :
https://bit.ly/3TmGbDi
Scribd :
https://bit.ly/41vj5wA
DOI :
https://doi.org/10.5281/zenodo.7927460
Abstract :
In this study, latent diffusion is proposed as a
novel method for text-to-image synthesis. The difficult
task of text-to-image synthesis entails creating accurate
visuals from textual descriptions. The suggested method
relies on a generative adversarial network (GAN) that
has a stability criteria to enhance the stability and the
convergence of the training process. The Lipschitz
constant and Jacobian norm, which gauge the
smoothness and robustness of the generator network,
serve as the foundation for the stability criterion. The
outcomes demonstrate that the suggested method beats
existing cutting-edge techniques in terms of image
quality and stability. The suggested method may find
use in a number of fields, including computer vision,
image editing, and artistic creativity. The work
proposes a potential method for text-to-image synthesis
and emphasises the significance of stability in GAN
training. The findings of this study add to the
expanding body of work on text-to-image synthesis and
offer suggestions for further study in this area.
Keywords :
CNN, RNN, GANs, VAEs, GDM, LDM, MIDAS
In this study, latent diffusion is proposed as a
novel method for text-to-image synthesis. The difficult
task of text-to-image synthesis entails creating accurate
visuals from textual descriptions. The suggested method
relies on a generative adversarial network (GAN) that
has a stability criteria to enhance the stability and the
convergence of the training process. The Lipschitz
constant and Jacobian norm, which gauge the
smoothness and robustness of the generator network,
serve as the foundation for the stability criterion. The
outcomes demonstrate that the suggested method beats
existing cutting-edge techniques in terms of image
quality and stability. The suggested method may find
use in a number of fields, including computer vision,
image editing, and artistic creativity. The work
proposes a potential method for text-to-image synthesis
and emphasises the significance of stability in GAN
training. The findings of this study add to the
expanding body of work on text-to-image synthesis and
offer suggestions for further study in this area.
Keywords :
CNN, RNN, GANs, VAEs, GDM, LDM, MIDAS