Authors :
Nishu Kumari
Volume/Issue :
Volume 7 - 2022, Issue 12 - December
Google Scholar :
https://bit.ly/3IIfn9N
Scribd :
https://bit.ly/3GZ9INQ
DOI :
https://doi.org/10.5281/zenodo.7551296
Abstract :
This text-to-image convertor aims to check
the conversion of data between the various modalities
(text, image) because of the evolution of human-machine
communication that introduced the utilization of natural
communication modalities to humans. Such as gestures,
speech, sound, and vision. In fact, one of the main
challenges of this "multimodal" learning is the learning
of a shared illustration between the distinct modalities
and the prediction of the missing knowledge ( by
retrieval or synthesis) from one conditioned modality to
another. Some researchers work on the various varieties
of conversions; Text to Speech, Speech to image or Text
to image synthesis, and vice-versa however in this paper
we tend to can focus on: image to audio image-to-text
synthesis.
This text-to-image convertor aims to check
the conversion of data between the various modalities
(text, image) because of the evolution of human-machine
communication that introduced the utilization of natural
communication modalities to humans. Such as gestures,
speech, sound, and vision. In fact, one of the main
challenges of this "multimodal" learning is the learning
of a shared illustration between the distinct modalities
and the prediction of the missing knowledge ( by
retrieval or synthesis) from one conditioned modality to
another. Some researchers work on the various varieties
of conversions; Text to Speech, Speech to image or Text
to image synthesis, and vice-versa however in this paper
we tend to can focus on: image to audio image-to-text
synthesis.