Review of Generative Models

Research Article
Open access

Review of Generative Models

Ruixi Wang 1*
  • 1 Northeastern University    
  • *corresponding author 20205397@stu.neu.edu.cn
Published on 1 August 2023 | https://doi.org/10.54254/2755-2721/8/20230269
ACE Vol.8
ISSN (Print): 2755-273X
ISSN (Online): 2755-2721
ISBN (Print): 978-1-915371-63-8
ISBN (Online): 978-1-915371-64-5

Abstract

The advancement of generative AI models has been remarkable since 2022. Several visually appealing generative AI models have been introduced to the public, including those for text and image generation. Despite being generated by large-scale neural networks and deep learning algorithms through extensive training, generative models are capable of achieving average or above-average quality and creativity in many fields, such as painting and literature. This paper will examine some of the AI models currently available, delve into their underlying principles and histories, and provide insight into what the future may hold. With the advancement of technology, we can expect to see even more innovative and creative applications in the future.

Keywords:

Generative models, artificial intelligence, NLP, image generation, deep learning

Wang,R. (2023). Review of Generative Models. Applied and Computational Engineering,8,524-529.
Export citation

References

[1]. Hinton, G. E., & Sejnowski, T. J. (1983). Optimal perceptual inference. Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 448-453.

[2]. Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786), 504-507.

[3]. Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.

[4]. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems, 2672-2680.

[5]. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30.

[6]. Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training.

[7]. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI Blog, 1(8), 9.

[8]. Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., ... & Sutskever, I. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877-1901.

[9]. OpenAI. (2023). GPT-4 Technical Report. arXiv preprint arXiv:2303.08774.

[10]. Ji, Z., Lee, N., Frieske, R., Chen, L., & Wang, W. Y. (2023). Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12), 1-38.

[11]. Rombach, R., Blattmann, A., Lorenz, D., Kingma, D. P., & Welling, M. (2022). High-resolution image synthesis with latent diffusion models. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10684-10695.

[12]. Yang, L., Zhang, Z., Song, Y., & Wu, F. (2022). Diffusion models: A comprehensive survey of methods and applications. arXiv preprint arXiv:2209.00796.

[13]. Weng, L. (2021, July 11). What are Diffusion Models? Lil’Log. https://lilianweng.github.io/posts/2021-07-11-diffusion-models/.

[14]. Sean. (2022, November 25). Stable Diffusion principle interpretation. Zhihu column. https://zhuanlan.zhihu.com/p/583124756.

[15]. Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., & Lee, H. (2016). Generative adversarial text to image synthesis. International conference on machine learning, 1060-1069.

[16]. OpenAI. (2021). DALL-E. OpenAI. https://openai.com/research/dall-e.

[17]. O'Connor, R. (2022, April 21). Here's how the DALL-E 2 works. Zhihu column. https://zhuanlan.zhihu.com/p/502389739.

[18]. OpenAI. (2021). DALL-E. OpenAI. https://openai.com/research/dall-e.

[19]. Bubeck, S., Chandrasekaran, V., Eldan, R., Klivans, A., Raghunathan, A., & Zeevi, A. (2023). Sparks of Artificial General Intelligence: Early experiments with GPT-4. arXiv preprint arXiv:2303.12712.

[20]. Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2023). GPTs are GPTs: An early look at the labor market impact potential of large language models. arXiv preprint arXiv:2303.10130.


Cite this article

Wang,R. (2023). Review of Generative Models. Applied and Computational Engineering,8,524-529.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

Disclaimer/Publisher's Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

About volume

Volume title: Proceedings of the 2023 International Conference on Software Engineering and Machine Learning

ISBN:978-1-915371-63-8(Print) / 978-1-915371-64-5(Online)
Editor:Anil Fernando, Marwan Omar
Conference website: http://www.confseml.org
Conference date: 19 April 2023
Series: Applied and Computational Engineering
Volume number: Vol.8
ISSN:2755-2721(Print) / 2755-273X(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).

References

[1]. Hinton, G. E., & Sejnowski, T. J. (1983). Optimal perceptual inference. Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 448-453.

[2]. Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786), 504-507.

[3]. Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.

[4]. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems, 2672-2680.

[5]. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30.

[6]. Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training.

[7]. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI Blog, 1(8), 9.

[8]. Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., ... & Sutskever, I. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877-1901.

[9]. OpenAI. (2023). GPT-4 Technical Report. arXiv preprint arXiv:2303.08774.

[10]. Ji, Z., Lee, N., Frieske, R., Chen, L., & Wang, W. Y. (2023). Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12), 1-38.

[11]. Rombach, R., Blattmann, A., Lorenz, D., Kingma, D. P., & Welling, M. (2022). High-resolution image synthesis with latent diffusion models. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10684-10695.

[12]. Yang, L., Zhang, Z., Song, Y., & Wu, F. (2022). Diffusion models: A comprehensive survey of methods and applications. arXiv preprint arXiv:2209.00796.

[13]. Weng, L. (2021, July 11). What are Diffusion Models? Lil’Log. https://lilianweng.github.io/posts/2021-07-11-diffusion-models/.

[14]. Sean. (2022, November 25). Stable Diffusion principle interpretation. Zhihu column. https://zhuanlan.zhihu.com/p/583124756.

[15]. Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., & Lee, H. (2016). Generative adversarial text to image synthesis. International conference on machine learning, 1060-1069.

[16]. OpenAI. (2021). DALL-E. OpenAI. https://openai.com/research/dall-e.

[17]. O'Connor, R. (2022, April 21). Here's how the DALL-E 2 works. Zhihu column. https://zhuanlan.zhihu.com/p/502389739.

[18]. OpenAI. (2021). DALL-E. OpenAI. https://openai.com/research/dall-e.

[19]. Bubeck, S., Chandrasekaran, V., Eldan, R., Klivans, A., Raghunathan, A., & Zeevi, A. (2023). Sparks of Artificial General Intelligence: Early experiments with GPT-4. arXiv preprint arXiv:2303.12712.

[20]. Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2023). GPTs are GPTs: An early look at the labor market impact potential of large language models. arXiv preprint arXiv:2303.10130.