Research Article
Open access
Published on 25 March 2024
Download pdf
Jin,Y. (2024). Analysis of the methods and performances for data augmentation: Image, text and audio. Applied and Computational Engineering,51,208-215.
Export citation

Analysis of the methods and performances for data augmentation: Image, text and audio

Yuying Jin *,1,
  • 1 Penn State Behrend

* Author to whom correspondence should be addressed.

https://doi.org/10.54254/2755-2721/51/20241354

Abstract

With the rapid development in computation ability as well as machine learning scenarios, various artificial intelligence applications can be achieved in recent years. With this in mind, this study will explore the application of data augmentation in machine learning and deep learning. To be specific, this paper first introduces the background and research history of data augmentation and then discusses the research progress in recent years. The basic description of this study describes the definition, common methods, and evaluation metrics of data augmentation in detail. At the same time, three data augmentation models, AutoAugment, AugGPT, and SpecAugment++, are introduced respectively, including their principles, experimental results, as well as evaluation. Finally, according to the analysis, the limitations and prospects of the field are discussed and demonstrated, as well as summarize the main findings and research implications of the full paper. Overall, these results shed light on guiding further exploration of data augmentation.

Keywords

Data augmentation, image data, text data, audio data

[1]. Khosla C and Saini B S 2020 International Conference on Intelligent Engineering and Management (ICIEM) pp 79-85.

[2]. Feng S Y, Gangal V, Wei J, Chandar S, Vosoughi S, Mitamura T and Hovy E 2021 arXiv preprint arXiv:2105.03075.

[3]. Nam H, Kim S H and Park Y H 2022 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) pp 4308-4312.

[4]. Shorten C and Khoshgoftaar T M 2019 J Big Data vol 6(1) p 6.

[5]. Holmstrom L and Koistinen P 1992 IEEE Transactions on Neural Networks vol 3(1) pp 24-38.

[6]. Chawla N V, Bowyer K W, Hall L O and Kegelmeyer W P 2002 arXiv preprint arXiv:11061813.

[7]. Wei S, Zou S, Liao F and Lang W 2020 Journal of Physics: Conference Series vol 1453(1) p 012085.

[8]. Yang L, Wang C, Chen Y, Du Y and Yang E 2019 arXiv preprint arXiv:190913302.

[9]. Sennrich R, Haddow B and Birch A 2016 Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics vol 1 pp 86–96.

[10]. Li B, Hou Y and Che W 2022 AI Open vol 3 pp 71-9.

[11]. Saldanha J, Chakraborty S, Patil S, Kotecha K, Kumar S and Nayyar A 2022 PLoS ONE vol 17(8) p e0266467.

[12]. Park D S, Chan W, Zhang Y, Chiu C C, Zoph B, Cubuk E D and Le Q V (201 arXiv preprint arXiv:190408779.

[13]. Kim G, Han D K and Ko H 2021 arXiv preprint arXiv:21080302.

[14]. Abayomi-Alli O O, Damaševičius R, Qazi A, Adedoyin-Olowe M and Misra S 2022 Electronics vol 11(22) p 3795.

[15]. Ferreira-Paiva L, Alfaro-Espinoza E, Almeida V M, Felix L B and Neves R V 2022 XXIV Brazilian Congress of Automatics (CBA) pp 122-127.

[16]. Chen T and Nguyen-Thi T A 2021 Computational Social Networks vol 8(1) p 1.

[17]. Cubuk E D, Zoph B, Mané D, Vasudevan V and Le Q V 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp 113-123/

[18]. Niu T and Bansal M 2019 arXiv preprint arXiv:190912868.

[19]. Zhang X, Wang Q, Zhang J and Zhong Z 2019 arXiv preprint arXiv:191211188.

[20]. Lim S, Kim I, Kim T, Kim C and Kim S 2019 Neural Information Processing Systems vol 32.

[21]. Dai H, Liu Z, Liao W, Huang X, Cao Y, Wu Z and Li X 2023 arXiv preprint arXiv:230213007.

[22]. Laskar M T R, Bari M S, Rahman M, Bhuiyan M A H, Joty S and Huang J X 2023 arXiv preprint arXiv:230518486.

[23]. Wang H, Zou Y and Wang W 2021 arXiv preprint arXiv:210316858.

[24]. Park D S, Chan W, Zhang Y, Chiu C C, Zoph B, Cubuk E D and Le Q V 2019 arXiv preprint arXiv:190408779.

Cite this article

Jin,Y. (2024). Analysis of the methods and performances for data augmentation: Image, text and audio. Applied and Computational Engineering,51,208-215.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

Disclaimer/Publisher's Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

About volume

Volume title: Proceedings of the 4th International Conference on Signal Processing and Machine Learning

Conference website: https://www.confspml.org/
ISBN:978-1-83558-347-0(Print) / 978-1-83558-348-7(Online)
Conference date: 15 January 2024
Editor:Marwan Omar
Series: Applied and Computational Engineering
Volume number: Vol.51
ISSN:2755-2721(Print) / 2755-273X(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).