Research Article
Open access
Published on 25 July 2024
Download pdf
Li,X.;Shen,Q.;Yang,T. (2024). Design and optimization of multidimensional data models for enhanced OLAP query performance and data analysis. Applied and Computational Engineering,69,161-166.
Export citation

Design and optimization of multidimensional data models for enhanced OLAP query performance and data analysis

Xu Li 1, Qi Shen *,2, Tiancheng Yang 3
  • 1 The University of Sheffield, Sheffield, The UK
  • 2 Singapore management university, Singapore
  • 3 University of Birmingham, Birmingham, The UK

* Author to whom correspondence should be addressed.

https://doi.org/10.54254/2755-2721/69/20241503

Abstract

This paper explores the design and optimization of multidimensional data models to enhance the query performance and data analysis capabilities of OLAP (Online Analytical Processing) systems. It delves into three prominent dimensional modeling techniques: Star Schema, Snowflake Schema, and Galaxy Schema, analyzing their impact on query complexity, data redundancy, storage requirements, and ease of maintenance. Additionally, it examines three aggregation strategies—Pre-Aggregation, Dynamic Aggregation, and Hybrid Aggregation—focusing on their effectiveness in balancing query response time, storage efficiency, flexibility, and computational cost. The study further investigates performance optimization techniques, including query optimization, partitioning, and materialized views, providing case studies and experimental data to illustrate their benefits and challenges. The findings underscore the importance of tailored optimization strategies in OLAP systems to meet varying business needs and query patterns, highlighting the trade-offs between performance gains, storage requirements, and implementation complexity

Keywords

Multidimensional data models, OLAP, Star Schema, Snowflake Schema, Galaxy Schema

[1]. Bimonte, Sandro, et al. "Logical design of multi-model data warehouses." Knowledge and Information Systems 65.3 (2023): 1067-1103.

[2]. Al-Okaily, Aws, et al. "The efficiency measurement of business intelligence systems in the big data-driven economy: a multidimensional model." Information Discovery and Delivery 51.4 (2023): 404-416.

[3]. Knezek, Gerald, et al. "Strategies for developing digital competencies in teachers: Towards a multidimensional Synthesis of Qualitative Data (SQD) survey instrument." Computers & education 193 (2023): 104674.

[4]. Benhissen, Redha, et al. "GAMM: Graph-Based Agile Multidimensional Model." DOLAP. 2023.

[5]. Cuzzocrea, Alfredo. "A Reference Architecture for Supporting Multidimensional Big Data Analytics over Big Web Knowledge Bases: Definitions, Implementation, Case Studies." International Journal of Semantic Computing 17.4 (2023).

[6]. Lasemi, Mohammad Ali, et al. "Energy cost optimization of globally distributed internet data centers by copula-based multidimensional correlation modeling." Energy Reports 9 (2023): 631-644.

[7]. An, Gary, and Chase Cockrell. "Generating synthetic multidimensional molecular time series data for machine learning: considerations." Frontiers in Systems Biology 3 (2023): 1188009.

[8]. Roy, Santanu, et al. "Efficient OLAP query processing across cuboids in distributed data warehousing environment." Expert Systems with Applications 239 (2024): 122481.

[9]. Shioi, Takamitsu, et al. "Read-safe snapshots: An abort/wait-free serializable read method for read-only transactions on mixed OLTP/OLAP workloads." Information Systems 124 (2024): 102385.

[10]. Hosseinzadeh, Shima, Amirhossein Parvaresh, and Dietmar Fey. "Optimization of OLAP In-Memory Database Management Systems with Processing-In-Memory Architecture." International Conference on Architecture of Computing Systems. Cham: Springer Nature Switzerland, 2023.

Cite this article

Li,X.;Shen,Q.;Yang,T. (2024). Design and optimization of multidimensional data models for enhanced OLAP query performance and data analysis. Applied and Computational Engineering,69,161-166.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

Disclaimer/Publisher's Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

About volume

Volume title: Proceedings of the 6th International Conference on Computing and Data Science

Conference website: https://www.confcds.org/
ISBN:978-1-83558-459-0(Print) / 978-1-83558-460-6(Online)
Conference date: 12 September 2024
Editor:Alan Wang, Roman Bauer
Series: Applied and Computational Engineering
Volume number: Vol.69
ISSN:2755-2721(Print) / 2755-273X(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).