Automatic Configuration Tuning on Cloud Database: A Survey

Abstract—Faced with the challenges of big data, modern cloud database management systems are designed to efficiently store, organize, and retrieve data, supporting optimal performance, scalability, and reliability for complex data processing and analysis. However, achieving good performance in modern databases is non-trivial as they are notorious for having dozens of configurable knobs, such as hardware setup, software setup, database physical and logical design, etc., that control runtime behaviors and impact database performance. To find the optimal configuration for achieving optimal performance, extensive research has been conducted on automatic parameter tuning in DBMS. This paper provides a comprehensive survey of predominant configuration tuning techniques, including Bayesian optimization-based solutions, Neural network-based solutions, Reinforcement learning-based solutions, and Search-based solutions. Moreover, it investigates the fundamental aspects of parameter tuning pipeline, including tuning objective, workload characterization, feature pruning, knowledge from experience, configuration recommendation, and experimental settings. We highlight technique comparisons in each component, corresponding solutions, and introduce the experimental setting for performance evaluation. Finally, we conclude this paper and present future research opportunities. This paper aims to assist future researchers and practitioners in gaining a better understanding of automatic parameter tuning in cloud databases by providing state-of-the-art existing solutions, research directions, and evaluation benchmarks.

1 INTRODUCTION

In the increasingly digitized age, vast and diverse volumes of data are generated from various sources, including mobile devices, social media platforms, sensors, and more. Faced with this data explosion, cloud database management systems (DBMS) cloud database management systems (DBMS) for data storage, coupled with big data analytics frameworks (BDAF), have emerged as powerful solutions to tackle the complexities of handling and processing massive and intricate data sets in a scalable and flexible manner. This makes them invaluable tools for organizations grappling with the challenges of big data and digital transformation [1], [2].

However, achieving good performance in modern DBMSs is non-trivial. Modern DBMSs have hundreds of configurable knobs regarding hardware setup, software configuration, database physical and logical design, that affect their performance [3]–[6]. Efficient parameter configurations can strike a balance between resource utilization, query responsiveness, and cost-effectiveness, while an inappropriate configuration can lead to significant performance degradation and inefficient usage of system resources [3], [6]–[13].

文章来源: https://hackernoon.com/automatic-configuration-tuning-on-cloud-database-a-survey?source=rss
如有侵权请联系:admin#unsafe.sh