Authors :
Sangeetha Mandapaka
Volume/Issue :
Volume 10 - 2025, Issue 10 - October
Google Scholar :
https://tinyurl.com/4fak2wp4
Scribd :
https://tinyurl.com/mrv4uk62
DOI :
https://doi.org/10.38124/ijisrt/25oct1435
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Note : Google Scholar may take 30 to 40 days to display the article.
Abstract :
This article presents a framework for integrating advanced machine learning models within PostgreSQL to
optimize query performance and manage workloads dynamically. The integration creates a paradigm shift from static, rule-
based optimization to adaptive, data-driven approaches that respond to changing conditions. PostgreSQL's extensible
architecture provides an ideal foundation for implementing ML-enhanced components without modifying core database
code. The framework encompasses four key areas: query optimizer enhancement using gradient boosting and neural
networks, adaptive indexing mechanisms that automatically adjust to workload patterns, dynamic resource allocation
through workload classification and forecasting, and a comprehensive model training pipeline. Experimental evaluations
across analytical, transactional, and hybrid workloads demonstrate significant improvements in cardinality estimation
accuracy, execution plan quality, resource utilization, and administrative overhead reduction. The modular design enables
incremental adoption in production environments while maintaining compatibility with existing applications, illustrating
how traditional relational database systems can evolve to meet modern data challenges through machine learning
integration.
Keywords :
Machine Learning Integration, PostgreSQL Extensibility, Adaptive Query Optimization, Workload Management, Learned Index Structures.
References :
- Tim Kraska et al., "The Case for Learned Index Structures," arXiv, 2018. https://arxiv.org/pdf/1712.01208.
- Ryan Marcus et al., "Neo: A Learned Query Optimizer," PVLDB, 12(11): 1705-1718, 2019. https://www.vldb.org/pvldb/vol12/p1705-marcus.pdf.
- Andrew Pavlo et al., "Self-Driving Database Management Systems," 8th Biennial Conference on Innovative Data Systems Research (CIDR'17), 2017. https://db.cs.cmu.edu/papers/2017/p42-pavlo-cidr17.pdf.
- Ryan Marcus and Olga Papaemmanouil, "Deep Reinforcement Learning for Join Order Enumeration," arXiv, 2018. https://arxiv.org/pdf/1803.00055.
- Lin Ma et al., "Query-based Workload Forecasting for Self-Driving Database Management Systems," SIGMOD’18, 2018. https://www.pdl.cmu.edu/PDL-FTP/Database/sigmod18-ma.pdf.
- Bailu Ding et al., "AI Meets AI: Leveraging Query Executions to Improve Index Recommendations," SIGMOD ’19, 2019. https://15799.courses.cs.cmu.edu/spring2022/papers/04-indexes2/ding-sigmod2019.pdf.
- Vikram Nathan et al., "Learning Multi-dimensional Indexes," arXiv, SIGMOD’20, 2019. https://arxiv.org/pdf/1912.01668.
- Immanuel Trummer et al., "SkinnerDB: Regret-Bounded Query Evaluation via Reinforcement Learning," arxiv, 2019. https://arxiv.org/pdf/1901.05152.
- Hussam Abu-Libdeh et al., "Learned Indexes for a Google-scale Disk-based Database," arxiv, Workshop on ML for Systems at NeurIPS, 2020. https://arxiv.org/pdf/2012.12501.
- Landon Brown and Elijah William, "AI-Driven Auto-Tuning for Cloud Database Performance Optimization," Researchgate, 2024. https://www.researchgate.net/publication/390213018_AI-Driven_Auto-Tuning_for_Cloud_Database_Performance_Optimization.
This article presents a framework for integrating advanced machine learning models within PostgreSQL to
optimize query performance and manage workloads dynamically. The integration creates a paradigm shift from static, rule-
based optimization to adaptive, data-driven approaches that respond to changing conditions. PostgreSQL's extensible
architecture provides an ideal foundation for implementing ML-enhanced components without modifying core database
code. The framework encompasses four key areas: query optimizer enhancement using gradient boosting and neural
networks, adaptive indexing mechanisms that automatically adjust to workload patterns, dynamic resource allocation
through workload classification and forecasting, and a comprehensive model training pipeline. Experimental evaluations
across analytical, transactional, and hybrid workloads demonstrate significant improvements in cardinality estimation
accuracy, execution plan quality, resource utilization, and administrative overhead reduction. The modular design enables
incremental adoption in production environments while maintaining compatibility with existing applications, illustrating
how traditional relational database systems can evolve to meet modern data challenges through machine learning
integration.
Keywords :
Machine Learning Integration, PostgreSQL Extensibility, Adaptive Query Optimization, Workload Management, Learned Index Structures.