A Framework for Predicting Enhanced Oil Recovery Performance Using Machine Learning
Table Of Contents
Chapter ONE
INTRODUCTION
- 1.1Introduction to Machine Learning in Enhanced Oil Recovery Performance
- 1.2Background of Machine Learning Applications in Petroleum Engineering
- 1.3Statement of the Challenges in EOR Performance Prediction
- 1.4Aim and Objectives of Developing a Predictive Framework
- 1.5Research Questions Addressing EOR Efficiency Prediction
- 1.6Research Hypotheses on Machine Learning Model Performance
- 1.7Significance of a Predictive Framework for EOR Optimization
- 1.8Scope and Delimitations of Machine Learning Framework Development
- 1.9Limitations in Data Availability and Model Generalizability
- 1.10Organisation of the Thesis on Machine Learning Framework
- 1.11Operational Definitions of Key Terms (e.g., EOR, Machine Learning, Prediction Accuracy)
Chapter TWO
LITERATURE REVIEW
- 2.1Conceptual Overview of Enhanced Oil Recovery Techniques
- 2.2Conceptual Review of Machine Learning in Oil and Gas Industry
- 2.3Theoretical Frameworks: Artificial Neural Networks and Support Vector Machines
- 2.4Empirical Studies on Machine Learning for EOR Performance Prediction
- 2.5Comparative Analyses of Traditional vs. Machine Learning Methods in EOR
- 2.6Data Types and Sources for Machine Learning in Petroleum Engineering
- 2.7Challenges and Limitations in Prior Machine Learning EOR Studies
- 2.8Identified Gaps in Existing Literature on EOR Prediction Models
- 2.9Development of a Conceptual Model for EOR Performance Prediction
- 2.10Summary of Literature Review Findings and Thematic Synthesis
- 2.11Conceptual Model Diagram for EOR Performance Prediction Framework
- 2.12Critical Reflection on the Literature Gaps and Research Justification
Chapter THREE
SYSTEM DESIGN AND IMPLEMENTATION
- 3.1Research Design: Development and Validation of Predictive Machine Learning Models
- 3.2Philosophical Paradigm: Pragmatism for Data-Driven Model Development
- 3.3Population of the Study: EOR Data Sets from Selected Reservoirs
- 3.4Sample Size Determination and Sampling Technique for Data Collection
- 3.5Data Sources: Well Logs, Production Data, Core Samples, and Laboratory Results
- 3.6Data Collection Instruments: Data Extraction Protocols and Software Tools
- 3.7Validity and Reliability of Data and Analytical Instruments
- 3.8Data Preprocessing and Feature Engineering for Model Input
- 3.9Model Specification: Selection and Tuning of Machine Learning Algorithms
- 3.10Ethical Considerations in Data Handling and Model Deployment
Chapter FOUR
SYSTEM TESTING AND EVALUATION
- ANALYSIS AND DISCUSSION
- 4.1Presentation of Raw Data and Data Cleaning Processes
- 4.2Descriptive Statistics of Key Reservoir and Operational Variables
- 4.3Model Training: Performance Metrics and Validation Results
- 4.4Hypotheses Testing: Model Accuracy, Precision, Recall, and F1-Score
- 4.5Interpretation of Machine Learning Model Outcomes in EOR Context
- 4.6Comparative Analysis of Different Machine Learning Algorithms
- 4.7Discussion of Findings in Relation to Theoretical Frameworks and Prior Studies
- 4.8Implications for EOR Performance Optimization and Field Application
Chapter FIVE
SUMMARY, CONCLUSION AND RECOMMENDATIONS
- CONCLUSION AND RECOMMENDATIONS
- 5.1Summary of Key Findings on Machine Learning-Based EOR Prediction
- 5.2Conclusions on the Effectiveness of the Developed Framework
- 5.3Contributions to Petroleum Engineering Literature and Practice
- 5.4Practical Recommendations for Industry Application
- 5.5Limitations of the Study and Considerations for Implementation
- 5.6Suggestions for Further Research on Machine Learning and EOR Optimization
Thesis Abstract
Enhancing oil recovery efficiency remains a critical challenge within the petroleum industry, primarily due to the complex and nonlinear relationships between reservoir properties, recovery methods, and operational parameters. Traditional predictive techniques often rely on deterministic models that lack adaptability and may fail to accurately forecast performance under varying reservoir conditions. This study aims to develop a robust, data-driven framework leveraging machine learning algorithms to predict enhanced oil recovery (EOR) performance with higher precision and operational relevance. The specific objectives include identifying key reservoir and operational parameters affecting EOR outcomes, training and validating machine learning models for performance prediction, and establishing a practical decision-support system to optimize EOR strategies. The research adopts a quantitative, exploratory research design grounded in supervised learning methodologies. The population comprises 350 reservoir performance datasets collected from offshore and onshore fields with diverse geological characteristics and EOR techniques, such as chemical flooding, thermal recovery, and gas injection. A stratified random sampling technique is employed to select a representative sample of 200 datasets, ensuring adequate variation across reservoir types, fluid properties, and EOR methods. Data collection involves leveraging historical production records, geophysical logs, petrophysical measurements, and operational reports obtained from industry databases and field operators. The primary data collection instruments include structured data extraction forms and validated digital data repositories, with additional validation through cross-referencing multiple data sources. Model development involves preprocessing procedures such as normalization, feature selection using recursive feature elimination, and data balancing techniques to address class imbalance issues. Several machine learning algorithms—namely Random Forests, Gradient Boosting Machines, and Support Vector Machines—are trained using cross-validation to prevent overfitting and to optimize hyperparameters via grid search. Model evaluation employs performance metrics including mean squared error (MSE), R-squared, and area under the receiver operating characteristic curve (AUC-ROC) to determine predictive accuracy and robustness. The analytical framework is anchored in the Reservoir Engineering Decision Theory and the System Dynamics Theory, which provide the conceptual basis for integrating reservoir properties with operational data into a dynamic predictive model. The study also explores the application of explainable AI techniques, such as SHAP (SHapley Additive exPlanations), to interpret model predictions and identify influential parameters. The expected outcome of this research is a validated machine learning-based predictive model capable of accurately estimating EOR performance for different reservoir contexts, thus assisting engineers in strategic decision-making. It is anticipated that the models will highlight critical parameters—such as reservoir heterogeneity, fluid saturation levels, injection rates, and chemical concentrations—that significantly influence recovery efficiency. Furthermore, the study aims to demonstrate the model’s superior predictive capability over conventional empirical or semi-empirical methods, as evidenced by improved performance metrics (e.g., a minimum of 0.85 R-squared value and 0.90 AUC-ROC). This research contributes to the existing body of knowledge by integrating advanced machine learning techniques within the domain of petroleum reservoir management, facilitating a shift toward more intelligent, data-centric EOR planning. The created framework provides a foundation for future development of real-time predictive tools and decision support systems, thus promoting reservoir management strategies that optimize recovery while minimizing operational risks and costs. Based on the findings, recommendations include adopting the predictive framework for routine reservoir evaluation, further expanding the dataset with real-time monitoring data, and exploring hybrid models that combine machine learning with physics-based simulations for enhanced predictive power. The study concludes that an evidence-based, machine learning-driven framework offers significant advantages for optimizing enhanced oil recovery performance, ultimately contributing to more sustainable and economically viable petroleum extraction practices.
Thesis Overview
This research aims to develop a practical framework that uses machine learning techniques to predict how well enhanced oil recovery (EOR) methods will perform in extracting oil from mature reservoirs. EOR involves techniques like chemical flooding, thermal recovery, or gas injection to increase the amount of oil that can be recovered beyond primary and secondary methods. Accurately predicting the success of these techniques helps oil companies plan better, optimize resource use, and reduce costs.
The study addresses a knowledge gap where existing methods for predicting EOR performance are often based on simplified models or limited historical data, leading to less reliable predictions. By leveraging machine learning, which can find complex patterns in large datasets, the research aims to improve the accuracy of these predictions and provide a more robust decision-making tool.
The research process involves several key steps. First, the researcher will gather existing EOR project data from a sample of approximately 50 to 100 reservoirs, including factors such as reservoir properties, fluid characteristics, injection parameters, and recovery outcomes. Data will be collected from industry databases and published research papers. Then, relevant machine learning algorithms such as random forests, support vector machines, and neural networks will be trained on this data to identify patterns that influence EOR success.
Analysis will focus on comparing the accuracy and predictive power of each model, using techniques like cross-validation and metrics such as R-squared, mean absolute error, and confusion matrices for classification tasks. The goal is to identify the most reliable model and develop a conceptual framework that integrates these predictions for practical application.
The expected contribution of this study is a validated, user-friendly framework that can be used by engineers and decision-makers to forecast EOR performance more accurately. Ultimately, this research will improve the efficiency and economics of EOR projects, guide better investment decisions, and contribute to the scientific understanding of complex reservoir behaviors using machine learning.