Application of Machine Learning for Effective Screening of Enhanced Oil Recovery Methods

Jawad Ali; Ubedullah Ansari; Fateh Ali; Tariq Javed; Imran Ahmed Hullio

doi:10.62762/RS.2025.333184

Volume 2, Issue 1, 2026

Submit Manuscript Edit a Special Issue

Article QR Code

Scan the QR code for reading

Popular articles

Case Studies on Integrating Artificial Intelligence in Finance to Transform Decision Making and Risk Management for Enhanced Financial Outcomes Reinforcement Learning for Prompt Optimization in Language Models: A Comprehensive Survey of Methods, Representations, and Evaluation Challenges AI and the Future of Education: Advancing Personalized Learning and Intelligent Tutoring Systems Reservoir Science: A Multi-Coupling Communication Platform to Promote Energy Transformation, Climate Change and Environmental Protection From CO$_2$ Sequestration to Hydrogen Storage: Further Utilization of Depleted Gas Reservoirs Effects of Crosslinking Agents and Reservoir Conditions on the Propagation of Fractures in Coal Reservoirs During Hydraulic Fracturing Plant Disease Detection Using Deep Learning Techniques Modeling Brain Functional Networks Using Graph Neural Networks: A Review and Clinical Application The Influence of Geological Factors and Transmission Fluids on the Exploitation of Reservoir Geothermal Resources: Factor Discussion and Mechanism Analysis Current Status and Development Prospects of Carbon Capture, Utilization, and Storage (CCUS) in China: Technical, Policy, and Market Perspectives

Reservoir Science, Volume 2, Issue 1, 2026: 65-80

Open Access | Research Article | 27 February 2026

Application of Machine Learning for Effective Screening of Enhanced Oil Recovery Methods

Jawad Ali 1 *

Ubedullah Ansari 1

Fateh Ali 1

Tariq Javed 1

Imran Ahmed Hullio 1

1 Institute of Petroleum and Natural Gas Engineering, Mehran University of Engineering and Technology, Jamshoro 76062, Pakistan

* Corresponding Author: Jawad Ali, [email protected]

DOI: 10.62762/RS.2025.333184

ARK: ark:/57805/rs.2025.333184

Received: 27 November 2025, Accepted: 10 February 2026, Published: 27 February 2026

PDF (1.58 MB)

Article Metrics Cite This Article

Abstract

Selecting the most suitable enhanced oil recovery (EOR) technique remains challenging due to severe class imbalance in historical datasets and the limitations of traditional screening criteria. To address data imbalance while preserving domain knowledge, this study proposes a novel machine learning framework that incorporates domain-informed synthetic data generation strictly constrained by established EOR screening criteria. An initial dataset of 583 documented EOR projects was compiled from field reports and public databases. After rigorous cleaning, 575 valid samples were retained and subsequently augmented to 760 balanced instances (class sizes ranging from 60–110 samples per class). This reduced the imbalance ratio from 123:1 to approximately 1.8:1. The augmented dataset was processed using principal component analysis (PCA) for dimensionality reduction, followed by hyperparameter tuning and 5-fold cross-validation. Among the evaluated models, K-Nearest Neighbors (KNN) and Random Forest achieved the highest macro-averaged performance (F1-score of 0.89 and 0.85, respectively). The results demonstrate that domain-guided synthetic data generation significantly improves model accuracy and robustness for multi-class EOR screening, offering reservoir engineers a reliable, machine learning-supported decision-making tool.

Graphical Abstract

Application of Machine Learning for Effective Screening of Enhanced Oil Recovery Methods

Keywords

EOR screening

machine learning

screening criteria

imbalanced data

multi-class classification

enhanced oil recovery

Data Availability Statement

Data will be made available on request.

Funding

This work was supported without any funding.

Conflicts of Interest

The authors declare no conflicts of interest.

AI Use Statement

The authors declare that no generative AI was used in the preparation of this manuscript.

Ethical Approval and Consent to Participate

Not applicable.

References

Aladasani, A., & Bai, B. (2010, June). Recent developments and updated screening criteria of enhanced oil recovery techniques. In SPE International Oil and Gas Conference and Exhibition in China (pp. SPE-130726). Spe.
[CrossRef] [Google Scholar]
Cheraghi, Y., Kord, S., & Mashayekhizadeh, V. (2021). Application of machine learning techniques for selecting the most suitable enhanced oil recovery method; challenges and opportunities. Journal of Petroleum Science and Engineering, 205, 108761.
[CrossRef] [Google Scholar]
Sathya, R., & Abraham, A. (2013). Comparison of supervised and unsupervised learning algorithms for pattern classification. International Journal of Advanced Research in Artificial Intelligence, 2(2), 34-38.
[Google Scholar]
Alvarado, V., Ranson, A., Hernandez, K., Manrique, E., Matheus, J., Liscano, T., & Prosperi, N. (2002, October). Selection of EOR/IOR opportunities based on machine learning. In SPE Europec featured at EAGE Conference and Exhibition? (pp. SPE-78332). SPE.
[CrossRef] [Google Scholar]
Wong, T. T., & Yeh, P. Y. (2019). Reliable accuracy estimates from k-fold cross validation. IEEE Transactions on Knowledge and Data Engineering, 32(8), 1586-1594.
[CrossRef] [Google Scholar]
Al Adasani, A., & Bai, B. (2011). Analysis of EOR projects and updated screening criteria. Journal of Petroleum Science and Engineering, 79(1-2), 10-24.
[CrossRef] [Google Scholar]
Oil & Gas Journal. (1998, April 20). 1998 worldwide EOR survey [Industry survey]. Retrieved from https://www.ogj.com/home/article/17226236/1998-worldwide-eor-survey
[Google Scholar]
Taber, J. J., Martin, F. D., & Seright, R. S. (1997). EOR screening criteria revisited Part 1: Introduction to screening criteria and enhanced recovery field projects. SPE Reservoir Engineering, 12(3), 189-198.
[CrossRef] [Google Scholar]
Mohammed, R., Rawashdeh, J., & Abdullah, M. (2020, April). Machine learning with oversampling and undersampling techniques: overview study and experimental results. In 2020 11th international conference on information and communication systems (ICICS) (pp. 243-248). IEEE.
[CrossRef] [Google Scholar]
Provost, F. (2000, July). Machine learning from imbalanced data sets 101. In Proceedings of the AAAI’2000 workshop on imbalanced data sets (Vol. 68, No. 2000, pp. 1-3). AAAI Press.
[Google Scholar]
Lohr, S. L. (2021). Sampling: design and analysis. Chapman and Hall/CRC.
[CrossRef] [Google Scholar]
May, R. J., Maier, H. R., & Dandy, G. C. (2010). Data splitting for artificial neural networks using SOM-based stratified sampling. Neural Networks, 23(2), 283-294.
[CrossRef] [Google Scholar]
Theng, D., & Bhoyar, K. K. (2024). Feature selection techniques for machine learning: a survey of more than two decades of research. Knowledge and Information Systems, 66(3), 1575-1637.
[CrossRef] [Google Scholar]
Hartono, A. D., Hakiki, F., Syihab, Z., Ambia, F., Yasutra, A., Sutopo, S., ... & Apriandi, R. (2017, October). Revisiting EOR projects in Indonesia through integrated study: EOR screening, predictive model, and optimisation. In SPE Asia Pacific Oil and Gas Conference and Exhibition (p. D012S036R029). SPE.
[CrossRef] [Google Scholar]
Khazali, N., Sharifi, M., & Ahmadi, M. A. (2019). Application of fuzzy decision tree in EOR screening assessment. Journal of Petroleum Science and Engineering, 177, 167-180.
[CrossRef] [Google Scholar]
Reddy, G. T., Reddy, M. P. K., Lakshmanna, K., Kaluri, R., Rajput, D. S., Srivastava, G., & Baker, T. (2020). Analysis of dimensionality reduction techniques on big data. IEEE Access, 8, 54776-54788.
[CrossRef] [Google Scholar]
Feurer, M., & Hutter, F. (2019). Hyperparameter optimization. In Automated machine learning: Methods, systems, challenges (pp. 3-33). Cham: Springer International Publishing.
[CrossRef] [Google Scholar]
Frederick, L. (2005). Implementation of Breiman's Random Forest Machine Learning Algorithm. ECE591Q Machine Learning Journal Paper, 1-13.
[Google Scholar]
Parada, C. H., & Ertekin, T. (2012, March). A new screening tool for improved oil recovery methods using artificial neural networks. In SPE western regional meeting (pp. SPE-153321). SPE.
[CrossRef] [Google Scholar]
Sorzano, C. O. S., Vargas, J., & Montano, A. P. (2014). A survey of dimensionality reduction techniques. arXiv preprint arXiv:1403.2877.
[Google Scholar]
Tanha, J., Abdi, Y., Samadi, N., Razzaghi, N., & Asadpour, M. (2020). Boosting methods for multi-class imbalanced data classification: an experimental review. Journal of Big data, 7(1), 70.
[CrossRef] [Google Scholar]
Tarrahi, M., Afra, S., & Surovets, I. (2015, October). A novel automated and probabilistic EOR screening method to integrate theoretical screening criteria and real field EOR practices using machine learning algorithms. In SPE Russian Petroleum Technology Conference (pp. SPE-176725). SPE.
[CrossRef] [Google Scholar]
Yang, L., & Shami, A. (2020). On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing, 415, 295-316.
[CrossRef] [Google Scholar]
Zhang, N., Wei, M., Fan, J., Aldhaheri, M., Zhang, Y., & Bai, B. (2019). Development of a hybrid scoring system for EOR screening by combining conventional screening guidelines and random forest algorithm. Fuel, 256, 115915.
[CrossRef] [Google Scholar]

Cite This Article

APA Style

Ali, J., Ansari, U., Ali, F., Javed, T., & Hullio, I. A. (2026). Application of Machine Learning for Effective Screening of Enhanced Oil Recovery Methods. Reservoir Science, 2(1), 65–80. https://doi.org/10.62762/RS.2025.333184

Export Citation

RIS Format

Compatible with EndNote, Zotero, Mendeley, and other reference managers

RIS format data for reference managers

TY  - JOUR
AU  - Ali, Jawad
AU  - Ansari, Ubedullah
AU  - Ali, Fateh
AU  - Javed, Tariq
AU  - Hullio, Imran Ahmed
PY  - 2026
DA  - 2026/02/27
TI  - Application of Machine Learning for Effective Screening of Enhanced Oil Recovery Methods
JO  - Reservoir Science
T2  - Reservoir Science
JF  - Reservoir Science
VL  - 2
IS  - 1
SP  - 65
EP  - 80
DO  - 10.62762/RS.2025.333184
UR  - https://www.icck.org/article/abs/RS.2025.333184
KW  - EOR screening
KW  - machine learning
KW  - screening criteria
KW  - imbalanced data
KW  - multi-class classification
KW  - enhanced oil recovery
AB  - Selecting the most suitable enhanced oil recovery (EOR) technique remains challenging due to severe class imbalance in historical datasets and the limitations of traditional screening criteria. To address data imbalance while preserving domain knowledge, this study proposes a novel machine learning framework that incorporates domain-informed synthetic data generation strictly constrained by established EOR screening criteria. An initial dataset of 583 documented EOR projects was compiled from field reports and public databases. After rigorous cleaning, 575 valid samples were retained and subsequently augmented to 760 balanced instances (class sizes ranging from 60–110 samples per class). This reduced the imbalance ratio from 123:1 to approximately 1.8:1. The augmented dataset was processed using principal component analysis (PCA) for dimensionality reduction, followed by hyperparameter tuning and 5-fold cross-validation. Among the evaluated models, K-Nearest Neighbors (KNN) and Random Forest achieved the highest macro-averaged performance (F1-score of 0.89 and 0.85, respectively). The results demonstrate that domain-guided synthetic data generation significantly improves model accuracy and robustness for multi-class EOR screening, offering reservoir engineers a reliable, machine learning-supported decision-making tool.
SN  - 3070-2356
PB  - Institute of Central Computation and Knowledge
LA  - English
ER  -

BibTeX Format

Compatible with LaTeX, BibTeX, and other reference managers

BibTeX format data for LaTeX and reference managers

@article{Ali2026Applicatio,
  author = {Jawad Ali and Ubedullah Ansari and Fateh Ali and Tariq Javed and Imran Ahmed Hullio},
  title = {Application of Machine Learning for Effective Screening of Enhanced Oil Recovery Methods},
  journal = {Reservoir Science},
  year = {2026},
  volume = {2},
  number = {1},
  pages = {65-80},
  doi = {10.62762/RS.2025.333184},
  url = {https://www.icck.org/article/abs/RS.2025.333184},
  abstract = {Selecting the most suitable enhanced oil recovery (EOR) technique remains challenging due to severe class imbalance in historical datasets and the limitations of traditional screening criteria. To address data imbalance while preserving domain knowledge, this study proposes a novel machine learning framework that incorporates domain-informed synthetic data generation strictly constrained by established EOR screening criteria. An initial dataset of 583 documented EOR projects was compiled from field reports and public databases. After rigorous cleaning, 575 valid samples were retained and subsequently augmented to 760 balanced instances (class sizes ranging from 60–110 samples per class). This reduced the imbalance ratio from 123:1 to approximately 1.8:1. The augmented dataset was processed using principal component analysis (PCA) for dimensionality reduction, followed by hyperparameter tuning and 5-fold cross-validation. Among the evaluated models, K-Nearest Neighbors (KNN) and Random Forest achieved the highest macro-averaged performance (F1-score of 0.89 and 0.85, respectively). The results demonstrate that domain-guided synthetic data generation significantly improves model accuracy and robustness for multi-class EOR screening, offering reservoir engineers a reliable, machine learning-supported decision-making tool.},
  keywords = {EOR screening, machine learning, screening criteria, imbalanced data, multi-class classification, enhanced oil recovery},
  issn = {3070-2356},
  publisher = {Institute of Central Computation and Knowledge}
}

Article Metrics

Citations:

Google Scholar

Crossref

Scopus

Web of Science

Article Access Statistics:

PDF Downloads: 8

Publisher's Note

ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and Permissions

Copyright © 2026 by the Author(s). Published by Institute of Central Computation and Knowledge. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

Reservoir Science

ISSN: 3070-2356 (Online)

Email: [email protected]

Portico

All published articles are preserved here permanently:
https://www.portico.org/publishers/icck/

User

Unlimited Downloads

Complete Library Access

Membership Eligibility

Community Leadership Opportunities

Google Scholar

Crossref

Scopus

Web of Science