Volume 2, Issue 1, Reservoir Science
Volume 2, Issue 1, 2026
Submit Manuscript Edit a Special Issue
Article QR Code
Article QR Code
Scan the QR code for reading
Popular articles
Reservoir Science, Volume 2, Issue 1, 2026: 65-80

Open Access | Research Article | 27 February 2026
Application of Machine Learning for Effective Screening of Enhanced Oil Recovery Methods
1 Institute of Petroleum and Natural Gas Engineering, Mehran University of Engineering and Technology, Jamshoro 76062, Pakistan
* Corresponding Author: Jawad Ali, [email protected]
ARK: ark:/57805/rs.2025.333184
Received: 27 November 2025, Accepted: 10 February 2026, Published: 27 February 2026  
Abstract
Selecting the most suitable enhanced oil recovery (EOR) technique remains challenging due to severe class imbalance in historical datasets and the limitations of traditional screening criteria. To address data imbalance while preserving domain knowledge, this study proposes a novel machine learning framework that incorporates domain-informed synthetic data generation strictly constrained by established EOR screening criteria. An initial dataset of 583 documented EOR projects was compiled from field reports and public databases. After rigorous cleaning, 575 valid samples were retained and subsequently augmented to 760 balanced instances (class sizes ranging from 60–110 samples per class). This reduced the imbalance ratio from 123:1 to approximately 1.8:1. The augmented dataset was processed using principal component analysis (PCA) for dimensionality reduction, followed by hyperparameter tuning and 5-fold cross-validation. Among the evaluated models, K-Nearest Neighbors (KNN) and Random Forest achieved the highest macro-averaged performance (F1-score of 0.89 and 0.85, respectively). The results demonstrate that domain-guided synthetic data generation significantly improves model accuracy and robustness for multi-class EOR screening, offering reservoir engineers a reliable, machine learning-supported decision-making tool.

Graphical Abstract
Application of Machine Learning for Effective Screening of Enhanced Oil Recovery Methods

Keywords
EOR screening
machine learning
screening criteria
imbalanced data
multi-class classification
enhanced oil recovery

Data Availability Statement
Data will be made available on request.

Funding
This work was supported without any funding.

Conflicts of Interest
The authors declare no conflicts of interest.

AI Use Statement
The authors declare that no generative AI was used in the preparation of this manuscript.

Ethical Approval and Consent to Participate
Not applicable.

References
  1. Aladasani, A., & Bai, B. (2010, June). Recent developments and updated screening criteria of enhanced oil recovery techniques. In SPE International Oil and Gas Conference and Exhibition in China (pp. SPE-130726). Spe.
    [CrossRef]   [Google Scholar]
  2. Cheraghi, Y., Kord, S., & Mashayekhizadeh, V. (2021). Application of machine learning techniques for selecting the most suitable enhanced oil recovery method; challenges and opportunities. Journal of Petroleum Science and Engineering, 205, 108761.
    [CrossRef]   [Google Scholar]
  3. Sathya, R., & Abraham, A. (2013). Comparison of supervised and unsupervised learning algorithms for pattern classification. International Journal of Advanced Research in Artificial Intelligence, 2(2), 34-38.
    [Google Scholar]
  4. Alvarado, V., Ranson, A., Hernandez, K., Manrique, E., Matheus, J., Liscano, T., & Prosperi, N. (2002, October). Selection of EOR/IOR opportunities based on machine learning. In SPE Europec featured at EAGE Conference and Exhibition? (pp. SPE-78332). SPE.
    [CrossRef]   [Google Scholar]
  5. Wong, T. T., & Yeh, P. Y. (2019). Reliable accuracy estimates from k-fold cross validation. IEEE Transactions on Knowledge and Data Engineering, 32(8), 1586-1594.
    [CrossRef]   [Google Scholar]
  6. Al Adasani, A., & Bai, B. (2011). Analysis of EOR projects and updated screening criteria. Journal of Petroleum Science and Engineering, 79(1-2), 10-24.
    [CrossRef]   [Google Scholar]
  7. Oil & Gas Journal. (1998, April 20). 1998 worldwide EOR survey [Industry survey]. Retrieved from https://www.ogj.com/home/article/17226236/1998-worldwide-eor-survey
    [Google Scholar]
  8. Taber, J. J., Martin, F. D., & Seright, R. S. (1997). EOR screening criteria revisited Part 1: Introduction to screening criteria and enhanced recovery field projects. SPE Reservoir Engineering, 12(3), 189-198.
    [CrossRef]   [Google Scholar]
  9. Mohammed, R., Rawashdeh, J., & Abdullah, M. (2020, April). Machine learning with oversampling and undersampling techniques: overview study and experimental results. In 2020 11th international conference on information and communication systems (ICICS) (pp. 243-248). IEEE.
    [CrossRef]   [Google Scholar]
  10. Provost, F. (2000, July). Machine learning from imbalanced data sets 101. In Proceedings of the AAAI’2000 workshop on imbalanced data sets (Vol. 68, No. 2000, pp. 1-3). AAAI Press.
    [Google Scholar]
  11. Lohr, S. L. (2021). Sampling: design and analysis. Chapman and Hall/CRC.
    [CrossRef]   [Google Scholar]
  12. May, R. J., Maier, H. R., & Dandy, G. C. (2010). Data splitting for artificial neural networks using SOM-based stratified sampling. Neural Networks, 23(2), 283-294.
    [CrossRef]   [Google Scholar]
  13. Theng, D., & Bhoyar, K. K. (2024). Feature selection techniques for machine learning: a survey of more than two decades of research. Knowledge and Information Systems, 66(3), 1575-1637.
    [CrossRef]   [Google Scholar]
  14. Hartono, A. D., Hakiki, F., Syihab, Z., Ambia, F., Yasutra, A., Sutopo, S., ... & Apriandi, R. (2017, October). Revisiting EOR projects in Indonesia through integrated study: EOR screening, predictive model, and optimisation. In SPE Asia Pacific Oil and Gas Conference and Exhibition (p. D012S036R029). SPE.
    [CrossRef]   [Google Scholar]
  15. Khazali, N., Sharifi, M., & Ahmadi, M. A. (2019). Application of fuzzy decision tree in EOR screening assessment. Journal of Petroleum Science and Engineering, 177, 167-180.
    [CrossRef]   [Google Scholar]
  16. Reddy, G. T., Reddy, M. P. K., Lakshmanna, K., Kaluri, R., Rajput, D. S., Srivastava, G., & Baker, T. (2020). Analysis of dimensionality reduction techniques on big data. IEEE Access, 8, 54776-54788.
    [CrossRef]   [Google Scholar]
  17. Feurer, M., & Hutter, F. (2019). Hyperparameter optimization. In Automated machine learning: Methods, systems, challenges (pp. 3-33). Cham: Springer International Publishing.
    [CrossRef]   [Google Scholar]
  18. Frederick, L. (2005). Implementation of Breiman's Random Forest Machine Learning Algorithm. ECE591Q Machine Learning Journal Paper, 1-13.
    [Google Scholar]
  19. Parada, C. H., & Ertekin, T. (2012, March). A new screening tool for improved oil recovery methods using artificial neural networks. In SPE western regional meeting (pp. SPE-153321). SPE.
    [CrossRef]   [Google Scholar]
  20. Sorzano, C. O. S., Vargas, J., & Montano, A. P. (2014). A survey of dimensionality reduction techniques. arXiv preprint arXiv:1403.2877.
    [Google Scholar]
  21. Tanha, J., Abdi, Y., Samadi, N., Razzaghi, N., & Asadpour, M. (2020). Boosting methods for multi-class imbalanced data classification: an experimental review. Journal of Big data, 7(1), 70.
    [CrossRef]   [Google Scholar]
  22. Tarrahi, M., Afra, S., & Surovets, I. (2015, October). A novel automated and probabilistic EOR screening method to integrate theoretical screening criteria and real field EOR practices using machine learning algorithms. In SPE Russian Petroleum Technology Conference (pp. SPE-176725). SPE.
    [CrossRef]   [Google Scholar]
  23. Yang, L., & Shami, A. (2020). On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing, 415, 295-316.
    [CrossRef]   [Google Scholar]
  24. Zhang, N., Wei, M., Fan, J., Aldhaheri, M., Zhang, Y., & Bai, B. (2019). Development of a hybrid scoring system for EOR screening by combining conventional screening guidelines and random forest algorithm. Fuel, 256, 115915.
    [CrossRef]   [Google Scholar]

Cite This Article
APA Style
Ali, J., Ansari, U., Ali, F., Javed, T., & Hullio, I. A. (2026). Application of Machine Learning for Effective Screening of Enhanced Oil Recovery Methods. Reservoir Science, 2(1), 65–80. https://doi.org/10.62762/RS.2025.333184
Export Citation
RIS Format
Compatible with EndNote, Zotero, Mendeley, and other reference managers
RIS format data for reference managers
TY  - JOUR
AU  - Ali, Jawad
AU  - Ansari, Ubedullah
AU  - Ali, Fateh
AU  - Javed, Tariq
AU  - Hullio, Imran Ahmed
PY  - 2026
DA  - 2026/02/27
TI  - Application of Machine Learning for Effective Screening of Enhanced Oil Recovery Methods
JO  - Reservoir Science
T2  - Reservoir Science
JF  - Reservoir Science
VL  - 2
IS  - 1
SP  - 65
EP  - 80
DO  - 10.62762/RS.2025.333184
UR  - https://www.icck.org/article/abs/RS.2025.333184
KW  - EOR screening
KW  - machine learning
KW  - screening criteria
KW  - imbalanced data
KW  - multi-class classification
KW  - enhanced oil recovery
AB  - Selecting the most suitable enhanced oil recovery (EOR) technique remains challenging due to severe class imbalance in historical datasets and the limitations of traditional screening criteria. To address data imbalance while preserving domain knowledge, this study proposes a novel machine learning framework that incorporates domain-informed synthetic data generation strictly constrained by established EOR screening criteria. An initial dataset of 583 documented EOR projects was compiled from field reports and public databases. After rigorous cleaning, 575 valid samples were retained and subsequently augmented to 760 balanced instances (class sizes ranging from 60–110 samples per class). This reduced the imbalance ratio from 123:1 to approximately 1.8:1. The augmented dataset was processed using principal component analysis (PCA) for dimensionality reduction, followed by hyperparameter tuning and 5-fold cross-validation. Among the evaluated models, K-Nearest Neighbors (KNN) and Random Forest achieved the highest macro-averaged performance (F1-score of 0.89 and 0.85, respectively). The results demonstrate that domain-guided synthetic data generation significantly improves model accuracy and robustness for multi-class EOR screening, offering reservoir engineers a reliable, machine learning-supported decision-making tool.
SN  - 3070-2356
PB  - Institute of Central Computation and Knowledge
LA  - English
ER  - 
BibTeX Format
Compatible with LaTeX, BibTeX, and other reference managers
BibTeX format data for LaTeX and reference managers
@article{Ali2026Applicatio,
  author = {Jawad Ali and Ubedullah Ansari and Fateh Ali and Tariq Javed and Imran Ahmed Hullio},
  title = {Application of Machine Learning for Effective Screening of Enhanced Oil Recovery Methods},
  journal = {Reservoir Science},
  year = {2026},
  volume = {2},
  number = {1},
  pages = {65-80},
  doi = {10.62762/RS.2025.333184},
  url = {https://www.icck.org/article/abs/RS.2025.333184},
  abstract = {Selecting the most suitable enhanced oil recovery (EOR) technique remains challenging due to severe class imbalance in historical datasets and the limitations of traditional screening criteria. To address data imbalance while preserving domain knowledge, this study proposes a novel machine learning framework that incorporates domain-informed synthetic data generation strictly constrained by established EOR screening criteria. An initial dataset of 583 documented EOR projects was compiled from field reports and public databases. After rigorous cleaning, 575 valid samples were retained and subsequently augmented to 760 balanced instances (class sizes ranging from 60–110 samples per class). This reduced the imbalance ratio from 123:1 to approximately 1.8:1. The augmented dataset was processed using principal component analysis (PCA) for dimensionality reduction, followed by hyperparameter tuning and 5-fold cross-validation. Among the evaluated models, K-Nearest Neighbors (KNN) and Random Forest achieved the highest macro-averaged performance (F1-score of 0.89 and 0.85, respectively). The results demonstrate that domain-guided synthetic data generation significantly improves model accuracy and robustness for multi-class EOR screening, offering reservoir engineers a reliable, machine learning-supported decision-making tool.},
  keywords = {EOR screening, machine learning, screening criteria, imbalanced data, multi-class classification, enhanced oil recovery},
  issn = {3070-2356},
  publisher = {Institute of Central Computation and Knowledge}
}

Article Metrics
Citations:

Crossref

0

Scopus

0

Web of Science

0
Article Access Statistics:
Views: 61
PDF Downloads: 8

Publisher's Note
ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and Permissions
CC BY Copyright © 2026 by the Author(s). Published by Institute of Central Computation and Knowledge. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
Reservoir Science

Reservoir Science

ISSN: 3070-2356 (Online)

Email: [email protected]

Portico

Portico

All published articles are preserved here permanently:
https://www.portico.org/publishers/icck/