A Machine Learning Framework for Artificial Lift Method Selection with Physics-Informed Data Balancing
Research Article  ·  Published: 19 June 2026
Issue cover
Reservoir Science
Volume 2, Issue 3, 2026: 228-260
Research Article Open Access

A Machine Learning Framework for Artificial Lift Method Selection with Physics-Informed Data Balancing

1 Department of Petroleum Engineering, University of Louisiana at Lafayette, Lafayette, LA 70504, United States
2 Department of Engineering Data Science, University of Houston, Houston, TX 77204, United States
3 Institute of Petroleum and Natural Gas Engineering, Mehran University of Engineering and Technology, Jamshoro, Sindh 76062, Pakistan
4 College of Petroleum Engineering, China University of Petroleum, Beijing, Beijing 102249, China
* Corresponding Author: Sohail Nawab, [email protected]
Volume 2, Issue 3

Article Information

Published in Reservoir Science
Pages 228-260

Abstract

The selection of optimal artificial lift methods using machine learning remains challenging due to complex interactions among reservoir characteristics, fluid properties, and operational constraints. Conventional approaches rely on engineering expertise and static screening criteria, often insufficient to capture multifactorial dependencies. This study presents a framework for classifying the most suitable lift method from four common techniques: ESP, Gas Lift, Rod Pumps, and PCP. A dataset of 990 wells with twelve physically meaningful parameters was compiled, including depth, temperature, GOR, API gravity, reservoir pressure, water cut, production rate, viscosity, sand production, deviation, H\(_2\)S presence, and formation type. To address class imbalance, SMOTE and the proposed DI-SDG method—which derives feature-specific perturbation limits from published intra-class variability data—were evaluated. Six ML models were trained using five-fold stratified cross-validation. Random Forest achieved the best test performance (accuracy: 91.41%, precision: 92.47%, recall: 92.67%, macro-\(F_1\): 0.9255), with XGBoost (\(F_1\): 0.9125) and Gradient Boosting (\(F_1\): 0.9006) also performing well. Generalization was validated via blind well testing using independent field cases. Analysis of 18 misclassified samples showed most errors occurred in overlapping operating envelopes, particularly between ESP and Gas Lift in intermediate GOR ranges. Misclassified samples averaged 64% confidence versus 82% for correct classifications, suggesting a 75% threshold for cases requiring additional evaluation. Overall, the physics-informed balancing approach provides an accurate, interpretable framework for artificial lift selection with reliable field-data performance.

Graphical Abstract

A Machine Learning Framework for Artificial Lift Method Selection with Physics-Informed Data Balancing

Keywords

artificial lift selection machine learning screening criteria class imbalance multi-class classification ESP gas lift rod pump PCP domain-informed data augmentation

Data Availability Statement

The raw production data used in this study are not publicly available due to confidentiality agreements with participating oil companies. However, anonymized summary statistics and aggregated results are provided within the article. The corresponding author may be contacted for reasonable data requests, which will be subject to the aforementioned confidentiality constraints.

Funding

This work was supported without any funding.

Conflicts of Interest

The authors declare no conflicts of interest. For transparency, Author Sarfraz A. Jokhio previously held industry positions with an artificial lift equipment provider (Wood Group ESP) and an operating company (Saudi Aramco); these prior affiliations had no role in the study's design, analysis, or conclusions.

AI Use Statement

The authors declare that Grammarly was used for language editing and grammar correction during the preparation of this manuscript. No generative AI was employed for content generation, data analysis, or interpretation. The authors have reviewed the manuscript and take full responsibility for the content of this work.

Ethical Approval and Consent to Participate

Not applicable.

References

  1. Clegg, J. D., Bucaram, S. M., & Hein, N. W., Jr. (1993). Recommendations and Comparisons for Selecting Artificial-Lift Methods(includes associated papers 28645 and 29092 ). Journal of Petroleum Technology, 45(12), 1128–1167.
    [CrossRef] [Google Scholar]
  2. Lea, J. F., & Nickens, H. V. (1999, March). Selection of artificial lift. In SPE Oklahoma City Oil and Gas Symposium/Production and Operations Symposium (pp. SPE-52157). SPE.
    [CrossRef] [Google Scholar]
  3. Alemi, M., Jalalifar, H., Kamali, G., & Kalbasi, M. (2010). A prediction to the best artificial lift method selection on the basis of TOPSIS model. Journal of Petroleum and Gas Engineering, 1(1), 009–015. https://academicjournals.org/journal/JPGE/article-abstract/DD2D4E82719
    [Google Scholar]
  4. Syed, F. I., Alshamsi, M., Dahaghi, A. K., & Neghabhan, S. (2022). Artificial lift system optimization using machine learning applications. Petroleum, 8(2), 219–226.
    [CrossRef] [Google Scholar]
  5. Cheraghi, Y., Kord, S., & Mashayekhizadeh, V. (2021). Application of machine learning techniques for selecting the most suitable enhanced oil recovery method; challenges and opportunities. Journal of Petroleum Science and Engineering, 205, 108761.
    [CrossRef] [Google Scholar]
  6. Sathya, R., & Abraham, A. (2013). Comparison of supervised and unsupervised learning algorithms for pattern classification. International Journal of Advanced Research in Artificial Intelligence, 2(2), 34-38.
    [CrossRef] [Google Scholar]
  7. Mahdi, M. A. A., Amish, M., & Oluyemi, G. (2023). An Artificial Lift Selection Approach Using Machine Learning: A Case Study in Sudan. Energies, 16(6), 2853.
    [CrossRef] [Google Scholar]
  8. Yakoot, M. S. E., Ragab, A. M. S., & Mahmoud, O. (2021, October). Machine learning application for gas lift performance and well integrity. In SPE Europec featured at EAGE Conference and Exhibition? (p. D021S001R008). SPE.
    [CrossRef] [Google Scholar]
  9. Khalili, Y., Ahmadi, M., & Moraveji, M. K. (2025). Time-aware predictive maintenance of electrical submersible pumps using catboost ensemble learning and trend-based labeling. Journal of Petroleum Exploration and Production Technology, 15(9), 147.
    [CrossRef] [Google Scholar]
  10. Ma, F., Altalbawy, F. M., Patel, P., Manjunatha, R., Kalia, R., Formanova, S., ... & Alam, M. M. (2025). Predictive modeling of oil rate for wells under gas lift using machine learning. Scientific Reports, 15(1), 27765.
    [CrossRef] [Google Scholar]
  11. Ali, J., Ansari, U., Ali, F., Javed, T., & Hullio, I. A. (2026). Application of machine learning for effective screening of enhanced oil recovery methods. Reservoir Science, 2(1), 65-80.
    [CrossRef] [Google Scholar]
  12. Mohammed, R., Rawashdeh, J., & Abdullah, M. (2020, April). Machine learning with oversampling and undersampling techniques: overview study and experimental results. In 2020 11th international conference on information and communication systems (ICICS) (pp. 243-248). IEEE.
    [CrossRef] [Google Scholar]
  13. Brown, K. E. (1977). The Technology of Artificial Lift Methods. PPC Books. https://arks.org/ark:/13960/s2qk5zj2bp9
    [Google Scholar]
  14. Takacs, G. (2015). Sucker-rod pumping handbook: production engineering fundamentals and long-stroke rod pumping. Gulf Professional Publishing.
    [Google Scholar]
  15. Espin, D. A., Gasbarri, S., & Chacin, J. E. (1994, April). Expert system for selection of optimum Artificial Lift method. In SPE Latin America and Caribbean Petroleum Engineering Conference (pp. SPE-26967). SPE.
    [CrossRef] [Google Scholar]
  16. Heinze, L. R., Winkler, H. W., & Lea, J. F. (1995, April). Decision Tree for selection of Artificial Lift method. In SPE Oklahoma City Oil and Gas Symposium/Production and Operations Symposium (pp. SPE-29510). SPE.
    [CrossRef] [Google Scholar]
  17. Matthews, C. M., Zahacy, T. A., Alhanati, F. J. S., Skoczylas, P., & Dunn, L. J. (2007). Progressing Cavity Pumping Systems. In L. W. Lake & J. D. Clegg (Eds.), Production Operations Engineering: IV (p. 0). Society of Petroleum Engineers (SPE).
    [CrossRef] [Google Scholar]
  18. Zheng, A., & Casari, A. (2018). Feature engineering for machine learning: principles and techniques for data scientists. " O'Reilly Media, Inc.". https://dl.acm.org/doi/abs/10.5555/3239815
    [Google Scholar]
  19. Powers, M. L. (1994). Depth constraint of electric submersible pumps. SPE Production & Facilities, 9(02), 137-142.
    [CrossRef] [Google Scholar]
  20. Takacs, G. (2009). Electrical Submersible Pumps Manual: Design, Operations, and Maintenance. Gulf Professional Publishing.
    [Google Scholar]
  21. Noonan, S. G. (2008, October). The Progressing Cavity Pump Operating Envelope: You cannot expand what you don't understand. In SPE International Thermal Operations and Heavy Oil Symposium (pp. SPE-117521). SPE.
    [CrossRef] [Google Scholar]
  22. Gamboa, J., Aurelio, O., & Sorelys, E. (2003, October). New approach for modeling progressive cavity pumps performance. In SPE Annual Technical Conference and Exhibition? (pp. SPE-84137). SPE.
    [CrossRef] [Google Scholar]
  23. Alhanati, F. J. S., Solanki, S. C., & Zahacy, T. A. (2001, April). ESP failures: can we talk the same language?. In SPE Gulf Coast Section Electric Submersible Pumps Symposium (pp. SPE-148333). SPE.
    [CrossRef] [Google Scholar]
  24. Lea Jr, J. F., & Rowlan, L. (2019). Gas well deliquification (3rd ed.). Gulf Professional Publishing.
    [CrossRef] [Google Scholar]
  25. Alfaqih, M. R., Ariwibowo, A., & Juliana, C. T. (2016, November). Performance Analysis for Progressive Cavity Pump PCP Production Scenario in Sandy and Heavy Oil Wells. In SPE Middle East Artificial Lift Conference and Exhibition (p. D011S003R004). SPE.
    [CrossRef] [Google Scholar]
  26. Taheri, A., & Hooshmandkoochi, A. (2006, May). Optimum selection of artificial-lift system for Iranian heavy-oil fields. In SPE Western Regional Meeting (pp. SPE-99912). SPE.
    [CrossRef] [Google Scholar]
  27. Castro, V., Leite, D., Lemos, D., Marins, J., Pessoa, R., & Magalhães, J. (2015, May). ESP Application on Heavy Oil in Peregrino Field. In SPE Artificial Lift Conference-Latin America and Caribbean (pp. SPE-173948). SPE.
    [CrossRef] [Google Scholar]
  28. Fakher, S., Khlaifat, A., Hossain, M. E., & Nameer, H. (2021). A comprehensive review of sucker rod pumps’ components, diagnostics, mathematical models, and common failures and mitigations. Journal of Petroleum Exploration and Production Technology, 11(10), 3815-3839.
    [CrossRef] [Google Scholar]
  29. Arnst, B., Morshed, R., & Pond, B. (2021, August 4). White paper: Reducing rod lift failure in horizontal wells. Journal of Petroleum Technology. Society of Petroleum Engineers. https://jpt.spe.org/white-paper-reducing-rod-lift-failure-in-horizontal-wells
    [Google Scholar]
  30. Waldner, L., Wonitoy, K., Klaczek, W., & Noonan, S. (2012, October). Thermal Performance Testing of a High-Temperature ESP Motor for SAGD Applications. In SPE Annual Technical Conference and Exhibition? (pp. SPE-160317). SPE.
    [CrossRef] [Google Scholar]
  31. Zhu, H., Zhu, J., Rutter, R., & Zhang, H. Q. (2021). Experimental study on deteriorated performance, vibration, and geometry changes of an electrical submersible pump under sand water flow condition. Journal of Energy Resources Technology, 143(8), 082104.
    [CrossRef] [Google Scholar]
  32. Zhu, H., Zhu, J., Lin, Z., Zhao, Q., Rutter, R., & Zhang, H. Q. (2021). Performance degradation and wearing of Electrical Submersible Pump (ESP) with gas-liquid-solid flow: Experiments and mechanistic modeling. Journal of Petroleum Science and Engineering, 200, 108399.
    [CrossRef] [Google Scholar]
  33. NACE International. (2015). NACE MR0175/ISO 15156 – Petroleum and Natural Gas Industries: Materials for Use in H$_2$S-Containing Environments in Oil and Gas Production. NACE International. https://fouladonline.ir/wp-content/uploads/2017/05/NACE-MR-0175-ISO-15156-2015.pdf
    [Google Scholar]
  34. ChampionX. (2022). Sucker Rod Failure Analysis. ChampionX. https://www.championx.com/contents/NOR_Sucker%20Rod%20Failure%20Analysis_BR_0322.pdf
    [Google Scholar]
  35. Al-Khalifa, M., Pessoa Rodrigues, R., & Sinclair, D. (2022). Electrical Submersible Pump Design Enhancements for Hydrogen Sulfide Harsh Environments. SPE Production & Operations, 37(04), 603-615.
    [CrossRef] [Google Scholar]
  36. Crnogorac, M., Tanasijević, M., Danilović, D., Karović Maričić, V., & Leković, B. (2020). Selection of artificial lift methods: a brief review and new model based on fuzzy logic. Energies, 13(7), 1758.
    [CrossRef] [Google Scholar]
  37. Reddy, G. T., Reddy, M. P. K., Lakshmanna, K., Kaluri, R., Rajput, D. S., Srivastava, G., & Baker, T. (2020). Analysis of dimensionality reduction techniques on big data. IEEE Access, 8, 54776-54788.
    [CrossRef] [Google Scholar]
  38. Bourgoyne Jr., A.T., Millheim, K. K., Chenevert, M. E., & Young Jr., F. S. (1986). Applied Drilling Engineering. Society of Petroleum Engineers. https://www.scribd.com/document/449344289/Applied-Drilling-Engineering-pdf
    [Google Scholar]
  39. Chen, T., & Guestrin, C. (2016, August). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794).
    [CrossRef] [Google Scholar]
  40. Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of statistics, 29(5), 1189-1232.
    [CrossRef] [Google Scholar]
  41. Feurer, M., & Hutter, F. (2019). Hyperparameter optimization. In Automated machine learning: Methods, systems, challenges (pp. 3-33). Cham: Springer International Publishing.
    [CrossRef] [Google Scholar]
  42. Tanha, J., Abdi, Y., Samadi, N., Razzaghi, N., & Asadpour, M. (2020). Boosting methods for multi-class imbalanced data classification: an experimental review. Journal of Big data, 7(1), 70.
    [CrossRef] [Google Scholar]
  43. Temizel, C., Canbaz, C. H., Betancourt, D., Ozesen, A., Acar, C., Krishna, S., & Saputelli, L. (2020, October). A comprehensive review and optimization of artificial lift methods in unconventionals. In SPE Annual Technical Conference and Exhibition? (p. D041S053R008). SPE.
    [CrossRef] [Google Scholar]
  44. Lehman, M. (2004). Progressing cavity pumps in oil and gas production. World Pumps, 2004(457), 20-22.
    [CrossRef] [Google Scholar]
  45. Shedid, S. A., & Yakoot, M. S. (2016). Simulation study of technical and feasible gas lift performance. International Journal of Petroleum Science and Technology, 10(1), 21-44. https://www.researchgate.net/publication/308062978
    [Google Scholar]
  46. Janadeleh, M., Ghamarpoor, R., Kadhim Abbood, N., Hosseini, S., Al-Saedi, H., & Hezave, A. (2024). Evaluation and selection of the best artificial lift method for optimal production using PIPESIM software. Heliyon, 10(17), e36934.
    [CrossRef] [Google Scholar]
  47. Beal, C. (1946). The viscosity of air, water, natural gas, crude oil and its associated gases at oil field temperatures and pressures. Transactions of the AIME, 165(1), 94–115.
    [CrossRef] [Google Scholar]

Cite This Article

APA Style
Nawab, S., Ali, M., Hulio, I. A., Noshad, Liu, N., & Jokhio, S. A. (2026). A Machine Learning Framework for Artificial Lift Method Selection with Physics-Informed Data Balancing. Reservoir Science, 2(3), 228-260. https://doi.org/10.62762/RS.2026.704585
Export Citation
RIS Format
Compatible with EndNote, Zotero, Mendeley, and other reference managers
TY  - JOUR
AU  - Nawab, Sohail
AU  - Ali, Muhammad
AU  - Hullio, Imran Ahmed
AU  - Noshad
AU  - Liu, Ning
AU  - Jokhio, Sarfraz A.
PY  - 2026
DA  - 2026/06/19
TI  - A Machine Learning Framework for Artificial Lift Method Selection with Physics-Informed Data Balancing
JO  - Reservoir Science
T2  - Reservoir Science
JF  - Reservoir Science
VL  - 2
IS  - 3
SP  - 228
EP  - 260
DO  - 10.62762/RS.2026.704585
UR  - https://www.icck.org/article/abs/RS.2026.704585
KW  - artificial lift selection
KW  - machine learning
KW  - screening criteria
KW  - class imbalance
KW  - multi-class classification
KW  - ESP
KW  - gas lift
KW  - rod pump
KW  - PCP
KW  - domain-informed data augmentation
AB  - The selection of optimal artificial lift methods using machine learning remains challenging due to complex interactions among reservoir characteristics, fluid properties, and operational constraints. Conventional approaches rely on engineering expertise and static screening criteria, often insufficient to capture multifactorial dependencies. This study presents a framework for classifying the most suitable lift method from four common techniques: ESP, Gas Lift, Rod Pumps, and PCP. A dataset of 990 wells with twelve physically meaningful parameters was compiled, including depth, temperature, GOR, API gravity, reservoir pressure, water cut, production rate, viscosity, sand production, deviation, H\(_2\)S presence, and formation type. To address class imbalance, SMOTE and the proposed DI-SDG method—which derives feature-specific perturbation limits from published intra-class variability data—were evaluated. Six ML models were trained using five-fold stratified cross-validation. Random Forest achieved the best test performance (accuracy: 91.41%, precision: 92.47%, recall: 92.67%, macro-\(F_1\): 0.9255), with XGBoost (\(F_1\): 0.9125) and Gradient Boosting (\(F_1\): 0.9006) also performing well. Generalization was validated via blind well testing using independent field cases. Analysis of 18 misclassified samples showed most errors occurred in overlapping operating envelopes, particularly between ESP and Gas Lift in intermediate GOR ranges. Misclassified samples averaged 64% confidence versus 82% for correct classifications, suggesting a 75% threshold for cases requiring additional evaluation. Overall, the physics-informed balancing approach provides an accurate, interpretable framework for artificial lift selection with reliable field-data performance.
SN  - 3070-2356
PB  - Institute of Central Computation and Knowledge
LA  - English
ER  - 
BibTeX Format
Compatible with LaTeX, BibTeX, and other reference managers
@article{Nawab2026A,
  author = {Sohail Nawab and Muhammad Ali and Imran Ahmed Hullio and Noshad and Ning Liu and Sarfraz A. Jokhio},
  title = {A Machine Learning Framework for Artificial Lift Method Selection with Physics-Informed Data Balancing},
  journal = {Reservoir Science},
  year = {2026},
  volume = {2},
  number = {3},
  pages = {228-260},
  doi = {10.62762/RS.2026.704585},
  url = {https://www.icck.org/article/abs/RS.2026.704585},
  abstract = {The selection of optimal artificial lift methods using machine learning remains challenging due to complex interactions among reservoir characteristics, fluid properties, and operational constraints. Conventional approaches rely on engineering expertise and static screening criteria, often insufficient to capture multifactorial dependencies. This study presents a framework for classifying the most suitable lift method from four common techniques: ESP, Gas Lift, Rod Pumps, and PCP. A dataset of 990 wells with twelve physically meaningful parameters was compiled, including depth, temperature, GOR, API gravity, reservoir pressure, water cut, production rate, viscosity, sand production, deviation, H\(\_2\)S presence, and formation type. To address class imbalance, SMOTE and the proposed DI-SDG method—which derives feature-specific perturbation limits from published intra-class variability data—were evaluated. Six ML models were trained using five-fold stratified cross-validation. Random Forest achieved the best test performance (accuracy: 91.41\%, precision: 92.47\%, recall: 92.67\%, macro-\(F\_1\): 0.9255), with XGBoost (\(F\_1\): 0.9125) and Gradient Boosting (\(F\_1\): 0.9006) also performing well. Generalization was validated via blind well testing using independent field cases. Analysis of 18 misclassified samples showed most errors occurred in overlapping operating envelopes, particularly between ESP and Gas Lift in intermediate GOR ranges. Misclassified samples averaged 64\% confidence versus 82\% for correct classifications, suggesting a 75\% threshold for cases requiring additional evaluation. Overall, the physics-informed balancing approach provides an accurate, interpretable framework for artificial lift selection with reliable field-data performance.},
  keywords = {artificial lift selection, machine learning, screening criteria, class imbalance, multi-class classification, ESP, gas lift, rod pump, PCP, domain-informed data augmentation},
  issn = {3070-2356},
  publisher = {Institute of Central Computation and Knowledge}
}

Article Metrics

Citations
Crossref
0
Scopus
0
Views
7
PDF Downloads
1

Publisher's Note

ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and Permissions

CC BY Copyright © 2026 by the Author(s). Published by Institute of Central Computation and Knowledge. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
Reservoir Science
Reservoir Science
ISSN: 3070-2356 (Online)
Portico
Preserved at
Portico