Abstract
Acoustic impedance (Z) is a fundamental parameter in geophysical subsurface characterization, governing seismic reflection coefficients and enabling reservoir property quantification through seismic inversion. Conventional derivation of Z relies on density (ρ) and P-wave velocity (V(p)) logs, yet these datasets are frequently unavailable due to operational constraints, tool limitations, or borehole irregularities. Existing empirical methods, such as neutron porosity-based formulas, suffer from restrictive assumptions -including matrix/fluid constant dependencies, low shale tolerance (< 25%), and negligible secondary porosity - that limit applicability in heterogeneous formations. To overcome these challenges, we present a robust machine learning workflow that predicts Z directly from commonly available well logs, circumventing the need for sonic or density data. A multi-well dataset comprising gamma-ray (GR), neutron porosity (NPHI), deep resistivity (R(D)), and formation tops were analyzed. Pearson correlation identified GR, NPHI, and log-transformed resistivity (R(Dlog)) as optimal predictors. Data preprocessing included Isolation Forest-based outlier removal and logarithmic resistivity transformation. The XGBoost regressor - selected for its scalability in handling nonlinear interactions - was trained on 80% of the data, with hyperparameters optimized via cross-validated grid search. Model performance was evaluated using mean absolute error (MAE), root MSE (RMSE), and coefficient of determination (R²). The optimized model achieved an R² of 0.916 (training) and 0.808 (testing), with RMSE values of 718.3 and 1070, respectively. Independent validation on a blind well demonstrated strong generalization (R² = 0.869, RMSE = 981.3), with predicted Z logs showing stratigraphic fidelity and suppression of high-amplitude artifacts inherent to sonic-derived impedance. Compared to empirical methods, the ML workflow eliminates reliance on matrix/fluid constants, accommodates shale volumes > 25%, and mitigates errors from secondary porosity or gas effects. This provides a scalable, cost-effective solution to enhance seismic inversion accuracy in data-scarce or complex lithological settings.