Abstract
We propose a self-supervised deep learning methodology based on a one-dimensional adaptation of the Convolutional Visual Transformer (CVT) model for characterizing the Earth's subsurface using well log data. Our foundation model is pre-trained on unlabeled multivariate sensor data across diverse basins and then fine-tuned for geological formation identification in the Williston Basin. The model achieves an average F1 score of 0.94 across six key formations, demonstrating faster convergence, increased robustness to missing inputs, and improved accuracy compared to baseline models, including U-Net, XGBoost, SVM, and KNN. We validate the model in geologically distinct basins, including the Groningen gas field, confirming its generalization to other fields. The framework applies to hydrocarbon exploration, carbon storage, geothermal, and groundwater reservoir characterization, supporting scalable and transferable AI solutions for the energy transition.