Abstract
Machine-learning interatomic potentials (MLIPs) enable large-scale atomistic simulations at moderate computational cost while retaining ab initio accuracy. In recent years, MLIPs trained on coupled-cluster data─particularly CCSD(T), which includes single, double, and perturbative triple excitations─have emerged as a promising route to achieve chemical accuracy (1 kcal/mol) beyond the limits of density functional theory (DFT) and to incorporate nonempirical van der Waals (vdW) interactions. Most existing approaches are, however, still not straightforwardly applicable for systems with extended covalent networks such as covalent organic frameworks (COFs) due to the limited availability of CCSD(T) under periodic boundary conditions. Here we present a methodology to train MLIPs with CCSD(T) accuracy for systems with extended covalent networks. The approach is based on the Δ-learning method with a dispersion-corrected tight-binding baseline and an MLIP trained on the differences of the target CCSD(T) energies from the baseline. This Δ-learning strategy enables training on compact molecular fragments while preserving transferability toward the periodic systems. Dispersion interactions are accounted for by including vdW-bound multimers in the training set, and the combination with a vdW-aware tight-binding baseline allows the formally local MLIP to attain CCSD(T)-level accuracy even for systems dominated by long-range vdW forces. The resulting potential yields root-mean-square energy errors below 0.4 meV/atom on both training and test sets and reproduces electronic total atomization energies, bond lengths, harmonic vibrational frequencies, and intermolecular interaction energies for benchmark molecular systems. We apply the method to a prototypical quasi-two-dimensional covalent organic framework (COF) composed of carbon and hydrogen. The COF structure, interlayer binding energies, and hydrogen absorption are analyzed at CCSD(T) accuracy. Overall, the developed methodology opens a practical route to large-scale atomistic simulations for systems with extended covalent networks and vdW interactions with chemical accuracy.