We present a robust and efficient method to implicitly account for solvation effects in modern semiempirical quantum mechanics and force fields. A computationally efficient yet accurate solvation model based on the analytical linearized Poisson–Boltzmann (ALPB) model is parameterized for the extended tight binding (xTB) and density functional tight binding (DFTB) methods as well as for the recently proposed GFN-FF general force field. The proposed methods perform well over a broad range of systems and applications, from conformational energies over transition-metal complexes to large supramolecular association reactions of charged species. For hydration free energies of small molecules, GFN1-xTB(ALPB) is reaching the accuracy of sophisticated explicitly solvated approaches, with a mean absolute deviation of only 1.4 kcal/mol compared to the experiment. Logarithmic octanol–water partition coefficients (log Kow) are computed with a mean absolute deviation of about 0.65 using GFN2-xTB(ALPB) compared to experimental values indicating a consistent description of differential solvent effects. Overall, more than twenty solvents for each of the six semiempirical methods are parameterized and tested. They are readily available in the xtb and dftb+ programs for diverse computational applications.
Automatic differentiation (AD) emerged as an integral part of machine learning, accelerating model development by enabling gradient-based optimization without explicit analytical derivatives. Recently, the benefits of AD and computing arbitrary-order derivatives with respect to any variable were also recognized in the field of quantum chemistry. In this work, we present dxtb—an open-source, fully differentiable framework for semiempirical extended tight-binding (xTB) methods. Developed entirely in Python and leveraging PyTorch for array operations, dxtb facilitates extensibility and rapid prototyping while maintaining computational efficiency. Through comprehensive code vectorization and optimization, we essentially reach the speed of compiled xTB programs for high-throughput calculations of small molecules. The excellent performance also scales to large systems, and batch operability yields additional benefits for execution on parallel hardware. In particular, energy evaluations are on par with existing programs, whereas the speed of automatically differentiated nuclear derivatives is only 2 to 5 times slower compared to their analytical counterparts. We showcase the utility of AD in dxtb by calculating various molecular and spectroscopic properties, highlighting its capacity to enhance and simplify such evaluations. Furthermore, the framework streamlines optimization tasks and offers seamless integration of semiempirical quantum chemistry in machine learning, paving the way for physics-inspired end-to-end differentiable models. Ultimately, dxtb aims to further advance the capabilities of semiempirical methods, providing an extensible foundation for future developments and hybrid machine learning applications. The framework is accessible at https://github.com/grimme-lab/dxtb.
We present the first benchmark set focusing on the conformational energies of highly flexible, long n-alkane chains, termed ACONFL. Unbranched alkanes are ubiquitous building blocks in nature, so the goal is to be able to calculate their properties most accurately to improve the modeling of, e.g, complex (biological) systems. Very accurate DLPNO-CCSD(T1)/CBS reference values are provided, which allow for a statistical meaningful evaluation of even the best available density functional methods. The performance of established and modern (dispersion corrected) density functionals is comprehensively assessed. The recently introduced r²SCAN-V functional shows excellent performance, similar to efficient composite DFT methods like B97-3c and r²SCAN-3c, which provide an even better cost-accuracy ratio, while almost reaching the accuracy of much more computationally demanding hybrid or double hybrid functionals with large QZ AO basis sets. In addition, we investigated the performance of common wavefunction methods, where MP2/CBS surprisingly performs worse compared to simple D4 dispersion corrected Hartree–Fock. Furthermore, we investigate the performance of several semiempirical and force field methods, which are commonly used for the generation of conformational ensembles in multilevel workflows or in large scale molecular dynamics studies. Outstanding performance is obtained by the recently introduced general force field, GFN-FF, while other commonly applied methods like the universal force field yield large errors. We recommend the ACONFL as a helpful benchmark set for parameterization of new semiempirical or force field methods and machine learning potentials as well as a meaningful validation set for newly developed DFT or dispersion methods.
The recently proposed second revision of the SCAN meta-GGA density-functional approximation (DFA) {Furness et al., J. Phys. Chem. Lett. 2020, 11, 8208-8215, termed r 2 SCAN} is used to construct an efficient composite electronic-structure method termed r 2 SCAN-3c, expanding the "3c'' series (hybrid: HSE/PBEh-3c, GGA: B97-3c, HF: HF-3c) to themGGA level. To this end, the unaltered r 2 SCAN functional is combined with a tailor-made triple-zeta Gaussian AO-basis as well as with refitted D4 and gCP corrections for London-dispersion and basis-set superposition error. The performance of the new method is evaluated for the GMTKN55 thermochemical database covering large parts of chemical space with about 1500 data points, as well as additional benchmarks for noncovalent interactions, organometallic reactions, lattice energies of organic molecules and ices, as well as for the adsorption on polar salt and non-polar coinage-metal surfaces. These comprehensive tests reveal a spectacular performance and robustness of r 2 SCAN-3c for reaction energies and noncovalent interactions in molecular and periodic systems, as well as outstanding conformational energies, and consistent structures. At just one tenth of the cost, r 2 SCAN-3c provides one of the best results of all semi-local DFT/QZ methods ever tested for the GMTKN55 benchmark database. Specifically for reaction and conformational energies as well as for noncovalent interactions, the new method outperforms hybrid-DFT/QZ approaches, compared to which the computational savings are even larger (factor 100-1000). In relation to other "3c'' methods, r 2 SCAN-3c by far surpasses the accuracy of its predecessor B97-3c at only about twice the cost. The perhaps most relevant remaining systematic deviation of r 2 SCAN-3c is due to self-interaction-error, owing to its mGGA nature. However, SIE is notably reduced compared to other (m)GGAs, as is demonstrated for several examples. After all, this remarkably efficient and robust method is chosen as our new group default, replacing previous low-level DFT and partially even expensive high-level methods in most standard applications for systems with up to several hundreds of atoms.
The computational treatment of large molecular structures is of increasing interest in fields of modern chemistry. Accordingly, efficient quantum chemical approaches are needed to perform sophisticated investigations on such systems. This engaged the development of the well-established "Our own N-layered integrated molecular orbital and molecular mechanics" (ONIOM) multi-layer scheme [L. W. Chung et al., Chem. Rev., 2015, 115, 5678-5796]. In this work, we present the specific implementation of the ONIOM scheme into the xtb semi-empirical extended tight-binding program package and its application to challenging transition-metal complexes. The efficient and broadly applicable GFNn-xTB and -FF methods are applied in the ONIOM framework to elucidate reaction energies, geometry optimizations, and explicit solvation effects for metal-organic systems with up to several hundreds of atoms. It is shown that an ONIOM-based combination of density functional theory, semi-empirical, and force-field methods can be used to drastically reduce the computational costs and thus enable the investigation of huge systems at almost no significant loss in accuracy.
We have developed a new method to accurately account for solvation effects in semiempirical quantum mechanics based on a polarizable continuum model (PCM). The extended conductor-like polarizable continuum model (CPCM-X) incorporates a computationally efficient domain decomposition conductor-like screening model (ddCOSMO) for extended tight binding (xTB) methods and uses a post-processing approach based on established solvation models, like the conductor-like screening model for real solvents (COSMO-RS) and the universal solvent model based on solute electron density (SMD). According to various benchmarks, the approach performs well across a broad range of systems and applications, including hydration free energies, non-aqueous solvation free energies, and large supramolecular association reactions of neutral and charged species. Our method for computing solvation free energies is much more accurate than the current methods in the xtb program package. It improves the accuracy of solvation free energies by up to 40% for larger supramolecular association reactions to match even the accuracy of higher-level DFT-based solvation models like COSMO-RS and SMD while being computationally more than 2 orders of magnitude faster. The proposed method and the underlying ddCOSMO model are readily available for a wide variety of solvents and are accessible in xtb for use in various computational applications.