2.2. Molecular Hamiltonians

A molecular Hamiltonian is defined in three steps:

  1. Set up a molecular geometry (see Specifying the molecular geometry)

  2. Define a Gaussian basis set (see Specifying the atomic basis set)

  3. Calculate the matrix representation of all contributions separately that are defined in the molecular Hamiltonian (see Computing the matrix representation of the Hamiltonian)

Before a HF or post-HF calculation can be performed, the overlap matrix of the atomic basis set as well as the molecular orbitals need to be specified (see Computing the overlap matrix and defining the orbitals)

2.2.1. Specifying the molecular geometry

The basis set reader in PyBEST requires the molecular coordinates to be defined in an xyz file. If an xyz file is present, only the (relative) path to the file (in most cases just the file name) needs to be specified. If the molecular geometry is not present as an xyz file, PyBEST will internally convert it to a file in the xyz format and store it in a temporary directory. Note that PyBEST can load/dump the molecular geometry from/to different file formats. The file format is determined automatically using the extension or prefix of the filename. For more information on supported file formats, refer to Input/Output Operations: the IOData Container.

In addition to providing the filename, the molecular geometry can be specified as described below.

2.2.1.1. Reading the molecular geometry from file

The molecular geometry can be read from file using the method from_file() of the IOData class,

mol = IOData.from_file(filename)

For instance, the molecular coordinates can be provided in the conventional xyz file format,

mol = IOData.from_file('molecule.xyz')

If just the filename is given, PyBEST assumes that the file is present in the working directory. If the coordinate file is lying elsewhere, the absolute or relative path have to be provided, for instance,

mol = IOData.from_file('./coords/molecule.xyz')

2.2.1.2. Constructing a molecular geometry from scratch

A molecular geometry can also be generated using Python code. The following example constructs the atomic coordinates for a water molecule and writes the corresponding geometry information to an xyz file. Note that the coordinates have to be defined in bohr (atomic units). The xyz file is represented in Angstrom.

Listing 2.1 data/examples/hamiltonian/water.py
import numpy as np

from pybest.io.iodata import IOData
from pybest.utils import get_com

# define the molecule
natom = 3
alpha = 109.0 / 2.0 * np.pi / 180.0  # bond angle in radian
radius = 1.8  # distance between two neighboring atoms in bohr

# define the coordinates and elements
coordinates = np.zeros((natom, 3))
atom = np.array(["O", "H", "H"], dtype=object)  # must be strings

# update coordinates of the hydrogens (the oxygen atoms remains at the origin)
coordinates[1, 1] = -radius * np.sin(alpha)
coordinates[1, 2] = radius * np.cos(alpha)
coordinates[2, 1] = radius * np.sin(alpha)
coordinates[2, 2] = radius * np.cos(alpha)

# assign coordinates to container
mol = IOData(coordinates=coordinates, atom=atom, title="Water")

# move center of mass to the origin
com = get_com(mol)
mol.coordinates = mol.coordinates - com

# write the molecule to an XYZ file (optional)
mol.to_file("water.xyz")

2.2.2. Specifying the atomic basis set

PyBEST uses the Libint2 basis set reader. Thus, the supported basis set format is the g94 format. Note that Libint2 needs to be compiled to support up to a maximum angular momentum quantum number of seven, that is, I functions.

PyBEST is distributed with its own basis set library. An atomic basis set can be read in from the PyBEST basis set library using the function call get_gobasis().

2.2.2.1. Supported atomic basis sets in the PyBEST basis set library

All supported basis sets are located in src/pybest/data/basis and include

  1. Minimal basis sets

    • STO-3G, STO-6G

  2. Pople basis sets

    • 3-21g

    • 6-31g, 6-31g*, 6-31g**, 6-31++g**

  3. Correlation consistent basis sets

    • cc-pVDZ, cc-pVTZ, cc-pVQZ, cc-pV5Z

    • cc-pcVDZ, cc-pcVTZ, cc-pcVQZ

    • aug-cc-pVDZ, aug-cc-pVTZ, aug-cc-pVQZ, aug-cc-pV5Z

  4. ANO basis sets

    • ANO-RCC (large)

    • ANO-RCC-VDZ, ANO-RCC-VDZP, ANO-RCC-VTZ, ANO-RCC-VTZP, ANO-RCC-VQZP

2.2.2.2. The same basis set for different atoms

By default, all atoms in the molecule are assigned the same atomic basis set. To define an atomic basis set, the molecular geometry has to be provided to the get_gobasis() (see Reading the molecular geometry from file). The atomic basis set can then be constructed by only one function call,

gobasis = get_gobasis(basisname, coordinate_filename)

where basisname is the name of the basis set (a string) and coordinate_filename is the file containing the coordinates. If a cc-pVDZ basis is used and the coordinates are stored in the mol.xyz file, the basis set can be defined as follows

gobasis = get_gobasis('cc-pvdz', 'mol.xyz')

2.2.2.3. Loading user-defined basis sets from file

If an atomic basis set is not present in the PyBEST basis set library, it can be read from file. Note that the basis set file has to be available in the g94 file format. To read in a user-defined basis set file, the basisname argument of the get_gobasis() function has to be substituted by the path to the basis set file.

gobasis = get_gobasis(path_to_basis, coordinate_filename)

path_to_basis can be either the absolute or relative path. If the user-specified basis set is stored in the my_own_dz.g94 file, the above example becomes

gobasis = get_gobasis('./my_own_dz.g94', 'mol.xyz')

Note

The file extension .g94 has to be explicitely provided. Only with the file extension, PyBEST will read a user-defined basis set instead of searching for the basis set in its basis set library.

2.2.2.4. Loading geometry and basis set information from another file or external program

The molecular geometry and basis set information can also be read in from another file generated by external software like (Molpro, ORCA, PSI4, etc). Supported file formats are .mkl and .molden (in addition to PyBEST’s internal .h5 format). As for molecular geometries, the from_file() method has to be used to load the file.

# Load the geometry, orbital basis, and wave function from a molden file:
mol = IOData.from_file('water.molden')

# Print the number of basis functions
print(mol.gobasis.nbasis)

The corresponding information on the basis set or molecular orbitals is stored in the corresponding attributes. For instance, the basis set object is available in the gobasis attribute as the return value. See also Input/Output Operations: the IOData Container for more details on how to handle I/O operations.

2.2.3. Specifying dummy or ghost atoms and active fragments

In PyBEST, dummy or ghost atoms and active molecular fragments are defined using the get_gobasis() function, that is, during the construction of the atomic basis set.

2.2.3.1. Dummy or ghost atoms

All dummy or ghost atoms are passed as a list to the get_gobasis() function that contains the indices of all dummy/ghost atoms as its elements. The elements are sorted according to their order in the xyz coordinate file.

# set first (0) and second (1) element in mol.xyz as dummy/ghost atom
gobasis = get_gobasis('cc-pvdz', 'mol.xyz', dummy=[0,1])

Note

In PyBEST, all indexing is performed using the Python convention, that is, beginning with 0.

2.2.3.2. Active fragments

PyBEST allows us to divide one xyz coordinate file in an active and inactive fragment. The inactive fragment will be neglected during the construction of the basis set and molecular Hamiltonian.

# define molecule containing atoms 3, 4, and 5 as active fragment contained
# in the mol.xyz file
gobasis = get_gobasis('cc-pvdz', 'mol.xyz', active_fragment=[3,4,5])

2.2.3.3. Combining ghost atoms and active fragments

The dummy option and the active_fragment option can be combined so that additional ghost or dummy atoms can be defined in the active fragment.

# define molecule containing atoms 3, 4, and 5 as active fragment contained
# in the mol.xyz file, set atom index 3 to be a ghost atom
gobasis = get_gobasis('cc-pvdz', 'mol.xyz', active_fragment=[3,4,5], dummy=[3])

Note

The dummy index has to be included in the active_fragment list. If this is not the case, an exception will be raised.

2.2.4. Computing the matrix representation of the Hamiltonian

As soon as the molecular geometry and atomic basis sets are defined, the matrix representation of the molecular Hamiltonian can be calculated. In the current version of PyBEST, each term has to be evaluated separately. The molecular Hamiltonian contains only one- and two-electron integrals as well as a constant term due to the nuclei-nuclei repulsion. Various one- and two-electron integrals are implemented in PyBEST (see also List of Supported One- and Two-Electron Integrals for a list of supported integrals). All integrals are calculated using the LibInt package [valeev2019] using the modern C++ API, except the Cholesky-decomposed two-electron repulsion integrals, which are evaluated using the LibChol Library.

All one-electron integrals as well as the overlap matrix of the atomic basis set are stored as two-index objects (that is, in its full or dense matrix representation), which are implemented in the DenseLinalgFactory class. The two-electron integrals are either stored as a dense four-index object (DenseLinalgFactory class, memory intensive) or Cholesky decomposed (CholeskyLinalgFactory class, more memory efficient). See PyBEST objects: the LinalgFactory for more information on the LinalgFactory and its features. The evaluation of all one- and two-electron integrals is performed by various PyBEST function calls.

2.2.4.1. The Schrödinger Hamiltonian

The electronic Schrödigner Hamiltonian,

(2.1)\[{\hat{H}}_{\mathrm{el}} = \sum_i^{N_{\mathrm{el}}} \hat{T}_i + \sum_i^{N_{\mathrm{el}}} \sum_I^{N_{\mathrm{nuc}}} \hat{V}_{iI} + \frac{1}{2} \sum_{i,j}^{N_{\mathrm{el}}} \hat{V}_{i,j} + \frac{1}{2} \sum_{I,J}^{N_{\mathrm{nuc}}} \hat{V}_{I,J}\]

which contains the kinetic energy of the electrons, the electron-nucleus attraction, the electron-electron repulsion, and the nucleus-nucleus repulsion terms. In PyBEST, each term has to be evaluated explicitly.

Before computing all one- and two-electron integrals, the Linalg factory has to be specified. Currently, two Linalg factories are supported: DenseLinalgFactory and CholeskyLinalgFactory, which differ only in the representation of the two-electron repulsion integrals. The one-electron integrals are represented in their dense representation using the DenseLinalgFactory class, irrespective of the choice between DenseLinalgFactory or CholeskyLinalgFactory. For both cases, the number of atomic basis functions has to be passed as an argument. It is stored in the nbasis attribute of the PyBasis instance (here gobasis).

# use DenseLinalgFactory
lf = DenseLinalgFactory(gobasis.nbasis)
# use CholeskyLinalgFactory
lf = CholeskyLinalgFactory(gobasis.nbasis)

Note

The DenseLinalgFactory and CholeskyLinalgFactory differ only in their representation of the two-electron integrals. All smaller dimensional tensors are represented using the DenseLinalgFactory.

The kinetic energy can be calculated using the compute_kinetic() function, which takes an instance of PyBasis as argument,

# evaluate kinetic energy of electrons and store in kin
kin = compute_kinetic(gobasis)

Similarly, the electron repulsion integrals are determined

# evaluate electron repulsion integrals and store in eri
eri = compute_eri(gobasis)
# or evaluate Cholesky-decomposed two-electron integals, the default threshold is
# set to 1e-3
eri = compute_cholesky_eri(gobasis, threshold=1e-8)

The potential energy associated with the nuclei is determined in a similar fashion,

# Nuclear-electron attraction term stored in na
na = compute_nuclear(gobasis)
# Nuclear-nuclear repulsion term stored as dictionary in nuc
nuc = compute_nuclear_repulsion(gobasis)

A complete example can be found here:

Listing 2.2 data/examples/hamiltonian/schrodinger.py, lines 3-39
from pybest import context
from pybest.gbasis import get_gobasis
from pybest.gbasis.gobasis import (
    compute_dipole,
    compute_eri,
    compute_kinetic,
    compute_nuclear,
    compute_nuclear_repulsion,
    compute_overlap,
    compute_quadrupole,
)
from pybest.io.iodata import IOData
from pybest.linalg.dense import DenseLinalgFactory
from pybest.utils import get_com

# Molecular Schrodinger Hamiltonian for the water molecule
# --------------------------------------------------------

# Define coordinate file.
# ----------------------------------------------------
coord = context.get_fn("test/water.xyz")

# Create a Gaussian basis set
# ---------------------------
gobasis = get_gobasis("cc-pvdz", coord)

# Create a linalg factory
# -----------------------
lf = DenseLinalgFactory(gobasis.nbasis)

# Compute integrals in the atom-centered Gaussian basis set
# ---------------------------------------------------------
kin = compute_kinetic(gobasis)
na = compute_nuclear(gobasis)
er = compute_eri(gobasis)
nuc = compute_nuclear_repulsion(gobasis)

2.2.5. Computing the overlap matrix and defining the orbitals

To perform a HF or post-HF calculation, the overlap matrix of the atomic orbital basis set has to be calculated and a set of (molecular) orbitals has to be defined. The overlap matrix can be calculated using the compute_overlap() method, which requires an instance of PyBasis as argument. The return value is a dense two-index object of DenseLinalgFactory (as all one-electron integrals).

# overlap matrix of the atomic basis set
olp = compute_overlap(gobasis)

The (molecular) orbitals are initialized using the create_orbital() method of the DenseLinalgFactory class,

# create molecular orbitals
orb = lf.create_orbital()

The orbitals are an instance of the DenseOrbital class. The instance orb contains the AO/MO coefficients (orb.coeffs), the occupation numbers (orb.occupations), and orbital energies (orb.energies), which are stored as two- (coefficients) or one-dimensional (occupations, energies) numpy arrays. See also Working with Orbitals in PyBEST for more details on orbitals and how to work with orbitals in PyBEST.

2.2.6. Computing the multipole moment integrals

PyBEST supports the calculation of the electric dipole and quadrupole moment integrals. The electric dipole integrals can be calculated using the compute_dipole() method, which requires an instance of PyBasis as argument. The return value is a list containing the overlap matrix and each component {x, y, z} of the (Cartesian) electric dipole operator, each being a dense two-index object of DenseLinalgFactory (as all one-electron integrals).

# electric dipole moment integrals of the atomic basis set
olp, mux, muy, muz = compute_dipole(gobasis)
# or store as tuple
dipole = compute_dipole(gobasis)

By default, the multipole origin is set to {0,0,0}. This can be adjusted by setting the x, y, and z arguments to some new origin. If you wish to take the center of mass (COM) of the molecule as the new origin, you first determine the COM and then pass it explicitly to the compute_dipole() method,

# determine COM
x, y, z = get_com(gobasis)

# electric dipole moment integrals of the atomic basis set wrt COM
olp, mux, muy, muz = compute_dipole(gobasis, x=x, y=y, z=z)
# or store as tuple
dipole = compute_dipole(gobasis, x=x, y=y, z=z)

The electric quadrupole moment integrals can be determined in a similar way using the compute_quadrupole() method. In addition to the overlap matrix and each component of the (Cartesian) electric dipole operator, all 6 components, {xx, xy, xz, yy, yz, zz}, of the Cartesian electric quadrupole operator are returned (thus, 10 objects in total),

# electric quadrupole moment integrals of the atomic basis set
# ------------------------------------------------------------
olp, mux, muy, muz, muxx, muxy, muxz, muyy, muyz, muzz = compute_quadrupole(gobasis)
# or store as tuple
quadrupole = compute_quadrupole(gobasis)

Similar to the electric dipole moment integrals, the quadrupole moment integrals are determined with respect to the origin of the Cartesian coordinate system, {0,0,0}. This can be adjusted by setting the x, y, and z arguments to some new origin,

# determine COM
x, y, z = get_com(gobasis)

# electric quadrupole moment integrals of the atomic basis set wrt COM
# --------------------------------------------------------------------
olp, mux, muy, muz, muxx, muxy, muxz, muyy, muyz, muzz = compute_quadrupole(gobasis, x=x, y=y, z=z)
# or store as tuple
quadrupole = compute_quadrupole(gobasis, x=x, y=y, z=z)

PyBEST provides additional wrappers that can be used to determine the electric dipole moment (see Computing the electric dipole moment for more details).

2.2.7. Example Python script

2.2.7.1. The molecular Hamiltonian for the water molecule

This example shows how to set up the molecular Hamiltonian, orbitals, and overlap matrix for the water molecule.

Listing 2.3 data/examples/hamiltonian/schrodinger.py, lines 3-ff
from pybest import context
from pybest.gbasis import get_gobasis
from pybest.gbasis.gobasis import (
    compute_dipole,
    compute_eri,
    compute_kinetic,
    compute_nuclear,
    compute_nuclear_repulsion,
    compute_overlap,
    compute_quadrupole,
)
from pybest.io.iodata import IOData
from pybest.linalg.dense import DenseLinalgFactory
from pybest.utils import get_com

# Molecular Schrodinger Hamiltonian for the water molecule
# --------------------------------------------------------

# Define coordinate file.
# ----------------------------------------------------
coord = context.get_fn("test/water.xyz")

# Create a Gaussian basis set
# ---------------------------
gobasis = get_gobasis("cc-pvdz", coord)

# Create a linalg factory
# -----------------------
lf = DenseLinalgFactory(gobasis.nbasis)

# Compute integrals in the atom-centered Gaussian basis set
# ---------------------------------------------------------
kin = compute_kinetic(gobasis)
na = compute_nuclear(gobasis)
er = compute_eri(gobasis)
nuc = compute_nuclear_repulsion(gobasis)

# Compute overlap matrix of atom-centered basis set
# -------------------------------------------------
olp = compute_overlap(gobasis)

# Create orbitals that store the AO/MO coefficients
# -------------------------------------------------
orb = lf.create_orbital()

# Get center of mass
# ------------------
x, y, z = get_com(gobasis)

# electric dipole moment integrals of the atomic basis set wrt COM
# ----------------------------------------------------------------
olp, mux, muy, muz = compute_dipole(gobasis, x=x, y=y, z=z)

# electric quadrupole moment integrals of the atomic basis set wrt COM
# --------------------------------------------------------------------
olp, mux, muy, muz, muxx, muxy, muxz, muyy, muyz, muzz = compute_quadrupole(
    gobasis, x=x, y=y, z=z
)