(WIP) Numerical Methods for Density Functional Theory

Introduction

It has been known for years that the physical laws underlying any arbitrary physical systems can be accurately described using quantum mechanics. While this viewpoint is philosophically elegant, it is not really useful as quantum mechanical descriptions of many-particle interactions, is formulated in a Hilbert’s space which scales exponentially with the number of particles, posing a computational bottleneck, which Bellman called the Curse of Dimensionality.

One can often get away with this, by modelling systems with mean field, classical particle based and even continuum based approximations as one often does in quantum field theory, kinetic theory of fluids and fluid mechanics respectively. Molecular Spectroscopy is one discipline where errors induced by such approximations is often dire, requiring explicit quantum mechanical descriptions despite the aforementioned bottleneck due to Bellman.

Luckily there are several approximations that have been proposed that aim at circumventing this problem. One such method is Density Functional Theory. In what follows, I present a self-consistent introduction to Density Functional Theory (DFT) from the perspective of numerical analysis and an efficient algorithmic implementation.

A Many Body Problem

For the rest of the article, I will refer to a many body quantum system with \(n\) electrons and \(m\) nuclear centers. It is known that the state of this system can be completely described with the help of its wavefunction, \(|\Psi\rangle\). If, the quantum system has \(p\) spin states, then the wavefunction, \(|\Psi\rangle : \mathbb{R}^{p^{m+n}} \times \mathbb{R}^{+} \mapsto \mathbb{C}\). Its evolution, is defined axiomatically by Schrodinger’s equation:

\[\begin{equation} \label{eq:MBP} i\bar{h} \frac{\partial |\Psi\rangle}{\partial t} = \left( -\frac{\bar{h}^2}{2M} \nabla^2 + V \right) |\Psi\rangle \end{equation}\]

The potential operator \(V\) in (\ref{eq:MBP}) is usually dominated by the electrostatic interactions among the different charge centers in the system. If \(Z_1, Z_2\) are the atomic numbers of two nuclear centers, located at \(r_1, r_2\), then the electrostatic potential between the two is

\[\begin{equation} \label{eq:electrostatic} V(r_1,r_2) = \frac{1}{4 \pi \epsilon} \frac{Z_1 Z_2 e^2}{\|r_1 - r_2\|} + V_{ext} \end{equation}\]

It is common to expand out (\ref{eq:MBP}-\ref{eq:electrostatic}) in its long form as follows.

\[\begin{equation} \label{eq:Ham} \begin{aligned} i\bar{h} \frac{\partial |\Psi\rangle}{\partial t} = \left( -\frac{\bar{h}^2}{2} \sum_{i=1}^{m} \frac{1}{M_i}\nabla_{r_i}^2 -\frac{\bar{h}^2}{2} \sum_{j=1}^{n} \frac{1}{Me_{i}}\nabla_{r_j}^2\\ + \frac{1}{4 \pi \epsilon} \sum_{i=i}^{n} \sum_{j=1}^{m} \frac{Z_i e^2}{\|r_i - r_j\|} + \frac{1}{4 \pi \epsilon} \sum_{i=i}^{n} \sum_{j=1}^{n} \frac{e^2}{\|r_i - r_j\|} + \frac{1}{4 \pi \epsilon} \sum_{i=i}^{m} \sum_{j=1}^{m} \frac{Z_i Z_j e^2}{\|r_i - r_j\|} + V_{ext}\right) |\Psi\rangle \end{aligned} \end{equation}\]

The long form operator is known as the many electronic Hamiltonian. By itself, equation (\ref{eq:Ham}) is a parabolic partial differential equation with no analytical solutions, except for very simplified cases.

The alternative of simulating the system numerically is also not computationally tractable. Say, \(p=2, m=6, n=60\). This corresponds to \(\Psi\) with \(2^{66} \approx 7.37 \times 10^{19}\) input parameters. If one were to parameterize each of these input directions with only \(10\) indicator functions then the state vector \(\tilde{\Psi}\) that uniquely represents a quantum state has \(10 \times 2^{66}\) components. This poses a burden that would be too heavy to bear for systems of larger sizes.

Born Oppenheimer approximation

Making the observation that nuclear centers are substantially more massive than the electrons, Born and Oppenheimer postulated that the electrons and nuclear centers evolve in two different timescales. Effectively, the contributions of the nuclei to the many electronic Hamiltonian can be dropped and the nuclei can in-turn be simulated using Newton’s laws govered by a potential surface \(E\) obtained from solving the stationary Schrodinger’s equation.

\[\begin{equation} \label{eq:SSE} H_e |\Psi\rangle := \left(-\frac{\bar{h}^2}{2} \sum_{j=1}^{n} \frac{1}{Me_{i}}\nabla_{r_j}^2\\ + \frac{1}{4 \pi \epsilon} \sum_{i=i}^{n} \sum_{j=1}^{m} \frac{Z_i e^2}{\|r_i - r_j\|} + \frac{1}{4 \pi \epsilon} \sum_{i=i}^{n} \sum_{j=1}^{n} \frac{e^2}{\|r_i - r_j\|} + V_{ext}\right) |\Psi\rangle = E |\Psi\rangle \end{equation}\]

While, the BO approximation certainly reduces the complexity of solving (\ref{eq:MBP}), the wavefunction is still sufficiently massive to discourage its computation, even on the largest computational facilities on earth. This is the curse of dimensionality that I alluded to, in the introduction. In what follows, I will elaborate on how DFT resolves this problem.

Density Functional Theory

\(|\Psi\rangle(z_1,z_2,....)\) has an exponentially growing input parameter space. So, all computations that involve \(|\Psi\rangle\) are prohibitively expensive. DFT addresses this problem, by restating (\ref{eq:SSE}) using an observable of \(|\Psi\rangle\) that takes in fewer parameters – the electron density of the system, \(\rho\), which is defined as:

\[\begin{equation} \label{eq:density} \rho(r) = n \sum_{s=\{0,1\}} \int \: dz_2 dz_3 ... dz_n \: \Psi^{*}((r,s),z_2,...,z_n) \Psi((r,s),z_2,...,z_n) \end{equation}\]

It would seem, that computing the density, in-turn requires knowledge of the wavefunction in the first place, which we have established as difficult to assemble on a computer. So, is this whole notion moot? Yes and No. While it is not computationally tractable to compute \(\rho\) directly, one can reformulate (\ref{eq:SSE}) in a variational setting and optimize for \(\rho\) indirectly.

Consider the Ritz minimization iteration in the context of (\ref{eq:SSE}).

\(\begin{equation} E_{min} = \arg \min_{|\Psi\rangle \in \Theta} \frac{\langle\Psi|H_e|\Psi\rangle}{\langle\Psi|\Psi\rangle} \end{equation}\) where \(\Theta\) is the hypothesis space of antisymmetric functions \(|\Psi\rangle\) with \(\langle\Psi|\Psi\rangle = 1\).