Absolute Free Energy Estimators#
Relative free energy calculations produce \(\Delta\Delta G\) (differences between two ligands). To compare these with experiment, or to rank ligands by affinity, we need absolute free energies (\(\Delta G\)) for each ligand.
This requires an estimator: a method that takes the network of relative free energies and produces absolute values.
Maximum Likelihood Estimation (MLE)#
The Maximum Likelihood Estimation (MLE) [1] method is the default estimator used in cinnabar to obtain absolute free energies (\(\Delta G\)) from a network of relative free energies (\(\Delta\Delta G\)).
The Core Idea#
To place every ligand on a common absolute scale, we need to find a set of \(\Delta G\) values that best explain all relative differences (\(\Delta\Delta G\)) simultaneously. The MLE method does this by asking:
What set of \(\Delta G\) values makes the observed data most likely, given the reported uncertainties?
This framing naturally integrates all edges and cycles in the graph simultaneously.
The Likelihood Function#
Suppose we have a network with two ligands i and j, with observed relative free energy \(\Delta\Delta G_{ij}\) and
uncertainty \(\sigma_{ij}\). The model assumes each measurement is normally distributed:
The likelihood is the product of probabilities for all edges in the graph. The MLE procedure finds the set of \(\Delta G\) values that maximises this likelihood (or equivalently, minimises the negative log-likelihood).
Uncertainty Propagation#
Input uncertainties (\(\sigma_{ij}\)) are explicitly included in the likelihood function. This means more precise edges (smaller uncertainty) have greater weight in determining the solution. However, this does mean that high confidence but low accuracy edges can impact the entire network and so robust uncertainty estimates on input data are crucial.
Centering of Results#
The absolute \(\Delta G\) scale is arbitrary: adding a constant to all \(\Delta G\) values does not change any relative differences \(\Delta\Delta G\).
As a result, the MLE solution is typically centred around zero (or another chosen reference). To compare with experimental
values, an experimental shift must be applied. By default cinnabar will align the mean of predicted and
experimental \(\Delta G\) in the plotting functions.
Limitations#
The MLE method can not use multiple independent measurements of the same edge to improve precision automatically. Each edge must be represented by a single \(\Delta\Delta G\) and uncertainty. If multiple measurements are available, they should be combined externally (e.g. via weighted averaging) before input to the estimator.
The MLE method automatically adds small non-zero uncertainties to edges with exactly zero reported uncertainty to ensure numerical stability.