Plotting Best Practices#

Visualisation plays a crucial role in assessing the accuracy of binding free energy calculations. The same raw data can be represented in many different ways, and the choice of plot or analysis metric determines which aspects of performance are highlighted. This page explains the plot types available in cinnabar, and the best practices they follow. For more detailed information we recommend reading the companion article [1].

Edgewise \(\Delta\Delta G\) Plots#

Edgewise relative free energy plots allow direct comparison of calculated and experimental \(\Delta\Delta G\) values. These plots are best for method developers, as they show how well individual transformations are predicted and highlight outliers. These plots can be generated from an FEMap using the plot_DDGs method.

Best practices:

  • Only error statistics (RMSE, MUE) are shown by default. Correlation measures (\(R^{2}\), \(\rho\)) are not meaningful, as the direction of a relative transformation is arbitrary.

  • Error bars are shown on to represent uncertainty in both calculated and experimental values.

  • Statistics uncertainty estimates are bootstrapped (1000 samples with replacement) to provide 95% confidence intervals.

Absolute \(\Delta G\) Plots#

Absolute free energy plots compare calculated and experimental \(\Delta G\) values for each ligand. Absolute \(\Delta G\) predictions can be obtained from a connected network of relative free energies via an estimator. These plots are useful for both method developers and users, as they show how well the overall ranking of ligands is predicted. As they depend on the entire network they accumulate errors from multipule aspects of the calculation, including edge accuracy, uncertainty quantification, perturbation network design and estimator, giving a complete picture of protocol performance. These plots can be generated from an FEMap using the plot_DGs method.

Best practices:

  • Both error (RMSE, MUE) and correlation (\(R^{2}\), \(\rho\)) statistics are shown by default, as absolute data are directional and correlation is meaningful.

  • Error bars are shown on to represent uncertainty in both calculated and experimental values.

  • Statistics uncertainty estimates are bootstrapped (1000 samples with replacement) to provide 95% confidence intervals.

  • Mean centering is applied by default to align the mean of calculated and experimental values. This is needed when analyzing the outcomes of relaltive free energy simulations as the absolute scale is arbitrary.

Pairwise (all-to-all) \(\Delta\Delta G\) Plots#

Pairwise or all-to-all relative free energy plots compare all possible pairwise \(\Delta\Delta G\) values between ligands in a dataset. These are generated by calculating pairwise differences from the estimated absolute \(\Delta G\) values. These plots are especially useful for method developers because they remove analysis biases introduced by the perturbation network design: every method is compared on the same set of pairwise values, even if the underlying networks differ. These plots can be generated from an FEMap using the plot_all_DDGs method.

Best practices:

  • Follow the same guidelines as for edgewise plots (error statistics only, with uncertainties represented).

  • Use these plots for fair, network-independent comparisons between methods.

Summary of Best Practices#

  • Use \(\Delta\Delta G\) plots for edge-level diagnostics.

  • Use absolute \(\Delta G\) plots for global performance and ranking.

  • Always represent uncertainties, both at the individual estimate level and on reported statistics (e.g. RMSE).

  • When comparing across methods, use all-to-all pairwise \(\Delta\Delta G\) to enable fair comparisons.

References#