Modeling biomolecular flexibility and plasticity is important for understanding the function of biological systems. Hence, it is desirable to have a precise knowledge about what can move and how. Given a network representation of a biomolecular structure, its intrinsic flexibility can be efficiently identified by determining the number and spatial distribution of bond-rotational degrees of freedom in the network. The flexibility for a given biomolecule can be analyzed within a few seconds by a graph theory-based approach as implemented in the FIRST program by Thorpe et al. We are using rigidity analysis for investigating large biomolecules such as the ribosome, analyzing the structural determinants of thermostability, approximating the change of vibrational entropy upon binding of biomolecules, understanding allosteric transmission in biomolecules, and sampling of biomolecular conformational spaces. Detailed information about rigidity analysis and its application to biomolecules is provided in [144].
The Constraint Network Analysis (CNA) approach [93] aims at linking information from rigidity analysis with biomolecular structure, (thermo-)stability, and function. CNA functions as a front- and back-end to FIRST, for which the C++-based CNA interface module pyFIRST was developed. That way, the computational efficiency of FIRST is preserved in CNA-driven computations [93]. Going beyond the mere identification of flexible and rigid regions in a biomolecule, CNA allows for I) performing constraint dilution simulations that consider a temperature dependence of non-covalent interactions [67], II) computing a comprehensive set of global and local indices for quantifying biomolecular stability [86], and III) performing rigidity analysis on ensembles of network topologies (ENT). For the latter, structural ensembles [19] and ensembles based on the concept of fuzzy non-covalent constraints (ENTFNC) can be used [96]. In order to facilitate the processing of the highly information-rich results obtained from CNA, the VisualCNA plugin [116] for PyMOL and the CNA web server [94] have been developed.
Monitoring the decay of network rigidity along a constraint dilution trajectory helps to improve the understanding of the relationship between biomolecular structure, activity, and thermostability. CNA was successfully applied to a variety of tasks ranging from the comparison of proteins from mesophilic and thermophilic organisms [37,54] to series of orthologs [68,125] to variants with only a few substitution [117,129]. From these studies, we provided direct evidence for the “principle of corresponding states”, according to which mesophilic / thermophilic homologs have similar flexibility and rigidity characteristics at the respective optimal growth temperatures [54]. We also obtained good to very good correlations between predicted and experimental thermostability values [67,117] and emphasized the importance of interface stability that contributes to the thermostability in multimeric structures [125]. For prospective studies, we developed a strategy to predict amino acid substitutions optimal for thermostability improvement [129]. The strategy combines a structural ensemble-based weak spot prediction for the wild type protein by CNA, filtering of weak spots according to sequence conservation, computational site saturation mutagenesis, assessment of variant structures with respect to their structural quality, and screening of the variants for increased structural rigidity by ENTFNC-based CNA.
Allostery is the process by which biomolecules transmit the effect of binding at one site to another, often distal, functional site. Due to the non-local character of rigidity percolation, rigidity analysis provides insights how altered stability due to binding of an allosteric effector affects sites all across the network. Such a long-range effect was first demonstrated for the protein-protein complex Ras/Raf [19]. Inspired by this observation, a computationally highly efficient approximation of changes in the vibrational entropy (ΔSvib) upon binding to biomolecules has been introduced recently, based on rigidity theory [146]. Compared to state-of-the-art computational methods for computing ΔSvib, this approach yields significant and good to fair correlations for datasets of protein-protein and protein-small molecule complexes as well as in alanine scanning. Recently, an ensemble-based perturbation approach has been introduced for gaining a deeper structure-based understanding of the relationship between changes in static properties and allosteric signal transmission in biomolecules [158]. Applying a free energy perturbation approach to results of rigidity analysis, free energies of cooperativity and pathways of allosteric signaling are computed. The approach was successfully applied on biomolecules showing ligand-based K- and V-type allostery, respectively, and for computing free energies of cooperativity for binding of the allosteric and orthosteric ligands in agreement with the underlying mechanisms of negative and positive cooperativity. As to nucleic acid systems, we proposed an allosteric signal transmission pathway within the large ribosomal subunit [40], which has been confirmed by two independent experimental studies later. In another study by us, FIRST was used to investigate the interplay between the ligand binding site, tertiary loop-loop interactions, and the switching sequence in the aptamer domain of the guanine-sensing riboswitch [156]. Our findings suggest that the distant tertiary interactions and the ligand binding cooperatively stabilize the P1 region, and in this way influence the regulation of genes.
Modeling conformational transitions of macromolecules is computationally challenging. Recently, coarse-grained normal mode approaches based on elastic network theory have emerged as efficient alternatives for investigating large-scale conformational changes. It has been shown that functionally important conformational changes of proteins can be described by low frequency normal modes, which are robust and insensitive to higher coarse-graining [52]. Accordingly, we have introduced a three-step approach for multi-scale modeling of macromolecular conformational changes. The first two steps are based on recent developments in rigidity and elastic network theory (termed Rigid Cluster Normal Mode Analysis) [24,26]. In the final step, the recently introduced idea of constrained geometric simulations of diffusive motions in proteins is extended. New macromolecule conformers are generated by deforming the structure along low-energy normal mode directions predicted by RCNMA plus random direction components [57]. Recently, NMSim has been used to sample large-scale domain motions during phosphate group transfer in the pyruvate phosphate dikinase (PPDK). From this, an unknown intermediate state of PPDK has been identified, which was confirmed by X-ray crystallography [145,152]. In connection with quantitative FRET studies and integrative structure modeling, NMSim has been used for unbiased and FRET-guided generation of structural ensembles [139]. NMSim is accessible via a web server [75].