It has become increasingly evident that the conformational distributions of intrinsically disordered proteins or regions are strongly dependent on their amino acid compositions and sequence. To facilitate a systematic investigation of these sequence-ensemble relationships, we selected a set of 16 naturally occurring intrinsically disordered regions of identical length but with large differences in amino acid composition, hydrophobicity, and charge patterning. We probed their conformational ensembles with single-molecule Förster resonance energy transfer (FRET), complemented by circular dichroism (CD) and nuclear magnetic resonance (NMR) spectroscopy as well as small-angle X-ray scattering (SAXS). The set of disordered proteins shows a strong dependence of the chain dimensions on sequence composition, with chain volumes differing by up to a factor of 6. The residue-specific intrachain interaction networks that underlie these pronounced differences were identified using atomistic simulations combined with ensemble reweighting, revealing the important role of charged, aromatic, and polar residues. To advance a transferable description of disordered protein regions, we further employed the experimental data to parametrize a coarse-grained model for disordered proteins that includes an explicit representation of the FRET fluorophores and successfully describes experiments with different dye pairs. Our findings demonstrate the value of integrating experiments and simulations for advancing our quantitative understanding of the sequence features that determine the conformational ensembles of intrinsically disordered proteins.
© 2024 The Authors. Published by American Chemical Society.