Protein function is dependent on charge interactions and charge biased regions, which are involved in a wide range of cellular and biochemical processes. We report the development of a new algorithm implemented in Python and its use to identify charge clusters CC (NegativeCC: NCC, PositiveCC: PCC and MixedCC: MCC) and compare their presence in mitochondrial proteins of plant groups. To characterize the resulting CC, statistical, structural and functional analyses were conducted. The screening of 105 399 protein sequences showed that 2.6 %, 0.48 % and 0.03 % of the proteins contain NCC, PCC and MCC, respectively. Mitochondrial proteins encoded by the nuclear genome of green algae have the biggest proportion of both PCC (1.6 %) and MCC (0.4 %) and mitochondrial proteins coded by the nuclear genome of other plants group have the highest portion of NCC (7.5 %). The mapping of the identified CC showed that they are mainly located in the terminal regions of the protein. Annotation showed that proteins with CC are classified as binding proteins, are included in the transmembrane transport processes, and are mainly located in the membrane. The CC scanning revealed the presence of 2373 and 784 sites and 192 and 149 motif profiles within NCC and PCC, respectively. The investigation of CC within pentatricopeptide repeat-containing proteins revealed that they are involved in correct and specific RNA editing. CC were proven to play a key role in providing insightful structural and functional information of complex protein assemblies which could be useful in biotechnological applications.
Keywords: Charged Clusters; Functional study; Green algae; Land plants; Mitochondrial proteins; Python detection algorithm.
Copyright © 2024 Elsevier B.V. and Mitochondria Research Society. All rights reserved.