Lytic polysaccharide monooxygenases of family AA9 catalyze the oxidative cleavage of glycosidic bonds in cellulose and related polysaccharides. The N-terminal half of AA9 LPMOs displays a huge sequence variability that is in contradiction with the substrate simplicity so far observed for these enzymes. To understand the cause of the high multigenicity that prevails in the family, we have performed a clustering analysis of the N-terminal region of 3400 sequences of family AA9 LPMOs, and have evaluated the coincidence of the clusters with distal visible features that may accompany functional differences. A method based on local pairwise alignments was devised to avoid the pitfalls of a global multiple alignment. Our analysis allowed the definition of 64 clusters, which successfully segregated several visible features associated to LPMO family AA9, such as the presence of carbohydrate-binding modules, of modules of unknown function and of the conspicuous H → R substitution at the first residue of the histidine brace that holds the catalytic copper. Our analysis shows that the hypervariability of the N-terminal half of the AA9 sequences is not driven by random evolution as sequence similarity does not follow solely taxonomy. The results suggest that some clusters are perhaps able to target chitin instead of cellulose, and that preference for C1 or C4 oxidation (or lack thereof), does not appear to constitute a strong evolutionary constraint. On an evolutionary standpoint, there seems to be little constraints that apply to the N-terminal half of the sequences other than the conservation of the histidine brace. The weak evolutionary constraints that apply to the N-terminal half of AA9 LPMOs explain both their hypervariability and multigenicity.
Keywords: Bioinformatics; Evolution; Fungal lifestyle; Lytic polysaccharide monooxygenases; Modular structure; Regioselectivity.
Copyright © 2017 Elsevier Ltd. All rights reserved.