Proteins are the building blocks of life. While proteins and their localization within cells and sub-cellular compartments are well defined, the proteins predicted to be secreted to form the extracellular matrix - or matrisome - remain elusive in the model organism C. elegans. Here, we used a bioinformatic approach combining gene orthology and protein structure analysis and an extensive curation of the literature to define the C. elegans matrisome. Similar to the human genome, we found that 719 out of ~20,000 genes (~4%) of the C. elegans genome encodes matrisome proteins, including 181 collagens, 35 glycoproteins, 10 proteoglycans, and 493 matrisome-associated proteins. We report that 173 out of the 181 collagen genes are unique to nematodes and are predicted to encode cuticular collagens, which we are proposing to group into five clusters. To facilitate the use of our lists and classification by the scientific community, we developed an automated annotation tool to identify ECM components in large datasets. We also established a novel database of all C. elegans collagens (CeColDB). Last, we provide examples of how the newly defined C. elegans matrisome can be used for annotations and gene ontology analyses of transcriptomic, proteomic, and RNAi screening data. Because C. elegans is a widely used model organism for high throughput genetic and drug screens, and to study biological and pathological processes, the conserved matrisome genes may aid in identifying potential drug targets. In addition, the nematode-specific matrisome may be exploited for targeting parasitic infection of man and crops.
Keywords: Basement membrane; Collagen; Cuticle; Extracellular matrix; Nematode.
© 2018 The Author(s).