Complete variable domain sequences of monoclonal antibody light chains identified from untargeted RNA sequencing data

Front Immunol. 2023 Apr 18:14:1167235. doi: 10.3389/fimmu.2023.1167235. eCollection 2023.

Abstract

Introduction: Monoclonal antibody light chain proteins secreted by clonal plasma cells cause tissue damage due to amyloid deposition and other mechanisms. The unique protein sequence associated with each case contributes to the diversity of clinical features observed in patients. Extensive work has characterized many light chains associated with multiple myeloma, light chain amyloidosis and other disorders, which we have collected in the publicly accessible database, AL-Base. However, light chain sequence diversity makes it difficult to determine the contribution of specific amino acid changes to pathology. Sequences of light chains associated with multiple myeloma provide a useful comparison to study mechanisms of light chain aggregation, but relatively few monoclonal sequences have been determined. Therefore, we sought to identify complete light chain sequences from existing high throughput sequencing data.

Methods: We developed a computational approach using the MiXCR suite of tools to extract complete rearranged IGVL-IGJL sequences from untargeted RNA sequencing data. This method was applied to whole-transcriptome RNA sequencing data from 766 newly diagnosed patients in the Multiple Myeloma Research Foundation CoMMpass study.

Results: Monoclonal IGVL-IGJL sequences were defined as those where >50% of assigned IGK or IGL reads from each sample mapped to a unique sequence. Clonal light chain sequences were identified in 705/766 samples from the CoMMpass study. Of these, 685 sequences covered the complete IGVL-IGJL region. The identity of the assigned sequences is consistent with their associated clinical data and with partial sequences previously determined from the same cohort of samples. Sequences have been deposited in AL-Base.

Discussion: Our method allows routine identification of clonal antibody sequences from RNA sequencing data collected for gene expression studies. The sequences identified represent, to our knowledge, the largest collection of multiple myeloma-associated light chains reported to date. This work substantially increases the number of monoclonal light chains known to be associated with non-amyloid plasma cell disorders and will facilitate studies of light chain pathology.

Keywords: AL amyloidosis; MiXCR; antibody light chain; antibody repertoire sequencing; antibody sequence; monoclonal gammopathy; multiple myeloma; plasma cell dyscrasia.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Antibodies, Monoclonal / genetics
  • Humans
  • Immunoglobulin Light Chains / genetics
  • Immunoglobulin Light Chains / metabolism
  • Multiple Myeloma*
  • RNA
  • Sequence Analysis, RNA

Substances

  • RNA
  • Immunoglobulin Light Chains
  • Antibodies, Monoclonal

Grants and funding

This work was supported by funding from the American Cancer Society via an Institutional Research Grant to the Boston University (BU) Cancer Center (IRG-17-176-39); the BU Clinical and Translational Sciences Institute, via NIH award 1UL1TR001430; the BU Bioinformatics Program and the BU Genome Science Institute via the Bioinformatics Masters Summer Internship Program; The Wildflower Foundation; the Karin Grunebaum Cancer Research Foundation; and the BU Amyloid Research Fund.