Background Cancer testis antigens (CTAs) are tumor antigens that have a highly tissue-restricted expression but are often expressed in diverse malignancies. With their highly immunogenic expression limited to tumor cells, CTAs have become a prime target for cancer vaccinations and T-cell-based therapy with chimeric T-cell receptors. In this study, we investigated the landscape of 17 CTA (NY-ESO-1, LAGE-1A, and 15 other CTAs) in the context of the tumor immune microenvironment of real-world clinical tumors spanning multiple histologies.

Methods RNA-seq was performed on 5450 FFPE tumors, and the expression of each of the 17 CTAs were classified as Positive (nRPM≥20) or Negative (nRPM<20). Pearson correlation analysis was conducted on the nRPM values for each CTA to determine co-expression relationships between any of the 17 CTAs. In order to visualize patterns in the CTA expression landscape, heatmap analysis was performed, using hierarchical clustering with Pearson’s correlation as a distance measure to reveal patterns in CTA status across all samples and CTAs.

Results 5450 tumor samples analyzed in this study spanned 39 histologic types of tumor and were predominantly composed of lung cancer (40.4%) followed by colorectal cancer (10.6%) and breast cancer (8.6%). Positive CTA prevalence ranged from 2.4% (GAGE13) to 31.5% (XAGE1B). A high degree of significant correlation between the expression of all CTAs was observed, with only GAGE10, XAGE1B, MLANA, MAGEA4, GAGE13, and SSX2 having a no significant correlation with at least one other CTA. Three key groups of co-expressed CTAs were observed: 1) NY-ESO-1, LAGE-1A, MAGEA12, MAGEA3, MAGEA1, MAGEA10, and MAGEA4 (0.36≤R≤0.82); 2) GAGE12J, GAGE2, GAGE1, GAGE13 (0.58≤R≤0.72); 3) SSX2, BAGE, MAGEC2 (0.4≤R≤0.58). The three remaining CTAs (GAGE10, XAGE1B, and MLANA) had little or no (R<0.22) correlation with any other CTA or each other. Clustering CTAs across all samples revealed three CTA expression clusters: 1) samples that express a collection of multiple CTAs; 2) samples that express mostly XAGE1B, over-represented by lung cancer (p=1.51e-296); 3) samples that express mostly GAGE10, over-represented by neuroendocrine tumors (p=1.64e-05).

Conclusions Across multiple cancer subtypes, the expression of a CTA occurs in the context of other CTAs, and specific groups of CTAs are likely to co-express, forming expression patterns characteristic to tumor subgroups. These findings provide a scientific base for selecting appropriate CTAs and designing multiplex vaccination in immunotherapies of a variety of tumors. Further studies are needed to understand the relationship of these CTAs with traditional and emerging immune-oncology biomarkers.