|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Dec 16, 2019 |
Title |
A large peptidome dataset improves HLA class I epitope prediction across most of the human population |
Organism |
Homo sapiens |
Experiment type |
Expression profiling by high throughput sequencing
|
Summary |
Prediction of HLA epitopes is important for the development of cancer immunotherapies and vaccines. However, current prediction algorithms have limited predictive power, in part because they were not trained on high-quality epitope datasets covering a broad range of HLA alleles. To enable prediction of endogenous HLA class I-associated peptides across a large fraction of the human population, we used mass spectrometry to profile >185,000 peptides eluted from 95 HLA-A, -B, -C and -G mono-allelic cell lines. We identified canonical peptide motifs per HLA allele, unique and shared binding submotifs across alleles and distinct motifs associated with different peptide lengths. By integrating these data with transcript abundance and peptide processing, we developed HLAthena, providing allele-and-length-specific and pan-allele-pan-length prediction models for endogenous peptide presentation. These models predicted endogenous HLA class I-associated ligands with 1.5-fold improvement in positive predictive value compared with existing tools and correctly identified >75% of HLA-bound peptides that were observed experimentally in 11 patient-derived tumor cell lines.
|
|
|
Overall design |
RNA transcript expression was quantified for mono-allelic HLA-C cell lines (HLA-C*04:01 and HLA-C*07:01, 4 replicates each) in order to assess its contribution to Class I peptide prediction and compare it against HLA-A and HLA-B.
|
|
|
Contributor(s) |
Wu CJ, Keskin DB |
Citation(s) |
31844290 |
|
Submission date |
May 15, 2019 |
Last update date |
Mar 16, 2020 |
Contact name |
Nir Hacohen |
Organization name |
Broad Institute
|
Lab |
Nir Hacohen
|
Street address |
415 Main Street
|
City |
Cambridge |
State/province |
MA |
ZIP/Postal code |
02142 |
Country |
USA |
|
|
Platforms (1) |
|
Samples (8)
|
|
Relations |
BioProject |
PRJNA543098 |
SRA |
SRP198497 |
Supplementary file |
Size |
Download |
File type/resource |
GSE131267_RAW.tar |
9.4 Mb |
(http)(custom) |
TAR (of TXT) |
SRA Run Selector |
Raw data are available in SRA |
Processed data provided as supplementary file |
|
|
|
|
|