|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Feb 16, 2021 |
Title |
Synthetic spike-in controls enable sensitive and reproducible cell-free methylome interrogation |
Organisms |
Homo sapiens; synthetic construct |
Experiment type |
Methylation profiling by high throughput sequencing
|
Summary |
Background.The cell-free methylated DNA immunoprecipitation-sequencing (cfMeDIP-seq) method, is adapted to work with low input DNA and with circulating cell-free DNA (cfDNA). This method allowsfor epigenetic profiling from liquid biopsy samples, providing potential information about tissue of origin. Similar to classical immunoprecipitation based enrichment protocols, interpretation requires a referenceor control to draw inference against a composite experimental baseline and against designed standards allowing for cross-experiment comparisons. Methods.To meet the need for a reference control in cfMeDIP-seqexperiments, we designed spike-in controlsand integrated the use of unique molecular index (UMI) to adjust for polymerase chain reaction (PCR)bias, and immunoprecipitation bias caused by the fragment length, G+C content, and CpG density ofthe DNA fragments. This enables for absolute quantification of methylated DNA in picomoles, while retaining epigenomic information that allows for sensitive, tissue-specific detection as well as comparableresults between different experiments. We designed 54 DNA fragments with combinations of methylationstatus (methylated and unmethylated), fragment length in base pair (bp) (80 bp,160 bp,320 bp), G+C content (35%,50%,65%), and fraction of CpGs within a fragment (1/80 bp,1/40 bp,1/20 bp). We checked spike-in control DNA sequence to ensure they had no cross alignment to the human genome and minimized formation of secondary structures to avoid issues with amplification. We carried outcfMeDIP-seq on either solely spike-in DNA fragments, spike-in DNA added to sheared HCT116 genomic DNA or spike-inDNA added tocfDNAfrom acute myeloid leukemia (AML) samples to assess technical and biological biases, determine optimal amount of spike-in DNA required for an experiment and to assess batch effects,respectively. Results. We show thatcfMeDIP-seqenriches for highly methylated regions, with less than 0.01%non-specific binding and preference to high G+C content and CpG fraction DNA fragments. The use of 0.01 ngof spike-in control DNA results in sufficient sequencing reads to adjust for variance due to fragment length,G+C content and CpG fraction without negatively impacting the number of sequencing reads generatedfor each sample. With known amount of each spike-in control, we generated a generalized linear modelthat can absolutely quantify molar amount from read counts while adjusting for fragment length, G+C content, and CpG fraction. Using our spike-in controls, we show that we can greatly mitigate batch effects,reducing batch associated variance in the data to ≤5%of the total variance. Conclusions.The incorporation of spike-in controls allows for easier interpretation of data generated from cfMeDIP-seq and MeDIP-seq experiments when compared to relative read count. Through the use of a generalized linear model tailored to each experiment, molar amount for each genomic region can becalculated, greatly mitigating both biological and technical biases in the data. We have created an Rpackage, spiky, to convert read counts to DNA picomoles while adjusting for fragment length, G+C contentand CpG fraction.
|
|
|
Overall design |
Evaluation of the use of synthetic spike-in control DNA in cfMeDIP-seq experiments.
>>>Submitter states that raw data for all AML patient samples have been deposited at EGA under study accession EGAS00001005069<<<
|
Web link |
https://doi.org/10.1101/2021.02.12.430289
|
|
|
Contributor(s) |
Wilson SL, Shen SY, De Carvalho DD, Hoffman MM |
Citation(s) |
36160046 |
|
Submission date |
Feb 05, 2021 |
Last update date |
Oct 04, 2022 |
Contact name |
Michael Hoffman |
E-mail(s) |
[email protected]
|
Organization name |
Princess Margaret Cancer Centre
|
Department |
Research
|
Lab |
Hoffman lab
|
Street address |
101 College St.
|
City |
Toronto |
State/province |
ON |
ZIP/Postal code |
M5G 1L7 |
Country |
Canada |
|
|
Platforms (3) |
GPL17769 |
Illumina MiSeq (synthetic construct) |
GPL21697 |
NextSeq 550 (Homo sapiens) |
GPL24676 |
Illumina NovaSeq 6000 (Homo sapiens) |
|
Samples (25)
|
GSM5067060 |
6428_Synthetic DNA only. Output. |
GSM5067061 |
6543_0.1ng of synthetic DNA. 10 ng of HCT116. |
GSM5067062 |
6544_0.1ng of synthetic DNA. 10 ng of HCT116. |
GSM5067063 |
6545_0.05ng of synthetic DNA. 10ng of HCT116. |
GSM5067064 |
6546_0.05ng of synthetic DNA. 10ng of HCT116. |
GSM5067065 |
6547_0.01ng of synthetic DNA. 10ng of HCT116. |
GSM5067066 |
6548_0.01ng of synthetic DNA. 10ng of HCT116. |
GSM5067067 |
6654_0.01ng of synthetic DNA in 10ng of AML sample: Patient 1 |
GSM5067068 |
6655_0.01ng of synthetic DNA in 10ng of AML sample: Patient 2 |
GSM5067069 |
6656_0.01ng of synthetic DNA in 10ng of AML sample: Patient 3 |
GSM5067070 |
6657_0.01ng of synthetic DNA in 10ng of AML sample: Patient 4 |
GSM5067071 |
6658_0.01ng of synthetic DNA in 10ng of AML sample: Patient 5 |
GSM5067072 |
A29_0.01ng of synthetic DNA in 10ng of AML sample: Patient 1 |
GSM5067073 |
A35_0.01ng of synthetic DNA in 10ng of AML sample: Patient 2 |
GSM5067074 |
A37_0.01ng of synthetic DNA in 10ng of AML sample: Patient 3 |
GSM5067075 |
A56_0.01ng of synthetic DNA in 10ng of AML sample: Patient 4 |
GSM5067076 |
A64_0.01ng of synthetic DNA in 10ng of AML sample: Patient 5 |
GSM5067077 |
A29_DT_0.01ng of synthetic DNA in 10ng of AML sample: Patient 1 |
GSM5067078 |
A35_DT_0.01ng of synthetic DNA in 10ng of AML sample: Patient 2 |
GSM5067079 |
A37_DT_0.01ng of synthetic DNA in 10ng of AML sample: Patient 3 |
GSM5067080 |
A56_DT_0.01ng of synthetic DNA in 10ng of AML sample: Patient 4 |
GSM5067081 |
A64_DT_0.01ng of synthetic DNA in 10ng of AML sample: Patient 5 |
|
Relations |
BioProject |
PRJNA699882 |
SRA |
SRP304910 |
Supplementary file |
Size |
Download |
File type/resource |
GSE166259_RAW.tar |
28.8 Gb |
(http)(custom) |
TAR (of BED) |
SRA Run Selector |
Raw data are available in SRA |
Processed data provided as supplementary file |
|
|
|
|
|