|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Apr 26, 2021 |
Title |
Leveraging histone modifications to improve genome annotation |
Organisms |
Glycine max; Zea mays |
Experiment type |
Genome binding/occupancy profiling by high throughput sequencing
|
Summary |
With the creation of accurate, chromosome-scale genomes, the next challenge facing the genomics community is the accurate idenfication of transcriptional units, distinguishing them from aberrant transcriptional noise. This has proven to be a challenge as annotation by traditional means, such as short read RNA-seq followed by transcriptome assembly, which is prone to the generation of in-silico artifacts. To address this issue, we took advantage of epigenomic data in the form of ChIP-seq to unbiasedly annotate plant genomes and identify potential annotation issues, as well as identify novel genes. Histone modifications appear in the genome in a reproducible and predictable manner, making them an ideal resource to use in annotation. Trimethylation of histone 3 lysine 4 (H3K4me3), as well as acetylation of histone 3 lysine 56 are well documented to coincide with initiation of transcription by polymerase II (Pol II) at promoter sequences. These initiation marks, paired with marks deposited across the gene body during transcriptional elongation, such as histone 3 lysine 36 tri-methylation (H3K36me3) and histone 3 lysine 4 mono-methylation (H3K4me1), offer a framework to begin identifying complete transcriptional units. We leveraged these data on a genome-wide scale, allowing for identification of annotations discordant with empirical data. In total, 13,159 potential annotation issues were found in Zea mays across three different tissues, which were corroborated using complementary RNA-based approaches. Upon correction and validation, genes were extended by an average of 2,128 base pairs, and the length of discovered novel genes was 1,962 base pairs. Application of this method to five additional plant genomes revealed a variety of novel gene annotations, including 13,836 in Asparagus officianalis, 2,724 in Setaria viridis, 2,446 in Sorghum bicolor, 8,631 in Glycine max, and 2,585 in Phaseolous vulgaris.
|
|
|
Overall design |
ChIP-seq of histone modification H3K36me3, H3K4me1, and H3K56ac for Glycine max. ChIP-seq for H3K4me1 of Zea mays ear
|
|
|
Contributor(s) |
Medieta P, Schmitz R, Zhang X |
Citation(s) |
34568920 |
|
Submission date |
Nov 05, 2020 |
Last update date |
Oct 06, 2021 |
Contact name |
Robert J Schmitz |
E-mail(s) |
[email protected]
|
Organization name |
University of Georgia
|
Department |
Genetics
|
Street address |
B416 Davison Life Sciences
|
City |
Athens |
State/province |
GA |
ZIP/Postal code |
30602 |
Country |
USA |
|
|
Platforms (4)
|
GPL20156 |
Illumina NextSeq 500 (Zea mays) |
GPL22410 |
Illumina NextSeq 500 (Glycine max) |
GPL25410 |
Illumina NovaSeq 6000 (Zea mays) |
GPL28801 |
Illumina NovaSeq 6000 (Glycine max) |
|
Samples (6)
|
|
Relations |
BioProject |
PRJNA674873 |
SRA |
SRP291339 |
Supplementary file |
Size |
Download |
File type/resource |
GSE160944_RAW.tar |
1.2 Mb |
(http)(custom) |
TAR (of BED) |
SRA Run Selector |
Raw data are available in SRA |
Processed data provided as supplementary file |
|
|
|
|
|