|
Status |
Public on Jan 31, 2021 |
Title |
E3 |
Sample type |
SRA |
|
|
Source name |
Myeloma cell lines (500 cells)
|
Organism |
Homo sapiens |
Characteristics |
cell type: Myeloma cell lines sample: sample2 cell_number: 500
|
Extracted molecule |
polyA RNA |
Extraction protocol |
Samples were processed using the Drop-seq DolomiteBio Nadia encapsulator system. For nanopore sequencing, cDNA was amplified with 25 SMART PCR reactions and sequencing libraries were prepared using the Oxford Nanopore LSK-109 library preperation kit.
|
|
|
Library strategy |
RNA-Seq |
Library source |
transcriptomic |
Library selection |
cDNA |
Instrument model |
MinION |
|
|
Description |
Myeloma cells
|
Data processing |
We performed basecalling on the raw fast5 data using Guppy (v) (guppy_basecaller –compress-fastq -c dna_r9.4.1_450bps_hac.cfg -x “cuda:1”) in GPU mode from Oxford Nanopore Technologies running on a GTX 1080 Ti graphics card. For each read we identify the barcode and UMI sequence by searching for the polyA region and flanking regions before and after the barcode/UMI. Accurately sequenced barcodes were identified based on their dual nucleotide complementarity. Unambiguous barcodes were then used as a guide to error correct the ambiguous barcodes in a second pass correction analysis approach. We performed fuzzy searching using a Levenshtein distance of 4 (unless otherwise stated in the figure legend) and replaced the original ambiguous barcode with the unambiguous sequence. A whitelist of barcodes was then generated using UMI-tools whitelist (umi_tools whitelist --bc-pattern=CCCCCCCCCCCCCCCCCCCCCCCCNNNNNNNNNNNNNNNN --set-cell-number=1000) [3]. This whitelist was used to assess the quality of our cells to read count ratio and used as an input for UMI-tools extract. Next the barcode and UMI sequence of each read was extracted and placed within the read2 header file using UMI-tools extract (umi_tools extract --bc-pattern=CCCCCCCCCCCCCCCCCCCCCCCCNNNNNNNNNNNNNNNN --whitelist=whitelist.txt). Reads were then aligned to the transcriptome using minimap2 [10] (-ax splice -uf --MD --sam-hit-only --junc-bed) using the reference transcriptome for human hg38 and mouse mm10. The resulting sam file was converted to a bam file and then sorted and indexed using samtools [11]. The transcript name was then added as a XT tag within the bam file using pysam. Finally, UMI-tools count (umi_tools count –per-gene –gene-tag=XT –per-cell –double-barcode) was used to count features to cells before being converted to a market matrix format. We modified UMI-tools count to handle the double nucleotide UMIs as defined below. This counts matrix was then used as an input into the standard Seurat pipeline. Genome_build: hg38/mm10 Supplementary_files_format_and_content: mtx
|
|
|
Submission date |
Jan 10, 2021 |
Last update date |
Feb 01, 2021 |
Contact name |
Adam Cribbs |
E-mail(s) |
[email protected]
|
Organization name |
University of Oxford
|
Department |
NDORMS
|
Street address |
Windmill Road
|
City |
Oxford |
ZIP/Postal code |
OX37LD |
Country |
United Kingdom |
|
|
Platform ID |
GPL24106 |
Series (1) |
GSE162053 |
High throughput error correction using dual nucleotide dimer blocks allows direct single-cell nanopore transcriptome sequencing |
|
Relations |
BioSample |
SAMN17274451 |
SRA |
SRX9816546 |