Here we scaffolded the domestic camel genome assembly (GCF_000767855.1) using our improved dromedary camel genome assembly (CamDro3, GCA_000803125.3) as a reference.
We used CamDro3 in a reference-guided assembly strategy implemented with Ragout v. 2.0 (Kolmogorov et al., 2014) to upgrade the Camelus bactrianus (CamBac1, GCF_000767855.1, (Wang et al., 2012)) genome assembly to chromosome-level scale. Briefly, we used default settings in Progressive Cactus v. Github commit c4bed56c0cd48d23411038acb9c19bcae054837e (Paten et al., 2011a; Paten et al., 2011b) to generate HAL (hierarchical alignment format) alignments between CamDro3 and CamBac1, and then used Ragout with the refine and small synteny block settings to convert the alignment to FASTA, upgrading the CamBac1 assembly to CamBac2. Before alignment with Progressive Cactus, we repeat masked CamDro3 with RepeatMasker v. open-4.0.8 (http://www.repeatmasker.org) against the mammal repeats from RepBase RepeatMaskerEdition-20181026 (Jurka et al., 2005). We filled in gaps CamBac2 with GapFiller v. 3.0 (Boetzer & Pirovano, 2012) using default settings and BowTie (Langmead et al., 2009) as an aligner. The paired-end reads used to fill in gaps were the original Illumina short-reads used in assembly (SRA accessions: SRR1552325, SRR1552327, SRR1552330, SRR1552336, SRR1552341, SRR1552346, SRR1552347, and SRR1552348), which we trimmed with BBDuk v. 37.76 (https://sourceforge.net/projects/bbmap/), using the following settings: ktrim=r, k=23, mink=11, hdist=1, tpe, tbo, qtrim=rl, trimq=15. Less...