Publications

Detailed Information

BiSpark: a Spark-based highly scalable aligner for bisulfite sequencing data

DC Field Value Language
dc.contributor.authorSoe, Seokjun-
dc.contributor.authorPark, Yoonjae-
dc.contributor.authorChae, Heejoon-
dc.date.accessioned2019-03-12T01:36:08Z-
dc.date.available2019-03-12T10:37:46Z-
dc.date.issued2018-12-10-
dc.identifier.citationBMC Bioinformatics, 19(1):472ko_KR
dc.identifier.issn1471-2105-
dc.identifier.urihttps://hdl.handle.net/10371/146970-
dc.description.abstractBackground
Bisulfite sequencing is one of the major high-resolution DNA methylation measurement method. Due to the selective nucleotide conversion on unmethylated cytosines after treatment with sodium bisulfite, processing bisulfite-treated sequencing reads requires additional steps which need high computational demands. However, a dearth of efficient aligner that is designed for bisulfite-treated sequencing becomes a bottleneck of large-scale DNA methylome analyses.

Results
In this study, we present a highly scalable, efficient, and load-balanced bisulfite aligner, BiSpark, which is designed for processing large volumes of bisulfite sequencing data. We implemented the BiSpark algorithm over the Apache Spark, a memory optimized distributed data processing framework, to achieve the maximum data parallel efficiency. The BiSpark algorithm is designed to support redistribution of imbalanced data to minimize delays on large-scale distributed environment.

Conclusions
Experimental results on methylome datasets show that BiSpark significantly outperforms other state-of-the-art bisulfite sequencing aligners in terms of alignment speed and scalability with respect to dataset size and a number of computing nodes while providing highly consistent and comparable mapping results.

Availability
The implementation of BiSpark software package and source code is available at
https://github.com/bhi-kimlab/BiSpark/

.
ko_KR
dc.description.sponsorshipThis work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP; Ministry of Science, ICT & Future Planning) (No. 2017R1C1B5018165), supported by Basic Science Research Program through the NRF funded by the Ministry of Education
(NRF-2016R1D1A1A02937186), supported by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (grant number : HI15C3224), and also supported by the Sookmyung Womens University Research Grants (1-1703-2032).
ko_KR
dc.language.isoenko_KR
dc.publisherBioMed Centralko_KR
dc.subjectDNA methylationko_KR
dc.subjectBisulfite sequencingko_KR
dc.subjectAlignmentko_KR
dc.subjectApache Sparkko_KR
dc.titleBiSpark: a Spark-based highly scalable aligner for bisulfite sequencing datako_KR
dc.typeArticleko_KR
dc.contributor.AlternativeAuthor소석준-
dc.contributor.AlternativeAuthor박윤재-
dc.contributor.AlternativeAuthor채희준-
dc.identifier.doi10.1186/s12859-018-2498-2-
dc.language.rfc3066en-
dc.rights.holderThe Author(s)-
dc.date.updated2018-12-16T04:14:33Z-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share