S-Space College of Engineering/Engineering Practice School (공과대학/대학원) Dept. of Computer Science and Engineering (컴퓨터공학부) Journal Papers (저널논문_컴퓨터공학부)
PS-MCL: parallel shotgun coarsened Markov clustering of protein interaction networks
- Lim, Yongsub; Yu, Injae; Seo, Dongmin; Kang, U; Sael, Lee
- Issue Date
- BioMed Central
- BMC Bioinformatics, 20(Suppl 13):381
- Graph clustering; Markov clustering; Parallel clustering; Coarsening; Non-overlapping clusters; Protein complex finding
How can we obtain fast and high-quality clusters in genome scale bio-networks? Graph clustering is a powerful tool applied on bio-networks to solve various biological problems such as protein complexes detection, disease module detection, and gene function prediction. Especially, MCL (Markov Clustering) has been spotlighted due to its superior performance on bio-networks. MCL, however, is skewed towards finding a large number of very small clusters (size 1-3) and fails to detect many larger clusters (size 10+). To resolve this fragmentation problem, MLR-MCL (Multi-level Regularized MCL) has been developed. MLR-MCL still suffers from the fragmentation and, in cases, unrealistically large clusters are generated.
In this paper, we propose PS-MCL (Parallel Shotgun Coarsened MCL), a parallel graph clustering method outperforming MLR-MCL in terms of running time and cluster quality. PS-MCL adopts an efficient coarsening scheme, called SC (Shotgun Coarsening), to improve graph coarsening in MLR-MCL. SC allows merging multiple nodes at a time, which leads to improvement in quality, time and space usage. Also, PS-MCL parallelizes main operations used in MLR-MCL which includes matrix multiplication.
Experiments show that PS-MCL dramatically alleviates the fragmentation problem, and outperforms MLR-MCL in quality and running time. We also show that the running time of PS-MCL is effectively reduced with parallelization.