Publications

Detailed Information

Multi-label classification with XGBoost for metabolic pathway prediction

DC Field Value Language
dc.contributor.authorJoe, Hyunwhan-
dc.contributor.authorKim, Hong-Gee-
dc.date.accessioned2024-02-05T00:54:55Z-
dc.date.available2024-02-05T09:57:03Z-
dc.date.issued2024-02-01-
dc.identifier.citationBMC Bioinformatics, Vol.25 no.52ko_KR
dc.identifier.issn1471-2105-
dc.identifier.urihttps://hdl.handle.net/10371/198976-
dc.description.abstractBackground
Metabolic pathway prediction is one possible approach to address the problem in system biology of reconstructing an organisms metabolic network from its genome sequence. Recently there have been developments in machine learning-based pathway prediction methods that conclude that machine learning-based approaches are similar in performance to the most used method, PathoLogic which is a rule-based method. One issue is that previous studies evaluated PathoLogic without taxonomic pruning which decreases its performance.

Results
In this study, we update the evaluation results from previous studies to demonstrate that PathoLogic with taxonomic pruning outperforms previous machine learning-based approaches and that further improvements in performance need to be made for them to be competitive. Furthermore, we introduce mlXGPR, a XGBoost-based metabolic pathway prediction method based on the multi-label classification pathway prediction framework introduced from mlLGPR. We also improve on this multi-label framework by utilizing correlations between labels using classifier chains. We propose a ranking method that determines the order of the chain so that lower performing classifiers are placed later in the chain to utilize the correlations between labels more. We evaluate mlXGPR with and without classifier chains on single-organism and multi-organism benchmarks. Our results indicate that mlXGPR outperform other previous pathway prediction methods including PathoLogic with taxonomic pruning in terms of hamming loss, precision and F1 score on single organism benchmarks.

Conclusions
The results from our study indicate that the performance of machine learning-based pathway prediction methods can be substantially improved and can even outperform PathoLogic with taxonomic pruning.
ko_KR
dc.description.sponsorshipThis work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(Ministry of Science and ICT) (No. RS-2023-00268071)ko_KR
dc.language.isoenko_KR
dc.publisherBMCko_KR
dc.subjectMetabolic pathway prediction-
dc.subjectBioCyc-
dc.subjectXGBoost-
dc.titleMulti-label classification with XGBoost for metabolic pathway predictionko_KR
dc.typeArticleko_KR
dc.identifier.doi10.1186/s12859-024-05666-0ko_KR
dc.citation.journaltitleBMC Bioinformaticsko_KR
dc.language.rfc3066en-
dc.rights.holderThe Author(s)-
dc.date.updated2024-02-04T04:21:56Z-
dc.citation.endpage15ko_KR
dc.citation.number52ko_KR
dc.citation.startpage1ko_KR
dc.citation.volume25ko_KR
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share