Publications

Detailed Information

Fast and scalable method for distributed Boolean tensor factorization

DC Field Value Language
dc.contributor.authorPark, Namyong-
dc.contributor.authorOh, Sejoon-
dc.contributor.authorKang, U.-
dc.date.accessioned2022-05-16T08:47:07Z-
dc.date.available2022-05-16T08:47:07Z-
dc.date.created2020-03-31-
dc.date.created2020-03-31-
dc.date.issued2019-08-
dc.identifier.citationVLDB Journal, Vol.28 No.4, pp.549-574-
dc.identifier.issn1066-8888-
dc.identifier.urihttps://hdl.handle.net/10371/179761-
dc.description.abstractHow can we analyze tensors that are composed of 0's and 1's? How can we efficiently analyze such Boolean tensors with millions or even billions of entries? Boolean tensors often represent relationship, membership, or occurrences of events such as subject-relation-object tuples in knowledge base data (e.g., 'Seoul'-'is the capital of'-'South Korea'). Boolean tensor factorization (BTF) is a useful tool for analyzing binary tensors to discover latent factors from them. Furthermore, BTF is known to produce more interpretable and sparser results than normal factorization methods. Although several BTF algorithms exist, they do not scale up for large-scale Boolean tensors. In this paper, we propose DBTF, a distributed method for Boolean CP (DBTF-CP) and Tucker (DBTF-TK) factorizations running on the Apache Spark framework. By distributed data generation with minimal network transfer, exploiting the characteristics of Boolean operations, and with careful partitioning, DBTF successfully tackles the high computational costs and minimizes the intermediate data. Experimental results show that DBTF-CP decomposes up to 16(3)-32(3) x larger tensors than existing methods in 82-180 x less time, and DBTF- TK decomposes up to 8(3)-16(3) x larger tensors than existing methods in 86-129 x less time. Furthermore, both DBTF- CP and DBTF- TK exhibit near- linear scalability in terms of tensor dimensionality, density, rank, and machines.-
dc.language영어-
dc.publisherSpringer Verlag-
dc.titleFast and scalable method for distributed Boolean tensor factorization-
dc.typeArticle-
dc.identifier.doi10.1007/s00778-019-00538-z-
dc.citation.journaltitleVLDB Journal-
dc.identifier.wosid000482951200006-
dc.identifier.scopusid2-s2.0-85063229652-
dc.citation.endpage574-
dc.citation.number4-
dc.citation.startpage549-
dc.citation.volume28-
dc.description.isOpenAccessN-
dc.contributor.affiliatedAuthorKang, U.-
dc.type.docTypeArticle-
dc.description.journalClass1-
dc.subject.keywordPlusDECOMPOSITIONS-
dc.subject.keywordPlusALGORITHMS-
dc.subject.keywordPlusSCALE-
dc.subject.keywordAuthorTensor-
dc.subject.keywordAuthorTensor factorization-
dc.subject.keywordAuthorBoolean CP factorization-
dc.subject.keywordAuthorBoolean Tucker factorization-
dc.subject.keywordAuthorDistributed algorithm-
Appears in Collections:
Files in This Item:
There are no files associated with this item.

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share