Publications

Detailed Information

Assessing data change in scientific datasets

DC Field Value Language
dc.contributor.authorMueller, Juliane-
dc.contributor.authorFaybishenko, Boris-
dc.contributor.authorAgarwal, Deborah-
dc.contributor.authorBailey, Stephen-
dc.contributor.authorJiang, Chongya-
dc.contributor.authorRyu, Youngryel-
dc.contributor.authorTull, Craig-
dc.contributor.authorRamakrishnan, Lavanya-
dc.date.accessioned2024-03-20T06:03:58Z-
dc.date.available2024-03-20T06:03:58Z-
dc.date.created2021-08-23-
dc.date.created2021-08-23-
dc.date.created2021-08-23-
dc.date.issued2021-08-
dc.identifier.citationConcurrency Computation Practice and Experience, Vol.33 No.16, pp.1-22-
dc.identifier.issn1532-0626-
dc.identifier.urihttps://hdl.handle.net/10371/199153-
dc.description.abstractScientific datasets are growing rapidly and becoming critical to next-generation scientific discoveries. The validity of scientific results relies on the quality of data used and data are often subject to change, for example, due to observation additions, quality assessments, or processing software updates. The effects of data change are not well understood and difficult to predict. Datasets are often repeatedly updated and recomputing derived data products quickly becomes time consuming and resource intensive and may in some cases not even be necessary, thus delaying scientific advance. Despite its importance, there is a lack of systematic approaches for best comparing data versions to quantify the changes, and ad-hoc or manual processes are commonly used. In this article, we propose a novel hierarchical approach for analyzing data changes, including real-time (online) and offline analyses. We employ a variety of fast-to-compute numerical analyses, graphical data change representations, and more resource-intensive recomputations of a subset of the data product. We illustrate the application of our approach using three scientific diverse use cases, namely, satellite, cosmological, and x-ray data. The results show that a variety of data change metrics should be employed to enable a comprehensive representation and qualitative evaluation of data changes.-
dc.language영어-
dc.publisherJohn Wiley & Sons Inc.-
dc.titleAssessing data change in scientific datasets-
dc.typeArticle-
dc.identifier.doi10.1002/cpe.6245-
dc.citation.journaltitleConcurrency Computation Practice and Experience-
dc.identifier.wosid000624349000001-
dc.identifier.scopusid2-s2.0-85101878283-
dc.citation.endpage22-
dc.citation.number16-
dc.citation.startpage1-
dc.citation.volume33-
dc.description.isOpenAccessY-
dc.contributor.affiliatedAuthorRyu, Youngryel-
dc.type.docTypeArticle-
dc.description.journalClass1-
dc.subject.keywordAuthordata management-
dc.subject.keywordAuthordata versions-
dc.subject.keywordAuthorhierarchical data change analysis-
dc.subject.keywordAuthorQA-
dc.subject.keywordAuthorQC-
dc.subject.keywordAuthorscientific data change analysis-
Appears in Collections:
Files in This Item:
There are no files associated with this item.

Related Researcher

  • College of Agriculture and Life Sciences
  • Department of Landscape Architecture and Rural System Engineering
Research Area Crop, Forest Carbon, Sensing Network, Water Cycles

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share