Publications

Detailed Information

Co-occurrence Pattern Learning Species Distribution Model (SDM) Quantifies Annual Reduction of American Coccinellids : 공동출현패턴 학습 종분포모델(SDM)을 이용한 북미 무당벌레의 연 단위 감소율 추정—데이터의 비일관성과 불충분성 극복을 중심으로
Overcoming Data Inconsistencies and Insufficiencies

DC Field Value Language
dc.contributor.advisor송영근-
dc.contributor.authorHyun Yong Chung-
dc.date.accessioned2023-11-20T04:37:16Z-
dc.date.available2023-11-20T04:37:16Z-
dc.date.issued2023-
dc.identifier.other000000179452-
dc.identifier.urihttps://hdl.handle.net/10371/196921-
dc.identifier.urihttps://dcollection.snu.ac.kr/common/orgView/000000179452ko_KR
dc.description학위논문(석사) -- 서울대학교대학원 : 사범대학 협동과정 환경교육전공, 2023. 8. 송영근.-
dc.description.abstractAim: To predict annual distribution patterns and reduction rates of insufficiently observed species by using co-occurrence pattern learning and devising filling-in strategy to overcome structural and temporal inconsistencies in multi-source noisy data.

Idea: Although more than 10% of insects will face extinction in the coming decades, studies on their reduction rates that will form the basis for conservation strategies are still limited. This limitation is first due to the dominance of unstructured records available for invertebrates, secondly, to the inconsistencies among them, and thirdly, to the insufficiencies of them. While compelling to gather data across multiple sources, the small amount of data precludes deep filtering to handle structural and temporal inconsistencies among sources for time-series comparison. This is the first study to estimate annual reductions with machine learning from multi-sourced, presence-only, and small data, by overcoming its inconsistencies and insufficiencies. This study proposes and validates the following two novel strategies. (1) Co-occurrence pattern learning: By grouping low-quality, unreliable individual occurrence records into patterns, I validate that structural and temporal inconsistencies can be overcome without deep filtering. (2) Filling-in strategy: I propose a procedure for estimating population trends by filling in the prediction into the deficiencies of the collected yearly data to be evenly compared.

Location: 51 states of the USA and 6 provinces of Canada

Taxa: four ladybugs native to North America

Methods: In chapter 2, seven performance scores were used to evaluate the predictions on presence versus absence in the following three situations: (1) learning unstructured data to predict structured data or low-efficiecy data to high-efficiency data; (2) learning data before a particular year to predict after that year and vice versa; (3) learning 70% of multi-source data to predict the rest. During both the evaluation and generalization phases, a comparison was made between the performance of the co-occurence pattern using models and the environmental information using models, as well as with the commonly accepted benchmark.

In chapter 3, reduction rates and extinction status were estimated by ML's predicting the occupancy of species annually at all coordinates where species have appeared since 2007. In addition to that, the newly suggested approach's methodological reliability was verified, in comparison with pre-established methods. Furthermore, the reliability of the newly proposed method was validated by examining discrepancies in estimations under the following scenarios: variances in data extraction for pseudo-absence data points, variances in variable selection techniques, and the stochastic incorporation of missing or false information within the presence data.

Results: 1) The COP models' performance surpassed acceptable criteria for all validation steps and all species. They also ouperformed over the ENV models. 2) Reduction rates were 36.4% for H. parenthesis (2007–2021; VU), 29.7% for A. bipunctata (2010–2019; NT), 23.7% for C. novemnotata (2009–2018; NT), and 14% for C. trasversoguettata (2007–2018; LC). Additionally, the newly proposed approach was confirmed to possess strong methodological validity when compared to pre-established methods. In terms of reliability tests, the range of estimations from the new method did not misrepresent IUCN conservation status to a significant extent.

Conclusion: The combination of using co-occurrence patterns as variables and filling-in strategy enabled SDM to predict species' finer time scale distribution patterns and reduction rates by overcoming structural and temporal inconsistencies in multi-source data integrating considerable citizen science data. In North America, four native ladybug species have been declining steadily. This study suggests that ML developed with COP can integrate multiple-source data without filtering, allowing for the acquisition of more data, and that COP-based SDMs may be advantageous for predictions at finer temporal scales (and thus more precise than commonly used SDMs developed with environmental variables usually spanning over decades). This can aid in tackling the challenge in global conservation initiatives posed by rare and invertebrate taxa, which frequently face restricted data availability and are often underrepresented in conservation lists.
-
dc.description.tableofcontentsChapter 1. Introduction
1.1. General background of the study 1
1.2. Purpose of the study 6
1.3. Study history 7

Chapter 2. Co-Occurrence Patterns Overcome Structural and Temporal Inconsistencies in Multi-Source Datasets, Outperforming Environmental Variables
2.1. Materials and methods 14
2.1.1. Summary of materials and methods 14
2.1.2. Target species 14
2.1.3. Occurrence data 14
2.1.4. Psuedo-absence 15
2.1.5. Variables 16
2.1.6. Development and characterization of models 17
2.1.7. Generalization 20
2.1.8. Evaluation 22
2.2. Results 23
2.2.1. Biases in multi-source data 23
2.2.2. Structural and temporal generalization 24
2.2.3. Evaluation of the developed models 31
2.2.4. Importance and correlation among variables 31
2.3. Discussion 34
2.3.1. The strength of co-occurrence pattern learning 34
2.3.2. The interpretation of used variables 35
2.3.3. The incorporation of new variables 36
2.3.4. The limitation in application 38
2.4. Conclusion 38

Chapter 3. Filling Machine Learning Predictions In Temporal Data Gaps Can Estimate Annual Reductions Across Every Historical Distribution
3.1. Materials and methods 40
3.1.1. Summary of materials and methods 40
3.1.2. Target species 40
3.1.3. Occurrence data 40
3.1.4. Psuedo-absence 40
3.1.5. Variables 40
3.1.6. Development and characterization of models 40
3.1.7. Prediction on annual distributions and reduction rates 41
3.1.8. Validity Evaluation 41
3.1.9. Reliability Evaluation 42
3.2. Results 43
3.2.1. Estimated reduction rates and conservation status 43
3.2.2. Validity comparison with pre-established methodologies 45
3.2.3. Reliability analysis on filling-in approach 49
3.2.4. Predicted distribution 51
3.3. Discussion 52
3.3.1. The theoretical rationale for the ML reduction rates 52
3.3.2. Affect of temporal fluctuations of data on various models 53
3.3.3. Practical benefits of the filling-in approach 55
3.3.4. The filling-in approach in conjunction with data filtering methods 56
3.4. Conclusion 58

Bibliography 59

Abstract in Korean 73
-
dc.format.extentv, 68-
dc.language.isoeng-
dc.publisher서울대학교 대학원-
dc.subjectconservation status-
dc.subjectannual reduction rate-
dc.subjectcitizen scienece-
dc.subjectpresence-only-
dc.subjectspeceis distribution model-
dc.subjectco-occurrence pattern-
dc.subject.ddc363.7007-
dc.titleCo-occurrence Pattern Learning Species Distribution Model (SDM) Quantifies Annual Reduction of American Coccinellids-
dc.title.alternative공동출현패턴 학습 종분포모델(SDM)을 이용한 북미 무당벌레의 연 단위 감소율 추정—데이터의 비일관성과 불충분성 극복을 중심으로-
dc.typeThesis-
dc.typeDissertation-
dc.contributor.AlternativeAuthor정현용-
dc.contributor.department사범대학 협동과정 환경교육전공-
dc.description.degree석사-
dc.date.awarded2023-08-
dc.title.subtitleOvercoming Data Inconsistencies and Insufficiencies-
dc.identifier.uciI804:11032-000000179452-
dc.identifier.holdings000000000050▲000000000058▲000000179452▲-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share