Publications

Detailed Information

Order-Preserving Matching in Numeric Strings : 수치 문자열의 순서를 보존하는 매칭 기법

DC Field Value Language
dc.contributor.advisor박근수-
dc.contributor.author김진일-
dc.date.accessioned2017-07-13T07:01:02Z-
dc.date.available2017-07-13T07:01:02Z-
dc.date.issued2014-02-
dc.identifier.other000000016633-
dc.identifier.urihttps://hdl.handle.net/10371/118957-
dc.description학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2014. 2. 박근수.-
dc.description.abstractString matching is a fundamental problem in computer science and has been extensively studied. Sometimes a string consists of numeric values instead of alphabet characters, and we are interested in some trends in the text rather than specific patterns. We introduce a new string matching problem called order-preserving matching on numeric strings, where a pattern matches a text substring of the same length if the relative orders in the substring coincide with those of the pattern. Order-preserving matching is applicable to many scenarios such as stock price analysis and musical melody matching.
In this thesis, we define order-preserving matching in numeric strings, and present various representations of order relations and efficient algorithms of order-preserving matching with those representations. For single pattern matching, we give an O(n log m) time algorithm with the prefix representation based on the KMP algorithm, and optimize it further to obtain O(n + m log m) time with the nearest neighbor representation, where n and m are the lengths of the text and the pattern, respectively. For multiple pattern matching, we present an O((n+m) log m) time algorithm with the prefix representation based on the Aho-Corasick algorithm, where n is the text length and m is the sum of the lengths of the patterns. Our algorithms are presented in binary order relations first, and then extended to ternary order relations. With our extensions, the time complexities in binary order relations can be achieved in ternary order relations as well.
-
dc.description.tableofcontentsContents
Abstract i
Contents ii
List of Figures iv
List of Tables v
Chapter 1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Chapter 2 Order-Preserving Pattern Matching 6
2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.1 Definitions of Order Relations . . . . . . . . . . . . . . . . 6
2.1.2 Number of Representations . . . . . . . . . . . . . . . . . 8
2.1.3 Problem Formulation . . . . . . . . . . . . . . . . . . . . 8
2.2 O(n logm) Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 Prefix Representation . . . . . . . . . . . . . . . . . . . . 10
2.2.2 KMP Failure Function . . . . . . . . . . . . . . . . . . . . 11
ii
2.2.3 Text Search . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.4 Construction of KMP Failure Function . . . . . . . . . . . 15
2.2.5 Correctness and Time Complexity . . . . . . . . . . . . . 17
2.3 O(n + mlogm) Algorithm . . . . . . . . . . . . . . . . . . . . . . 17
2.3.1 Nearest Neighbor Representation . . . . . . . . . . . . . . 17
2.3.2 Text Search . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.3 Construction of KMP Failure Function . . . . . . . . . . . 21
2.3.4 Correctness and Time Complexity . . . . . . . . . . . . . 22
2.3.5 Generalized Order-Preserving Matching . . . . . . . . . . 23
2.3.6 Remark on Alphabet Size . . . . . . . . . . . . . . . . . . 23
Chapter 3 Order-Preserving Multiple Pattern Matching 25
3.1 O((n + m) logm) Algorithm . . . . . . . . . . . . . . . . . . . . . 25
3.1.1 Aho-Corasick Automaton . . . . . . . . . . . . . . . . . . 26
3.1.2 Aho-Corasick Failure Function . . . . . . . . . . . . . . . 27
3.1.3 Text Search . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.1.4 Construction of Aho-Corasick Failure Function . . . . . . 29
3.1.5 Correctness and Time Complexity . . . . . . . . . . . . . 32
Chapter 4 Extensions to Ternary Order Relations 33
4.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2 Extension of Prefix Representation . . . . . . . . . . . . . . . . . 34
4.3 Extension of Nearest Neighbor Representation . . . . . . . . . . . 38
4.4 Generalized Order-Preserving KMP Algorithm . . . . . . . . . . 42
Chapter 5 Conclusion 45
Bibliography 47
-
dc.formatapplication/pdf-
dc.format.extent3476658 bytes-
dc.format.mediumapplication/pdf-
dc.language.isoen-
dc.publisher서울대학교 대학원-
dc.subject.ddc621-
dc.titleOrder-Preserving Matching in Numeric Strings-
dc.title.alternative수치 문자열의 순서를 보존하는 매칭 기법-
dc.typeThesis-
dc.contributor.AlternativeAuthorKim Jinil-
dc.description.degreeDoctor-
dc.citation.pagesv, 52-
dc.contributor.affiliation공과대학 전기·컴퓨터공학부-
dc.date.awarded2014-02-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share