S-Space College of Engineering/Engineering Practice School (공과대학/대학원) Dept. of Civil & Environmental Engineering (건설환경공학부) Theses (Master's Degree_건설환경공학부)
Designing a System Prototype for Construction Document Management Using Automated Tagging and Visualization
- 공과대학 건설환경공학부
- Issue Date
- 서울대학교 대학원
- Construction Document; Document Management System; Automated Tagging; Automated Visualization; Text Mining
- 학위논문 (석사)-- 서울대학교 대학원 : 건설환경공학부, 2015. 8. 지석호.
- A large amount of text data have been accumulated over time in the construction industry. Important and useful information collected from previous construction projects as experience is mainly recorded in document form. Such information can be used as best practice for upcoming projects by delivering lessons learned for better risk management and project control. Thus, text-based information plays an important role in business strategy development in the highly competitive construction industry. To experience benefits from this text-based information, practical and usable text data management systems are vital.
A significant amount of construction text data are rarely utilized for new construction projects because of the difficulty in accessing them. As the technology that can handle text data has been developed a number of document management systems have been proposed based on text mining techniques. However, most of them focus on classifying documents and they are unable to deal with construction documents’ complex and diverse features. In addition, unnecessary time and energy is still wasted to skim the whole database in order to uncover data of interest. Lastly, because the majority of research has focused on English data ? with only a few studies using Korean data ? there are plenty of constraints to applying existing English-based systems to Korea’s domestic construction industry.
Thus, a construction document management system was designed to manage text data effectively and efficiently, and to activate data and information transfer among system users in the domestic construction industry. To achieve this a system prototype was developed. The proposed construction document management system comprises data collection, data processing, and automated document tagging and dataset visualization.
About 25,000 Korean Internet documents were collected to develop the system prototype using a web crawler. Collected data were processed by using text mining techniques, including POS tagging, to calculate the weight of each term in a document. Each term was clarified using a construction corpus which was also developed in this study. Five keywords were automatically extracted and tagged for each document and a tag’s sub-dataset was visualized as a form of wordcloud based on the processed data.
The proposed system prototype was evaluated both qualitatively and quantitatively by surveying ten experts. Questionnaire scores on the significance of the system’s results, the usability of and the need for the proposed system design were all above four on a five-point scale. Moreover, on the quantitative evaluation, estimating the accuracy of the system’s results, the accuracy of the proposed system prototype was 84 percent on average. Thus the evaluation results confirm the potential for and feasibility of the proposed system.