Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy...

48
Semantic Processing of Engineering D t i PLM E i t Documents in PLM Environment * KAIST 산업 시스템공학과 * 서효원 교수 * KAIST 산업 시스템공학과 * 전상민 박사과정/한국타이어 *김경근 박사과정/국방과학연구소 *최승아 석사과정 *최승아 석사과정

Transcript of Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy...

Page 1: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

Semantic Processing of Engineering D t i PLM E i tDocuments in PLM Environment

* KAIST 산업 및 시스템공학과

* 서효원 교수

* KAIST 산업 및 시스템공학과

* 전상민 박사과정/한국타이어*김경근 박사과정/국방과학연구소

*최승아 석사과정*최승아 석사과정

Page 2: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

ContentsContents

1 B k d1. Background

2 New Approach2. New Approach

3. Research Trend & Paper Introductionp

4. Introduction of Basic Algorithm

5. Case Study 1,2,3

6. Conclusion

1

Page 3: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

ContentsContents

1 B k d1. Background

2 New Approach2 New Approach2. New Approach 2. New Approach

3. Research Trend & Paper Introduction3. Research Trend & Paper Introductionpp

4. Introduction of Basic Algorithm 4. Introduction of Basic Algorithm

5. Case Study 1,2,35. Case Study 1,2,3

6. Conclusion6. Conclusion

2

Page 4: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

1 B k d (AS IS)1. Background (AS-IS)

• 제품 개발 시 Engineering 문서 폭증• 제품 개발 시 Engineering 문서 폭증요구사항 설계 해석 제조 시험 양산어디서? 어떻게? 원하는 문서를 빠르게 얻을 수 있을까?

• PLM이 보편화/안정화/고도화 단계문서의 저장/관리 다 탐색/검색이 더 부각문서의 저장/관리 보다 탐색/검색이 더 부각

• 기존 Engineering 문서의 검색기존 Engineering 문서의 검색Keyword 검색 선택의 폭 너무 넓음

3

Page 5: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

1 B k d (TO BE)

• 효율적인 Engineering 문서 검색을 위해

1. Background (TO-BE)

• 효율적인 Engineering 문서 검색을 위해,• 문서 Package관리가 아닌 Text 기반 Contents 관리• Keyword 검색이 아니 의미기반 검색

• 의미기반 검색을 위해,정 의 S i 구축 필• 정보의 Semantics 구축 필요

• 이를 기반으로, 문서의 Semantic Processing 진행*Semantic Processing = Syntax Processing(NLP) + Semantic Processing(Ontology)Semantic Processing = Syntax Processing(NLP) + Semantic Processing(Ontology)

• Semantic Processing 기반 Engineering 문서 관리• 정보 검색의 효율성 (↑)• 정보 재 활용성 (↑)

정보의 통합성 (↑)• 정보의 통합성 (↑)

4

Page 6: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

ContentsContents

1 B k d1 B k d

2 New Approach

1. Background1. Background

2. New Approach

3. Research Trend & Paper Introduction3. Research Trend & Paper Introductionpp

4. Introduction of Basic Algorithm 4. Introduction of Basic Algorithm

5. Case Study 1,2,35. Case Study 1,2,3

6. Conclusion6. Conclusion

5

Page 7: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

2 N A h2. New Approach

UC: user createdWS/SS/NS: well/semi/non structuredPCD: producer-centric data

의미 분석구문 분석

FolksonomyTaxonomy

Neutral Data

PCD

DataBase

PCD: producer centric dataCCD: consumer-centric data

Neutral DataCCD

Engineering문서

(WS/SS/NS)

의미기반검색

SemanticProcessor

Data &Knowledge

BaseEngineers

Engineer

WS data

참조의미 모델

Engineer

SNData

정보 생산 측면 정보 소비 측면SNS users

6S-NL (약식 자연어 처리) 분야별 참조모델 온톨로지 의미표현 의미 유사도 평가 자기기반 검색

Page 8: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

ContentsContents

1 B k d1 B k d1. Background1. Background

2 New Approach2 New Approach

3. Research Trend & Paper Introduction

2. New Approach 2. New Approach

p

4. Introduction of Basic Algorithm 4. Introduction of Basic Algorithm

5. Case Study 1,2,35. Case Study 1,2,3

6. Conclusion6. Conclusion

7

Page 9: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

R h T d (1/2)Research Trend (1/2)

1. Wu Ying-Han; Shaw Heiu-Jou, “Document based knowledge base engineering method for ship basic design” OCEAN ENGINEERING Volume: 38 Issue: 13 Pages: 1508 1521 SEP 2011design , OCEAN ENGINEERING Volume: 38 Issue: 13 Pages: 1508-1521, SEP 2011

2. Wang Han-Hsiang; Boukamp Frank; Elghamrawy Tar, “Ontology-Based Approach to Context Representation and Reasoning for Managing Context-Sensitive Construction Information”, JOURNAL OF COMPUTING IN CIVIL ENGINEERING V l 25 I 5 P 331 346 SEP OCT 2011OF COMPUTING IN CIVIL ENGINEERING Volume: 25 Issue: 5 Pages: 331-346, SEP-OCT 2011

3. Liu S.; McMahon C. A.; Culley S. J., “A review of structured document retrieval (SDR) technology to improve information access performance in engineering document management”, COMPUTERS IN INDUSTRY V l 59 I 1 P 3 16 JAN 2008INDUSTRY Volume: 59 Issue: 1 Pages: 3-16, JAN 2008

4. S. Liu, C.A. McMahon *, M.J. Darlington, S.J. Culley, P.J. Wild, “A computational framework for retrieval of document fragments based on decomposition schemes in engineering information management”, d d i i f i 1 1Advanced Engineering Informatics 20 (2006) 401–413

5. Zhanjun, L., Karthik, R., A., (2007), " Ontology-based design information extraction and retrieval " Artificial Intelligence for Engineering Design, Analysis and Manufacturing 21, pp. 137–154.

6. Zhanjun Li, Victor Raskin, Karthik Ramani, ”Developing Engineering Ontology for Information Retrieval”, Journal of Computing and Information Science in Engineering, 3.2008, vol 8

7. Zhanjun Li, Maria C.Yang, Karthik Ramani, “A methodology for engineering ontology acquisition and validation”, Artificial Intelligence for Engineering Design, Analysis and Manufacturing, 2009, vol 23

8

Page 10: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

R h T d (2/3)Research Trend (2/3)

8. Zhanjun Li Min Liu David C. Anderson Karthik Ramani, “Semantic-based design knowledge annotation and retrieval” Proceedings of IDETC/CIE 2005 ASME 2005 International Designannotation and retrieval , Proceedings of IDETC/CIE 2005 ASME 2005 International Design Engineering Technical Conferences & Computer and information in Engineering Conference September 24-28, 2005, Long Beach, California, USA

9. Deeptimahanti Deva Kumar, Ratna Sanyal(2008) “ Static UML Model Generator from Analysis of p y ( ) yRequirements(SUGAR)” 2008 Advanced Software Engineering & Its Applications, pp. 77–84.

10. Lin, JX ; Fox, MS ; Bilgic, T(1996) “ A Requirement Ontology for Engineering Design” Concurrent Engineering-Research and Iapplications, Vol 4, Issue3, pp. 279-291.

11. Soner, K., Ozgur, A., Orkunt, S., Samet, A., Nihan, K.C., Ferda, N.A., (2012), " An ontology-based retrieval system using semantic indexing," Information Systems, 37, pp. 294-305.

12 Li M H (2009) " A ti l kl d b d d t ll ti h f ltidi k d t b "12. Lin, M., H., (2009), " An optimal workload-based data allocation approach for multidisk databases" Data and knowledge Engineering, 68, pp. 499–508.

13. Patricia, L., (2000), " Information extraction from documents for automating software testing," Artificial Intelligence in Engineering 14 pp 63-69Artificial Intelligence in Engineering, 14, pp. 63 69.

14. Module-based Failure Propagation (MFP) model for FMEA, Int J Adv Manuf Technol, Kyoung-Won Noh, Hong-Bae Jun, Jae-Hyun Lee, Gyu-Bong Lee, Hyo-Won Suh, 2011

15. A Functional Basis for Engineering Design: Reconciling and Evolving Previous Efforts, NIST Technical Note 1447, Julie Hirtz, Robert B. Stone, Daniel A. McAdams, Simon Szykman, and Kristin L. Wood, 2002

9

Page 11: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

Ontology-based design information gy gextraction and retrieval

ZHANJUN LI and KARTHIK RAMANIArtificial Intelligence for Engineering Design, Analysis and Manufacturing (2007), 21, 137–154.

10

Page 12: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

1 Ab t t1. Abstract

• Increasing complexity of product design processg p y p g pthe number of design documents has exploded

• To design information retrievalShallow natural language process(NLP)Domain-specific design semantics/ontology

Text/unstructured structured/semantic-based representation

LinguisticPatten

DesignConcept &

Relationship

DOC(Text)

ApplicationSpecific

Design Semantics

• To improve the performance of design information retrievalDeveloped ontology-based query processingDeveloped ontology based query processing

Users’ requests are interpreted based on domain-specific meanings

DesignConcept

DocDesignConcept

Scoring &Pairing

Doc.Retrieval

Query

11

Page 13: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

2 S t A hit t & F ti l Di2. System Architecture & Functional Diagram

ODART: Ontology-based Design document Analysis and Retrieval Toolgy g y

구문분석 의미 분석 Doc Semantics QueryQuerySemantics

ProcessedQuery

Query

12

Page 14: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

3 O t l M d li3. Ontology Modeling

Taxonomy

< Linguistic Knowledge>

Reference Model

< Domain Knowledge>13

Reference Model

Page 15: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

4 D i S ti E t ti4. Design Semantic Extraction

< Linguistic Knowledge>< Linguistic Knowledge>

< Design Semantic/Taxonomy Model>

<Device Taxonomy>

14< Domain Knowledge>

Page 16: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

4 D i S ti E t ti4. Design Semantic Extraction

Page 17: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

5 E l ti5. Evaluation

Find products having DC motors

16

Page 18: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

ContentsContents

1 B k d1 B k d1. Background1. Background

2 New Approach2 New Approach

3. Research Trend & Paper Introduction3. Research Trend & Paper Introduction

2. New Approach 2. New Approach

pp

4. Introduction of Basic Algorithm

5. Case Study 1,2,35. Case Study 1,2,3

6. Conclusion6. Conclusion

17

Page 19: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

Introduction of Basic Algorithm f ti d t ifor semantic document processing

18

Page 20: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

1 알고리즘 O tli1. 알고리즘 Outline의미기반 텍스트

프로세싱

문서 프로세싱

프로세싱

19

의미기반 CAD 프로세싱

Page 21: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

2 문서 프로세싱2. 문서 프로세싱

EngineeringEngineeringDOC

(구조/텍스트)

Bayesian classification

분류된 문서

POI Extractor

타이틀 구조 텍스트

POI Extractor

타이틀(키워드)

구조(계층)

텍스트(테이블)

TF-IDF Tokenization

단위 문장텍스트 주제

IE, OP•TF-IDF : Term Frequency - Inverse Document Frequency

인덱스/인스턴스/구조 문장

• IE : Information Extraction• OP : Ontology Population

20

Page 22: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

2 1 주요 알고리즘

Input Algorithm Output

2.1. 주요 알고리즘

문서 ex) FMEA Doc. Bayesian classification 분류된 문서 ex) Class 1

Brief Algorithm DescriptionD

Training Data Set Feature generation Classifier(model)

Doc

Prediction Classification

Example

( )

Training data setsw1 w2 … Class

FMEA Failure Analysis 1Evaluation Measure 1Concept Design 2

(FMEA, Failure Analysis…)

사용 R f

p gDetail Design 2CAD Drawing 3… … ..

Output : Class 1

Training data sets

사용 Reference

21

Page 23: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

2 2 주요 알고리즘

Input Algorithm Output

2.2. 주요 알고리즘

분류된 문서 Apache POI Extractor API 타이틀, 구조, Text 추출

Example

Output :

-Title : FMEA(FAILURE MODE AND EFFECTS ANALYSIS) Doc.

-Structure :Structure Meta-data : Image : 0 개, Table : 1개, Text : 11줄,작성자 : Mr.An, 문서생성일 : 2010.3.20…

-Text : FMEA(FAILURE MODE AND EFFECTS ANALYSIS) Doc.Reporting Person : Mr AnReporting Person : Mr. AnReporting Date : 2010.05.01-Overall assessment : There are mal-functionsin Filtering to collect dust and remove gas. It makes a fan not purifying polluted air…

사용 R f사용 Reference

22

Page 24: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

2 3 주요 알고리즘

Input Algorithm Output

2.3. 주요 알고리즘

문서(Text) Inverted Index Inverted index

Brief Algorithm Description

문서 Tokenization 문서의 Token List 생성

TokenNormalization

Token 별문서 정보 색인

Example

FMEA (FAILURE MODE AND EFFECTSTerm Document ID

FMEA

Stopword list

-AFMEA (FAILURE MODE AND EFFECTSFMEA (FAILURE MODE AND EFFECTS ANALYSIS) Doc. Reporting Person : Mr. An. Reporting Date : 2010.05.01.There are mal-functions in Filtering to collect dust and remove gas. It makes a fan not purifying polluted air

1 FMEA 12 FAILURE 1.23 Design 1,34 Reporting 1,2,4

A-And-Around-Every-For-From-In

FMEA (FAILURE MODE AND EFFECTS ANALYSIS) Doc. Reporting Person : Mr. An. Reporting Date : 2010.05.01.There are mal-functions in Filtering to collect dust and remove gas. It makes a

FMEA (FAILURE MODE AND EFFECTS ANALYSIS) Doc. Reporting Person : Mr. An. Reporting Date : 2010.05.01.There are mal-functions in Filtering to collect dust and remove gas. It makes a

사용 R f

fan not purifying polluted air… … … …… … …

In-Is-It...

fan not purifying polluted air…collect dust and remove gas. It makes a fan not purifying polluted air…

사용 Reference

23

Page 25: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

2 4 주요 알고리즘

Input Algorithm Output

2.4. 주요 알고리즘

문서(Text) TF-IDF Text Weighting (중요도)

Brief Algorithm Description

ExampleDoc.1 Doc.2 Doc.3 …

FMEAFMEA 4 2 1Failure 5 2 4Lack 3 0 0Fan 13 0 0… … … …

FMEA (FAILURE MODE AND EFFECTS ANALYSIS) Doc. Reporting Person : Mr. An. Reporting Date : 2010.05.01.There are mal-functions in Filtering to collect dust and remove gas It makes a

FMEA (FAILURE MODE AND EFFECTS ANALYSIS) Doc. Reporting Person : Mr. An. Reporting Date : 2010.05.01.There are mal-functions in Filtering to

ll t d t d It k

FMEA (FAILURE MODE AND EFFECTS ANALYSIS) Doc. Reporting Person : Mr. An. Reporting Date : 2010.05.01.There are mal-functions in Filtering to … … … …

Doc.1 Doc.2 Doc.3 …FMEA 0.365 0.5228 0.157Failure 0.406 0.5228 0.365

collect dust and remove gas. It makes a fan not purifying polluted air…collect dust and remove gas. It makes a fan not purifying polluted air…

gcollect dust and remove gas. It makes a fan not purifying polluted air…

Lack 0.602 0 0Fan 1.146 0 0… … … …… … … …

Output : ‘Fan’ is the most important word

사용 Reference

Output : Fan is the most important word.

24

Page 26: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

3 의미기반 텍스트 프로세싱 & 검색3. 의미기반 텍스트 프로세싱 & 검색

단위 문장 Query (문장)단위 문장

Tokenization

단어Lexicon

y (문장)

단어

Tokenization

POS Tagged

단어

POS tagging

POS Tagged 단어

POS tagging

gg단어

Concept

Concept Disambiguation

DomainOntology

단어

Concept

Concept Disambiguation

Concept

Joining

p추출

Semantic Doc.Representation

추출

Concept

Joining

pRelationship

Concept Indexing

Relationship

Similarity계산

Vector Space Model

Document획득

25

Vector Space Model

Page 27: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

3 1 주요 알고리즘

Input Algorithm Output

3.1. 주요 알고리즘

문장 Tokenization Token(단어)

Brief Algorithm Description

문장단어 구분자

(하이픈, 생략부호, 마침표..) Token 생성

Example

A second DC motor rotates air that makes a smell removed

A/ second/ DC/ motor / rotates/ air/ that/ makes/ a/ smell/ removed/

사용 R f

A/ second/ DC/ motor / rotates/ air/ that/ makes/ a/ smell/ removed/

사용 Reference

26

Page 28: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

3 2 주요 알고리즘

Input Algorithm Output

3.2. 주요 알고리즘

문장 POS(Part-of-Speech) tagging POS tagging

Brief Algorithm Description

LexiconDB

POS Tagging

Example

A/ second/ DC/ motor / rotates/ air/ that/ makes/ a/ smell/ removed

DT – DeterminerJJ - Adjective

A/ second/ DC/ motor / rotates/ air/ that/ makes/ a/ smell/ removed.

A<DT> second<JJ> DC<NN> motor > rotates<VBZ> air<NN> that<TDT> makes<VBZ> a<DT> smell<NN> removed<VBZ>

NN - Noun, singular or massVBZ - Verb, 3rd person singular presentTO – toCD - Cardinal numberNNS - Noun, plural

사용 R f

air<NN> that<TDT> makes<VBZ> a<DT> smell<NN> removed<VBZ>. NNS Noun, pluralRB – AdverbCC - Coordinating conjunction…

Lexicon DB

사용 Reference

27

Page 29: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

3 3 주요 알고리즘

Input Algorithm Output

3.3. 주요 알고리즘

한글 문장 한글 형태소 분석 한글 POS tagging

Brief Algorithm Description텍스트 내용 읽음

Number Dictionary

한국어 형태소 분석(Korean Morphological Analysis)System

Dictionary(283 948단어)

Tag Set TableUser Dictionary

(Domain Lexicon DB) 한국어 품사 태깅(POS Tagging)

(283,948단어)

한나눔 한글 형태소 분석기 (HanNanum v0 8 4)

Example

한나눔 한글 형태소 분석기 (HanNanum v0.8.4)

유도탄 중량을 현재 100kg에서 80kg으로 감량 설계 추진 요망

사용 R f

유도탄 중량을 현재 100kg에서 80kg으로 감량 설계 추진 요망

유도탄/NC 중량/NC 100/NN kg/F 에/PV 80/NN kg/F 감량/NC 설계/NC 추진/NC 요망/NC

System Dictionary, User Dictionary, Number Dictionary, Tag Set Table28

사용 Reference

Page 30: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

3 4 주요 알고리즘

Input Algorithm Output

3.4. 주요 알고리즘

문장 Concept Disambiguation 문장과 가장 유사한 Concept

Brief Algorithm Description-Wm : weight of phrase

( contain only one word : Wm = 1,No match : Wm = 0, Rightmost matched : 0.55[가장 매칭되는 단어에게 높은 점수]

Example

[ ]Rest of : split 0.45 equally )

ex) phrase : “second DC Motor”) p0.225 0.225 0.55

Concept 이 “Motor” 일 경우,Tscore= (1*(0+0+0.55) )/ 3 = 0.183

Concept이 “AC Motor”일 경우, Tscore= (1*(0+0+0 55) )/ 3 = 0 183 AC-Motor DC-Motor

Motor

사용 R f

Tscore= (1 (0+0+0.55) )/ 3 = 0.183Concept이 “DC Motor”일 경우

Tscore= (2*(0+0.225+0.55) )/ 3 = 0.516 DC Motor 가 가장 높은 점수로 선택됨.

사용 Reference

29

Page 31: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

3 5 주요 알고리즘

Input Algorithm Output

3.5. 주요 알고리즘

문장 Joining Concept Relationship

Brief Algorithm Description

SyntaxRule

Joining

Example

Input : DC motor rotates fan.DC<NN> motor<NN> rotates<VBZ> fan<NN>.

Syntax Rule : NN^VBZ^NN -> VBZ(NN,NN)

사용 R f

Syntax Rule : NN VBZ NN VBZ(NN,NN)Rotate(DC Fan, air)

사용 Reference

30

Page 32: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

3 6 주요 알고리즘

Input Algorithm Output

3.6. 주요 알고리즘

문서(Text) Concept Index Concept index

Brief Algorithm Description

문서Concept

추출Concept

Disambiguation문서 내

Concept ListConcept 별

문서 정보 색인

Example

FMEA (FAILURE MODE AND EFFECTSTerm Document ID

FMEAFMEA (FAILURE MODE AND EFFECTS

Report

FMEA (FAILURE MODE AND EFFECTS ANALYSIS) Doc. Reporting Person : Mr. An. Reporting Date : 2010.05.01.There are mal-functions in Filtering to collect dust and remove gas. It makes a fan not purifying polluted air

1 FMEA 12 FAILURE 1.23 Dust 1,34 Gas 1,2,4

FMEA (FAILURE MODE AND EFFECTS ANALYSIS) Doc. Reporting Person : Mr. An. Reporting Date : 2010.05.01.There are mal-functions in Filtering to collect dust and remove gas. It makes a

FMEA (FAILURE MODE AND EFFECTS ANALYSIS) Doc. Reporting Person : Mr. An. Reporting Date : 2010.05.01.There are mal-functions in Filtering to collect dust and remove gas. It makes a

FEMA

Failure

사용 R f

fan not purifying polluted air… … … …… … …

fan not purifying polluted air…collect dust and remove gas. It makes a fan not purifying polluted air… Dust Gas

사용 Reference

31

Page 33: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

3 7 주요 알고리즘

Input Algorithm Output

3.7. 주요 알고리즘

Query, 문서들 Vector Space Model (Similarity) Query와 가장 유사한 문서

Brief Algorithm Description

Example문서 내 각 단어의 Score, ex) FMEA, Failure, Dust, Gas…)

D2와 D3가가장 유사함

문서 내 각 단어의 Score, ex) FMEA, Failure, Dust, Gas )

사용 R f

가장 유사함

사용 Reference

32

Page 34: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

4 의미기반 CAD 모델 추출 및 검색4. 의미기반 CAD 모델 추출 및 검색

DOCDOCCAD주요

Parameter 추출

LexiconFeature 추출

DomainOntology

Feature 속성 추출

F OntologyFeatureDisambiguation

F t XML

Feature IndexingQuery

Feature XML

Feature IndexingQuery

33

Page 35: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

ContentsContents

1 B k d1 B k d1. Background1. Background

2 New Approach2 New Approach

3. Research Trend & Paper Introduction3. Research Trend & Paper Introduction

2. New Approach 2. New Approach

pp

4. Introduction of Basic Algorithm 4. Introduction of Basic Algorithm

5. Case Study 1,2,3

6. Conclusion6. Conclusion

34

Page 36: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

C St d O tliCase Study Outline

CASE 3. 유사한 과거

CASE 2. Conceptual design ofSemantic Issue management system

FMEA 문서 추출

CASE 1. 문서기반 CAD Model 검색

PCD

Neutral DataCCD

Engineering문서

DataBase

의미기반검색

SemanticProcessor

Data &KnowledgeE i

WS data문서

(WS/SS/NS)검색Processor Knowledge

BaseEngineers

참조

Engineer

data

SN

35정보 생산 측면 정보 소비 측면

참조의미 모델

SNData

SNS users

Page 37: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

CASE STUDY 1 :

문서 기반 CAD Model 검색

36

Page 38: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

1 재사용을 위한 CAD M d l 검색의 문제점

Case Study 1 :문서기반 CAD Model 검색 (1/16)

1. 재사용을 위한 CAD Model 검색의 문제점

• CAD Model의 재사용• CAD Model의 재사용• 제품 설계 시 80%의 CAD Model이 재사용 되고 있음.• 성공적인 재사용은 50%의 비용 절감 효과 발생.

• 현재 CAD Model 검색 방법• CAD File 이름을 통해• CAD File 이름을 통해• 정보 시스템(PLM,PDM)의 정보(BOM) 기반 검색

사용자가 의도한 검색에 한계 (검색 결과 중 48%는 사용 불가[1])

• 보다 효과적인 검색 및 재사용을 위해서• 단순 이름 및 정보 기반이 아닌 Semantic CAD Model검색 필요• 단순 이름 및 정보 기반이 아닌 Semantic CAD Model검색 필요• CAD Model에 대한 Knowledge 추출 및 재사용 필요

[1] LI, M, Y F Zhang*, J Y H Fuh and Z M Qiu, "Towards effective mechanical design reuse:feature-based CAD model retrieval on general shapes and partial shapes". JOURNAL OFMECHANICAL DESIGN, (2009).

37

Page 39: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

2 CAD M d l 재 사용 예

Case Study 1 :문서기반 CAD Model 검색 (2/16)

2. CAD Model 재 사용 예

<PLM> Update Simulation Accuracy<Detail Design>

QFD,DFEMA,B/M Report

Basic Spec.(Dimension,Construction,Material)

Test ResultTest Result

Update Simulation Accuracy

Re-design DrawingM-BOM Tire

<Detail Design>

EngineeringRequirement

DesignConcept Design Simulation Test

ProductionPerformance

TestManuf.Spec.

Sub Process

CAD ReportCAD Report <CAD&CAE> ReportDrawing

CAD Mesh ODB ReportCAD Mesh ODB Report

PatternSimulation

PatternDesign

MoldDrawing

FEASimulation

Post-Process(Report)

p

: Output

LegendConstruction

DesignPre-Process(Meshing)

CAD : Output

: ProcessSidewallDesign 38

Page 40: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

3 문서 기반 CAD 모델 검색 프로세스 (1/2)

Case Study 1 :문서기반 CAD Model 검색 (3/16)

3.문서 기반 CAD 모델 검색 프로세스 (1/2)

유사유사도계산

Concept 문서 Semantic 문서 모델 Semantic CAD 모델 CAD 모델

39

Page 41: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

3 문서 기반 CAD 모델 검색 프로세스 (1/2)

Case Study 1 :문서기반 CAD Model 검색 (4/16)

Doc RepresentationSyntactic Analysis Semantic Analysis

3.문서 기반 CAD 모델 검색 프로세스 (1/2)

문서 Text 추출(Table 정보)

Tokenization

Concept 추출

Concept

Concept XML 생성

Concept IndexingDOC

Tokenization

SentenceSegmentation

pDisambiguation

Concept Joining

Concept Indexing

Semantic 문서 모델생성 (OWL)

DOCDOC

Lexicon DB Ontology DB Semantic Model DB

Feature Extraction

주요Parameter 추출

Feature XML 생성

CAD RepresentationFeature Analysis

Feature Parameter 추출

Feature 추출 Feature Relationship추출

Feature Indexing DOCDOCCAD

Disambiguation

Feature 속성 추출Semantic CAD 모델

생성

40

Page 42: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

CASE STUDY 2 :

Conceptual design ofS ti I t tSemantic Issue management system

41

Page 43: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

1 AS IS

Case Study 2 : Conceptual design ofSemantic Issue management system (1/19)

1. AS-IS

AS-ISAS-IS① 매일 매일 시스템 개발 프로젝트와 관련한

다양한 형태의 회의, 토의, 의사결정이 이루

어지고 있지만, 이에 대한 내용은 참석자 및

관련자(회의록 등을 메일로 전달받은 사람)

들에게만 전달되고 있음.

② 시스템 또는 서브시스템의 규격, 설계사항과

관련하여 토의되는 내용이 대부분이지만,

이와 관련한 분류(시스템 구성에 따른) 및 변이와 관련한 분류(시 템 구성에 따른) 및 변

경사항 반영은 현재는 모두 데이터베이스 관

리자가 직접 수행하여야 하는 상태임.

③ 심각한 overhead 발생③ 심각한 overhead 발생.

④ 변경사항 반영이 실시간으로 이루어지지 못

하고 있으며, 누락사항 발생도 충분히 가능

한 상태임한 상태임.

42

Page 44: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

2 TO BE

Case Study 2 : Conceptual design ofSemantic Issue management system (2/19)

2. TO-BE

제안 사항

① 조직 구성(OBS), 업무분할구조(WBS),

시스템 분해구조(SBS), 요구사항 전

개구조(RBS)를 Ontology 모델 등으로

구축. Reference Ontology Modelgy

② 회의 및 의사결정 데이터 입력 : 회의록

(document) 또는 SE 도구를 이용한

well-structured data.

③ 입력 데이터를 Reference Ontology③ 입력 데이터를 Reference Ontology

model을 이용, 각 조직, 업무, 시스템

에 대하여 Semantic하게 할당, 재정리.

TO-BE① 엔지니어가 출근하여 PLM시스템에 로

그인하게 되면, 항상 자기자신의 업무

(또는 조직, 담당 시스템, 요구사항)와

관련하여 토의되거나 의사 결정된 내용

을 실시간으로 확인 가능.

② 자기자신이 현재 해결하여야 할 이슈② 자기자신이 현재 해결하여야 할 이슈

(회의의 경우 Action item)을 실시간으

로 확인 가능.

43

Page 45: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

3 TO BE W k Fl

Case Study 2 : Conceptual design ofSemantic Issue management system (3/19)

3. TO-BE Work Flow

44

Page 46: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

ContentsContents

1 B k d1 B k d1. Background1. Background

2 New Approach2 New Approach

3. Research Trend & Paper Introduction3. Research Trend & Paper Introduction

2. New Approach 2. New Approach

pp

4. Introduction of Basic Algorithm 4. Introduction of Basic Algorithm

5. Case Study 1,2,35. Case Study 1,2,3

6. Conclusion

45

Page 47: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

C l iConclusion

• 제품 개발 관련 Engineering 문서 폭증 효율적 검색 방안 필요• 제품 개발 관련 Engineering 문서 폭증 효율적 검색 방안 필요

• PLM이 보편화/안정화/고도화 단계 탐색/검색 기능 부각

• Keyword 검색 선택의 폭 너무 넓음

• Semantic Processing 기반 Engineering 문서 관리/검색

• 정보 검색의 효율성 (↑)• 정보 검색의 효율성 (↑)

• 정보 재 활용성 (↑)

• 정보의 통합성 (↑)

46

Page 48: Semantic Processing of Engineering DtiPLMEi tDocuments in ...구문분석 의미분석 Taxonomy Folksonomy Neutral Data PCD Data Base PCD: producer centric data ... •TF-IDF : Term

Question and AnswerQTHANK YOU ☺

47