logging in or signing up key sun choi Arley33 Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 100 Category: News & Reports.. License: All Rights Reserved Like it (0) Dislike it (0) Added: September 11, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Infrastructures in Korea and for the Korean Language : Infrastructures in Korea and for the Korean Language Key-Sun Choi Academic Society: Academic Society SIG-Korean Language Computing under Korea Information Science Society 300 members Korea Information Society linguistics oriented KIBS Korea Information Base and Systems: KIBS Korea Information Base and Systems Purpose: To improve Korean Language Processing Technology To promote Korean Software Industry in the planning phase (1993), targetted to Hangul Wordprocessor, Machine Translation and Korean Linguistic Research 1995 - 1997 (Phase 1): 'word' Two ministry joint project + Industry Ministry of Scienceandamp;Technology, Ministry of Culture 1998 - 2000 (Phase 2): 'sentence' Only by Ministry of Scienceandamp;Technology + Industry will be evaluated in October, 2000 2001 - 2003 (Phase 3): 'discourse' - not decided http://kibs.kaist.ac.kr/ King Sejong Project: King Sejong Project Purpose To promote the Korean Language Research in the linguistics side To prepare for the language planning for Unification of South-/North-Korea for International use of Korean Sponsor: Ministry of Culture Period: 1998 - 2007 (10 years) Items corpus, dictionary, internationalization, terminology, education, font, old Korean http://www.sejong.or.kr/ KIBS: Architecture: KIBS: Architecture MA1 MA2 TA1 TA2 PA1 PA2 WSD1 WSD2 DA1 DA2 RM1 RM2 Ontology Common Knowledge Domain Knowledge Electronic Dictionary Engine Module Level Engine Level Basic DB corpus MRD Knowledge extractor Knowledge Source Level MT engine IR engine Spell checker Style checker UI engine Application Level Word processor MT system Information Retrieval System Automatic Speech Translation End User User(Programmer) User(lexicographyist) User(Dictionary) Quality Management System -- System Terminology Distributed Resource Management System Master DB Tagging Support Tool Knowledge Level Terminology DB KIBS: Introduction: KIBS: Introduction Title of Project KIBS I : Integrated Korean Information Base KIBS II : On Development of Deep-Level Processing and Quality Management Technology for Very Large Korean Information Base Outline Term : 1994.12.4 ~ 2004.9.30 (10 years) Sponsor : Ministry of Science and Technology Staff : 50 person/year The Goal of First step: The Goal of First step The Goal of Second step: The Goal of Second step Development Tools: Development Tools Korean Concordance Program (KCP) Compound Noun Browser Corpus Browser Corpus Browser by Category Automatic English-to-Korean Transliteration System (TLEK) KAIST Ontology Browser Korean Morphological Analyser Korean Tagger Korean Syntactic Analyser Editing Support Tools to Electronic Dictionary Results & Distribution: Results andamp; Distribution Major Results The first (KIBS I) : 1997.6. ~ present (80 site) Text corpus 10 million word phrases POS tagged corpus 1 million word phrases Syntactic structure tagged corpus 10 thousands sentences TDMS, Speech DB samples, Hand-written character DB samples The second (KIBS II) : 1998.12. ~ present (140 site) Raw corpus 10 million word phrases, POS tagged corpus – 200 thousands word phrases The third (KIBS III) : 2000 (pending) Proper noun 10 thousands entries, Compound noun 20 thousands entries, Verb sentence pattern dictionary 3 thousands entries, ... Plan to maintain and distribute ... KORTERM : KORTERM Korea Terminology Center for Language and Knowledge Engineering http://korterm.or.kr/ http://korterm.org/ Goals of KORTERM: Goals of KORTERM Through World-Wide Terminology Collection and Their Standardization and Harmonization in Local Society Distribution, Publication and Application in Language and Knowledge Engineering are promoted. Through Education and Consultation of Terminology Randamp;D Methodology for Each Subject Field, High-Quality, High-Reliable Terminology and Its Infrastructure and System are achieved. Center of Terminology and Knowledge Engineering Phases and Subjects of KORTERM: Phases and Subjects of KORTERM Integration of Working Terminology Terminology Collection (Basic Sandamp;T, Industry Standard, Economics) Electronic Terminology (Publication) Randamp;D Environment (System Standardization) Terminology Theory and Education Infrastructure Value-Added Terminology Integration Terminology Collection (Extended Sandamp;T) Extension andamp; Maintenance (Industry Standards) High-Quality Terminology Application in Language Industry Verification for High-Reliability and Distribution Multi-lingual Terminology Integration Terminology Collection (Humanity and Social Science) Maintenance and Extension Large-Scale Knowledge Base for Terminology Terminology Education Curriculum Development Application Product Development Continuous Extension and Management Terminology Study Promotion Distribution of Terminology Information Base Continuous Terminology Extension and Management Phase 2 (2001-2003) Value-Added Working System Phase 3 (2004-2007) Operation Phase 4 (2008 - ) Maintenance and Extension Phase 1 (1998-2000) Randamp;D Environment and Basic Data Collection R & D (1): Basic Data (Corpus) Corpus for Each Subject Domain Electronic Dictionary for Basic Vocabulary Everyday Vocabulary consists of General Vocabulary and Everyday Terminology Internationalization of Korean Language South-North Korean Terminology Standardization, Korean language Input Methods Korean Language Engineering Standardized Term Use for Information Retrieval, Machine Translation and Document Classification R andamp; D (1) R & D (2): Language Engineering Information Retrieval: Effective Internet Information Creation and Information/Knowledge Acquisition Multi-lingualism Machine Translation: Efficient Information Generation through Terminology and Vocabulary Collection and Standardization Wordprocessor: High Productivity by Spelling Correction, Summarization and Efficient Use. R andamp; D (2) R & D (3): Language, Information and Terminology Language Education: Technical Thinking and Technical Communication Terminology-based Education Language Study: Domain-specific Language Study R andamp; D (3) Terminology Sponsors: Terminology Sponsors Support from Government, Organization and Industry according to each specialty Ministry of Culture and Tourism (KORTERM Center Operation) Ministry of Science and Technology (Randamp;D Fund) Ministry of Information and Telecommunication (Randamp;D Fund) Ministry of Diplomacy and Trade Ministry of Industry and Resource Ministry of Education Korea Science and Technology Foundation (Event Support) Task Configuration: Task Configuration Terminology Base (Collection) Non-standards International Term Standard Terminology Standard Languageandamp; Knowledge Product Language Education Environment Terminology Information Environment Randamp;D Environment Application Use Terminology Symbolization Terminology Access Standard Channel Grid Size Controller Application-Specific Dictionary Language Education Adaptable to Student Randamp;D Industry Living Communication Standardization andamp; Harmonization Terminological Conceptual Space Large-Scale Speech/Language/Image DB Construction and Evaluation: Large-Scale Speech/Language/Image DB Construction and Evaluation Supported by Ministry of Science and Technology Two Year Project (1999.10-2001.10) Goals: Goals Final Goal Working Group Organization Survey and Planning IR Test Suite and Evaluation Model Recommend MT Test Suite and Evaluation Model Recommend Image Attribute Format Color-Lexical Entry MPEG7 Specification Language Sentence-unit Speech DB Prosody for Speech Synthesis Speech Image Language Speech Image IR/QA 90 query/200K doc, MT 5,000 sentences word-unit telephone speech DB: 100 token * 500 Image 300 kinds - Meta Data Question-Answering IR Test Suites: Question-Answering IR Test Suites Test Suites for IR/QA Documents 207,067 records (370MB) Newspapers Query Generation 90 queries (through 300 quiz query analysis) Queries for WH-question and other various types of answers for NLP problem solving relevent document set to include the answer by using four kinds of commercialized IR systems by 16 kinds of methods English-Korean MT Test Suites: English-Korean MT Test Suites Type Classification: About 300 Kinds Test Sentences and Test Query: 5,000 Records Extracted from Textbook and Grammar books (1999-2000) will be extracted from the Real usage like web, newspapers (2000-2001) Evaluation by Yes/No Question Tested for 4 Commercialized English-Korean MT Systems MT Evaluation Workbench: MT Evaluation Workbench Image Meta Data Editor: Image Meta Data Editor Meta data Input Workbench by XML Image Retrieval by Meta data: Image Retrieval by Meta data http://korterm.kaist.ac.kr/ksurimal/: http://korterm.kaist.ac.kr/ksurimal/ You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
key sun choi Arley33 Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 100 Category: News & Reports.. License: All Rights Reserved Like it (0) Dislike it (0) Added: September 11, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Infrastructures in Korea and for the Korean Language : Infrastructures in Korea and for the Korean Language Key-Sun Choi Academic Society: Academic Society SIG-Korean Language Computing under Korea Information Science Society 300 members Korea Information Society linguistics oriented KIBS Korea Information Base and Systems: KIBS Korea Information Base and Systems Purpose: To improve Korean Language Processing Technology To promote Korean Software Industry in the planning phase (1993), targetted to Hangul Wordprocessor, Machine Translation and Korean Linguistic Research 1995 - 1997 (Phase 1): 'word' Two ministry joint project + Industry Ministry of Scienceandamp;Technology, Ministry of Culture 1998 - 2000 (Phase 2): 'sentence' Only by Ministry of Scienceandamp;Technology + Industry will be evaluated in October, 2000 2001 - 2003 (Phase 3): 'discourse' - not decided http://kibs.kaist.ac.kr/ King Sejong Project: King Sejong Project Purpose To promote the Korean Language Research in the linguistics side To prepare for the language planning for Unification of South-/North-Korea for International use of Korean Sponsor: Ministry of Culture Period: 1998 - 2007 (10 years) Items corpus, dictionary, internationalization, terminology, education, font, old Korean http://www.sejong.or.kr/ KIBS: Architecture: KIBS: Architecture MA1 MA2 TA1 TA2 PA1 PA2 WSD1 WSD2 DA1 DA2 RM1 RM2 Ontology Common Knowledge Domain Knowledge Electronic Dictionary Engine Module Level Engine Level Basic DB corpus MRD Knowledge extractor Knowledge Source Level MT engine IR engine Spell checker Style checker UI engine Application Level Word processor MT system Information Retrieval System Automatic Speech Translation End User User(Programmer) User(lexicographyist) User(Dictionary) Quality Management System -- System Terminology Distributed Resource Management System Master DB Tagging Support Tool Knowledge Level Terminology DB KIBS: Introduction: KIBS: Introduction Title of Project KIBS I : Integrated Korean Information Base KIBS II : On Development of Deep-Level Processing and Quality Management Technology for Very Large Korean Information Base Outline Term : 1994.12.4 ~ 2004.9.30 (10 years) Sponsor : Ministry of Science and Technology Staff : 50 person/year The Goal of First step: The Goal of First step The Goal of Second step: The Goal of Second step Development Tools: Development Tools Korean Concordance Program (KCP) Compound Noun Browser Corpus Browser Corpus Browser by Category Automatic English-to-Korean Transliteration System (TLEK) KAIST Ontology Browser Korean Morphological Analyser Korean Tagger Korean Syntactic Analyser Editing Support Tools to Electronic Dictionary Results & Distribution: Results andamp; Distribution Major Results The first (KIBS I) : 1997.6. ~ present (80 site) Text corpus 10 million word phrases POS tagged corpus 1 million word phrases Syntactic structure tagged corpus 10 thousands sentences TDMS, Speech DB samples, Hand-written character DB samples The second (KIBS II) : 1998.12. ~ present (140 site) Raw corpus 10 million word phrases, POS tagged corpus – 200 thousands word phrases The third (KIBS III) : 2000 (pending) Proper noun 10 thousands entries, Compound noun 20 thousands entries, Verb sentence pattern dictionary 3 thousands entries, ... Plan to maintain and distribute ... KORTERM : KORTERM Korea Terminology Center for Language and Knowledge Engineering http://korterm.or.kr/ http://korterm.org/ Goals of KORTERM: Goals of KORTERM Through World-Wide Terminology Collection and Their Standardization and Harmonization in Local Society Distribution, Publication and Application in Language and Knowledge Engineering are promoted. Through Education and Consultation of Terminology Randamp;D Methodology for Each Subject Field, High-Quality, High-Reliable Terminology and Its Infrastructure and System are achieved. Center of Terminology and Knowledge Engineering Phases and Subjects of KORTERM: Phases and Subjects of KORTERM Integration of Working Terminology Terminology Collection (Basic Sandamp;T, Industry Standard, Economics) Electronic Terminology (Publication) Randamp;D Environment (System Standardization) Terminology Theory and Education Infrastructure Value-Added Terminology Integration Terminology Collection (Extended Sandamp;T) Extension andamp; Maintenance (Industry Standards) High-Quality Terminology Application in Language Industry Verification for High-Reliability and Distribution Multi-lingual Terminology Integration Terminology Collection (Humanity and Social Science) Maintenance and Extension Large-Scale Knowledge Base for Terminology Terminology Education Curriculum Development Application Product Development Continuous Extension and Management Terminology Study Promotion Distribution of Terminology Information Base Continuous Terminology Extension and Management Phase 2 (2001-2003) Value-Added Working System Phase 3 (2004-2007) Operation Phase 4 (2008 - ) Maintenance and Extension Phase 1 (1998-2000) Randamp;D Environment and Basic Data Collection R & D (1): Basic Data (Corpus) Corpus for Each Subject Domain Electronic Dictionary for Basic Vocabulary Everyday Vocabulary consists of General Vocabulary and Everyday Terminology Internationalization of Korean Language South-North Korean Terminology Standardization, Korean language Input Methods Korean Language Engineering Standardized Term Use for Information Retrieval, Machine Translation and Document Classification R andamp; D (1) R & D (2): Language Engineering Information Retrieval: Effective Internet Information Creation and Information/Knowledge Acquisition Multi-lingualism Machine Translation: Efficient Information Generation through Terminology and Vocabulary Collection and Standardization Wordprocessor: High Productivity by Spelling Correction, Summarization and Efficient Use. R andamp; D (2) R & D (3): Language, Information and Terminology Language Education: Technical Thinking and Technical Communication Terminology-based Education Language Study: Domain-specific Language Study R andamp; D (3) Terminology Sponsors: Terminology Sponsors Support from Government, Organization and Industry according to each specialty Ministry of Culture and Tourism (KORTERM Center Operation) Ministry of Science and Technology (Randamp;D Fund) Ministry of Information and Telecommunication (Randamp;D Fund) Ministry of Diplomacy and Trade Ministry of Industry and Resource Ministry of Education Korea Science and Technology Foundation (Event Support) Task Configuration: Task Configuration Terminology Base (Collection) Non-standards International Term Standard Terminology Standard Languageandamp; Knowledge Product Language Education Environment Terminology Information Environment Randamp;D Environment Application Use Terminology Symbolization Terminology Access Standard Channel Grid Size Controller Application-Specific Dictionary Language Education Adaptable to Student Randamp;D Industry Living Communication Standardization andamp; Harmonization Terminological Conceptual Space Large-Scale Speech/Language/Image DB Construction and Evaluation: Large-Scale Speech/Language/Image DB Construction and Evaluation Supported by Ministry of Science and Technology Two Year Project (1999.10-2001.10) Goals: Goals Final Goal Working Group Organization Survey and Planning IR Test Suite and Evaluation Model Recommend MT Test Suite and Evaluation Model Recommend Image Attribute Format Color-Lexical Entry MPEG7 Specification Language Sentence-unit Speech DB Prosody for Speech Synthesis Speech Image Language Speech Image IR/QA 90 query/200K doc, MT 5,000 sentences word-unit telephone speech DB: 100 token * 500 Image 300 kinds - Meta Data Question-Answering IR Test Suites: Question-Answering IR Test Suites Test Suites for IR/QA Documents 207,067 records (370MB) Newspapers Query Generation 90 queries (through 300 quiz query analysis) Queries for WH-question and other various types of answers for NLP problem solving relevent document set to include the answer by using four kinds of commercialized IR systems by 16 kinds of methods English-Korean MT Test Suites: English-Korean MT Test Suites Type Classification: About 300 Kinds Test Sentences and Test Query: 5,000 Records Extracted from Textbook and Grammar books (1999-2000) will be extracted from the Real usage like web, newspapers (2000-2001) Evaluation by Yes/No Question Tested for 4 Commercialized English-Korean MT Systems MT Evaluation Workbench: MT Evaluation Workbench Image Meta Data Editor: Image Meta Data Editor Meta data Input Workbench by XML Image Retrieval by Meta data: Image Retrieval by Meta data http://korterm.kaist.ac.kr/ksurimal/: http://korterm.kaist.ac.kr/ksurimal/