JS WS TAMA2001

Uploaded from authorPOINTLite
Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

— Ontologies in Terminology Work — Enabling Controlled Authoring: 

— Ontologies in Terminology Work — Enabling Controlled Authoring Dr Jörg Schütz IAI Saarbrücken joerg@iai.uni-sb.de

What is it about?: 

What is it about? Technical information and asserted knowledge in SGML/XML environments Acquisition Production Translation - localization, internationalization, globalization Dissemination Assimilation Shared corporate (public) knowledge backbone for guiding and supporting the processes exchange with suppliers and customers Application and deployment scenarios

Road map: 

Road map An introduction to the technology used Why ontologies as knowledge backbone? How ontologies support and guide the processes How to build and constrain an ontology The ontology life-cycle (steps and rules) How to represent an ontology (exchange formats) Application examples Conclusion Questions and answers

What is Multidoc Technology?: 

What is Multidoc Technology? Controlled Language enabling technology that identifies spelling mistakes grammar errors stylistic weaknesses terminology misuse and inconsistency in SGML/XML encoded information objects Based on IAI’s core language technology and a prototype designed and built in an EC project with automotive partners, a translation company and a translation technology provider BMW, Jaguar, Renault, Rolls-Royce, Saab, Volvo, and ITR and STAR

What is Language Technology?: 

What is Language Technology? “Research, development and production of methods, methodologies and tools for the processing and the deployment of human language through computer systems. We distinguish between software for the algorithmic and programmatic implementation, and lingware for the formalization of the language knowledge resources (lexicons, terminologies, grammars and ontologies).”

What is a Controlled Language?: 

What is a Controlled Language? Consists of well-defined, unambiguous vocabulary (aka terminology, preferably multilingual) certain grammatical restrictions set of writing guidelines for different types of information objects (aka style rules) Formalized in rules for computer use (similar to business rules) IF conditions THEN actions Rule engine operating on the results of the linguistic analysis module (linguistic engine)

Why is CL important?: 

Why is CL important? Supports the complete information cycle consisting of acquisition, production, translation, dissemination and assimilation of technical information in several dimensions: terminological standardization accuracy readability and comprehensibility reusability and maintainability cost-effective, controllable and benchmarkable translation reduced lead times and faster time to market Cost & Time Savings Better Control & Exchange

How does it work?\1: 

How does it work?\1

How does it work?\2: 

How does it work?\2 Information Mapping* * Conformance Checking Controlled Language

Why ontologies?: 

Why ontologies? Create and maintain the vocabulary of the domain Product data: parts lists, nomenclature, ... Terminology: different user profiles, views, ... Translation: styles, cultures, regulations, ... Support the different processes and the knowledge workers (writers, translators, workshop employees,...) Encode knowledge about the domain beyond the capabilities of term banks, specialized dictionaries, glossaries, indices and thesauri Search, retrieval and filtering of technical information Sharing and exchange of technical information Enable automation, e.g. controlled language application

How ontologies guide and support: 

How ontologies guide and support Highly structured repository of knowledge making explicit the concepts attributes of concepts properties of concepts relationships between concepts of a given domain Encoded (multilingual) term formation rules for Term mining (identify new terms/concepts) Term seeding (distribute new terms/concepts) within a given domain

How to build an ontology — definition: 

How to build an ontology — definition “An explicit formal specification of how to represent the objects, concepts, and other entities that are assumed to exist in some area of interest and the relationships that hold among them.” [free online dictionary of computing] “An explicit specification of some subject field. In the context of our project, it is a formal and declarative representation of the automotive service and repair domain (operating system domain) — vocabulary and rules.”

How to build an ontology — concept: 

How to build an ontology — concept ... is a language-neutral symbol that is used to represent meaning corresponding to a distinguished type of entity. ... is characterized by a unique label an associated (intensional) definition a set of characteristics encoded as attribute-value pairs Example CAR #_of_doors 2/4/5 left/right-hand-drive left/right turbo yes/no catalyst yes/no

How to build an ontology — relation: 

How to build an ontology — relation ... provides information about links between two (or more) concepts and reflects how the respective entities are related to each other: super-concept/sub-concept (hyperonymy/hyponymy) whole/part (meronymy) participant role Example isa(<subordinate_concept>,<super_ordinate_concept>) hasa(<whole>,<part>) and partof(<part>,<whole>)

How to build an ontology — relation/2: 

How to build an ontology — relation/2 Example effector(<activity>,<entity>) effected(<activity>,<entity>) affected(<activity>,<entity>) affector(<activity>,<entity>) location(<activity>,<entity>) instrument(<activity>,<entity>) attribuant(<state>,<entity>)

How to build an ontology — qualia: 

How to build an ontology — qualia Need for multidimensional views to account for different facets, e.g. [gasoline] as a [liquid] [gasoline] as a [combustible] Qualia Theory of Pustejovsky (“mode of explanation”) <constitutive> the relation between an object and its constituent parts  [cons] with [whole] and [part] <formal> that which distinguishes it within a large domain  [form] (isa) <telic> its purpose and function  [purp] (purpose) <agentive> factors involved in its origin or “bringing it about”  [orig] (origin) Quale meta-entity: qua(<quale>,<concept>)

How to build an ontology — schema: 

How to build an ontology — schema

Ontology life-cycle: 

Ontology life-cycle Information gathering and structural design Definition of concepts Definition of constraint Population with instances (terminology integration) Delivery Deployment Development basis of the Multidoc Ontology automotive documentation; software documentation automotive terms (13,800 EN; 15,000 DE; 12,000 FR; 6,800 SE); software terms (9,300 EN)

Ontology snap-shot: 

Ontology snap-shot

Ontology snap-shot: 

Ontology snap-shot

Exchange formats: 

Exchange formats XML based including the tools used for creation and maintenance (ontology and associated population) Possible exchange formats: Pure XML (serialized with DTD or schema) Ontology exchange formats such as OIL, or even KIF and CGs (very close in notation, ANSI standard) SALT’s XLT format (ISO 12620 related) Topic Maps (ISO 13250 and XTM 1.0) Evaluation of Topic Maps Electronic information repository Publishing and information/knowledge management Integration, support and exchange with other resources of corporate knowledge management systems

Topic Maps (ISO 13250): 

Topic Maps (ISO 13250) Genesis dates back in the early 1990: interchange of computer documentation based on SGML; navigation within this information resource similar to indices Creation of the SGML DocBook DTD Work on “Topic Navigation Maps” as a HyTime application Specification of Topic Maps evolved as a general navigation enabling model by using independent (or out-of-line) linking and addressing mechanisms, and a basis for querying that extends full-text search ISO Standard at the end of 1999 XML version at the end of 2000 (XTM)

Topic Maps — ISO 13250 Intro: 

Topic Maps — ISO 13250 Intro “This International Standard provides a standardized notation for interchangeably representing information about the structure of information resources used to define topics, and the relationships between topics. A set of one or more interrelated documents that employs the notation defined by this International Standard is called a topic map.”

Topic Maps — main building blocks: 

Topic Maps — main building blocks Topic topic types (categorization of a topic) topic names (base name, display name, sort name) Occurrence (linked resources) occurrence role (mnemonic) occurrence type (link to the topic which further characterizes the role) Association (relationship between two or more topics) association types association roles  Concept  Instance  Relation

Topic Maps — more building blocks: 

Topic Maps — more building blocks Identity - enables the merge of topic maps to accomplish semantic equivalence of a single topic that is the union of the characteristics of two or more topics (cf. public subject, topic map grove) Facets - enable the assignment of meta-data through attribute/value pairs (facet/facet value); query filter creation; not used to qualify topic map elements facet value name (token) facet value type (reference to a topic which further qualifies the relevance of the value) Scope - the limit of validity of an assignment of a topic’s characteristic (name, occurrence, association)

Topic Maps — any buts?: 

Topic Maps — any buts? Everything in a topic map is a topic all types are defined as topics scope is defined in terms of themes which are topics Powerful model allowing self-documentation ontology description (the things the topic map consists of) efficient navigation and querying Control information topic map processing topic map templates (declarative part of a topic map) Further research query language, constraint schemas, user profiles, ...

Applications using ontologies: 

Applications using ontologies Controlled language based authoring Term mining Classification/indexing Retrieval and filtering Customer basis automotive industry printing machines industry software industry

Highlights of the CL Product\1: 

Highlights of the CL Product\1 Customizable terminology import based on XML schema (ISO 12620 inspired) rules according to existing in-house style guidelines linguistic engine adaptable to certain SGML/XML DTD or schema elements (triggers processing and thus performance of the linguistic engine) Integratable into and support of existing IT infrastructure SGML editors, DMS, IMS and KMS Available for several languages DE, EN, FR, SE, ...

Highlights of the CL Product\2: 

Highlights of the CL Product\2 Increases hit rates of Translation Memory (TM) systems (can also check TM content) Enables the setup of quality assurance processes (quality index and translatability index), e.g. SAE J2450 Supports B2B and B2C processes and operations Linguistic Engine has well-defined input/output behavior (XML based) to allow for different deployment scenarios

Highlights of the CL Product\3: 

Highlights of the CL Product\3 Client/Server implementation GUI with error rendering, navigation and editing capability is implemented in Java Client (glue code between GUI and LE) is implemented in C Linguistic Engine is implemented in C OS Platforms: Solaris, Linux, HP-UX, Windows NT and 2000

Releases available: 

Releases available Integrated SGML/XML editor (tag-save, tag-sensitive) Full navigation within loaded information object (multithreading capability) Server architecture with integrated management facility for user/group and resources client configuration Fully-fledged development environment for rule customization MS Word (MS Office) integration for mass market

Customer References: 

Customer References BMW AG - German (codename: Multilint) Several German SMEs (codename: Tetris) Sun Microsystems - English (codename: SunProof) Volvo Car and Volvo Truck - Swedish Saab - Swedish Rolls-Royce and Bentley Motor Cars - English Heidelberger Druckmaschinen AG - German Siemens AG - German DUDEN (language assistant) - German Plan Software (electronic catalogues) - DE/EN/FR

Demonstration on request: 

Demonstration on request

Conclusion: 

Conclusion Important aspects of an ontology as knowledge backbone increases productivity avoids inconsistency permits efficient exchange allows for effective merge Standardized model for language technology that supports integration effective deployment Multilingual repository that guides and supports (traditional) translation tasks localization, internationalization and globalization processes

Further information: 

Further information Ontologies in general www.cyc.com www.bestweb.net/~sowa www.ontology.org www.ontoknowledge.org ... Topic map www.topicmap.org www.infoloom.com www.topicmap.com (empolis/Bertelsmann) www.mondeca.com www.ontopia.net ...