Business Case for Semantic Interperability : Business Case for Semantic Interperability
Clarifications : Clarifications Difference between a ‘business case’ and a ‘scenario’
Business case describes the business objectives, drivers, proposes a solution, outlines enablers, identifies risks of not building the solution, outlines an implementation plan and provides operational criteria
A scenario – in our context – is an example of a situation in which the solution might be implemented
What I am going to describe today is a business case for an intelligence environment, in which scenarios help to illustrate how the solution would work, and in which semantic interoperability is a key enabler
Semantic Interoperability itself is NOT the business case…
Business Case Overview : Business Case Overview Business Objectives
Business Drivers
Proposed Solution
Enablers
Risks of Not Building
Implementation Plan
Operational Criteria
Business Objectives : Business Objectives Intelligence environment for 21st Century is not the same as the intelligence environment of the 1950 Cold War era
Cold War threat scenarios were the best we could do for the technical capabilities we had at that time
We focused on linking people – we focused on the capability to find “people who posed” threats rather than threats
The world has become more complex since that time – our business cases must reflect this complexity if they are to be believable and our intelligence environment must be more complex if it is to be effective
Business Objectives : Business Objectives To improve the environment in which information is collected, characterized, analyzed and converted into intelligence in the context of threat scenarios
To leverage existing knowledge of accident prevention, safety protocols, training exercises and mitigation as actionable threat scenarios
To leverage existing sources of intelligence and domain information to create a broader intelligence picture, without removing it or exposing it to discovery outside of the creating agency
To focus the intelligence analyst’s work on the evaluation and further elaboration of threat scenario conditions
To bring into the intelligence model the knowledge of domain experts on to assist with threat discovery, thresholding and mitigation
Business Drivers : Business Drivers The current approach to intelligence analysis is inefficient because it uses productivity-constrained human intelligence to manually review information, mentally translate-harmonize-extract knowledge, using sources that may or may not be related in some context or by some specific feature.
Because the scale and scope of potential threats have increased, more information is being collected but the model remains unchanged.
Humans have a rubber ceiling of time and mental capacity – human capacity should be applied in thoughtful and critical review of scenario conditions and thresholds
Systems should be used to organize and manage information, to set the context for scenario discovery and human interpretation of threat scenario levels
Business Drivers : Business Drivers Can’t see our intelligence problems – information overload hides what’s missing from the big picture and may lead analysts to spend valuable time looking an unimportant sources
It may not be possible to connect source content along common integration points due to system security constraints, lack of consistency in specifications, syntax and semantics, and analysts’ different mental models
Common data reference model across systems and sources not available and no context for building reference model
Integration of information currently dependent upon human communication, human access to information, and human analysis and mental capacity
We do not have an integrated, interoperable intelligence environment today and we need one
The Proposed Solution : The Proposed Solution
Proposed New Intelligence Model : Proposed New Intelligence Model In order to discover potential threats in today’s complex and global environment, analysts must be able to work with the “big information picture”
We contend that human intelligence is inefficiently used in initial stages of intelligence gathering, and would be better used to analyze, interpret and critically evaluate system generated fits of threat scenarios to current conditions.
Intelligence analysts’ intuitive understanding of threat scenarios should be leveraged to build explicit threat models
Training scenarios, accident and safety protocols should be converted to actionable threat models
Slide10 : Across Federal Enterprise
Repository of
Intelligence Metadata Reference tables, maps, etc. CIA
sources DIA
sources AGR. IA
sources Transport
sources Manufact.
Sources Foreign
sources State
sources Local
sources METADATA FEEDS UP FROM EACH DOMAIN & BACK DOWN Scenario Repository
Fictional
Descriptions Analyst’s
Knowledge Accident &
Safety
Protocols Law Enforcement
Reports Critical
Incident
Histories Analyst DCA
Security Agent Airport
Security SI
Threshold
warnings View of the Information in Scenario context with language of that user
Centralized metadata/ “Repository of Intelligences”
Decentralized review of scenarios for ______ conclusions
Person in the context is best qualified to judge the significance of threshold alerts and to write interpretations
Components of the Reference Model : Components of the Reference Model Threat scenarios constructed as mathematical/ simulation models – built by domain experts
Central repository of surrogate information (metadata) for entities that pertain to threat scenarios
Metadata extracted from domain systems and based on domain expert knowledge
Scenarios run against metadata/metainformation to identify threat conditions or emerging threats
Mitigation or aversion strategies in place to stop the threat, rather than to simply identify an incident
Proposed New Intelligence Model : Proposed New Intelligence Model Information collected and processed by one agency needs to be available to another agency consistent with that agency’s mission, regardless of the purpose for which it was created.
Data need not move physically to be made available outside of its repository.
A cross-domain, cross-system, high level reference model of entities and attributes is proposed - this reference data should be sufficient to enable scenario analysis across agencies, domains, countries
Entities and attributes for this reference model are derived from existing domain knowledge but are extended through rich scenario modeling
Actual data for model is fed from deep surrogation of existing domain and intelligence information sources – information itself does not need to leave its source system
Proposed New Intelligence Model : Proposed New Intelligence Model The context in which we detect threats must change – as we heard from the 9/11 Commission
Detection must be proactive, not reactive – onsite rather than remote
Detection must be actionable and realistic, not coincidental and crude – train scenario March, 2004
High level reference model is driven by actionable scenarios, running against relevant cross-domain information surrogates, and presented to domain experts in their a context they can evaluate
High level reference model has people looking at information to evaluate and validate potential threats, not only to identify potential threats
Proposed New Intelligence Model : Proposed New Intelligence Model Cross-domain – chemical production, product manufacturing, international shipping, coast guard, port management,
Cross system – different control systems, different domain vocabularies, different kinds of information, variant attributes, different syntaxes for the same attributes
Cross organizations – different levels of security, different data collection protocols, different data retention rates, different business goals
International – different languages, different industry standards, different national goals
Must be practical, doable, buildable from existing sources
Proposed New Intelligence Model : Proposed New Intelligence Model Contextualization is critical for both effective and efficient threat analysis
We already have rich context in domain specific environments and systems – when we move from “accident” to “threat” scenarios, though, the contextualization must be scaled up
Contextualization of information for the reference model is enabled through scenarios
Contextualization means making the models meaningful for every threat situation – meaningful to the domain experts, interpretable in the domain systems
Enablers : Enablers
Think Back … : Think Back … On September 11, 2001 I was leaving a hotel in Connecticut – about to drive back through New York City en route home to Washington DC
As I passed the lobby television, I saw a jet hit one of the World Trade Towers
What assumptions did I make? What assumptions did you make? I assumed this was an accident… did you?
When did you begin to believe that it might not be an accident?
When the logical probability of it happening again at the same place within minutes was exceeded
Enablers – Example : Enablers – Example Threats are nothing more than accidents intentionally launched
If we know how to prepare for and prevent accidents, and if we have mitigation plans can we manage threats? All threats are local
What do we know about accidents and how do we prevent them?
The answer is a simple one – we consider accidents are possible, build scenarios that identify the conditions under which they can happen, and monitor the factors that would create the conditions
The challenge is to extend the accident conditions outside of their domains to see how they might be part of a larger threat scenario
Enablers : Enablers What was most disturbing to me about the 9/11 Commission hearings was the fact that it was clear that we were just as unprepared to handle an accident as a threat scenario
Accidents, just like threats, consist of a series of events, conditions and actors –
Only difference between an accident and a threat is that we assume an accident can happen and we have always assumed a threat scenario would not
Can we repurpose accident monitoring and mitigation to stop threats? What more is needed to scale up and across domains?
SI as Enabler : SI as Enabler When we scale up from accidents to threats, semantic interoperability is the critical success factor – why?
Because we create new dependencies across domains, across systems, across protocols…
What do we mean by semantic interoperabiity?
What form does SI take in a threat scenario? How do we solve SI challenges?
Let’s walk through a scenario
Slide21 : Question & Answer Function Content Use Management Content Critique & Evaluation Content Search & Discovery Content Collection Building Content Publishing & Distribution Content Analysis & Indexing Content Organization Content Descriptive Cataloging Content Creation Architecture
Factors Semantic
Factors Machine
Applications Human
Applications Functions Semantic Interoperability Across Full Information Life Cycle Publisher or user
Defined subject values Human indexing with
Controlled vocabularies Concept clustering,
Rule-based classification Data Dictionaries,
Attribute Maps Facet mapping Content structure
Definitions Hierarchies of content elements; content classification schemes Content element tags, definitions, specs; Content structure languages – xml, html,.. Faceted taxonomies of
Metadata attributes – each facet represents a functional access point Metadata scheme variations; each facet has its own semantics, syntax, definitions & specs Hierarchiess as classification schemes, topic, bus.activity, org. unit, country, region, etc. Class scheme warrant,
Labels, lumper/splitter,
Semantic vs. statistical
Orientation Collection conspectus,
Collection development policies Context extraction and summarization’
‘seeded’ extracts; decomposition &
Recomposition of grammars & sense Human critical analysis and
Review; abstract/review formats Limited to the programmatic creation
Of security algorithms, rather than
Programmatic classification of content Humans still classify content &
Redact sections of content
interactively Q&A repositories with routing from
Q&A to bestguess repository;
Question typing; phrase matching Human decompose Questions
Into types,check current Q&A sources
identify best sourceOf answer and read; Network structures – concept maps, thesauri, dictionaries, etc. Lexical, semantic, syntax,
Cross-language morphological rules Faceted structures as
Attribute profiles Profile attributes have all
The same semantic issues
As metadata – but for people,
audience and collections Machine record structures Concept/phrase
Extraction & validation Concept extraction,
Categorization applied to
people, audiences LSCC, Dewey, UDC, ITU, ACM,
IEEE, CTI, Human classifying Collection Hierarchies to build sets of ‘like’ content,
Collection scope and coverage definitions and warrant, limitations
Collection assessment expert systems using policies & metadata Hierarchies, network and faceted structures. All are used in building search and discovery systems
Semantics here are as complex as metadata because search & discovery can be. Solve semantics at lower functional levels & bring solutions to this level
Search system human
Configuration using
NKOs Wide range of solutoins from simple
text retrieval with categorization
and clustering to parametric Search Hierarchical security class chemes, copyright ownership, access rights, more complex and deep faceted User metadata to ensure observance of aecurity rules
Hierarchical security class schemes,
More complex and deep faceted
User metadata to ensure that
Security constraints are observed
Domain and mission oriented rules for structuring and formulating abstracts,
Critical reviews; linkages between original content and abstract; compound Hierarchies of content elements;;
Associated – network links – from
Original content to reviews;
Networked grammars Retention & disposition schedules are
Faceted structures, drawing on
Multiple attribute values for decision
Making Missing values, variations in values
Over time such as organizational
Units/structure, bus. Process
Changes, variations in print/digital Content Inferencing Function Content Retention & Dispositioning Semantics now involve complex
Morphological rules – language by
Language to deconstruct content;
Concept focus shifts to verbs Best practice approach is to build
Repositories of Q&A, FAQs, to
Manage basic question types;
Semantic analysis of user question Inferencing function takes on the
Structure of networks of concept
Nodes with links links representing
All kinds of relationships; facet tax. Questions are now classified by type;
Answers are represented as complex
Phrases; associations are build
Between types and complex phrases Expert Systems with retention
& disposition schedules, rules to
Identify actions based on metadata Humans currently do
Appraisal of print content one by one
Or collection by collection Expert Systems withextensively
Elaborated scenarios and conditions Analysts laboriously
Review source information and
Mentally make connections given their
Knowledge & experience
Production of Dangerous Chemicals : Production of Dangerous Chemicals Every day dangerous chemicals are produced and shipped around the world – they are manufactured in every country, and are traded commodities in our global economy
How could we detect an intentional chemical plant rupture? How could we stop it? How could we mitigate its impact if it happened? Are there safety protocols?
How would we prevent an accidental rupture? By having a strong safety protocol system in place and building process models that monitor safety conditions
Best we have now is local plant monitoring systems – what happens if we try to monitor across plants? How do we alert other plants to potential threats?
Where’s the SI Problem? : Where’s the SI Problem? Each plant has its own system and some systems are only for legal reporting purposes – have no proactive monitoring capabilities
There are international safety protocols for specific chemicals, but compliance is voluntary
Each plant implements the safety protocols in a different way – not all of the data required for an accident scenario is available online
Risk value of this scenario is high – rupture or theft may have both loss of life and property costs
SI challenge is high at the system level though low at the chemical production level – why?
Chemicals in Transport Sector : Chemicals in Transport Sector Chemicals transported off-site enter the transport sector either on the highways, via rail or via waterways
Chemical conditions and treatment is no longer the primary focus of the monitoring system
Monitoring system now focuses on transport and logistics – different business objectives, different business drivers, different domains – not on chemical properties or conditions
Transportation systems have different focus, structure and attributes – chemical threat is now out of context for monitoring – systems are interoperable
And, semantics have shifted as well – railway engineers, truck drivers, or tanker captains do not have chemical expertise or safety protocols, no access to systems
Risks of Not Building : Risks of Not Building Risk of not being able to monitor within or across plants is that information about chemical threat conditions may be treated with varying levels of importance and security
Risk of not being able to monitor transport of chemicals nationally and internationally means that intelligence about a threat is randomized and is more difficult to isolate – best we can do is to mitigate when it occurs
Greatest risk is that randomized alerts will be ignored after a time because there is no confidence that they are “real”
Production costs of commonly used chemicals can increase having an economic impact beyond the initial impacts
Transport insurance costs can increase
Implementation Plan : Implementation Plan Continued creation of persistent repositories of metadata/metainformation of potential intelligence value
Establish a programmatic approach to organizing intelligence information, regardless of its source system, to support the creation of a high level, cross-agency, cross-domain, cross-system reference model
Record and elaborate threat scenarios currently represented as safety protocols, accident and mitigation procedures, training models and simulations, intelligence analysts’ tacit information
Decomposition of scenarios into kinds of entities, their attributes and the kinds of linkages that represent actions in any scenario
Implementation Plan : Implementation Plan Conversion of scenario models into business process models which will be run against the cross agency reference model
Development of a high level cross-scenario reference model of entities and attributes
Map the agency data models to the cross-agency data model
Working across entities identify common attributes for the purpose of exploring and resolving semantic variations = attributes where interopreability is critical
Define the nature of semantic interoperability required attribute by attribute and propose solutions for each
Implementation Plan : Implementation Plan Apply semantic technologies to persistently extract and tag intelligence information according to entities and attributes found in the reference model – technologies should be configurable to enable harmonization
Review the overall reference model to identify intelligence gaps needed to determine when scenario compoents may exist at a high confidence level
Devise sustainable methods for filling the gaps
Develop the threat context views and thresholds working with domain experts
Establish a schedule for running scenarios against the database
Establish methods for pulling reference data in from all sources
Operational Criteria – Data Reference Model : Operational Criteria – Data Reference Model Each scenario has entities and attributes for which information may be collected in many domains, in many countries, in different systems
In order to pull all of the information together to actualize the scenarios – we need a central repository
Data reference model includes metadata/metainformation required to enable scenarios - extracted entities and attributes from the information sources
Reference model is the point at which semantic interoperability is enabled and managed
Semantic interoperability is managed from the perspective of the experts who will use it to guard against threats in their specific environment
Operational Criteria – Intelligence Metadata Capture and Collection : Operational Criteria – Intelligence Metadata Capture and Collection Using concept extraction, rule-based categorization and summarization technologies all intelligence information is profiled and surrogate records are created for each item
Surrogate records contain entity extractions
Information remains in its secure source system
Surrogates are exported into central metadata repository
Operational Criteria – Scenario Library : Operational Criteria – Scenario Library The most critical component of this new intelligence model is the scenario
The richer the scenarios, the better chance we have of being able to anticipate and guard against threats
There are many potential threats to monitor – managing the threat scenarios is a central repository function
Threat scenario library also needs to contain information about improvements, evaluations from domain experts
Operational Criteria – Running Scenarios Centrally and Locally : Operational Criteria – Running Scenarios Centrally and Locally In the new intelligence model the computer models and monitors data fits to threat scenarios
This is a step above the entity based clustering we have been doing for the past 15-20 years
When threat scenarios reach critical threshold in terms of fit to data, domain experts are alerted
In this environment, we have what is close to an actionable alert with actionable intelligence
Scenarios cannot run without intelligence data but intelligence data is not actionable without scenarios
Getting There From Here : Getting There From Here Many of the efforts referenced in the SI Table above are already underway in the government and outside the government
Best approach would be to commit to building the model, establishing the baseline architecture components, and begin to integrate work in progress
Oversight group – centralized intelligence administration – would coordinate the construction