logging in or signing up Dave Harris Demetrio Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 81 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: March 19, 2008 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Toward Improved Visualization of Unstructured Information: Toward Improved Visualization of Unstructured Information March 4, 2005 National Academy of Sciences Context 2 J. David Harris National Security AgencyContext 2: Context 2 The definition: Visualization in the context of large, unstructured, changing data sets where the relevance, significance, and conceptual links among the data have yet to be discovered Preliminary thoughts Effective visualization of structured data is challenging Unstructured data requires some type of structural mapping . The data discovery and analysis will be imperfect . The mapping will be imperfect, and task dependentAn Example of Real Data: An Example of Real Data A real-world story of intrigue… About a cell of conspiring individuals… Who set forth on a project… Constrained by time and money… Characterized by deception… To advance their own cause. The plot is ultimately discovered… But not before the mission is accomplished.The Players and Their Motivation: The Players and Their Motivation The Strategy: The Strategy Discovering “Project X” Exploratory analysis… is there some data that conforms to some pattern? END PROJECT Acquire Land for Florida Property START PROJECT Walt Disney World Company ORGANIZE EXECUTEThe Investigation: The Investigation Purchase Land Disguise Travel Patterns Acquire Mineral Rights Define Operation Establish Schedule Secure Finances Identify Purchase Agents Disguise Travel Patterns Establish Aliases for Agents Obfuscate Communications Establish Dummy Corporations Discovering “Project X” Refining the pattern…but only some of the pattern is observable in the unstructured, varied, uncertain and continuously updated data END PROJECT Acquire Land for Florida Property START PROJECT Walt Disney World Company ORGANIZE EXECUTE Retrospective analysis? Or prospective analysis?The Observables: The Observables Establish aliases for purchase agents, and incorporate Important information is dominated by irrelevant data There’s very little evidence on which to base decisions… Relevant observations, (semi-)notional: COMPANY NAME POC STATE DATE ( Compass East Group, Roy Davis, Delaware, 7 December 1964 ) ( Tomahawk Properties, Inc., Bob Price, Florida, January 1965 ) (Latin-American Dev and Mgmt Corp, Roy Davis, Florida, February 1965 ) ( Ayefour Corporation, Bob Foster, Florida, February 1965 ) ( Bay Lake Properties, Inc., Bob Price, Florida, March 1965 ) ( Reedy Creek Ranch, Inc., M. T. Lott, Florida, June 1965 ) Incorporations More Observables: More Observables Purchase land The plot begins to unfold… Correlated events emerge (both within and across streams) Incorporations Land Acquisition Relevant observations: PURCHASING AGENT TRANSACTION DATE LONGITUDE LATITUDE ACRES The Discovery: The Discovery Orlando Sentinel Dateline—May 4, 1965 Reported that two real estate transactions totaling over $1.5 million had been made…for nearly 9,000 acres of land near the small Florida farming town of Orlando Dateline—October 20, 1965 Reported that Walt Disney was secretly behind the purchase of land LOGICAL “COMMUNITY OF INTEREST” PHYSICAL and TEMPORAL PROXIMITY OF TRANSACTIONS Vagueness (Dynamism) of Hypotheses Unknown Sources of Data and Information Relevant Data Concealed by “Noise” Uncertain and Erroneous Observations (Causally) Incomplete Context…Missing Data Logical and Physical StructuresBut It Was Too Late: But It Was Too Late Disney’s “Project X” Began in the early 1960’s The Florida site was selected on November 22, 1963 Ayefour Corporation buys the first parcel of land on October 23, 1964 An official announcement was made by Disney on November 15, 1965 They had acquired 27,443 acres of land SW of Orlando… And they had “big plans” The first acre: $80 The final acre: $80,000 What did it cost? About $185/acre, on averageWhy is This Context Interesting?: Why is This Context Interesting? The definition: Visualization in the context of large, unstructured, changing data sets where the relevance, significance, and conceptual links among the data have yet to be discovered To enable understanding! Retrospective Forensics Prospective Investigative Reporting Business Intelligence SecurityThe “Context 2” Agenda: The “Context 2” Agenda Ronald Coifman Yale University Diffusion/Inference Geometries of Data Features, Situational Awareness and Visualization Andre Skupin University of New Orleans A Different Kind of Map Stephen Eick University of Illinois at Chicago; SSS Research DECIDETM Hypothesis Visualization Tool Dave Harris National Security Agency Reactions and discussionReactions and Discussion…: Reactions and Discussion… Context 2…the definition: Visualization in the context of large, unstructured, changing data sets where the relevance, significance, and conceptual links among the data have yet to be discovered Perception and cognition of visualization Reasoning under uncertaintyPerception and Cognition of Visualization: Perception and Cognition of Visualization Map-making Simplification What’s important? Who’s the intended audience? How might we measure interpretability? Classification Symbolization Induction Visualize… Existence Notation on a map that a point or area exists Associative existence Added absolute or relative quantity to the identified points and areas Spatially associated existence Spatial relationships between points and areas This representation of the Orlando metropolitan area is targeted at tourists… Maps are a specific type of diagram with which most people have experiencePerception and Cognition of Visualization: Perception and Cognition of Visualization How can we capture the dynamic nature of data? Maps are snapshots…but they require little additional training Can we place thematic overlays on top of term-document landscapes? as a means of creating different views of the same data... How do we encourage interactivity? What can’t be represented using topography only? For unstructured data… What kind of mappings can we impose? Some structure may be due to contact or context (not content)… What might roads represent? What about rest areas? National Parks? Hospitals? How is uncertainty represented?Reasoning Under Uncertainty: Reasoning Under Uncertainty A critical aspect of Context 2 Visualization of the hypotheses… Capture the intent of the task and subject matter expertise Guide the exploration and analysis Customize the visualization Reasoning Under Uncertainty: Reasoning Under Uncertainty Multiple (competing) hypotheses Alternative models, at the onset… or after improved/diminished understanding Machine-learning can (should?) offer data as... Supporting evidence Contradictory evidence Change in the actual plan Changing world eventsWhat to Visualize?: What to Visualize? How do we decide what’s important? We (probably) don’t need all of these observations? Incorporations Land Acquisition Final Thoughts: Final Thoughts So… Visualization influences hypothesis generation Hypothesis generation influences analysis Analysis influences visualizationReactions and Discussion…: Reactions and Discussion… Select relevant information from assembled data Impose some kind of structure… Apply graphic techniques To enable understanding… You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
Dave Harris Demetrio Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 81 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: March 19, 2008 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Toward Improved Visualization of Unstructured Information: Toward Improved Visualization of Unstructured Information March 4, 2005 National Academy of Sciences Context 2 J. David Harris National Security AgencyContext 2: Context 2 The definition: Visualization in the context of large, unstructured, changing data sets where the relevance, significance, and conceptual links among the data have yet to be discovered Preliminary thoughts Effective visualization of structured data is challenging Unstructured data requires some type of structural mapping . The data discovery and analysis will be imperfect . The mapping will be imperfect, and task dependentAn Example of Real Data: An Example of Real Data A real-world story of intrigue… About a cell of conspiring individuals… Who set forth on a project… Constrained by time and money… Characterized by deception… To advance their own cause. The plot is ultimately discovered… But not before the mission is accomplished.The Players and Their Motivation: The Players and Their Motivation The Strategy: The Strategy Discovering “Project X” Exploratory analysis… is there some data that conforms to some pattern? END PROJECT Acquire Land for Florida Property START PROJECT Walt Disney World Company ORGANIZE EXECUTEThe Investigation: The Investigation Purchase Land Disguise Travel Patterns Acquire Mineral Rights Define Operation Establish Schedule Secure Finances Identify Purchase Agents Disguise Travel Patterns Establish Aliases for Agents Obfuscate Communications Establish Dummy Corporations Discovering “Project X” Refining the pattern…but only some of the pattern is observable in the unstructured, varied, uncertain and continuously updated data END PROJECT Acquire Land for Florida Property START PROJECT Walt Disney World Company ORGANIZE EXECUTE Retrospective analysis? Or prospective analysis?The Observables: The Observables Establish aliases for purchase agents, and incorporate Important information is dominated by irrelevant data There’s very little evidence on which to base decisions… Relevant observations, (semi-)notional: COMPANY NAME POC STATE DATE ( Compass East Group, Roy Davis, Delaware, 7 December 1964 ) ( Tomahawk Properties, Inc., Bob Price, Florida, January 1965 ) (Latin-American Dev and Mgmt Corp, Roy Davis, Florida, February 1965 ) ( Ayefour Corporation, Bob Foster, Florida, February 1965 ) ( Bay Lake Properties, Inc., Bob Price, Florida, March 1965 ) ( Reedy Creek Ranch, Inc., M. T. Lott, Florida, June 1965 ) Incorporations More Observables: More Observables Purchase land The plot begins to unfold… Correlated events emerge (both within and across streams) Incorporations Land Acquisition Relevant observations: PURCHASING AGENT TRANSACTION DATE LONGITUDE LATITUDE ACRES The Discovery: The Discovery Orlando Sentinel Dateline—May 4, 1965 Reported that two real estate transactions totaling over $1.5 million had been made…for nearly 9,000 acres of land near the small Florida farming town of Orlando Dateline—October 20, 1965 Reported that Walt Disney was secretly behind the purchase of land LOGICAL “COMMUNITY OF INTEREST” PHYSICAL and TEMPORAL PROXIMITY OF TRANSACTIONS Vagueness (Dynamism) of Hypotheses Unknown Sources of Data and Information Relevant Data Concealed by “Noise” Uncertain and Erroneous Observations (Causally) Incomplete Context…Missing Data Logical and Physical StructuresBut It Was Too Late: But It Was Too Late Disney’s “Project X” Began in the early 1960’s The Florida site was selected on November 22, 1963 Ayefour Corporation buys the first parcel of land on October 23, 1964 An official announcement was made by Disney on November 15, 1965 They had acquired 27,443 acres of land SW of Orlando… And they had “big plans” The first acre: $80 The final acre: $80,000 What did it cost? About $185/acre, on averageWhy is This Context Interesting?: Why is This Context Interesting? The definition: Visualization in the context of large, unstructured, changing data sets where the relevance, significance, and conceptual links among the data have yet to be discovered To enable understanding! Retrospective Forensics Prospective Investigative Reporting Business Intelligence SecurityThe “Context 2” Agenda: The “Context 2” Agenda Ronald Coifman Yale University Diffusion/Inference Geometries of Data Features, Situational Awareness and Visualization Andre Skupin University of New Orleans A Different Kind of Map Stephen Eick University of Illinois at Chicago; SSS Research DECIDETM Hypothesis Visualization Tool Dave Harris National Security Agency Reactions and discussionReactions and Discussion…: Reactions and Discussion… Context 2…the definition: Visualization in the context of large, unstructured, changing data sets where the relevance, significance, and conceptual links among the data have yet to be discovered Perception and cognition of visualization Reasoning under uncertaintyPerception and Cognition of Visualization: Perception and Cognition of Visualization Map-making Simplification What’s important? Who’s the intended audience? How might we measure interpretability? Classification Symbolization Induction Visualize… Existence Notation on a map that a point or area exists Associative existence Added absolute or relative quantity to the identified points and areas Spatially associated existence Spatial relationships between points and areas This representation of the Orlando metropolitan area is targeted at tourists… Maps are a specific type of diagram with which most people have experiencePerception and Cognition of Visualization: Perception and Cognition of Visualization How can we capture the dynamic nature of data? Maps are snapshots…but they require little additional training Can we place thematic overlays on top of term-document landscapes? as a means of creating different views of the same data... How do we encourage interactivity? What can’t be represented using topography only? For unstructured data… What kind of mappings can we impose? Some structure may be due to contact or context (not content)… What might roads represent? What about rest areas? National Parks? Hospitals? How is uncertainty represented?Reasoning Under Uncertainty: Reasoning Under Uncertainty A critical aspect of Context 2 Visualization of the hypotheses… Capture the intent of the task and subject matter expertise Guide the exploration and analysis Customize the visualization Reasoning Under Uncertainty: Reasoning Under Uncertainty Multiple (competing) hypotheses Alternative models, at the onset… or after improved/diminished understanding Machine-learning can (should?) offer data as... Supporting evidence Contradictory evidence Change in the actual plan Changing world eventsWhat to Visualize?: What to Visualize? How do we decide what’s important? We (probably) don’t need all of these observations? Incorporations Land Acquisition Final Thoughts: Final Thoughts So… Visualization influences hypothesis generation Hypothesis generation influences analysis Analysis influences visualizationReactions and Discussion…: Reactions and Discussion… Select relevant information from assembled data Impose some kind of structure… Apply graphic techniques To enable understanding…