logging in or signing up 18 IBM Text Analytic OS Architecture 011606 mod FunSchool Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 140 Category: News & Reports.. License: All Rights Reserved Like it (0) Dislike it (0) Added: September 18, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript IOP ’06Open Source Intelligence Lesson Learned: IOP ’06 Open Source Intelligence Lesson Learned Issues in using open source for intelligence: Issues in using open source for intelligence Growth and complexity of heterogeneous content Not all open source data is equal – Quantities vs. Qualitative Requirements of Ecoinformatics Architectures Slide3: Source: IBM 2005 GTO Years 1024 = 1Trillion Terabytes of data which is equivalent to all the information consumed visually by all humans in a year Digital content is growing at dramatic rate Slide4: Source: IBM 2005 GTO The scale of open source data and its heterogeneous form increases complexity of extracting intelligence Storage online Medical data stored Personal multimedia Surveillance bytes Photos multimedia Scalable Heterogeneity Intelligence Structured data Free from text 109 1012 1015 1021 1024 1027 Slide5: Open Source Intelligence from the periphery requires an understanding of its topology, including strengths and weaknesses sources in the periphery Ecoinformatics Architectures need to be multi-layered : Ecoinformatics Architectures need to be multi-layered Cross-Page Annotators Classification Clustering Communities Ranking Applications Network Associations Search Topic Tracking Buzz Analysis Per-Page Annotators Auto Entity Spotters Auto Geography Spotter Porn andamp; Dup Detection Customer Taxonomy Spotter 100’s 1000’s (pages/second) World Wide Web Blogs Newspapers Licensed Feeds Data Bases Intranet Data Taxonomies Commercial Date Bases Index Store Un-Structured Data DATA ACQUISITION Structured Data Parsing/ Tokenizing Annotation Searching Natural Clustering Affinity Analysis Snippet Analysis Trending Performance Management Drug Research Business Insights Workbench Customer Applications 10’s Relevancy Volume WebFountain Business Insights Workbench WS OminFind II Slide7: 0.0% 0.5% 1.0% 1.5% 2.0% 2.5% 3.0% 3.5% 4.0% 4.5% Congressman Rob Simmons Douglas Rushkoff Eliot Jardines Major General Patrick Cammaert Mr Arno Reuser Robert Steele Open Source Trend on Web Some event happened in August % of OSI web documents One dominant voice Finding intelligence can require different view of the same information Slide8: Context Network of Conference Attendees to auto-spotted Companies and Universities In this network view we don’t care about association with 'Open Source Intelligence' but with companies and universities Slide9: Computers don’t create intelligence, people do – computers enable smart people Not all open source content is equal – know the sources Not every thing you see is right – it’s all about the CONTEXT Ecoinformation architecture supports - Large scale analytics of open source content - Integration of content other than open source - Power text analytic tools to support analysis of on topic stores Conclusions on Open Source Intelligence You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
18 IBM Text Analytic OS Architecture 011606 mod FunSchool Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 140 Category: News & Reports.. License: All Rights Reserved Like it (0) Dislike it (0) Added: September 18, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript IOP ’06Open Source Intelligence Lesson Learned: IOP ’06 Open Source Intelligence Lesson Learned Issues in using open source for intelligence: Issues in using open source for intelligence Growth and complexity of heterogeneous content Not all open source data is equal – Quantities vs. Qualitative Requirements of Ecoinformatics Architectures Slide3: Source: IBM 2005 GTO Years 1024 = 1Trillion Terabytes of data which is equivalent to all the information consumed visually by all humans in a year Digital content is growing at dramatic rate Slide4: Source: IBM 2005 GTO The scale of open source data and its heterogeneous form increases complexity of extracting intelligence Storage online Medical data stored Personal multimedia Surveillance bytes Photos multimedia Scalable Heterogeneity Intelligence Structured data Free from text 109 1012 1015 1021 1024 1027 Slide5: Open Source Intelligence from the periphery requires an understanding of its topology, including strengths and weaknesses sources in the periphery Ecoinformatics Architectures need to be multi-layered : Ecoinformatics Architectures need to be multi-layered Cross-Page Annotators Classification Clustering Communities Ranking Applications Network Associations Search Topic Tracking Buzz Analysis Per-Page Annotators Auto Entity Spotters Auto Geography Spotter Porn andamp; Dup Detection Customer Taxonomy Spotter 100’s 1000’s (pages/second) World Wide Web Blogs Newspapers Licensed Feeds Data Bases Intranet Data Taxonomies Commercial Date Bases Index Store Un-Structured Data DATA ACQUISITION Structured Data Parsing/ Tokenizing Annotation Searching Natural Clustering Affinity Analysis Snippet Analysis Trending Performance Management Drug Research Business Insights Workbench Customer Applications 10’s Relevancy Volume WebFountain Business Insights Workbench WS OminFind II Slide7: 0.0% 0.5% 1.0% 1.5% 2.0% 2.5% 3.0% 3.5% 4.0% 4.5% Congressman Rob Simmons Douglas Rushkoff Eliot Jardines Major General Patrick Cammaert Mr Arno Reuser Robert Steele Open Source Trend on Web Some event happened in August % of OSI web documents One dominant voice Finding intelligence can require different view of the same information Slide8: Context Network of Conference Attendees to auto-spotted Companies and Universities In this network view we don’t care about association with 'Open Source Intelligence' but with companies and universities Slide9: Computers don’t create intelligence, people do – computers enable smart people Not all open source content is equal – know the sources Not every thing you see is right – it’s all about the CONTEXT Ecoinformation architecture supports - Large scale analytics of open source content - Integration of content other than open source - Power text analytic tools to support analysis of on topic stores Conclusions on Open Source Intelligence