logging in or signing up AGU 2006 Woolf IN53C 02 Nathaniel Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 31 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: October 29, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript ‘Feature types’ as an integration bridge in the climate sciences: ‘Feature types’ as an integration bridge in the climate sciences Andrew Woolf (1,*), Bryan Lawrence (2), Jeremy Tandy (3), Keiran Millard (4), Dominic Lowe (2), Sam Pepler (2) (1) CCLRC e-Science Centre, (2) British Atmospheric Data Centre, (3) Met Office, (4) HR Wallingford (*) Corresponding author email: A.Woolf@rl.ac.ukOutline: Outline Background ‘container’ vs ‘content’ BADC feature types The data management pipeline ingestion integration management use Examples CSML Observations and MeasurementsBackground: container vs content: Background: container vs content Storage-centred data management focuses on container, not content different stovepipes for different storage granularity impacts entire pipeline backend exposed throughout integration difficult maintenance complexityBackground: e.g. BADC: Background: e.g. BADC British Atmospheric Data Centre http://badc.nerc.ac.uk UK NERC designed data centre ~60 Tb, ~130 datasets NERC programmes, Met Office, ECMWF, NASA, ... ground-based observation networks, model output (NWP, climate), satellite dataBackground: e.g. BADC: Background: e.g. BADCBackground: e.g. BADC: Background: e.g. BADCBackground: e.g. BADC: Background: e.g. BADC Nearly all the data at the BADC has geospatial information But it is not represented in a standard way Lots of types of geospatial and temporal things with no clear categorisationBackground: e.g. BADC: The current way of doing things makes it hard to integrate data from other data repositories… …, or other datasets… …, or even data from within the same dataset sometimes! Background: e.g. BADCBackground: ‘feature types’: Background: ‘feature types’ Emerging ISO standards TC211 – around 40 standards for geographic information Cover activity spectrum: discovery access use ISO 19101 Domain Reference ModelBackground: ‘feature types’: [from ISO 19109 “Geographic information – Rules for Application Schema”] Background: ‘feature types’ Geographic ‘features’ “abstraction of real world phenomena” [ISO 19101] Type or instance Encapsulate important semantics in universe of discourse Application schema Defines semantic content and logical structure of datasets ISO standards provide toolkit: spatial/temporal referencing geometry (1-, 2-, 3-D) topology dictionaries (phenomena, units, etc.) GML – canonical encodingBackground: ‘feature types’: Background: ‘feature types’ “lifetime of a technical implementation is shorter than the lifetime of the information it handles” (CEN/TR 15449) Loosens coupling between storage artefacts and data management infrastructure: breaks the link between storage and discovery/access front-end can expose information rather than files entire infrastructure more independent of back-endData management pipeline: ingestion: Data management pipeline: ingestion “What’s a dataset ?” BADC currently: “A collection of files with a common theme and administration” Alternative: “A collection of feature instances with a common theme and administration” better for integration more natural granularity for use independent of physical storage format Data management pipeline: integration: Data management pipeline: integration e.g. UK NERC DataGridData management pipeline: integration: Data management pipeline: integration ‘Feature types’ provide integration key common language across providers/users e.g. oceanographers / meteorologists share discussion about semantics of data despite format differences Standard mechanism for ‘relating’ data ‘association’ is part of General Feature Model (rather than determined by file/directory structures)Data management pipeline: management: Data management pipeline: management How to manage preservation/curation of storage artefacts ? A ‘features view’ redirects the emphasis to preserving the feature rather than the file e.g. become less hung-up on GRIBnetCDF conversion object-with-attributes is the curation focus cf. OAIS (ISO 14721):Data management pipeline: use: Data management pipeline: use Currently, have to ‘back out’ information content – ‘features’ make this explicit enables standard patterns for ‘context’, e.g. OGC Observations and Measurements ‘Features’ are closer to applications can be leveraged for value-added services General Feature Model/UML ‘operations’ (Work needed on implementation!)Data management pipeline: use: Data management pipeline: use Visualisation generic visualisation capability fraught! feature types make this more explicit Discovery ‘feature collections’ more natural granularity than file/directory collections or database tablesData management pipeline: use: Data management pipeline: use Mediator architecture n+m, not n*m ! ‘Feature types’ viewData management pipeline: use: Data management pipeline: use Integrates climate science data within mainstream ‘spatial data infrastructure’ e.g. EU INSPIRE Directive enhances cross-disciplinary useExamples: Examples Climate Science Modelling Language (CSML)Examples: Examples OGC ‘Observations and Measurements’ An Observation is an Event whose result is an estimate of the value of some Property of the Feature-of-interest, obtained using a specified Procedure CSMLSummary: Summary Data management problems arise from traditional ‘storage-oriented’ view ‘Feature types’ encapsulate information semantics Provides integration key across granularity range Potential benefits for entire data management pipeline ingestion integration management use You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
AGU 2006 Woolf IN53C 02 Nathaniel Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 31 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: October 29, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript ‘Feature types’ as an integration bridge in the climate sciences: ‘Feature types’ as an integration bridge in the climate sciences Andrew Woolf (1,*), Bryan Lawrence (2), Jeremy Tandy (3), Keiran Millard (4), Dominic Lowe (2), Sam Pepler (2) (1) CCLRC e-Science Centre, (2) British Atmospheric Data Centre, (3) Met Office, (4) HR Wallingford (*) Corresponding author email: A.Woolf@rl.ac.ukOutline: Outline Background ‘container’ vs ‘content’ BADC feature types The data management pipeline ingestion integration management use Examples CSML Observations and MeasurementsBackground: container vs content: Background: container vs content Storage-centred data management focuses on container, not content different stovepipes for different storage granularity impacts entire pipeline backend exposed throughout integration difficult maintenance complexityBackground: e.g. BADC: Background: e.g. BADC British Atmospheric Data Centre http://badc.nerc.ac.uk UK NERC designed data centre ~60 Tb, ~130 datasets NERC programmes, Met Office, ECMWF, NASA, ... ground-based observation networks, model output (NWP, climate), satellite dataBackground: e.g. BADC: Background: e.g. BADCBackground: e.g. BADC: Background: e.g. BADCBackground: e.g. BADC: Background: e.g. BADC Nearly all the data at the BADC has geospatial information But it is not represented in a standard way Lots of types of geospatial and temporal things with no clear categorisationBackground: e.g. BADC: The current way of doing things makes it hard to integrate data from other data repositories… …, or other datasets… …, or even data from within the same dataset sometimes! Background: e.g. BADCBackground: ‘feature types’: Background: ‘feature types’ Emerging ISO standards TC211 – around 40 standards for geographic information Cover activity spectrum: discovery access use ISO 19101 Domain Reference ModelBackground: ‘feature types’: [from ISO 19109 “Geographic information – Rules for Application Schema”] Background: ‘feature types’ Geographic ‘features’ “abstraction of real world phenomena” [ISO 19101] Type or instance Encapsulate important semantics in universe of discourse Application schema Defines semantic content and logical structure of datasets ISO standards provide toolkit: spatial/temporal referencing geometry (1-, 2-, 3-D) topology dictionaries (phenomena, units, etc.) GML – canonical encodingBackground: ‘feature types’: Background: ‘feature types’ “lifetime of a technical implementation is shorter than the lifetime of the information it handles” (CEN/TR 15449) Loosens coupling between storage artefacts and data management infrastructure: breaks the link between storage and discovery/access front-end can expose information rather than files entire infrastructure more independent of back-endData management pipeline: ingestion: Data management pipeline: ingestion “What’s a dataset ?” BADC currently: “A collection of files with a common theme and administration” Alternative: “A collection of feature instances with a common theme and administration” better for integration more natural granularity for use independent of physical storage format Data management pipeline: integration: Data management pipeline: integration e.g. UK NERC DataGridData management pipeline: integration: Data management pipeline: integration ‘Feature types’ provide integration key common language across providers/users e.g. oceanographers / meteorologists share discussion about semantics of data despite format differences Standard mechanism for ‘relating’ data ‘association’ is part of General Feature Model (rather than determined by file/directory structures)Data management pipeline: management: Data management pipeline: management How to manage preservation/curation of storage artefacts ? A ‘features view’ redirects the emphasis to preserving the feature rather than the file e.g. become less hung-up on GRIBnetCDF conversion object-with-attributes is the curation focus cf. OAIS (ISO 14721):Data management pipeline: use: Data management pipeline: use Currently, have to ‘back out’ information content – ‘features’ make this explicit enables standard patterns for ‘context’, e.g. OGC Observations and Measurements ‘Features’ are closer to applications can be leveraged for value-added services General Feature Model/UML ‘operations’ (Work needed on implementation!)Data management pipeline: use: Data management pipeline: use Visualisation generic visualisation capability fraught! feature types make this more explicit Discovery ‘feature collections’ more natural granularity than file/directory collections or database tablesData management pipeline: use: Data management pipeline: use Mediator architecture n+m, not n*m ! ‘Feature types’ viewData management pipeline: use: Data management pipeline: use Integrates climate science data within mainstream ‘spatial data infrastructure’ e.g. EU INSPIRE Directive enhances cross-disciplinary useExamples: Examples Climate Science Modelling Language (CSML)Examples: Examples OGC ‘Observations and Measurements’ An Observation is an Event whose result is an estimate of the value of some Property of the Feature-of-interest, obtained using a specified Procedure CSMLSummary: Summary Data management problems arise from traditional ‘storage-oriented’ view ‘Feature types’ encapsulate information semantics Provides integration key across granularity range Potential benefits for entire data management pipeline ingestion integration management use