logging in or signing up FuGE springPSI2006 luie Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 48 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: October 01, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript FuGE: A framework for developing standards for functional genomics: FuGE: A framework for developing standards for functional genomics Angel Pizarro Univesrity of Pennsylvania Andrew Jones University of ManchesterOverview: Overview Challenge of building data standards Introduction to FuGE Current status Formats developed using FuGEData Standards for HT Genomics: Data Standards for HT Genomics Major challenge developing standards: Technology still evolving Heterogeneous data formats (and data types) from software and instruments “Important” info about starting sample is almost unlimited Large quantities of metadata to validate results BUT: Most of these problems are shared by microarrays, proteomics, metabolomics etc.Experiment Workflow: Experiment Workflow Material Treatment Material Material Treatment Material Treatment Material Data Acquisition Data Data Transformation Data DataFunctional Genomics Experiment (FuGE) Object Model: Functional Genomics Experiment (FuGE) Object Model Merges of MAGE and PEDRo models where attempted Results where and even more complex model that still left other FG technologies untouched Main motivation was reuse MAGE sample prep and ontology components FuGE project was created as independent project from MGED and PSI Model of common components across FG to enable synergy between standards Sample description, protocols, investigation structure http://fuge.sourceforge.netArchitecture Details: Architecture Details FuGE mainly represented as UML model UML 1.4 using Magic Draw 9.5 Uses AndroMDA to produces platform specific models XML Schema Language Bindings and API’s Java, Perl, C, etc. Database schemaFuGE Structure: FuGE Common Bio Description Audit Ontology Protocol Reference Investigation Data Material Conceptual Molecule Common: General data format management Auditing Referencing external resources Protocols Bio: Investigation structure Data Materials (organisms, solutions, compounds) Theoretical molecules e.g. sequences FuGE StructureFuGE Workflow: FuGE WorkflowFuGE is an Enabler: FuGE is an Enabler Serve as a basis for developing new formats PSI-GPS and MGED are using FuGE for developing their new data formats Existing formats can be tied together using FuGE mzData does not describe biosource separation procedure (gels, LC, etc.) CPAS from FHCRC does thisUse 1: Extending FuGE: Use 1: Extending FuGEUse 2: Tie Together External Formats: Protocol definition says “See ExternalData file for parameters” (rather than storing params in Protocol) Use 2: Tie Together External Formats Protocol ProtocolApplication Material ExternalData mzData file File format definition Parser will exist to extract data / parameters from mzData file Material can be used to describe the sample. This connects the MS data with a separation workflow inputMaterial outputDataStatus of FuGE: Status of FuGE Milestone 1 release - Sep 2005 Milestone 2 release - Dec 2005 Acceptance by PSI and MGED at this time Milestone 3 – Spring 2006 Milestone 2 of GelML and spML Version 1.0 – Fall 2006FuGE Extensions: FuGE Extensions MAGE V2 Format for microarray data and annotations GelML Format for methods + results of 2D gels Milestone 1 Dec 2005 Release scheduled for Spring/Summer 2006 spML Sample processing: liquid chromatography, capillary electrophoresis, centrifugation Milestone 1 Dec 2005 CPAS uses a FuGE-inspired manifest for experiments Metabolomics community considering PRIDE contemplating FuGE for data format Flow Cytometry community interested MIACA?Summary: Summary FuGE should help convergence of omics data formats: Single description of the sample for all types of experiment Shared representation of protocols Investigation and workflow structure for integrating different omics projects Good starting point, proven development methodologyAcknowledgements: Acknowledgements Other FuGE developers Andrew Jones (Manchester) Michael Miller (Rosetta), Paul Spellman (Lawrence Berkley) MGED, PSI, Fred Hutch CRC, Genologics, and various Contact: angel@mail.med.upenn.eduWhile I have your attention…: While I have your attention… Space cost Ultra expensive ~$19/GB ($380 for 20GB) Cheap (TerraStation NAS) ~$0.80/GB ($16) Ultra Cheap ($500 PC) ~ $0.50 ($10) MIAPE confounding factors Will never have a complete list We are implicitly telling investigators that they don’t know how to do good science (a Bad Thing) Instead require quality assessment statistics on the data (variance, reproducibility, etc.) You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
FuGE springPSI2006 luie Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 48 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: October 01, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript FuGE: A framework for developing standards for functional genomics: FuGE: A framework for developing standards for functional genomics Angel Pizarro Univesrity of Pennsylvania Andrew Jones University of ManchesterOverview: Overview Challenge of building data standards Introduction to FuGE Current status Formats developed using FuGEData Standards for HT Genomics: Data Standards for HT Genomics Major challenge developing standards: Technology still evolving Heterogeneous data formats (and data types) from software and instruments “Important” info about starting sample is almost unlimited Large quantities of metadata to validate results BUT: Most of these problems are shared by microarrays, proteomics, metabolomics etc.Experiment Workflow: Experiment Workflow Material Treatment Material Material Treatment Material Treatment Material Data Acquisition Data Data Transformation Data DataFunctional Genomics Experiment (FuGE) Object Model: Functional Genomics Experiment (FuGE) Object Model Merges of MAGE and PEDRo models where attempted Results where and even more complex model that still left other FG technologies untouched Main motivation was reuse MAGE sample prep and ontology components FuGE project was created as independent project from MGED and PSI Model of common components across FG to enable synergy between standards Sample description, protocols, investigation structure http://fuge.sourceforge.netArchitecture Details: Architecture Details FuGE mainly represented as UML model UML 1.4 using Magic Draw 9.5 Uses AndroMDA to produces platform specific models XML Schema Language Bindings and API’s Java, Perl, C, etc. Database schemaFuGE Structure: FuGE Common Bio Description Audit Ontology Protocol Reference Investigation Data Material Conceptual Molecule Common: General data format management Auditing Referencing external resources Protocols Bio: Investigation structure Data Materials (organisms, solutions, compounds) Theoretical molecules e.g. sequences FuGE StructureFuGE Workflow: FuGE WorkflowFuGE is an Enabler: FuGE is an Enabler Serve as a basis for developing new formats PSI-GPS and MGED are using FuGE for developing their new data formats Existing formats can be tied together using FuGE mzData does not describe biosource separation procedure (gels, LC, etc.) CPAS from FHCRC does thisUse 1: Extending FuGE: Use 1: Extending FuGEUse 2: Tie Together External Formats: Protocol definition says “See ExternalData file for parameters” (rather than storing params in Protocol) Use 2: Tie Together External Formats Protocol ProtocolApplication Material ExternalData mzData file File format definition Parser will exist to extract data / parameters from mzData file Material can be used to describe the sample. This connects the MS data with a separation workflow inputMaterial outputDataStatus of FuGE: Status of FuGE Milestone 1 release - Sep 2005 Milestone 2 release - Dec 2005 Acceptance by PSI and MGED at this time Milestone 3 – Spring 2006 Milestone 2 of GelML and spML Version 1.0 – Fall 2006FuGE Extensions: FuGE Extensions MAGE V2 Format for microarray data and annotations GelML Format for methods + results of 2D gels Milestone 1 Dec 2005 Release scheduled for Spring/Summer 2006 spML Sample processing: liquid chromatography, capillary electrophoresis, centrifugation Milestone 1 Dec 2005 CPAS uses a FuGE-inspired manifest for experiments Metabolomics community considering PRIDE contemplating FuGE for data format Flow Cytometry community interested MIACA?Summary: Summary FuGE should help convergence of omics data formats: Single description of the sample for all types of experiment Shared representation of protocols Investigation and workflow structure for integrating different omics projects Good starting point, proven development methodologyAcknowledgements: Acknowledgements Other FuGE developers Andrew Jones (Manchester) Michael Miller (Rosetta), Paul Spellman (Lawrence Berkley) MGED, PSI, Fred Hutch CRC, Genologics, and various Contact: angel@mail.med.upenn.eduWhile I have your attention…: While I have your attention… Space cost Ultra expensive ~$19/GB ($380 for 20GB) Cheap (TerraStation NAS) ~$0.80/GB ($16) Ultra Cheap ($500 PC) ~ $0.50 ($10) MIAPE confounding factors Will never have a complete list We are implicitly telling investigators that they don’t know how to do good science (a Bad Thing) Instead require quality assessment statistics on the data (variance, reproducibility, etc.)