2 Alfieri

Uploaded from authorPOINTLite
Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

Slide1: 

Parameter Estimation for Cell Cycle Ordinary Differential Equation (ODE) Models Using a Grid Approach HealthGrid 2007 Conference Geneva, April 25th 2007 Roberta Alfieri Institute for Biomedical Technologies, CNR, Milan, Italy CILEA, Milan, Italy

Outline: 

Outline Introduction to Systems Biology and Cell Cycle Modelling Cell Cycle Model Simulation Machinery Database Model Section Simulation Engine User Interface Parameter Estimation and Grid Technology

The Biological Problem: Cell Cycle: 

The Biological Problem: Cell Cycle The cell cycle is a frequently investigated process in systems biology, especially through mathematical modelling, to verify the impact differently regulated genes can have in normal and cancer cells. The complexity of this biological process lies in the high number of genes and networks of protein interactions involved; The quantification of the behaviour of each cell cycle component has a crucial role in understanding the complex mechanism of cell cycle regulation.

Cell Cycle Modelling : 

Cell Cycle Modelling What is modelling? The act of describing something in a schematic representation, usually on a smaller scale (general definition); Design and analysis of a mathematical representation of a biological system to outline unknown properties of that system, the emergent properties (systems biology definition). Mathematical representation of a biological process: Set of kinetic equations to define biochemical reactions System of Ordinary Differential Equations to describe the dynamic behaviour of the model components Initial parameters for kinetic equations Initial concentration of the model species

Problems related to modelling: 

Problems related to modelling Simulation of an ODE system is possible on a single workstation: the numerical integration of an ODE system is not very time consuming; Parameter estimation, the evaluation of the best set of parameters which define the model relating to a specific experimental dataset; requires High Performance Computing techniques since the computational load needed in finding the best model is very great; The estimation of the kinetic parameters in silico is performed by computing a number of ODE systems with different parameters and verifying the best solution.

Outline: 

Outline Introduction to Systems Biology and Cell Cycle Modelling Cell Cycle Model Simulation Machinery Database Model Section Simulation Engine User Interface Parameter Estimation and Grid Technology

Cell Cycle Database: 

Cell Cycle Database Cell Cycle Database: relational database which integrates information about genes and proteins involved in yeast and mammalian cell cycle process; Database section dedicated to cell cycle mathematical models Model publication data (information on the published models, such as the detailed publication data, authors, PubMed ID, abstract, journal information, diagram of the model, protein involved in the model, XML file, where available); SBML data structure (SBML components of the model, including its mathematical expressions); Simulation section (model simulation using XPPAUT, direct results retrieval in graphical formats).

Outline: 

Outline Introduction to Systems Biology and Cell Cycle Modelling Cell Cycle Model Simulation Machinery Database Model Section Simulation Engine User Interface Parameter Estimation and Grid Technology

Simulation Engine Workflow: 

Simulation Engine Workflow Core Technology User Interface Independent Engines

Simulation Pipeline: 

Simulation Pipeline The pipeline is composed of a series of PHP scripts and allows the visualization and the computation of SBML models through: Data retrieval from Cell Cycle Database; SBML parser; MathML to HTML converter: pipeline for the translation of the SBML mathematical expression for their visualization on web interface; XPPAUT which is the simulation software chosen: allows the solution of differential equations using many different options for the numerical algorithm; widely used for the modelling of different biological pathways; requires simply formatted input file.

Outline: 

Outline Introduction to Systems Biology and Cell Cycle Modelling Cell Cycle Model Simulation Machinery Database Model Section Simulation Engine User Interface Parameter Estimation and Grid Technology

Cell Cycle Database Model Section : 

Cell Cycle Database Model Section

Simulation Section: 

Simulation Section 2D plot: image exported in png using GnuPlot The simulation of a single ODE system describing a cell cycle model is possible

Outline: 

Outline Introduction to Systems Biology and Cell Cycle Modelling Cell Cycle Model Simulation Machinery Database Model Section Simulation Engine User Interface Parameter Estimation and Grid Technology

Parameter estimation : 

Parameter estimation Estimate the model which fits with real biological data the best Find the best parameter set which describes the real biological system Possible approaches to parameter estimation: Deterministic mathematical methods Stochastic mathematical methods

Stochastic methods: 

Stochastic methods Evolutionary algorithms: population-based stochastic methods relying on the idea of biological evolution: Iterative creation of new generations of individuals (relying on the recombination of the best individuals of the previous generation) in numerical forms to find solutions close to optimum (experimental data); Three groups of evolutionary methods: Genetic Algorithm Evolutionary Programming Evolutionary Strategies: the most efficient and robust especially for continuous problems, like ODE systems resolution (N. Saravanan, 1995).

Simulation Time Estimation: 

Simulation Time Estimation Model example: 9 species, 41 parameters, 22 reactions, 9 ODEs (Swat et al, 2004) Single Numerical Simulation 4 seconds Evolutionary Computation for Parameter Estimation (50000 individuals for 100 generations) 231 days

The distributed approach: 

The distributed approach The use of a High Performance Computing platform like grid for the computation of a large number of independent ODE systems solution is possible; The porting of the ODE solver system on the grid has been successfully performed by the creation of an infrastructure able to distribute the computation efficiently; The parameter estimation engine works on the top of a set of scripts for: Job submission; Monitoring of the computation; Retrieval and integration of the results.

Parameter estimation and grid: 

Parameter estimation and grid Development of a system for the parameter estimation in order to find the best parameter set by computing many different simulations with the Evolutionary Strategy algorithm using the grid platform to overcome the computation complexity coming from: the high number of parameter combination values; the high number of simulations needed to fit data; Difference from the other grid-based parameter estimation approaches: type of algorithm used: Evolutionary Strategy Algorithm grid platform on which the computation is performed: grid platform based on gLite.

Distributed Approach Advantages: 

Distributed Approach Advantages Key parameter for distribution: number of simulations for each job; The number of equations which have to be simulated in a specific job is related to the computation time needed for each simulation: We set the number of simulations for each job to 500 ODE systems: Necessity to parallelize each single generation; Optimization of the queue time. The average computational time for each job is about 30 minutes and, considering the queue time, the global computation needs almost a week.

Grid Deployment: 

Grid Deployment The parameter estimation is controlled by a set of scripts that are responsible for the submission, the monitoring and the retrieval of the results for each job; The system works for generation step: All the jobs of a generation are sent to the grid; When the results are retrieved the software integrates them in order to create a new generation; The new generation is re-submitted to grid; The ODE solver system is deployed on the grid node at job execution time and the results are retrieved from the User Interface where the data are integrated to generate following populations.

Conclusion: 

Conclusion We present a grid-oriented approach to solve ODE systems describing cell cycle models, in order to make the numerical simulations of the biological process easier and more accurate; We choose to perform parameter estimation using a High Performance Computing platform like the grid because the system is designed with the aim to estimate the best model by computing many different simulations of each model; The implemented system is useful to manage the mathematical information related to cell cycle models and to simulate the whole process using the grid platform.

Acknowledgment: 

Acknowledgment This project has been supported by: Italian FIRB-MIUR project “LITBIO” european projects “BioinfoGRID” and “EGEE” People from Bioinformatics Group at Institute for Biomedical Technologies, CNR, Milan Luciano Milanesi Ivan Merelli Ettore Mosca