logging in or signing up ahm poster gridblast 2004 Freedom Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 11 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: December 28, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Slide1: A GT3 based BLAST grid service for biomedical research Micha Bayer1, Aileen Campbell2 & Davy Virdee2 1National e-Science Centre, e-Science Hub, Kelvin Building, University of Glasgow, Glasgow G12 8QQ 2Edikt, National e-Science Centre, e-Science Institute, 15 South College Street, Edinburgh EH8 9AA Overview BLAST is a well-known program for biological sequence comparison used to compare query sequences to a set of target sequences in order to find similar sequences in the target set can be extremely compute intensive we present a parallel implementation of BLAST delivered via a GT3 grid service part of the BRIDGES project, a UK e-Science project aimed at providing a grid based environment for research into the genetic causes of hypertension (http://www.brc.dcs.gla.ac.uk/projects/bridges/) Parallel BLAST to achieve maximum performance in a grid context, we have parallelised BLAST multiple query sequences are partitioned into sub-jobs on the basis of the number of idle compute nodes available and then processed on these in batches we have provided our own java based scheduler which distributes sub-jobs across an array of resources System Architecture grid service uses GT3.0.2 core only we have provided our own wrappers for OpenPBS client side and the Condor submission components a scheduler component examines the input, polls resources for available processors and farms out subtasks to the resources details of resources (i.e. clusters) are held in single XML config file – adding new resources is easy target databases are located on execute nodes or on cluster masternode to minimise stage-in time – these need updating regularly Design Issues no suitable metaschedulers available at time of designing the system – had to write our own system only uses GT3 core as a thin layer between client side and scheduler since full GT3 was due to be replaced by WSRF – minimises future porting effort Compute Resources Used ScotGRID compute cluster at Glasgow Univ.: a 250 processor Linux cluster Condor pool at National e-Science Centre, Glasgow Univ.: 25 desktop machines, single processors Client Side users of service range from expert to low computer literacy delivery mechanism chosen was therefore via BRIDGES web portal (see below) Java based graphical client to service is downloaded via Java webstart allows for easy, centralised updates also provides good opportunity to explore client side Globus Scheduler Algorithm parse input and count no. of query sequences poll resources and establish total no. of idle nodes set number of sub-jobs to be run to be equal to total no. of idle nodes calculate no. of sequences to be run per sub-job n (= no. of idle nodes/no. of sequences) while there are sequences left save n sequences to a sub-job input file if the number of idle nodes is 0 make up small number of sub-jobs (currently hardcoded to 5) and evenly distribute these into queues across resources else for each resource send i subjobs to the resource as separate threads where is the number of idle nodes on the resource when results are complete save to file in the original input file order return this to the user Summary We have constructed a parallelised BLAST service that farms out multiple query sequences as subjobs to a pool of resources. Our scheduler runs over OpenBPS and Condor resources via our own java wrappers. Client side delivery is through a Java GUI delivered via a web portal and Java Webstart. Contact / Further Information BRIDGES website and portal at http://www.brc.dcs.gla.ac.uk/projects/bridges/ email contact: michab@dcs.gla.ac.uk You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
ahm poster gridblast 2004 Freedom Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 11 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: December 28, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Slide1: A GT3 based BLAST grid service for biomedical research Micha Bayer1, Aileen Campbell2 & Davy Virdee2 1National e-Science Centre, e-Science Hub, Kelvin Building, University of Glasgow, Glasgow G12 8QQ 2Edikt, National e-Science Centre, e-Science Institute, 15 South College Street, Edinburgh EH8 9AA Overview BLAST is a well-known program for biological sequence comparison used to compare query sequences to a set of target sequences in order to find similar sequences in the target set can be extremely compute intensive we present a parallel implementation of BLAST delivered via a GT3 grid service part of the BRIDGES project, a UK e-Science project aimed at providing a grid based environment for research into the genetic causes of hypertension (http://www.brc.dcs.gla.ac.uk/projects/bridges/) Parallel BLAST to achieve maximum performance in a grid context, we have parallelised BLAST multiple query sequences are partitioned into sub-jobs on the basis of the number of idle compute nodes available and then processed on these in batches we have provided our own java based scheduler which distributes sub-jobs across an array of resources System Architecture grid service uses GT3.0.2 core only we have provided our own wrappers for OpenPBS client side and the Condor submission components a scheduler component examines the input, polls resources for available processors and farms out subtasks to the resources details of resources (i.e. clusters) are held in single XML config file – adding new resources is easy target databases are located on execute nodes or on cluster masternode to minimise stage-in time – these need updating regularly Design Issues no suitable metaschedulers available at time of designing the system – had to write our own system only uses GT3 core as a thin layer between client side and scheduler since full GT3 was due to be replaced by WSRF – minimises future porting effort Compute Resources Used ScotGRID compute cluster at Glasgow Univ.: a 250 processor Linux cluster Condor pool at National e-Science Centre, Glasgow Univ.: 25 desktop machines, single processors Client Side users of service range from expert to low computer literacy delivery mechanism chosen was therefore via BRIDGES web portal (see below) Java based graphical client to service is downloaded via Java webstart allows for easy, centralised updates also provides good opportunity to explore client side Globus Scheduler Algorithm parse input and count no. of query sequences poll resources and establish total no. of idle nodes set number of sub-jobs to be run to be equal to total no. of idle nodes calculate no. of sequences to be run per sub-job n (= no. of idle nodes/no. of sequences) while there are sequences left save n sequences to a sub-job input file if the number of idle nodes is 0 make up small number of sub-jobs (currently hardcoded to 5) and evenly distribute these into queues across resources else for each resource send i subjobs to the resource as separate threads where is the number of idle nodes on the resource when results are complete save to file in the original input file order return this to the user Summary We have constructed a parallelised BLAST service that farms out multiple query sequences as subjobs to a pool of resources. Our scheduler runs over OpenBPS and Condor resources via our own java wrappers. Client side delivery is through a Java GUI delivered via a web portal and Java Webstart. Contact / Further Information BRIDGES website and portal at http://www.brc.dcs.gla.ac.uk/projects/bridges/ email contact: michab@dcs.gla.ac.uk