logging in or signing up GGF9 Gfarm Waldarrama Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 119 Category: Entertainment License: All Rights Reserved Like it (1) Dislike it (0) Added: October 09, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Grid Datafarm and File System Services: Grid Datafarm and File System Services Osamu Tatebe Grid Technology Research Center, National Institute of Advanced Industrial Science and Technology (AIST)ATLAS/Grid Datafarm project:CERN LHC Experiment: ATLAS/Grid Datafarm project: CERN LHC Experiment Truck ATLAS Detector 40mx20m 7000 Tons LHC Perimeter 26.7km ~2000 physicists from 35 countries Collaboration between KEK, AIST, Titech, and ICEPP, U TokyoPetascale Data-intensive Computing Requirements: Petascale Data-intensive Computing Requirements Peta/Exabyte scale files Scalable parallel I/O throughput > 100GB/s, hopefully > 1TB/s within a system and between systems Scalable computational power > 1TFLOPS, hopefully > 10TFLOPS Efficiently global sharing with group-oriented authentication and access control Resource Management and Scheduling System monitoring and administration Fault Tolerance / Dynamic re-configuration Global Computing EnvironmentGrid Datafarm (1): Global virtual file system [CCGrid 2002]: Grid Datafarm (1): Global virtual file system [CCGrid 2002] World-wide virtual file system Transparent access to dispersed file data in a Grid Map from virtual directory tree to physical file Fault tolerance and access-concentration avoidance by file replication Grid File System File replica creation Virtual Directory Tree mappingGrid Datafarm (2): High-performance data processing [CCGrid 2002]: Grid Datafarm (2): High-performance data processing [CCGrid 2002] World-wide parallel and distributed processing Aggregate of files = superfile Data processing of superfiles = parallel and distributed data processing of member files Local file view File-affinity scheduling Grid File System Virtual CPU Newspapers in a year (superfile) 365 newspapers World-wide Parallel & distributed processingExtreme I/O bandwidth support example: gfgrep - parallel grep: Extreme I/O bandwidth support example: gfgrep - parallel grep % gfrun –G gfarm:input gfgrep –o gfarm:output regexp gfarm:input CERN.CH KEK.JP input.1 input.2 input.3 input.4 open(“gfarm:input”, &f1) create(“gfarm:output”, &f2) set_view_local(f1) set_view_local(f2) close(f1); close(f2) grep regexp Host2.ch Host1.ch Host3.ch Host4.jp gfarm:input Host1.ch Host2.ch Host3.ch Host4.jp Host5.jp gfmd input.5 Host5.jp File affinity schedulingDesign of AIST Gfarm Cluster I: Design of AIST Gfarm Cluster I Cluster node (High density and High performance) 1U, Dual 2.8GHz Xeon, GbE 800GB RAID with 4 3.5” 200GB HDDs + 3ware RAID 97 MB/s on writes, 130 MB/s on reads 80-node experimental cluster (operational from Feb 2003) Force10 E600 181st position in TOP500 (520.7 GFlops, peak 1000.8 GFlops) 70TB Gfarm file system with 384 IDE disks 7.7 GB/s on writes, 9.8 GB/s on reads for a 1.7TB file 1.6 GB/s (= 13.8 Gbps) on file replication of a 640GB file with 32 streams World-wide Grid Datafarm Testbed: World-wide Grid Datafarm Testbed Total disk capacity: 80 TB, disk I/O bandwidth: 12 GB/s KEK Titech AIST SDSC Indiana U Tsukuba U Kasetsert U, ThilandGfarm filesystem metadata: Gfarm filesystem metadata File status File ID Owner, file type, access permission, access times Num. of fragments, a command history File fragment status File ID, fragment index Fragment file size, checksum type, checksum Directories List of file IDs and logical filenames Replica catalog File ID, fragment index, filesystem node Filesystem node status hostname, architecture, #CPUs, . . . File status File fragment Directories Replica catalog Filesystem node Gfarm filesystem metadata Virtual File system Metadata Services Replica Location ServicesFilesystem metadata operation: Filesystem metadata operation No direct manipulation Metadata is consistently managed via file operations only open() refers to the metadata close() updates or checks the metadata rename(), unlink(), chown(), chmod(), utime(), . . . New replication API Creation and deletion Inquiry and management You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
GGF9 Gfarm Waldarrama Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 119 Category: Entertainment License: All Rights Reserved Like it (1) Dislike it (0) Added: October 09, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Grid Datafarm and File System Services: Grid Datafarm and File System Services Osamu Tatebe Grid Technology Research Center, National Institute of Advanced Industrial Science and Technology (AIST)ATLAS/Grid Datafarm project:CERN LHC Experiment: ATLAS/Grid Datafarm project: CERN LHC Experiment Truck ATLAS Detector 40mx20m 7000 Tons LHC Perimeter 26.7km ~2000 physicists from 35 countries Collaboration between KEK, AIST, Titech, and ICEPP, U TokyoPetascale Data-intensive Computing Requirements: Petascale Data-intensive Computing Requirements Peta/Exabyte scale files Scalable parallel I/O throughput > 100GB/s, hopefully > 1TB/s within a system and between systems Scalable computational power > 1TFLOPS, hopefully > 10TFLOPS Efficiently global sharing with group-oriented authentication and access control Resource Management and Scheduling System monitoring and administration Fault Tolerance / Dynamic re-configuration Global Computing EnvironmentGrid Datafarm (1): Global virtual file system [CCGrid 2002]: Grid Datafarm (1): Global virtual file system [CCGrid 2002] World-wide virtual file system Transparent access to dispersed file data in a Grid Map from virtual directory tree to physical file Fault tolerance and access-concentration avoidance by file replication Grid File System File replica creation Virtual Directory Tree mappingGrid Datafarm (2): High-performance data processing [CCGrid 2002]: Grid Datafarm (2): High-performance data processing [CCGrid 2002] World-wide parallel and distributed processing Aggregate of files = superfile Data processing of superfiles = parallel and distributed data processing of member files Local file view File-affinity scheduling Grid File System Virtual CPU Newspapers in a year (superfile) 365 newspapers World-wide Parallel & distributed processingExtreme I/O bandwidth support example: gfgrep - parallel grep: Extreme I/O bandwidth support example: gfgrep - parallel grep % gfrun –G gfarm:input gfgrep –o gfarm:output regexp gfarm:input CERN.CH KEK.JP input.1 input.2 input.3 input.4 open(“gfarm:input”, &f1) create(“gfarm:output”, &f2) set_view_local(f1) set_view_local(f2) close(f1); close(f2) grep regexp Host2.ch Host1.ch Host3.ch Host4.jp gfarm:input Host1.ch Host2.ch Host3.ch Host4.jp Host5.jp gfmd input.5 Host5.jp File affinity schedulingDesign of AIST Gfarm Cluster I: Design of AIST Gfarm Cluster I Cluster node (High density and High performance) 1U, Dual 2.8GHz Xeon, GbE 800GB RAID with 4 3.5” 200GB HDDs + 3ware RAID 97 MB/s on writes, 130 MB/s on reads 80-node experimental cluster (operational from Feb 2003) Force10 E600 181st position in TOP500 (520.7 GFlops, peak 1000.8 GFlops) 70TB Gfarm file system with 384 IDE disks 7.7 GB/s on writes, 9.8 GB/s on reads for a 1.7TB file 1.6 GB/s (= 13.8 Gbps) on file replication of a 640GB file with 32 streams World-wide Grid Datafarm Testbed: World-wide Grid Datafarm Testbed Total disk capacity: 80 TB, disk I/O bandwidth: 12 GB/s KEK Titech AIST SDSC Indiana U Tsukuba U Kasetsert U, ThilandGfarm filesystem metadata: Gfarm filesystem metadata File status File ID Owner, file type, access permission, access times Num. of fragments, a command history File fragment status File ID, fragment index Fragment file size, checksum type, checksum Directories List of file IDs and logical filenames Replica catalog File ID, fragment index, filesystem node Filesystem node status hostname, architecture, #CPUs, . . . File status File fragment Directories Replica catalog Filesystem node Gfarm filesystem metadata Virtual File system Metadata Services Replica Location ServicesFilesystem metadata operation: Filesystem metadata operation No direct manipulation Metadata is consistently managed via file operations only open() refers to the metadata close() updates or checks the metadata rename(), unlink(), chown(), chmod(), utime(), . . . New replication API Creation and deletion Inquiry and management