YongzhengMa 1

Uploaded from authorPOINTLite
Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

Towards e-Science: Scientific Data Grid in CAS: 

Towards e-Science: Scientific Data Grid in CAS Yongzheng Ma CNIC, CAS (myz@cnic.cn) APEC TEL GRID WORKSHOP Sep. 20, 2004, Singapore

Outlines: 

Outlines e-Science activities in CAS CAS and e-Science Current efforts for e-Science in CAS Scientific Data Grid Resource Middleware Applications Conclusion

What’s e-Science: 

What’s e-Science e-Science Informatization of research activities.

Why e-Science?: 

Why e-Science? Challenges in modern research problems are more complex than ever research object is not isolated, but cross-discipline and large-scale data processing, simulation and computing become indispensable methods more and more communication and collaboration among scientists

Background of e-Science for CAS : 

Background of e-Science for CAS CAS launched Knowledge Innovation Program in 1998, it’s time NOW to push it forward in all aspects. Scientists demand a higher level Informatization to meet their requirements in research activities. CAS started the Informatization Program in the 10th Five-year Plan (2001-2005) Informatization will make great effects on promotion of technology innovation and knowledge innovation.

Informatization of Research Activities: 

Informatization of Research Activities Bridge the gaps of time, space and environment, enable global, cross-discipline, large-scale collaboration among scientists Change the way how scientists do research, greatly improve communication and collaboration, advance the development of science and technology Informatization of Research Activities is the pioneer of Informatization of the whole society

Features of e-Science: 

Features of e-Science Open Resource sharing Supercomputer, Data, Instruments, … Coordinated research working with a colleague across an ocean as if they were within a same lab cross-discipline, complex, coordinated problem-solving

Infrastructure for e-Science: 

Infrastructure for e-Science Computing resources Data resources Software resources Communication resources Human resources Scientific Instruments particle accelerators, telescopes, sensors, …

e-Science and Application: 

e-Science and Application e-Science provides an informatized environment and platform for research Individual applications for fields and areas should be developed case by case Application is key

Milestones of e-Science in CAS: 

Milestones of e-Science in CAS In 2000, proposed “Informatized Research Environment” in the SDB project In March 2001, proposed “Scientific Data Grid” In August 2001, the project funded by the CAS Informatization Program In December 2001, proposed “China Science Grid” In October 2002, “Scientific Data Grid” joined the China National Grid and became a key component

e-Science Activities in CAS (2001-2005): 

e-Science Activities in CAS (2001-2005) Upgrading IT Infrastructure Constructing Scientific Research Environment Developing Key IT Technologies Demonstrating Science Applications

Upgrading IT Infrastructure: 

Upgrading IT Infrastructure Networks CSTNET Domestic links – 155M-2.5G International links – 310M CNGI (China Next Generation Internet) Supported by National Development and Reform Commission 12 GigaPoPs, 2.5-10G links will build by CAS Scientific Database 10TB Supercomputing Environment 5 TFLOPS Mass Storage System 100TB Visualization Environment SGI Oynx3000 Lenovo 6800, Installed at CNIC

DeepComp 6800: 

DeepComp 6800 Developed by the Lenovo Group Corp, China Completed in Nov. 2003 Installed at CNIC, CAS in December, 2003 2.6TB memory 81TB disks 4.183TFLOPS Linpack performance (78.5% efficiency) Ranked at 14th in the Top500 list (in Nov, 2003)

Lenovo DeepComp 6800: 

Lenovo DeepComp 6800

Constructing Scientific Research Environments based on the Internet: 

Constructing Scientific Research Environments based on the Internet Network of Field Observatories Ecology network Astronomical Observatories Weather stations Mountain disaster stations …… Network of Digital Libraries of Specimen 24 (zoology, botany, fossil, mineral, …), 80% of the whole country Digital Library of Specimen is starting Network of Digital Libraries National Science & Technology Digital Library Network of Scientific Instruments LAMOST, BEP-II, Electron Microscopes, …

Key IT Technologies: 

Key IT Technologies NGI Technology IPv6/IPv4 Transition Network Measurement IPv6 Root DNS Multicast Hierarchy Network Security …… Resource Location & Addressing Grid Computing Data Grid Middleware Data Integration Grid Information Service Grid Security Metadata Grid-enabled application ……

Grid-enabled Applications: 

Grid-enabled Applications Virtual Observatory Digital Earth HEP Data Grid Bio Grid Chemical Integrated Information System ……

China Science Grid: 

China Science Grid By 2005, “Scientific Data Grid” will have been built. Sharing of scientific data resources and collaboration based on it are achieved. Then, computing resources and scientific instruments will be integrated into. China Science Grid will be built on the SDG. Also, develop grid-enabled applications and establish application grids: bio grid, astro grid, … etc. China Science Grid – an instance of e-Science

International Collaboration: 

International Collaboration PRAGMA, 2002 GLORIAD, Jan, 2004 NCSA(US), Kurchatov Institute(RU) KISTI (Korea) Internet2 APAN ……

Introduction to GLORIAD: 

Introduction to GLORIAD Proposed network/program to be operational in 2004 Co-developed (and to-be-co-funded) by U.S., Russia, China Expanded capacity for science and education collaboration (10 Gbps) New “Global Ring” topology for reliability and new applications Essential for supporting advanced S&E applications (particularly HEP, Astronomy, Atmospheric Sciences, Bioinformatics,optical network research, network security research)

GLOBAL RING NETWORK FOR ADVANCED APPLICATIONS DEVELOPMENT Russia-China-USA Science & Education Network: 

GLOBAL RING NETWORK FOR ADVANCED APPLICATIONS DEVELOPMENT Russia-China-USA Science & Education Network

e-Science Planning in Future: 

e-Science Planning in Future Starting to plan the 11th Five-year Informatization Program (2006-2010) Focus on e-Science in CAS Work with CNGI (China Next Generation Internet) International Collaboration GLORIAD PRAGMA APAN… Potential Killer Science Applications Virtual Observatory High Energy Physics Bioinformatics…

Scientific Data Grid (SDG): 

Scientific Data Grid (SDG) An exploration towards e-Science Undertaken by CAS Background Current Status Resource constructing Middleware developing Experimental applications

Background: 

Background Scientific Data Grid (SDG) is built upon the mass scientific data resources of the Scientific Database (SDB). SDB is a long-term project since 1983, in which there are multi-disciplinary scientific data accumulated through the course of science activities in CAS. The vision of SDG is to take valuable data resources into full play by benefiting from advanced information technologies, in particular, the Grid technology.

Data Resources: 

Data Resources Scientific Database (SDB) 45 institutions across 16 cities 313 databases 10TB total volume Cover a lot of disciplines Chemistry, Biology, Geosciences, Environment, Astronomy, High energy physics, …

SDG Platform: 

SDG Platform Data Center Part nodes of DeepComp 6800 20TB SAN Storage TFLOPS-scale computing

SDG Software Modules: 

SDG Software Modules

SDG Middleware and ToolKits: 

SDG Middleware and ToolKits Grid Middleware Grid Information System SDG Uniform Access Interface SDG Security System SDG Toolkits SDG GIS V1.0 Universal Metadata Tool V2.0 Statistics Tool V1.1

SDG GIS V1.0: 

SDG GIS V1.0 Backend MDS/LDAP Two types of Information System info Metadata Management and Service Centralized Distributed Query GRIP GRRP MDRP SDG GIIS SDG Sub-GIIS SDG Applications MDW MDIS MDIS C-MDIS I-MDIS I-MDIS C-MDIS DCIS DCIS

SDG Universal Metadata Tool: 

SDG Universal Metadata Tool metadata is tree-like and more flexible than fix-column tables, difficult to deal with on web UI use xml files to store interim results

Universal Metadata Management Tool: 

Universal Metadata Management Tool

Statistics & Analysis Tool (SAT) for Data Volume: 

Statistics & Analysis Tool (SAT) for Data Volume Features Win2000/XP, Linux Java 1.4 Globus Toolkit 3 Core Oracle, SQL Server, File System Deploy Data nodes: 45 institutes at CAS, across 16 cities in China Mediator: CNIC Service Monitor

Slide33: 

Windows 2k/xp Java 1.4 GT3 Core Statistics Services

SDG Middleware and ToolKits: 

SDG Middleware and ToolKits SDG Middleware Grid Information System SDG Uniform Access Interface SDG Security System SDG Toolkits Data Access Subsystem 1.0

SDG Data Access Service Framework: 

SDG Data Access Service Framework Internet Internet Oracle SQLServer FileSystem mySQL DB2 Foxpro …… …… Information Service …… Application Clients Grid Level Services Member Institutes Member Institutes Node Level Services & Data Resources

Slide37: 

Data Access

SDG Middleware and ToolKits: 

SDG Middleware and ToolKits SDG Middleware Grid Information System SDG Uniform Access Interface SDG Security System SDG Toolkits SDG CA V1.0 Access Control Toolkit V1.1

SDG Security System: 

SDG Security System Full Process of security-related operations under SDG Security System GSI based Use certificates to identify users Role-based local access control

Security Subsystem: 

Security Subsystem

SDG Middleware and ToolKits: 

SDG Middleware and ToolKits SDG Middleware Grid Information System SDG Uniform Access Interface SDG Security System SDG Toolkits SDG Portal (prototype) Image Process Tool 1.0 Storage Sharing Service

Pilot Applications: 

Pilot Applications Virtual Observatory High Energy Physics Global Climate Data Integration Bioinformatics Integration Resources and Environment Monitoring

China Virtual Observatory Demo: 

China Virtual Observatory Demo

Conclusion: 

Conclusion Today and tomorrow’s research demands global collaboration – e-Science. The progress of Information Technology make it possible. CAS is making great efforts on e-Science with its Informatization Program in the 10th Five-year Plan. The e-Science Program in the 11th Five-year Plan (2006-2010) is being worked out. e-Science will become the groundwork of research in the future five years. Scientific Data Grid is the first experimental project for CAS e-Science. A few of science applications on SDG would be our exploration towards e-Science.

Slide46: 

Thank you!